Machine Learning

CS 613

Winter 2015

Wednesday 6:00pm - 8:50pm


Course Staff

You can email the staff at CS613-staff at googlegroups.com.

Instructor
Dr. Rachel Greenstadt
Department of Computer Science
Drexel University
Office: University Crossings 140
Tel: 1 215 895 2920
Email: greenie AT cs drexel edu
Office Hours: Wed. 5:00-6:00 or by appt

TAs
Aylin Caliskan-Islam
Office: University Crossing 144
Email: ac993 at drexel edu
Office Hours: Wed. 4:00-5:00

Rebekah Overdorf
Office: University Crossing 144
Email: rjo43 at drexel edu
Office Hours: Wed. 5:00-6:00

Course Overview

This course introduces modern statistical machine learning. Lectures will cover the mathematical foundation and representative algorithms of selected topics in machine learning. Throughout the course, strong emphasis will be given to Bayesian modeling and inference. The topics expected to be covered include fundamentals of probabilities and decision theory, regression, classification including support vector machine and relevance vector machine, graphical models including Bayes nets and Markov random field, mixture models, clustering, expectation maximization, hidden Markov models, Kalman filtering, and linear dynamical systems.

Multiple in-class quizzes will also be given to reinforce understanding of each topic. Students are also expected to implement one or more algorithms from the covered topics and apply it to solve a real-world problem using appropriate data.

Students entering the class with sound understanding of probability and statistics will be at an advantage but the course is designed so that anyone with strong background or interest in mathematical modeling and analysis can catch up and fully participate. The programming assignment will also be a great chance for students to apply the learnt topics to problems in their own area of interest.

Prerequisites

CS 583 *or* CS 510. There is no official prerequisite for math courses. However, basic (fluency at undergraduate level) understanding of Linear Algebra and Calculus will be necessary. For the final project, one will need to program in a programming language of their own choice (Matlab, C/C++, Java, etc.). The program must run on tux.cs.drexel.edu or on any general Windows/Mac machine. Students will also perform some hands-on exercises with weka that make use fo the Java API.

Topics

The following is the envisioned schedule of topics to be covered. The timeline is preliminary and will most likely change.
  • Jan 7 week 1 Introduction, probability, basics (decision trees, bayes) (weka tutorial)
  • Jan 14 week 2 More probability, unsupervised methods, linear regression (weka assignment 1 due)
  • Jan 21 week 3 Classification [overview, how to evaluate] (quiz 1)
  • Jan 28 week 4 Aylin Caliskan-Islam lecture Classification 2 [SVM, kernel methods, random forests]
  • Feb 4 week 5 NN, Feature selection, Deep Learning, big data learning (weka assignment 2 due)
  • Feb 11 week 6 Midterm oral presentations
  • Feb 18 week 7 Graphical Models, mixture models (quiz 2)
  • Feb 25 week 8 Sequential Data
  • March 4 week 9 Bekah Overdorf lecture (secure learning, domain adaptation, review - how to pick a method) (hmm assignment due)
  • March 11 week 10 Final project presentations (quiz 3)
  • March 14-21 Final paper due

Textbook

Pattern Recognition and Machine Learning, by Christopher M. Bishop, Springer, 2006. (ISBN-10: 0387310738, ISBN-13: 978-0387310732) Lectures will folow this book. Drexel Bookstore should have copies.

Coursework and Grading

Students will take multiple (likely 3) short written exams (quizzes) (in class or take home). These quizzes will mainly cover the material lectured the previous weeks (since the previous quiz).

There will also be three short, hands-on assignments to learn how to use and evaluate machine learning algorithms in practice.

Students will also implement one of the algorithms covered in the course and use it to infer meaningful information from data collected by themselves or from public data sets. Students must choose a paper on which this final project will be based upon -- students should implement the method discussed in the paper and extend it. If the student chooses to tackle something completely novel, a paper directly relevant to the topic must be chosen for presentation for the midterm oral exam.

If the student does not have a clear idea on what to work on for the final project, they can consult the instructor for some advice on papers in his/her area of interest. For the latter case, the student must consult with the instructor to ensure that the paper is relevant and covers one of the topics covered in the course.

Students may work on this programming project throughout the entire period of the course. At the end, students must submit the complete data, code, writeup in the form of a conference paper, and slides used for midterm oral exam.

For the midterm oral exam, students will give one paper presentation on a topic of his/her choice. The paper should be the one s/he intends to implement and extend for the final project or on a topic directly related to it. There is not enough time in that week for all students to present. As a result, about half of the students will give their presentation in other weeks. You must be the sole original author of all assignments and examination solutions in their entirety. As the university's policy explains, penalties up to and including a failing grade for the course with no opportunity to withdraw, will be given for plagiarism, fabrication, or cheating*.

*The standards for originality in a program are similar to those of other written works. Programs by different authors show clear and substantial differences as judged by most criteria, including but not limited to: choice of variable and procedure names, line spacing and indentation, choice of program structure, choice of algorithms, ordering of modules, module design, and ordering and choice of instructions. The original author of an assignment can explain each detail and how they came to create it on their own.

Below is the grading breakdown:

  • Midterm Oral Exam (Paper Presentation): 10%
  • Quizzes: 30%
  • Homeworks: 15%
  • Project: 45%