University of Michigan, Winter 2014

Instructor: Clayton Scott (clayscot)

Classroom: EECS 1003

Time: MW 9-10:30

Office: 4433 EECS

Office hours: Monday 1-2 or by appt.

__Prerequisites__:

- Probability at the level of EECS 501 or equivalent
- Experience with formal mathematical proofs.

__Recommended books__ (on reserve at Engineering library):

- Devroye, Gyorfi, and Lugosi,
*A Probabilistic Theory of Pattern Recognition*, Springer, 1996. - Mohri, Rostamizadeh, and Talwalkar,
*Foundations of Machine Learning*, MIT Press, 2012.

__Surveys and tutorials__:

- Olivier Bousquet, Stephane Boucheron, and Gabor Lugosi, Introduction to Statistical Learning Theory, in O. Bousquet, U.v. Luxburg, and G. Ratsch (editors), Advanced Lectures in Machine Learning, Springer, pp. 169--207, 2004.
- Ambuj Tewari and Peter L. Bartlett, Learning Theory, in Rama Chellappa and Sergios Theodoridis (editors), Academic Press Library in Signal Processing, volume 1, chapter 14. Elsevier, 1st edition, 2013.

__Lecture notes__

- Probabilistic setting
- The Bayes classifier
- Hoeffding's inequality
- Empirical risk minimization
- Vapnik-Chervonenkis Theory. VC classes, Sauer's lemma, DKW theorem, monotone layers and convex sets.
- Sieve Estimators: Consistency and Rates of Convergence
- Dyadic Decision Trees
- Oracle Inequalities and Adaptive Rates. Structural risk minimization. Adapting to relevant features, intrinsic dimenion.
- The Bounded Difference Inequality
- Rademacher Complexity. Proof of VC inequality.
- Kernels
- Reproducing Kernel Hilbert Spaces
- Kernel Methods and the Representer Theorem
- Calibrated Surrogate Losses
- Rademacher Complexity of Kernel Classes
- Margin Bounds
- Universal Consistency of Support Vector Machines and other Kernel Methods
- Rates for Linear SVMs under the Hard Margin Assumption
- Kernel Density Estimation
- Weakly Supervised Learning, Anomaly detection, classification with label noise.

__Grading__

Homework (40%)

Final report (40%)

Participation (20%)

__Homework__

Homework will be assigned progressively in lecture,
and
will be due at regular intervals. You will be given at least one week
advanced notice of the due date, but I recommend solving the problems as
they are assigned, since it will help with your understanding of the
lectures.

__Final report__

Each student will choose one or a few research
papers
on a particular topic in statistical learning theory, and write a report
that summarizes the contributions of the paper(s), including at least a
sketch of the main technical ideas. Reports will be evaluated by your
peers in a manner that mimics a conference review process. I may be out of
town the last week of classes, so the reports may be due slightly before
then, such as the last Friday before classes end.

__Participation__

Attendance, classroom interaction, and scribing
lecture notes.

__Honor Code__

All undergraduate and graduate students are expected to abide by the
College of Engineering Honor Code as stated in the Student Handbook and
the Honor Code Pamphlet.

__Students with Disabilities__

Any student with a documented disability needing academic adjustments or
accommodations is requested to speak with me during the first two weeks of
class. All discussions will remain confidential.