Statistical Learning
Theory - ECE695, fall 2003
Instructor: Shai Ben-David
Time: Wed. 9:30-11:00
Place: Phillips 203
Credit: 2 Grade: S/U Course ID: 047-846
Description:
Statistical Learning Theory focuses on the statistical aspects of automated
learning.
We wish to understand the issue of automated extraction of regularities (orpatterns, or structure)
from randomly generated data.
It is a relatively young field that employs tools ranging from statistics to
computational complexity,
combinatorics and geometry in an attempt to provide theoretical foundations to some important
applications emerging from need to process data sets whose sizes and complexities are beyond the
ability of humans to handle.
This will be a rather informal topics course aiming to demonstrate the type
of questions that this research is dealing with, as well as some of the tools it
can offer.
Prerequisite:
This course is aimed primarily for PhD students and requires some background in
probability and foundations
of computer science.
Syllabus Outline: I wish to cover the following topics (in practice, we shall probably have time for only
a subset, which will be influenced by the participants' interests):
1) The statistical learning problem.
2) Basic pitfalls: Overfitting and the ‘No Free Lunch’ principle.
3) Traditional performance guarantees (based on Chernoff and Hoefding, inequalities).
4) Occam’s Razor – Compression based solutions.
5) Covering numbers and their application to generalization bounds.
6) Combinatorial tools: The Vapnik-Chervonenkis dimension and Sauer’s Lemma.
7) e - Nets and e - Approximations.
8) Applications of VC dimension and e - Nets to Data Structures and other fields.
9) Uniform bounds based on the VC-dimension.
10) Uniform bounds for estimating real valued functions.
11) Data Dependent (including margin-based) generalization bounds.
12) Algorithmic considerations – some basic learning algorithms and computational complexity
lower bounds.
Lecture 2 -Basic generalization theorems (PDF)
Lecture 3 - Description-Size based Bounds (PDF)
Lecture 4 - VC dimension and Sauer's Lemma (PDF)
Lecture 5 - Sample Compression Schemes (PDF)
Lecture 6 -Sample size lower bounds (PDF)
Lecture 7 -Sample size upper bounds for obtaining e -nets (PDF) (draft)
Lecture 8 -VC-dim based upper bounds for classification (PDF) (draft)
Lecture 9 - The Perceptron Algorithm and margins-based sample size bounds (PDF) (draft)