Statistical Learning Theory - ECE695, fall 2003

Instructor: Shai Ben-David
Time: Wed. 9:30-11:00
Place: Phillips 203
Credit: 2  Grade: S/U Course ID: 047-846


Statistical Learning Theory focuses on the statistical aspects of automated learning.

We wish to understand the issue of automated extraction of regularities (orpatterns, or structure)

from randomly generated data.

It is a relatively young field that employs tools ranging from statistics to computational complexity,

combinatorics and geometry in an attempt to provide theoretical foundations to some important

applications emerging from need to process data sets whose sizes and complexities are beyond the

ability of humans to handle.

This will be a rather informal topics course aiming to demonstrate the type of questions that this research is dealing with, as well as some of the tools it can offer.


This course is aimed primarily for PhD students and requires some background in probability and foundations

 of computer science.

Syllabus Outline: I wish to cover the following topics (in practice, we shall probably have time for only

a subset, which will be influenced by the participants' interests):

      1)      The statistical learning problem.

2)      Basic pitfalls: Overfitting and the ‘No Free Lunch’ principle.

3)      Traditional performance guarantees (based on Chernoff and Hoefding, inequalities).

4)      Occam’s Razor – Compression based solutions.

5)      Covering numbers and their application to generalization bounds.

6)      Combinatorial tools: The Vapnik-Chervonenkis dimension and Sauer’s Lemma.

7)      e - Nets and e - Approximations.

8)      Applications of VC dimension and e - Nets to Data Structures and other fields.

9)      Uniform bounds based on the VC-dimension.

10)  Uniform bounds for estimating real valued functions.

11)  Data Dependent (including margin-based) generalization bounds.

      12)  Algorithmic considerations – some basic learning algorithms and computational complexity

            lower bounds.

Lecture 2 -Basic generalization theorems  (PDF)

Lecture 3 - Description-Size based Bounds (PDF)

Lecture 4 - VC dimension and Sauer's Lemma (PDF)

Lecture 5 - Sample Compression  Schemes (PDF)

Lecture 6 -Sample size lower bounds (PDF)

Lecture 7 -Sample size upper bounds for obtaining e -nets (PDF) (draft)

Lecture 8 -VC-dim based upper bounds for classification (PDF) (draft)

Lecture 9 - The Perceptron Algorithm and margins-based sample size bounds (PDF) (draft)