News
- Mar 22: Assignment 3 is now posted
- Mar 9: A list of suggested papers is posted
- Feb 21: Assignment 2 is now posted
- Jan 31: Assignment 1 is now posted
Information
- Lectures
- Tue 10:30-12:15pm (MMW 703) Wed 2:30-3:15pm (MMW 705)
- Zoom
-
960 6586 2487 (for both lectures and tutorials)No passcode, but CUHK-authenticated Zoom accounts are required to join
- Instructor
- Siu On CHAN Office hours: Tue 2:30-4:30pm
- Prerequisites
- Probability, Discrete Math, Engineering Math (Linear Algebra), Computational Complexity (NP-Hardness)
- Textbook
- An Introduction to Computational Learning Theory, Michael J. Kearns and Umesh V. Vazirani (link) (accessible online at the university library webpage, one user at a time)
- References
- Understanding Machine Learning: From Theory to Practice, Shai Shalev-Shwartz and Shai Ben-David (free online copy at the author’s homepage)
- Forum
- Please sign up on Piazza
- Grading
- Homework (30%), Midterm exam (30%), Project report (40%)
- Similar courses
- Introduction to Computational Learning Theory (Spring 2021), Rocco Servedio
- Machine Learning Theory (Fall 2020), Ilias Diakonikolas
- Theoretical Foundations of Machine Learning (Spring 2018), Michael Kearns
- Computational Learning Theory (Hilary & Trinity 2021), Varun Kanade
Topics
This course focuses on theoretical aspects of supervised learning, with emphasis on provable guarantees of classification algorithms. This course is for mathematically and theoretically minded students, and complements other machine learning courses that are application-oriented.
- Online mistake bound model
- Probably Approximately Correct (PAC) learning model
- VC dimension
- Occam’s Razor
- Weak learning vs strong learning
- Boosting
- AdaBoost
- Computational hardness of learning
- Statistical Query learning
- Learning with noise
- Differential privacy
- Rademacher complexity
- Inherent hardness of learning
Lectures
Lecture recordings will be posted on Piazza → Resources after classes
Date | Topic | Readings | Notes | Tutorials | HW | |
---|---|---|---|---|---|---|
1 | Jan 10 | Introduction; Learning models & problems | §1.1, Blum §3 | |||
2 | Jan 11 | Online mistake bound; decision lists | Blum §3.1 | |||
3 | Jan 17 | Winnow algorithms | Littlestone §5 | |||
4 | Jan 18 | Perceptron and Halving algorithms | Littlestone §3 | |||
Jan 24 | Lunar New Year Holiday | |||||
Jan 25 | Lunar New Year Holiday | |||||
5 | Jan 31 | Generic bounds for online learning; VC dimension | §3.1-3.3 | |||
6 | Feb 1 | Weighted majority | LW §1-2 | HW 1 | ||
7 | Feb 7 | PAC learning | §1.2 | |||
8 | Feb 8 | Online to PAC conversion | ||||
9 | Feb 14 | Occam’s Razor | §2.1 | |||
10 | Feb 15 | Chernoff bounds; Hypothesis testing | ||||
11 | Feb 21 | Proper vs improper learning; 3-term DNFs | §1.3-1.4 | |||
12 | Feb 22 | Sample size bounds via VC dimension | §3.5-3.6 | HW 2 | ||
Feb 28 | Midterm exam | |||||
13 | Mar 1 | Sauer–Shelah lemma | §3.4 | |||
Mar 7 | Reading week | |||||
Mar 8 | Reading week | |||||
14 | Mar 14 | Weak and strong learning; Boosting | §4.1-4.2 | |||
15 | Mar 15 | AdaBoost | ||||
16 | Mar 21 | Neural networks | §3.7 | |||
17 | Mar 22 | Random classification Noise; Statistical Query model | §5.1-5.3 | HW 3 | ||
18 | Mar 28 | Learning with noise via Statistical Query model | §5.4 | |||
19 | Mar 29 | Lower bound for Statistical Query model |
Date | Topic | Readings | Notes | Tutorials | HW | |
---|---|---|---|---|---|---|
1 | Jan 10 | Introduction; Learning models & problems | §1.1, Blum §3 | |||
2 | Jan 11 | Online mistake bound; decision lists | Blum §3.1 | |||
3 | Jan 17 | Winnow algorithms | Littlestone §5 | |||
4 | Jan 18 | Perceptron and Halving algorithms | Littlestone §3 | |||
Jan 24 | Lunar New Year Holiday | |||||
Jan 25 | Lunar New Year Holiday | |||||
5 | Jan 31 | Generic bounds for online learning; VC dimension | §3.1-3.3 | |||
6 | Feb 1 | Weighted majority | LW §1-2 | HW 1 | ||
7 | Feb 7 | PAC learning | §1.2 | |||
8 | Feb 8 | Online to PAC conversion | ||||
9 | Feb 14 | Occam’s Razor | §2.1 | |||
10 | Feb 15 | Chernoff bounds; Hypothesis testing | ||||
11 | Feb 21 | Proper vs improper learning; 3-term DNFs | §1.3-1.4 | |||
12 | Feb 22 | Sample size bounds via VC dimension | §3.5-3.6 | HW 2 | ||
Feb 28 | Midterm exam | |||||
13 | Mar 1 | Sauer–Shelah lemma | §3.4 | |||
Mar 7 | Reading week | |||||
Mar 8 | Reading week | |||||
14 | Mar 14 | Weak and strong learning; Boosting | §4.1-4.2 | |||
15 | Mar 15 | AdaBoost | ||||
16 | Mar 21 | Neural networks | §3.7 | |||
17 | Mar 22 | Random classification Noise; Statistical Query model | §5.1-5.3 | HW 3 | ||
18 | Mar 28 | Learning with noise via Statistical Query model | §5.4 | |||
19 | Mar 29 | Lower bound for Statistical Query model | ||||
20 | Apr 4 | Differential privacy | ||||
Apr 5 | Ching Ming Festival | |||||
21 | Apr 11 | Differential privacy via Statistical Query algorithms | ||||
22 | Apr 12 | Rademacher complexity | ||||
23 | Apr 18 | Generalization bound from Rademacher complexity | ||||
24 | Apr 19 | Inherent hardness of learning |
Homework
There will be three homework assignments
Submit your answers as a single PDF file via Blackboard by 11:59pm
You are encouraged to either typeset your solution in LaTeX or Word, or scan or take photos of your handwritten solution (please bundle your scans/photos into a single PDF file)
Collaboration on homework and consulting references is encouraged, but you must write up your own solutions in your own words, and list your collaborators and your references on the solution sheet
- Assignment 1 due Wed Feb 15
- Assignment 2 due Wed Mar 8
- Assignment 3 due Wed Apr 5
Project
The course project involves writing a short report (5-10 pages) related to computational learning theory in an area of your interest. The report may be your exposition of a paper from the list below (or any similar paper), or a summary of your new follow-up results. Focus on results with rigorous, provable guarantees.
You are encouraged include in your report any illustrative, concrete examples that help you understand the papers you read.
If your report is a summary of a paper you read, it should highlight the following:
- What is the main result? A typical paper contains one main result and several corollaries or extensions. Focus on the main result in your report, unless you find a particular extension interesting.
- If the main result is very general, can you illustrate it with interesting special cases?
- What are the key new ideas? Can you illustrate them with concrete examples? How did the authors come up with these ideas?
- What have you learned? How might the key ideas or new results be useful in your own research?
- Are key inequalities in the papers tight (perhaps up to constant factors)? If so, can you come up with tight examples? If not, what do you think are the correct bounds?
Write your summary as if you are explaining the main result and its key ideas to other students in this class, or to any graduate student with a solid background in math.
Explain in your own words and in simple language as much as possible. The more you can do so, the better you understand the papers.
Paper suggestions:
- Optimal Mean Estimation without a Variance, Yeshwanth Cherapanamjeri, Nilesh Tripuraneni, Peter L. Bartlett, Michael I. Jordan
- On The Memory Complexity of Uniformity Testing, Tomer Berg, Or Ordentlich, Ofer Shayevitz
- Generalization Bounds via Convex Analysis, Gábor Lugosi, Gergely Neu
- Better Private Algorithms for Correlation Clustering, Daogao Liu
- The Sample Complexity of Robust Covariance Testing, Ilias Diakonikolas, Daniel M. Kane
- Agnostic Proper Learning of Halfspaces under Gaussian Marginals, Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
- Robust Online Convex Optimization in the Presence of Outliers, Tim van Erven, Sarah Sachs, Wouter M. Koolen, Wojciech Kotłowski
- Deterministic Finite-Memory Bias Estimation, Tomer Berg, Or Ordentlich, Ofer Shayevitz
- Bounded Memory Active Learning through Enriched Queries, Max Hopkins, Daniel Kane, Shachar Lovett, Michal Moshkovitz
- The Sparse Vector Technique, Revisited, Haim Kaplan, Yishay Mansour, Uri Stemmer
- On Avoiding the Union Bound When Answering Multiple Differentially Private Queries, Badih Ghazi, Ravi Kumar, Pasin Manurangsi
- Johnson-Lindenstrauss Transforms with Best Confidence, Maciej Skorski
- On the Sample Complexity of Privately Learning Unbounded High-Dimensional Gaussians, Ishaq Aden-Ali, Hassan Ashtiani, Gautam Kamath
- Non-uniform Consistency of Online Learning with Random Sampling, Changlong Wu, Narayana Santhanam
- Learning a mixture of two subspaces over finite fields, Aidao Chen, Anindya De, Aravindan Vijayaraghavan
- Differentially Private Assouad, Fano, and Le Cam, Jayadev Acharya, Ziteng Sun, Huanyu Zhang
- Near-tight closure bounds for Littlestone and threshold dimensions, Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi
- Online Boosting with Bandit Feedback, Nataly Brukhim, Elad Hazan