News
 Mar 22: Assignment 3 is now posted
 Mar 9: A list of suggested papers is posted
 Feb 21: Assignment 2 is now posted
 Jan 31: Assignment 1 is now posted
Information
 Lectures
 Tue 10:3012:15pm (MMW 703) Wed 2:303:15pm (MMW 705)
 Zoom

960 6586 2487 (for both lectures and tutorials)No passcode, but CUHKauthenticated Zoom accounts are required to join
 Instructor
 Siu On CHAN Office hours: Tue 2:304:30pm
 Prerequisites
 Probability, Discrete Math, Engineering Math (Linear Algebra), Computational Complexity (NPHardness)
 Textbook
 An Introduction to Computational Learning Theory, Michael J. Kearns and Umesh V. Vazirani (link) (accessible online at the university library webpage, one user at a time)
 References
 Understanding Machine Learning: From Theory to Practice, Shai ShalevShwartz and Shai BenDavid (free online copy at the author’s homepage)
 Forum
 Please sign up on Piazza
 Grading
 Homework (30%), Midterm exam (30%), Project report (40%)
 Similar courses
 Introduction to Computational Learning Theory (Spring 2021), Rocco Servedio
 Machine Learning Theory (Fall 2020), Ilias Diakonikolas
 Theoretical Foundations of Machine Learning (Spring 2018), Michael Kearns
 Computational Learning Theory (Hilary & Trinity 2021), Varun Kanade
Topics
This course focuses on theoretical aspects of supervised learning, with emphasis on provable guarantees of classification algorithms. This course is for mathematically and theoretically minded students, and complements other machine learning courses that are applicationoriented.
 Online mistake bound model
 Probably Approximately Correct (PAC) learning model
 VC dimension
 Occam’s Razor
 Weak learning vs strong learning
 Boosting
 AdaBoost
 Computational hardness of learning
 Statistical Query learning
 Learning with noise
 Differential privacy
 Rademacher complexity
 Inherent hardness of learning
Lectures
Lecture recordings will be posted on Piazza → Resources after classes
Date  Topic  Readings  Notes  Tutorials  HW  

1  Jan 10  Introduction; Learning models & problems  §1.1, Blum §3  
2  Jan 11  Online mistake bound; decision lists  Blum §3.1  
3  Jan 17  Winnow algorithms  Littlestone §5  
4  Jan 18  Perceptron and Halving algorithms  Littlestone §3  
Jan 24  Lunar New Year Holiday  
Jan 25  Lunar New Year Holiday  
5  Jan 31  Generic bounds for online learning; VC dimension  §3.13.3  
6  Feb 1  Weighted majority  LW §12  HW 1  
7  Feb 7  PAC learning  §1.2  
8  Feb 8  Online to PAC conversion  
9  Feb 14  Occam’s Razor  §2.1  
10  Feb 15  Chernoff bounds; Hypothesis testing  
11  Feb 21  Proper vs improper learning; 3term DNFs  §1.31.4  
12  Feb 22  Sample size bounds via VC dimension  §3.53.6  HW 2  
Feb 28  Midterm exam  
13  Mar 1  Sauer–Shelah lemma  §3.4  
Mar 7  Reading week  
Mar 8  Reading week  
14  Mar 14  Weak and strong learning; Boosting  §4.14.2  
15  Mar 15  AdaBoost  
16  Mar 21  Neural networks  §3.7  
17  Mar 22  Random classification Noise; Statistical Query model  §5.15.3  HW 3  
18  Mar 28  Learning with noise via Statistical Query model  §5.4  
19  Mar 29  Lower bound for Statistical Query model 
Date  Topic  Readings  Notes  Tutorials  HW  

1  Jan 10  Introduction; Learning models & problems  §1.1, Blum §3  
2  Jan 11  Online mistake bound; decision lists  Blum §3.1  
3  Jan 17  Winnow algorithms  Littlestone §5  
4  Jan 18  Perceptron and Halving algorithms  Littlestone §3  
Jan 24  Lunar New Year Holiday  
Jan 25  Lunar New Year Holiday  
5  Jan 31  Generic bounds for online learning; VC dimension  §3.13.3  
6  Feb 1  Weighted majority  LW §12  HW 1  
7  Feb 7  PAC learning  §1.2  
8  Feb 8  Online to PAC conversion  
9  Feb 14  Occam’s Razor  §2.1  
10  Feb 15  Chernoff bounds; Hypothesis testing  
11  Feb 21  Proper vs improper learning; 3term DNFs  §1.31.4  
12  Feb 22  Sample size bounds via VC dimension  §3.53.6  HW 2  
Feb 28  Midterm exam  
13  Mar 1  Sauer–Shelah lemma  §3.4  
Mar 7  Reading week  
Mar 8  Reading week  
14  Mar 14  Weak and strong learning; Boosting  §4.14.2  
15  Mar 15  AdaBoost  
16  Mar 21  Neural networks  §3.7  
17  Mar 22  Random classification Noise; Statistical Query model  §5.15.3  HW 3  
18  Mar 28  Learning with noise via Statistical Query model  §5.4  
19  Mar 29  Lower bound for Statistical Query model  
20  Apr 4  Differential privacy  
Apr 5  Ching Ming Festival  
21  Apr 11  Differential privacy via Statistical Query algorithms  
22  Apr 12  Rademacher complexity  
23  Apr 18  Generalization bound from Rademacher complexity  
24  Apr 19  Inherent hardness of learning 
Homework
There will be three homework assignments
Submit your answers as a single PDF file via Blackboard by 11:59pm
You are encouraged to either typeset your solution in LaTeX or Word, or scan or take photos of your handwritten solution (please bundle your scans/photos into a single PDF file)
Collaboration on homework and consulting references is encouraged, but you must write up your own solutions in your own words, and list your collaborators and your references on the solution sheet
 Assignment 1 due Wed Feb 15
 Assignment 2 due Wed Mar 8
 Assignment 3 due Wed Apr 5
Project
The course project involves writing a short report (510 pages) related to computational learning theory in an area of your interest. The report may be your exposition of a paper from the list below (or any similar paper), or a summary of your new followup results. Focus on results with rigorous, provable guarantees.
You are encouraged include in your report any illustrative, concrete examples that help you understand the papers you read.
If your report is a summary of a paper you read, it should highlight the following:
 What is the main result? A typical paper contains one main result and several corollaries or extensions. Focus on the main result in your report, unless you find a particular extension interesting.
 If the main result is very general, can you illustrate it with interesting special cases?
 What are the key new ideas? Can you illustrate them with concrete examples? How did the authors come up with these ideas?
 What have you learned? How might the key ideas or new results be useful in your own research?
 Are key inequalities in the papers tight (perhaps up to constant factors)? If so, can you come up with tight examples? If not, what do you think are the correct bounds?
Write your summary as if you are explaining the main result and its key ideas to other students in this class, or to any graduate student with a solid background in math.
Explain in your own words and in simple language as much as possible. The more you can do so, the better you understand the papers.
Paper suggestions:
 Optimal Mean Estimation without a Variance, Yeshwanth Cherapanamjeri, Nilesh Tripuraneni, Peter L. Bartlett, Michael I. Jordan
 On The Memory Complexity of Uniformity Testing, Tomer Berg, Or Ordentlich, Ofer Shayevitz
 Generalization Bounds via Convex Analysis, Gábor Lugosi, Gergely Neu
 Better Private Algorithms for Correlation Clustering, Daogao Liu
 The Sample Complexity of Robust Covariance Testing, Ilias Diakonikolas, Daniel M. Kane
 Agnostic Proper Learning of Halfspaces under Gaussian Marginals, Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
 Robust Online Convex Optimization in the Presence of Outliers, Tim van Erven, Sarah Sachs, Wouter M. Koolen, Wojciech Kotłowski
 Deterministic FiniteMemory Bias Estimation, Tomer Berg, Or Ordentlich, Ofer Shayevitz
 Bounded Memory Active Learning through Enriched Queries, Max Hopkins, Daniel Kane, Shachar Lovett, Michal Moshkovitz
 The Sparse Vector Technique, Revisited, Haim Kaplan, Yishay Mansour, Uri Stemmer
 On Avoiding the Union Bound When Answering Multiple Differentially Private Queries, Badih Ghazi, Ravi Kumar, Pasin Manurangsi
 JohnsonLindenstrauss Transforms with Best Confidence, Maciej Skorski
 On the Sample Complexity of Privately Learning Unbounded HighDimensional Gaussians, Ishaq AdenAli, Hassan Ashtiani, Gautam Kamath
 Nonuniform Consistency of Online Learning with Random Sampling, Changlong Wu, Narayana Santhanam
 Learning a mixture of two subspaces over finite fields, Aidao Chen, Anindya De, Aravindan Vijayaraghavan
 Differentially Private Assouad, Fano, and Le Cam, Jayadev Acharya, Ziteng Sun, Huanyu Zhang
 Neartight closure bounds for Littlestone and threshold dimensions, Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi
 Online Boosting with Bandit Feedback, Nataly Brukhim, Elad Hazan