CSCI3320: Fundamental of Machine Learning

General Expectations: Student/Faculty's Expectations on Teaching and Learning

Classroom: After Jan. 13, 2017:

Instructor: Prof. John C.S. Lui

This course provides an introduction to machine learning. It is designed to give undergraduate students a taste of various machine learning techniques. Students need to have a good background in probability, statistics, a bit of optimizaton as well as programming (e.g., Python) to appreciate various methods.

Furthermore, students need to spend time to read the textbook, as well as to put in the effort to read various resources on the Internet, do the homework, attend the lectures and tutorials to understand and keep pace with this course. If students skip some classes, they are responsibile for their own actions on any missed lectures or announcemnets.

I know skipping classes is a norm in CUHK (which I don't understand why), but I like to emphasize that if you skip lectures/tutorial in this course, you will easily get lost and will not be able to keep pace with the lectures. So, words of advice, do not skip any classes or tutorials.

Machine learning is an essential knowledge in computer science/engineering, and a highly sought after skill in the industry. If you are well-trained in this subject, surely you can find a good job. Nevertheless, the subject is not for the faint-hearted students. I will discuss the mathematics and theory behind different machine learning methods, and students need to do various homework and exercises to understand the subject.

Teaching Assistants


Course Grades:



Final Examination :

Date: April 29th, 10:00 am till noon, LSB LT1 Please note that the final examination is NOT an open-book exam. You are allowed to bring in one piece of A4 paper (or 2-pages) of ``cheat sheet'' . Topics to be covered in the final exam are in general the materials we went through in the lectures and tutorials, these include:

Lecture Notes: (Password Protected)

Introduction on Machine Learning

Review on Statistics
    Statistical Sampling
    Estimation Theory
    Hypothesis Testing
    Curve Fitting
    Least Sqaured Regression
    Corrrelation Theory
    Q-Q Plot
    Derivation of Least Squares

Overview of Supervised Learning

Overview of Bayesian Decision Theory
    Bayes' Rule: Machine Learning perspective
    Loss/Risk Functions, discriminant functions
    Introduction to correlation and causality
    Introduction to causal and diagnostic inference
    Simple Bayesian Networks and Simple Bayes' Classifiers
    Association Rules

Linear Discrimination (Non-parametric method. Discussed in advanced for programming)
    Generalizing the Linear Model
    Geometry of the Linear Discriminant
    Linear Discriminant via Pairwise Separation
    Logistic Discriminant: Two and Multple Classes
    Discriminant by Regression
    Discriminant via Ranking

Parametric Methods
    Maximum likelihood estimator
    Estimator: bias vs. variance
    Unbiased estimator, consistent estimator, asymptotically unbiased estimator
    Bayes' estimator
    Parametric Classification
    Parametric Regression
    Bias/Variance Dilemma
    Illustration of Model Selection

Multivariate Parametric Methods
    Multivariate Parameters and Estimation
    Multivariate Normal Distributions
    Multivariate Parametric Classification in Multivariate Normal Distributions
    Multivariate Parametric Classification in Multivariate Bernoulli/Multinomial Distributions
    Multivariate Regression

Dimensionality Reduction


Nonparametric Methods
    Nonparametric density estimation: Histogram Estimator
    Nonparametric density estimation: Kernel Estimator
    Nonparametric density estimation: k-Nearest Neighbor Estimator
    Nonparametric density estimation: Generalization to Multivarate Data
    Condensed Nearest Neighbor
    Distance-Based Classification
    Nonparametric Regression: Smoothing Models

Decision Trees
    Univariate Trees
    Prunning on Decision Trees
    Rule Extraction from Decision Trees
    Learning Rules from Decision Trees
    Multivariate Decision Trees

Kernel Machines
    Quick Review of Logistic Classification/Regression
    From Logistic Classification to SVM Classification
    Concept of Large Margin
    Landmarks to Kernels
    Theory of Margin and Support Vectors
    Non-separable Case: Soft Margin Hyperplane
    Hinge Loss
    Kernel Tricks and Kernel Functions
    Multiple Kernel Learning and Multiclass Kernel Machines
    SVM for Regression
    SVM for Ranking
    Large Margin Nearest Neighbor
    Kernel Dimensionality Reduction
    Optional Reading 1: Constrained Optimization
    Optional Reading 2: Inequality Constraints and Kuhn-Tucker method

Graphical Models
    Conditional Independence
    Generative Models
    Belief Propagation
    Undirected Graphs and Markov Random Fields
    Learning Structures from Graphical Model
    Influence Diagram

Multilayers Perceptrons (Artificial Neural Networks)
    Training a Perceptron
    Learning Boolean Functions
    Multilayer Perceptrons
    Backpropagation Algorithm
    Training Procedures
    Tuning the Network Size
    Bayesian View of Learning
    Dimensionality Reduction
    Deep Learning
Hidden Markov Models
    Discrete Markov Processes
    Hidden Markov Models (HMM)
    Basic Problems of HMM
    Evaluation Problem
    Learning the State Sequence
    Learning the Model Parameters
    The HMM as a Graphical Model
Bayesian Estimation
    Bayesian Estimation of Parameters of a Disrete Distribution
    Bayesian Estimation of Parameters of a Gaussian Distribution
    Bayesian Estimation of Parameters of a Function
    Choosing a Prior
    Bayesian Model Comparison
    Bayesian Estimation of a Mixed Model
    Gaussian and Dirichlet Processes, Chinese Restaurants
    Latent Dirichlet Allocation
    Beta Processes and Indian Buffets
Reinforcement Learning (e.g., Game Theory, Markov Decision Process,..etc.)
    Single State Case: K-Armed Bandit
    Elements of Reinforcement Learning
    Model-Based Learning
    Teporal Difference Learning
    Partially Observed States
    Brief Introduction to Game Theory : Game Theory
... etc.


Exploring Python by Timothy A. Budd
Think Python: How to Think Like a Computer Scientist, by Allen B. Downey
Python Tutorial
Python Programming at Youtube
Reference note on matrix differentiation
Matrix notations and operations
Vector notations and operations
The Matrix Cookbook by K.B. Petersen and M.S. Pedersen

Tutorial Notes (Password Protected)

Tutorial 0: part A , Tutorial 0: part B (Quick Introduction to Python)

Tutorial 1 (Quick Introduction to scikit-learn)

Tutorial 2 (Review on Linear Algebra And Matrix Calculus)

Tutorial 3 (Review on Gradient Descent For Linear Regression)

Tutorial 4 (Review on Linear Regression)

Tutorial 5 (Regularization and Cross Validation)

Tutorial 6 (Parametric Classification and Implementation)

Tutorial 7 (Principal Component Analysis)

Tutorial 8 (Feature Extraction for Text under scikit-learn )

Tutorial 9 (Project: What Makes People Happy)

Tutorial 10 (Project: What Makes People Happy:Visualization)

... etc.

Homework (Password Protected) Submission: please email your homework to

Homework 1 (Deadline: Feb 12, 2017, 23:59 PM)

Homework 2 (Deadline: Feb 19, 2017, 23:59 PM)

Homework 3 (Deadline: March 19, 2017, 23:59 PM)

Homework 4 (Deadline: March 28, 2017, 23:59 PM)

Programming homework (Password Protected)
Programming Homework 1 ; zipped data file (Deadline: March 2nd, 2017, 23:59)

Programming Homework 2 ; zipped data file (Deadline: March 17th, 2017, 23:59)

Programming Homework 3 ; zipped data file (Deadline: April 4th, 2017, 23:59)

Programming Project : (Password Protected)

Programming Project: What Makes People Happy? ; zipped data file (Deadline: May 2nd, 2017, 23:59 PM)