CSCI3320: Fundamental of Machine Learning
General Expectations:
Student/Faculty's Expectations on Teaching and Learning
Classroom: Starting Jan. 8th, 2018:
 Lecture 1: Tuesday 8:3010:15 am, LSK LT3
 Lecture 2: Wednesday 8:309:15 am, LSK LT4
 Tutorial: Thursday 2:303:15 pm MMW LT1
Instructor:
Prof. John C.S. Lui
This course provides an introduction to machine learning.
It is designed to give undergraduate students
a taste of various machine learning techniques.
Students need to have a good background in
probability, statistics, a bit of optimizaton as well as
programming (e.g., Python) to appreciate various methods.
Furthermore, students need to spend time to read the textbook,
as well as to put in the effort to read various resources on the Internet,
do the homework, attend the lectures and tutorials
to understand and keep pace with this course.
If students skip some classes, they are responsibile for their own actions
on any missed lectures or announcemnets.
I know skipping classes is a norm in CUHK (which I don't understand why),
but I like to emphasize that if you skip lectures/tutorial in this course,
you will easily get lost and will not be able to keep pace with the lectures.
So, words of advice, do not skip any classes or tutorials.
Machine learning is an essential knowledge in computer science/engineering,
and a highly sought after skill in the industry.
If you are welltrained in this subject,
surely you can find a good job.
Nevertheless, the subject is
not for the fainthearted students.
I will discuss the mathematics and theory behind different machine learning
methods, and students need to do various homework and exercises to understand
the subject.
Teaching Assistants
 Miss Tingwei Liu
Office hour: Office, HSH Eng Bldg, Room 120, Tue, 4:005:00pm.
 Mr. Li Ye
Office hour: Office, HSH Eng Bldg, Room 120, Thu, 5:006:00pm.
 Mr. Liu Xutong
Office hour: Office, HSH Eng Bldg, Room 120, Tue, 2:303:30pm.
 Mr. Li Zhuahua
Office hour: Office, HSH Eng Bldg, Room 120, Wed, 5:006:00pm.
References:

Bayesian Reasoning and Machine Learning, by David Barber

Pattern Recognition and Machine Learning, by Christopher M. Bishop

Machine Learning: A Probabilistic Perspective, by Kevin P. Murphy

Learning from Data, by Yaser S. AbuMostafa

Introduction to Machine Learning, by Ethem Alpaydin, MIT Press

Machine Learning: An Algorithmic Perspective, by Stephen Marsland

Machine Learning with R, by Brett Lantz

The Elements of Statistical Learning: Data Mining, Inference, and Prediction, by Trevor Hastie, Robert Tibshirani, Jerome Friedman

An Introduction to Statistical Learning: with Applications in R,
by Gareth James, Trevor Hastie

Mastering Machine Learning With scikitlearn,
by Gavin Hackeling

Machine Learning for Hackers, by Drew Conway, John Myles White

Probabilistic Graphical Models: Principles and Techniques, by Daphne Koller, Nir Friedman

Machine Learning in Action, by Peter Harrington
 Abundant resources available on the web.
Course Grades:
 Written homework: 0%; (Why 0%? Well, ....)
 Python/Scikitlearn Programming : 15%;
 Project: 25%;
 Examination: 60%
(note: you need to get at least 30% in the final exam to pass the course)
Policies:
Announcemnet:
 No class on March 13th, Makeup Class will be on March 17th, TYW LT, 14:0017:15 !!!!!!!!!!
 FINAL EXAM: April 28th, 9:30 am till 11:30AM, LSK LT5 !!!!!!!!!!
Discussion via Piazza
To facilitate efficient query/reply and discussion, please
consider doing it via the following site:
Piazza for CSCI3320
Final Examination :
Date: April 28th, 9:30 am till 11:30AM, LSK LT5
Please note that the final examination is NOT an openbook exam. You are allowed to
bring in one piece of A4 paper (or 2pages) of ``cheat sheet'' . Topics to be covered
in the final exam are in general the materials we went through
in the lectures and tutorials, these include:
 Statistics, sampling, curve fitting, correlation theory
 Basic concepts in matrix calculus, linear algebra, Lagrangian Optimization
 Supervised and unsupervised learning
 VC dimension
 Bayesian Decision Theory
 Parametric Methods: Univariate and Multivariate methods
 Dimensionality Reduction via PCA, Feature Embedding, LDA.
 Clustering via KMean Algorithm
 EM Algorithm
 Matrix Factorization
 Linear Discriminant: logistic classification and regression
 Decision trees
 Random forests
 ...etc
Lecture Notes: (Password Protected)

Introduction on Machine Learning

Review on Statistics
 Statistical Sampling
 Estimation Theory
 Hypothesis Testing
 Curve Fitting
 Least Sqaured Regression
 Regression
 Corrrelation Theory
 QQ Plot

Derivation of Least Squares

Overview of Supervised Learning

Overview of Bayesian Decision Theory

Bayes' Rule: Machine Learning perspective

Loss/Risk Functions, discriminant functions

Introduction to correlation and causality

Introduction to causal and diagnostic inference

Simple Bayesian Networks and Simple Bayes' Classifiers

Association Rules

Linear Discrimination (Nonparametric method. Discussed in advanced for programming)

Generalizing the Linear Model

Geometry of the Linear Discriminant

Linear Discriminant via Pairwise Separation

Logistic Discriminant: Two and Multple Classes

Discriminant by Regression

Discriminant via Ranking

Parametric Methods

Maximum likelihood estimator

Estimator: bias vs. variance

Unbiased estimator, consistent estimator, asymptotically unbiased estimator

Bayes' estimator

Parametric Classification

Parametric Regression

Bias/Variance Dilemma

Illustration of Model Selection

Multivariate Parametric Methods

Multivariate Parameters and Estimation

Multivariate Normal Distributions

Multivariate Parametric Classification in Multivariate Normal Distributions

Multivariate Parametric Classification in Multivariate
Bernoulli/Multinomial Distributions

Multivariate Regression

Dimensionality Reduction

Clustering

Nonparametric Methods

Nonparametric density estimation: Histogram Estimator

Nonparametric density estimation: Kernel Estimator

Nonparametric density estimation: kNearest Neighbor Estimator

Nonparametric density estimation: Generalization to Multivarate Data

Condensed Nearest Neighbor

DistanceBased Classification

Nonparametric Regression: Smoothing Models

Decision Trees

Univariate Trees

Prunning on Decision Trees

Rule Extraction from Decision Trees

Learning Rules from Decision Trees

Multivariate Decision Trees

Kernel Machines (To be uploaded)

Quick Review of Logistic Classification/Regression

From Logistic Classification to SVM Classification

Concept of Large Margin

Landmarks to Kernels

Theory of Margin and Support Vectors

Nonseparable Case: Soft Margin Hyperplane

Hinge Loss

Kernel Tricks and Kernel Functions

Multiple Kernel Learning and Multiclass Kernel Machines

SVM for Regression

SVM for Ranking

Large Margin Nearest Neighbor

Kernel Dimensionality Reduction

Optional Reading 1:
Constrained Optimization

Optional Reading 2:
Inequality Constraints and KuhnTucker method

Graphical Models (To be uploaded)

Conditional Independence

Generative Models

dSeparation

Belief Propagation

Undirected Graphs and Markov Random Fields

Learning Structures from Graphical Model

Influence Diagram

Multilayers Perceptrons (Artificial Neural Networks) (To be uploaded)
 Perceptron
 Training a Perceptron
 Learning Boolean Functions
 Multilayer Perceptrons
 Backpropagation Algorithm
 Training Procedures
 Tuning the Network Size
 Bayesian View of Learning
 Dimensionality Reduction
 Deep Learning

Hidden Markov Models (To be uploaded)
 Discrete Markov Processes
 Hidden Markov Models (HMM)
 Basic Problems of HMM
 Evaluation Problem
 Learning the State Sequence
 Learning the Model Parameters
 The HMM as a Graphical Model

Bayesian Estimation (To be uploaded)
 Bayesian Estimation of Parameters of a Disrete Distribution
 Bayesian Estimation of Parameters of a Gaussian Distribution
 Bayesian Estimation of Parameters of a Function
 Choosing a Prior
 Bayesian Model Comparison
 Bayesian Estimation of a Mixed Model
 Gaussian and Dirichlet Processes, Chinese Restaurants
 Latent Dirichlet Allocation
 Beta Processes and Indian Buffets

Reinforcement Learning (e.g., Game Theory, Markov Decision Process,..etc.) (To be uploaded)
 Single State Case: KArmed Bandit
 Elements of Reinforcement Learning
 ModelBased Learning
 Teporal Difference Learning
 Partially Observed States

Brief Introduction to Game Theory :
Game Theory
References

Exploring Python by Timothy A. Budd

Think Python: How to Think Like a Computer Scientist, by Allen B. Downey

Python Tutorial

Python Programming at Youtube

Reference note on matrix differentiation

Matrix notations and operations

Vector notations and operations

The Matrix Cookbook by K.B. Petersen and M.S. Pedersen
Tutorial Notes (Password Protected)

Tutorial 1: Introduction to Python,
(Quick Introduction to Python)

Tutorial 1 (Quick Introduction to scikitlearn);
tutorial1.ipynb (Sample code)

Tutorial 2 (Review on Linear Algebra And Matrix Calculus)

Tutorial 3 (Review on Gradient Descent For Linear Regression)

Tutorial 4 (Review on Linear Regression)

Tutorial 5 (Regularization and Cross Validation)

Tutorial 6 (Parametric Classification and Implementation)

Tutorial 7 (Principal Component Analysis)

Project Tutorial (Horse Racing Prediction)

Tutorial 8 (Kernel Machines) (To be uploaded)

Tutorial 9 (Ensemble Methods) (To be uploaded)
Homework (Password Protected)

Homework 1

Homework 2

Homework 3

Homework 4
Programming homework (Password Protected)
Submission: please email your homework to cuhkcsci3320@gmail.com

Programming Homework 0 ;
zipped data file
(Deadline: Feb. 15th, 2018, 17:00 PM)

Programming Homework 1 ;
zipped data file
(Deadline: Mar. 16th, 2018, 17:00 PM)

Programming Homework 2 ;
zipped data file
(Deadline: April. 13th, 2018, 17:00 PM)
Programming Project : (Password Protected)
Programming Project ;
zipped data file
(Deadline: NEWLY UPDATED: May, 1st, 2018, 17:00 PM)