CSCI3320: Fundamental of Machine Learning
General Expectations:
Student/Faculty's Expectations on Teaching and Learning
Classroom: After Jan. 13, 2017:
 Lecture 1: Tuesday 8:3010:15 am, ERB 407
 Lecture 2: Wednesday 10:3011:15 am, LHC 103
 Tutorial: Thursday 2:303:15 pm MMW LT2
Instructor:
Prof. John C.S. Lui
This course provides an introduction to machine learning.
It is designed to give undergraduate students
a taste of various machine learning techniques.
Students need to have a good background in
probability, statistics, a bit of optimizaton as well as
programming (e.g., Python) to appreciate various methods.
Furthermore, students need to spend time to read the textbook,
as well as to put in the effort to read various resources on the Internet,
do the homework, attend the lectures and tutorials
to understand and keep pace with this course.
If students skip some classes, they are responsibile for their own actions
on any missed lectures or announcemnets.
I know skipping classes is a norm in CUHK (which I don't understand why),
but I like to emphasize that if you skip lectures/tutorial in this course,
you will easily get lost and will not be able to keep pace with the lectures.
So, words of advice, do not skip any classes or tutorials.
Machine learning is an essential knowledge in computer science/engineering,
and a highly sought after skill in the industry.
If you are welltrained in this subject,
surely you can find a good job.
Nevertheless, the subject is
not for the fainthearted students.
I will discuss the mathematics and theory behind different machine learning
methods, and students need to do various homework and exercises to understand
the subject.
Teaching Assistants
 Miss Xiaowei Chen
Office hour: Office, HSH Eng Bldg, Room 120, Wed, 4:006:00pm.
 Miss Tingwei Liu
Office hour: Office, HSH Eng Bldg, Room 120, Monday, 4:006:00pm.
 Mr. Li Ye
Office hour: Office, HSH Eng Bldg, Room 120, Tuesday, 4:006:00pm.
References:

Bayesian Reasoning and Machine Learning, by David Barber

Pattern Recognition and Machine Learning, by Christopher M. Bishop

Machine Learning: A Probabilistic Perspective, by Kevin P. Murphy

Learning from Data, by Yaser S. AbuMostafa

Introduction to Machine Learning, by Ethem Alpaydin, MIT Press

Machine Learning: An Algorithmic Perspective, by Stephen Marsland

Machine Learning with R, by Brett Lantz

The Elements of Statistical Learning: Data Mining, Inference, and Prediction, by Trevor Hastie, Robert Tibshirani, Jerome Friedman

An Introduction to Statistical Learning: with Applications in R,
by Gareth James, Trevor Hastie

Mastering Machine Learning With scikitlearn,
by Gavin Hackeling

Machine Learning for Hackers, by Drew Conway, John Myles White

Probabilistic Graphical Models: Principles and Techniques, by Daphne Koller, Nir Friedman

Machine Learning in Action, by Peter Harrington
 Abundant resources available on the web.
Course Grades:
 Homework: 5%;
 Python/Scikitlearn Programming : 20%;
 Project: 25%;
 Examination: 50%
(note: you need to get at least 30% in the final exam to pass the course)
Policies:
Announcemnet:
 No class on April 1112, 2017.
 MakeUp Class: April 8th, 9:00 am till noon, LSB LT5.
 Class on April 19th, 10:3011:15 am will be changed to
LSK LT7 due to room renovation.
 FINAL EXAM: April 29th, 10:00 am till noon, LSB LT1.
Final Examination :
Date: April 29th, 10:00 am till noon, LSB LT1
Please note that the final examination is NOT an openbook exam. You are allowed to
bring in one piece of A4 paper (or 2pages) of ``cheat sheet'' . Topics to be covered
in the final exam are in general the materials we went through
in the lectures and tutorials, these include:
 Statistics, sampling, curve fitting, correlation theory
 Basic concepts in matrix calculus, linear algebra, Lagrangian Optimization
 Supervised and unsupervised learning
 VC dimension
 Bayesian Decision Theory
 Parametric Methods: Univariate and Multivariate methods
 Dimensionality Reduction via PCA, Feature Embedding, LDA.
 Clustering via KMean Algorithm
 EM Algorithm
 Matrix Factorization
 Nonparametric method: Kernel estimator and KNN
 Linear Discriminant: logistic classification and regression
 Decision trees
 Random forests
 Artificial neural networks
 Elementary game theory
 Elementary multiarmed bandit
 ...etc
Lecture Notes: (Password Protected)

Introduction on Machine Learning

Review on Statistics
 Statistical Sampling
 Estimation Theory
 Hypothesis Testing
 Curve Fitting
 Least Sqaured Regression
 Regression
 Corrrelation Theory
 QQ Plot

Derivation of Least Squares

Overview of Supervised Learning

Overview of Bayesian Decision Theory

Bayes' Rule: Machine Learning perspective

Loss/Risk Functions, discriminant functions

Introduction to correlation and causality

Introduction to causal and diagnostic inference

Simple Bayesian Networks and Simple Bayes' Classifiers

Association Rules

Linear Discrimination (Nonparametric method. Discussed in advanced for programming)

Generalizing the Linear Model

Geometry of the Linear Discriminant

Linear Discriminant via Pairwise Separation

Logistic Discriminant: Two and Multple Classes

Discriminant by Regression

Discriminant via Ranking

Parametric Methods

Maximum likelihood estimator

Estimator: bias vs. variance

Unbiased estimator, consistent estimator, asymptotically unbiased estimator

Bayes' estimator

Parametric Classification

Parametric Regression

Bias/Variance Dilemma

Illustration of Model Selection

Multivariate Parametric Methods

Multivariate Parameters and Estimation

Multivariate Normal Distributions

Multivariate Parametric Classification in Multivariate Normal Distributions

Multivariate Parametric Classification in Multivariate
Bernoulli/Multinomial Distributions

Multivariate Regression

Dimensionality Reduction

Clustering

Nonparametric Methods

Nonparametric density estimation: Histogram Estimator

Nonparametric density estimation: Kernel Estimator

Nonparametric density estimation: kNearest Neighbor Estimator

Nonparametric density estimation: Generalization to Multivarate Data

Condensed Nearest Neighbor

DistanceBased Classification

Nonparametric Regression: Smoothing Models

Decision Trees

Univariate Trees

Prunning on Decision Trees

Rule Extraction from Decision Trees

Learning Rules from Decision Trees

Multivariate Decision Trees

Kernel Machines

Quick Review of Logistic Classification/Regression

From Logistic Classification to SVM Classification

Concept of Large Margin

Landmarks to Kernels

Theory of Margin and Support Vectors

Nonseparable Case: Soft Margin Hyperplane

Hinge Loss

Kernel Tricks and Kernel Functions

Multiple Kernel Learning and Multiclass Kernel Machines

SVM for Regression

SVM for Ranking

Large Margin Nearest Neighbor

Kernel Dimensionality Reduction

Optional Reading 1:
Constrained Optimization

Optional Reading 2:
Inequality Constraints and KuhnTucker method

Graphical Models

Conditional Independence

Generative Models

dSeparation

Belief Propagation

Undirected Graphs and Markov Random Fields

Learning Structures from Graphical Model

Influence Diagram

Multilayers Perceptrons (Artificial Neural Networks)
 Perceptron
 Training a Perceptron
 Learning Boolean Functions
 Multilayer Perceptrons
 Backpropagation Algorithm
 Training Procedures
 Tuning the Network Size
 Bayesian View of Learning
 Dimensionality Reduction
 Deep Learning

Hidden Markov Models
 Discrete Markov Processes
 Hidden Markov Models (HMM)
 Basic Problems of HMM
 Evaluation Problem
 Learning the State Sequence
 Learning the Model Parameters
 The HMM as a Graphical Model

Bayesian Estimation
 Bayesian Estimation of Parameters of a Disrete Distribution
 Bayesian Estimation of Parameters of a Gaussian Distribution
 Bayesian Estimation of Parameters of a Function
 Choosing a Prior
 Bayesian Model Comparison
 Bayesian Estimation of a Mixed Model
 Gaussian and Dirichlet Processes, Chinese Restaurants
 Latent Dirichlet Allocation
 Beta Processes and Indian Buffets

Reinforcement Learning (e.g., Game Theory, Markov Decision Process,..etc.)
 Single State Case: KArmed Bandit
 Elements of Reinforcement Learning
 ModelBased Learning
 Teporal Difference Learning
 Partially Observed States

Brief Introduction to Game Theory :
Game Theory

... etc.
References

Exploring Python by Timothy A. Budd

Think Python: How to Think Like a Computer Scientist, by Allen B. Downey

Python Tutorial

Python Programming at Youtube

Reference note on matrix differentiation

Matrix notations and operations

Vector notations and operations

The Matrix Cookbook by K.B. Petersen and M.S. Pedersen
Tutorial Notes (Password Protected)

Tutorial 0: part A ,
Tutorial 0: part B
(Quick Introduction to Python)

Tutorial 1 (Quick Introduction to scikitlearn)

Tutorial 2 (Review on Linear Algebra And Matrix Calculus)

Tutorial 3 (Review on Gradient Descent For Linear Regression)

Tutorial 4 (Review on Linear Regression)

Tutorial 5 (Regularization and Cross Validation)

Tutorial 6 (Parametric Classification and Implementation)

Tutorial 7 (Principal Component Analysis)

Tutorial 8 (Feature Extraction for Text under scikitlearn )

Tutorial 9 (Project: What Makes People Happy)

Tutorial 10 (Project: What Makes People Happy:Visualization)

... etc.
Homework (Password Protected)
Submission: please email your homework to cuhkcsci3320@gmail.com

Homework 1
(Deadline: Feb 12, 2017, 23:59 PM)

Homework 2
(Deadline: Feb 19, 2017, 23:59 PM)

Homework 3
(Deadline: March 19, 2017, 23:59 PM)

Homework 4
(Deadline: March 28, 2017, 23:59 PM)
Programming homework (Password Protected)

Programming Homework 1 ;
zipped data file
(Deadline: March 2nd, 2017, 23:59)

Programming Homework 2 ;
zipped data file
(Deadline: March 17th, 2017, 23:59)

Programming Homework 3 ;
zipped data file
(Deadline: April 4th, 2017, 23:59)
Programming Project : (Password Protected)
Programming Project: What Makes People Happy? ;
zipped data file
(Deadline: May 2nd, 2017, 23:59 PM)