CSCI3320: Fundamentals of Machine Learning

General Expectations: Student/Faculty's Expectations on Teaching and Learning

Message: 2020-2021 was my last year of teaching CSCI3320, Fundamentals of Machine Learning, a course which I have taught for 7 or 8 years. It has been fun and as someone said, usually it is the teacher who learns the most. In any case, for students who want to take CSCI3320, please refer to the syllabus by the current course instructor.


Instructor: Prof. John C.S. Lui , office hours: Thursday, 8:30-10:30am.

Machine learning (ML) is a method of data analysis that automates analytical model building. Some people say that ML is a branch of artificial intelligence. Personally, I think that ML is really a branch of statistics. In any case, this course provides an introduction to machine learning. It is designed to give undergraduate students a taste of various machine learning techniques. Students need to have a good background in probability, statistics, a bit of optimizaton as well as programming (e.g., Python) to appreciate various methods.

Furthermore, students need to spend time to read the textbook, as well as to put in the effort to read various resources on the Internet, do the homework, attend the lectures and tutorials to understand and keep pace with this course. If you skip some classes, please remember that you are solely responsibile for you own actions on any missed lectures or announcemnets.

Although skipping classes is now a norm in CUHK, but I like to emphasize that if you skip lectures/tutorial in this course, you will easily get lost and will not be able to keep pace with the lectures. So, words of advice, do not skip any classes or tutorials.

Machine learning is an essential knowledge in computer science/engineering, and a highly sought after skill in the industry. If you are well-trained in this subject, surely you can find a good job. Nevertheless, the subject is not for the faint-hearted students. I will discuss the mathematics, theories, algorithms and programming techniques behind different machine learning methods, and students need to do various homework and exercises to understand the subject.


References:

Course Grades:

Policies:

Announcemnet:

Final Examination : Topics to be covered in the final exam are in general the materials we went through in the lectures and tutorials, these include:


Lecture Notes: Lecture and tutorial notes and videos can be downloaded from the Blackboard at CUHK.

Introduction on Machine Learning (online lecture)

Review on Statistics (pre-recorded lecture)

Some exercises on "Review of Statistics" (online lecture)

Overview of Supervised Learning (pre-recorded lecture)

Examining Your Data or Cleaning your Data: PANDAS Tutorial (online lecture with Python code in Jupyter notebook)

Overview of Bayesian Decision Theory (pre-recorded lecture)

Regression, Overfitting, Underfitting and Prediction in Python (online lecture with Python code in Jupyter notebook)

Evaluation metrics (pre-recorded lecture)

Data cleansing and data processing in scikit-learn (pre-recorded lecture) (with scikit-learn code in Jupyter notebook)

Classification in scikit-learn (pre-recorded lecture) (with scikit-learn code on Jupyter notebook)

Parametric Methods (pre-recorded lecture)

Introduction to Classification in Python and Scikit-learn (online lecture with Python and scikit-learn codee in Jupyter notebook)

How to do regression in Python and Scikit-learn (online lecture with Python and scikit-learn code in Jupyter notebook)

Regression in scikit-learn (pre-recorded lecture with scikit-learn code in Jupyter notebook)

Real Life Classification: rating answers in Stackoverflow (online lecture with Python and scikit-learn code in Jupyter notebook)

Dimensionality Reduction (pre-recorded lecture)

Dimensionality Reduction in action (online lecture with Python and scikit-learn code in Jupyter notebook)

Clustering (pre-recorded lecture)

Text Pre-processing, NLTK and Finding top k documents via Clustering Technique (online lecture with Python and scikit-learn code in Jupyter notebook)

Multivariate Parametric Methods (pre-recorded lecture)

Linear Discrimination (pre-recorded video)

Recommender Systems (online lecture with Python and scikit-learn code in Jupyter notebook)

Nonparametric Methods (pre-recorded video)

Decision Trees (pre-recorded video)

Sentiment Analysis on Tweeter-like data (online lecture with Python and scikit-learn code in Jupyter notebook)

Kernel Machines (pre-recorded video)

Multilayers Perceptrons (Artificial Neural Networks) (pre-recorded video)
Topic Modeling: Comparing or searching documents by topics instead of words (online lecture with Python and scikit-learn code in Jupyter notebook)

Music Genre Classifiation (with Python and scikit-learn code in Jupyter notebook (will be uploaded to blackboard))

Graphical Models (To be uploaded if time allows)

Hidden Markov Models (To be uploaded if time allows)
Bayesian Estimation (To be uploaded if time allows)
Reinforcement Learning (e.g., Game Theory, Markov Decision Process,..etc.) (To be uploaded if time allows)

Additional References

Exploring Python by Timothy A. Budd
Think Python: How to Think Like a Computer Scientist, by Allen B. Downey
Python Tutorial
Python Programming at Youtube
Reference note on matrix differentiation
Matrix notations and operations
Vector notations and operations
The Matrix Cookbook by K.B. Petersen and M.S. Pedersen
Brief Introduction to Kalman Filters

Tutorial Notes (Availablle on Blackboard)

Tutorial 0: Introduction to Python,

Tutorial 1 (Quick Introduction to scikit-learn with Jupyter notebook);

Tutorial 2 (Review on Linear Algebra And Matrix Calculus, with Jupyter notebook))

Tutorial 3 (Review on Gradient Descent For Linear Regression with Jupyter notebook)

Tutorial 4 (Review on Linear Regression)

Tutorial 5 (Regularization and Cross Validation with Python code)

Tutorial 6 (Parametric Classification and Implementation with sample code)

Tutorial 7 (Principal Component Analysis)

Project Tutorial (Horse Racing Prediction)

Tutorial 8 (Kernel Machines) (To be uploaded)

Tutorial 9 (Ensemble Methods) (To be uploaded)


Homework (Available on Blackboard)

Will be posted on Blackboard.

Programming homework
Will be posted on Blackboard.

Programming Project :

Will be posted on Blackboard.