CSCI5640 Natural Language Processing


Course code CSCI5640
Course title Natural Language Processing
Course description Natural language processing (NLP) is a crucial part of artificial intelligence (AI), which aims to endow computers with the ability to process human language. This course gives an overview of modern deep learning techniques for natural language processing. The course starts with basic linguistic concepts in NLP and moves from shallow bag-of-words representations to richer structural embeddings, which is the foundation for the successful use of deep learning in NLP. Then the course will guide you through three fundamental tasks of NLP: language modeling (LM), natural language understanding (NLU), and natural language generation (NLG), followed by some recent advances such as BERT and adversarial learning. Along the way we will introduce cutting-edge computational models together with insights from a linguistic perspective.
自然語言處理(NLP)是人工智能(AI)的重要組成部分,旨在使計算機具備處理人類語言的能力。 本課程概述了用於自然語言處理的現代深度學習技術。 該課程從NLP中的基本語言概念開始,講述從簡單的單詞袋表示到更豐富的結構性詞嵌入,這是在NLP中成功使用深度學習的基礎。 然後,本課程將指導您認識NLP的三個基本任務:語言建模(LM),自然語言理解(NLU)和自然語言生成(NLG),最後會介紹關於BERT和對抗性學習等最新進展。 在此過程中,我們將介紹最先進的計算模型並且從語言學角度闡述其原理。
Unit(s) 3
Course level Postgraduate
Semester 1 or 2
Grading basis Graded
Grade Descriptors A/A-:  EXCELLENT – exceptionally good performance and far exceeding expectation in all or most of the course learning outcomes; demonstration of superior understanding of the subject matter, the ability to analyze problems and apply extensive knowledge, and skillful use of concepts and materials to derive proper solutions.
B+/B/B-:  GOOD – good performance in all course learning outcomes and exceeding expectation in some of them; demonstration of good understanding of the subject matter and the ability to use proper concepts and materials to solve most of the problems encountered.
C+/C/C-: FAIR – adequate performance and meeting expectation in all course learning outcomes; demonstration of adequate understanding of the subject matter and the ability to solve simple problems.
D+/D: MARGINAL – performance barely meets the expectation in the essential course learning outcomes; demonstration of partial understanding of the subject matter and the ability to solve simple problems.
F: FAILURE – performance does not meet the expectation in the essential course learning outcomes; demonstration of serious deficiencies and the need to retake the course.
Learning outcomes At the end of the course of studies, students will have acquired the ability to
1. Understand basic concepts in NLP from both computational and linguistic perspectives
2. Understand fundamental tasks in NLP and its representative applications
3. Hands-on techniques to preprocess and analyze texts
4. Implement deep learning models to resolve some simple real-world applications, such as sentiment analysis for tweets and dialog agents
(for reference only)
Short answer test or exam: 40%
Project: 20%
Presentation: 20%
Essay test or exam: 20%
Recommended Reading List 1. Jacob Eisenstein, “Introduction to Natural Language Processing”, The MIT Press, 2019
2. Dan Jurafsky and James H. Martin, “Speech and Language Processing, 2nd Edition”, Prentice Hall, 2009. Third edition draft is available at


CSCIN programme learning outcomes Course mapping
Upon completion of their studies, students will be able to:  
1. identify, formulate, and solve computer science problems (K/S); Y
2. design, implement, test, and evaluate a computer system, component, or algorithm to meet desired needs (K/S);
3. receive the broad education necessary to understand the impact of computer science solutions in a global and societal context (K/V);
4. communicate effectively (S/V);
5. succeed in research or industry related to computer science (K/S/V);
6. have solid knowledge in computer science and engineering, including programming and languages, algorithms, theory, databases, etc. (K/S); Y
7. integrate well into and contribute to the local society and the global community related to computer science (K/S/V);
8. practise high standard of professional ethics (V);
9. draw on and integrate knowledge from many related areas (K/S/V);
Remarks: K = Knowledge outcomes; S = Skills outcomes; V = Values and attitude outcomes; T = Teach; P = Practice; M = Measured