CS 4120: Natural Language Processing

Time and Location: Tuesdays 11:45 am - 1:25 pm, Thursdays 2:50 pm - 4:30 pm in Snell Library 037

Instructor: Lu Wang, Office Rm 2211, 177 Huntington Ave.

Staff and Office Hours:

Discussion Forum: Piazza, sign up at piazza.com/northeastern/spring2020/cs4120

Important Announcement

[4/8/2020] All the project presentations are available at this link.

[3/19/2020] The final exam will be held online. The time is 11:30am-1:30pm, April 14th.

[1/3/2020] We will have a quiz with 20 simple True or False questions (relevant to probability, statistics, and linear algebra) in the end of lecture on Jan 9. This quiz will be graded, but will not be counted in your final score if you're enrolled in CS4120. The purpose of this quiz is to indicate the expected background of students. 80% of the questions should be easy to answer. If you find yourself struggling with this quiz, it's possible that you need to catch up on the background or it may be preferable to take one or two preliminary courses.

Course Description (and Syllabus)

This course aims to introduce fundamental tasks in natural language processing, and its recent advances based on machine learning algorithms (e.g., neural networks) and applications for interdisciplinary subjects (e.g., computational social science). The course materials are mostly delivered as lectures, and accompanied with reading materials. The students will be evaluated based on in-class quizzes, assignments, a research-driven course project, and an open-book exam.

Please find the syllabus here: [Link]

Textbooks and Reference


This course is designed for senior undergraduate students majoring in computer science, information science, linguistics, and other related areas. Students who take this course are expected to be able to write code in some programming languages (e.g., Python is recommended) proficiently, and finish courses in algorithms (CS 3000, CS 3800, or CS4810), multivariable calculus, probability, and statistics. Linear algebra is optional, but highly recommended. It would be beneficial if the students have prior knowledge on supervised machine learning.


Each assignment or report is due by the end of day on the corresponding due date (i.e. 11:59pm, EST). Blackboard is used for electronic submission. Assignment or report turned in late will be charged 20 points (out of 100 points) off for each late day (i.e. every 24 hours). Each student has a budget of 6 days throughout the semester before a late penalty is applied. You may want to use it wisely, e.g. save for emergencies. Each 24 hours or part thereof that a submission is late uses up one full late day. Late days are not applicable to final presentation. Each group member is charged with the same number of late days, if any, for their submission. There is no need to inform the instructors if late days are used; timestamp of the last submission on Blackboard will be used for automatic grade calculation.

Grades will be determined based on three assignments, eight in-class quizzes, one course project, one open-book exam, and participation:


Jan 7 & 9 (quiz0)

Jan 14 & 16 (quiz about LM)

Jan 21 & 23 (quiz about text categorization, evaluation, and NB)

Jan 28 & 30 (quiz about POS and WSD)

Feb 4 & 6 (quiz about ML Basics and Grammars)

Feb 11 & 13 (quiz about lexical semantics)

Feb 18 & 20 (quiz about SVD and neural LM)

Feb 25 & 27 (quiz about sentiment)

Mar 3 & 5 (No class: Spring Break)

Mar 10 & Mar 12 (quiz about QA and summarization covered on 3/10)

Mar 17 (CLASS CANCELLED due to students moving out) & Mar 19 (quiz about dialogue systems)

Mar 24 (quiz about dialogue system) & Mar 26 (project feedback meetings)

Mar 31 (quiz about chatbots) & Apr 2 (quiz about discourse)

Apr 7 & Apr 9

Apr 14

Academic Integrity

This course follows the Northeastern University Academic Integrity Policy. All students in this course are expected to abide by the Academic Integrity Policy. Any work submitted by a student in this course for academic credit should be the student's own work. Collaborations are allowed only if explicitly permitted. Violations of the rules (e.g. cheating, fabrication, plagiarism) will be reported.