HG2051 – Language and the Computer (AY2023)

Hiram Ring <hiram.ring@ntu.edu.sg>

Fridays, 12:30-3:20pm (Sem1)

TR+29 (LHS-B2-06, The Hive)

Traditionally linguistic analysis was done largely by hand, but computer-based methods and tools are becoming increasingly widely used in contemporary research. This course provides an introduction to skills and resources that can assist the linguist in performing fast, flexible, and accurate quantitative analyses. Students will learn a programming language (Python) along with techniques for processing human language data. No previous programming experience is required: we will teach you the basics of programming and computational linguistics along with some good software engineering practices.

Schedule

Week Date Topic Notes
1 18 Aug What is Computational Linguistics? Why do it? Why use Python? CS basics Setup, VS_Code
2 25 Aug Basic Types and Data Structures; Using Python to Count Things; Lists PyT 3.1; NLTK 1
01 Sep Polling Day, no class
3 08 Sep Assignment, Expressions, and Control; Strings (abbreviated class due to Student Union Day) PyT 3.2, 4; NLTK 4.1
4 15 Sep Text Corpora and Conditional Frequencies DIP 2.2, 2.8; PT 5.3, 5.5, 4.7.1-2; NLTK 2.1-2; lecture; practice
5 22 Sep Lexical Resources and WordNet NLTK 2.4, 2.5, (How To); lecture; practice
6 29 Sep Processing Raw Text NLTK 3.1, 3.3; PT 7.1-7.3; lecture; practice
06 Oct Recess
7 13 Oct Mid-review; Working with Software Projects PT 6, 6.4; Coding challenge
8 20 Oct Algorithmic Thinking and Regular Expressions NLTK 3.4, 5, 6, 7, 8; lecture; practice
9 27 Oct N-Grams and Collocations NLTK 4.5, 5; lecture; practice
10 03 Nov Part-of-speech Tagging NLTK 5.4, 5, 7; practice
11 10 Nov Classification NLTK 6.2, 5, 6; practice, enamdict; Project 1 due
12 17 Nov Ethics, Language Models, and Software Libraries
13 24 Nov Review and Final Quiz:
12:30pm, TR+49 (LHS-02-03, The Hive)
Coding challenge
01 Dec Project 2 due, 11:59pm

Course Pages

Grading Criteria

This course is graded with continuous assessment as follows:

You may also get 1–5% extra credit (not exceeding 100% in the course) by submitting a contribution (e.g., code or documentation) to an open-source project. Contact me for details.

Resources

Acknowledgments

The majority of the content for this course has been borrowed (with permission) from Michael Wayne Goodman and Francis Bond, who taught previous years. Below are some of the archives of the previous courses: