Graduate Courses Schedule (Spring 2025)
For Ph.D./M.A. in Linguistics, M.A. in Computational Linguistics
This course is called “Statistics” but it is really “Practical Computational Linguistics
2”, and it follows up on “Practical Computational Linguistics 1”, which is the intro
Python graduate course (called “Comp Ling 1”).
The course will be an introduction to dealing with linguistic data in large quantities.
It is focused on acquiring knowledge that is helpful for theoretical linguists and
people who are more interested in building language technology systems. It is not
meant as an introduction to contemporary Natural Language Processing (NLP), though
it should make it easier to take an NLP class subsequently.
All programming will be done in Python.
The course will cover the following issues:
• Some basic concepts in scientific methodology, probability, statistics, and information
theory (2.5 weeks)
• Acquiring data (1 week)
• The post-twitter world
• Analyzing data (2 weeks)
• Correlation, statistical significance
• Human-annotating data (1.5 weeks)
• Inter-annotator agreement
• Using automatic annotation on data (2 weeks)
• Representing and automatically tagging phonology, morphology, syntax, semantics
• Finding patterns in data: machine learning 1 (2 weeks)
• Decision trees
• Evaluation: metrics, statistical significance
• Finding patterns in data: machine learning 2 (3 weeks)
• Neural network basics, deep learning, LLMs, zero-shot, fine-tuning
Students will choose a project at the beginning of the semester. I will provide projects
for those who do not come with a project. Each student will develop their project
over the course of the semester, implementing various concepts learned in class. Each
class, one or two students will present their current work. Experienced PhD students
will be asked to supplement the class with in-depth demos of how to use Python for
the tasks discussed in class.
A detailed consideration of recent developments in syntactic theory, including treatments
of constituency and word order, grammatical relations, typological variation and linguistic
universals, and constraints on grammatical rules and representations.
A study of recent developments in phonological theory, with particular attention to
nonlinear models of phonological representation and constraint-based models.
A survey of parsing theory for natural language processing and its applications in
psycholinguistic modeling. The course covers a wide variety of parsing algorithms
for context-free and mildly context-sensitive grammar formalisms. The performance
of these algorithms is carefully analyzed and set in relation to empirical phenomena
of human sentence processing.
An introduction to the theoretical foundation of computational linguistics. The course
emphasizes the importance of algorithms, algebra, logic, and formal language theory
in the development of new tools and software applications. Empirical phenomena in
phonology and syntax are sampled from a variety of languages to motivate and illustrate
the use of concepts such as strictly local string languages, tree transducers, and
semirings. Students will develop familiarity with the literature and tools of the
field.
Course Description: This course provides an overview of the history and structure of signed languages
and gestural systems. It is designed for graduate students in linguistics with no
prior sign language knowledge. You may pick up a few American Sign Language (ASL)
signs along the way, but this is not an ASL language course. Sample topics include
phonology, morphology, sign order & spatial grammar, grammatical use of facial expressions,
constructed action, and role of iconicity. We will also discuss gesture, homesign,
language emergence, historical change, dialects, acquisition, bilingualism, and sign
language disorders. The course format includes Dr. Singleton’s lectures and student-led
discussions on the assigned readings. Students will contribute posts to four online
discussion prompts during the semester and complete a course paper on a sign language-related
topic (to be developed in consultation with Dr. Singleton).
This seminar will examine various issues in the syntax of WH. Both then and now.
The seminar is divided into 4 primary modules:
(i) constraints and conditions (ii) WH in situ (iii) Multiple WH (iv) partial WH and
typology
Each for-credit student will develop an original research project related to WH and
will present it in blitz form at the WH Festival at the end of the semester, and write
a term paper on it.
The most important ongoing requirement of the course is READING! Every week you should
read carefully (at least) two things you have not read before (book chapters; articles,
etc). Everyone is subject to “the collective arrangement” – the more you read, the
less you will have to write !
Additionally, each student will do a language “sketch” of the WH situation in a language
you do not know and have not worked on or studied. (There is an extensive “typology”
folder with tons of papers on various languages on Brightspace that can serve as a
starting point for “the sketch”)
An overview of different perspectives on the nature of markedness and the role of
phonetic naturalness in phonological theory. A range of different theories and approaches
are examined, including Radical Underspecification, Natural Phonology, Optimality
Theory, Phonetically Based Phonology, Substance Free Phonology, and Evolutionary Phonology,
among others.
An overview of the acquisition of morphology as a computational problem. Topics covered
will include a characterization of the primary linguistic data, stages that child learners go through
as they acquire their morphologies, and learning algorithms that have been proposed for various aspects
of the acquisition process, including segmentation, inflection, and paradigm discovery.
For Ph.D./M.A. in Linguistics, M.A. in Computational Linguistics
The course will be an introduction to dealing with linguistic data in large quantities. It is focused on acquiring knowledge that is helpful for theoretical linguists and people who are more interested in building language technology systems. It is not meant as an introduction to contemporary Natural Language Processing (NLP), though it should make it easier to take an NLP class subsequently.
All programming will be done in Python.
The course will cover the following issues:
• Some basic concepts in scientific methodology, probability, statistics, and information theory (2.5 weeks)
• Acquiring data (1 week)
• The post-twitter world
• Analyzing data (2 weeks)
• Correlation, statistical significance
• Human-annotating data (1.5 weeks)
• Inter-annotator agreement
• Using automatic annotation on data (2 weeks)
• Representing and automatically tagging phonology, morphology, syntax, semantics
• Finding patterns in data: machine learning 1 (2 weeks)
• Decision trees
• Evaluation: metrics, statistical significance
• Finding patterns in data: machine learning 2 (3 weeks)
• Neural network basics, deep learning, LLMs, zero-shot, fine-tuning
Students will choose a project at the beginning of the semester. I will provide projects for those who do not come with a project. Each student will develop their project over the course of the semester, implementing various concepts learned in class. Each class, one or two students will present their current work. Experienced PhD students will be asked to supplement the class with in-depth demos of how to use Python for the tasks discussed in class.
The seminar is divided into 4 primary modules:
(i) constraints and conditions (ii) WH in situ (iii) Multiple WH (iv) partial WH and typology
Each for-credit student will develop an original research project related to WH and will present it in blitz form at the WH Festival at the end of the semester, and write a term paper on it.
The most important ongoing requirement of the course is READING! Every week you should read carefully (at least) two things you have not read before (book chapters; articles, etc). Everyone is subject to “the collective arrangement” – the more you read, the less you will have to write !
Additionally, each student will do a language “sketch” of the WH situation in a language you do not know and have not worked on or studied. (There is an extensive “typology” folder with tons of papers on various languages on Brightspace that can serve as a starting point for “the sketch”)
For M.A. in TESOL