For Ph.D./M.A. in Linguistics, M.A. in Computational Linguistics
This course is called “Statistics” but it is really “Practical Computational Linguistics
2”, and it follows up on “Practical Computational Linguistics 1”, which is the intro
Python graduate course (called “Comp Ling 1”).
The course will be an introduction to dealing with linguistic data in large quantities.
It is focused on acquiring knowledge that is helpful for theoretical linguists and
people who are more interested in building language technology systems. It is not
meant as an introduction to contemporary Natural Language Processing (NLP), though
it should make it easier to take an NLP class subsequently.
All programming will be done in Python.
The course will cover the following issues:
• Some basic concepts in scientific methodology, probability, statistics, and information
theory (2.5 weeks) • Acquiring data (1 week) • The post-twitter world • Analyzing data (2 weeks) • Correlation, statistical significance • Human-annotating data (1.5 weeks) • Inter-annotator agreement • Using automatic annotation on data (2 weeks) • Representing and automatically tagging phonology, morphology, syntax, semantics • Finding patterns in data: machine learning 1 (2 weeks) • Decision trees • Evaluation: metrics, statistical significance • Finding patterns in data: machine learning 2 (3 weeks) • Neural network basics, deep learning, LLMs, zero-shot, fine-tuning
Students will choose a project at the beginning of the semester. I will provide projects
for those who do not come with a project. Each student will develop their project
over the course of the semester, implementing various concepts learned in class. Each
class, one or two students will present their current work. Experienced PhD students
will be asked to supplement the class with in-depth demos of how to use Python for
the tasks discussed in class.
A detailed consideration of recent developments in syntactic theory, including treatments
of constituency and word order, grammatical relations, typological variation and linguistic
universals, and constraints on grammatical rules and representations.
A study of recent developments in phonological theory, with particular attention to
nonlinear models of phonological representation and constraint-based models.
An introduction to the theoretical foundation of computational linguistics. The course
emphasizes the importance of algorithms, algebra, logic, and formal language theory
in the development of new tools and software applications. Empirical phenomena in
phonology and syntax are sampled from a variety of languages to motivate and illustrate
the use of concepts such as strictly local string languages, tree transducers, and
semirings. Students will develop familiarity with the literature and tools of the
field.
This course is a continuation from Semantics I taught in fall 2023 as LIN 625. An
investigation of the role of semantics (the theory of meaning) in the overall theory
of grammar, structured around such topics as formal semantics, the interaction of
syntax and semantics, and lexical semantics. Prerequisite: LIN 625
In this seminar we'll explore the phenomenon of locality in syntax, the observation
that two elements in a syntactic dependency are generally required to be 'close' to
each other in some relevant sense. Over the course of the semester we'll get a good
empirical overview of the kinds of phenomena where locality has been argued to play
an important role, but our primary focus will be theoretical. We'll discuss a series
of key works on locality, from some classics to cutting edge papers from recent years,
aiming for solid coverage of the various ways that locality has been understood and
implemented in syntactic theory: e.g. in terms of domains (like phases), intervening
elements (like Relativized Minimality) and paths. I will also present some of my own
recent work which tries to ground one popular approach to locality — phase theory
— by reimplementing it in a way that takes inspiration from ideas of modular design.
People taking the course for credit will be responsible for leading discussion on
one of the readings, and those signed up for 3 credits will write a final paper on
a relevant topic.
Epenthesis refers to the insertion of phonological material into a string. In this
course we will investigate insertions of different types of material (feature, segment,
syllable) and different motivations for the insertion, from phonetic intrusion, to
the repair of a phonological violation, or alignment of a morphological boundary with
a phonological one. In each class meeting we will discuss an article and then workshop
the applications of that article to the students’ research projects. Depending on
the interests of the students, the final assignment will either be an individual or
group project that includes an abstract for submission to an upcoming conference.
This course provides an in-depth exploration of computational methods for speech analysis,
focusing on speech prosody (such as intonation, rhythm, and accents). Speech prosody
research is essential for understanding how the interplay between fine phonetic detail
and various linguistic structures (phonology, morphology, syntax, and semantics) shapes
communication beyond traditional linguistic boundaries. Throughout the course, you
will gain a comprehensive understanding of both theoretical foundations and practical
techniques for analyzing speech prosody. You will also develop a personalized research
project by engaging with recent studies in the field and analyzing prosodic patterns
in languages of your own interest using large corpus data.
For M.A. in TESOL
Content-based language and literacy instruction and assessment to children and adolescents
for whom English is not their first language, in alignment with current state, national,
and professional standards. Teacher candidates design standard-based and data-driven
curricular modules for teaching language through mathematics, the sciences and the
social studies, engage in reflective and collaborative practices, and evaluate web-based
technologies. 3 credits, letter graded (A, A-, B+, etc)
Study of the acquisition of a second language by children and adults. The focus is
on data; the systematicity of the learner' errors, the ease of acquisition in childhood,
etc., the adequacy of theories (e.g. Interlanguage processes, the monitor model, the
critical period) to explain data, and the reliability of methods of obtaining data.
Students conduct an empirical study testing a current hypothesis.
In-depth exploration of the theories of literacy and language development of native
English speakers and students who are English language learners pre-school through
grade 12. The development and assessment of literacy skills among children at various
stages of learning development and across disciplines will be examined. Attention
will also be given to children with special needs and the integration of technology
in the development of literacy skills.
Study of the systematic errors made by foreign language learners and the potential
of various linguistic theories to predict and account for these errors.
An in-depth study of curriculum design and evaluation with a focus on needs analysis,
goals and objectives, approaches to language learning and teaching, assessment, resources,
and program evaluation.
Investigation and evaluation of instructional planning and assessment aligned with
current state, national, and professional standards. Teacher candidates practice content-based
curriculum development, and use of technologies for language and literacy development
among English language learners and reflect on their teaching in multi-level classrooms.
Partnerships with colleagues, parents and the respective communities are explored.
3 credits, letter graded (A, A-, B+, etc)
Exploration, inquiry, and practice of English language instruction strategies and
approaches. Prerequisite: Admission to MA TESOL Teacher Education Program.
Observation and practice of data-driven language and literacy instruction and assessment
across disciplines for children and adolescents for whom English is not their first
language. Teacher candidates are placed in diverse educational settings in pre-elementary
through secondary levels for 50 hours of field experience. 1 credit, S/U grading May
be repeated for credit.
TESOL teacher candidates receive supervised practice teaching by arrangements with
selected schools across the region. The student teacher reports to the school to which
he or she is assigned each full school day for the entire semester. Applications must
be filed in the academic year preceding that in which the teacher candidate plans
to take the course. 3 credits, S/U grading.
TESOL teacher candidates receive supervised practice teaching by arrangement with
selected schools across the region. The student teacher reports to the school to which
he or she is assigned each full school day for the entire semester. Applications must
be filed in the academic year preceding that in which the teacher candidate plans
to take the course. 3 credits, S/U grading.