Dissertation Proposal Defense: Caplan

May 2, 2019 at - | Linguistics Department Library (3401-C Walnut Street, room 301C)

Working Title -- How to Get (Linguistically) Rich when 
you’re (Computationally) Poor: The Acquisition and Use of 
Abstract Linguistic Knowledge

Supervisors: Charles Yang, Mitch Marcus (CIS)
Proposal Committee: Tony Kroch (Chair), John Trueswell, 
Kathryn Schuler

Date: May 2nd, 2019

Time: 11:00am

Location: Linguistics Library, 3401-C Walnut Street 3rd 
floor, Suite 300 (first door on the right once you enter the 
Department's suite)


The proposed dissertation explores the interaction between 
computation, mechanisms, and representation within language 
acquisition and language processing. The primary goal is 
offer support for two main hypotheses. Firstly, that an 
algorithmic-level understanding of the issues here is 
crucially informative above purely high-level computational 
accounts. Additionally, the brunt of the 'heavy lifting' in 
language processing and acquisition is not due to boundless 
computational power or explicit optimization, but rather due 
to specifics of the linguistic representations and 
abstractions at play. I address these questions through a 
set of five case studies.

In the domain of lexical acquisition, the first case study 
presents a model of word learning grounded in category 
formation. The model is evaluated both through computational 
simulation as well as a novel eye-tracking paradigm. A 
second case study develops a theory of syntactic category 
acquisition based on the iterative bootstrapping of simple 
distributional clusters. The third case study addresses the 
question of information theoretic efficiency in language 
use. I argue that to whatever degree we can characterize the 
output of the language production system as ‘efficient’ in 
information ordering, this is an emergent property of 
incremental generation. The fourth case study uses a 
combination of corpus statistics and an acceptability task 
to investigate the factors responsible for conditioning the 
choice of (optional) embedded V2 in Swedish. I demonstrate 
that apparent stable lexical variation is not due to 
probabilistic representations of verbs themselves, but 
arises as an interaction between context and the formal 
properties of predicate classes. Finally, the fifth case 
study uses an accent-adaptation paradigm to argue that 
acoustic-phonetic signal is not maintained over time during 
speech processing.