LING570 Distributional Learning

Monday 9:30am-noon (First meeting: Sept 14th as Sept 7 is Labor Day)

Zoom link to be announced.  The real class page is on a password-protected server; email me if you would to participate in this class.


The study of language acquisition has produced plenty of accurate and insightful descriptions of child language but relatively few explicit accounts of learning that incorporate language specific experience into the child's knowledge. Likewise, experimental research has identified processes that could provide the bridge between the data and the grammar, but questions remain whether these laboratory findings can sufficiently generalize to the full range of linguistic complexity. Most pressing is the question of discontinuity: How does the child go from not knowing some aspect of their language, such as a morphological rule, to knowing it at some later point of acquisition? 

A similar tension exists in the study of cognitive development. The Piagetian theory of developmental stages overstates its case as the study of infant cognition and perception has shown in the past few decades. However, there are quantal changes in children’s conceptual development at least in some domains: a causal mechanism is called for.  The coverage of cognitive development will be minor class in this class as we focus on the specific aspects where language, and language learning, seem to have a decisive impact.

This class studies the important connection between what children know and how they come to know it. We will have very little to say about aspects of language that are invariant, universal, and likely innate, but will focus on the use of language specific experience, i.e., the structural and statistical properties of the input, in the acquisition of language. In other words, we do distributional learning. As we will see, once a good theory of learning is in place, we may start to question the validity of theories that traditionally allocate more weight of explanation to grammar internal principles and constraints. Maybe language really is nothing but Merge.


For the registered students, there will be 3-4 problem sets, all of which have to deal with data analysis. You will be asked to extract child language data, and child-directed input datafrom the CHILDES database, and carry out statistical analysis or implement specific learning algorithms. Quantitative results are expected and will be subject to formal evaluation. The problems are all well-defined and many have been extensively researched in the past but additional data, including those from other languages, may still yield new insights.

Reading materials and problem sets will be posted online. Registered students need to submit a research paper.

Major Topics, Roughly in Order

Readings (* = optional)

I realize it is nearly impossible to read all the materials listed below, some of which may be quite far removed from the core areas of study for most participants. I list them because I think they are important, for me anyway, for developing a broader perspective on language and learning.

Each unit has an introductory reading item (in red) which should be read first for background and overview.

Books will be distributed in some appropriate form.


Lewontin, R.C. 1983. The organism as the subject and object of evolution. Scientia. 118, 53-82.

Quine, W.V., 1957. The scope and language of science. The British Journal for the philosophy of Science, 8(29), pp.1-17.

Chomsky, N. 1965. Aspects of the theory of syntax. MIT Press. Part of chapter 1.

Labov, W., 1989. The child as linguistic historian. Language variation and change, 1(1), pp.85-97.


Berwick & Niyogi 1996. Formalizing triggers. Linguistic Inquiry.

Bush, R.R. and Mosteller, F., 1951. A mathematical model for simple learning. Psychological review, 58(5), p.313.

Gleitman, L.R. and Trueswell, J.C., 2020. Easy words: Reference resolution in a malevolent referent world. Topics in cognitive science, 12(1), pp.22-47.

*Pereira, A.F., Smith, L.B. and Yu, C., 2014. A bottom-up view of toddler word learning. Psychonomic bulletin & review, 21(1), pp.178-185.

*Pruden, S.M., Hirsh‐Pasek, K., Golinkoff, R.M. and Hennon, E.A., 2006. The birth of words: Ten‐month‐olds learn words through perceptual salience. Child development, 77(2), pp.266-280.

Stevens, J.S., Gleitman, L.R., Trueswell, J.C. and Yang, C., 2017. The pursuit of word meanings. Cognitive science, 41, pp.638-676.

Xu, F. and Tenenbaum, J.B., 2007. Word learning as Bayesian inference. Psychological review, 114(2), p.245.


* Cui, A., 2020. The Emergence of Phonological Categories (Doctoral dissertation, University of Pennsylvania).

Feldman, N.H., Griffiths, T.L. and Morgan, J.L., 2013.  A role for the developing lexicon in phonetic category acquisition. Psychological review, 120(4), p.751.

Johnson, E.K. and White, K.S. 2019. Six Questions in Infant Speech. Human Language: From Genes and Brains to Behavior, p.99.

*Lignos, C., 2012, April. Infant word segmentation: An incremental, integrated model. In Proceedings of the West Coast Conference on Formal Linguistics (Vol. 30, pp. 13-15).

Reeder, P.A., Newport, E.L. and Aslin, R.N., 2013. From shared contexts to syntactic categories: The role of distributional information in learning linguistic form-classes. Cognitive psychology, 66(1), pp.30-54.

Yang, C. 2004. Universal Grammar, statistics or both?. Trends in cognitive sciences, 8(10), pp.451-456.

Yang, C. 2009. Population structure and language change. Unpublished manuscript.

Yeung, H.H. and Werker, J.F., 2009. Learning words’ sounds before learning how words sound: 9-month-olds use distinct objects as cues to categorize speech information. Cognition, 113(2), pp.234-243.


Chemla, E., Mintz, T.H., Bernal, S. and Christophe, A., 2009. Categorizing words using ‘frequent frames’: what cross‐linguistic analyses reveal about distributional acquisition strategies. Developmental science, 12(3), pp.396-406.

*De Marcken, C., 1995. Lexical heads, phrase structure and the induction of grammar. In Third Workshop on Very Large Corpora.

Dye, C., Kedar, Y. and Lust, B., 2019. From lexical to functional categories: New foundations for the study of language development. First Language, 39(1), pp.9-32.

*Meylan, S.C., Frank, M.C., Roy, B.C. and Levy, R., 2017. The emergence of an abstract grammatical category in children’s early speech. Psychological science, 28(2), pp.181-192.

Shi, R. and Melançon, A., 2010. Syntactic categorization in French‐learning infants. Infancy, 15(5), pp.517-533.

Tomasello, M., 2000. Do young children have adult syntactic competence?. Cognition, 74(3), pp.209-253.

Valian, V., Solt, S. and Stewart, J., 2009. Abstract categories or limited-scope formulae? The case of children's determiners. Journal of child language, 36(4), pp.743-778.

Yang, C., 2013. Ontogeny and phylogeny of language. Proceedings of the National Academy of Sciences, 110(16), pp.6324-6327.

Yang & Valian. (Submitted). Determining the abstractness of determiners.


Albright, A. and Hayes, B., 2003. Rules vs. analogy in English past tenses: A computational/experimental study. Cognition, 90(2), pp.119-161.

Berko, J. 1958. The child's learning of English morphology. Word.

Chomsky, N. 1970. Remarks on nominalization. In: R. Jacobs and P. Rosenbaum (eds.) Reading in English Transformational Grammar, 184-221. Waltham: Ginn

Dabrowska, E. 2001. Learning a morphological system without a default: the Polish genitive. J. Child Language.

*Maratsos, M. 2000. More regularization after all. J. Child Language.

Medin, D.L. and Schaffer, M.M., 1978. Context theory of classification learning. Psychological review, 85(3), p.207.

*Nosofsky, R.M., Palmeri, T.J. and McKinley, S.C., 1994. Rule-plus-exception model of classification learning. Psychological review, 101(1), p.53.

Pinker, S. and Ullman, M.T. and McClelland, J. and Patterson, P.  2002. The past tense debate. Trends in cognitive sciences, 6(11).

Yang, C., 2002. Knowledge and learning in natural language. Oxford University Press. Chapter 3.


Allen, S.E., 2015. Argument structure. In Cambridge handbook of child language (pp. 271-297). Cambridge University Press.

Berwick, R.C., 1985. The acquisition of syntactic knowledge. MIT press.

Bowerman, M. and Croft, W., 2008. The acquisition of the English causative alternation. Crosslinguistic perspectives on argument structure: Implications for learnability, pp.279-307.

Boyd, J.K. and Goldberg, A.E., 2011. Learning what not to say: The role of statistical preemption and categorization in a-adjective production. Language, pp.55-83.

*Irani, A. 2019. Learning from positive evidence: The acquisition of verb argument structure. Penn dissertation.

Perfors, A., Tenenbaum, J. and Wonnacott, E., 2010. Variability, negative evidence, and the acquisition of verb argument constructions. Journal of child language.

Pinker, S., 1989. Learnability and cognition: The acquisition of argument structure. MIT Press.

Yang, C., 2016. The price of linguistic productivity: How children learn to break the rules of language. MIT press.

Yang, C. 2017. Rage against the machine. Language acquisition. 24(2), 100-125.

Cognitive development

Gelman, S. 2003. The essential child.  Oxford University Press.

Carey, S. 2009. The origin of concepts. Oxford University Press.


The CHILDES database

SUBTLEX corpus

English Lexicon Project.

Unix for Poets.

Python bootcamp

CMU Pronunication Dictionary