The Price of Linguistic Productivity:

How children learn and break rules of language

A user's guide to the Tolerance Principle. A discussion of the conceptual and methodological issues in how to use the Tolerance Principle for language research.

The Tolerance Principle was developed in 2002: a draft paper can be found here. The examples and narratives were crude, the formal result was not available (as it was provided by Sam Gutmann a bit later), but the main idea has remained the same.

Gutmann's proof led to an earlier publication in 2005. Few cared but the Penn folks. In 2006 I gave a job talk at Penn, ostensibly for a historical linguistics position. The slides are attached here.


“This is the best linguistics book that I have read in a decade. It presents a simple, elegant solution to the problem of reconciling patterns of regularity and irregularity. A compelling property of Yang’s Tolerance Principle is that it works better with small quantities of data, thus providing a novel and insightful answer to those who wonder how children can master a language with so little input. This is a wonderful book and deserves to attract a large audience.”
—Mark Aronoff, Distinguished Professor of Linguistics, Stony Brook University; author of Morphology by Itself: Stems and Inflectional Classes

“This excellent book addresses a range of extremely important issues, and brings novel arguments to bear on their resolution. It clarifies the elusive distinction between ‘core’ grammatical facts and the ‘periphery’; makes explicit a standard (though typically vague) approach to how inflectional gaps can be acquired; and presents coherent accounts of some perennial analytic conundrums as the status of various plural marking rules in German. It is the most important work I’ve read in years for the deep basic insights it has to offer on fundamental questions in the theory of grammar.”
—Stephen R. Anderson, Dorothy R. Diebold Professor of Linguistics, Yale University; author of Languages: A Very Short Introduction

“Charles Yang’s new book does something that I never thought I would witness in my lifetime; it makes quantitative predictions in a linguistic domain. And by ‘quantitative’ I do not mean giving p-values or confidence intervals or rating scores. I mean numerical predictions about the size of a measurable effect; in fact, many, many measurable effects. That’s quantitative! So run, don’t walk, to your nearest book provider and read the damn thing! It is groundbreaking work.”
—Norbert Hornstein, Professor, Departments of Linguistics, University of Maryland

“Charles Yang’s book is full of new insights into enduring questions. For decades our colleagues have debated about rules and exceptions—are there really ‘rules’ in mental representations, or is everything stored as lexical information, exemplars, instances? The Price of Linguistic Productivity provides new data showing that in fact children make a categorical distinction between these two types of representation—and, most important, an insightful computational account of when each will be formed. The Tolerance Principle not only accounts for findings in scores of languages; it also makes new predictions about nonlinguistic concepts—that is, when a generalization will occur in inductive learning, within languages and beyond. This book is a profoundly important contribution to our understanding of language acquisition and of learning.”
—Elissa L. Newport, Professor of Neurology, Psychology, and Linguistics, Georgetown University

“Charles Yang proposes a simple rule relating the number of exceptions that a productive rule of grammar can tolerate to the number of regular cases it generates, and provides a diverse set of case studies, including data concerning the course of child language acquisition. The case-studies suggest that it applies with great generality across languages, and across different distributions of regular and irregular forms. His book will be read by linguists, psychologists, cognitive scientists, and all who are concerned with questions of the fundamental nature of human language.”
—Mark Steedman, Professor of Cognitive Science, School of Informatics, University of Edinburgh


The initial goal was modest. In my earlier work (KLNL, Chapter 3), I reported evidence that children learn and organize irregular verbs by rules. For instance, a rule "Change Rime to ought" takes care of verbs such as catch, think, buy, bring, teach, seek, and verbs such as say-said, lose-lost, sleep-slept, etc. fall under the rule of Vowel Shortening triggered by suffixation. In other words, the traditional approach to irregular morphology with rules going back to Bloch (1947) and SPE is correct, and the holistic storage/retrieval account is incorrect.  

These results raise a sharp question: How does the child figure out the "ought" rule is applies to a fixed number of words whereas the "-ed" rule is general and can apply to any verb. Children may occasionally over-use the -ed rule (go-goed) but no one ever says hatch-haught on the analogy of catch-caught.  Given the ubiquity of exceptions in language, a general solution must be found. Furthermore, some deficiencies in the variational approach to language acquisition developed in KLNL also call for a principled distinction between noise and exception in learning.

Although the formal solution was worked out in 2002, it took many years to understand the scope of problems at hand, and accumulate a large number of cross-linguistic case studies. A brief outline is as follows.

Chapter 1: Border Wars

The tension between rules and exceptions, and productivity, in language. Why this has remained an unresolved question and how it has poisoned the water for so long.

Chapter 2: The Indispensability of Rules

A review of statistical facts of language especially morphology and children’s acquisition of morphology with focus on productivity.  Contrary to popular beliefs, productivity should be understood as a categorical notion in language, judging from the now extensive cross-linguistic studies of language acquisition.

Chapter 3: The Tipping Point

Using the Elsewhere Condition as a basic principle of language, as well as a performance/processing model, derive the mathematical principle of productivity, what I call the Tolerance Principle:

Chapter 4: Signal and Noise

A very detailed study of the acquisition of English inflectional morphology and nominalization morphology, the treatment of metrical stress in English and its acquisition, and the important case of German noun plurals where a very small rule (‘add -s’) can be productive. The entire discussion is driven by the equation, using child-directed language data.

Chapter 5: When Language Fails  

The theory predicts complete lexicalization when the number of exceptions to a rule exceeds the threshold. I show that this leads to morphological gaps: without a productive rule, you only know the derived form if you hear it otherwise ineffability arises. Detailed numerical studies for gaps in Russian (Morris Halle's famous 1973 paper),  English “stride-strode-stridden” gap, Spanish verbal inflections in the third conjugation, and the masc. sg. genitive in Polish and its acquisition. The Tolerance Principle also provides an integrated theory of language acquisition, variation, and change, in that it provides/predicts the conditions under which language change is actuated. As a case study, the theory explains why—and when—the so-called dative sickness in Icelandic started to take shape in the late 19th century.

Chapter 6: The Logic of Evidence

A completely new conceptualization of the indirect negative evidence business in language acquisition, especially in syntax. Instead of thinking about retreating from over-generalization, a derivative application of the Tolerance Principle ensures that the child is much more careful before generalizing. Shows how the learner may acquire that adjectives such as “asleep” do not allow attributive in NPs (“*the asleep cat”), and how to resolve Baker’s classic problem of dative construction acquisition (“*I donated the museum a painting”).  A critique of previous proposals, including Bayesian models of inference, is also included.

Chapter 7: On Language Design

Summary. How the current study impacts traditional problems in linguistics, and how it leads to a simplification of the theory of UG and language learning,  with a reduced role for domain-specific innate knowledge of language, leading to an arguably more plausible solution to the problem of language evolution. A novel account of why language learning must start small.