Colorless green ideas still do sleep furiously
Professor Robert C. Berwick, Massachusetts Institute of Technology
Recently it has been argued that models containing only information extracted from probabilistic or recurrent neural networks without appeal to conventional generative grammars can closely approximate sentence acceptability judgments. Because differences in acceptability judgments have been the primary evidence used to motivate sophisticated mental grammars, these results might be taken to imply that the conventional generative grammars could be eliminated in favor of these alternatives. Here we examine to what extent such claims are warranted. Our conclusion is that they are not. Our statistical analysis shows that probabilistic or neural network acceptability models remain inadequate, as tested on four theoretically-relevant datasets using models built from the British National Corpus and the Corpus of Contemporary American English: 300 randomly sampled sentence types from Linguistic Inquiry; 230 from Adger’s 2003 Core Syntax; a new dataset of all 120 permutations of the five words in Chomsky’s colorless green ideas sleep furiously; and all 335 sentences from Lasnik and Uriagereka’s 1988 A Course in GB Syntax. The correlations between human acceptability judgments and the model predictions are inadequately low–the models do capture some of the human variation in acceptability judgments, but leave a preponderance unaccounted for, despite use of “best-in-breed” statistical modelling. Further, residual analysis reveals unexplained structure in the data that is non-random and larger than one would expect if surface probabilities alone were directly generating acceptability judgments. This suggests the influence of a third factor, viz., an underlying generative grammar, which can be measured via conventional “effects analysis” methodology. As computational tools increase in sophistication, it is important for the field to explore to what extent probabilistic information might replace some part of grammatical theory. That is just good science, especially given the fact that multiple factors are known to influence acceptability judgments. However, the results of this study suggest that acceptability judgments still provide strong evidence for the necessity of sophisticated mental grammars–even when the example sentences are nearly 60 years old.
Joint work with: Jon Sprouse, Sandiway Fong, Sagar Indurkya, and Beracah Yankama