TO-DO: example numbers
TO-DO: finish intro
TO-DO: final section: e-language vs. i-language
TO-DO: footnote with labov stats 1 Foundational issues

1 Foundational issues


This book is an introduction to generative grammar from a Chomskyan perspective. By the time you finish this chapter, you will have a clearer understanding of what we mean by this sentence, and by the time you finish the entire book, your understanding of it should be clearer and deeper still. But for now, you have probably gained the impression that this book is about grammar of some sort. And right there, we have a problem. The problem is that there is an everyday sense of the term `grammar' and a quite different sense in which the term is used in linguistics.

** CONTINUE INTRO


Prescriptive versus descriptive grammar

In the everyday sense, `grammar' refers to a collection of rules concerning what counts as socially acceptable and unacceptable language use. Some of these rules, like the ones in (1), make reference to particular words and apply to both spoken and written language.

(1) a.   Don't use ain't.
b. Don't use seen as the past tense of see.

But mainly, the rules in question concern the proper composition of sentences in the written language, and you probably recall being taught rules like those in (2) at school.

(2) a. Don't start a sentence with a conjunction.
b. Don't use contractions.
c. Don't use sentence fragments.
d. Don't end a sentence with a linking verb.
e. Don't use dangling participles.
f. Don't end a sentence with a preposition.
g. Don't use an object pronoun for a subject pronoun in a conjoined noun phrase.
h. Don't use a plural pronoun to refer back to a singular noun like everyone, no-one, someone, and the like.
i. Don't split an infinitive.
j. Use whom, not who, as the object of a verb or preposition.

Someone who composes sentences in accordance with rules like those in (2) is said to have good grammar, whereas someone said to have bad grammar doesn't apply the rules when they ought to be applied, 1 producing sentences like (3).

(3) a.   Over there is the guy who I went to the party with. violates (2f), (2j), should be: with whom I went to the party
b. Bill and me went to the store. violates (2g); should be: Bill and I

Now if rules like those in (2) were the only ones that were used to form English sentences, then people who didn't follow them should produce rampantly variable and confusing sentences, leading in extreme cases to a complete breakdown of communication. However, even people who routinely produce sentences like those in (3) do not produce the likes of (5).

(5) a.   Over there is guy the who I went to party the with.
b. Over there is the who I went to the party with guy.
c. Bill and me the store to went.

The sentences in (3) may be instances of bad grammar in the everyday sense, but they are still English sentences. By contrast, we don't need to rely on rules learned at school to tell us that the examples in (5) are not English sentences, even though they contain exactly the same words as those in (3).

Since native speakers of English do not produce a variable mishmash of words, including word salad like (5), they must be composing sentences using rules of some other sort than those in (2). We can determine what some of these rules are by examining the sequences in (5) to see what it is that makes them into word salad. In (5a), the article the is in the wrong order with respect to guy and party, the nouns that it belongs with. In (5b), the relative clause (who I went to the party with) is in the wrong order with respect to the noun that it modifies (guy). In (5c), the preposition to is in the wrong order with respect to its object (the store). In other words, the sentences in (5) do not follow the rules in (6).

(6) a.   Articles precede the nouns they belong with.
b. Relative clauses follow the noun that they modify.
c. Prepositions precede their objects.

(There's a fourth rule that's not followed in (5), which you are asked to formulate in the Exercises.)

Rules like those in (6) have a quite different intention from those in (2). The rules in (2) are normative or prescriptive, whereas those in (6) are descriptive. Rules of prescriptive grammar have the same status as rules of etiquette (like table manners or dress codes) or of the laws of society, which divide the entire spectrum of possible human behavior into correct, socially acceptable, or legal behavior, on the one hand, and incorrect, socially unacceptable, or illegal behavior, on the other. Rules of prescriptive grammar make statements about how people ought to use language. In contrast, rules of descriptive grammar have the status of scientific observations, and they are intended as true and insightful generalizations about the way that human language is, rather than how it ought to be used. Descriptive rules are more general and basic than prescriptive rules in the sense that all sentences of a language are formed in accordance with them, not just the subset of sentences that count as correct. We can think of prescriptive rules as filtering out some (relatively small) portion of the entire output of the descriptive rules of a language.

In syntax, as in linguistics more generally, we adopt a resolutely descriptive perspective concerning language. In particular, when we say that a sentence is grammatical, we don't mean that it is correct from a prescriptive point of view, but that it conforms to descriptive rules like those in (6). In order to indicate that a sequence is ungrammatical (in the descriptive sense), we prefix it with an asterisk. Grammatical sentences are usually not specially marked, but they can be prefixed with `ok' for clarity. These conventions are illustrated in () and ().

() a. * Over there is guy the who I went to party the with. (= (5a))
b. * Over there is the who I went to the party with guy. (= (5b))
() a. ok Over there is the guy who I went to the party with. (= (6a))
b. ok Over there is the guy with whom I went to the party.

Prescriptive grammar is based on the view that there is a right way to do things and a wrong way to do things. In a situation of linguistic variability, prescriptive grammar is concerned with declaring one of the variants to be correct (and often with justifying the choice). In the same situation, the most basic aim of descriptive grammar is to document the variants without passing judgment on them. For instance, consider the variable subject-verb agreement pattern in (7). In (7a), the singular verb is (contracted to 's) agrees in number with the preverbal expletive subject there (in italics), whereas in (7b), the plural verb are agrees with the postverbal logical subject some boxes (in boldface). The typeface of the verb indicates which of the two subjects it agrees with.

(7) a.   There's still some boxes that need to be brought in.
b. There are still some boxes that need to be brought in.

The prescriptive and descriptive rules concerning this pattern are given in (8).

(8) In a sentence containing both the singular expletive subject there and a plural logical subject ...
a. Prescriptive rule:   ... the verb should agree in number with the logical subject.
b. Descriptive rule:   ... the verb can agree in number with either subject.

To take another example, let's consider the rule that says, "Don't end a sentence with a preposition."2 A prescriptive grammarian might argue that it is more logical to keep the preposition (in italics) together with its object (in boldface), as in ()a, rather than to separate the two, as in ()b.

() a. With which friend did you go to the party?
b. Which friend did you go to the party with?

But by that token, we ought to keep verbs and their objects together in questions and to prefer ()a over ()b. In fact, however, ()a is ungrammatical in English.

() a. * Adopt which cat did your friend?
b. ok Which cat did your friend adopt?

There is no logical reason that prepositions can be separated from their objects in English, but verbs can't. From a descriptive perspective, the grammaticality contrast between ()a and ()a is simply a matter of fact, irreducible to more basic considerations given our present state of knowledge. () highlights the differences between the relevant prescriptive and descriptive rules.

() When the object of a preposition appears in a position that isn't its ordinary one, ...
a. Prescriptive rule:   ... it should be preceded by the preposition.
b. Descriptive rule:   ... it can either be preceded by the preposition, or it may stand alone, with the preposition remaining in its ordinary position.

The contrasting attitude of prescriptive and descriptive grammar towards linguistic variation has a quasi-paradoxical consequence: namely, that prescriptive rules are never descriptive rules. The reason for this has to do with the way that social systems (not just language) work. If everyone in a community consistently behaves in a way that is socially acceptable in some respect, then there is no need for explicit prescriptive rules to ensure the behavior in question. It is only when behavior is perceived as socially unacceptable that prescriptive rules come to be formulated to check it. For example, if every customer who enters a store invariably wears both a shirt and shoes, there is no need for the store owner to put up the sign that says "No shirt, no shoes, no service." Conversely, it is precisely at illegal dump sites that "No dumping" signs are posted. In an analogous way, in the domain of language use, rules of prescriptive grammar are only ever formulated in situations with noticeable linguistic variation. But being prescriptive, they cannot treat all of the occurring variants as equally acceptable, so they can't ever be descriptive.

Rule formation in language acquisition

In addition to differing in intention, prescriptive and descriptive rules of grammar differ in another respect as well: namely, in how they come to be part of our knowledge. Prescriptive rules are taught at school, and because they are taught explicitly, we tend to be conscious of them, even if we haven't actually learned how to follow them. By contrast, we follow the rules of descriptive grammar consistently3 and effortlessly, yet without learning them at school. In fact, children have essentially mastered these rules on their own by first grade. Ordinarily, we are completely unconscious of the descriptive rules of language. If we do become conscious of them, it tends to be in connection with learning a foreign language whose descriptive grammar differs from that of English. In order to emphasize the difference between the unconscious way that we learn a native language (or several) in early childhood and the conscious way that we learn a foreign language later on in life, the first process is often called language acquisition rather than language learning.

As you consider the descriptive rules in (6), you might not find it all that surprising that a child raised in an English-speaking community would acquire the rule, say, that articles precede nouns. After all, all the child ever hears are articles and nouns in that order4. So what in the world would lead an English-speaking child to put the article and the noun in the other order? Isn't it just common sense that children learn their native language by imitating older speakers around them?

Well, yes and no. It is true that children learn some aspects of their native language by imitation and memorization. Children in English-speaking communities learn English words, children in Navajo-speaking communities learn Navajo words, children in Swahili-speaking communities learn Swahili words, and so on. But language acquisition isn't purely a process of memorization. In fact, given current human life spans, it couldn't possibly be!

To see this, let's consider a toy fragment of English that contains three-word sentences consisting of a noun, a transitive verb, and another noun, both ones like () that are sensible given the real world and ones like () that aren't, but that might be useful in fairy tale or science fiction contexts.

() a.   Cats detest peas.     () a.   Peas detest cats. ("The secret life of peas")
b. Children eat tomatoes. b. Tomatoes eat children. ("The attack of the GM tomatoes")
c. Cheetahs chase gazelles. c. Gazelles chase cheetahs. ("Gazelle's revenge")

Again for the sake of argument, let's assume a vocabulary of 1,000 nouns and 100 verbs. This gives us a list of 1,000 x 100 x 1,000, or 100 million, three-word sentences of the type in () and (). Numbers of this magnitude are difficult to put in human perspective, so let's estimate how long it would take a child to learn all the items on the list. Again, for the sake of argument, let's assume that children can memorize sentences at the rate of one a second. The entire list of three-word sentences could then be memorized in 100 million seconds, which comes to 3.17 years. This may not sound like such a long time, but once we start adding vocabulary items, the number of sentences and the time that would have to be spent memorizing them quickly mushrooms. For instance, adding only 10 adjectives to the child's vocabulary would cause the number of five-word sentences of the form in () to grow to 10 billion.

() a.   Black cats detest green peas.
b. Happy children eat ripe tomatoes.
c. Hungry cheetahs chase fleet gazelles.

Even at the incredibly quick rate of one sentence per second that we're assuming, the list of all such five-word sentences would take a bit over 317 years to learn. Clearly, this is an absurd consequence. For instance, how could our memorious child ever come to know, as we plainly do, that the sentence in () is ungrammatical? In a memorization-based approach, the only way to determine this is to compare () to all of the 10 billion five-word sentences and to find that it matches none of them, which would take additional centuries beyond the time required to memorize the sentences.

() * Cats black detest peas green.

And even after all that time, a memorization-based language learner still wouldn't know what was wrong with ()!

In addition to this thought experiment, there is another reason to think that language acquisition isn't entirely based on rote memorization---namely, that children use what they hear of language as raw material to construct linguistic rules. How do we know this? We know because children sometimes produce rule-based forms that they have never heard before.

Rule-based word formation

One of the earliest demonstrations that children form linguistic rules, rather than simply imitating adult language use, was the wug experiment (Berko 1958). In it, the psycholinguist Jean Berko used invented words to examine (among other things) how children between the ages of 4 and 7 form plurals in English. She showed the children cards with simple line drawings of objects and animals and elicited plurals from them by reading them accompanying texts like the one in (10).

(10)     This is a wug. Now there is another one. There are two of them. There are two ___.

More than 75% of the children pluralized the invented words cra, lun, tor, and wug in exactly the same way that adults did in a control group: they added the sound -z to the word (Berko 1958:159-162).5 Since none of the children could have encountered the invented words before the experiment, their response clearly indicates that they had acquired a plural rule and were using it to produce the novel forms.

In the wug experiment, both the children being studied and the adults in the control group produced novel rule-based forms. Children also sometimes substitute novel rule-based forms for existing adult forms, producing overregular forms like comed or goed for irregular forms like came and went. Some further overregularizations are given in (9) (Marcus et al. 1992:148-149, based on Brown 1973).

(9) a.   beated, blowed, catched, cutted, doed, drawed, drived, falled, feeled, growed, holded, maked, sleeped, standed, sticked, taked, teached, throwed, waked, winned (Adam, between the ages of 2 and 5)
b. drinked, seed, weared (Eve, between the ages of 1 1/2 and 2)

Overregularizations don't amount to a large fraction of the forms that children produce overall (less than 5% in the case of past tense forms, according to Marcus et al. 1992:35), but they clearly show that even the acquisition of words doesn't boil down to rote memorization.

Question formation

In addition to morphological rules (which concern the structure of words), children also acquire syntactic rules (which concern the structure of sentences). These are of particular interest because children not only use or overgeneralize the same syntactic rules that adults use, but also form ones of their own. At the same time, however, children's syntactic rules don't differ in an arbitrary way from those of adults. Rather, they share certain abstract properties with adult rules, even when they differ from them.

Let's consider how young children form yes-no questions (as the name implies, yes-no questions are ones to which the expected answer is either `yes' or `no'). Some 3- to 5-year-olds form such questions from declarative sentences by copying the auxiliary element to the beginning of the sentence, as in (11) (Crain and Nakayama 1987:536). (We use the term `auxiliary element' as a convenient way of referring to be, can and other similar elements. See Modals and auxiliary verbs in English for detailed discussion.)

(11) a.   The girl is tall. ---> Is the girl is tall?
b. The red pig can stand on the house. ---> Can the red pig can stand on the house?

Eventually, the questions in (11) give way to those in (12), where we can think of the auxiliary element as having been moved rather than copied.

(12) a.   Is the girl ___ tall?
b. Can the red pig ___ stand on the house?

But now notice a striking indeterminacy, first pointed out by Chomsky 1971:26-27. When children produce questions as in (12), there is no way of telling whether they are using the adult rule for question formation in (13a) or the logically possible alternative rule in (13b).

(13) a.   Adult question formation rule: To form a question from a declarative sentence containing an auxiliary element, identify the subject, and invert the auxiliary with it.
b.   Logically possible alternative: To form a question from a declarative sentence containing an auxiliary element, identify the first auxiliary element, and move it to the beginning of the sentence.

Don't confuse `subject' with `simple subject.'

Subjects, in contrast to the simple subjects, are possible responses to questions like Who is tall? and Who can stand on the house? So the subjects in (12) are the noun phrases the girl and the red pig. If the subject consists of a single word or of a clause, then the simple subject is identical to the subject; otherwise, the simple subject of a sentence is obtained by stripping the subject of any modifiers (yielding girl and pig as the simple subjects of (12)).

The notion of subject is basic to syntactic theory (see Chapter 3), but we will have no further use for the notion of simple subject.

Both rules in (13) give the same result for simple sentences, which are likely to form most of the data that young children attend to. Both rules also require children to identify auxiliary elements. However, the adult rule in addition requires children to identify the subject of the sentence by grouping together sequences of words such as the girl or the red pig into a single abstract structural unit. Because of this grouping requirement, the adult rule is called structure-dependent. By contrast, the rule in (13b) is structure-independent, requiring the child only to classify words according to their syntactic category and to count them, but not to group them into structural units. Since the rule in (13b) is simpler in the sense that it relies on both fewer and computationally less complex cognitive operations, it is reasonable to expect that at least some children would experiment with it in the course of acquiring question formation. Nevertheless, Chomsky predicted that children would not use structure-independent rules in the course of acquisition.

As we mentioned, both rules give the same result for simple sentences. So how could we possibly tell which of the two rules a child was actually using? Well, forming yes-no questions is not restricted to simple sentences. So although we can't tell which rule a child is using in connection with simple sentences, the rules in (13) give different results for complex sentences like (14), where the entire sentence contains the auxiliary is and the relative clause who was holding the plate contains a second auxiliary was.

(14)     The boy who was holding the plate is crying.

A child applying the structure-dependent rule to (14) would first identify the subject of the entire sentence (the boy who was holding the plate) and then invert the entire subject---including the relative clause and the auxiliary contained within it (was)---with the auxiliary of the entire sentence (is). On the other hand, a child applying the structure-independent rule would simply identify the first auxiliary (was) and move it to the beginning of the sentence. The two rules have very different results, as shown in (15).

(15) a.   Result of structure-dependent rule: Is [the boy who was holding the plate] ___ crying?
b. Result of structure-independent rule: Was the boy who ___ holding the plate is crying?

Recall that Chomsky 1971 predicted that children would not use structure-independent rules, even though they are simpler than structure-dependent ones. This prediction was tested in an experiment with 3- to 5-year-old children by Crain and Nakayama 1987. In the experiment, the experimenter had the children pose yes-no questions to a doll (Jabba the Hut from Star Wars). For instance, the experimenter would say to each child Ask Jabba if the boy who was holding the plate is crying. This task elicited various responses. Some children produced (15a), whereas others produced the copy question in (16a) or the restart question in (16b).

(16) a.   Is [the boy who was holding the plate] is crying?
b. Is [the boy who was holding the plate], is he crying?

Notice that although neither of the questions in (16) is based on the adult rule in (13a), the rules that the children used to produce them are structure-dependent in the same way that the adult rule is. This is because children who produced (16a) or (16b) must have identified the subject of the sentence, just like the children who produced the adult question in (15a). On the other hand, out of the 155 questions that the children produced, none were of the structure-independent type in (15b). Moreover, no child produced the structure-independent counterpart of (16a), shown in (17), which results from copying (rather than moving) the first auxiliary element in the sentence.

(17)     Was the boy who was holding the plate is crying?

In other words, regardless of whether a child succeeded in producing the adult question in (15a), every child in the experiment treated the sequence the boy who was holding the plate as a unit, thus confirming Chomsky's prediction.

Structure dependence

We have seen that young children are capable of forming and applying both morphological and syntactic rules. Moreover, as we have seen in connection with question formation, children do not immediately adopt the rules that adults use. Nevertheless, the syntactic rules that children postulate in the course of acquisition are a subset of the logically possible rules that they might experiment with in principle. In other words, children's syntactic rules are constrained by certain basic organizing principles of human language. As we have seen, one of these principles is structure dependence, which forces children learning language (and people using language more generally) to group appropriate words together into constituents rather than treating sentences as simple strings of words.

Intuitions about words belonging together

Evidence for structure dependence isn't restricted to data from child language acquisition. Further evidence comes from the intuitions that we as adults have that certain words in a sentence belong together, whereas others do not. For instance, in a sentence like (18), we have the strong intuition that the first the belongs with dog, but not with did, even though the is adjacent to both.

(18)     Did the dog chase the cat?

Similarly, the second the in (18) belongs with cat and not with chase. The doesn't always belong with the following word, though. For instance, in (19), the first the belongs with dog, just as in (18), but dog doesn't in turn belongs with the second the.

(19)     Did the dog the children like chase the cat?

Words that belong together can sometimes be replaced by placeholder elements such as pronouns. This is illustrated in ().

The term `pronoun' is misleading since it suggests that pronouns substitute for nouns regardless of syntactic context. In fact, what pronouns substitute for is entire noun phrases. A less confusing term for them would be `pro-noun phrase,' but we'll continue to use the traditional term.

() a.   Did the dog chase the cat? --->   Did she chase him?
b.   Did the dog the children like chase the cat? --->   Did the dog they like chase him?

It's important to recognize that pronouns don't simply replace strings of words regardless of context. Just because a string like the dog is a constituent in ()a doesn't mean that it's always a constituent. We can see this by replacing the dog by a pronoun in ()b, which leads to an ungrammatical result.

b.   Did the dog the children like chase the cat? ---> * Did she the children like chase the cat?

The ungrammaticality of () tells us that the and dog belong together less closely in () than in (). In (), the and dog combine directly, whereas in (), dog combines first with the relative clause, and only then with the.

In some sentences, we have the intuition that words belong together even when they are not adjacent. For instance, see and who in (20a) belong together in much the same way as see and Bill do in (20b).

(20) a.   Who will they see?
b. They will see Bill.

Finally, we can observe that there are various sorts of ways that words can belong together. For instance, in a phrase like the big dog, big belongs with dog, and we have the intuition that big modifies dog. On the other hand, the relation between saw and Bill in (20b) isn't one of modification. Rather, we have the intuition that Bill is a participant in a seeing event.

In the course of this book, we will introduce more precise ways of expressing and representing intuitions like the ones just discussed. For the moment, however, what is important is simply that we have strong intuitions that words belong together in ways that go beyond adjacency.

Structural ambiguity

Another, particularly striking piece of evidence for structure dependence is the phenomenon of structural ambiguity. The classified advertisement in (21) is a humorous illustration.

(21)     Wanted: Man to take care of cow that does not smoke or drink.

World knowledge tells us that the intent of the advertiser is to hire a clean-living man to take care of a cow. But because of the way the advertisement is formulated, it also has an unintentionally comical interpretation---namely, that the advertiser has a cow that does not smoke or drink and that a man is wanted to take care of this clean-living cow. The intended and unintended interpretations describe sharply different situations; that is why we say that (21) is ambiguous, and not merely that it is vague. Moreover, the ambiguity of the sentence can't be pinned on a particular word, as it can be in the ambiguous sentences in (22).

(22) a.   As far as I'm concerned, any gender is a drag. (Patti Smith)
b. Our bikinis are exciting. They are simply the tops.

Examples like (22) are called instances of lexical ambiguity, because their ambiguity is based on their containing a lexeme that happens to have two distinct meanings. In (21), on the other hand, the words themselves have the same meanings in each of the two interpretations, and the ambiguity depends on how the reader groups the words in interpreting the sentence. In the intended interpretation, the relative clause that does not smoke or drink modifies man, whereas in the unintended interpretation it modifies cow.

To avoid any confusion, we should emphasize that we are considering structural ambiguity here from a purely descriptive perspective, focusing on what it tells us about the design features of human language and disregarding the practical aim of effective communication. As writers of advertisements ourselves, of course, we would be careful not to use (21), but to disambiguate it by means of an appropriate paraphrase. For the ordinary interpretation, in which the relative clause modifies man, we would place the relative clause next to the intended modifiee, as in (23a). The comical interpretation of (21), on the other hand, cannot be expressed unambiguously by moving the relative clause. If it were the desired interpretation, we would have to resort to a more drastic reformulation, such as (23b).

(23) a.   Wanted: Man that does not smoke or drink to take care of cow.
b. Wanted: Man to take care of nonsmoking, nondrinking cow.

Universal Grammar

Formal universals

Structure dependence is a general principle of the human language faculty (the part of the mind/brain that is devoted to language), often also referred to as Universal Grammar, especially when considered separately from any particular language. There are two sources of evidence for this. First, as we have seen, the syntactic rules that children form in the course of language acquisition, even when they are not the rules that adults use, are constrained by structure dependence. Second, even though it would be logically possible and computationally tractable for languages to have structure-independent question formation rules like those in (24), no known human language actually disregards structure in this way.

(24) a.   To form a question, switch the order of the first and second words in the corresponding declarative sentence. The girl is tall. ---> Girl the is tall?
The blond girl is tall. ---> Blond the girl is tall?
b. To form a question, reverse the order of the words in the corresponding declarative sentence. The girl is tall. ---> Tall is girl the?
The blond girl is tall. ---> Tall is girl blond the?

The principle of structure dependence is what is known as a formal universal of human language---a property shared by all human languages that is independent of the meanings of words. Formal universals are distinguished from substantive universals, which concern the substance, or meaning, of linguistic elements. An example of a substantive universal is the fact that all languages have indexical elements such as I, here, and now. These words have the special property that their meanings are predictable in the sense that they denote the speaker, the speaker's location, and the time of speaking, but that what they refer to varies depending on who the speaker is.

Recursion

In addition to structure dependence, human language exhibits another formal universal: the property of recursion. A simple illustration of this property is the fact that it is possible for one sentence to contain another. For instance, the simple sentence in (25a) forms part of the complex sentence in (25b), and the resulting sentence can form part of a still more complex sentence. Recursive embedding is illustrated in (25) up to a level of five embeddings.

(25) a. She won.
b. The Times reported that
      [she won].
c. John told me that
      [the Times reported that
            [she won]].
d. I remember distinctly that
      [John told me that
            [the Times reported that
                  [she won]]].
e. They don't believe that
      [I remember distinctly that
            [John told me that
                  [the Times reported that
                        [she won]]]].
f. I suspect that
      [they don't believe that
            [I remember distinctly that
                  [John told me that
                        [the Times reported that
                              [she won]]]]].

Parameters

Formal universals like structure dependence and recursion are of particular interest to linguistics in the Chomskyan tradition. But of course, individual languages also differ from one another, and not just in the sense that they use different-sounding words. This means that Universal Grammar is not completely fixed, but allows some variation. The ways in which the grammars of languages can differ are called parameters.

One basic parameter concerns the order of verbs and their objects. There are two orders that are possible: verb-object (VO) or object-verb (VO), and human languages use either one or the other. As illustrated in (26) and (27), English and French are languages of the verb-object (VO) type, whereas Hindi, Japanese, and Korean are languages of the object-verb (OV) type.

(26) a. English Peter read the book.
b. French
Pierre lisait     le  livre.
Pierre was.reading the book
`Pierre was reading the book.'
(27) a. Hindi
Peter-ne kitaab   parh-ii.
b. Japanese
Peter-ga hon-o    yon-da.
c. Korean
Peter-ka chayk-ul il-ess-ta.
Peter    book      read 
`Peter read the book.'

Another parameter of Universal Grammar concerns the possibility, mentioned earlier, of separating a preposition from its object, or preposition stranding. (The idea is that the movement of the object of the preposition away from its ordinary position following the verb leaves the preposition high and dry.) The alternative to preposition stranding goes by the imaginative name of pied piping, by analogy to the Pied Piper of Hameln, who took revenge on the citizens of Hameln for mistreating him by luring the town's children away with him. In pied piping of the syntactic sort, the object of the preposition moves away from its usual position, just as in preposition stranding, but it manages to lure the preposition along with it. The examples we gave earlier for preposition stranding and pied piping are repeated in (30), now along with their appropriate labels.

(30) a. Preposition stranding: ok Which house does your friend live in?
b. Pied piping: ok In which house does your friend live?

Just as in English, preposition stranding and pied piping are both grammatical in Swedish. In fact, in Swedish, preposition stranding counts as prescriptively correct, and it is pied piping that is often frowned upon, on the grounds that it is artificial and stiff.

(31) a. Swedish ok
Vilket hus  bor   din  kompis i?
which   house lives your friend in
`Which house does your friend live in?'
b. ok I vilket hus bor din kompis?

On the other hand, preposition stranding is ungrammatical in French and Italian; that is, speakers of these languages reject examples like (32) as word salad. In these languages, only pied piping, as in (33), is grammatical.

(32) a. French *
Quelle maison est-ce que  ton  ami    habite dans?
which   house   is  it that your friend lives  in
Intended meaning: `Which house does your friend live in?'
b. Italian *
Quale casa abita il  tuo  amico  in?
which  house lives the your friend in
Intended meaning: `Which house does your friend live in?'
(33) a. French ok Dans quelle maison est-ce que ton ami habite?
b. Italian ok In quale casa abita il tuo amico?

Generative grammar

At the beginning of this chapter, we said that this book was an introduction to generative grammar from a Chomskyan perspective. Until now, we have clarified our use of the term `grammar,' and we have indicated that a Chomskyan perspective on grammar is concerned with the formal principles that all languages share as well as with the parameters that distinguish them. Let's now turn to the notion of a generative grammar.

()   A generative grammar is an algorithm for specifying, or generating, all and only the grammatical sentences in a language.

An algorithm, in turn, is an explicit, step-by-step procedure for accomplishing a task. If you have ever written a computer program, you have created an algorithm. Other everyday examples of algorithms include a recipe for sushi, a knitting pattern, the instructions for assembling an Ikea bookcase, or the steps to follow in balancing a checkbook.

An important point to keep in mind is that it is often difficult to construct an algorithm for even a seemingly trivial procedure. A quick way to gain an appreciation of this is to describe how to tie a bow. Like language, tying a bow that we've mastered around school age, and that we perform more or less unconsciously thereafter, but describing how to do it is anything but easy. In an analogous way, constructing a generative grammar of English is a completely different task than speaking the language, and much more difficult!

Just like a cooking recipe, a generative grammar needs to specify the ingredients and procedures that are necessary for generating grammatical sentences. We won't introduce all of these in this first chapter, but in the remainder of the section, we'll introduce enough ingredients and procedures to give you a flavor of what's to come.

Elementary trees and substitution

The raw ingredients that sentences consist of are vocabulary items. These come in various syntactic categories, like noun, adjective, transitive verb, preposition, and so forth. Depending on their syntactic category, vocabulary items combine with one another to form constituents, which in turn each belong to their own syntactic category. For instance, determiners (a category that includes the articles a and the and the demonstratives this, that, these and those) can combine with nouns to form noun phrases, but they can't combine with other syntactic categories like adverbs, verbs, or prepositions.

() a.   a house () a. * a slowly
b.   the cats b. * the went
c.   those books c. * those with

It's possible to represent the information contained in a constituent by using labeled bracketings. Each vocabulary item is enclosed in brackets that are labeled with the appropriate syntactic category. The constituent that results from combining vocabulary items is in turn enclosed in brackets that are labeled with the constituent's syntactic category. The labeled bracketings for the constituents in () are given in ().

() a.   [NounPhr [Deta ] [Noun house ] ]
b.   [NounPhr [Detthe ] [Noun cats ] ]
c.   [NounPhr [Detthose ] [Noun books ] ]

Noun phrases can in turn combine with other syntactic categories, such as prepositions or transitive verbs. Prepositions combine with a single noun phrase to form prepositional phrases. A transitive verb combines with one noun phrase to form a verb phrase, which in turn combines with a second noun phrase to form a complete sentence.

() a.   [PrepPhr [Prep on ] [NounPhr [Detthe ] [Noun table ] ] ]
b.   [VerbPhr [TrVerb drafted ] [NounPhr [Deta ] [Noun letter ] ] ]
c.   [Sentence [NounPhr [Detthe ] [Noun secretary ] ] [VerbPhr [TrVerb drafted ] [NounPhr [Deta ] [Noun letter ] ] ] ]

Again, however, noun phrases don't combine with any and all syntactic categories. For instance, noun phrases can't combine with adverbs or determiners.

() a. * slowly the letter
b. * the this letter

As constituent structure grows more complex, labeled bracketings can grow difficult for humans to process. Because of this, it's often more convenient to use an alternative mode of representing constituent structure called tree diagrams, or trees for short. Trees represent exactly the same information as labeled bracketings, but the information is presented differently. Instead of enclosing an element in brackets that are labeled with a syntactic category, the category is placed immediately above the element and connected to it with a branch. The labeled bracketings that we have seen so far translate into the trees in ().6

() a.       b.       c.  
() a.       b.       c.  

Trees like those in () resemble dishes that are ready to serve; they don't provide a record of exactly how they were brought into being. We can provide such a record, however, by including combinatorial information for individual vocabulary items in the trees for the items. For example, prepositions and transitive verbs can be represented as trees with empty slots for noun phrases to fit into, as shown in ().

() a.       b.  

We'll refer to trees for vocabulary items like those in () as elementary trees. Since elementary trees are intended to represent the combinatorial possibilities of a vocabulary item, they generally contain unfilled nodes. Such nodes are called substitution nodes, and they are filled by a substitution operation, defined in ().

() a.       b.       c.  
Tree No. 1 has a substitution node
of some syntactic category.
The topmost, or root, node in Tree No. 2
is of the same category.
The root node of Tree No. 2 is identified
with the substitution node in Tree No. 1.

Elementary trees don't necessarily contain substitution nodes, though; ones that always play the role of Tree No. 2 in the substitution operation don't. The elementary tree for the noun in () below is an example.

Notice that there are two conceivable ways to arrive at the trees for noun phrases in (), depending on whether it is the noun that is taken as the substitution node, as in (), or the determiner, as in (). At this point, there is no reason to prefer one way over the other, but in what follows, we'll use () for reasons to be explained in Chapter **.

() a.       b.       () a.       b.  

In summary, a generative grammar as we've constructed it so far consists of a set of elementary trees, which represent the vocabulary items in a language and the range of their combinatorial possibilities, and a substitution operation, by means of which the elementary trees combine into larger constituents and ultimately into grammatical sentences. In later chapters, we will add two further operations to the grammar. The first, adjunction, will enable the grammar to generate sentences containing modifiers, such as adjectives or relative clauses modifying nouns (the big dog, the dog that the children like). The second, movement, will enable the grammar to express, among other things, both the similarities and the differences between declarative sentences (They will see Bill) and questions corresponding to them (Will they see Bill?, Who will they see?).

Grammaticality

Given the aim of a generative grammar---to generate all and only the grammatical sentences of a language, the notion of grammaticality is basic to syntactic theory, and so it is important to distinguish it from notions with which it is easily confused.

First of all, `grammatical' needs to be distinguished from `makes sense.' The sentences in (39) `make sense' in the sense that they are easily interpreted by speakers of English. Nevertheless, as indicated by the asterisks, they are not grammatical.

(39) a. * Me no like drink coffee.
b. * Is our children learning?
c. * I slept late because of that my alarm clock didn't go off.
d. * Gasoline was rationed while the war.

Conversely, there are English sentences that are grammatical, but that don't `make sense' because their meaning is anomalous. The `fairy tale' or `science fiction' sentences that we mentioned earlier are of this type. Two further examples are given in (40). Since such sentences are grammatical, they aren't preceded by an asterisk. However, if necessary, their anomalous status can be indicated by a prefixed pound sign.

(40) a. # Colorless green ideas sleep furiously. (Chomsky 1965:149) cf. Revolutionary new ideas appear infrequently.
b. # I plan to travel there last year. cf. I plan to travel there next year.

Second, `grammatical' must be distinguished from `easily processable by human beings.' This is because it turns out that certain well-motivated grammatical operations can be applied in ways that result in sentences that are virtually impossible for human beings to process. For instance, it is possible in English to modify a noun with a relative clause, and sentences containing nouns modified in this way, like those in (41), are normally perfectly acceptable and easily understood. (Here and in the following examples, the relative clauses are bracketed and the modified noun is underlined.)

(41) a.   The mouse [that the cat chased] escaped.
b.   The cat [that the dog scared] jumped out the window.

But now notice what happens when we modify the noun within the relative clause in (41a) with a relative clause of its own.

(42)     The mouse [that the cat [that the dog scared] chased] escaped.

Even though (42) differs from (41a) by only four additional words and one additional level of embedding, the result is virtually uninterpretable without pencil and paper. The reason is not that relative clause modification can't be applied more than once, since the variant of (41a) in (43), which contains exactly the same words and is exactly as long, is perfectly fine.

(43)     The mouse escaped [that the cat chased ] [that the dog scared].

Rather, the unacceptability of (42) has to do with limitations on human short-term memory (Chomsky and Miller 1963:286, Miller and Chomsky 1963:471). Specifically, notice that in the acceptable (43), the subject of the main clause the mouse doesn't have to "wait" (that is, be kept active in short-term memory) for its verb escaped since the verb is immediately adjacent to the subject. The same is true for the subjects and verbs of each of the relative clauses (the cat and chased, and the dog and scared). In (42), on the other hand, the mouse must be kept active in memory, waiting for its verb escaped, for the length of the entire sentence. What is even worse, however, is that the period during which the mouse is waiting for escaped overlaps with the period during which the cat must be kept active, waiting for its verb chased. What makes (42) so difficult, then, is not the mere fact of recursion, but that two relations of exactly the same sort (the subject-verb relation) must be kept active in memory at the same time. In none of the other sentences in (41)-(43) is such double activation necessary. For instance, in (41a), the mouse must be kept active for the length of the relative clause, but the subject of the relative clause (the cat) needn't be kept active since it immediately precedes its verb chased.

A final point to bear in mind is that the sentences of a language consist of an expression (a sequence of words) paired with a particular interpretation. Grammaticality is always determined with respect to such pairings of form and meaning. An important consequence of this is that a particular sequence can be grammatical under one interpretation, but not under another. (), for instance, is grammatical under an object-subject-verb (OSV) interpretation (that is, when it is interpreted as Tom hired Sue). On this interpretation, Sue ordinarily receives a special intonation indicating focus or contrast (in writing, Sue would also ordinarily be set off from the rest of sentence by a comma).

()     Sue Tom hired.

But () is ungrammatical under an subject-object-verb (SOV) interpretation (that is, when the sentence is interpreted as Sue hired Tom). In other words, the grammaticality of () depends on whether its interpretation is analogous to ()a or ()b.

() a. ok Her, he hired. (The others, he didn't even call back.)
b. * She him hired.

E-language versus I-language

**


Notes

1. It's also possible to overzealously apply rules like in (2) where they shouldn't be applied, a phenomenon known as hypercorrection. Two common instances are illustrated in (i).

(i) a. There is the guy whom I think took her to the party. hypercorrect (the relative pronoun is not an object, but the subject of took her to the party);
should be: the guy who I think took her to the party; cf. the guy { who, *whom } took her to the party, I think.
b. This is strictly between you and I. hypercorrect (the second pronoun is not part of a subject, but part of the object of the preposition between);
should be: between you and me

2. The prescriptive rule is actually better stated as "Don't separate a preposition from its object," since the traditional formulation invites exchanges like ().

() A: Who are you going to the party with?
B: Didn't they teach you never to end a sentence with a preposition?
A: Sorry, let me rephrase that. Who are you going to the party with, Mr. Know-it-all?

3. ** LABOV STATS

4. Actually, that's an oversimplification. Not all the articles and nouns the child hears are in that order. To see why, carefully consider the underlined sentence in this footnote.

5. When children didn't respond this way, they either didn't respond at all, or they simply repeated the singular of the invented word. It's not clear what to make of the last response, since the children might have either intended the repetition as a plural (cf. deer and sheep) or been stumped by the experimental task.

6. Online corpora that are annotated with syntactic structure, such as the Penn Treebank, the Penn-Helsinki Corpus of Middle English, and others like them, tend to use labeled bracketing because the resulting files consist entirely of ASCII characters and are easy to search. The readability of such corpora for humans can be improved by suitable formatting of the labeled bracketing itself or by translating bracketed structures into tree diagrams.


Exercises and problems

Exercise 1.**

In the text, we give three descriptive rules of English that are violated by the sentences in (5). As we mentioned, there is a further rule that is violated. Formulate this fourth descriptive rule in a sentence or two.


Exercise 1.**

In a sentence or two, formulate a descriptive rule concerning subject-verb agreement in Belfast English (data from Henry 1995, chapter 2).

(1) a. ok The girl is late.     (2) a. * The girl are late.
b. ok She is late.     b. * She are late.
c. ok Is { the girl, she } late?     c. * Are { the girl, she } late?
(3) a. ok The girls are late.     (4) a. ok The girls is late.
b. ok They are late.     b. * They is late.
c. ok Are { the girls, they } late?     c. * Is { the girls, they } late?


Exercise 1.**

For each of the ambiguous examples in (1), explain whether the example is lexically or structurally ambiguous, or a mixture of both. (Material in parentheses is not itself ambiguous.)

(1) a.   British left waffles on Falkland Islands
b. Complaints about NBA referees growing ugly
c. Drunk gets nine months in violin case
d. Enraged cow injures farmer with ax
e. Hershey bars protest
f. I most enthusiastically recommend this candidate with no qualifications whatsoever.
g. In "What Women Want" (2000), Mel Gibson plays a man who develops the ability to understand what women are thinking after a freak accident.
h. Iraqi head seeks arms
i. Lawyers give poor free legal advice
j. March planned for next August
k. One morning I shot an elephant in my pajamas. (How he got into my pajamas I'll never know.) (Groucho Marx, in Animals Crackers)
l. Prostitutes appeal to Pope
m. Reporter's telegram: How old Cary Grant?
(Grant's reply: Old Cary Grant fine.)
n. Stolen painting found by tree
o. Teacher strikes idle kids
p. The bride was wearing an old lace gown that fell to the floor as she came down the aisle.
q. Time flies like the wind. Fruit flies like a banana. (another Groucho Marx quote)
r. Two sisters reunited after 18 years in checkout counter
s. (Q. What did the Zen master say to the guy at the hot dog stand?)
A. Make me one with everything.


Exercise 1.**

In the text, we showed that sentences are recursive categories. Show that noun phrases and prepositional phrases are recursive categories as well.


Exercise 1.**

Which, if any, of the sentences in (1)-(4) are ungrammatical? Which, if any, are semantically anomalous? Briefly explain.

(1) a.   They decided to go tomorrow yesterday.
b. They decided to go yesterday tomorrow.
(2) a. They decided yesterday to go tomorrow.
b. They decided tomorrow to go yesterday.
(3) a. Yesterday, they decided to go tomorrow.
b. Tomorrow, they decided to go yesterday.
(4)     They decided to go yesterday yesterday.


Problem 1.**

Are structure-dependence and recursion equally basic, or is one more basic than the other? (In other words, is structure-dependence possible without recursion, or recursion without structure-dependence?) Explain.