1 Foundational issues

This book is an introduction to generative grammar from a Chomskyan perspective. By the time you finish this chapter, you will have a clearer understanding of what we mean by this sentence, and by the time you finish the entire book, your understanding of it will be clearer and deeper still. But for the moment, you have probably gained the impression that this book is about grammar of some sort. And right there, we have a problem. The problem is that there is an everyday sense of the term 'grammar' and a quite different sense in which the term is used in linguistics.

Prescriptive versus descriptive grammar

In the everyday sense, 'grammar' refers to a collection of rules concerning what counts as socially acceptable and unacceptable language use. Some of these rules, like the ones in (1), make reference to particular words and apply to both spoken and written language.

(1)	a.		Don't use ain't.
	b.		Don't use seen as the past tense of see (as in I seen him at the party last night).

But mainly, the rules in question concern the proper composition of sentences in written language. You may recall being taught rules at school like those in (2).

(2)	a.	Don't end a sentence with a linking verb.
	b.	Don't end a sentence with a preposition.
	c.	Don't split infinitives.
	d.	Don't start a sentence with a conjunction.
	e.	Don't use a plural pronoun to refer back to a singular noun like everyone, no-one, someone, and the like.
	f.	Don't use an object pronoun for a subject pronoun in a conjoined subject.
	g.	Don't use contractions.
	h.	Don't use dangling participles.
	i.	Don't use sentence fragments.
	j.	Use whom, not who, as the object of a verb or preposition.

Someone who composes sentences in accordance with rules like those in (2) is said to have good grammar, whereas someone said to have bad grammar doesn't apply the rules when they ought to be applied ¹ and so produces sentences like (3).

(3)	a.		Over there is the guy who I went to the party with.	- violates (2b), (2j)
	b.		Bill and me went to the store.	- violates (2f)

From the amount of attention that people devote to rules like those in (1) and (2), it is easy to get the impression that they are the only linguistic rules there are. But it is also easy to see that that can't be so. The reason is that even people who don't follow the rules in (1) and (2) don't produce rampantly variable, confusing word salad. For instance, even people who invariably produce sentences like (3) do not produce the likes of (4).

(4)	a.	Over there is guy the who I went to party the with.
	b.	Over there is the who I went to the party with guy.
	c.	Bill and me the store to went.

The sentences in (3) may be instances of bad grammar in the everyday sense, but they are still English sentences. By contrast, we don't need to rely on school rules to tell us that the examples in (4) are not English sentences - even though they contain exactly the same English words as the sentences in (3).

Since native speakers of English do not produce a variable mishmash of words of the sort in (4), there must be another type of rules according to which sentences are composed. We can determine what some of them are by taking a closer look at the sequences in (4). Why exactly is it that they are word salad? In (4a), the article the is in the wrong order with respect to the nouns that it belongs with, guy and party. In (4b), the relative clause (who I went to the party with) is in the wrong order with respect to the noun that it modifies (guy). In (4c), the preposition to is in the wrong order with respect to its object (the store). In other words, the sentences in (4) do not follow the rules in (5).

(5)	a.	Articles precede nouns.
	b.	Relative clauses follow the noun that they modify.
	c.	Prepositions precede their objects.

(There's a further rule that's not followed in (4), which you are asked to formulate in the Exercise 1.1.)

Rules like those in (5) have a different intention than those in (2). The rules in (2) are prescriptive; those in (5) are descriptive. Rules of prescriptive grammar have the same status as rules of etiquette (like table manners or dress codes) or the laws of society, which divide the spectrum of possible human behavior into socially acceptable or legal behavior, on the one hand, and socially unacceptable or illegal behavior, on the other. Rules of prescriptive grammar make statements about how people ought to use language, and often they are framed as negative injunctions ("Thou shalt not split infinitives", on a par with "Thou shalt not steal"). In contrast, rules of descriptive grammar have the status of scientific observations, and they are intended as insightful generalizations about the way that speakers use language in fact, rather than about the way that they ought to use it. Descriptive rules are more general and more fundamental than prescriptive rules in the sense that all sentences of a language are formed in accordance with them. A useful way to think about the descriptive rules of a language (to which we return in more detail below) is that they produce, or generate, all the sentences of a language. The prescriptive rules can then be thought of as filtering out some (relatively minute) portion of the entire output of the descriptive rules as socially unacceptable.

In syntax, as in modern linguistics more generally, we adopt a resolutely descriptive perspective concerning language. In particular, when linguists say that a sentence is grammatical, we don't mean that it is correct from a prescriptive point of view, but rather that it conforms to descriptive rules like those in (5). In order to indicate that a sequence of words or morphemes is ungrammatical in this descriptive sense, we prefix it with an asterisk. Grammatical sentences are usually not specially marked, but sometimes we prefix them with a checkmark (✓) for clarity. These conventions are illustrated in (6) and (7).

(6)	a.	*	Over there is guy the who I went to party the with.	(= (4a))
	b.	*	Over there is the who I went to the party with guy.	(= (4b))
(7)	a.	✓	Over there is the guy who I went to the party with.	(= (3a))
	b.	✓	Over there is the guy with whom I went to the party.

Prescriptive grammar is based on the idea that there is a single right way to do things. When there is more than one way of saying something, prescriptive grammar is generally concerned with declaring one (and only one) of the variants to be correct. The favored variant is usually justified as being better (whether more logical, more euphonious, or more desirable on some other grounds) than the deprecated variant. In the same situation of linguistic variability, descriptive grammar is content simply to document the variants - without passing judgment on them.

For instance, consider the variable subject-verb agreement pattern in (8).

(8)	a.		There	's	some boxes left on the porch.
	b.		There	are	some boxes left on the porch.

In (8a), the singular verb is (contracted to 's) agrees in number with the preverbal expletive subject there (in red), whereas in (8b), the plural verb are agrees with the postverbal logical subject some boxes (in blue). The color of the verb indicates which of the two subjects it agrees with.

The prescriptive and descriptive rules concerning this pattern are given in (9). The differences between the two rules are emphasized by underlining.

(9)		In a sentence containing both the singular expletive subject there and a plural logical subject ...
	a.	Prescriptive rule:	... the verb should agree in number with the logical subject.
	b.	Descriptive rule:	... the verb can agree in number with either the expletive subject or the logical subject.

To take another example, let's consider the prescriptive rule that says, "Don't end a sentence with a preposition."² A prescriptivist might argue that keeping the preposition (in italics) together with its object (in boldface), as in (10a), makes sentences easier to understand than does separating the two, as in (10b).

(10)	a.		With which friend did you go to the party?
	b.		Which friend did you go to the party with?

But by that reasoning, (11a), where the verb and its object are adjacent, is preferable to (11b), where they are not. In fact, however, (11a) is completely ungrammatical in English.

(11)	a.	*	Adopt which cat did your friend?
	b.	✓	Which cat did your friend adopt?

It is important to understand that there is no semantic or other conceptual reason that prepositions can be separated from their objects in English, but that verbs can't. From a descriptive perspective, the grammaticality contrast between (10a) and (11a) is simply a matter of fact, irreducible to more basic considerations (at least given our present state of knowledge). (12) highlights the difference between the relevant prescriptive and descriptive rule.

(12)		When the object of a preposition appears in a position other than its ordinary one (as in a question), ...
	a.	Prescriptive rule:	... it should be preceded by the preposition.
	b.	Descriptive rule:	... it can either be preceded by the preposition, or stand alone, with the preposition remaining in its ordinary position.

The contrasting attitude of prescriptive and descriptive grammar towards linguistic variation has a quasi-paradoxical consequence: namely, that prescriptive rules are never descriptive rules. The reason for this has to do with the way that social systems (not just language) work. If everyone in a community consistently behaves in a way that is socially acceptable in some respect, then there is no need for explicit prescriptive rules to ensure the behavior in question. It is only when behavior that is perceived as socially unacceptable becomes common that prescriptive rules come to be formulated to keep the unacceptable behavior in check. For example, if every customer entering a store invariably wears both a shirt and shoes, there is no need for the store owner to put up a sign that says "No shirt, no shoes, no service." Conversely, it is precisely at illegal dump sites that we observe "No dumping" signs. In an analogous way, in the domain of language use, rules of prescriptive grammar are only ever formulated in situations where linguistic variation is common. But being prescriptive, they cannot treat all of the occurring variants as equally acceptable - with the result that they can't ever be descriptive.

Rule formation and syntactic structure in language acquisition

As we have just seen, prescriptive and descriptive rules of grammar differ in intention. In addition, they differ in how they come to be part of a speaker's knowledge. Prescriptive rules are taught at school, and because they are taught, people tend to be conscious of them, even if they don't actually follow them. By contrast, we follow the rules of descriptive grammar consistently ³ and effortlessly, yet without learning them at school. In fact, children have essentially mastered these rules on their own by first grade. Ordinarily, we are completely unconscious of the descriptive rules of language. If we do become conscious of them, it tends to be in connection with learning a foreign language whose descriptive grammar differs from that of our native language. In order to emphasize the difference between the unconscious way that we learn a native language (or several) in early childhood and the conscious way that we learn a foreign language later on in life, the first process is often called language acquisition rather than language learning.

As you consider descriptive rules like those in (5), you might not find it all that surprising that a child raised in an English-speaking community would acquire, say, the rule that articles precede nouns. After all, you might say, all the child ever hears are articles and nouns in that order. (Hmmm, though, actually... - see the footnote.)⁴ So why would it ever occur to such a child to put the article and the noun in the other order? Isn't it just common sense that children learn their native language by imitating older speakers around them?

Well, yes and no. It is true that children learn some aspects of their native language by imitation and memorization. Children in English-speaking communities learn English words, children in Navajo-speaking communities learn Navajo words, children in Swahili-speaking communities learn Swahili words, and so on. But language acquisition isn't purely a process of memorization. In fact, given current human life spans, it couldn't possibly be!

A thought experiment

To see this, let's consider a toy version of English that contains three-word sentences consisting of a noun, a transitive verb, and another noun. The toy version contains sentences like (13) that are sensible given the real world as well as sentences like (14) that aren't, but that might be useful in fairy tale or science fiction contexts.

(13)	a.	Cats detest lemons.	(14)	a.	Lemons detest cats. ("Secret life of citrus fruits")
	b.	Children eat tomatoes.		b.	Tomatoes eat children. ("Attack of the genetically modified tomatoes")
	c.	Cheetahs chase gazelles.		c.	Gazelles chase cheetahs. ("Avenger gazelle")

Again for the sake of argument, let's assume a (small) vocabulary of 1,000 nouns and 100 verbs. This gives us a list of 1,000 x 100 x 1,000 (= 100 million) three-word sentences of the type in (13) and (14). Numbers of this magnitude are difficult to put in human perspective, so let's estimate how long it would take a child to learn all the sentences on the list. Again, for the sake of argument, let's assume that children can memorize sentences quickly, at a rate of one sentence a second. The entire list of three-word sentences could then be memorized in 100 million seconds, which comes to 3.17 years. So far, so good. However, the minute we start adding complexity to Toy English, the number of sentences and the time it would take to memorize them quickly mushrooms. For instance, adding only 10 adjectives to the child's vocabulary would cause the number of five-word sentences of the form in (15) to grow to 10 billion (100 million x 10 x 10).

(15)	a.	Black cats detest green peas.
	b.	Happy children eat ripe tomatoes.
	c.	Hungry cheetahs chase speedy gazelles.

Even at the quick rate of one sentence per second that we're assuming, the list of all such five-word sentences would take a bit over 317 years to learn. Clearly, this is an absurd consequence. For instance, how could our memorious child ever come to know, as every English speaker plainly does, that the sentence in (16) is ungrammatical? If grammatical knowledge were based purely on rote memorization, the only way to determine this would be to compare (16) to all of the 10 billion five-word sentences and to find that it matches none of them.

(16)

Cats black detest peas green.

And even after performing the comparison, our fictitious language learner still wouldn't have the faintest clue as to why (16) is ungrammatical!

In addition to this thought experiment with its comically absurd consequences, there is another reason to think that language acquisition isn't entirely based on rote memorization - namely, that children use what they hear of language as raw material to construct linguistic rules. How do we know this? We know because children sometimes produce rule-based forms that they have never heard before.

Rule-based word formation

One of the earliest demonstrations that children acquire linguistic rules, rather than simply imitating the forms of adult language, was the well-known wug experiment (Berko 1958). In it, the psycholinguist Jean Berko used invented words to examine (among other things) how children between the ages of 4 and 7 form plurals in English. She showed the children cards with simple line drawings of objects and animals and elicited plurals from them by reading them accompanying texts like (17).

(17)

This is a wug. Now there is another one. There are two of them. There are two ___.

More than 75% of the children pluralized the invented words cra, lun, tor, and wug in exactly the same way that adults did in a control group: they added the sound -z to the word (Berko 1958:159-162).⁵ Since none of the children had encountered the invented words before the experiment, their response clearly indicates that they had acquired a plural rule and were using it to produce the novel forms.

Children are also observed to produce novel rule-based forms instead of existing irregular adult forms (for instance, comed or goed instead of came or went). This process, which is known as overregularization, is further illustrated in (18) (Marcus et al. 1992:148-149, based on Brown 1973).

(18)	a.		beated, blowed, catched, cutted, doed, drawed, drived, falled, feeled, growed, holded, maked, sleeped, standed, sticked, taked, teached, throwed, waked, winned (Adam, between the ages of 2 and 5)
	b.		drinked, seed, weared (Eve, between the ages of 1 1/2 and 2)

Overregularized forms don't amount to a large fraction of the forms that children produce overall (less than 5% in the case of past tense forms, according to Marcus et al. 1992:35), but they are important because they clearly show that even the acquisition of words can't be completely reduced to rote memorization.

Question formation

In addition to morphological rules (which concern the structure of words), children also acquire syntactic rules (which concern the structure of sentences). Some of these rules are of particular interest because they differ from the corresponding adult rules that the children eventually acquire. At the same time, however, the children's novel rules don't differ from the rules of the adult grammar in completely arbitrary ways. Rather, the children's rules share certain abstract properties with the adult rules, even when they differ from them.

To see this, let's consider how young children form yes-no questions. Some 3- to 5-year-olds form such questions from declarative sentences by copying the auxiliary element to the beginning of the sentence, as in (19) (Crain and Nakayama 1987:536). (We use the term 'auxiliary element' as a convenient cover term for elements that invert with the subject in (adult) English questions, like forms of the verb to be or modals like can. See Modals and auxiliary verbs in English for more details.)

(19)	a.		The girl is tall.	→	Is the girl is tall?
	b.		The red pig can stand on the house.	→	Can the red pig can stand on the house?

In the course of language acquisition, the questions in (19) are eventually replaced by those in (20), where we can think of the auxiliary element as having been moved rather than copied.

(20)	a.		Is the girl ___ tall?
	b.		Can the red pig ___ stand on the house?

But now notice a striking indeterminacy, first pointed out by Chomsky 1971:26-27. When children produce questions like those in (20), there is no way of telling whether they are using the adult rule for question formation in (21a) or the logically possible alternative rule in (21b).

(21)	a.		Adult question formation rule:	To form a question from a declarative sentence containing an auxiliary element, find the subject of the sentence, and invert the subject and the auxiliary.
	b.		Logically possible alternative:	To form a question from a declarative sentence containing an auxiliary element, find the first auxiliary element, and move it to the beginning of the sentence.

Don't confuse 'subject' with 'simple subject.'

Subjects, in contrast to simple subjects, are possible responses to questions like Who is tall? and Who can stand on the house? The subjects in (20) are the noun phrases the girl and the red pig.

If the subject consists of a single word or a clause, then the simple subject is identical to the subject; otherwise, the simple subject of a sentence is obtained by stripping the subject of any modifiers (yielding girl and pig as the simple subjects of (20)). The notion of subject is basic to syntactic theory, but we will have no further use for the notion of simple subject.

Both rules in (21) give the same result for simple sentences, which are likely to form most of the data that young children attend to. Both rules also require children to identify auxiliary elements. However, the adult rule additionally requires children to identify the subject of the sentence by grouping together sequences of words like the girl or the red pig into a single abstract structural unit. Because of this grouping requirement, the adult rule is called structure-dependent. By contrast, the alternative rule in (21b) is not structure-dependent, since it requires the child only to classify words according to their syntactic category (Is this word an auxiliary element?), but not to group the words into structural units. The rule in (21b) is simpler in the sense that it relies on fewer, as well as computationally less complex, cognitive operations, and children might reasonably be expected to experiment with it in the course of acquiring question formation. Nevertheless, Chomsky 1971 predicted that children would use only structure-dependent rules in the course of acquisition.

As we mentioned, both rules give the same result for simple sentences. So how could we possibly tell which of the two rules a child was actually using? Well, forming yes-no questions is not restricted to simple sentences. So although we can't tell which rule a child is using in the case of simple sentences like (19), the rules in (21) give different results for a complex sentence like (22), which contains a relative clause (who was holding the plate).

(22)

The boy who was holding the plate is crying.

In particular, the sentence in (22) contains two auxiliary elements - one (was) for the relative clause, and another one (is) for the entire sentence (the so-called matrix sentence, which contains the relative clause). A child applying the structure-dependent question formation rule to (22) would first identify the subject of the matrix sentence (the boy who was holding the plate) and then invert the entire subject - including the relative clause and the auxiliary contained within it (was) - with the matrix auxiliary (is). On the other hand, a child applying the structure-independent rule would identify the first auxiliary (was) and move it to the beginning of the sentence. As shown in (23), the two rules have very different results,

(23)	a.	Structure-dependent rule:
		[ The boy who was holding the plate ] is crying.	→ Is [the boy who was holding the plate] ___ crying?
	b.	Structure-independent rule:
		The boy who was holding the plate is crying.	→ Was the boy who ___ holding the plate is crying?

Recall that Chomsky predicted that children would not use structure-independent rules, even though they are (in some sense) simpler than structure-dependent ones. This prediction was tested in an experiment with 3- to 5-year-old children by Crain and Nakayama 1987. In the experiment, the experimenter had the children pose yes-no questions to a doll (Jabba the Hut from Star Wars). For instance, the experimenter would say to each child Ask Jabba if the boy who was holding the plate is crying. This task elicited various responses. Some children produced the adult question in (23a), whereas others produced the copy question in (24a) or the restart question in (24b).

(24)	a.		Is [the boy who was holding the plate] is crying?
	b.		Is [the boy who was holding the plate], is he crying?

Although neither of the questions in (24) uses the adult rule in (21a), the rules that the children used to produce them are structure-dependent in the same way that the adult rule is. This is because children who produced (24a) or (24b) must have identified the subject of the sentence, just like the children who produced (23a). Out of the 155 questions that the children produced, none were of the structure-independent type in (23b). Moreover, no child produced the structure-independent counterpart of (24a), shown in (25), which results from copying (rather than moving) the first auxiliary element in the sentence.

(25)

Was the boy who was holding the plate is crying?

In other words, regardless of whether a child succeeded in producing the adult question in (23a), every child in the experiment treated the sequence the boy who was holding the plate as a structural unit, thus confirming Chomsky's prediction.

More evidence for syntactic structure

We have seen that young children are capable of forming and applying both morphological and syntactic rules. Moreover, as we have seen in connection with question formation, children do not immediately acquire the rules of the adult grammar. Nevertheless, the syntactic rules that children are observed to use in the course of acquisition are a subset of the logically possible rules that they might postulate in principle. In particular, as we have just seen, children's syntactic rules are structure-dependent, even when they differ from the target adult rules. Another way of putting this is that the objects that syntactic rules operate on (declarative sentences in the case of the question formation rule) are not simply strings of words, but rather groups of words that belong together, so-called syntactic constituents.

Intuitions about words belonging together

Evidence for syntactic structure isn't restricted to data from child language acquisition. Further evidence comes from the intuitions that speakers, whether adults or (older) children, have that certain words in a sentence belong together, whereas others do not. For instance, in a sentence like (26), we have the strong intuition that the first the belongs with dog, but not with did, even though the is adjacent to both.

(26)

Did the dog chase the cat?

Similarly, the second the in (26) belongs with cat and not with chase. But a word doesn't always belong with the following word. For instance, in (27), dog belongs with the preceding the, not with the following the.

(27)

Did the dog the children like chase the cat?

Words that belong together can sometimes be replaced by placeholder elements such as pronouns, as illustrated in (28).

(28)	a.		Did the dog chase the cat?	→		Did she chase him?
	b.		Did the dog the children like chase the cat?	→		Did the dog they like chase him?

Strictly speaking, the term 'pronoun' is misleading since it suggests that pronouns substitute for nouns regardless of syntactic context. In fact, pronouns substitute for noun phrases (as will be discussed in more detail in Chapter 2). A less confusing term would be 'pro-noun phrase,' but we'll continue to use the traditional term.

It's important to recognize that pronouns don't simply replace strings of words regardless of context. Just because a string like the dog is a constituent in (28a) doesn't mean that it's always a constituent. We can see this by replacing the dog by a pronoun in (28b), which leads to the ungrammatical result in (29).

(29)

Did the dog the children like chase the cat?

→

Did she the children like chase the cat?

The ungrammaticality in (29) is evidence that the and dog belong together less closely in (28b) than in (28a). In particular, in (28b), dog combines with the relative clause, and the combines with the result of this combination, not with dog directly, as it does in (28a). (We will consider the internal structure of noun phrases more closely in Chapter 5.)

In some sentences, we have the intuition that words belong together even when they are not adjacent. For instance, see and who in (30a) belong together in much the same way as see and Bill do in (30b).

(30)	a.		Who will they see?
	b.		They will see Bill.

Finally, we can observe that there are various sorts of ways that words can belong together. For instance, in a phrase like the big dog, big belongs with dog, and we have the intuition that big modifies dog. On the other hand, the relation between see and Bill in (30b) isn't one of modification. Rather, we have the intuition that Bill is a participant in a seeing event.

In the course of this book, we will introduce more precise ways of expressing and representing intuitions like the ones just discussed. For the moment, what is important is that we have strong intuitions that words belong together in ways that go beyond adjacency.

Structural ambiguity

A particularly striking piece of evidence for the existence of syntactic structure is the phenomenon of structural ambiguity. The classified advertisement in (31) is a humorous illustration.

(31)

Wanted: Man to take care of cow that does not smoke or drink.

World knowledge tells us that the intent of the advertiser is to hire a clean-living man to take care of a cow. But because of the way the advertisement is formulated, it also has an unintentionally comical interpretation - namely, that the advertiser has a clean-living cow and that the advertiser wants a man (possibly a chain-smoking alcoholic) to take care of this animal. The intended and unintended interpretations describe sharply different situations; that is why we say that (31) is ambiguous (like the two mutually exclusive interpretations of an optical illusion), rather than that it is vague (like a blurry image). Moreover, the ambiguity of the sentence can't be pinned on a particular word, as is possible in ambiguous sentences like those in (32).

(32)	a.		As far as I'm concerned, any gender is a drag. (Patti Smith)
	b.		Our bikinis are exciting. They are simply the tops.

Sentences like those in (32) are examples of lexical ambiguity; their ambiguity is based on a lexeme (= vocabulary item) with two distinct meanings. In (31), on the other hand, the words themselves have the same meanings in each of the two interpretations, and the ambiguity derives from the possibility of grouping the words in distinct ways. In the intended interpretation, the relative clause that does not smoke or drink modifies man; in the unintended interpretation, it modifies cow.

To avoid any confusion, we should emphasize that we are here considering structural ambiguity from a purely descriptive perspective, focusing on what it tells us about the design features of human language and disregarding the practical issue of effective communication. As writers of advertisements ourselves, we would take care not to use (31), but to disambiguate it by means of an appropriate paraphrase. For the ordinary interpretation of (31), where the relative clause modifies man, we might move the relative clause next to the intended modifiee, as in (33a). The comical interpretation of (31), on the other hand, cannot be expressed unambiguously by moving the relative clause. If it were the desired interpretation, we would have to resort to a more drastic reformulation, such as (33b).

(33)	a.		Wanted: Man that does not smoke or drink to take care of cow.
	b.		Wanted: Man to take care of nonsmoking, nondrinking cow.

Universal Grammar

Formal universals

The structure-dependent character of syntactic rules is a general property of the human language faculty (the part of the mind/brain that is devoted to language), often also referred to as Universal Grammar, especially when considered in abstraction from any particular language. There are two sources of evidence for this. First, as we have seen, the syntactic rules that children acquire even when they are not the rules that adults use, are structure-dependent. Second, even though structure-independent rules are logically possible and computationally tractable, no known human language actually has rules that disregard syntactic structure as a matter of course. For instance, (34) gives two examples of computationally very simple rules for question formation, but no known human language has rules of this type.

(34)	a.	To form a question, switch the order of the first and second words in the corresponding declarative sentence.	The girl is tall.	→	Girl the is tall?
			The blond girl is tall.	→	Blond the girl is tall?
	b.	To form a question, reverse the order of the words in the corresponding declarative sentence.	The girl is tall.	→	Tall is girl the?
			The blond girl is tall.	→	Tall is girl blond the?

The structure-dependent character of syntactic rules (often referred to more briefly as structure dependence) is what is known as a formal universal of human language - a property common to all human languages that is independent of the meanings of words. Formal universals are distinguished from substantive universals, which concern the substance, or meaning, of linguistic elements. An example of a substantive universal is the fact that all languages have indexical elements such as I, here, and now. These words have the special property that their meanings are predictable in the sense that they denote the speaker, the speaker's location, and the time of speaking, but that what exactly they refer to depends on the identity of the speaker.

Recursion

Another formal universal is the property of recursion. A simple illustration of this property is the fact that it is possible for one sentence to contain another. For instance, the simple sentence in (35a) forms part of the complex sentence in (35b), and the resulting sentence can form part of a still more complex sentence. Recursive embedding is illustrated in (35) up to a level of five embeddings.

(35)	a.	She won.
	b.	The Times reported that [she won].
	c.	John told me that [the Times reported that [she won]].
	d.	I remember distinctly that [John told me that [the Times reported that [she won]]].
	e.	They don't believe that [I remember distinctly that [John told me that [the Times reported that [she won]]]].
	f.	I suspect that [they don't believe that [I remember distinctly that [John told me that [the Times reported that [she won]]]]].

Parameters

Formal universals like structure dependence and recursion are of particular interest to linguistics in the Chomskyan tradition. This is not to deny, however, that individual languages differ from one another, and not just in the sense that their vocabularies differ. In other words, Universal Grammar is not completely fixed, but allows some variation. The ways in which grammars can differ are called parameters.

One simple parameter concerns the order of verbs and their objects. In principle, two orders are possible: verb-object (VO) or object-verb (OV), and different human languages use either one or the other. As illustrated in (36) and (37), English and French are languages of the VO type, whereas Hindi, Japanese, and Korean are languages of the OV type.

(36)	a.	English	Peter read the book.
	b.	French	Pierre lisait le livre. Pierre was.reading the book 'Pierre was reading the book.'
(37)	a.	Hindi	Peter-ne kitaab parh-ii. Peter book read 'Peter read the book.'
	b.	Japanese	Peter-ga hon-o yon-da.
	c.	Korean	Peter-ka chayk-ul ilk-ess-ta.

Another parameter of Universal Grammar concerns the possibility, mentioned earlier in connection with prescriptive rules, of separating a preposition from its object, or preposition stranding. (The idea behind the metaphor is that the movement of the object of the preposition away from its ordinary position leaves the preposition stranded high and dry.) The parametric alternative to preposition stranding goes by the name of pied piping,⁶ by analogy to the Pied Piper of Hameln, who took revenge on the citizens of Hameln for mistreating him by luring the town's children away with him. In pied piping of the syntactic sort, the object of the preposition moves away from its usual position, just as in preposition stranding, but it takes the preposition along with it. The two parametric options are illustrated in (38). (You'll note that a single language can exhibit two parameter settings. We return to this issue later on in the chapter, in the section on Grammar versus language.)

(38)	a.	Preposition stranding:	✓	Which house does your friend live in?
	b.	Pied piping:	✓	In which house does your friend live?

Just as in English, preposition stranding and pied piping are both grammatical in Swedish. (In Swedish, it is preposition stranding that counts as prescriptively correct! Pied piping is frowned upon, on the grounds that it sounds stiff and artificial.)

(39)	a.	Swedish	✓	Vilket hus bor din kompis i? which house lives your friend in 'Which house does your friend live in?'
	b.		✓	I vilket hus bor din kompis?

In other languages, such as French and Italian, preposition stranding is ungrammatical. Speakers of these languages reject examples like (40) as word salad, and accept only the corresponding pied-piping examples in (41).

(40)	a.	French	*	Quelle maison est-ce que ton ami habite dans? which house is it that your friend lives in Intended meaning: 'Which house does your friend live in?'
	b.	Italian	*	Quale casa abita il tuo amico in? which house lives the your friend in Intended meaning: 'Which house does your friend live in?'
(41)	a.	French	✓	Dans quelle maison est-ce que ton ami habite?
	b.	Italian	✓	In quale casa abita il tuo amico?

Generative grammar

At the beginning of this chapter, we said that this book was an introduction to generative grammar from a Chomskyan perspective. Until now, we have clarified our use of the term 'grammar,' and we have explained that a Chomskyan perspective on grammar is concerned with the formal principles that all languages share as well as with the parameters that distinguish them. Let's now turn to the notion of a generative grammar.

(42)

A generative grammar is an algorithm for specifying, or generating, all and only the grammatical sentences in a language.

What's an algorithm? It's simply any finite, explicit procedure for accomplishing some task, beginning in some initial state and terminating in a defined end state. Computer programs are the algorithms par excellence. More ordinary examples of algorithms include recipes, knitting patterns, the instructions for assembling an Ikea bookcase, or a list of steps for balancing your checkbook.

An important point to keep in mind is that it is often difficult to construct an algorithm for even trivial tasks. A quick way to gain an appreciation for this is to describe how to tie a bow. Like speaking a language, tying a bow is a skill that most of us master around school age and that we perform more or less unconsciously thereafter. But describing (not demonstrating!) how to do it is not that easy, especially if we're not familiar with the technical terminology of knot-tying. In an analogous way, constructing a generative grammar of English is a completely different task than speaking the language, and much more difficult (or at least difficult in a different way)!

Just like a cooking recipe, a generative grammar needs to specify the ingredients and procedures that are necessary for generating grammatical sentences. We won't introduce all of these in this first chapter, but in the remainder of the section, we'll introduce enough ingredients and procedures to give a flavor of what's to come.

Elementary trees and substitution

The raw ingredients that sentences consist of are vocabulary items. These belong to various syntactic categories, like noun, adjective, transitive verb, preposition, and so forth. Depending on their syntactic category, vocabulary items combine with one another to form constituents, which in turn belong to syntactic categories of their own. For instance, determiners (a category that includes the articles a and the and the demonstratives this, that, these and those) can combine with nouns to form noun phrases, but they can't (or at least don't ordinarily) combine with other syntactic categories like adverbs, verbs, or prepositions.

(43)	a.	✓	a house	(44)	a.	*	a slowly
	b.	✓	the cats		b.	*	the went
	c.	✓	those books		c.	*	those of

It's possible to represent the information contained in a constituent by using labeled bracketing. Each vocabulary item is enclosed in brackets that are labeled with the appropriate syntactic category. The constituent that results from combining vocabulary items is in turn enclosed in brackets that are labeled with the constituent's syntactic category. The labeled bracketings for the constituents in (43) are given in (45).

(45)	a.	[_NounPhr [_Det a ] [_Noun house ] ]
	b.	[_NounPhr [_Det the ] [_Noun cats ] ]
	c.	[_NounPhr [_Det those ] [_Noun books ] ]

Noun phrases can combine with other syntactic categories, such as prepositions or transitive verbs. Prepositions combine with a noun phrase to form prepositional phrases. A transitive verb combines with one noun phrase to form a verb phrase, which in turn combines with a second noun phrase to form a complete sentence.

(46)	a.	[_PrepPhr [_Prep on ] [_NounPhr [_Det the ] [_Noun table ] ] ]
	b.	[_VerbPhr [_TrVerb drafted ] [_NounPhr [_Det a ] [_Noun letter ] ] ]
	c.	[_Sentence [_NounPhr [_Det the ] [_Noun secretary ] ] [_VerbPhr [_TrVerb drafted ] [_NounPhr [_Det a ] [_Noun letter ] ] ] ]

Noun phrases don't, however, combine with any and all syntactic categories. For instance, they can't combine with determiners (at least not in English).

(47)

the this letter

As constituent structure grows more complex, labeled bracketings very quickly grow difficult for humans to process, and it's often more convenient to represent constituent structure with tree diagrams. Tree diagrams, or trees for short, convey exactly the same information as labeled bracketings, but the information is presented differently. Instead of enclosing an element in brackets that are labeled with a syntactic category, the category is placed immediately above the element and connected to it with a line or branch. The labeled bracketings that we have seen so far translate into the trees in (48) and (49).⁷

(48)

(49)

Trees like those in (48) and (49) resemble dishes that are ready to serve; they don't provide a record of how they were brought into being. We can provide such a record by representing vocabulary items themselves in the form of trees that include combinatorial information. For example, prepositions and transitive verbs can be represented as trees with empty slots for noun phrases to fit into, as shown in (50).

(50)

We'll refer to trees for vocabulary items like those in (50) as elementary trees. The purpose of elementary trees is to represent a vocabulary item's combinatorial possibilities, and so they ordinarily contain unfilled nodes. Such nodes are called substitution nodes, and they are filled by a substitution operation, as shown in (51).

(51)	a.				b.				c.
			Tree (a) has a substitution node of some syntactic category.				The root (= topmost) node in Tree (b) has the same syntactic category as the substitution node in Tree (a).				Substitution occurs when the root node of Tree (b) is identified with the substitution node in Tree (a).

Elementary trees don't necessarily contain substitution nodes, though; ones that invariably play the role of Tree No. 2 in the substitution operation don't. The elementary tree for the noun in (52b) is an example.

Notice, by the way, that there are two conceivable ways to arrive at trees for noun phrases like those cats, depending on whether it is the noun that is taken as the substitution node, as in (52), or the determiner, as in (53). At this point, there is no reason to prefer one way over the other, but in Chapter 5, we will adopt a variant of (52).

(52)	a.				b.
(53)	a.				b.

In summary, a generative grammar as we've constructed it so far consists of a set of elementary trees, which represent the vocabulary items in a language and the range of their combinatorial possibilities, and a substitution operation, by means of which the elementary trees combine into larger constituents and ultimately into grammatical sentences. In Chapter 4, we will introduce two further formal operations. The first, adjunction, will enable the grammar to generate sentences containing modifiers, such as adjectives or relative clauses modifying nouns (the big dog, the dog that the children like). The second, movement, will enable the grammar to represent, among other things, the similarities and the differences between declarative sentences (They will see Bill) and questions corresponding to them (Will they see Bill?, Who(m) will they see?).

Grammaticality

As we mentioned earlier, the aim of a generative grammar is to generate all and only the grammatical sentences of a language. Since the notion of grammaticality is basic to syntactic theory, it is important to distinguish it from notions with which it is easily confused.

First and foremost, 'is grammatical' is not the same thing as 'makes sense.' The sentences in (54) all 'make sense' in the sense that it is easy to interpret them. Nevertheless, as indicated by the asterisks, they are not grammatical.⁸

(54)	a.	*	Is our children learning?
	b.	*	Me wants fabric.
	c.	*	To where are we be taking thou, sir?
	d.	*	The introduction explained that "the Genoese people, besides of hard worker, are good eater too, and even 'gourmand,' of that honest gourmandise which will not drive a man to hell but which is, after all, one of the few pleasures that mankind can enjoy in this often sorrowful world."

Conversely, sentences can be grammatical, but not 'make sense.' The 'fairy tale' or 'science fiction' sentences in (14) are of this type. Two further examples are given in (55). Since the sentences are grammatical, they aren't preceded by an asterisk. Their semantic anomaly can be indicated, if desired, by a prefixed pound sign (hash mark).

(55)	a.	#	Colorless green ideas sleep furiously. (Chomsky 1965:149)	- cf. Revolutionary new ideas appear infrequently.
	b.	#	I plan to travel there last year.	- cf. I plan to travel there next year.

Second, 'grammatical' must be distinguished from 'acceptable' or 'easily processable by human beings.' This is because it turns out that certain well-motivated simple grammatical operations can be applied in ways that result in sentences that are virtually impossible for human beings to process. For instance, it is possible in English to modify a noun with a relative clause, and sentences containing nouns that are modified in this way, like those in (56), are ordinarily perfectly acceptable and easily understood. (Here and in the following examples, the relative clauses are bracketed and the modified noun is underlined.)

(56)	a.		The mouse [that the cat chased] escaped.
	b.		The cat [that the dog scared] jumped out the window.

But now notice what happens when we modify the noun within the relative clause in (56a) with a relative clause of its own.

(57)

The mouse [that the cat [that the dog scared] chased] escaped.

Even though (57) differs from (56a) by only four additional words and a single additional level of embedding, the result is virtually uninterpretable without pencil and paper. The reason is not that relative clause modification can't apply more than once, since the variant of (56a) in (58), which contains exactly the same words and is exactly as long, is perfectly fine (or at any rate much more acceptable than (57)).

(58)

The mouse escaped [that the cat chased ] [that the dog scared].

The reason that (57) is virtually uninterpretable is also not that it contains recursive structure (the relative clause that modifies mouse contains the relative clause that modifies cat). After all, the structures in (35) are recursive, yet they don't throw us for a loop the way that (57) does.

(57) is unacceptable not because it is ungrammatical, but because of certain limitations on human short-term memory (Chomsky and Miller 1963:286, Miller and Chomsky 1963:471). Specifically, notice that in the (relatively) acceptable (58), the subject of the main clause the mouse doesn't have to "wait" (that is, be kept active in short-term memory) for its verb escaped since the verb is immediately adjacent to the subject. The same is true for the subjects and verbs of each of the relative clauses (the cat and chased, and the dog and scared). In (57), on the other hand, the mouse must be kept active in memory, waiting for its verb escaped, for the length of the entire sentence. What is even worse, however, is that the period during which the mouse is waiting for its verb escaped overlaps the period during which the cat must be kept active, waiting for its verb chased. What makes (57) so difficult, then, is not the mere fact of recursion, but that two relations of exactly the same sort (the subject-verb relation) must be kept active in memory at the same time. In none of the other relative clause sentences is such double activation necessary. For instance, in (56a), the mouse must be kept active for the length of the relative clause, but the subject of the relative clause (the cat) needn't be kept active since it immediately precedes its verb chased.

Sentences like (56) and (57) are often referred to as center-embedding structures, and the dependencies between the subjects and their verbs are said to be nested.

The mouse that the cat chased escaped.
    |              |_____|       |
    |____________________________|

The mouse that the cat that the dog scared chased escaped.
    |              |            |_____|      |       |
    |              |_________________________|       |
    |________________________________________________|

By contrast, the corresponding dependencies in (58) are not nested.

The mouse escaped that the cat chased.
    |________|             |______|            

The mouse escaped that the cat chased that the dog scared.
    |________|             |______|            |_____|

This remains true even if we focus on the dependencies between the relative clauses and the nouns that they modify.

The mouse escaped that the cat chased that the dog scared.
      |____________|        |__________|

A final important point to bear in mind is that any sentence is an expression that is paired with a particular interpretation. Grammaticality is always determined with respect to a pairing of form and meaning. This means that a particular string can be grammatical under one interpretation, but not under another. For instance, (59) is ungrammatical under an subject-object-verb (SOV) interpretation (that is, when the sentence is interpreted as Sue hired Tom).

(59)

Sue Tom hired.

(59) is grammatical, however, under an object-subject-verb (OSV) interpretation (that is, when it is interpreted as Tom hired Sue). On this interpretation, Sue receives a special intonation marking contrast, which would ordinarily be indicated in writing by setting off Sue from the rest of the sentence by a comma. In other words, the grammaticality of (59) depends on whether its interpretation is analogous to (60a) or (60b).

(60)	b.	*	She him hired.
	a.	✓	Her, he hired. (The other job candidates, he didn't even call back.)

Grammar versus language

We conclude this chapter by considering the relationship between the concepts of grammar and language. The notion of language seems straightforward because we are used to thinking and speaking of "the English language," "the French language," "the Swahili language," and so forth. But these terms are actually much vaguer than they seem at first glance because they cover a plethora of varieties, including ones that differ enough to be mutually unintelligible. For instance, Ethnologue distinguishes 32 dialects of English in the United Kingdom alone. In addition, distinct dialects of English are spoken in former British colonies, including Australia, Canada, India, New Zealand, South Africa, the United States, and many other African, Asian, and Caribbean nations, and many of these dialects have subdialects of their own. Similarly, Ethnologue distinguishes 11 dialects of French in France and 10 dialects of Swahili in Kenya, and there are further dialects in other countries in which these languages are spoken. Moreover, we use terms like "the English language" to refer to historical varieties that differ as profoundly as present-day English does from Old English, which is about as intelligible to a speaker of modern English as German - that is to say, not.

Although the most salient differences between dialects are often phonological (that is, speakers of different dialects often have different accents), dialects of a so-called single language can differ syntactically as well. For instance, in standard French, as in the Romance languages more generally, adjectives ordinarily follow the noun that they modify. But that order is reversed in Walloon, a variety of French spoken in Belgium. The two parametric options are illustrated in (61) (Bernstein 1993:25-26).

(61)	a.		Standard French	un chapeau noir a hat black
	b.		Walloon	on neûr tchapê a black hat 'a black hat'

Another example of the same sort, though considerably more cathected for speakers of English, concerns multiple negation in sentences like (62a).

(62)	a.		The kids didn't eat nothing.
	b.		The kids didn't eat anything.

In present-day standard English, didn't and nothing each contribute their negative force to the sentence, and the overall force of (62a) isn't negative; rather, the sentence means that the kids ate something. In many nonstandard varieties of English, however, (62a) conveys exactly the same meaning as standard English (62b); that is, the sentence as a whole has negative force. In these dialects, the negation in nothing can be thought of as agreeing with (and reinforcing) the negation in didn't rather than cancelling it; hence the term negative concord for this phenomenon ('concord' is a variant term for 'agreement'). Negative concord is routinely characterized as "illogical" by prescriptivists,⁹ and it is one of the most heavily stigmatized features in present-day English.¹⁰ However, it was productive in earlier forms of English, and it is attested in renowned masters of the language such as Chaucer and Shakespeare. Moreover, negative concord is part of the standard forms of languages like French, Italian, Spanish, and modern Greek. From a descriptive and generative point of view, negative concord is simply a parametric option of Universal Grammar just like any other, and negative concord is no more illogical than the noun-adjective order in (61a) or preposition stranding.

In both of the examples just discussed, we have dialects of "the same language" (English and French, respectively) differing with respect to a parameter. The converse is also possible: two "different languages" that are parametrically (all but) indistinguishable. For example, the same linguistic variety spoken on the Dutch-German border may count as a dialect of Dutch or German depending on which side of the political border it is spoken, and the same is true of many other border dialects as well. According to Max Weinreich, "a language is a dialect with an army and a navy." A striking (and sad) confirmation of this aphorism concerns the recent terminological history of Serbo-Croatian. As long as Yugoslavia was a federal state, Serbo-Croatian was considered a single language with a number of regional dialects. The 14th edition of Ethnologue, published in 2000, still has a single entry for Serbo-Croatian. In the 15th edition, published in 2005, the single entry is replaced by three new entries for Bosnian, Croatian, and Serbian.

As the previous discussion has shown, the notion of language is based more on sociopolitical considerations than on strictly linguistic ones. By contrast, the term 'grammar' refers to a particular set of parametric options that a speaker acquires. For this reason, the distinction between language and grammar that we have been drawing is also referred to as the distinction between E-language and I-language (mnemonic for 'external' and 'internal' language) (Chomsky 1986).

As we have seen, the same language label can be associated with more than one grammar (the label "English" is associated with grammars both with and without negative concord), and a single grammar can be associated with more than one language label (as in the case of border dialects). It is important to distinguish the concept of shared grammar from mutual intelligibility. To a large extent, standard English and many of its nonstandard varieties are mutually intelligible even where their grammars differ with respect to one parameter or another. On the other hand, it is perfectly possible for two or more varieties that are mutually unintelligible to share a single grammar. For instance, in the Indian village of Kupwar (Gumperz and Wilson 1971), the three languages Marathi, Urdu, and Kannada, each spoken by a different ethnic group, have been in contact for about 400 years, and most of the men in the village are bi- or trilingual. Like the standard varieties of these languages, their Kupwar varieties have distinct vocabularies, thus rendering them mutually unintelligible to monolingual speakers, but in Kupwar, the considerable grammatical differences that exists among the languages as spoken in other parts of India have been virtually eliminated. The difference between standard French and Walloon with respect to prenominal adjectives is another instance of this same convergence phenomenon. Here, too, the adjective-noun order in Walloon is due to language contact and bilingualism, in this case between French and Flemish, the other language spoken in Belgium; in Flemish, as in the Germanic languages more generally, adjectives ordinarily precede the nouns that they modify.

Finally, we should point out that it is perfectly possible for a single speaker to acquire more than one grammar. This is most strikingly evident in balanced bilinguals. Speakers can also acquire more than one grammar in situations of syntactic change. For instance, in the course of its history, English changed from an OV to a VO language, and individual speakers during the transition period (which began in late Old English and continued into Middle English) acquired and used both parametric options. Speakers can also acquire more than one grammar in situations of diglossia or stable syntactic variation. For instance, English speakers whose vernacular grammar has negative concord or preposition stranding might acquire the parametric variants without negative concord or with pied piping in the course of formal education.

Notes

1. It is also possible to overzealously apply rules like those in (2), even in cases where they shouldn't be applied. This phenomenon is known as hypercorrection. Two common instances are illustrated in (i).

		Hypercorrect example	Explanation
(i)	a.	Over there is the guy whom I think took her to the party.	Should be: the guy who I think took her to the party (the relative pronoun who is the subject of the relative clause, not the object; cf. the guy { who, whom* } took her to the party)
	b.	This is strictly between you and I.	Should be: between you and me (the second pronoun is part of the conjoined object of the preposition between, not part of a subject)

2. The prescriptive rule is actually better stated as "Don't separate a preposition from its object," since the traditional formulation invites exchanges like (i).

(i)	A:	Who are you going to the party with?
	B:	Didn't they teach you never to end a sentence with a preposition?
	A:	Sorry, let me rephrase that. Who are you going to the party with, Mr. Know-it-all?

3. As William Labov has often pointed out, everyday speech (apart from false starts and other self-editing phenomena) hardly ever violates the rules of descriptive grammar.

4. Actually, that's an oversimplification. Not all the articles and nouns an English-speaking child hears appear in the article-noun order. To see why, carefully consider the underlined sentence in this footnote.

5. When children didn't respond this way, they either repeated the original invented word, or they didn't respond at all. It's not clear what to make of these responses. Either response might indicate that the children were stumped by the experimental task. Alternatively, repetition might have been intended as an irregular plural (cf. deer and sheep), and silence might indicate that some of the invented words (for instance, cra) struck the children as phonologically strange.

6. The term 'pied piping' was invented in the 1960's by John Robert Ross, a syntactician with a penchant for metaphorical terminology.

7. Online corpora that are annotated with syntactic structure, such as the Penn Treebank, the Penn Parsed Corpora of Historical English, and others like them, tend to use labeled bracketing because the resulting files are computationally extremely tractable. The readability of such corpora for humans can be improved by suitable formatting of the labeled bracketing or by providing an interface that translates the bracketed structures into tree diagrams.

8. (54a) is from a speech by George W. Bush (https://politicalhumor.about.com/library/blbushisms2000.htm). (54b) was the subject line of an email message in response to an offer of free fabric; the author is humorously attempting to imitate the language of a child greedy for goodies. (54c) is from "Pardon my French" (Calvin Trillin. 1990. Enough's enough (and other rules of life). 169). (54d) is from "Connoisseurs and patriots" (Joseph Wechsberg. 1948. Blue trout and black truffles: The peregrinations of an epicure. 127).

9. Two important references concerning the supposed illogicality of negative concord (and of nonstandard English more generally) are Labov 1972a, 1972b.

Those who argue that negative concord is illogical often liken the rules of grammar to those of formal logic or arithmetic, where one negation operator or subtraction operation cancels out another; that is, (NOT (NOT A)) is identical to A, and (-(-5)) = +5. Such prescriptivists never distinguish between sentences containing even and odd numbers of negative expressions. By their own reasoning, (i.a) should have a completely different status than (i.b) - not illogical, but at worst redundant.

(i)	a.		They never told nobody nothing.
	b.		They never told nobody.

10. Because of the social stigma associated with it, it is essentially impossible to study negative concord in present-day English. This is because even for those speakers of negative concord varieties who don't productively control standard English as a second dialect, the influence of prescriptive grammar is so pervasive that if such speakers reject negative concord sentences as unacceptable, we don't know whether they are rejecting them for grammatical or for social reasons.

Exercises and problems

Exercise 1.1

The sentences in (4) violate several descriptive rules of English, three of which were given in (5). As mentioned in the text, there is a fourth descriptive rule that is violated in (4). Formulate the rule (you shouldn't need more than a sentence).

Exercise 1.2

(1)-(4) illustrate the facts of subject-verb agreement in the variety of English spoken in Belfast, Ireland (data from Henry 1995, chapter 2). Describe the data as clearly and briefly as you can.

In order to avoid conflating morphological form with semantic function, you can refer to "is" and "are" as "the i- form" and "the a- form", rather than as "singular" and "plural".

(1)	a.	✓	The girl is late.	(2)	a.	*	The girl are late.
	b.	✓	She is late.		b.	*	She are late.
	c.	✓	Is { the girl, she } late?		c.	*	Are { the girl, she } late?
(3)	a.	✓	The girls are late.	(4)	a.	✓	The girls is late.
	b.	✓	They are late.		b.	*	They is late.
	c.	✓	Are { the girls, they } late?		c.	*	Is { the girls, they } late?

Exercise 1.3

A. Which of the newspaper headlines in (1) are lexically ambiguous, which are structurally ambiguous, and which are a mixture of both types of ambiguity? Explain.

(1)	a.	Beating witness provides names
	b.	Child teaching expert to speak
	c.	Drunk gets nine months in violin case
	d.	Enraged cow injures farmer with ax
	e.	Prostitutes appeal to pope
	f.	Teacher strikes idle kids
	g.	Teller stuns man with stolen check

B. At least two of the examples form a subgroup even within their type (lexical, structural, mixed). Can you find them and explain their similarity?

Exercise 1.4

In the text, we showed that sentences are recursive categories. In other words, one instance of the syntactic category 'sentence' can contain another instance of the same category. Provide evidence that noun phrases and prepositional phrases are recursive categories, too.

Be careful to give examples that are recursive, and not just ones in which the syntactic category in question occurs more than once. For instance, (1) does not provide the evidence required in this exercise, because the second prepositional phrase is not contained in the first. This is clearly shown by the fact that the order of the prepositional phrases can be switched.

(1)

The cat jumped [_PP onto the table ] [_PP without the slightest hesitation ].

Exercise 1.5

Which, if any, of the sentences in (1)-(5) are ungrammatical? Which, if any, are semantically or otherwise anomalous? Briefly explain.

(1)	a.	They decided to go tomorrow yesterday.
	b.	They decided to go yesterday tomorrow.
(2)	a.	They decided yesterday to go tomorrow.
	b.	They decided tomorrow to go yesterday.
(3)	a.	Yesterday, they decided to go tomorrow.
	b.	Tomorrow, they decided to go yesterday.
(4)		They decided to go yesterday yesterday.
(5)		How long didn't Tom wait?

Exercise 1.6

A. The following expressions are structurally ambiguous. For each reading (= interpretation), provide a paraphrase that is itself unambiguous or a diagnostic scenario (that is, a scenario that is compatible only with the reading in question).

(1)	a.	chocolate cake icing
	b.	clever boys and girls
	c.	John will answer the question precisely at noon.
	d.	Watch the man from across the street.
	e.	They should decide if they will come tomorrow.

B. For each reading, provide as much of a labeled bracketing as you can, focusing on distinguishing between the readings. In other words, you may not be able to give a full labeled bracketing, but for each reading, determine which words go together more or less closely.

Problem 1.3

The following instructions assume that you've downloaded the Trees program. Download the grammar tool in which grammar. Open the Trees program and from within it, click on "File" (top menu, leftmost item). Use "Choose Grammar" (second block first item) to navigate to the grammar tool and to open it. Then select the file menu item "New". This calls up three windows: an upper left window with one-letter expressions, a lower left window that is blank, and a large workspace on the right that is also blank. Follow the instructions given in the grammar tool, spelled out in more detail here.

Select a grammar (G1 or G2) using "choose-grammar" (top menu, rightmost item). (Don't confuse this with the earlier step of selecting the grammar tool.)
Click on the null symbol in the upper left window. A copy will appear in the lower left window. Drag that copy into the blank workspace.
Click on any Roman letter, and drag the copy from the lower left window onto the null symbol in the workspace.
Build more complex structures by clicking on Roman letters, and dragging the copies onto the root of already existing structures in the workspace.

Once you are able to construct complex expressions, briefly answer the following questions. There is no need to submit the trees you construct.

What is the difference between G1 and G2?
If presented with substrings generated by G1 and G2 containing only Roman letters (i.e., assuming that the null symbol is invisible), is it possible to tell which grammar has generated the string?