2    From phrase structure rules to lexical projection

From the facts presented in Chapter 1 and Assignment 1, we can conclude that the interpretation of a noun phrase (and in some cases, even the grammaticality of the sentence containing it) depends on its structural relations to other noun phrases. But what is the origin of these structural relations, and, more generally, of syntactic structure itself? In this chapter, we discuss two approaches to this question that have been pursued in the history of generative grammar. The first, based on so-called phrase structure rules, characterized the field from its beginnings in the 1950s until roughly 1980. It had several conceptual shortcomings, however, which over time led to its replacement by a second approach, still current, according to which syntactic structure is projected from a language's lexicon. This second approach to the generation of phrase structure is therefore referred to as lexicalist. The particular formulation of the lexicalist approach presented here is based on the mathematical formalism of Tree-Adjoining Grammar (Joshi et al. 1975, Kroch and Joshi 1985).

Phrase structure rules

From the 1950s until roughly 1980, syntactic structure was thought to be generated by phrase structure rules like those in (1).

(1) a.   VP ---> V NP PP
b. VP ---> V NP NP
c. VP ---> V NP
d. VP ---> V

From a mathematical point of view, such rules form part of a so-called context-free grammar, and they therefore have—by definition—certain formal properties. Specifically, the lefthand side of a phrase structure rule consists of exactly one symbol, whereas the righthand side of a phrase structure rule may consist of one or more symbols. The rightward-pointing rewrite arrow that connects the lefthand and righthand sides of a phrase structure rule indicates that the lefthand side of the rule can be rewritten as (= replaced by) the symbols on the rule's righthand side. Because of this, phrase structure rules are also called rewrite rules.

Translating phrase structure rules into trees

As they stand, phrase structure rules are devices that manipulate strings without regard for structure. For instance, given the two rules in (2), the symbol in (3a) can be successively rewritten as the strings in (3b) and (3c).

(2) a.   VP ---> V NP
b. NP ---> Article N
(3) a.   VP
b. V NP     (by (2a))
c. V Article N (by (2b))

But although the phrase structure rule for NP in (2b) implies that the article and the noun form a structural unit, nothing in the final result of the rewriting in (3c) reflects this. That is, (3c) contains no indication that 'Article' and 'N' belong together more closely than do 'V' and 'Article'. In order to retain the structural information inherent in the system of phrase structure rules, it is necessary to maintain a history of how each symbol is rewritten. This can be done by means of the tree-drawing algorithm in (4).

(4) a.   Let the symbol on the lefthand side of a phrase structure rule correspond to a node in a tree.
b. Let the symbols on the righthand side of a phrase structure rule correspond to a set of nodes in the same order as that of the symbols in the rule.
c. Let the rewrite arrow correspond to a domination relation between the symbols from the lefthand and righthand sides of the rule. This relation is conventionally represented by downward-pointing branches connecting the dominating node (the mother or parent) and the dominated nodes (the daughters or children). The downward-pointing direction of the branches is generally left implicit.

Using (4), the rules in (1) and (2b) can be translated into the trees in (5).

(5) a.       b.       c.       d.       e.  

In the mathematical literature, the structures that result from applying the algorithm in (4) are known as derivation trees because they represent the history of how a particular sequence of symbols is derived from an original symbol. For instance, (6) represents the derivation of (3c) from (3a).

(6)    

In linguistics, structures like (6) are simply referred to as trees.

Lexical insertion rules

The phrase structure rules in (1) and (2) contain only nonterminal symbols (= syntactic categories; roughly speaking, parts of speech). In order to generate phrases and sentences consisting of actual words, there must also be rules available whose righthand side contains terminal symbols, like those in (7).

(7) a.   V ---> put
b. V ---> tell
c. V ---> devour
d. V ---> waited
e. Article ---> the

In contrast to nonterminals, terminal symbols cannot appear on the lefthand side of a phrase structure rule. This is what gives them their name; since they cannot be rewritten, they terminate the particular bit of the derivation in which they appear.

Because we can think of the terminal symbols in (7)—the lexical items—as being inserted into structures like those in (5), rules like those in (7) are called lexical insertion rules. Lexical insertion rules are like ordinary phrase structure rules in that the lefthand side in both must be a single nonterminal symbol. However, the righthand side of a lexical insertion rule is constrained to be a single terminal symbol. The result of inserting the lexical items in (7) into the structures in (5) is shown in (8).

(8) a.       b.       c.       d.       e.  

Subcategorization frames

Assuming appropriate phrase structure rules and lexical insertion rules, the approach just outlined makes it possible to build structures for entire sentences, such as those in (9).

(9) a.   They put the book on the shelf.
b. I will tell you the answer.
c. The lion will devour the wildebeest.
d. They waited.

But this approach also has a serious problem that was noted early on—it incorrectly allows structures to be generated for ungrammatical sentences like those in (10).

(10) a. * I will put you the answer.
b. * They devoured.
c. * They waited the book on the shelf.

In the terms of traditional grammar, the difficulty is that a purely rule-based approach to syntactic structure fails to distinguish among various subcategories of verbs, such as intransitive, transitive, and ditransitive verbs.

Generative grammarians therefore proposed to incorporate the relevant information into each verb's lexical entry. The idea is that each lexical item has an entry in our mental grammar, comparable to a conventional dictionary entry (though much more detailed). Included in the lexical entry are (among other things) the lexical item's pronunciation, its meaning, its syntactic category, and, crucially for present purposes, the syntactic environments in which it can occur. When represented as in (11), these environments are known as subcategorization frames. The blank line represents the position of the verb, and any remaining syntactic categories represent information about the verb's syntactic environment that is obligatory, but not predictable. For instance, since every English verb requires a subject, subjects are omitted in (11). As the examples of (11b,e) show, verbs (and lexical items more generally) may be associated with more than one subcategorization frame.

(11) a.   put ___ NP PP
b.   tell ___ NP NP, ___ NP PP
c.   devour ___ NP
d.   waited ___
e.   eat ___, ___ NP

Subcategorization frames are used as follows. At the point of lexical insertion, a verb's subcategorization frame is checked against the syntactic environment that the verb is being inserted into. If the environment matches the frame, lexical insertion goes forward, but if not, lexical insertion fails. Thus, the sentences in (9) are generated, but those in (10) are not.

Two conceptual difficulties

Although subcategorization frames solve the problem posed by sentences like those in (10), phrase structure rules, along with subcategorization frames, were eventually given up for two reasons.

First, nothing in the formal constraints on phrase structure rules prohibits 'crazy' rules like those in (12).

(12) a.   NP ---> V Adj
b. VP ---> Adj

But structures corresponding to such rules are simply not found in the world's languages. Rather, a phrase of a particular type (say, a verb phrase) contains a lexical item of that same type (in this case, a verb). In other words, as we have already been assuming, phrases have a grammatical core. In traditional grammar, this core is called the head—a term that has been adopted by generative grammarians.

Second, the architecture of a grammar that is based on phrase structure rules and subcategorization frames requires the information in each subcategorization frame to duplicate information in some phrase structure rule. Such redundancy violates (13), a fundamental methodological principle of parsimony in the sciences and elsewhere, which is commonly known as Occam's razor after the scholastic philosopher William Occam.

(13)     Entia non sunt multiplicanda praeter necessitatem. (Latin)
entities not are to be multiplied beyond necessity
'Don't assume unnecessary entities.'

Much of the history of generative grammar has been driven by the assumption that redundancy in the theory indicates a failure of insight, and that more insight will be achieved by searching for ways to eliminate the redundancy.

Projection from the lexicon

Beginning in the 1970s and on through the 1980s, a new way of generating phrase structure was developed to eliminate the drawbacks of phrase structure rules. Since the grammar of a language must include the information in lexical entries no matter what, syntactic structure can be thought of as originating in the lexicon itself. In this view, each lexical item can be thought of as a small piece of syntactic structure—a syntactic atom, as it were. We will refer to these syntactic atoms as elementary trees. The derivation of the phrases and sentences of a language is then the process of composing the elementary trees with each other in a well-defined way.

Syntactic category

The question that we focus on for most of the rest of this chapter is what information needs to be included in the elementary trees. A first important piece of information is clearly a lexical item's syntactic category. This can easily be represented as the lexical item's mother, as shown in (14).

(14) a.       b.       c.       d.  

Phrasal projection

A second piece of information that needs to be represented is the fact that lexical items serve as heads of phrases. Again, we can represent this by having an appropriately labeled node dominate the lexical item. We can think of the representation as being generated in two steps: first, a phrasal node of as yet undetermined category is added to the elementary trees in (14), and then the lower syntactic category's type percolates up the elementary tree (to use a widespread metaphor) to determine the type of the entire phrase.

(15) a.       b.       c.       d.  

(16) a.       b.       c.       d.  

Syntactic dependents

The structures in (16), consisting of a lexical item, its syntactic category, and a phrase corresponding to the category, are an elementary tree's irreducible core—its spine. In many cases, an elementary tree also contains further slots for various other syntactic dependents. This permits the straightforward representation of the various subcategories of verbs, the fact that prepositions take objects, and so on. Elementary trees for prepositions and for ditransitive, transitive, and intransitive verbs are illustrated in (17).

(17) a.       b.       c.       d.

As is evident from the identity of (17b–d) with (8b–d), the difference between the rule-based and the projection-based approaches to phrase structure is not empirical, but conceptual. The structures in (8) are generated 'top down' by phrase structure rules, whereas those in (17) are generated 'bottom up' by projection from a lexical item.

Notice that the slots in the elementary trees other than the lexical item anchoring it are empty. This is because each elementary tree is intended to provide exactly the information that is characteristic of the lexical item anchoring it—no less, but also no more. For instance, a verb like devour requires an object, but not a particular one; many different phrases will fill the bill, as long as they are noun phrases.

Multiple elementary trees per lexical item

A final issue concerns verbs such as eat, which can be used either with or without an explicit object. In the rule-based approach to generating syntactic structure, such verbs are associated with multiple subcategorization frames; recall (11b,e). In the projection-based approach, the notational counterpart to multiple subcategorization frames is to associate a single lexical item with more than one elementary tree, as shown in (18).

(18) a.       b.  

As we will see in a moment, the structures in (17) and (18) are oversimplified, but for now, they illustrate how it is possible to reconstruct trees like those in (8) without incurring the redundancy inherent in a system of phrase structure rules, lexical insertion rules, and subcategorization frames.

Intermediate projections

Empirical motivation

As stated earlier, an elementary tree is intended to specify exactly the information that is characteristic of the lexical item that anchors it. In contrast to subcategorization frames, elementary trees can include predictable information. It is therefore important that a critical piece of information is still missing in the elementary trees in (17b–d)—namely, a slot for the verb's subject. The simplest way that one could go about adding such a slot would be to add a leftmost daughter to the VP nodes in (18b–d), yielding so-called flat structures as in (19).
Note: For expository simplicity, the focus in what follows is on verbs. The discussion is extended to other types of heads in Chapters 3 and 4.

(19) a.       b.       c.  

Notice that in such flat structures, the subject and any objects of the verb c-command each other. However, we concluded on the basis of the distribution of reflexive pronouns (recall Assignment 1, Exercise 2) that subjects asymmetrically c-command objects. That is, subjects c-command objects, but objects don't c-command subjects. Since the representations in (19), at least those in (19a,b), fail to represent this fact, they must be rejected.

What is necessary is a node that groups the verb together with any objects, but that excludes the subject, as in (20). The required additional node is standardly called V' (read as 'V bar').

(20) a.       b.       c.  

It is worth noting that we have no evidence as yet for V' in elementary trees for intransitive verbs like waited. For the moment, we will assume its existence on conceptual grounds—namely, on the grounds that including it makes the elementary tree in (20c) analogous to those in (20a,b). In Assignment 2, you will be asked to formulate an empirical (= data-based) argument for including V' in the elementary trees of intransitive verbs.

Notice further that it is now V' and not the node labeled VP that corresponds to the verb phrase of traditional grammar. Although the concepts and terms of generative grammar are often borrowed from traditional grammar and begin by coinciding with them, their further development can diverge. Such conceptual and terminological innovation is par for the course in the history of any science or technical discipline.

Terminology

For expository convenience, we now review and introduce some useful terminology to describe hierarchically structured elementary trees as in (20). We say that the lexical item projects the syntactic structure in the elementary tree. The lexical item's syntactic category, the head of the projected structure—here, V—is also called the lexical projection. V' is the intermediate projection, and VP is the maximal projection or phrasal projection. We extend the notion of spine introduced earlier to include all three of these projections. The three projections are said to have distinct bar levels, as summarized in (21).

(21)     Projection Bar level Example

Lexical 0 V
Intermediate 1 V'
Maximal, phrasal 2 VP

The sister of the intermediate projection is called the specifier; it is the slot for subjects of sentences (see Chapter 6) and possessives in noun phrases. Each elementary tree is limited to a single specifier. In English, the specifier of VP is the intermediate projection's left sister, but in a VOS language like Malagasy (recall Assignment 1, Exercise 3), it is the right sister. Following traditional terminology, any sisters of the head are called complements. Taken together, the specifier and any complements are often referred to as a head's arguments.

Complements versus adjuncts

Do so substitution

Consider (22), where the direct object five apples and the temporal expression before lunch are both construed with the verb.

(22)     They eat five apples before lunch.

As mentioned earlier, certain verbs—among them, eat—are associated with more than one elementary tree. The grammaticality of (22) might therefore be taken to indicate that eat is associated with a ditransitive elementary tree as in (23a), in addition to the transitive and intransitive elementary trees in (23b,c).

(23) a.       b.       c.  

Assuming (23a), (22) would then have the structure in (24). (For clarity, the internal structure of NPs and PPs in this and following sentence structures is omitted.)

(24)    

However, there turns out to be compelling evidence against the representations in (23a) and (24). This evidence comes from a syntactic phenomenon by the name of do so substitution, which is illustrated in (25).

(25) a.   They eat five apples before lunch, and we do so, too.
b.   They eat five apples before lunch, and we do so before dinner.

As indicated by the italicization, do so substitutes for the sequence eat five apples before lunch in (25a), but only for the sequence eat five apples in (25b). Let us now make the assumption in (26), which has been standard in syntactic theory from before the times of generative grammar.

(26)     Substitution is possible only if the sequence of words being substituted for is a syntactic constituent (= unit of syntactic structure).

In trees, constituents are represented as nodes that exhaustively dominate the sequence in question.

(27)     A node exhaustively dominates a sequence of symbols iff it dominates all and only the symbols in the sequence.

For instance, A dominates the sequence B C in (28a–c), but exhaustively dominates it only in (28a). A fails to exhaustively dominate B C in (28b,c), because it dominates too much material. A also fails to exhaustively dominate B C in (28d), because it dominates too little material.

(28) a.       b.       c.       d.  

As is evident, domination is a necessary but not a sufficient condition for exhaustive domination.

The assumption in (26) raises the question of whether the representation in (24) is consistent with the do so substitution facts in (25). The sequence being substituted for in (25a) (eat five apples before lunch) is exhaustively dominated by V', and so the representation is indeed consistent with (25a). But (24) is not consistent with the grammaticality of do so substitution in (25b), because it contains no node that exhaustively dominates the sequence eat five apples; cf. the discussion of (28b). Since there is no reason to believe that the structure of the verb phrases in (25) differs according to which sequence do so substitutes for, the structure in (24) must be rejected, along with the elementary tree in (23a) that underlies it.

More branching structure

But what then is the correct syntactic structure for the verb phrase eat five apples before lunch? Given the assumption in (26) and the do so substitution facts, it must be the case that eat five apples before lunch and eat five apples are both constituents (= sequences exhaustively dominated by a single node). The tree in (29) has the proper shape.

(29)    

In (29), the NP five apples is a sister of the head V, whereas the PP before lunch and the head V are more distantly related. Given our earlier definition of complements as sisters of heads, only the direct object is a complement. But then what relation does before lunch bear to the head V, being neither a complement nor a specifier? Again adopting a term from traditional grammar, we refer to such constituents as modifiers or adjuncts.

Observe that in (29), one instance of V' dominates another. We say that V' in (29) is a recursive category, and that (29) itself is a recursive structure. We return to the topic of recursion in Chapter 3.

A typology of syntactic dependents

The dependent elements that we have been discussing—complements, specifiers, and adjuncts—all stand in distinct structural relations to the head (and to the spine more generally). Both complements and adjuncts are daughters of intermediate projections, but they differ in that complements are sisters of heads, whereas adjuncts are sisters of the next higher projection level, intermediate projections. As sisters of intermediate projections, adjuncts resemble specifiers. But again, the two relations are distinct because adjuncts are daughters of intermediate projections, whereas specifiers are daughters of maximal projections. These structural relations and distinctions are summarized in (30).

(30)     Relation to head Sister of … Daughter of …

Complement Head Intermediate projection
Adjunct Intermediate projection Intermediate projection
Specifier Intermediate projection Maximal projection

Substitution and adjunction

Both traditional and generative grammar maintain that the grammatical relation between heads and their arguments is closer than that between heads and adjuncts. The idea is that adjuncts are not required by a particular lexical item; rather, they are optional specifications. Now recall that elementary trees are intended to provide exactly the information that is characteristic of the lexical item anchoring it—no less, but also no more. Elementary trees therefore cannot include slots for adjuncts. But if the structure in (31) is not a legal elementary tree, then how is it possible to build the structure in (29)?

(31)   *

Let us work up to the answer to this question by first considering the adjunctless sentence They eat five apples. It is clear that we can derive the structure for this sentence by substituting subtrees for they and five apples at the specifier and the complement nodes of the transitive elementary tree for eat in (23b), repeated here as (32a). This process of substitution yields (32b); as before, the internal structure of syntactic dependents is omitted for clarity.

(32) a.       b.  

Of course, the question still remains of how adjuncts enter into syntactic structure. The answer is that they are integrated by a separate process called adjunction. Adjunction is a two-step process that targets a particular node. For the modifier of a head, the target of adjunction is that head's intermediate projection, as indicated by the box in (33a). The first step in adjunction is to make a copy of the target of adjunction right above the original node, as in (33b). The second step is to attach the tree for the adjunct phrase as a daughter of the newly created node, as in (33c).

(33) a.       b.       c.  
Select target of adjunction Step 1: Clone target of adjunction Step 2: Attach adjunct as daughter of clone

Compare (33c) with (29), and you will see that the two structures are identical—precisely the desired result.

It is important to attach the adjunct as a daughter of the higher copy of the intermediate projection. Attaching the adjunct as a daughter of the lower copy would result in a structure in which the attached constituent is incorrectly represented as a complement. Moreover, the higher copy would serve no useful function, thereby violating Occam's razor.

(33a) contains only one instance of V' to serve as target of adjunction. In (33c), on the other hand, there are two V' nodes, and the question immediately arises whether either of them can serve as target of adjunction for a second adjunct. Selecting the lower V' as the target of adjunction is illustrated in (34).

(34) a.       b.       c.  
Target of adjunction: Lower V' Step 1: Clone lower V' Step 2: Attach adjunct as daughter of clone

(35) illustrates the alternative choice of the higher V' as target of adjunction.

(35) a.       b.       c.  
Target of adjunction: Higher V' Step 1: Clone higher V' Step 2: Attach adjunct as daughter of clone

Since the sentences that result in (34c) and (35c) are both grammatical, we conclude that any intermediate projection can serve as target of adjunction.

As expected given this conclusion, even more complex adjunction structures are possible, as illustrated in (36).

(36) a.       b.  

More on the distinction between complements and adjuncts

The most reliable guide to distinguishing between complements and adjuncts of verbs is to perform do so substitution as just discussed, but this step can often be dispensed with. Bare noun phrase objects (that is, ones not introduced by a preposition) are always complements. Phrases specifying reason, time, and location are always adjuncts, even if the phrase is a bare noun phrase. Some examples are given in (37), the adjuncts in the first conjunct are underlined.

(37) a.   They waited for no good reason, but we did so for a very good one.
b. They waited (for) a week, but we did so (for) only a day.
c. They waited in the parking lot, but we did so across the street.

Almost all phrases specifying manner are also adjuncts. But in at least one exceptional case, namely that of the verb word, speakers report varying judgments concerning sentences like (38b), as indicated by the percent sign.

(38) a.   They worded the answer carelessly, and we did so, too.
b. % They worded the answer carelessly, but we did so carefully.

For speakers who accept (38b), the manner phrase is an adjunct, whereas for those who reject it, it is a complement. Depending on one's judgments, then, the verb word is associated in one's mental grammar with the elementary tree in (39a) or in (39b).

(39) a.       b.  
Speaker accepts (38b) Speaker rejects (38b)

The status of syntactic dependents not covered above is often not immediately clear. In such cases, it is necessary to appeal to the results of do so substitution, bearing in mind that it is not unusual for speakers to disagree about these results, as in the case of (38b).

A final word should be said about the correlation between a syntactic dependent's obligatory or optional character and its status as a complement or adjunct. It is tempting to posit the biconditional relationship in (40).

(40) a.   If a syntactic dependent is obligatory, it is a complement.        TRUE
b.   If a syntactic dependent is a complement, it is obligatory. FALSE

But as the annotation indicates, the biconditional in (40) cannot be maintained. It is true that obligatory syntactic dependents are complements. For instance, the contrast in (41) allows us to conclude the noun phrase following devour is a complement, a conclusion that is borne out by do so substitution in (42).

(41)     It's amazing; every time I see him, …
a. * … he's devouring.
b. … he's devouring a six-inch steak.
(42) a.   He devoured a hamburger and french fries, and I did so, too.
b. * He devoured a hamburger and french fries, and I did so six samosas.

However, not all complements are obligatory, as we know from the existence of lexical items associated with more than one elementary tree, like eat. The relevant do so substitution facts are given in (43) and (44). Note particularly the contrast between (44a) and (44c).

(43)     It's amazing; every time I see him, …
a. … he's eating.
b. … he's eating a six-inch steak.
(44) a.   He ate, and I did so, too.
b.   He ate a hamburger and french fries, and I did so, too.
c. * He ate a hamburger and french fries, and I did so six samosas.

It is, however, possible to apply the modus tollens of the propositional calculus to (40a) to deduce (45).

(45)     If a syntactic dependent is not a complement, it is not obligatory.

The two valid generalizations in (40a) and (45) can be expressed more succinctly as in (46).

(46) a.   Obligatory syntactic dependents are complements.
b.   Adjuncts are optional.

Labeled bracketing

We have been representing syntactic structure by means of tree diagrams, but it is often convenient to use bracketed structures (also known as labeled bracketings) instead. In a tree diagram, domination is represented by arranging the dominating node above the material that is dominated. In a bracketed structure, the dominated material is instead enclosed in square brackets that are annotated with the label of the dominating node. By convention, the label is placed on the inside of the left bracket.

The bracketed structures corresponding to (32b), (33c), (34c), and (36a,b) are given in (47).

(47) a.   [VP [NP they ] [V ' [V eat ] [NP five apples ] ] ]
b. [VP [NP they ] [V ' [V ' [V eat ] [NP five apples ] [PP before lunch ] ] ] ]
c. [VP [NP they ] [V ' [V ' [V ' [V eat ] [NP five apples ] [PP with their friends ] [PP before lunch ] ] ] ] ]
d. [VP [NP they ] [V ' [V ' [V ' [V ' [V eat ] [NP five apples ] [PP before lunch ] [PP with their friends ] [PP for fun ] ] ] ] ] ]
e. [VP [NP they ] [V ' [V ' [V ' [V ' [V eat ] [NP five apples ] [PP before lunch ] [PP for fun ] [PP under the table ] [PP with their friends ] ] ] ] ] ]

It is important to realize that tree diagrams and bracketed structures are completely equivalent ways of representing the same information. Notice, for instance, that information concerning the internal structure of the NPs and PPs is missing in (47), just as it is in the corresponding tree diagrams.

Although tree diagrams are easier for humans to process, bracketed structures are often desirable for various reasons. Publishers prefer labeled bracketings because they are cheaper to typeset. In large syntactically annotated corpora such as the Penn Treebank or the Penn-Helsinki Parsed Corpus of Middle English, syntactic structure is also represented by means of labeled bracketing because the corpora can then be stored as ASCII files, which can be quickly and conveniently searched using dedicated string manipulation languages such as Perl.