From the facts presented in Chapter 1 and Assignment 1, we can conclude that the interpretation of a noun phrase (and in some cases, even the grammaticality of the sentence containing it) depends on its structural relations to other noun phrases. But what is the origin of these structural relations, and, more generally, of syntactic structure itself? In this chapter, we discuss two approaches to this question that have been pursued in the history of generative grammar. The first, based on so-called phrase structure rules, characterized the field from its beginnings in the 1950s until roughly 1980. It had several conceptual shortcomings, however, which over time led to its replacement by a second approach, still current, according to which syntactic structure is projected from a language's lexicon. This second approach to the generation of phrase structure is therefore referred to as lexicalist. The particular formulation of the lexicalist approach presented here is based on the mathematical formalism of Tree-Adjoining Grammar (Joshi et al. 1975, Kroch and Joshi 1985).
|(1)||a.||VP||--->||V NP PP|
|b.||VP||--->||V NP NP|
From a mathematical point of view, such rules form part of a so-called context-free grammar, and they therefore haveby definitioncertain formal properties. Specifically, the lefthand side of a phrase structure rule consists of exactly one symbol, whereas the righthand side of a phrase structure rule may consist of one or more symbols. The rightward-pointing rewrite arrow that connects the lefthand and righthand sides of a phrase structure rule indicates that the lefthand side of the rule can be rewritten as (= replaced by) the symbols on the rule's righthand side. Because of this, phrase structure rules are also called rewrite rules.
As they stand, phrase structure rules are devices that manipulate strings without regard for structure. For instance, given the two rules in (2), the symbol in (3a) can be successively rewritten as the strings in (3b) and (3c).
|b.||V NP||(by (2a))|
|c.||V Article N||(by (2b))|
But although the phrase structure rule for NP in (2b) implies that the article and the noun form a structural unit, nothing in the final result of the rewriting in (3c) reflects this. That is, (3c) contains no indication that 'Article' and 'N' belong together more closely than do 'V' and 'Article'. In order to retain the structural information inherent in the system of phrase structure rules, it is necessary to maintain a history of how each symbol is rewritten. This can be done by means of the tree-drawing algorithm in (4).
|(4)||a.||Let the symbol on the lefthand side of a phrase structure rule correspond to a node in a tree.|
|b.||Let the symbols on the righthand side of a phrase structure rule correspond to a set of nodes in the same order as that of the symbols in the rule.|
|c.||Let the rewrite arrow correspond to a domination relation between the symbols from the lefthand and righthand sides of the rule. This relation is conventionally represented by downward-pointing branches connecting the dominating node (the mother or parent) and the dominated nodes (the daughters or children). The downward-pointing direction of the branches is generally left implicit.|
Using (4), the rules in (1) and (2b) can be translated into the trees in (5).
In the mathematical literature, the structures that result from applying the algorithm in (4) are known as derivation trees because they represent the history of how a particular sequence of symbols is derived from an original symbol. For instance, (6) represents the derivation of (3c) from (3a).
In linguistics, structures like (6) are simply referred to as trees.
The phrase structure rules in (1) and (2) contain only nonterminal symbols (= syntactic categories; roughly speaking, parts of speech). In order to generate phrases and sentences consisting of actual words, there must also be rules available whose righthand side contains terminal symbols, like those in (7).
In contrast to nonterminals, terminal symbols cannot appear on the lefthand side of a phrase structure rule. This is what gives them their name; since they cannot be rewritten, they terminate the particular bit of the derivation in which they appear.
Because we can think of the terminal symbols in (7)the lexical itemsas being inserted into structures like those in (5), rules like those in (7) are called lexical insertion rules. Lexical insertion rules are like ordinary phrase structure rules in that the lefthand side in both must be a single nonterminal symbol. However, the righthand side of a lexical insertion rule is constrained to be a single terminal symbol. The result of inserting the lexical items in (7) into the structures in (5) is shown in (8).
|(9)||a.||They put the book on the shelf.|
|b.||I will tell you the answer.|
|c.||The lion will devour the wildebeest.|
But this approach also has a serious problem that was noted early onit incorrectly allows structures to be generated for ungrammatical sentences like those in (10).
|(10)||a.||*||I will put you the answer.|
|c.||*||They waited the book on the shelf.|
In the terms of traditional grammar, the difficulty is that a purely rule-based approach to syntactic structure fails to distinguish among various subcategories of verbs, such as intransitive, transitive, and ditransitive verbs.
Generative grammarians therefore proposed to incorporate the relevant information into each verb's lexical entry. The idea is that each lexical item has an entry in our mental grammar, comparable to a conventional dictionary entry (though much more detailed). Included in the lexical entry are (among other things) the lexical item's pronunciation, its meaning, its syntactic category, and, crucially for present purposes, the syntactic environments in which it can occur. When represented as in (11), these environments are known as subcategorization frames. The blank line represents the position of the verb, and any remaining syntactic categories represent information about the verb's syntactic environment that is obligatory, but not predictable. For instance, since every English verb requires a subject, subjects are omitted in (11). As the examples of (11b,e) show, verbs (and lexical items more generally) may be associated with more than one subcategorization frame.
|(11)||a.||put||___ NP PP|
|b.||tell||___ NP NP, ___ NP PP|
|e.||eat||___, ___ NP|
Subcategorization frames are used as follows. At the point of lexical insertion, a verb's subcategorization frame is checked against the syntactic environment that the verb is being inserted into. If the environment matches the frame, lexical insertion goes forward, but if not, lexical insertion fails. Thus, the sentences in (9) are generated, but those in (10) are not.
First, nothing in the formal constraints on phrase structure rules prohibits 'crazy' rules like those in (12).
But structures corresponding to such rules are simply not found in the
world's languages. Rather, a phrase of a particular type (say, a verb
phrase) contains a lexical item of that same type (in this case, a verb).
In other words, as we have already been assuming, phrases have a
grammatical core. In traditional grammar, this core is called the
heada term that has been adopted by generative grammarians.
Second, the architecture of a grammar that is
based on phrase structure rules and subcategorization frames requires the
information in each subcategorization frame to duplicate information in
some phrase structure rule. Such redundancy violates (13), a fundamental
methodological principle of parsimony in the sciences and elsewhere, which
is commonly known as Occam's
razor after the scholastic philosopher William Occam.
Second, the architecture of a grammar that is based on phrase structure rules and subcategorization frames requires the information in each subcategorization frame to duplicate information in some phrase structure rule. Such redundancy violates (13), a fundamental methodological principle of parsimony in the sciences and elsewhere, which is commonly known as Occam's razor after the scholastic philosopher William Occam.
|entities||not||are||to be multiplied||beyond||necessity|
|'Don't assume unnecessary entities.'|
Much of the history of generative grammar has been driven by the assumption that redundancy in the theory indicates a failure of insight, and that more insight will be achieved by searching for ways to eliminate the redundancy.
The question that we focus on for most of the rest of this chapter is what information needs to be included in the elementary trees. A first important piece of information is clearly a lexical item's syntactic category. This can easily be represented as the lexical item's mother, as shown in (14).
A second piece of information that needs to be represented is the fact that lexical items serve as heads of phrases. Again, we can represent this by having an appropriately labeled node dominate the lexical item. We can think of the representation as being generated in two steps: first, a phrasal node of as yet undetermined category is added to the elementary trees in (14), and then the lower syntactic category's type percolates up the elementary tree (to use a widespread metaphor) to determine the type of the entire phrase.
The structures in (16), consisting of a lexical item, its syntactic category, and a phrase corresponding to the category, are an elementary tree's irreducible coreits spine. In many cases, an elementary tree also contains further slots for various other syntactic dependents. This permits the straightforward representation of the various subcategories of verbs, the fact that prepositions take objects, and so on. Elementary trees for prepositions and for ditransitive, transitive, and intransitive verbs are illustrated in (17).
As is evident from the identity of (17bd) with (8bd), the difference between the rule-based and the projection-based approaches to phrase structure is not empirical, but conceptual. The structures in (8) are generated 'top down' by phrase structure rules, whereas those in (17) are generated 'bottom up' by projection from a lexical item.
Notice that the slots in the elementary trees other than the lexical item anchoring it are empty. This is because each elementary tree is intended to provide exactly the information that is characteristic of the lexical item anchoring itno less, but also no more. For instance, a verb like devour requires an object, but not a particular one; many different phrases will fill the bill, as long as they are noun phrases.
A final issue concerns verbs such as eat, which can be used either with or without an explicit object. In the rule-based approach to generating syntactic structure, such verbs are associated with multiple subcategorization frames; recall (11b,e). In the projection-based approach, the notational counterpart to multiple subcategorization frames is to associate a single lexical item with more than one elementary tree, as shown in (18).
As we will see in a moment, the structures in (17) and (18) are oversimplified, but for now, they illustrate how it is possible to reconstruct trees like those in (8) without incurring the redundancy inherent in a system of phrase structure rules, lexical insertion rules, and subcategorization frames.
Note: For expository simplicity, the focus in what follows is on verbs. The discussion is extended to other types of heads in Chapters 3 and 4.
Notice that in such flat structures, the subject and any objects of
the verb c-command each other. However, we concluded on the basis of
the distribution of reflexive pronouns (recall Assignment 1, Exercise 2) that subjects
asymmetrically c-command objects. That is, subjects c-command
objects, but objects don't c-command subjects. Since the representations
in (19), at least those in (19a,b), fail to represent this fact, they must
What is necessary is a node that groups the verb together with any
objects, but that excludes the subject, as in (20). The required
additional node is standardly called V' (read as 'V bar').
What is necessary is a node that groups the verb together with any objects, but that excludes the subject, as in (20). The required additional node is standardly called V' (read as 'V bar').
It is worth noting that we have no evidence as yet for V' in elementary trees for intransitive verbs like waited. For the moment, we will assume its existence on conceptual groundsnamely, on the grounds that including it makes the elementary tree in (20c) analogous to those in (20a,b). In Assignment 2, you will be asked to formulate an empirical (= data-based) argument for including V' in the elementary trees of intransitive verbs.
Notice further that it is now V' and not the node labeled VP that
corresponds to the verb phrase of traditional grammar. Although the
concepts and terms of generative grammar are often borrowed from
traditional grammar and begin by coinciding with them, their further
development can diverge. Such conceptual and terminological innovation is
par for the course in the history of any science or technical discipline.
For expository convenience, we now review and introduce some useful
terminology to describe hierarchically structured elementary trees as in
(20). We say that the lexical item projects the syntactic structure
in the elementary tree. The lexical item's syntactic category, the
head of the projected structurehere, Vis also called the
lexical projection. V' is the intermediate projection, and VP
is the maximal projection or phrasal projection. We extend
the notion of spine introduced earlier to include all three of these
projections. The three projections are said to have distinct bar levels, as summarized in (21).
For expository convenience, we now review and introduce some useful terminology to describe hierarchically structured elementary trees as in (20). We say that the lexical item projects the syntactic structure in the elementary tree. The lexical item's syntactic category, the head of the projected structurehere, Vis also called the lexical projection. V' is the intermediate projection, and VP is the maximal projection or phrasal projection. We extend the notion of spine introduced earlier to include all three of these projections. The three projections are said to have distinct bar levels, as summarized in (21).
The sister of the intermediate projection is called the specifier; it is the slot for subjects of sentences (see Chapter 6) and possessives in noun phrases. Each elementary tree is limited to a single specifier. In English, the specifier of VP is the intermediate projection's left sister, but in a VOS language like Malagasy (recall Assignment 1, Exercise 3), it is the right sister. Following traditional terminology, any sisters of the head are called complements. Taken together, the specifier and any complements are often referred to as a head's arguments.
|(22)||They eat five apples before lunch.|
As mentioned earlier, certain verbsamong them, eatare associated with more than one elementary tree. The grammaticality of (22) might therefore be taken to indicate that eat is associated with a ditransitive elementary tree as in (23a), in addition to the transitive and intransitive elementary trees in (23b,c).
Assuming (23a), (22) would then have the structure in (24). (For clarity, the internal structure of NPs and PPs in this and following sentence structures is omitted.)
However, there turns out to be compelling evidence against the representations in (23a) and (24). This evidence comes from a syntactic phenomenon by the name of do so substitution, which is illustrated in (25).
|(25)||a.||They eat||five apples||before lunch,||and we do so, too.|
|b.||They eat||five apples||before lunch,||and we do so before dinner.|
As indicated by the italicization, do so substitutes for the sequence eat five apples before lunch in (25a), but only for the sequence eat five apples in (25b). Let us now make the assumption in (26), which has been standard in syntactic theory from before the times of generative grammar.
|(26)||Substitution is possible only if the sequence of words being substituted for is a syntactic constituent (= unit of syntactic structure).|
In trees, constituents are represented as nodes that exhaustively dominate the sequence in question.
|(27)||A node exhaustively dominates a sequence of symbols iff it dominates all and only the symbols in the sequence.|
For instance, A dominates the sequence B C in (28ac), but exhaustively dominates it only in (28a). A fails to exhaustively dominate B C in (28b,c), because it dominates too much material. A also fails to exhaustively dominate B C in (28d), because it dominates too little material.
As is evident, domination is a necessary but not a sufficient condition for exhaustive domination.
The assumption in (26) raises the question of whether the
representation in (24) is consistent with the do so substitution
facts in (25). The sequence being substituted for in (25a) (eat five
apples before lunch) is exhaustively dominated by V', and so the
representation is indeed consistent with (25a). But (24) is not consistent
with the grammaticality of do so substitution in (25b), because it
contains no node that exhaustively dominates the sequence eat five
apples; cf. the discussion of (28b). Since there is no reason to
believe that the structure of the verb phrases in (25) differs according to
which sequence do so substitutes for, the structure in (24) must be
rejected, along with the elementary tree in (23a) that underlies it.
But what then is the correct syntactic structure for the verb phrase
eat five apples before lunch? Given the assumption in (26) and the
do so substitution facts, it must be the case that eat five
apples before lunch and eat five apples are both constituents
(= sequences exhaustively dominated by a single node). The tree in
(29) has the proper shape.
More branching structure
But what then is the correct syntactic structure for the verb phrase eat five apples before lunch? Given the assumption in (26) and the do so substitution facts, it must be the case that eat five apples before lunch and eat five apples are both constituents (= sequences exhaustively dominated by a single node). The tree in (29) has the proper shape.
In (29), the NP five apples is a sister of the head V, whereas the PP before lunch and the head V are more distantly related. Given our earlier definition of complements as sisters of heads, only the direct object is a complement. But then what relation does before lunch bear to the head V, being neither a complement nor a specifier? Again adopting a term from traditional grammar, we refer to such constituents as modifiers or adjuncts.
Observe that in (29), one instance of V' dominates another. We say
that V' in (29) is a recursive category, and that (29) itself is a
recursive structure. We return to the topic of recursion in Chapter 3.
The dependent elements that we have been discussingcomplements,
specifiers, and adjunctsall stand in distinct structural relations to
the head (and to the spine more generally). Both complements and adjuncts
are daughters of intermediate projections, but they differ in that
complements are sisters of heads, whereas adjuncts are sisters of the next
higher projection level, intermediate projections. As sisters of
intermediate projections, adjuncts resemble specifiers. But again, the two
relations are distinct because adjuncts are daughters of intermediate
projections, whereas specifiers are daughters of maximal projections.
These structural relations and distinctions are summarized in (30).
A typology of syntactic dependents
The dependent elements that we have been discussingcomplements, specifiers, and adjunctsall stand in distinct structural relations to the head (and to the spine more generally). Both complements and adjuncts are daughters of intermediate projections, but they differ in that complements are sisters of heads, whereas adjuncts are sisters of the next higher projection level, intermediate projections. As sisters of intermediate projections, adjuncts resemble specifiers. But again, the two relations are distinct because adjuncts are daughters of intermediate projections, whereas specifiers are daughters of maximal projections. These structural relations and distinctions are summarized in (30).
|(30)||Relation to head||Sister of||Daughter of|
|Adjunct||Intermediate projection||Intermediate projection|
|Specifier||Intermediate projection||Maximal projection|
Both traditional and generative grammar maintain that the grammatical relation between heads and their arguments is closer than that between heads and adjuncts. The idea is that adjuncts are not required by a particular lexical item; rather, they are optional specifications. Now recall that elementary trees are intended to provide exactly the information that is characteristic of the lexical item anchoring itno less, but also no more. Elementary trees therefore cannot include slots for adjuncts. But if the structure in (31) is not a legal elementary tree, then how is it possible to build the structure in (29)?
Let us work up to the answer to this question by first considering the adjunctless sentence They eat five apples. It is clear that we can derive the structure for this sentence by substituting subtrees for they and five apples at the specifier and the complement nodes of the transitive elementary tree for eat in (23b), repeated here as (32a). This process of substitution yields (32b); as before, the internal structure of syntactic dependents is omitted for clarity.
Of course, the question still remains of how adjuncts enter into syntactic structure. The answer is that they are integrated by a separate process called adjunction. Adjunction is a two-step process that targets a particular node. For the modifier of a head, the target of adjunction is that head's intermediate projection, as indicated by the box in (33a). The first step in adjunction is to make a copy of the target of adjunction right above the original node, as in (33b). The second step is to attach the tree for the adjunct phrase as a daughter of the newly created node, as in (33c).
|Select target of adjunction||Step 1: Clone target of adjunction||Step 2: Attach adjunct as daughter of clone|
Compare (33c) with (29), and you will see that the two structures are identicalprecisely the desired result.
It is important to attach the adjunct as a daughter of the higher copy of the intermediate projection. Attaching the adjunct as a daughter of the lower copy would result in a structure in which the attached constituent is incorrectly represented as a complement. Moreover, the higher copy would serve no useful function, thereby violating Occam's razor.
(33a) contains only one instance of V' to serve as target of adjunction. In (33c), on the other hand, there are two V' nodes, and the question immediately arises whether either of them can serve as target of adjunction for a second adjunct. Selecting the lower V' as the target of adjunction is illustrated in (34).
|Target of adjunction: Lower V'||Step 1: Clone lower V'||Step 2: Attach adjunct as daughter of clone|
(35) illustrates the alternative choice of the higher V' as target of adjunction.
|Target of adjunction: Higher V'||Step 1: Clone higher V'||Step 2: Attach adjunct as daughter of clone|
Since the sentences that result in (34c) and (35c) are both
grammatical, we conclude that any intermediate projection can serve as
target of adjunction.
As expected given this conclusion, even more complex adjunction
structures are possible, as illustrated in (36).
As expected given this conclusion, even more complex adjunction structures are possible, as illustrated in (36).
|(37)||a.||They waited for no good reason, but we did so for a very good one.|
|b.||They waited (for) a week, but we did so (for) only a day.|
|c.||They waited in the parking lot, but we did so across the street.|
Almost all phrases specifying manner are also adjuncts. But in at least one exceptional case, namely that of the verb word, speakers report varying judgments concerning sentences like (38b), as indicated by the percent sign.
|(38)||a.||They worded the answer carelessly, and we did so, too.|
|b.||%||They worded the answer carelessly, but we did so carefully.|
For speakers who accept (38b), the manner phrase is an adjunct, whereas for those who reject it, it is a complement. Depending on one's judgments, then, the verb word is associated in one's mental grammar with the elementary tree in (39a) or in (39b).
|Speaker accepts (38b)||Speaker rejects (38b)|
The status of syntactic dependents not covered above is often not immediately clear. In such cases, it is necessary to appeal to the results of do so substitution, bearing in mind that it is not unusual for speakers to disagree about these results, as in the case of (38b).
A final word should be said about the correlation between a syntactic dependent's obligatory or optional character and its status as a complement or adjunct. It is tempting to posit the biconditional relationship in (40).
|(40)||a.||If a syntactic dependent is obligatory, it is a complement.||TRUE|
|b.||If a syntactic dependent is a complement, it is obligatory.||FALSE|
But as the annotation indicates, the biconditional in (40) cannot be maintained. It is true that obligatory syntactic dependents are complements. For instance, the contrast in (41) allows us to conclude the noun phrase following devour is a complement, a conclusion that is borne out by do so substitution in (42).
|(41)||It's amazing; every time I see him,|
|b.||he's devouring a six-inch steak.|
|(42)||a.||He devoured a hamburger and french fries, and I did so, too.|
|b.||*||He devoured a hamburger and french fries, and I did so six samosas.|
However, not all complements are obligatory, as we know from the existence of lexical items associated with more than one elementary tree, like eat. The relevant do so substitution facts are given in (43) and (44). Note particularly the contrast between (44a) and (44c).
|(43)||It's amazing; every time I see him,|
|b.||he's eating a six-inch steak.|
|(44)||a.||He ate, and I did so, too.|
|b.||He ate a hamburger and french fries, and I did so, too.|
|c.||*||He ate a hamburger and french fries, and I did so six samosas.|
It is, however, possible to apply the modus tollens of the propositional calculus to (40a) to deduce (45).
|(45)||If a syntactic dependent is not a complement, it is not obligatory.|
The two valid generalizations in (40a) and (45) can be expressed more succinctly as in (46).
|(46)||a.||Obligatory syntactic dependents are complements.|
|b.||Adjuncts are optional.|
The bracketed structures corresponding to (32b), (33c), (34c), and (36a,b) are given in (47).
|(47)||a.||[VP [NP they ] [V ' [V eat ] [NP five apples ] ] ]|
|b.||[VP [NP they ] [V ' [V ' [V eat ] [NP five apples ] [PP before lunch ] ] ] ]|
|c.||[VP [NP they ] [V ' [V ' [V ' [V eat ] [NP five apples ] [PP with their friends ] [PP before lunch ] ] ] ] ]|
|d.||[VP [NP they ] [V ' [V ' [V ' [V ' [V eat ] [NP five apples ] [PP before lunch ] [PP with their friends ] [PP for fun ] ] ] ] ] ]|
|e.||[VP [NP they ] [V ' [V ' [V ' [V ' [V eat ] [NP five apples ] [PP before lunch ] [PP for fun ] [PP under the table ] [PP with their friends ] ] ] ] ] ]|
It is important to realize that tree diagrams and bracketed structures are completely equivalent ways of representing the same information. Notice, for instance, that information concerning the internal structure of the NPs and PPs is missing in (47), just as it is in the corresponding tree diagrams.
Although tree diagrams are easier for humans to process, bracketed structures are often desirable for various reasons. Publishers prefer labeled bracketings because they are cheaper to typeset. In large syntactically annotated corpora such as the Penn Treebank or the Penn-Helsinki Parsed Corpus of Middle English, syntactic structure is also represented by means of labeled bracketing because the corpora can then be stored as ASCII files, which can be quickly and conveniently searched using dedicated string manipulation languages such as Perl.