I have drawn the examples used in this chapter from the corpus version of Malory's Morte d'Arthur. I have modernized the spelling to make the examples easier to understand and added my own notes in square brackets to help explain the language. At the end of the chapter all the examples will be shown in their corpus versions.
Here's a sentence from Malory, that is understandable to an English speaker today:
(The queen is Igrayne; the child is Arthur; she has just discovered that Uther is his father.) Intuitively, any English speaker can divide the words in this sentence into distinct groups. For instance, it's clear that "great" belongs with "joy" (in a way that it does not belong, for instance, with "queen"). Similarly, "who was the father of her child" is a distinct group of words, since it answers the question, "what did the queen know?" Also, there are two clauses that could act as sentences on their own, namely "the queen made great joy" and "she knew who was the father of her child." Linguists call these natural word-groups constituents.
If we allow this as a sentence,
why does the following sentence seem wrong to us?
This has to do with the rules for replacement of constituents. We have replaced "the queen", which is a noun phrase, with "of her child", which is a prepositional phrase. The sentence will only sound right to us if we replace the noun phrase, "the queen", with another noun phrase. For instance, the following sentence sounds right, because "the father of her child" is a noun phrase (which contains a prepositional phrase):
As another example,
is a legitimate sentence, but
This suggests one of the basic premises of syntax, that some patterns of constituents are allowed in a language and some are not.
This notion that some patterns of constituents are allowed and some are not can be expressed in a formal grammar.
In the system of labelling used for the Middle English corpus, the sentence
has the structure NP-SBJ — VBD — NP-OB1 where NP-SBJ means noun phrase subject ("the queen"), VBD means verb in the past tense ("made") and NP-OB1 means noun- phrase direct object ("great joy"). Informally, the subject is whatever does the action of the verb; the object describes what the verb was done to. The long hyphens ("—") represent precedence: that is, the noun phrase precedes the verb, which precedes the object. So one rule of our formal grammar must look like this:
"IP" stands for "inflectional phrase", which means, very roughly, "sentence." This rule says, "a sentence can be rewritten as a noun-phrase subject preceding a verb in the past tense preceding a noun-phrase object."
Once we have a rule (rule 1, above), we can build other sentences from it. For instance, for a noun phrase subject we can use the pronoun "he", for a verb we can use "kissed", and for the verb's object we can use "the lady Igrayne", to compose the sentence:
Notice that the two noun-phrase objects in these examples have a different structure: "great joy" has the structure ADJ — N (adjective "great" precedes noun "joy") but "the lady Igrayne" has the structure D — N — NPR (determiner "the" precedes noun "lady" precedes proper noun "Igrayne". A proper noun is a noun referring to exactly one object: there may be many ladies, but only one Igrayne. Proper nouns are usually capitalized in Modern English.) This suggests these rules of our formal grammar:
We say that the constituent on the left hand side of the rule "dominates" the constituent on the right hand side of the rule. So "NP-OB1" dominates "ADJ — N" in the sentence "The queen made great joy", and "NP-OB1" dominates "D — N — NPR" in the sentence "He kissed the lady Igrayne." For that matter, "N" dominates "joy" in "great joy", and "N" dominates "lady" in "the lady Igrayne."
Thus a sentence can be entirely described in terms of precedence and dominance; precedence is expressed by the rules of the formal grammar, and dominance is expressed by the trail of rewrites made in building the sentence.
Of course, our original sentence was longer than "the queen made great joy." There was an entire prepositional phrase, labelled "PP", "when she knew who was the father of her child", tacked onto the end, as well as an adverbial phrase describing time, labelled "ADVP-TMP", "Then", tacked onto the beginning. So we could rewrite rule 1 like this, using parentheses to show that the prepositional and adverbial phrases are optional (because we would accept "The queen made great joy" as a sentence on its own):
Remember that the prepositional phrase contained a sentence of its own, "she knew who was the father of her child", following the preposition "when". The clause that links the embedded sentence to the main sentence is called a "complementizer phrase". In the schema used for the Middle English corpus, the complementizer phrase in this example is labelled "CP- ADV", because it describes the time that the main clause is happening, which makes it adverbial. The "C" is the complementizer; in this example it is not represented by any word, so "0" is entered.
For contrast, here's an example where a complementizer phrase is linked by a non-0 complement:
Here, the embedded sentence "I may not be whole" is introduced by the complementizer "that".
This embedding of one sentence in another sentence illustrates a basic feature of natural- language syntax, that of recursion. Since it is possible to embed one sentence in another sentence, there is no limit to possible sentence length and thus an infinite number of sentences can be generated. A classic example of recursion is the children's rhyme "The House that Jack Built", where each sentence embeds the one before it:
In theory this game could go on forever but in practice it mercifully ends:
The embedded sentence "she knew who was the father of her child", of course, has a structure of its own. In the Middle English corpus it is represented like this:
where "she" is the noun-phrase subject, "knew" is the verb in the past tense, and "CP- QUE" is an indirect question. That is, it answers the question, "what did the queen know?" The indirect question "who was the father of her child", in turn, has structure:
where WNP means wh-noun-phrase, a noun phrase beginning with "who", "which", or "what", in this case the pronoun "who", and C is the 0 complementizer which links the embedded sentence "was the father of her child", labelled IP.
It is not obvious why "was the father of her child" is labelled an embedded sentence. Most English speakers would not accept it as a sentence on its own because it appears to have no subject. In this type of analysis, the subject is a "trace", which can be thought of as a placeholder for a constituent that has been moved away. The constituent that has been moved in this case is the "who". It is a formal requirement of English grammar that wh- words move to the WNP position.
This is the rule describing the IP "was the father of her child":
where NP-SBJ represents the noun-phrase subject trace, BED means a past-tense version of the verb "to be", in this case "was", and NP-OB1 is again a noun-phrase object.
This is the rule describing the NP-OB1 "the father of her child":
where D represents the determiner "the", N represents the noun "father", and PP represents the prepositional phrase "of the father".
This is the rule describing the prepositional phrase "of her child":
where P represents the preposition "of" and NP represents the noun phrase "her child".
This is the rule describing the noun phrase "her child":
where PRO$ represents the possessive pronoun "her", and "N" represents the noun "child".
We have now worked our way through the structure of the sentence,
Below is the example sentence as it appears, parsed and labelled, in the Middle English corpus.
Note: All the labels used in the corpus are listed in the appendix, with a brief description of each label. IPs that mark the main clause of a sentence are labelled "IP-MAT" ("MAT" for "matrix"), and other IPs are labelled "IP-SUB" ("SUB" for "subordinate.")
( (IP-MAT (ADVP-TMP (ADV Thenne)) (NP-SBJ (D the) (N quene)) (VBD made) (NP-OB1 (ADJ grete) (N joye)) (PP (P whan) (CP-ADV (C 0) (IP-SUB (NP-SBJ (PRO she)) (VBD knewe) (CP-QUE (WNP-1 (WPRO who)) (C 0) (IP-SUB (NP-SBJ *T*-1) (BED was) (NP-OB1 (D the) (N fader) (PP (P of) (NP (PRO$ her) (N Child))))))))) (E_S .)) (ID CMMALORY,5.135))
And here are the other examples:
And so he kissed the lady Igrayne.
( (IP-MAT (CONJ and) (ADVP (ADV so)) (NP-SBJ (PRO he)) (VBD kist) (NP-OB1 (D the) (N lady) (NP-PRN (NPR Igrayne)))) (ID CMMALORY,4.95))
I am sick for anger and for love of fair Igrayne, that I may not be whole.
( (IP-MAT-SPE (' ') (NP-SBJ (PRO I)) (BEP am) (ADJP (ADJ seke)) (PP (PP (P for) (NP (N angre))) (CONJP (CONJ and) (PP (P for) (NP (N love) (PP (P of) (NP (ADJ fayre) (NPR Igrayne))))))) (, ,) (CP-ADV (C that) (IP-SUB (NP-SBJ (PRO I)) (MD may) (NEG not) (BE be) (ADJP (ADJ hool)))) (E_S .)
And so they set a pavilion over the stone and the sword.
( (IP-MAT (CONJ and) (ADVP (ADV so)) (NP-SBJ (PRO they)) (VBD sette) (NP-OB1 (D a) (N pavelione)) (PP (P over) (NP (NP (D the) (N stone)) (CONJP (CONJ and) (NP (D the) (N swerd))))) (E_S ,)) (ID CMMALORY,10.288))
In theory, it is possible to write a formal grammar which will describe every sentence in the English language (the rules above would constitute a very small subset of such a grammar). The most exhaustive grammar so far is The major syntactic structures of English, written in 1973 by Robert P. Stockwell, Paul Schachter, and Barbara Hall Partee. The book is 847 pages long. Changes in theory since it was written have rendered it obsolete.