Our general strategy in extending the guidelines for the PPCME2 to the later historical corpora has been to minimize the number of any changes between the corpora. In particular, we have attempted to make changes to the original annotation scheme only when forced to do so by distributional changes in the texts. In some instances (notably, in connection with quantifiers), we have modified the annotation guidelines because they prove too difficult to implement in a consistent manner. Finally (in connection with the post-modifier rule), we enforce the annotation guidelines more strictly in the later historical corpora than in the PPCME2.
In subsequent editions of the corpora, we hope to further minimize the differences described below.
In the PPCME2, collective nouns (FOLK, HORS, PEOPLE, etc.) are tagged as N. In early texts, before the universalization of plural -S, it can be quite difficult to distinguish reliably between singular and plural. For texts from the Middle English period M1, we have therefore tried to follow the translation that accompanies the text used, or when this is lacking, a separate translation. For details, consult the information for the individual texts.
In the later corpora, PEOPLE is tagged as singular (N) when
preceded by an unambiguously singular determiner (A, THAT, THIS), and as
plural (NS) elsewhere.
Concessive clauses
In the PPCME2, ALL BE IT (THAT) and SO BE IT (THAT) clauses are treated
similarly to V1 conditionals.
HOW BE IT (THAT) clauses are treated as adverbial free relatives.
In the later corpora, ALL BE IT and HOW BE IT (though not SO BE IT) come to be used absolutely. Moreover, regardless of whether they appear absolutely or introduce subordinate clauses, these items come to be spelled as single words (ALBEIT, HOWBEIT). We therefore treat them as unitary adverbs or prepositions. In the later corpora, SO BE IT clauses cease to pattern with ALL BE IT clauses. Instead, they are simply word order variants of IT BE SO and are annotated accordingly.
See below for further details and examples.
ALL BE IT (THAT), ALBEIT
In the PPCME2, ALL BE IT (THAT) clauses, like SO BE IT (THAT) clauses,
are treated similarly to V1
conditionals. ALL is POS-tagged Q, surrounded by
ADVP brackets, and treated as a daughter of CP-ADV.
This is not intended as the correct analysis of the construction, but
rather to fit in with the annotation of V1 conditionals.
( (IP-MAT (CONJ and) (PP (P atte) (NP (N risyng) (PP (P of) (NP (D the) (N sonne))))) (NP-SBJ (PRO I)) (VBD fond) (NP-OB1 (D the) (ADJ secunde) (N degre) (PP (P of) (NP (NPR Aries)))) (IP-PPL (VAG sittyng) (PP (P upon) (NP (PRO$ myn) (N est) (N orisonte)))) (, ,) (CP-ADV (ADVP (Q all)) ← ALL BE IT (IP-SUB (BEP be) (NP-SBJ-1 (PRO it)) (CP-THT-1 (C that) (IP-SUB (NP-SBJ (PRO it)) (BEP was) (ADJP (FP but) (ADJ litel)))))) (. .)) (ID CMASTRO,673.C1.364)) ( (IP-MAT-SPE (CONJ And) (, ,) (CP-ADV (ADVP (Q al)) ← ALL BE IT (IP-SUB (BED were) (NP-SBJ-1 (PRO it)) (ADVP (ADV so)) (CP-THT-1 (C that) (IP-SUB (NP-SBJ (PRO she)) (ADVP-TMP (ADV right) (ADV now)) (BED were) (ADJP (ADJ deed)))))) (, ,) (NP-SBJ (PRO ye)) (NEG ne) (MD oughte) (NEG nat) (, ,) (PP (P as) (PP (P for) (NP (PRO$ hir) (N deeth)))) (, ,) (NP-OB1 (PRO$+N youreself)) (TO to) (VB destroye) (. .)) (ID CMCTMELI,217.C1b.18))
In the later corpora, ALBEIT (like HOWBEIT) is treated as a unitary adverb (when used absolutely) or as a unitary preposition (when introducing a subordinate clause).
(NODE (CP-CAR (WNP-1 (WPRO Which)) (C 0) (IP-SUB (PP (P in) (NP (NP-POS (D the) (N$ kinges)) (NS daies))) (, ,) (PP-LFD (P albeit) ← ALBEIT (CP-ADV (C 0) (IP-SUB (NP-SBJ (PRO he)) (BED was) (ADVP (ADV sore)) (VAN ennamored) (PP (P vpon) (NP (PRO her)))))) (, ,) (ADVP-RSP (ADV yet)) (NP-SBJ-RSP=1 (PRO he)) (VBD forbare) (NP-OB1 (PRO her)) (, ,) (PP (CONJ either) (PP (P for) (NP (N reuerence))) (, ,) (CONJP (CONJ or) (PP (P for) (NP (D a) (ADJ certain) (ADJ frendly) (N faithfulnes))))))) (ID MORERIC,55.118))
( (IP-MAT (CONJ And) (ADVP (CP-FRL (WADVP-1 (WADV how)) ← HOW BE IT (C 0) (IP-SUB (ADVP *T*-1) (BEP be) (NP-SBJ-2 (PRO it)) (CP-THT-2 (C 0) (IP-SUB (IP-SUB-3 (NP-SBJ (PRO thou)) (HVP hast) (ADVP-TMP (ADV often)) (ADVP-TMP (ADV before)) (PP (P in) (NP (NP (PRO$ thy) (ADJ yonge) (N age)) (CONJP (CONJ and) (NP (ADJ myddell) (N age))))) (VBN dyvydyd) (NP-OB1 (PRO$ thy) (N lyfe)) (NP-TMP (Q+N somtyme)) (PP (P to) (NP (N vertue)))) (, ,) (IP-SUB=3 (NP-TMP (Q+N somtyme)) (PP (P to) (NP (N vyce))))))))) (, ,) (NP-SBJ (PRO ye)) (PP (P as) (ADVP-TMP (ADV now))) (PP (P in) (NP (PRO$ thy) (ADJR latter) (N age))) (VBP kepe) (IP-SMC (NP-SBJ (PRO$ thy) (N lyfe)) (ADJP (ADJ holy))) (PP (P in) (NP (N vertue))) (. .)) (ID CMINNOCE,11.189))
In the later corpora, HOWBEIT (like ALBEIT) is treated as a unitary adverb (when used absolutely) or as a unitary preposition (when introducing a subordinate clause).
( (IP-MAT-SPE (ADVP (ADV (ADV31 How) (ADV32 be) (ADV33 it))) ← HOWBEIT (NP-SBJ (PRO he) (CP-REL-SPE (WNP-1 0) (C that) (IP-SUB-SPE (NP-SBJ *T*-1) (HVP hath) (VBN receaved) (NP-OB1 (PRO$ hys) (N testimonye))))) (HVP hath) (VBN set) (RP to) (NP-OB1 (PRO$ his) (N seale) (CP-THT-SPE (C that) (IP-SUB-SPE (NP-SBJ (NPR God)) (BEP is) (ADJP (ADJ true))))) (. .)) (ID TYNDNEW,III,20J.218))
( (IP-MAT-SPE (CONJ and) (NP-SBJ (Q all)) (MD $shal) (BE be) (VAN delyverde) (, ,) (PP (P so) (CP-ADV (C 0) (IP-SUB (NP-SBJ (PRO thou)) (MD wolte) (VB telle) (NP-OB2 (PRO me)) (NP-OB1 (PRO$ thy) (N name))))) (, ,) (CP-ADV (ADVP (ADV so)) ← SO BE IT (IP-SUB (BEP be) (NP-SBJ-1 (PRO hit)) (CP-THT-1 (C that) (IP-SUB (NP-SBJ (PRO thou)) (BEP be) (NEG nat) (NP-OB1 (NPR sir) (NPR Launcelot)))))) (. .) (' ')) (ID CMMALORY,191.2824))
In the later corpora, SO BE IT is a word order variant of IT BE SO and is tagged accordingly.
( (IP-MAT (PP-LFD (P IF) (CP-ADV (C 0) (IP-SUB (ADVP (ADV SO)) ← SO BE IT (BEP BE) (NP-SBJ=1 (PRO IT)) (, ,) (CP-THT-1 (C THAT) (IP-SUB (PP (P IN) (NP (Q ANY) (N TRIANGLE))) (, ,) (NP-SBJ (D THE) (N SQUARE) (PP (P OF) (NP (D THE) (ONE ONE) (N SYDE)))) (BEP BE) (ADJP (ADJ =L) (PP (P TO) (NP (D THE) (NUM .IJ.) (NS SQUARES) (PP (P OF) (NP (D THE) (OTHER OTHER) (NUM IJ.) (NS SIDES))))))))))) (, ,) (ADVP-RSP (ADV THAN)) (MD MUST) (NP-ADV (N NEDES)) (NP-SBJ (D THAT) (N CORNER)) (BE BE) (NP-OB1 (D A) (ADJ RIGHT) (N CORNER) (, ,) (CP-REL (WNP-3 (WPRO WHICH)) (C 0) (IP-SUB (NP-SBJ *T*-3) (BEP IS) (VAN CONTEINED) (PP (P BETWENE) (NP (D THOSE) (NUM TWO) (ADJR LESSER) (NS SYDES)))))) (. .)) (ID RECORD,2.E4V.296))
In the older construction, which continues into Early Modern English with LIKE, the theme is labelled NP-SBJ, and the experiencer NP-OB1. This contrasts with the annotation of impersonal copular constructions of the type IT/THERE IS NEED (TO) ME, where the experiencer is labelled NP-OB2 because of the presence of the verb BE (see NP-OB2 in copular constructions).
(NODE (IP-SUB (NP-SBJ (D this) (ADJ wise) (N man)) (VBD saugh) (CP-THT (C that) (IP-SUB (NP-OB1 (PRO hym)) (VBD wanted) (NP-SBJ (N audience))))) (ID CMCTMELI,219.C2.95)) (NODE (PP (P if) (CP-ADV (C 0) (IP-SUB (NP-SBJ (D +tat)) (NP-OB1 (PRO +gow)) (VBP nede+t)))) (ID CMBRUT3,51.1503))
In the modern construction, the experiencer is labelled
NP-SBJ, and the theme NP-OB1.
In the later corpora, the general rule is applied consistently, and
these items are surrounded by both types of brackets.
In this connection, it is worth noting that the PPCME2 and the later
corpora do not always agree on which instances of postnominal ELSE and
ENOUGH are tagged as adjectival (ADJR) or as adverbial
(ADVR).
In the PPCME2, LESS, LEAST and MUCH, MORE, MOST are generally tagged
as quantifiers (Q, QR, QS), but as adjectives (ADJ, ADJR,
ADJS) under conditions
described below. The
distinction between the adjectival use and the pure quantifier use is
not always easy to make in a consistent way and becomes more difficult
over time. In the later corpora, these items are therefore uniformly
tagged as quantifiers (Q, QR, QS).
LESS, LEAST and MUCH, MORE, MOST are treated as adjectives (ADJ,
ADJR, ADJS) in the PPCME2 under the following conditions.
See Comparative
adjectives as heads of ADJP
and Superlative
adjectives as heads of ADJP for further relevant discussion.
Quantified expressions functioning as clause-level measure phrases
are tagged NP-ADV in the PPCME2 (see below), but as
NP-MSR in the later corpora. See
also Q+N, Q+WPRO.
Post-head modifiers
In the PPCME2, ELSE and ENOUGH in post-head position are surrounded by
POS brackets, but not by phrasal brackets, contrary to the general rule
that post-head modifiers are always bracketed as phrases.
PPCME2 Later corpora
(NP (Q no) (N thynge) (ADJ elles)) (NP (Q+N nothing)
(ADJP (ADJ else)))
(ADVP-LOC (Q+WADV anywhere) (ADV elles)) (ADVP-LOC (Q+N anywhere)
(ADVP (ADV else)))
(NP (N blisse) (ADJR inoh)) (NP (N bliss)
(ADJP (ADJR enough)))
(ADJP (ADJ rich) (ADVR ynow)) (ADJP (ADJ rich)
(ADVP (ADVR enough)))
(ADVP (ADV quickly) (ADVR ynow)) (ADVP (ADV quickly)
(ADVP (ADVR enough)))
Quantifiers and quantified expressions
LESS, LEAST and MUCH, MORE, MOST
a_D michel_ADJ lust_N
fram_P +te_D michel_ADJ conseil_N of_P +te_D vntrew_ADJ
ouer_P +tat_D michele_ADJ water_N
his_PRO$ michele_ADJ wisdom_N
Hit_PRO is_BEP a_D michel_ADJ reunesse_N ← MICHEL = ADJ (preceding determiner)
of_P mani_Q mann_N +de_C is_BEP
on_P michele_Q dwele_N on_P him_PRO seluen_N ← MICHEL = Q (no determiner)
and_CONJ +te_D more_ADJR fysches_NS swolwen_VBP +te_D lesse_ADJR ;_. ← elided head noun
'_' and_CONJ I_PRO shall_MD ensure_VB you_PRO ye_PRO shall_MD have_HV
the_D more_ADJR worship_N than_P ever_ADV ye_PRO had_HVD ._. '_'
And_CONJ hem_PRO thinketh_VBP +tat_C the_D more_ADJR peyne_N &_CONJ
the_D more_ADJR tribulacioun_N +tat_C +tei_PRO suffren_VBP for_P loue_N
of_P here_PRO$ god_N ,_, the_D ←P_117>_CODE more_ADJR ioye_N +tei_PRO
schull_MD haue_HV in_P another_D+OTHER world_N
for_CONJ he_PRO shal_MD ben_BE michel_ADJ bifore_P gode_NPR
And_CONJ sum_Q men_NS maken_VBP hem_PRO more_ADJR
and_CONJ maki+t_VBP his_PRO$ myracle_N more_ADJR
Ne_NEG dowte_VBP we_PRO not_NEG how_WADV byleue_N may_MD now_ADV be_BE
lesse_ADJR and_CONJ now_ADV be_BE more_ADJR
and_CONJ bifore_P gode_NPR ben_BEP michel_ADJ and_CONJ mihti_ADJ
that_C ony_Q sholde_MD be_BE accompted_VAN more_QR hardy_ADJ or_CONJ
more_ADJR of_P prouesse_N
Measure phrases
(NODE (IP-SUB (NP-SBJ (D +te) (N water))
(MD wolde)
(NP-ADV (Q+N no+ting)) ← nothing
(DO done)
(NP-OB1 (PRO$ his) (N commandement)))
(ID CMBRUT3,123.3740))
( (IP-MAT (CONJ &)
(NP-SBJ (D +tis) (NPR Harolde))
(HVD hade)
(NP-ADV (Q+N no+ting)) ← nothing
(NP-OB1 (NP (D +te) (NS condicions))
(CONJP (CONJ ne)
(NP (NS maners)))
(PP (P of)
(NP (NPR Kyng) (NPR Knoght)
(CP-REL (WNP-1 0)
(C +tat)
(IP-SUB (NP-SBJ *T*-1)
(BED was)
(NP-OB1 (PRO$ his) (N fader)))))))
(. ,))
(ID CMBRUT3,124.3772))
(NODE (IP-SUB (PP *ICH*-2)
(NP-SBJ (D the) (N werre))
(VBP liketh)
(NP-OB1 (PRO yow))
(NP-ADV (Q no) (N thyng))) ← nothing
(ID CMCTMELI,235.C1.699))
( (IP-MAT (NEG Ne)
(MD +terf)
(NP-SBJ (D +tt) (ADJ seli) (N meiden)
(CP-REL (WNP-1 0)
(C +tt)
(IP-SUB (NP-SBJ *T*-1)
(HVP haue+d)
(ADVP (Q al))
(DON idon)
(NP-OB1 (PRO hire))
(PP (RP ut) (P of)
(NP (ADJ +tullich) (N +teowdom)))
(PP (P as)
(NP (NP (NPR$ godes) (ADJ freo) (N dohter))
(CONJP (CONJ &)
(NP (NP-POS (PRO$ his) (N$ sunes))
(N spuse))))))))
(, .)
(VB drehe)
(NP-ADV (Q+N nawiht)) ← nought
(NP-OB1 (SUCH swucches))
(. .))
(ID CMHALI,157.417))
( (IP-MAT-SPE (NP-OB1 (Q Alle) (PRO$ +tine) (CODE ←P_47>) (NS +treates))
(NEG ne)
(VBP drede)
(NP-SBJ (PRO ich))
(IP-MAT-PRN (VBD q+d)
(NP-SBJ (PRO ha)))
(NP-ADV (QP (ADV riht) (Q noht))) ← nought
(. .))
(ID CMKATHE,47.442))
(NODE (NP (D $+te) (CODE {TEXT:bi+te}) (N mu+d)
(CP-REL (WNP-2 0)
(C +tt)
(IP (NP-SBJ *T*-2)
(NP-ADV (Q+N eawicht)) ← ought
(VBP (VBP21 mis) (VBP22 sei+d))
(NP-OB1 (PRO +te)))))
(ID CMANCRIW,II.100.1211))
(NODE (IP-SUB (NP-ADV (Q oghte)) ← ought
(NP-SBJ (PRO it))
(BEP es)
(NP-OB1 (QP (ADVR swa) (Q lyttill))
(CONJP (CONJ and)
(ADJP (ADVR swa) (ADJ schorte))))
(, ,)
(PP (P for)
(NP (OTHER othire) (NS thoghtes)
(CP-REL (WNP-1 0)
(C +tat)
(IP-SUB (NP-SBJ *T*-1)
(BEP are)
(PP (P in)
(NP (PRO thaym))))))))
(ID CMROLLTR,9.249))
(NODE (IP-SUB (NP-SBJ (PRO ic))
(NP-OB1 (PRO hit))
(NP-ADV (Q ouht)) ← ought
(VBP wite)
(, ,)
(PP (P to)
(NP (OTHER o+der) (NS +tinge))))
(ID CMVICES1,53.588))
(NODE (IP-SUB (NP-SBJ (PRO ha))
(BED wes)
(NP-ADV (Q+N sumdel)) ← somedeal
(VAN (VAN offruht) (CONJ ant) (VAN offert)))
(ID CMKATHE,29.161))
(NODE (IP-SUB (NP-SBJ (PRO ich))
(NP-OB1 (D +tis))
(IP-MAT-PRN (VBP sei+d)
(NP-SBJ (N warschipe)))
(NP-ADV (Q+N sumdel))
(VBP understonde))
(ID CMSAWLES,182.236))
( (IP-IMP (CONJ and)
(PP (P among)
(NP (QP (ADVR so) (Q muche))
(N ioye)))
(VBI antermete)
(NP-OB1 (PRO +te))
(NP-ADV (Q+WPRO sumwhat)) ← somewhat
(. ,))
(ID CMAELR3,40.410))
(NODE (IP-SUB (NP-SBJ (D +tis) (N worde)
(NP-PRN (N Gaste)))
(VBP sownnes)
(NP-ADV (Q+WPRO sumwhate)) ← somewhat
(PP (P into)
(NP (N fellenes))))
(ID CMEDTHOR,48.744))
Q+N, Q+WPRO (e.g., SOMETHING, SOMEWHAT)
To facilitate searches, quantified expressions of the form Q+N,
Q+WPRO (e.g., SOMETHING, SOMEWHAT) below the clause level are
always enclosed in NP-MSR brackets in the later corpora,
regardless of whether they are spelled as one word or two.
PPCME2 Later corpora
(ADJP (Q+WPRO somewhat) (ADJP (NP-MSR (Q+WPRO somewhat)
(ADJ late)) (ADJ late))
(ADJP (NP-MSR (Q some) (WPRO what)) (ADJP (NP-MSR (Q some) (WPRO what)
(ADJ late)) (ADJ late))
(QP (Q+WPRO sumdele) (QR moor)) (QP (NP-MSR (Q+WPRO somewhat))
(QR more))
Splitting and joining words
In keeping with our general strategy of minimizing changes to the
annotation guidelines, most items are split or joined in the same way in
the later corpora as in the PPCME2. However, some items that are
treated as PPs in the PPCME2 (like AFTERNOON, TODAY, and TONIGHT) have a
wider distribution in Modern English (the afternoon, today's
lecture) and are therefore reclassified
as unitary nouns. By contrast, the
distribution of fused forms like
ALIVE and ASLEEP continues to reflect their phrasal origin (*an
asleep child), and so these items continue to be tagged with complex
(+) tags.
In a few cases (for instance, ALMIGHTY, BETIME(S)), we have changed the treatment of items in the later corpora for sheer convenience. |