Known issues

This page lists known issues in the annotation of the Penn historical English corpora:

Backwards gapping
Instances of backwards gapping may be mistagged as ordinary gapping.

CODE material is not always attached as high as possible.

Conjunction of unlike categories
When unlike categories are conjoined at the word level, the CONJP required by the guidelines may be missing.

Direct speech fragments
Fragments of direct speech may be labelled as FRAG rather than QTP.

The PPCME2 and the later corpora do not always agree on which instances of post-head ELSE and ENOUGH are tagged as adjectival (ADJR) or adverbial (ADVR).

Exceptional Case-Marking (ECM) versus object control

Fragments, direct speech
See Direct speech fragments.

Infinitival adjuncts
Infinitival adjuncts may lack the -ADT dash tag.

Left-dislocation (-LFD)
In the PPCME2, PPs are annotated as left-dislocated only if the resumptive element is a matching PP (that is, if the preposition in the left-dislocated and in the resumptive element is identical, and if the objects of the prepositions corefer); there are only a handful of examples.
( (IP-IMP (VBI Thynk)
	  (ALSO eek)
	  (CP-THT (C that)
		  (IP-SUB (PP-LFD (P of)
				  (NP (SUCH swich)
				      (N seed)
				      (PP (P as)
					  (CP-CMP (WNP-1 0)
						  (C 0)
						  (IP-SUB (PP (P (CODE {of}))
							      (NP *T*-1))
							  (NP-SBJ (NS cherles))
							  (VBP spryngen))))))
			  (, ,)
			  (PP-RSP (P of)
				  (NP (SUCH swich) (N seed)))
			  (NP-SBJ=2 *exp*)
			  (VBP spryngen)
			  (NP-2 (NS lordes))))
	  (. .))
  (ID CMCTPARS,314.C1.1108))

In the later corpora, the -LFD dash tag is used for such examples, but also more liberally to indicate a relationship between a pre-subject PP and any more broadly resumptive PP or ADVP.

( (IP-MAT-SPE (PP-LFD (P Though)
		      (CP-ADV-SPE (C 0)
				  (IP-SUB-SPE (NP-SBJ (PRO I))
					      (VBP beare)
					      (NP-OB1 (N record))
					      (PP (P of)
						  (NP (PRO$ my) (N selfe))))))
	      (, ,)
	      (ADVP-RSP (ADV yet))
	      (NP-SBJ (PRO$ my) (N record))
	      (BEP is)
	      (ADJP (ADJ true)))
  (. :)) (ID AUTHNEW-E2-H,VIII,1J.1030))

Measure phrases (NP-MSR, QP)
Measure phrase modifiers of nouns may be tagged as NP-POS rather than NP-MSR.
The distinction between NP-MSR and QP is not straightforward, and it is likely that some instances of one category are mistagged as the other. Searches for one category should therefore generally include the other.

Object control versus Exceptional Case-Marking (ECM)
See Exceptional Case-Marking (ECM) versus object control

Participial clauses (IP-PPL)
The distinction between participial clauses functioning as adjuncts (IP-PPL) and complements (IP-PPL-OB1) is not implemented in the PPCME2. However, participial clauses functioning as complements are likely to be rare in that corpus.

Participial clauses versus reduced relative clauses
Reduced relatives (RRC) headed by participles are not always easy to distinguish from participial clauses (IP-PPL). It is wise in searches for one category to include the other.

Participles, adjectival
Some adjectival uses of participles, notably passive participles, are likely to be mistagged as ADJ, contrary to the rule in Verbs and other categories.

Proper nouns
Many inconsistencies and outright errors likely remain with respect to the tagging of proper nouns (NPR).
The guidelines for proper nouns of the form THE N OF NP (THE WAR OF THE ROSES) have the counterintuitive result that none of the nouns is tagged NPR.

Purpose infinitives
Purpose infinitives are not always easy to distinguish from bare infinitives, and some infinitives that should be tagged as purpose infinitives may lack the -PRP tag, particularly in connection with go and send.

Reduced relative clauses versus participial clauses
See Participial clauses versus reduced relative clauses.

Resumptive elements (-RSP)
See Left dislocation.

Right-node raising
Not all instances of right-node raising are annotated with an index, particularly in the statutes.

Secondary predication versus small clauses
It is not always easy to distinguish instances of secondary predication from small clauses. See Secondary predicate NPs for a list of predicates that license NP-SPR, ADJP-SPR. See Small clauses for a list of predicates that license IP-SMC.

Single NP object with LIKE and similar verbs (LACK, NEED, WANT)
In the PPCME2, the experiencer argument of LIKE (and similar verbs) in the ME LIKE(N) PEARS construction may be mistagged as NP-OB1 rather than NP-OB2.

Small clauses versus secondary predication
See Secondary predication versus small clauses.

Verb fronting in free relatives
In COME WHAT MAY, the entire construction is currently treated as a free relative with movement or copying of an infinitive to the pre wh- position. In reality, the construction is a V1 optative clause containing a free relative (here, the subject WHAT MAY), either with or without elision. The current annotation is correct in those rare cases where verb fronting occurs in complement clauses.