Known issues

We hope to address the following issues in future releases:

Currently treated as a single lexical item, as in (i), ANOTHER should perhaps better be split, as in (ii).
(i)  (D another)
(ii) (D an@) (ADJ @other)

Complex "plus" tags
Morphologically complex quantifiers, consisting of one of the quantifiers ANY-, EVERY-, NO-, or SOME- combining with one of the nominal heads -BODY, -ONE or -THING, are currently tagged with special "plus" tags, as in (i), but should perhaps better be split, as in (ii).
(i)  (Q+N everyone)
(ii) (Q every@) (N @one)

Empty heads
Empty heads of phrases are not always explicitly included.

Extralinguistic material
There is no current annotation guideline for whether non-linguistic material such as coughs, laughter, and so on, forms part of an adjacent token or stands alone. In the future, we plan to impose a stand-alone default.

Extraposed elaborations on subjects inconsistently attach as sisters of the subject (daughters of IP) or as daughters of VP.

The corpus contains sporadic *ICH* traces of extraposition.

False starts
The internal structure of short false starts is sometimes annotated, contrary to the current guidelines.

There are no current guidelines about when to annotate disfluent material as a single long false start or a sequence of several.

POS tags within false starts are more likely to contain errors than those in the main parsed structure.

According to the current guidelines, trailing hyphens are intended to mark incomplete words. But in many cases, trailing hyphens mark complete words within false starts, notably the final ones.

Some FRAGs should likely be annotated as IPs with missing subjects, and vice versa.

The conventions for hyphenation in the Praat transcripts and in the parsed corpus are not completely consistent. Users unable to find a hyphenated word from the parsed corpus in the Praat transcripts should keep this in mind and revise searches accordingly.

Interjections sometimes form separate sentence tokens, even when they shouldn't according to the current guidelines.

Punctuation in the AAPCAppE is not always consistent with the current guidelines. But since punctuation is not a part of the audio signal, even widespread inconsistency should not affect the usability of the corpus for linguistic research.

Even the documentation (in sections other than those devoted to punctuation) may be inconsistent.

There are likely to be instances of direct speech that are not marked as QTP.

Some material tagged as XX probably contains overlooked false starts.