PPCHE RELEASE NUMBER: 5
Last updated: July 15, 2025
RELEASE DATE: TBA
(babel: /htdocs/ppche/ppche-README)
Penn Parsed Corpora of Historical EnglishThe Penn Parsed Corpora of Historical English (PPCHE) are running texts and text samples of British English prose across its history - from the earliest Middle English documents up to the First World War. They include three corpora:
A 2016 release of PPCHE added 2 million words to the Modern British English corpus, for a total of 3 million words, and included a substantial number of corrections to all three corpora. In addition, the 2016 annotation guidelines slightly streamlined earlier versions. As of July 2025, the 2016 release is superseded by PPCHE2, which again corrects annotation errors and inconsistencies and streamlines the annotation guidelines yet further. Unlike earlier releases, PPCHE2 contains only tagged and parsed versions of the texts. It is available from the Linguistic Data Consortium (LDC) at the University of Pennsylvania under catalog number LDC2025T09. The 2016 release remains available under catalog number LDC2020T16. For questions concerning distribution, please contact LDC (ldc AT ldc DOT upenn DOT edu). For other issues, contact Beatrice Santorini (beatrice DOT santorini AT gmail DOT com). We especially welcome reports of annotation errors or inconsistencies, so that we can continue to improve the quality of the corpora.
|
With respect to the above-listed grants, any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Endowment for the Humanities or the National Science Foundation.
![]() |
Byland Abbey, Yorkshire. It was at abbeys like Byland, throughout Britain, that the manuscripts on which our knowledge of Middle English is based were largely written, copied, and preserved. The monastic orders that built and inhabited these monasteries were dissolved by Henry VIII, whereupon the buildings were dismantled for building materials by the landlords who succeeded to the monastic estates. Most of the abbeys' manuscripts were lost, but some came into private hands and so survived. Photo © A. Kroch 1998. |