Annotation guidelines
- English
- French
- Portuguese
- Spanish
Cheat sheets and tutorials
Presentations and workshops
DIGS 13,
University of Pennsylvania, June 2011
NWAV 36,
University of Pennsylvania, October 2007
CorpusSearch Workshop, University of Ottawa, August 2007
IV Encontro de Corpora, Universidade de São Paulo, August 2004
Deutsch Diachron Digital, Humboldt-Universität Berlin, December 2003
References
Parser
- Bikel, Daniel. 2004.
- On the parameter space of generative lexicalized
statistical parsing models. Ph.D. dissertation, Department of Computer
Science, University of Pennsylvania.
http://www.cis.upenn.edu/~dbikel/software.html,
"Multilingual Statistical Parsing Engine."
- Collins, Michael. 1999.
- Head-driven statistical models for natural language
parsing. Ph.D. dissertation, Department of Computer Science, University
of Pennsylvania.
http://people.csail.mit.edu/mcollins/code.html
-
Kulick, Seth, Daniel Bikel, and Anthony Kroch. 2006.
- Treebank construction by levels using constrained chart parsing.
Proceedings of the 5th International Conference on Treebanks and
Linguistic Theories, Prague, Czech Republic.
Tagger
- Brill, Eric. 1993.
- A corpus-based approach to language learning. Ph.D. dissertation,
Department of Computer Science, University of Pennsylvania.
- Florian, Radu, and Grace Ngai. 2001.
- Multidimensional
transformation-based learning. Proceedings of CONLL '01, 1-8. Toulouse. http://nlp.cs.jhu.edu/~rflorian/fntbl/
- Penn tagging wiki
- Instructions for using the fnTBL tagger
for Modern British English and for Middle French.
http://tagging.xwiki.com/xwiki/bin/view/Main/
Other
- Marcus, Mitch, Beatrice Santorini, and Mary Ann Marcinkiewicz.
1993.
- Building a large annotated corpus of English: The Penn Treebank.
Computational Linguistics 19, 313-330. Reprinted in
Susan Armstrong, ed., 1994,
Using large corpora.
Cambridge, MA:
MIT Press.
273-290.
- Penn Treebank Project.
- http://www.cis.upenn.edu/~treebank
- Randall, Beth. 2005-2007.
- CorpusSearch 2.
http://corpussearch.sourceforge.net