x Beatrice Santorini, Curriculum Vitae

July 2023

Beatrice Santorini

https://www.ling.upenn.edu/~beatrice

Education

1983-1989 Ph.D. in linguistics. University of Pennsylvania. The generalization of the verb-second constraint in the history of Yiddish. Committee: Anthony Kroch, Ellen Prince, Jack Hoeksema. Also available on github.
1981-1982 Graduate work in linguistics. University of Konstanz, Germany.
1975-1980 Equivalent of B.A. in English, American studies and comparative linguistics. University of Tübingen, Germany.
1975 Abitur, Gymnasium Ernestinum, Celle, Germany. Specialization in Greek and Latin.

Positions held

2010- Senior fellow. Department of Linguistics, University of Pennsylvania.
July-August 2017 Visiting instructor (with Anthony Kroch and Emanuela Sanfelici). Summer school in "Historical linguistics", University of Göttingen. Methods in historical syntax.
1998-2010 Lecturer. Department of Linguistics, University of Pennsylvania.
1998-2000 Lecturer. Department of German, University of Pennsylvania.
1997-1998 Visiting assistant professor. Department of Linguistics, University of Pennsylvania.
1991-1997 Assistant professor. Department of Linguistics, Northwestern University.
Summer 1992 Visiting lecturer. Summer school, Dutch National Graduate Network for Fundamental Linguistics, University of Amsterdam.
1990-1991 Postdoctoral research fellow. Department of Linguistics, University of Pennsylvania. NSF grant BNS 89-19701, "Head-Complement Word Order in the History of the West Germanic Clause." Principal Investigator: Anthony Kroch, University of Pennsylvania.
1989-1991 Administrator, Penn Treebank Project. Department of Computer and Information Science, University of Pennsylvania. DARPA grant N0014-85-K0018, DARPA/AFOSR grant AFOSR-90-0066, ARO grant DAAL 03-89-C0031 PRI, and General Electric Corporation grant J01746000. Principal Investigator: Mitchell Marcus, University of Pennsylvania.
1989-1990 Postdoctoral research fellow (half-time). Department of Computer and Information Science, University of Pennsylvania. DARPA grant N00014-85-K-0018. Principal Investigator: Aravind Joshi, University of Pennsylvania.
1988-1989 Visiting instructor in linguistics. General Programs, Haverford College.

Other research activities

2020-2022 Principal Investigator (with Seth Kulick), NSF grant BCS 13-046668, "Annotating and extracting detailed syntactic information from a 1.1-billion-word corpus."
2017-2020 Research grant subawardee. "Enhancing data and tools for research and education on African American English." NSF grant BCS 13-58724, Principal Investigator: Tyler Kendall, University of Oregon.
2016-2020 Co-Principal Investigator (with Christina Tortora et al.). NSF grant BCS 16-29348, "Collaborative research: A corpus of New York City English: Audio-aligned and parsed."
2013-2016 Research consultant. "Studying variation in syntax: A parsed corpus of Swiss German." Swiss National Science Foundation Grant 100015_146450. Principal Investigator: Eric Haeberli, University of Geneva.
2012-2015 Co-Principal Investigator (with Christina Tortora). NSF grant BCS 11-51630, "Collaborative research: A syntactically annotated corpus of Appalachian English."
2012-2015 Research associate. Department of Linguistics, University of Pennsylvania. NSF grant BCS 11-47499, "Testing and improving methods for efficient annotation through the construction of a large parsed corpus." Principal Investigators: Anthony Kroch and Seth Kulick, University of Pennsylvania.
Summer 2010 Research consultant. PRSY 40668-00-01, "The comparative morpho-syntax of Appalachian English." Principal Investigator: Marcel den Dikken, City University of New York.
2005-2010 Research consultant. Canadian Social Science and Humanities Research Council, "Modéliser le changement: Les voies du français" (Modeling change: The paths of French), grant number 412-2004-1002. Lead Principal Investigator: France Martineau.
2004-2010 Research associate. Department of Linguistics, University of Pennsylvania. NSF grant BCS 05-08731, "A parsed historical corpus of Modern British English." Principal Investigator: Anthony Kroch, University of Pennsylvania.
2004-2008 Co-Principal Investigator (with Mark Liberman, Steven Bird, Susan Davidson, and Michael Maxwell). NSF grant BCS 03-17826, "Querying linguistic databases."
August 2004 Fulbright Senior Specialist. Consultant for the Tycho Brahe Corpus of Historical Portuguese in the project Rhythmic patterns, parameter setting and language change. Principal Investigator: Charlotte Galves, University of Campinas, Brazil.
2000-2004 Research associate. Department of Linguistics, University of Pennsylvania. NEH grant PA 23382-99, "Creating an electronic parsed corpus of Early Modern English," and NSF grant BCS 99-05488, "The emergence of Modern English syntax." Principal Investigator: Anthony Kroch, University of Pennsylvania.

Parsed corpora and related materials

2022- Revised corrected version of Ann Taylor, Arja Nurmi, Anthony Warner, Susan Pintzuk, and Terttu Nevalainen. 2006. Parsed Corpus of Early English Correspondence, Compiled by the CEEC Project Team. York: University of York and Helsinki: University of Helsinki. https://github.com/beatrice57/pceec2
2021- Parsed examples from Ellegård 1953. https://github.com/beatrice57/ellegard-examples-parsed
2021- Penn Parsed Corpus of Historical Yiddish, 1.0. https://github.com/beatrice57/penn-parsed-corpus-of-historical-yiddish
2021- Somme des vices et vertus, various materials. https://github.com/beatrice57/somme-des-vices-et-vertus
2021- Anthony Kroch and Beatrice Santorini, eds. Penn-BFM Parsed Corpus of Historical French, version 1.0. https://github.com/beatrice57/mcvf-plus-ppchf
2021- France Martineau, Paul Hirschbühler, Anthony Kroch, and Yves Charles Morin, eds. MCVF Corpus, parsed, version 2.0. https://github.com/beatrice57/mcvf-plus-ppchf
2017 Christina Tortora, Beatrice Santorini, Frances Blanchette, and C.E.A. Diertani, eds. The Audio-Aligned and Parsed Corpus of Appalachian English (AAPCAppE), version 0.1. www.aapcappe.org
2017 Beatrice Santorini and Ariel Diertani. Syntactic annotation manual for audio-aligned parsed corpora. https://www.ling.upenn.edu/~beatrice/annotation-audio-aligned-corpora/index.html.
2010 Annotation manual for the Penn Parsed Historical Corpora of English and the York-Helsinki Parsed Corpus of Early English Correspondence. https://www.ling.upenn.edu/hist-corpora/annotation/index.html.
2010 Anthony Kroch, Beatrice Santorini, and Ariel Diertani. Penn Parsed Corpus of Modern British English. https://www.ling.upenn.edu/hist-corpora/PPCMBE2-RELEASE-1/index.html.
2004 Anthony Kroch, Beatrice Santorini, and Lauren Delfs. Penn Parsed Corpus of Early Modern English. https://www.ling.upenn.edu/hist-corpora/PPCEME-RELEASE-2/index.html.
1990 Part-of-speech tagging guidelines for the Penn Treebank Project. Department of Computer and Information Science, University of Pennsylvania, Technical Report MS-CIS-90-47.

Articles in refereed journals

2022 Molly Diesing and Beatrice Santorini. On the symmetry of V2 in Yiddish and some of its consequences for extraction. Journal of Germanic linguistics 34, Special Issue 2: Yiddish, 186-208. https://doi.org/10.1017/S1470542721000131
2007 Technological and linguistic issues in the construction of parsed corpora. Sprache und Datenverarbeitung 31, 25-30.
1996 Shahrzad Mahootian and Beatrice Santorini. Code switching and the complement/adjunct distinction. Linguistic Inquiry 27, 464-479.
1995 Beatrice Santorini and Shahrzad Mahootian. Code-switching and the syntactic status of adnominal adjectives. Lingua 95, 1-27.
1993 Das Jiddische als OV/VO-Sprache. Linguistische Berichte 123, 230-245.
1993 The rate of phrase structure change in the history of Yiddish. Language Variation and Change 5, 257-283.
1993 Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. Building a large annotated corpus of English: The Penn Treebank. Computational linguistics 19, 313-330. Reprinted in Susan Armstrong, ed., 1994, Using large corpora. Cambridge, MA: MIT Press. 273-290.
1992 Variation and change in Yiddish subordinate clause word order. Natural Language and Linguistic Theory 10, 595-640.

Book chapters

2020 Molly Diesing and Beatrice Santorini. The scope of embedded V2 in Yiddish. In Theresa Biberauer, Sam Wolfe, and Rebecca Woods, eds., Rethinking verb-second. Oxford University Press. 665-681.
2003 Ann Taylor, Mitch Marcus, and Beatrice Santorini. The Penn Treebank: An overview. In Anne Abeillé, ed., Treebanks. Text, speech and language technology, vol. 20. Springer. https://doi.org/10.1007/978-94-010-0201-1_1
1995 Two types of verb-second in the history of Yiddish. In Adrian Battye and Ian Roberts, eds., Clause structure and language change. Oxford: Oxford University Press. 53-79.
1994 Some similarities and differences between Icelandic and Yiddish. In Norbert Hornstein and David Lightfoot, eds., Verb movement. Cambridge: Cambridge University Press. 87-106.
1994 Young-Suk Lee and Beatrice Santorini. Towards resolving Webelhuth's paradox: Evidence from German and Korean. In Norbert Corver and Henk van Riemsdijk, eds., Studies on scrambling. Movement and non-movement approaches to free word-order phenomena (Studies in generative grammar 41). Berlin: Mouton de Gruyter. 257-300.
1991 Anthony Kroch and Beatrice Santorini. The derived structure of the West Germanic verb raising construction. In Robert Freidin, ed., Principles and parameters in comparative grammar (Current studies in linguistics 20). Cambridge, MA: MIT Press. 269-338.

Conference proceedings

2023 Seth Kulick, Neville Ryant, and Beatrice Santorini. Parsing Early English Books Online for linguistic search. In Proceedings of the Society for Computation in Linguistics, vol. 6, article 21. https://doi.org/10.7275/kr54-n102.
2022a Seth Kulick, Neville Ryant, and Beatrice Santorini. Parsing Early Modern English for linguistic search. In Proceedings of the Society for Computation in Linguistics 2022, 143-157. https://doi.org/10.7275/twww-ef90.
2022b Seth Kulick, Neville Ryant, and Beatrice Santorini. Penn-Helsinki Parsed Corpus of Early Modern English: First parsing results and analysis. Findings of the Association for Computational Linguistics: NAACL 2022, 578-593. Association of Computational Linguistics. https://doi.org/10.18653/v1/2022.findings-naacl.44.
2014a Seth Kulick, Anthony Kroch, and Beatrice Santorini. The Penn Parsed Corpus of Modern British English: First results and analysis. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore, MD: Association for Computational Linguistics. 662-667.
2014b Seth Kulick, Ann Bies, Justin Mott, Anthony Kroch, Mark Liberman, and Beatrice Santorini. Parser evaluation using derivation trees: A complement to evalb. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore, MD: Association for Computational Linguistics. 668-673.
2013 Seth Kulick, Ann Bies, Justin Mott, Mohamed Maamouri, Beatrice Santorini, and Anthony Kroch. Using derivation trees for informative treebank inter-annotator agreement evaluation. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Atlanta, GA: Association for Computational Linguistics. 550-555.
1995 Owen Rambow and Beatrice Santorini. Incremental phrase structure generation and a universal theory of V2. In Jill Beckman, ed., Proceedings of the 25th Annual Meeting of the North-Eastern Linguistic Society. Amherst, MA: Graduate Linguistic Student Association. 373-387.
1994 Shahrzad Mahootian and Beatrice Santorini. Adnominal adjectives, code-switching and lexicalized TAG. In Anne Abeillé, Sophie Aslanides, and Owen Rambow, eds., 3e colloque international sur les grammaires d'arbres adjoints (Technical Report TALANA-RT-94-01), Paris: TALANA. 73-76.
1993 Caroline Heycock and Beatrice Santorini. Head movement and the licensing of non-thematic positions. In Jonathan Mead, Luc Moritz, and Martin Wessels, eds., Proceedings of West Coast Conference in Formal Linguistics XI. Stanford: CSLI. 262-276.
1991 E. Black, S. Abney, D. Flickenger, C. Gdaniec, R. Grishman, P. Harrison, D. Hindle, R. Ingria, F. Jelinek, J. Klavans, M. Liberman, M. Marcus, S. Roukos, B. Santorini, and T. Strzalkowski. A procedure for quantitatively comparing the syntactic coverage of English grammars. In Proceedings of the DARPA Speech and Natural Language Workshop. 306-311.
1990 Eric Brill, David Magerman, Mitchell P. Marcus, and Beatrice Santorini. Deducing linguistic structure from the statistics of large corpora. In Proceedings of the DARPA Speech and Natural Language Workshop. 275-282.
1988 Anthony Kroch, Beatrice Santorini, and Caroline Heycock. Bare infinitives and external arguments. In Proceedings of the 18th Annual Meeting of the North-Eastern Linguistic Society, vol. 1, Amherst, MA: Graduate Linguistic Student Association. 271-285.
1985 Zero subject complements of the German verb lassen. In Jan Terje Faarlund, ed., Germanic linguistics. Papers from a symposium at the University of Chicago, Bloomington: Indiana University Linguistics Club. 135-156.

Other publications

2022 Seth Kulick, Neville Ryant, Beatrice Santorini, and Joel Wallenberg. A part-of-speech tagger for Yiddish: First steps in tagging the Yiddish Book Center Corpus. arXiv:2204.01175
2018 Christina Tortora, Beatrice Santorini, and Frances Blanchette. Romance parsed corpora: Editors' introduction. In Christina Tortora, Beatrice Santorini, and Frances Blanchette, eds. Linguistic Variation 18, Special Issue on Romance parsed corpora. Benjamins.
1996 Review of Hubert Haider, Susan Olsen, and Sten Vikner, eds., Studies in comparative Germanic syntax. American Journal of Germanic Linguistics and Literatures 8, 19-27.
1990 Mitchell P. Marcus, Beatrice Santorini, and David Magerman. First steps towards an annotated database of American English. Department of Computer and Information Science, University of Pennsylvania, Technical Report MS-CIS-90-46.
1988 Variable rules vs. variable grammars in the history of Yiddish. Groninger Arbeiten zur germanistischen Linguistik 29, 63-73.
1988 Beatrice Santorini and Caroline Heycock. Remarks on causatives and passive. Department of Computer and Information Science, University of Pennsylvania, Technical Report MS-CIS-88-33.
1987 Review of Werner Abraham, ed., Erklärende Syntax des Deutschen, Studies in Language 11, 261-272.

Invited presentations

February 2021 Using linguistic corpora to study grammars in situations of variation and change. Poster Day, English Linguistics, University of the Saarland.
December 2017 Anthony Kroch and Beatrice Santorini. Using the Penn Parsed Corpora of Historical English with CorpusSearch, Workshop, Waseda University.
December 2017 Anthony Kroch and Beatrice Santorini. Evidence for two kinds of OV word order. Exploiting parsed corpora: Applications in research, pedagogy, and processing. International symposium, National Institute of Japanese Language and Linguistics.
December 2017 Constructing a syntactically annotated corpus for grammatical research. Colloquium, National Institute of Japanese Language and Linguistics.
March 2017 Anthony Kroch and Beatrice Santorini. Detecting grammatical properties in usage data. Keynote address, 39th annual meeting of the German Society for Linguistics, University of the Saarland.
July 2014 Anthony Kroch and Beatrice Santorini. Evidence for "underlying" XV word order in Early Old French. Diachronic Generative Syntax 16, Budapest. (Also presented in September 2014 at the Yale Linguistics Monday Colloquium, Yale University.)
April 2014 Searching parsed corpora with CorpusSearch. GLEEFUL 21. Michigan State University.
June 2013 Using parsed corpora for diachronic research. LSA Summer Institute 2013, Workshop on Diachronic Syntax, University of Michigan, Ann Arbor.
April 2013 Anthony Kroch and Beatrice Santorini. The syntactic evolution of French as seen in the MCVF corpus. Linguistic Symposium on Romance Linguistics 43, Special Session on Parsed Corpora of Romance Languages, CUNY.
June 2011 Beatrice Santorini and Joel Wallenberg. Using annotated corpora for linguistic research. Diachronic Generative Syntax 13, University of Pennsylvania.
October 2007 Anthony Kroch and Beatrice Santorini. Finding a needle in a haystack: Using annotated corpora for linguistic research. NWAV 36, University of Pennsylvania.
August 2007 CorpusSearch Workshop, University of Ottawa.
April 2007 Technological and linguistic issues in the construction of parsed corpora. Workshop on Diachronic Corpora, Historical Syntax, and Text Technology, University of Frankfurt.
August 2004 Constructing a large parsed corpus of Early Modern English. IV Encontro de Corpora, University of São Paulo.
December 2003 Building and searching large parsed corpus of diachronic texts. Deutsch Diachron Digital, Humboldt University Berlin. (expanded version of presentation | handout)
November 2003 Anthony Kroch, Beatrice Santorini, and Laura Whitton. Methods in the quantitative analysis of historical syntax. Workshop, NWAVE 32, University of Pennsylvania.
November 1995 An annotated diachronic corpus of Yiddish texts. Panel on the use of corpora in diachronic syntax. Diachronic Generative Syntax 4, University of Quebec at Montreal.
April 1994 A Tree-Adjoining Grammar (TAG) approach to code-switching. Department of Linguistics, Indiana University. Joint work with Shahrzad Mahootian.
April 1994 XV/VX in the history of Yiddish. Washington Area Generative Society, Georgetown University.
June 1992 Evidence for mixed phrase structure in Yiddish. 12. Groninger Grammatikgespräche, Rijksuniversiteit Groningen.
December 1991 Exploiting quantitative patterns in analyzing syntactic change. Department of Linguistics, University of Illinois at Urbana/Champaign.
October 1991 On some differences between Icelandic and Yiddish. Conference on Verb Movement, University of Maryland at College Park.
March 1991 Mitchell P. Marcus and Beatrice Santorini. Building very large natural language corpora: Some methodological considerations. Joint Meeting of the Association for Computing in the Humanities and the Association for Literary and Linguistic Computing, Arizona State University.
March 1991 Structural and quantitative evidence for variation and change in the history of Yiddish. Department of Linguistics, Northwestern University.
November 1990 Scrambling and INFL in German. Department of Linguistics, City University of New York.
October 1990 The history of V2 in Yiddish. Department of Germanic Languages, University of Lund.
May 1990 Real and apparent changes in the grammar of Yiddish. Department of Linguistics, University of Delaware.
May 1990 The case of V2 subordinate clauses in Yiddish. Jersey Syntax Circle, Princeton University.
May 1989 V2 in the history of Yiddish. Colloque sur Syntaxe Historique, Department of Linguistics, University of Quebec at Montreal.
February 1989 The use of diachronic syntax. Department of Linguistics, Cornell University. Also presented at the Department of Linguistics, Brown University, March 1989.
February 1989 Nominative case assignment in West Germanic. Department of Linguistics, Cornell University.

Unpublished conference presentations

July 2023 Héctor Vázquez Martínez and Beatrice Santorini. A Variational Model of the loss of English OV. Diachronic Generative Syntax (DiGS) 24, Université de Paris Cité.
May 2022 Seth Kulick, Neville Ryant, and Beatrice Santorini. Penn-Helsinki Parsed Corpus of Early Modern English: First parsing results and analysis. Poster session, 3rd International Workshop on Computational Approaches to Historical Language Change, Dublin.
January 2017 Christina Tortora, Beatrice Santorini and Greg Johnson. Infinitival perfects in Appalachian English: Modals vs. infinitival to. Poster session, Annual Meeting of the Linguistic Society of America, Austin, TX.
November 2016 Christina Tortora, Beatrice Santorini and Greg Johnson. Infinitival perfects in Appalachian English: Modals vs. infinitival to. NWAV 45, Simon Fraser University.
May 2016 Anthony Kroch and Beatrice Santorini. Evidence for OV word order in Old French, Icelandic, and Yiddish. FWAV 3, City University of New York.
September 2014 Anthony Kroch and Beatrice Santorini. On the word order of Early Old French. SinFonIJA 7, University of Graz.
July 2014 Anthony Kroch and Beatrice Santorini. Evidence for "underlying" XV word order in Early Old French. DiGS 16, University of Budapest.
December 2012 Anthony Kroch and Beatrice Santorini. The evolution of word order frequencies in medieval English and French. Historical Corpora 2012, Frankfurt am Main.
July 2012 Anthony Kroch and Beatrice Santorini. Comparing Old French and Middle English clause structure. Diachronic Generative Syntax 14, University of Lisbon.
September 2010 Anthony Kroch and Beatrice Santorini. Prosody, topicalization and verb-second: How they interact in the history of English and French. Fourth Workshop on Prosody, Syntax, and Information Structure, University of Delaware.
July 2009 Anthony Kroch and Beatrice Santorini. The comparative evolution of word order in French and English. DiGS 11, University of Campinas, Brazil.
January 1995 Antisymmetry and scope in West Germanic. Annual Meeting of the Linguistic Society of America, New Orleans, LA.
October 1993 Susan Pintzuk and Beatrice Santorini. Reanalysis: Precondition or consequence of syntactic change? New Ways of Analyzing Variation in English 22, University of Ottawa.
November 1992 Comments on Haeberli and Haegeman's "Old English word order: Evidence from negative concord." Diachronic Generative Syntax 2, University of Pennsylvania.
January 1991 AGR as argument: Evidence from scrambling in German. Annual Meeting of the Linguistic Society of America, Chicago, IL.
August 1990 Anthony Kroch, Beatrice Santorini, and Aravind K. Joshi. A TAG analysis of the German `third construction.' First International Conference on Tree-Adjoining Grammars, Schloss Dagstuhl, Germany.
April 1990 Real and apparent syntactic changes in the history of Yiddish. Diachronic Generative Syntax 1, University of York, Great Britain.
December 1988 The generalization of the verb-second constraint in Yiddish. Annual Meeting of the Linguistic Society of America, New Orleans, LA.
October 1988 Against a uniform analysis of all verb-second clauses. Eastern States Conference on Linguistics 5, University of Pennsylvania.
May 1987 Anthony Kroch and Beatrice Santorini. Verb raising without the verb cluster. Comparative Germanic Syntax Workshop 4, McGill University. Also presented at the Verb Raising Workshop, Linguistic Society of America Summer Institute, Stanford University, July 1987.
December 1985 Anthony Kroch and Beatrice Santorini. Questioning the West Germanic verb cluster. Annual Meeting of the Linguistic Society of America, Seattle, WA.

Other unpublished work

1995 The syntax of verbs in Yiddish. Unpublished manuscript, Northwestern University. Also available on github.

Curriculum development

Redesigned Linguistics 110: The history of words.
Co-developed (with Jami Fisher) new course for the ASL curriculum, Linguistics 247: The structure of ASL.

Ph.D. dissertations supervised

1998 Maria Pilar Ron. The position of the subject in Spanish and clausal structure: Evidence from dialectal variation.
1993 Shahrzad Mahootian. A null theory of codeswitching.
Current position: Professor emerita, Department of Linguistics, Northeastern Illinois University.

Awards and honors

Fall 1997 Northwestern University Interfraternity Council/Panhellenic Association Award for Excellence in Teaching and Encouraging High Academic Achievement.

Professional and administrative service

Journals

Peer review

Other professional service

Administrative service

Winter 1997 Coordinator, Language and Cognition speaker series, Northwestern University.
1995-1996 Freshman adviser, Northwestern University.
1994 Member, Cognitive Science Liaison Committee, Northwestern University.
1992-1993 Member, Cognitive Science Program Subcommittee, Northwestern University.

Languages