Linguistics 300, F08, Assignment 4

Due: M 9/22 (extended to W 9/24)

Warner 2005:262-263 gives two tables (Tables 2 and 4), in which he cross-tabulates Early Modern English texts from two time periods (1500-1575 and 1600-1712) by lexical complexity and percentage of auxiliary do. Your job in this assignment is to replicate Warner's results on the basis of the texts in the PPCEME. The spreadsheet containing the data for the assignment is here. The spreadsheet is a corrected version of the one in Assignment 3; the column headings are the same as in that assignment.

In particular, you will be filling in the cells of two tables of the following form:

Occurrence of do in texts of low versus high lexical complexity xxxx-xxxx
Texts of low lexical complexity Texts of high lexical complexity Total
High DO %
Low DO %
DO % average

You will need to use your ingenuity to construct a measure of lexical complexity. Please begin by constructing a measure that is based on raw average word length and raw type/token ratio, as Warner does. In other words, use Columms F-H in the spreadsheet. Later on, if you want, you can experiment with the corresponding measures that are based on open-class items (Columns J-L). You can also experiment with incorporating sentence length (Column I) into your measure of lexical complexity.

Later on in the class, we will present a test that will allow you to evaluate the statistical significance of the various results that you obtain.

Together with the tables that your submit, please include a brief discussion of how you constructed the measure of lexical complexity and how you arrived at the figures in your tables, so that Caitlin and I can follow your reasoning.

In calculating the cells of the tables, please take the following points into account: