Linguistics 001 Fall 2007 Homework 4 Due Mo 10/08
1. The first part of the assignment requires you to estimate the size of someone's "mental lexicon," by obtaining a sample of 100 words from a medium-sized English dictionary, and checking what fraction of that sample your subject knows.
Each time you access this link you'll get a random sample of 100 headwords from a recent Collegiate dictionary, sampled from a list with 78,984 entries overall. If you access this link instead, you'll get a plain text file from the same dictionary instead of an HTML page (for easier cutting and pasting into some word processors).
How to do it
If you are a native speaker of English, you can use yourself as the subject. If you are not, then find a friend or acquaintance who is a native English speaker, and get them to do it for you.
Print out your sample of 100 words. For each one, answer yes ("I definitely know this word"), no ("I have no clue whatsoever about this word"), or maybe ("I have some sort of idea about this word, and could try to use it, but I might be at least partly wrong").
Note that some of the "entries" are words with internal spaces, such as time capsule or press box. You can treat these just like the solid entries -- it's no harder to figure out if you know what Ponzi scheme means than it is to see if you know the meaning of androcentric or microcode.
Of course, it isn't easy in any case -- without looking at the whole dictionary entry, you may sometimes be unsure whether your belief about a word's meaning is in fact correct. In such cases, you can look the word up in a regular paper dictionary, or on-line here:
In many cases, with or without checking the word in the dictionary, it's clear that you have partial knowledge -- for instance, you might know that exocrine is the opposite of endocrine, and has something to do with hormones and stuff, but be a little vague on exactly what it means beyond that. In this sort of case, you should score the word as "maybe."
If you total your "yes", "maybe" and "no" items separately, you'll wind up with range of estimates, such as "I know between 65 (count of "yes" answers) and 78 (count of "yes" and "maybe" answers) of the items on my list."
Count up the answers in the three categories, and give the totals. What vocabulary size (relative to the contents of the overall dictionary) does this result suggest?
Note that you are not going to be graded on the size of your vocabulary! To get full credit, you just need to turn in your worksheet and the resulting counts. You will not get a higher grade if the answer is higher -- in fact we might get suspicious if the answer is too high. In order to match the estimated average word stock of U.S. high school graduates, you only need to "know" 51 of 100.
How to interpret your results
If you "know" 70 out of your sample of 100 entries, you can estimate that you know roughly 70/100 of the entire 78,984 entries, or 55,289. This is surely an underestimate of your "mental lexicon", since this wordlist doesn't include acronyms such as UN, ASAP, DWI, fubar; company names like IBM, Coca Cola, Microsoft; institutional names like Harvard, Penn, Louvre; place names like Philadelphia, Albany, Katmandu; proper names like Clinton, Elvis, Socrates; and so on. Among these categories, you surely know several thousands (probably tens of thousands) of additional items.
After everyone turns in the homework, we'll give you some graphs summarizing the answers from the class as a whole.
2. Here is the second part of the assignment (based on an exercise in Farmer & Demers' A Linguistics Workbook):
The table below gives a set of 22 words in Telugu, each of which is translated by an English sentence. (This is an uncharacteristically simple sample of Telugu verb morphology.)
A. List the Telugu morphemes corresponding to these English words:
B. List the order in which the morphemes occur in the Telugu words. Use terms such as verb, tense, and subject.