Linguistics 001      Fall 2003     Homework 3     Due Mo 9/29

1. The first part of the assignment requires you to estimate the size of someone's "mental lexicon," by obtaining a sample of 100 words from a medium-sized English dictionary, and checking what fraction of that sample your subject knows.

Each time you access this link you'll get a random sample of 100 headwords from a recent Collegiate dictionary, sampled from a list with 78,984 entries overall. If you access this link instead, you'll get a plain text file from the same dictionary instead of an HTML page (for easier cutting and pasting into some word processors).

How to do it

If you are a native speaker of English, you can use yourself as the subject. If you are not, then find a friend or acquaintance who is a native English speaker, and get them to do it for you.

Print out your sample of 100 words. For each one, answer yes ("I definitely know this word"), no ("I have no clue whatsoever about this word"), or maybe ("I have some sort of idea about this word, and could try to use it, but I might be at least partly wrong").

Note that some of the "entries" are words with internal spaces, such as time capsule or press box. You can treat these just like the solid entries -- it's no harder to figure out if you know what Ponzi scheme means than it is to see if you know the meaning of androcentric or microcode.

Of course, it isn't easy in any case -- without looking at the whole dictionary entry, you may sometimes be unsure whether your belief about a word's meaning is in fact correct. In such cases, you can look the word up in a regular paper dictionary, or on-line here:


In many cases, with or without checking the word in the dictionary, it's clear that you have partial knowledge -- for instance, you might know that exocrine is the opposite of endocrine, and has something to do with hormones and stuff, but be a little vague on exactly what it means beyond that. In this sort of case, you should score the word as "maybe."

If you total your "yes", "maybe" and "no" items separately, you'll wind up with range of estimates, such as "I know between 65 (count of "yes" answers) and 78 (count of "yes" and "maybe" answers) of the items on my list."

Count up the answers in the three categories, and give the totals. What vocabulary size (relative to the contents of the overall dictionary) does this result suggest?

Note that you are not going to be graded on the size of your vocabulary! To get full credit, you just need to turn in your worksheet and the resulting counts. You will not get a higher grade if the answer is higher -- in fact we might get suspicious if the answer is too high. In order to match the estimated average word stock of U.S. high school graduates, you only need to "know" 51 of 100.

How to interpret your results

If you "know" 70 out of your sample of 100 entries, you can estimate that you know roughly 70/100 of the entire 78,984 entries, or 55,289. This is surely an underestimate of your "mental lexicon", since this wordlist doesn't include acronyms such as UN, ASAP, DWI, fubar; company names like IBM, Coca Cola, Microsoft; institutional names like Harvard, Penn, Louvre; place names like Philadelphia, Albany, Katmandu; proper names like Clinton, Elvis, Socrates; and so on. Among these categories, you surely know several thousands (probably tens of thousands) of additional items.

After everyone turns in the homework, we'll give you some graphs summarizing the answers from the class as a whole.

2. Here is the second part of the assignment (inspired by an exercise in Farmer & Demers' A Linguistics Workbook):

In Anthony Burgess' novel A Clockwork Orange, he invents a slang vocabulary based mostly on borrowings from Russian. In the early 1960s, when this novel was written, it was actually plausible that the direction of cultural influence might result in large-scale borrowings from Russian into English. If you're curious about why this might have been, read this.

A short passage from the beginning of Burgess' book is quoted below:

There was me, that is Alex, and my three droogs, that is Pete, Georgie and Dim, Dim being really dim, and we sat in the Korova Milkbar making up our rassoodocks what to do with the evening [. . .] The Korova Milkbar was a milkplus mesto, and you may, O my brothers, have forgotten what these mestos were like, things changing so skorry these days and everybody very quick to forget, newspapers not being read much neither. Well, what they sold there was milk plus something else. They had no license for selling liquor, but there was no law yet against prodding some of the new veshches which they used to put into the old moloko, so you could peet it with vellocet or synthemesc or drencrom or one or two other veshches which would give you a nice quiet horrowshow fifteen minutes admiring Bog And All His Holy Angels And Saints in your left shoe with lights bursting all over your mozg. or you could peet milk with knives in it, as we used to say, and this would sharpen you up [. . .] and that was what we were peeting this evening I'm starting off the story with.

Match each of the words in boldface with one of the glosses given below.

  1. friend (noun)
  2. God (noun)
  3. a type of drug (noun)
  4. thing (noun)
  5. quickly (adverb)
  6. mind (noun)
  7. place (noun)
  8. milk (noun)
  9. to produce (verb)
  10. to drink (verb)
  11. brain (noun)

You should be able to do this fairly easily without cheating by looking the words up in an online glossary of nadsat.

In each case, give a morphological analysis that breaks the word down into its parts (if any). For each part, indicate whether it is a root, a derivational affix, or an inflectional affix.

For example, the word "Baghdadis" would be analyzed as

baghdad i s
root derivational affix inflectional affix

Also in each case, describe briefly how the context gives you clues about the form and meaning of the word. For example, in figuring out the word "cistus", in the following sentence from Patrick O'Brian's Master and Commander

How did he manage to make a living in the sparse thin grass of that stony, sun-beaten landscape, so severe and parched, with no more cover than a few tumbles of pale stone, a few low creeping hook-thorned caper-bushes and a cistus whose name Stephen did not know?

you might observe that its phrasal context shows that it is a noun ("a __ whose name Stephen did not know"), that it occurs in a list whose previous member is a "a few ... caper-bushes", and that in this situation it seems like a word for a type of plant, since it would work well to substitute things like "cactus" or "evergreen". And you would be right, as the OED would tell you -- cistus is

[a] genus of handsome shrubs (family CistaceŠ) known as Rock-Rose and Gum Cistus, with large spotted red or white flowers, which seldom last more than a few hours after expansion.

Finally, do you know any Russian? If so, (optionally) indicate what the Russian source is for (some of) the words in the paragraph above.

    [course home page]    [lecture schedule]     [homework]