Linguistics 001      Fall 2002     Homework 3     Due Mo 10/07

This assignment has three parts: estimation of vocabulary size; analysis of some English morphology; and exercises in aspects of the morphology of some other languages;

1. Estimation of vocabulary size

The first part of the assignment requires you to estimate the size of someone's "mental lexicon," by obtaining a sample of 100 words from a medium-sized English dictionary, and checking what fraction of that sample your subject knows.

Each time you access this link you'll get a random sample of 100 headwords from a recent Collegiate dictionary, sampled from a list with 78,984 entries overall. If you access this link instead, you'll get a plain text file from the same dictionary instead of an HTML page (for easier cutting and pasting into some word processors).

Question 1(a)

If you are a native speaker of English, you can use yourself as the subject. If you are not, then find a friend or acquaintance who is a native English speaker, and get them to do it for you.

Print out your sample of 100 words. For each one, answer yes ("I definitely know this word"), no ("I have no clue whatsoever about this word"), or maybe ("I have some sort of idea about this word, and could try to use it, but I might be at least partly wrong").

Note that some of the "entries" are words with internal spaces, such as time capsule or press box. You can treat these just like the solid entries -- it's no harder to figure out if you know what Ponzi scheme means than it is to see if you know the meaning of androcentric or microcode.

Of course, it isn't easy in any case -- without looking at the whole dictionary entry, you may sometimes be unsure whether your belief about a word's meaning is in fact correct. In such cases, you can look the word up in a regular paper dictionary, or on-line here:


In many cases, with or without checking the word in the dictionary, it's clear that you have partial knowledge -- for instance, you might know that exocrine is the opposite of endocrine, and has something to do with hormones and stuff, but be a little vague on exactly what it means beyond that. In this sort of case, you should score the word as "maybe."

If you total your "yes", "maybe" and "no" items separately, you'll wind up with range of estimates, such as "I know between 65 (count of "yes" answers) and 78 (count of "yes" and "maybe" answers) of the items on my list."

Count up the answers in the three categories, and give the totals. What vocabulary size (relative to the contents of the overall dictionary) does this result suggest?

Note that you are not going to be graded on the size of your vocabulary! To get full credit, you just need to turn in your worksheet and the resulting counts. You will not get a higher grade if the answer is higher -- in fact we might get suspicious if the answer is too high. In order to match the estimated average word stock of U.S. high school graduates, you only need to "know" 51 of 100.

How to interpret your results

If you "know" 70 out of your sample of 100 entries, you can estimate that you know roughly 70/100 of the entire 78,984 entries, or 55,289. Perform the analogous computation based on your own answers. Give the results as in two forms: just the "yes" answers, and the sum of the "yes" and "maybe" answers.

After everyone turns in the homework, we'll give you some graphs summarizing the answers from the class as a whole.

Question 1(b)

This estimate of your "mental lexicon size" is surely too low, since the wordlist we used doesn't include acronyms such as ASAP, DWI, fubar; company names like TWA, Exxon, Microsoft; institutional names like Harvard, Penn, Louvre; place names like Philadelphia, Albany, Katmandu; proper names like Clinton, Elvis, Socrates; and so on. Among these categories, you surely know several thousands (or even tens of thousands) of additional items. How could you make the estimate more accurate?

2. Some English morphology

A short passage from the abstract of a biomedical article is given below.

In this passage:

2(a) Find two examples of regular inflectional suffixes.

2(b) Find two examples of derivational suffixes that are at least semi-regular, in the sense that the pattern of derivation corresponds to many other examples in the language. In each case, give the base form, the category of the base form, the suffix, the combined form, and the category of the combined form. Example: rehabilitate (V) + -ion == rehabilitation (N). For example of your examples, also give an analogous case (of a different word derivation based on the same suffix) not found in the cited passage

2(c) Find a case where two productive derivational prefixes are used (in different sentences) with the same base verb. By "productive" we mean that the form and meaning of the combination is easily predictable from the form and meaning of the parts. In this case, the base forms (the verb froms without the prefixes) could also have been used in the same contexts, with a slightly different meaning.

2(d) Are there any inflectional prefixes in this passage?

NADPH-cytochrome-P450 reductase both purified from rat hepatic
microsomes and involved in microsomal fraction was inactivated by
treatment with alpha-lipoic acid. Since alpha-lipoic acid contains
disulfide bond, it reacts with SH-groups of the reductase via the
reaction of thiol-disulfide exchange resulting in the loss of the
enzyme reducing activity. NADP+ completely protected reductase from
the inactivation. The modification of reductase was reversible: the
modified enzyme was partially reactivated with dithiothreitol and
dihydrolipoic acid in the case when cytochrome c was used as a
substrate of reductase. In the case when inorganic substrate,
K3Fe(CN)6, was used for assay the activity of modified reductase no
reactivation was observed. It was found that the order of the reaction
of inactivation of membrane-bound microsomal reductase is equal to 1.2
+/- 0.2, which is in an agreement with pseudo-first order kinetics,
and the second-order-rate constant of 26 M-1min-1. The results have
shown that well known therapeutic agent alpha-lipoic acid is an
efficient inhibitor of both purified and microsomal reductase.

3. Morphology in Arabic and Nahuatl

These examples were contributed by Uri Horesh and Sergio Romero from langauges they have worked on.

3.1 Syrian Arabic

In Syrian Arabic, any given noun is either feminine or masculine. In most cases where an animate noun can be of either gender, a suffix, -e or -a, is added to the otherwise masculine form to designate the feminine.


(1)  najjaar 'carpenter (m)'
     najjaara 'carpenter (f)'

(2)  Haddaad 'blacksmith (m)'
     Haddaade 'blacksmith (f)'
[note: a double consonant in the transcription denotes a geminate consonant, i.e., one which is pronounced doubly in length.]

There are two basic types of plurals in Syrian Arabic. Traditionally, they have been referred to as "sound plurals" vs. "broken plurals". Sound plurals are formed by adding a suffix to the base form of a noun, whereas broken plurals are formed by "changing around" the vowels within the base form of a noun, without adding anything at the end of it.

Examples of "sound plurals":

(3)  najjaariin 'carpenters (m)'
     najjaaraat 'carpenters (f)'
(4)  Haddaadiin 'blacksmiths (m)'
     Haddaadaat 'blacksmiths (f)'
[note: a double vowel in the transcription denotes a long vowel, i.e., one which is pronounced doubly in length.]

As can be inferred from (3) and (4), -iin is the sound plural marker for masculine nouns, and -aat is its feminine counterpart. Note that in the feminine forms, only the feminine plural suffix is present, not the feminine singular one.

Broken plurals have various forms. In many cases, they are predictable by looking at the corresponding singular forms.


(5)  xariiTa 'map (f)'
     xaraayeT 'maps'
(6)  baHr 'sea (m)'
     bHaar 'seas'
(7)  jariide 'newspaper (f)'
     jaraayed 'newspapers'
(8)  tall 'hill (m)'
     tlaal 'hills'

[note: 'x' denotes a voiceless uvular fricative, similar to 'ch' in German "Buch" 'book'; 'T' denotes an "emphatic" t-like consonant; 'H' denotes a voiceless pharyngeal fricative (use your imagination...).]

Here are couple of things that you should notice about these examples:

  • Whether a noun takes a "sound plural" or a "broken plural", and the nature of the broken plural if one is used, depends (in these cases) on its pattern of consonants and vowels.
  • Double consonants work like any other sequences of consonants, so that /tall/ has the same consonant-vowel pattern as /baHr/.

Using these examples as a model, try to figure out what the plural forms of the following singular nouns would be:

3.1(a)  kalb 'dog (m)'
3.1(b)  xayyaaT 'tailor (m)'
3.1(c)  faDiiHa 'scandal (f)' 
[note: 'D' denotes an "emphatic" d-like consonant].
Answer the following questions:
3.1(d)  The word for 'seamstress' is the grammatical feminine of the word for
'tailor' (see [3.1(b)] above). What is the word for 'seamstress'?
3.1(e)  What is the word for 'seamstresses'?

3.2 Nahuatl

Nahuatl is a language spoken by about two million people in Central and Southern Mexico. It is a highly agglutinative language. As shown in the following examples, words have a very complex structure. Long chains of morphemes are often strung together forming words whose meaning can only be translated into English using full sentences.

[note: in the examples below, semicolons follow long vowels; and 'X' is a 'mystery category' that you will need to identify]

(1)  o:-ni- mitz-wal- tlaxkal- chi:wa-li
     PE-1SN-2SX- come-tortilla-make-  AP
     "I came to make tortillas for you"
PE    Perfect marker
1SN   1st Person Singular Subject marker
2SX   2nd Person Singular X marker
AP applicative suffix
(2)  ni- mitz-maka se  tlaxkal- li
     1SN-2SX- give one tortilla-AB
	 "I give you one tortilla"
AB   Absolutive marker (Marks uninflected form of nouns)
(3)  ni- k-  maka no- kone  i:- kak
     1SN-3SX-give 1SP-child 3SP-sandal
     "I give my child his sandal"
3SX  3rd Person Singular X marker
1SP  1st Person Possessive marker
3SP  3rd Person Possessive marker
(4)  ni- k-  toka  masa:tl i:- te:ch tepe:tl
     1SN-3SX-chase deer    3SP-LOC   hill
     "I chase/hunt deer in the hill"
LOC   Locative (Same as prepositional "in" in English)

(5)  tla 0-  ne:ch-itta-s mo- na:n-  tzin, ni- k-  no:tza:-    s
     if  3SN-1SX-  see- F 2SP-mother-REV   1SN-3SX-call/salute-F
     "If your mother sees me, I will salute her."

3SN    3rd Person Singular Subject marker
1SX    1st Person Singular X marker
F      Future marker
2SP    2nd Person Possessive marker
REV    Reverential suffix 

Examine the Nahuatl sentences above and answer the following questions:

3.2(a) What does the X marker refer to in each example?
3.2(b) What seem to be its function(s)?
3.2(c) Is it ambiguous? Justify your answer.

