PII: S0911-6044(00)00002-6 Copyright ©
2000 Elsevier Science Ltd. All rights reserved.
Word association norms for two cohorts of British adults
Katherine W. Hirsh, , a
and Jeremy J. Tree, b
a School of Psychology,
Cardiff University, PO Box 901, Cardiff CF10 3YG, UKb Department of Psychology, University of Plymouth,
Drake Circus, Plymouth PL4 8AA, UK Available online 18
Katherine W. Hirsh, , a
and Jeremy J. Tree, b
Word association data were obtained from two cohorts of British adults. Young
adults (21¯30 years of age) and older adults (66¯81)
responded to 90 words in a discrete word association task. An associative
frequency measure was calculated by counting how many participants produced a
particular word and then converting this number into a proportion. The degree of
overlap between the cohorts in terms of dominant responses, the responses with
the highest association frequencies, was moderate. Dominant responses were
common to the two cohorts for only 36 of the 90 items. When the top three
responses were considered the degree of overlap increased to approximately 60%.
Four measures of response heterogeneity were calculated for each stimulus item.
Comparison of the responses of the younger and older adults indicates that there
was less response heterogeneity amongst the older cohort. These norms should be
of use to investigators interested in developmental changes in the structure of
semantic memory across the adult lifespan as well as to researchers interested
in comparing results from neurologically impaired older adults to a normative
sample from the same age cohort.
Author Keywords: Ageing; Word association; Semantic memory
There has been a recent upsurge of interest in developmental changes in
semantic memory and other linguistic functions (see Burke ,
for a review with respect to language production). For example, the number of
studies investigating semantic priming in older adults has multiplied rapidly
over the past 10¯15 years [2,
In part this increase stems from an interest in individuals with suspected
Alzheimer's type dementia, a disorder where semantic memory is thought to be
compromised ( 
although see 
for dissenting view). While some studies have compared the responses of
dementing and non-dementing older adults on word association and related tasks
as far as we are aware there are no normative data available on word association
in older British adults. We set out therefore to collect data on 90 words from
both a young and an older cohort of British adults.
Previous work by Burke and Peters 
suggests that the level of response variability in word association is not
influenced by age. Burke and Peters measured between-participant variability in
two ways, firstly by examining the proportion of each participant's responses
that were the first or second most popular response to each item and secondly by
counting the number of unique responses each participant gave (unique in that no
other individual in the cohort gave that response to that stimulus). They found
no contribution of participant age to either measure. They did find differences
however in the words that were given as the most popular (or dominant)
responses. There was only 60.5% overlap in the three most popular responses to
stimulus items across the two cohorts.
Burke and Peters 
also examined variability across cohorts in terms of the type of response given.
Following Deese's 
criteria, they classified responses as either paradigmatic or syntagmatic.
Paradigmatic responses were those that shared form class with the stimulus item
as well as sharing features in terms of meaning (e.g., boy¯girl,
wise¯clever, carrot¯vegetable). Syntagmatic responses
were those from a different form class than the stimulus item and as such they
were words that could co-occur with the stimulus in a sentence
blue¯sky). Burke and Peters found that the majority of responses
were paradigmatic, and that the proportion of paradigmatic responses was not
influenced by age. This finding is important as several studies have revealed a
reduction in the proportion of paradigmatic responses in older adults with
dementia of the Alzheimer's type (DAT; [7,
suggesting that such a reduction might be a marker of a breakdown in semantic
In addition to providing normative data we report data on between-participant
agreement in our two cohorts. We were interested to determine whether there were
age-related changes in response variability. We utilised four agreement measures
to assess between-participant variability. We also examined variability in terms
of type of response produced. A change in the degree of variability or the type
of response given would suggest that there are age-related changes in the
structure of semantic memory. If DAT is indeed a function of changes in semantic
memory, finding an increase in the degree of variability or a change in the type
of response given in older adults would be consistent with the idea that there
is a continuum between old age and DAT [12
The participants were 90 adult volunteers. Forty-five of them were between 66
and 81 years of age (the older cohort) and the remainder were between 21 and 30
years of age (the young cohort). The educational attainment of the older cohort
was: secondary school to age 14 ¯¯ 10 Ps; secondary school to age 15
¯¯ 13 Ps; secondary school to age 16 ¯¯ 13 Ps; secondary
school to age 17 ¯¯ 2 Ps; secondary school to age 18 ¯¯ 1
P; and post-secondary education ¯¯ 6 Ps. All participants in the
young cohort were first year postgraduate students in the School of Psychology,
Cardiff University. All participants were native speakers of British English.
The materials were 90 written words. The stimulus words were selected either
because they were the names of concrete, picturable objects or because it was
hoped that they would elicit the name of a concrete object as a frequent
response. The norms should therefore be particularly useful to researchers
interested in picture comprehension or picture naming. The norms may also be of
use to individuals looking to design semantically-based therapy programmes.
Each word appeared on a separate page in 36-point New York font. The first
letter of each stimulus word was capitalised. Forty-one of the stimulus words
were unequivocally nouns. The remaining 49 items could stand as members of more
than one form class. For example, the word blue is primarily thought of
as an adjective but it may also be used as a noun or a verb. The frequency of
the words ranged from 0 to 312 per million 
and the length ranged from 3 to 10 letters. Frequency values are available in Appendix
Participants were tested individually. They were instructed to write down
only their first response to each stimulus word. Items were ordered
pseudo-randomly such that neither semantically nor phonologically related words
occurred consecutively. No time limit was placed on the participants, however
they were encouraged to respond quickly. The instructions to participants were
taken from Moss and Older :
In this booklet you will find a list of simple words. Please read each word
and then write down the first word it brings to mind. Write your response in the
space provided. For example, if the word is butter, the first
word you think of might be bread or milk or
cup. Please work through the booklet quickly. Remember, we are
interested in the word that comes to mind immediately, not after thinking about
it for a while. If you can't think of anything at all, leave the space blank.
Don't go back and change your mind about any of the words after you have written
your first response. Thank you very much for taking part in the experiment.
Participants recorded their responses in writing next to the stimulus words.
A typical participant took 15 min to complete the task.
Responses were tabulated and a full list of the stimuli, their associated
responses and the proportion of participants who produced each response can be
found in Appendix
A (unless overwise noted N=45). Following Moss and Older 
we treated morphological variants of a word as separate responses except in the
case of plurals.
Four measures of response heterogeneity (between-participants variability)
were computed for each stimulus item. The first of these was the association
frequency of the dominant response (ADOM). The dominant response is that
produced by the largest number of participants. Association frequency of a
response word is the proportion of participants who produced that particular
word in response to the stimulus word. Thus ADOM is the association frequency of
the word produced most often by participants in response to a target stimulus.
Dominant responses and their ADOM values can be found in Appendix
B. The second heterogeneity measure was the number of unique responses (NUR)
that were given to individual stimulus words, that is the number of responses
there were that were produced by only a single participant. The third measure
was the number of different responses that were given by more than one
participant (NMR), that is it is the number of responses given by multiple
participants as opposed to by only a single participant. The final heterogeneity
measure was the information statistic H which was calculated according to
the following formula:
The H statistic offers information about the distribution of
responses. If the responses were uniform the H statistic would be zero.
Increasing values of H signal decreasing response agreement and typically
a smaller percentage of overlapping responses (i.e., an increase in the number
of unique responses). For example consider two items where the dominant response
was produced by 30 people and in one case one additional response was produced
by the other 15 participants while in the other case three responses were given
each by five people. These two items will have an identical dominant response
proportion (ADOM) but the latter item will have a higher H statistic than
the former (1.45 vs 0.92). Number of unique responses (NUR), Number of responses
given by multiple participants (NMR) and H values for each stimulus item
may be found in Appendix
As can be seen from Table
1 the proportion of young adults contributing dominant responses was lower
than the proportion of older adults producing dominant responses. A Wilcoxon
Signed Ranks test comparing stimulus ADOM values for the two groups confirmed
the significance of this difference (=-3.25,
p<0.002). The NUR values were also greater for the young adults than
for the old (=-4.58, p<0.0001). The NMR values were
greater for the young than for the old (=-2.45,
p<0.02). As NUR+NMR is equal to the total number of responses made,
these two results indicate that the young adults produced more responses on
average to a stimulus word than did the older adults. Table
1 also shows that the H statistic values were higher for the younger
adults indicating yet again that there was significantly less agreement amongst
this cohort in terms of their associative responses (=-5.24,
p<0.0001). Overall the findings from the four agreement measures show
that younger adults were less consistent in their responses to the stimulus
words than were the older adults, that is, between-participant variability
decreased with age.
Table 1. Mean agreement values for the young and the older cohorta
Table 1. Mean agreement values for the young and the older cohorta
We utilised linear regression to examine whether the frequency of occurrence
of the stimulus word affected any of the agreement measures. We found no
significant relationship between stimulus frequency and any of the agreement
measures for the younger cohort. However in the older cohort, log stimulus
frequency was related to ADOM (r2=0.14, F(1,88)=15.09,
p<0.0005), NUR (r2=0.12, F(1,88)=12.68,
p<0.001) and the H statistic (r2=0.13,
F(1,88)=14.34, p<0.0005). As the stimulus frequency increased
so did the number of unique responses and the H statistic; the
association frequency of the dominant response decreased. Thus high frequency
items resulted in lower levels of agreement than low frequency items. The
dominant responses of both groups were significantly more frequent than the
stimulus items that evoked them (older cohort z=-2.58, p<0.01;
young cohort z=-3.02, p<0.005). There was no significant
difference however between the frequency of the dominant responses produced by
the two groups (z=-1.13, p=0.26).
In addition to assessing agreement in a cohort internal fashion, we also
examined the degree of agreement across cohorts in terms of the dominant
response. There were a total of 36 stimulus items where the two age cohorts had
the same dominant response (40%). Of the remaining 54 items there were 13 (14%
of total) where the dominant response for one cohort was not produced by at
least one member of the other cohort. When the top three responses are
considered the degree of overlap between the cohorts increases to 57% (cf. 60.5%
Finally, it is worth noting that some responses appeared as the dominant
response for more than one stimulus item, that is some responses were repeatedly
produced. The older adult cohort produced ache, clothes, dress, fire, pet(s),
pond, ring, sky and vegetable as the dominant response to two
different stimulus items and flower(s) was the dominant response given to
three different stimulus items. Thus although there were 90 stimulus items only
79 unique dominant responses were made by the older cohort. The young adult
cohort produced ball, fire and sky as the dominant response to two
stimulus items and cat was the dominant response given to three of the
stimuli. Thus there were 85 unique responses made by the young participants.
We adopted Bandera et al.'s 
classification of responses as hierarchical¯categorical or
propositional¯relational rather than the more commonly used
paradigmatic or syntagmatic. Hierarchical¯categorical (corresponding
to paradigmatic) responses were those where the response and the stimulus were
from the same form class and included coordinates, subordinates, superordinates,
synonyms, antonyms and metonyms of the stimulus. A response was considered a
propositional¯relational (syntagmatic) response when the stimulus and
response were "only syntactically contiguous within the frame of a
sentence...[the response was] an action related to the stimulus, or an attribute
of the stimulus, or a noun in a typical sentence containing the stimulus" [7,
p.296]. Note that in this system responses classified as syntagmatic may be from
the same form class as the stimulus. Classification of the dominant responses
was performed individually by each of the authors and then any discrepancies (of
which there were only 10 out of a possible 180) were resolved through
discussion. Participants in this study produced predominantly
propositional¯relational responses. The young cohort produced
slightly more propositional¯relational responses than did the older
cohort (70 and 60% respectively). The reversal of the typical predominance of
paradigmatic-type over syntagmatic-type responses is due to classification
without regard to form class. When form class is the decisive factor for
classification, the dominance of paradigmatic responses re-emerges: 14%
syntagmatic responses for the older cohort and 13% for the young cohort.
We also categorised the propositional¯relational stimulus-response
pairs in terms of whether the relationship was one of phrasal collocation (e.g.,
cave¯man versus grill¯bacon). Twenty-two of
the fifty-five propositional¯relational dominant responses made by
the older cohort were phrasal collocations. Twenty-nine of the sixty-three
propositional¯relational responses made by the young cohort were
phrasal collocations. Seven of the phrasal collocations were common to both
cohorts: cloak¯dagger, daisy¯chain,
drain¯pipe, gas¯fire, peanut¯butter,
sly¯fox and tummy¯ache. It is worth noting that
a large number of the collocational responses would have been classified as
paradigmatic responses because the stimulus and response were from the same form
class: alley and cat are not related in the same way as
alley and lane however both pairs are composed of two nouns. It
was because we felt that phrasal collocations were properly considered
"syntagmatic" that we selected Bandera et al.'s classification scheme 
rather than that of Deese .
A we have provided a set of nouns that may be used by researchers interested
in studying semantic memory in participants who are speakers of British English.
A contains the word association responses of an older and a young cohort to
90 stimulus words and as such it offers age-appropriate normative data for the
construction of experimental materials. The importance of utilising
age-appropriate normative data is demonstrated through our analysis of
cohort-based differences in between-participant variability.
We have shown that there are cohort effects in word association. The degree
of overlap between the two groups in terms of their dominant responses showed
this very clearly. The same word was produced as the most popular response by
both young and old for only 36 of the 90 stimulus items presented. Even when the
top three responses to each item were considered the degree of overlap still
only reached 57%. When we look at the between-participant variability in
responding we also find differences between the cohorts. Younger adults produced
a wider variety of responses ¯¯ fewer participants contributed to the
dominant response proportion (ADOM), more unique responses were produced (NUR),
more non-unique responses were produced (NMR) and as a result the average
H agreement statistic for the younger cohort was higher. If we look
instead at the response types we see that the two cohorts were quite similar.
Whether responses are classified as
paradigmatic/syntagmatic, the two cohorts produced very similar proportions of
each response type. Taken together, the lack of agreement in terms of actual
responses produced and the similarity in terms of the type of responses produced
suggest that cohort effects are not due to changes in the structure of semantic
memory; rather, they can be related to differences in the content stored in
similarly structured semantic memory systems. The lack of evidence for
structural changes to semantic memory in old age suggests that the changes in
word association performance seen in dementia are qualitatively different to
what is seen in normal ageing.
One implication of these results is that word association data collected from
young adults may not be the ideal source of experimental materials for research
involving older adults. Cohort differences were most apparent in the
collocational stimulus-response pairs. Here only seven dominant responses were
common to both cohorts. Some of the differences appear to result from the
introduction into the language of new collocations: the dominant response to the
word couch was potato in the young cohort but settee in the
older cohort; for the word basket it was ball for the young participants
and shopping for the older participants. Other differences are more
difficult to explain in this way: the dominant response to the word skirt
in the young cohort was dress whereas in the older cohort it was blouse.
Moreover, even when the same response was dominant the level of agreement
differed across cohorts: 47% of the younger cohort produced the word fork
in response to the stimulus knife while in the older cohort fork
was produced by 73% of respondents.
One intriguing idea is that the higher level of between-participant agreement
seen in the older cohort may account for the finding in a meta-analysis that
"semantic priming effects are reliably larger for older than for younger adults"
(hyper-priming; [3, p.34]). If older participants are in general more likely to
produce the dominant response and dominance is a reasonable measure of
association strength, then the same item may be a potent prime for 40% of older
participants to whom it is presented but only 20% of young participants. In a
given experiment then, a larger proportion of older participants would be
receiving a highly associated prime and therefore the effects of prime
presentation appear to be augmented. This hypothesis, although rather
speculative, is amenable to experimental investigation. It would be possible to
select from these norms a set of stimuli where the same response was the
dominant one for both cohorts but the associative frequency of that response was
greater for one of the cohorts. If the hypothesis is correct one would expect to
see reliably larger associative priming effects in a sample from the age-cohort
that produced the dominant response more frequently be they old or young. These
norms will allow experimenters interested in the effects of ageing on language
to select materials in such a way as to equate associative frequency for the two
age-cohorts. It is possible that such matching could eliminate hyperpriming.
The finding that for the older cohort there was an inverse relationship
between frequency of occurrence and three of the agreement measures also
deserves some discussion. In a paper examining the effects of frequency and
imageability on word association (in young adults), de Groot 
argued that conceptual representations of high frequency words may be linked to
a larger number of other conceptual representations than are the representations
of low frequency words. A wider range of links could account for the decrease in
proportion of participants providing the dominant response as well as the
increase in the number of unique responses: as the number of links increases so
too does the number of possible responses and arguably thereby the amount of
overlap between choices will decrease if selection is somewhat random. What is
difficult to explain however is why this effect occurred only for the older
Finally, we must note that in addition to differing in age our two cohorts
also differed in level of educational attainment. All of the young participants
were university graduates, however only one of the older participants was a
graduate. This does present a difficulty in interpreting the cohort effects, in
that they could be due to age differences or to educational differences. We do
not feel however that this lessens the value of the norms we have provided. The
level of educational attainment found in our older sample is typical of this
age-cohort and therefore the norms provided here are likely to be more
appropriate to this group whatever the underlying cause of the cohort
D.M. Burke, Language production and aging. In: S. Kemper and R. Kliegel,
Editors, Constraints on language: Aging, grammar, and memory, Kluwer,
London (1999), pp. 3¯28.
N.L. Bowles, Age and semantic inhibition in word retrieval. Journal of
Gerontology: Psychological Sciences 44 (1989), pp.
G.D. Laver and D.M. Burke, Why do automatic semantic priming effects increase in
old age? A meta-analysis. Psychology and Aging 8 (1993),
pp. 34¯43. MEDLINE
B.A. Ober, K. Shenaut, W.J. Jagust and R.C. Stillman, Automatic semantic priming
with various category relations in Alzheimer's disease and normal aging.
Psychology and Aging 6 (1991), pp. 647¯660.
A. Martin, Degraded knowledge representations in patients with Alzheimer's
disease: Implications for models of semantic memory and repetition priming. In:
L.R. Squire and N. Butters, Editors, Neuropsychology of Memory (2nd
ed.),, Guildford Press, London (1992), pp. 230¯232.
R.D. Nebes, Semantic memory dysfunction in Alzheimer's disease: Disruption of
semantic knowledge or information-processing limitation?. In: L.R. Squire and N.
Butters, Editors, Neuropsychology of Memory (2nd ed.),, Guildford Press,
London (1992), pp. 233¯240.
L. Bandera, Salla S. Della, M. Laiacona, C. Luzzatti and H. Spinnler, Generative
associative naming in dementia of the Alzheimer's type. Neuropsychologia
29 (1991), pp. 291¯304. MEDLINE
L.R. Gewirth, A.G. Shindler and D.B. Hier, Altered patterns of word association
in dementia and aphasia. Brain and Language 21 (1984),
pp. 307¯317. MEDLINE
Pietro M.J. Santo and R. Goldfarb, Characteristic patterns of word association
responses in institutionalized elderly with and without senile dementia.
Brain and Language 26 (1985), pp. 230¯243. EMBASE
Burke DM, Peters L. Word associations in old age: Evidence for consistency in
semantic encoding during adulthood, Psychology and Aging 4:283¯92.
J. Deese, Form class and the determinants of association. Journal of Verbal
Learning and Verbal Behavior 1 (1962), pp.
F.A. Huppert and C. Brayne, What is the relationship between dementia and normal
aging?. In: F.A. Huppert and D.W. O'Connor, Editors, Dementia and normal
aging, Cambridge University Press, Cambridge (1994), pp. 3¯11.
E. LaBarge, D.A. Balota, M. Storandt and D.S. Smith, An analysis of
confrontation naming errors in senile dementia of the Alzheimer type.
Neuropsychology 6 (1992), pp. 77¯95.
for Lexical Information Centre. The Celex lexical database, The Max
Planck Institute for Psycholinguistics, Nijmegen (1993).
H. Moss and L. Older. [bj]Birbeck word association norms, Psychology
Press, Hove, UK (1996).
A.M.B. de Groot, Representational aspects of word imageability and word
frequency as assessed through word association. Journal of Experimental
Psychology: Learning, Memory, and Cognition 15 (1989), pp.
Fax: +44-29-20874858; email: firstname.lastname@example.org
to ScienceDirectSoftware and compilation © 2001 ScienceDirect. All
rights reserved.ScienceDirect® is an Elsevier Science B.V. registered