Linguistics 520: Design a Perception Experiment

10/16/2013 -- Due 10/23/2013


1. Learn about different speech-perception experiment design
2. Learn about different issues that such experiments address
3. Design your own experiment and submit the plan: We'll pick one and carry it out together!

Here's an incomplete list of experimental paradigms used in speech perception resesarch. I've left out techniques like eye-tracking and functional brain imaging that require equipment we won't have access to, and have focused on things you can do (these days) with a computer and a loudspeaker or a set of headphones. I've also left out techniques that use subject populations we can't easily supply, like ferrets or newborn infants.

One way to organize a taxonomy of speech perception experiments is by the type of response elicited [and how the responses are analyzed]:

1. Identification: What did (s)he say? [Distribution of errors; Confusion matrix]

2. Discrimination: Same or different? [Success rate as a function of whatever]

3. Auditory lexical decision: Was that a word or not? [Reaction time as a function of whatever]

4. Shadowing: Imitate while the talking continues. [Lag as a function of whatever]

5. Monitoring: Hit a key when you hear X. [Success rate, reaction time as a function of whatever]

There's a long tail of other sorts of questions to ask about spoken stimuli: speaker characteristics (intrinsic, like sex, or transitory, like emotional state); similarity between examplars; and so on. And there are responses in other modalities: gaze direction (determined by eye tracking); electrophysiological responses (N100, P300, N400, P600, "mismatch negativity"); etc.

Another way to look at things is in terms of the context or method of stimulus presentation:

1. Gating: Cut the stimulus off early and see what happens (e.g. in terms of identification).

2. Priming: Present other stuff first (speech, noises, words, pictures, ...).

3. Distortion: Add noise or otherwise mess speech up and see what happens.

4. Stream Selection: Combine streams of speech or other sounds and ask about one of them.

5. Dichotic listening: Present different things to the right and left ears.

6. Cross-modal perception: Use a (real or synthetic) talking face, or add other visual stimuli.

We could also make lists of ways of choosing or creating stimuli -- this is a more open-ended list, which we'll discuss in class.

Researchers often use several different techniques to explore an effect. Thus the "phoneme restoration effect" was first explored by replacing a short piece of speech with a noise burst, and asking whether subjects could accurately determine which bit had been replaced (they can't) -- see Warren 1970 below. We can see this as a kind of "stream selection" experiment. Later, the same phenomenon was explored by modifying the signal-to-noise ratio in a short region of speech, and asking what the SNR threshold is for subjects to be able to hear the difference (the more redundant the material, the worse the performance) -- see Samuel 1981 below, which explores "phoneme restoration" using a discrimination task.

A few classic speech perception papers are given below. The point here is just to give you a sense of how and why people have deployed various techniques to learn about how speech perception works. In some cases, the emphasis in on the experimental techniques, and in other cases the paper is mostly about some point of theory.

Doreen Kimura, "Functional Asymmetry of the Brain in Dichotic Listening", Cortex 1967.
Richard Warren, "Perceptual Restoration of Missing Speech Sounds", Science 1970.
William Marslen-Wilson, "Linguistic structure and speech shadowing at very short latencies", Nature 1973.
Francois Grosjean, "Spoken word recognition processes and the gating paradigm", Perception & Psychophysics 1980.
David Pisoni and Joan Lazarus, "Categorical and noncategorical modes of speech perception along the voicing continuum", JASA 1974.
Arthur Samuel, "Phonemic Restoration: Insights From a New Methodology", Journal of Experimental Psychology 1981.
Paul Luce & David Pisoni, "Recognizing Spoken Words: The Neighborhood Activation Model", Ear & Hearing 1988.
Anne Cutler et al., "The monolingual nature of speech segmentation by bilinguals", Cognitive Psychology 1992.
Jean Andruski, Sheila Blumstein & Martha Burton, "The effect of subphonetic differences on lexical access", Cognition 1994.

Here are a few more recent speech perception papers:

Tessa Bent et al., "The Influence of Linguistic Experience on the Cognitive Processing of Pitch in Speech and Nonspeech Sounds", Journal of Experimental Psychology 2006.
Jennifer Pardo, "On phonetic convergence during conversational interaction", JASA 2006.
Alan Yu, "Perceptual Compensation Is Correlated with Individuals' 'Autistic' Traits: Implications for Models of Sound Change", PLoSOne 2010.
Hannah Rohde and Marc Ettlinger, "Integration of Pragmatic and Phonetic Cues in Spoken Word Recognition", Journal of Experimental Psychology 2012.
Fei Chen et al., "Assessing the perceptual contributions of vowels and consonants to Mandarin sentence intelligibility", JASA 2013.
Megan Sumner, "A phonetic explanation of pronunciation variant effects", JASA 2013.

There's a useful exploration of part of this space in Grant McGuire, "A Brief Primer on Experimental Designs for Speech Perception Research", and a discussion from a different perspective in Katie Drager, "Conducting Speech Perception Experiments".

I'll add some more papers before Monday's lecture -- or you can poke around on Google Scholar for other research that relates to some topic of interest to you.

What you should do

Work out a way to explore some question that you care about, by finding or constructing a set of speech materials, presenting them in some controlled way to listeners, getting some pattern of responses, and analyzing the results.

Explain your plan briefly but clearly, and send me the explanation.

If you have questions about any aspect of this -- and you should! -- ask them on the class Piazza site so that everyone can learn from the discussion.

Proposals (as of 10/29/2013)

Number 1: Auditory discrimination in neurotypical and ASD populations: Speech and non-speech contexts
Number 2: Perception of English STRUT and TRAP vowels by Italian learners of English
Number 3: Processing costs in tracking Liaison and H-aspiré
Number 4: Effect of F0 on vowel identification
Number 5: Parameters of phonetic restoration
Number 6: Consonant voicing as a salient or natural category
Number 7: Relative contribution of vowels and consonants to Korean sentence intelligibility
Number 8: The effect of perceptual assimilation on Mandarin speakers' perception of English […™]
Number 9: Perception of reduced him vs. them
Number 10: "Reading" speech waveforms
Number 11: Audio-visual ("McGurk") effects on phoneme restoration
Number 12: Categorical perception of emotional speech in Mandarin Chinese
Number 13: Perception of Korean stop manner distinctions (plain/aspirated/tense)
Number 14: Perception of different split-/ae/ systems.

"High-entropy speech recognition, automatic and otherwise"