Linguistics 520 -- Fall 2008 -- Lab 6

Effects of phrase position on duration

1. Record a digit-string set. This is a set of 100 grouped strings of the digits from 0 to 9, designed so that each digit occurs 10 times in each position, and each pair of digits occurs once spanning each pair of positions.

The sets linked below are 7 long, 3+4, in the style of U.S. telephone numbers.

You can record such a set in any (nearly) any language. For English, the rules should be:

For reading in another language, please adapt these rules as appropriate.

Pick one of the ten sets here (1, 2, 3, 4, 5, 6, 7, 8, 9, 10); or make your own using digitstring.R and digitstring.sh.

The recording should be monophonic (single channel) and at a moderate sampling rate (typically 11025 Hz). If you end up recording e.g. 44.1 kHz stereo, convert the results before going forward.

2. Use Praat to create a TextGrid interval tier, in which each digit string is segmented into its constituent digits.

3. How to transfer your measurements to R (or another statistics program) for analysis.

once you've recorded your material, found the segmentation points, and saved the .TextGrid file containing them, you could do the rest of the process with pencil and paper -- but this would be quite tedious.

The good news is that the rest of the labor can be entirely automated.

The bad news is that this requires some programming. So you'll either have to learn some (new) programming techniques, or get Josh or Mark to help you with this piece of the lab.

Some of this needs to be in Praat's built-in scripting language, to put durations (and measurements of pitch, formants and other parameters) into an output file in convenient form to be assimilated by another program that you can use for the statistical analysis, graphing and so on. And some of the programming may need to take place in between Praat and the statistics program, so as to make the transfer easily on both ends.

There are many ways to do this. Below you'll find links to one example of how it can be done for my own recordings of two number-string lists from this experiment.

You can get everything that you need in one (18 MB) zip file: LAB6.zip

Or you can fetch the contents individually as needed. (If you're only doing the duration part, you won't need the the .wav files, but only the .TextGrid files, and the scripts dodurations, GetDurations, and Digitstring1.R)

The recordings:

MYL_List8.wav     MYL_NList4.wav      MYL_NList4a.wav

The TextGrids:

MYL_List8.TextGrid     MYL_NList4.TextGrid      MYL_NList4a.TextGrid

The shell script that runs the whole process -- to extract the parameters, to transform them for ease of access in the statistics system R, to read them into R, and to create some plots. This script should run on a Unix (Linux or Mac OS-X) system like those in the lab. The comments in the script should help you to pull pieces out and adapt them to your needs.

doitall

An alternative script that only does the (first) duration part of the process:

dodurations

Three needed Praat scripts (only the first one is needed for durations alone):

GetDurations      NumPitchScript       NumFormScript

Three needed R scripts (only the first one is needed for durations alone):

Digitstring1.R       Digitstring2.R       Digitstring3.R