Linguistics 520: Lab Assignment #1

8/24/2013: Due 9/4/2013


1. Access to basic computer programs.
2. Recording audio.
3. Understanding SNR (signal to noise ratio) and measuring it.
4. Extracting and saving segments from a longer recording.

If you haven't already done it, install Praat and Audacity on your computer. If you don't have a computer of your own, you can use the versions installed on computers in the Phonetics Lab.

If you don't have a decent-quality microphone, use one of those that are available in the phonetics lab; or buy an inexpensive head-mounted headphone/microphone combination. The ones with USB connections generally produce better results -- they cost $10-30 online or at the Computer Connection.

Experiment with using that arrangement to make a series of recordings with different amounts of background noise -- in a reasonable quiet place (like the phonetics lab) or in a quieter place (like the sound booth in the lab), or in a place with more background (like the phonetics lab with your classmates rustling papers or drumming their fingers on the table. In each case, check the level of background noise and the level of the voice; in the lab session, we'll show you how to measure them and compare them so as to estimate the signal-to-noise ratio ("SNR"). Get a sense of what recordings at different SNRs look like.

Familiarize yourself with the concept "amplitude" and "power" as applied to audio signals, and the notion of "bel" and "decibel" as relational measurement units. We will go over these concepts in class -- some background is here.

Your homework for next week has three parts.

1. The first part is to record yourself (or someone you can persuade to do it for you) reading 100 ten-digit number sequences. (Or, for variety, choose a second version or a third version.) These lists have been crafted so that every digit occurs equally often in every position (10 times), and every pair of digits occurs once spanning every adjacent pair of positions. We'll discuss in the lab session how to read these sequences: the basic rules are

(a) Read the digits as separate words, so that 681-103-7539 is read as "six eight one, one zero three, seven five three nine";
(b) Use the hyphens as guidance to group the digits, in the style of telephone numbers;
(c) Despite the grouping, avoid silent pauses within any ten-digit string.
(d) Read the string number before each digit string, as (8) 657-27-4174 "Number eight [pause]: six five seven, two seven, four one seven four".
(e) If you're reading in English, use "zero" rather than "oh" for digit 0.

2. After you've recorded the 100-string set (which should take a little less than ten minutes), the second part of the assignment is to cut the recording up into 100 individual files, each containing one of the digit strings (without the introductory string number). Leave at least a tenth of a second of silence around the selection. There are various ways to do this -- the slowest and most painful is to select each region (in Praat or Audacity) and export it to an appropriately named file. This method should not take you more than an hour for the whole job. (Later on, we'll learn some other and less labor-intensive ways to do this sort of thing -- or you can find for yourself a way to get Praat or Audacity to do this more conveniently -- but you should know the the basic slow and reliable method...)

3. The third part of the assignment is to listen to the 100 files and make sure that you read the right number-string in the right way, and cut it out correctly. Again, there are more and less painful ways to do this -- over the course of the term, we'll learn some labor-saving techniques.

You'll use this recording later, to learn about how to do phonetic segmentation, and to look at the prosody effects of phrasing.

N.B. You'll find it easier to proof-listen your reading and your file-division if the 100 individual audio files that you've extracted collate (e.g. in folder listings) in the same order as the prompts. Thus you might use a scheme like


The leading zeros will help keep numerical order and sorting order in sync.

Notes for more advanced students:

This is not part of the basic content of the course -- it's here for completeness, and to keep students with more background from getting bored...

(1) The Python code I used to make the number lists is here.

(2) Eric Doty has contributed a Praat script for cutting up a sound file based on the segments in a TextGrid.

(3) If your audio file is called Lab1.wav, and you use Audacity to create and export a label file Lab1.alab with something like the following content:

3.715193 7.523265 1
9.160272 12.678095 2

(meaning that the segment labelled "1" extends from 3.715193 seconds to 7.53265 seconds, etc.), then this simple GNU octave program , invoked as e.g.

cutitup Lab1

will cut up Lab1.wav into Lab1X001.wav, Lab1X002.wav, etc.

If you use Praat to do the segmentation, and write out your TextGrid in the form of a tab-separated table, it might look like this:

tmin text tmax
3.715193 lab001 7.523265
9.160272 lab002 12.678095

and an appropriately-modified version of the same program might do what you want.