Creating an aligned transcript in Praat

These are instructions for using the program Praat to create a simple transcription, aligned phrase-by-phrase or word-by-word with an audio file, as required by Homework 4 for Linguistics 001, Fall 2014 edition.

I'm sorry to say that the design of Praat makes it much harder than it should be to do some very simple things. The good news is that it's not a lot harder to do some very sophisticated things -- but you won't get to that part of the Praat experience in this exercise, alas. On the other hand, you may well choose to build on what you've learned in this exercise when it comes time to do your Final Project.

1. Download and install Praat.

2. Download ObamaMTP092014x.wav, the audio file to be transcribed, into an appropriate folder.

3. Start Praat. Two windows will appear, labelled "Praat Objects" and "Praat Picture". Get rid of the "Praat Picture" window, as you won't be using it for this exercise.

4. Open the ObamaMTP092014x.wav file in Praat. Click on the "Open" menu in the "Praat Objects" window, and use Open>>Read from file... to browse to wherever you put the ObamaMTP092014x.wav file, and open it.

Now the Praat Objects window should look like this:

 

5. Set up to edit a "TextGrid" associated with the audio file. Click on the "Annotate -" button:

And then select "To TextGrid..." from the resulting pop-up menu:

This causes a box to pop up that looks like this:

Edit the box to look like this:

And click "OK". Now the (top of the) "Praat Objects" window should look like this:

"Command-click" (on Mac) or "control-click" (on Windows) on the "1. Sound ObamaMTP092014x" item, after which the (top of the) Praat Objects window should look like this:

Now click on "View & Edit", and a new window labelled "2. TextGrid ObamaMTP092014" should appear:

There are three panes in that window. The top pane is the "waveform", which shows time left-to-right, and the local value of the sound signal (deviations from ambient air pressure) in the up-and-down dimension. The middle pane is the "spectrogram", about which more in a later lesson. The bottom pane (here in yellow) is where you'll put the transcription.

6. Start the transcription. Click and drap (left-click on Windows) to highlight the first "breath group" in the waveform pane. Then click on the "sel" (for "select") button.

The result will look something like this:

If you click on the region towards the bottom of the display, labelled "Visible part X.XXXXXXXX seconds", the program should play the portion of the recording that you're seeing. As you can hear, Obama says something like

w- we're gonna have to act- work smarter, we're gonna have to

Now place the mouse pointer at the start of Obama's vocalization, click, and hit RETURN. This will create a blank label in the TextGrid, from the start of the file to the point where you placed the cursor. The display should look something like this:

Now place the cursor at the end of the visible audio region -- by putting the mouse pointer where you want it to go, and clicking to set the cursor. Type (or cut-and-paste) the transcript, and hit RETURN. This will establish a transcript segment from the previous segmentation point to the current cursor, labelled with whatever you type before hitting RETURN:

Before going on, spend a little time picking out different parts of the audio using click-and-drag motions of the mouse, and see if you can find specific syllables or words or word sequences, checking what you've found by playing selected region. For example, here's the region corresponding to the word "we're" -- the bar indicated by the red arrow tells you that the duration of this region is about 210 milliseconds. If you click on that bar segment, the selected region will be played:

5. Continue the transcription. You can move around in the recording with the slider:

Or you can zoom in and out with the "in" and "out" buttons:

So after a few more iterations of the same thing, you might have something like this:

This should be enough of an introduction to get you started -- for more help, try trial-and-error, or ask your TA, or ask a question on the course Piazza page.

6. When you're done with the transcription phase, you can save your TextGrid in a compact form using Save>>Save as short text file... in the Save menu in the Praat Objects window. The resulting file will contain lines like these:

         "w- we're gonna have to act- work smarter, we're gonna have to"
         3.0181715343616795
         3.3925821300399512
         ""
         3.3925821300399512
         3.9075768182866035
         "train"
         3.9075768182866035
         4.261635666456177
         " "
         4.261635666456177
         4.634107099263886
         "uh"
         4.634107099263886
         4.792897128139938

Rounding the absurdly many digits after the decimal point off to milliseconds, these lines tell you where the various labels begin and end -- thus "train" is 4.262-3.908 = 0.354 seconds long.