Problem set #3 -- COGS501

(due 10/05/2011)

Follow the instructions given in problem set #2 to set up the Peterson/Barney vowel data.

1. Apply principal components analysis to the F1, F2, F3 data as a whole.

Plot the (first two dimensions) of the results using letters or strings to mark which vowels go where. Compare this to a conventional plot of F1 against F2, or F1 against F2-F1. (Note that it's traditional to put the origin in the upper right hand corner, in order to make the F1-against-F2 plot congruent with the IPA vowel quadrilateral.)

Since there may be too many vowels to see clearly, explore other ways of showing where the various vowels are in this space (e.g. plotting mean values, plotting ellipses including a given proportion of the points, etc.).

You might also show where the new principal components axes (i.e. the line corresponding to the new basis vectors) fall on the F1/F2 plot.

2.  Apply principal components analysis to the F0, F1, F2, F3 data, and do the same thing.

3. Plot the same two results showing the categories man, woman, child rather than vowel identity.

4. Was PCA a useful thing to do here? Why or why not?

There are two goals here:

a) to be sure that you understand how principal components analysis works, and

b) to give you experience doing graphical exploratory data analysis in Matlab.

If there's something you want to do and you can't figure out quickly how to do it, feel free to ask (e.g. "How do I plot a bunch of points labeled with different strings according to their category?").