CIS 558 / Linguistics 525
Computer Analysis and Modeling of Biological Signals and Systems
Homework 5

Due: 3/20/2017

Linear Prediction

Part I: Analysis

Make a vector of samples 40234:40444 of the file SX133.wav from the first homework assignment. This is a segment of the vowel from the word "quick".

  1. Plot the vector and its log magnitude spectrum, using a 256 point fft.  Make the x axis of the spectrum plot be frequency in Hz (samples per second).
  2. Write a function lpana that takes as input a vector of speech samples and a number k and returns the vector c of length k that best (in the least squares sense) linearly predicts the speech samples,
  3. Apply your function lpana to the samples extracted from SX133.wav and compare the (log magnitude) linear prediction filter spectrum you obtain with the spectrum of the speech samples, by plotting them in different colors on the same graph.
  4. Compare the first 70 samples of the impulse response of the linear prediction filter to the first 70 samples of the speech sample that you analyzed.
  5. Find the poles of the linear prediction filter and their angles.  Identify the three complex conjugate pole pairs with lowest frequencies and show their locations with respect to the unit circle on a plot of the complex plane (where the x direction corresponds to the real part and the y direction to the complex part).  Do all three of those pole pairs correspond to peaks in the spectrum?

Part II: Reconstruction

  1. Inverse filter the speech segment to find the prediction residual.  Plot samples 19:end of the residual together with the corresponding samples from the speech segment, using the same y scale for both.  How do they relate to each other?
  2. Samples 48:117 of the segment you have analyzed cover one pitch period.  Make a vector of samples 48:117 from the residual.  Supposing that you have called the vector pp, make a longer residual with falling pitch by executing the code

    > fall=pp;
    > for j=1:40
    > fall=[fall; zeros(round(j/4),1); pp*(1+randn()/10)];
    > end

  3. (Feel free to use a number larger or smaller than 40 if you think it improves the final outcome.)
  4. Filter the falling residual with the linear prediction filter.  Apply the function taper to the result to make the start and end less abrupt. For slightly greater realism, put samples 19312:21026 of SX133.wav (a /z/ sound) at the end to make an approximation of the word "is". You will have to adjust the relative amplitudes of the filtered sound and the /z/. Try making the maximum amplitude of the /z/ between 1 and 1.5 times the maximum amplitude of the filtered vowel.  Find a relationship that sounds reasonable to you.
  5. Take the three lowest frequency pole pairs that correspond to peaks in the linear prediction spectrum (so not necessarily the ones you plotted above) and make a new filter using just these poles.  Filter the falling residual with it.  Taper and stick on the /z/ as above.  You will have to readjust the amplitudes.  What does it sound like?