Towards the Discrete Fourier Transform

We spent some time earlier on the first slogan of signal processing:

The response of a linear shift-invariant system S to an arbitrary input x is the convolution of x with the impulse response of S.

This turned out to follow pretty directly from the definition of the terms -- but it is still a very useful idea!

Now we're going to turn our attention to a second slogan:

The Fourier transform of the response of a linear shift-invariant system S to an arbitrary input x is the product of the Fourier transform of x with the Fourier transform of the impulse response of S.

This is just Slogan #1, with "convolution" changed to "product" and "(the) Fourier transform of (the)" stuck in three times.

In its own way, Slogan #2 is just as important as Slogan #1. We're not quite ready to understand what it says, though, much less why it's important. We know what "PRODUCT" means, more or less, and the rest of the terms from Slogan #1 carry over, but what about FOURIER TRANSFORM?

Well, a TRANSFORM (in the sense of interest to us here) is just a linear operator---that is, a matrix. It takes vectors as inputs, and produces vectors as outputs. And FOURIER is just the name of a French physicist, Baron Jean Baptiste Fourier, 1768-1830, who invented a certain technique for modeling the way that an iron ring will cool off if you heat it in a non-uniform way.

So Slogan #2 means that there is some transform (some matrix operator F) with a fairly magical property:

If we transform the input x, the impulse response h, and the output y all using the same matrix F, then we will find that the transformed output is just the transformed input multiplied element-wise by the transformed impulse response. In the new coordinates, LSI systems are just scalars -- they just multiply their inputs!

Thus we should be able to say in MATLAB:

% y = conv(x,h)
% fy = (F*x).*(F*h)
% y1 = F\fy 

and have y1 come out equal to y. However, as stated, this can't work, because the second line implies that x, h and fy all have the same length, while the first line implies that the length of y is the length of x plus the length of h minus 1. This is the problem of "edge effects" in convolution again; we'll see how to get all the lengths to agree shortly (by circular convolution).

But first, let's see how we could get a transform with the right properties for this approach to work at all.

In the end, the answer is fairly easy: we just fill the rows of F with sampled sinusoids of different frequencies, and hey presto!

In order to see why this works, we have to understand two things:

  1. sinusoids of different frequencies are orthogonal
  2. sinusoids are eigenfunctions of LSI systems

These are both pretty easy, actually; as with slogan #1, they follow from the definitions involved and from a bit of simple high-school-level algebra (in this case along with a couple of easy trigonometric identities).

There will be just one non-9th-grade hump to get over: in order to get things to work out in the desired neat way, we need to express sinusoids as complex powers of e, instead of in terms of the more familiar sine and cosine functions. So first we will show (1) and (2) in terms of sines and cosines, and then we will take a small detour into the land of complex exponentials, for those to whom these are not already second nature.

Basic formulae for sinusoids.

The general formula for a sinusoid is

a*sin(f*x + p),

with three constants: 'a' (amplitude), 'f' (frequency) and 'p' (phase).

Because sin(a+b) = sin(a)*cos(b) + cos(a)*sin(b), there is an alternative general formula for a sinusoid as a weighted sum of sin and cos:

a*sin(f*x + p) = a*cos(p)*sin(f*x) + a*sin(p)cos(f*x)

In this alternative formulation, there are also three constants: the cosine weight, the sine weight, and the frequency. The equation above shows us how to go from amplitude/phase to sine-weight/cosine-weight.

In the other direction (from sine-weight/cosine-weight to amplitude/phase):

A*sin(f*x) + B*cos(f*x) = (A*sqrt(1+((B/A)^2))) * sin(f*x + arctan(B/A))

It's not especially important to memorize these relationships -- what is important is to know that they exist.

Orthogonality of sinusoids

Sinusoids of different frequencies are orthogonal; for example:

>> x = (0:127)*2*pi/128;
>> 
>> sin(x)*sin(3*x)'
ans =
   4.3012e-15      % effectively zero -- 
>> cos(x)*cos(2*x)'
ans =
   2.4425e-15
>> sin(x)*cos(x)'
ans =
  -1.9360e-15
>> sin(x)*sin(23*x)'
ans =
  -6.4913e-14
>> sin(x)*sin(x)'
ans =
    64

Actually, this reflects a reasonably general property of periodic sequences with appropriate symmetries. For instance, as long as we take integral numbers of periods related by even factors, square wave sequences are also orthogonal. In addition, two appropriately symmetric sequences of the same frequency with one advanced by 1/4 period will generally be othogonal (just as sine and cosine are):

>> oo = ones(1,128);
>> mm = -1*oo;
>> sq1 = [oo(1:64) mm(65:128)];                     % square wave with frequency 1
>> sq2 = [oo(1:32) mm(33:64) oo(65:96) mm(97:128)]; % square wave with frequency 2
>> sq4 = [oo(1:16) mm(17:32) oo(33:48) mm(49:64) oo(65:80) mm(81:96) ...
       oo(97:112) mm(113:128)];                     % square wave with frequency 4
>> csq1 = [oo(1:32) mm(33:96) oo(97:128)];          % freq. 1 sq. wave advanced 1/4 period
>> sq1 * sq2'
ans =
     0
>> sq1 * sq4'
ans =
     0
>> sq2 * sq4'
ans =
     0
>> sq1 * csq1'
ans =
     0
>> sq1*sq1'
ans =
   128

Orthogonal sequences such as these are the basis of the Walsh-Hadamard transform.

In the case of square waves, sequences with numbers of periods related by odd factors will not be orthogonal. Sinusoidal sequences in such cases will remain orthogonal.

We can prove this via the trigonometric identity

cos(a)*cos(b) = cos(a-b)/2 + cos(a+b)/2

As long as a and b are not equal, the result will be a sum of cosines. In the case of discretely-sampled finite-length sequences, as long as the frequencies a-b and a+b both "fit" into the sequence in a way we will make precise later on, the sum of the samples of each will obviously be 0 -- because of the symmetry of the cosine, the sum of samples of any intergral number of periods will always be 0. If each term of the sum adds up to 0, the sum will also be 0, and therefore cos(a) and cos(b) are othogonal.

If a=b, then we have cos(0)/2 + cos(2*a)/2, and this is obviously not 0.

An orthonormal basis made out of sinusoids.

If we fill up the rows of a square matrix with sampled sines and cosines of different frequencies -- and an integer number of periods within each row -- we can count on having an orthogonal basis. To make the basis orthonormal, we just have to divide each row by its (geometric) length.

If we have an N-dimensional space, N/2 -1 sines and N/2 + 1cosines will do the trick. Why not (say) N sines? Well,

>> x = (0:127)*2*pi/128;
>> plot(x,sin(127*x))

which purports to show us a sine wave of frequency 127 sampled in vector with 128 elements, instead produces this plot, which looks suspiciously like sin(-x) --because it is!

As we'll discuss in detail later on, a vector of length N can correctly show us sinusoids only up to frequency N/2. After that, the frequencies wrap around, or "alias", into negative frequencies going back down towards 0.

Why the +1 and -1 in "N/2+1 cosines and N/2-1 sines"? Well, we need both cos(0*x) (for the DC component) and cos((N/2)*x) (for the highest possible frequency), whereas sin(0*x) and sin((N/2)*x) would just be all zeros, and thus no help at all as basis vectors...

Sinusoids as eigenfunctions of LSI systems

In order to see that sinusoids are eigenfunctions of linear shift-invariant systems, we will need:

  1. Definition of eigenfunction
  2. Slogan #1, turned "sideways"
  3. The fact that the sum of two sinusoids of the same frequency, with arbitrary amplitude and phase, is always another sinusoid of the same frequency.

We'll take these in reverse order.

If we add up two sinusoids of the same frequency, one of which is a shifted (and perhaps scaled) copy of the other, the result is always a (possibly shifted and scaled) sinusoid of the same frequency. Here's a couple of samples:

>> a = sin(2*x);
>> r=rand(1,2)
r =
    0.9501    0.2311
>> b = r(1)*sin(2*x-r(2)*pi);
>> figure(1)
>> plot(x,a,'.',x,b,'o',x,(a+b),'+')
>> r=rand(1,2)
r =
    0.6068    0.4860
>> b = r(1)*sin(2*x-r(2)*pi);
>> figure(2)
>> plot(x,a,'.',x,b,'o',x,(a+b),'+')

For a general proof, we can use the trig identity

cos(a) + cos(b) = 2*cos((a+b)/2)*cos((a-b)/2)
to show that a sum of cosines with the same frequency but different phases is another cosine of the same frequency:
   cos(n*x-p1) + cos(n*x-p2) =

   2*cos((n*x-p1+n*x-p2)/2)*cos((n*x-p1-n*x+p2)/2) =
   2*cos((2*n*x-p1-p2)/2*cos((p2-p1)/2) =

   2*cos((p2-p1)/2) *   cos(n*x -        (p1+p2)/2)

           ^                 ^               ^
        constant     original freq        average phase

A bit more algebra gets us the scale-invariance as well.

Let us repeat: the sum of two sinusoids of the same frequency (with arbitrary amplitude and phase) is another sinusoid of the same frequency (with amplitude and phase depending on the input).

Now go back to the facts about linear shift-invariant systems. Remember that the response of such a system to an input x is gotten by convolving x with the impulse response h.

This can be thought of as a sum of scaled and shifted copies of the impulse response, where the scaling factors are the values of the signal at each shift.

Equivalently, because convolution is commutative, we can think of the same operation as a sum of scaled and shifted copies of the input x, where the scaling factors are the values of the impulse response at each shift.

Thus the response of an arbitrary LSI system to a sinusoid will be a sum of scaled and shifted copies of that sinusoid. But we have just seem that the result of adding up scaled and shifted sinusoids of a given frequency is just a sinusoid of that same frequency.

So this means that an LSI system, poked with a sinusoid, will always output a sinusoid, just scaling it and shifting it in some way. Let's try it:


>> x = (0:127)*2*pi/128;
>> x = [0:127]*2*pi/128;
>> y = sin(3*x);
>> h = rand(8,1)  % a random impulse respone
h =
    0.9501
    0.2311
    0.6068
    0.4860
    0.8913
    0.7621
    0.4565
    0.0185
>> b = mycconv(h,y);
>> plot(x,y,'rx',x,b,'go')

and for good measure we'll show that this doesn't work for square waves:

>> s = square(3*x);
>> r = cconv(s,h);
>> figure(2);
>> plot(x,s,':',x,r,'-',x,s,'x',x,r,'o')

Note that to get the above to work, we snuck in cconv(), "circular convolution". In this form of convolution, instead of imagining that the vectors are embedded in a sea of zeros, we think of the longer of the two (the "signal") as being embedded in a sequence of indefinitely many copies of itself, while the shorter (the "impulse response") is still thought of as being embedded in zeros. Now the output will be an indefinitely repeated pattern with a period that is the length of the "signal". We (conceptually) pick one period of this pattern as a representative (they are all the same, after all).

In this approach, the "output" has the same size (number of elements) as the "input". More important, with this approach, sinusoids really are eigenfunctions of LSI systems -- rather than being "polluted" with non-sinusoidal stuff around the edges, as they would be if we used regular convolution (try it!).

%% Circular convolution, of size of larger vector
function c = cconv(a,b)
 if ( length(a) > length(b) )
    long = a; short = b;
 else
    long = b; short = a;
 end
 ll = length(long);
 sl = length(short);
 % pad:
 clong = [long long long];
 % convolve:
 cc = conv(short, clong);
 % crop:
 c = cc(ll+1:ll+ll);

Now it is easy to see that a version of slogan #2 will be true: the transformation F*x decomposes x into a sum of sinusoids; the only thing that an LSI system can do to a sinusoid of a given frequency is shift or scale it; therefore each sine/cosine pair in F*x will be treated by the LSI system S independent of the others.

However, this gives us the rather ugly SINE-AND-COSINE VERSION OF SLOGAN #2:

Take the discrete Fourier transform (sine and cosine version) of an arbitrary vector x, and also of the impulse response h of an arbitrary LSI system S. Use the relation

A*sin(f*x) + B*cos(f*x) = (A*sqrt(1+(B/A^2))) * sin(f*x + arctan(B/A))

to map sine and cosine amplitudes onto amplitude and phase of a sine wave. Then at each frequency, multiply the amplitude of x's sine wave by the amplitude of h's sine wave, and add h's phase to x's phase. Then use the relation

a*sin(f*x + p) = a*cos(p)*sin(f*x) + a*sin(p)cos(f*x)

to map back to the sine and cosine amplitudes, and use these coefficients to transform back to the original coordinate system. This predicts the effect of S on x.

It would be nice if we could treat this effect as a single simple operation on the sine/cosine pair rather than as this rather complicated business.

Sinusoids as complex exponentials

The complex-exponential way of writing sinusoids can be seen as a notational trick that lets us treat arbitrary shifting and scaling of a sinusoid as a simple (pair of) multiplications by complex scalars.

As a result, we can cash in the promise of Slogan #2 in terms of simple multiplication...as long as we adopt this slightly non-obvious way of notating sinusoids.

This notational trick starts with the fact that a number raised to imaginary powers traces out a sinusoid, and specifically that e^(ix) is equal to cos(x) + i*sin(x). This is called Euler's relation (or Euler's formula). This relationship between complex exponentials and sinusoids takes some getting used to. The accessible proofs of this relation are not very enlightening. However, it's important for the meaning of this relation to become second nature to you. It may be helpful to go through some ways of deriving the relation.

Once you accept Euler's relation, then the rest of the Fourier notation trick goes like this:

 e^(ix) =  cos(x) + i*sin(x)
 e^(-ix) = cos(x) - i*sin(x)

 sin(x) = (e^(ix)-e^(-ix))/(2*i)
 cos(x) = (e^(ix)+e^(-ix))/2

So for some complex number a+ib and its complex conjugate a-ib

 (a+ib)e^(ix) + (a-ib)e^(-ix)  =
 ae^(ix)+ibe^(ix) + ae^(-ix)-ibe^(-ix) =
 a(e^(ix) + e^(-ix)) + ib(e^(ix) - e*^(-ix)) =
 2 a cos(x) + 2 i^2 b sin(x) =
 2 a cos(x) - 2 b sin(x)

Thus in the other direction,

 c*cos(x) + d*sin(x) =
 (c/2 - (d/2)i) e^(ix) + (c/2 + (d/2)i) e^(-ix)

This gives us a new (third) way to write a real-valued sinusoid of frequency x. Starting from the weighted sum of cosine and sine terms, we get:

 a complex exponential with frequency x,
           multiplied by some complex constant k,
 plus a complex exponential with frequency -x,
           multiplied by the complex conjugate of k;

 where the real part of k is half the cosine weight
 and the imaginary part of k is half the sine weight.

The connection between the complex constant k and the amplitude-and-phase expression for a sinusoid is more commonly needed, and is also fairly simple.

It is easy to derive this from the same equations that we used above to derive the representational trick in the first place, but we'll just state it here.

If we think of k as a vector in a 2-dimensional space whose first dimension (x) is the real part and whose second dimension (y) is the imaginary part, then the magnitude of the resulting sinusoid is just the length of that vector, and the phase is just its angle relative to the vector (1,0) (if we express the sinusoid as a cosine).

Thus we can exemplify as follows.
We are using the the MATLAB functions

exp(x) for e^x
abs(x) which in the case of complex numbers returns the magnitude
       i.e. the length of the vector in the complex plane
angle(x) which give the angle of a complex number
       relative to the positive x axis
       expressed in radians between -pi and pi
 Note that this convention about phase angle will "work"
      if we express our sinusoid as a cosine
      with the phase constant added -- cos(f*x + p)
>> x = (0:127)*2*pi/128;
>> r = rand(1,2)
r =
    0.4565    0.0185
>> k = r(1) + i*r(2);
>> y1 = k*exp(i*x) + conj(k)*exp(-i*x);
>> a = 2*abs(k)
a =
    0.9137
>> p = angle(k)
p =
    0.0405
>> y2 = a*cos(x+p);
>> norm(y1 - y2)
ans =
   1.6349e-15        % effectively zero --
>> plot(x,y1,'x', x, y2,'o')

Needless to say, the conventions about cos vs. sin, and the sign of the phase, and so on, do matter:

>> y2a = a*sin(x+p);
>> y2b = a*cos(x-p);
>> norm(y1-y2a)
ans =
   10.3372
>> norm(y1-y2b)
ans =
    0.5921

Going in the other direction:

>> r = rand(1,2)
r =
    0.8214    0.4447
>> a = r(1);  p = r(2);
>> y3 = a*cos(x+p);
>> k = (a/2)*exp(i*p)
k =
   0.3708 + 0.1767i
>> y4 = k*exp(i*x) + conj(k)*exp(-i*x);
>> norm(y4-y3)
ans =
   1.3322e-15
>> 
>> plot(x,y3,'x', x, y4,'o')

Note that the imaginary part of the (summed) complex exponential is zero in these examples, exactly, due to the fact that we used the notational trick discussed above to express real sinusoids as sums of complex exponentials:

>> k = r(1) + i*r(2);
>> y1 = k*exp(i*x) + conj(k)*exp(-i*x);
>> imag(y1(1:64))
ans =
  Columns 1 through 12 
     0     0     0     0     0     0     0     0     0     0     0     0
  Columns 13 through 24 
     0     0     0     0     0     0     0     0     0     0     0     0
  Columns 25 through 36 
     0     0     0     0     0     0     0     0     0     0     0     0
  Columns 37 through 48 
     0     0     0     0     0     0     0     0     0     0     0     0
  Columns 49 through 60 
     0     0     0     0     0     0     0     0     0     0     0     0
  Columns 61 through 64 
     0     0     0     0

Now we are just about ready to put it all together. We can express an arbitrary real sinusoid as the weighted sum of two complex exponentials whose weights are a complex conjugate pair k and conj(k).

We can therefore express any arbitrary scaling or shifting of a real sinusoid as some operation on k.

What operation is that?
Well, the elementary properties of complex numbers give the answer.

Suppose that we express the complex number k in polar coordinates, as

   r*e^(i*theta)

Then if we multiply k by some other complex number

  r1*e^(i*theta1)

the result is

  r*e^(i*theta) * r1*e^(i*theta1) =
  r*r1*e^(i*(theta+theta1))

that is, a new complex number whose magnitude is r*r1 and whose phase angle is theta+theta1 (modulo the issue of staying within -pi to pi).

So we can express any scaling and shifting of a sinusoid of freq x (expressed in the form k*exp(ix)+conj(k)*exp(-ix)) simply in terms of multiplying the complex constant k by some other complex number.

Since we have shown that the effect of any LSI system on a sinusoidal input of a particular frequency is merely to shift and scale it, it follows that the effect of any LSI system on a sinusoidal input of a particular frequency can be represented simply as multiplication by a complex number.

Here is a simple example showing how we can use complex factors to encode arbitrary scaling and shifting of sinusoids.

>> % time base for sampled sinusoid
>> x = (0:127)*2*pi/128;
>> % random amplitude and phase:
>> r = rand(2,1)*2 - 1;
>> a = r(1)   % amplitude
a =
    0.2309
>> p = r(2)*pi  % phase
p =
    1.8343
>> % complex-number encoding of same:
>> k = (a/2)*exp(i*p)   % k corresponding to a and p
k =
  -0.0301 + 0.1114i
>> 
>> y1 = a*cos(x+p); y2 = k*exp(i*x) + conj(k)*exp(-i*x);
>> plot(x,y1,'x', x,y2,'o');  % compare the results

So far, so good. Now let's try using a single complex multiplication to scale and shift:

>> % random amplitude scaling and phase shift:
>> s = rand(2,1)*2 - 1;  % two random numbers
>> scale = s(1)   % scaling factor
scale =
    0.8436
>> shift = s(2)*pi  % phase shift
shift =
    1.4967
>> % complex-number encoding of scale and shift:
>> q = scale*exp(i*shift)
q =
   0.0625 + 0.8413i
>> y3 = scale*a* cos(x + p + shift);    % scaling and shifting the cosine
>> y4 = q*k*exp(i*x) + conj(q)*conj(k)*exp(-i*x);  % and the complex exponential
>> plot(x,y3,'x', x,y4,'o');  % compare the results

So:
if we are constructing a real-valued sinusoid out of a sum of positive- and negative-frequency complex exponentials whose coefficients are complex conjugates (which is the notational trick employed in the DFT)
then to subject this sinusoid to scaling by factor 'scale' and to a phase shift by 'shift' radians,
we need to multiply the positive frequency complex exponential by shift*exp(i*scale)
and the negative frequency complex exponential by conj( shift*exp(i*scale) )

When we calculate the DFT of the (real-valued) impulse response h of some arbitrary LSI system S, the positive- and negative-frequency coefficients of the DFT of h will have exactly the correct complex-conjugate relationship to express the scaling and shifting imposed by S on each sinusoidal component. Thus for input x and output y, it will be true that

DFT(y) == DFT(x).*DFT(h)

We can now state more exactly what we mean by saying that "sinusoids are eigenfunctions of LSI systems" --

Just as a regular eigenvector x of a matrix A has the property that

A*x = lambda*x

so that the effect of A on x is just the same as simple scaling by lambda,
likewise a sinusoidal function s(x) has the property that when it is operated on by any LSI matrix A,

A*s(x) = q*s(x)

where s(x) is short for k*e^(ix)+conj(k)*e^(-ix), and q is some complex scalar.

Since complex exponentials of different frequencies are mutually orthogonal just as sinusoids are, we can easily find a set of N mutally orthogonal complex exponentials to use as a basis for expressing arbitrary N-dimensional vectors. Expressed in this form, any vector will be a sum of components that are individually guaranteed to be eigenfunctions of any LSI system, in the sense just defined.

This enables us to cash in Slogan #2:

The (FOURIER TRANSFORM OF THE) response of a linear shift-invariant system S to an arbitrary input x is the PRODUCT of (THE FOURIER TRANSFORM OF) x with (THE FOURIER TRANSFORM OF) the impulse response h of S.

The (discrete) Fourier transform of x expresses x as a weighted sum of sinusoids

The (discrete) Fourier transform of an impulse is all ones:

>> x = [1 0 0 0 0 0 0 0];
>> fft(x)
ans =
     1     1     1     1     1     1     1     1

Thus an impulse is a sum of all the sinusoids in the Fourier basis set, equally weighted.

Since the system impulse response h shows what the system S does to an impulse, each value in the Fourier transform of h shows what the system does to one sinusoidal component of that impulse.

Thus each (complex) value in the (discrete) Fourier transform of h is S's eigenvalue for the sinusoidal component at the corresponding frequency.

If we now take an arbitrary input x and express it as a sum of sinusoids (i.e. take its Fourier transform), then the effect of system S on x will just be to multiply each sinusoidal component of x by some complex number, namely S's eigenvalue for a sinusoid at that particular frequency. But as we have just seen, this multiplicative factor (representing the scaling and shifting imposed by S on a sinusoid of the frequency in question) is just the corresponding element in the (discrete) Fourier transform of h.