Leonhard Euler is credited with discovering the simple but profound equation
One way to derive it is by looking at the power series expansion
The real part (terms 1, 3, 5 etc.) gives the power series for the cosine. The imaginary part gives the power series for the sine. However, if you are not already familiar with power series, this is entirely unenlightening.
In Linear Algebra and its Applications, Strang writes (p. 185):
I remember the day when a letter came to MIT from a prisoner in New York, asking if Euler's formula was true. It really is astonishing, when you think of it, that three of the key functions of mathematics should come together in such a graceful way. Our best answer was to look at the power series [...] The formula is correct, and I wish we had sent a more beautiful proof.
If you're willing to take Euler's equation on faith -- and don't have a trouble remembering it -- you can skip the rest of this page. If you want more, an alternative way to assimilating Euler's relation comes from Chapter 22 of Feynman's Lectures on Physics. It uses only elementary algebra, and some simple arithmetic of a kind originally used in the 17th century to facilitate the calculation of tables of logarithms.
We assume zero, the positive integers and the idea of counting upwards.
This enables us to define the operation of addition of positive integers:
a + b means to start from a, and count upwards b times.
We can then define multiplication of integers as repeated addition: starting from zero, we add a (i.e. count upwards a times) b times over: this is written as b * a.
Similarly, we can define exponentiation of integers
as repeated multiplication:
starting from 1, we multiply by a, b times over: this is written as a ^ b.
It follows from these definitions that
a + 0 = a a * 1 = a a ^ 1 = a a + b = b + a a + (b + c) = (a + b) + c a * b = b * a a * (b * c) = (a * b) * c a * (b + c) = (a * b) + (a * c) (a * b) ^ c = (a ^ c) * (b ^ c) (a ^ b) * (a ^ c) = a ^ (b + c) (a ^ b) ^ c = a ^ (b * c)
Call these equations (and some similar ones) The Rules. We learn them, at least implicitly, in elementary and junior high school.
We can define inverse operations for addition, multiplication and exponentiation by asking for the value of b satisfying equations like
a + b = c a * b = c a ^ b = c b ^ a = c
if a and c are known.
The inverse operations, in the corresponding cases, can be expressed as:
b = c - a b = c / a b = log_a(c) b = root_a(c) (also written b = c ^ (1/a) )
There are two inverse operations definable for exponentiation
since a ^ b is not equal to b ^ a:
logarithm finds the exponent given the base, while root finds the base given the exponent.
Having introduced all these definitions based on zero and the successor function counting up through the integers, we now run into the problem that lots of simple equations do not find any solution among the integers.
The trick is to generalize the notion of number in such a way that all such equations can be solved, while leaving all the basic rules about these operations (their algebraic structure) intact.
Thus equations like
b = 3 - 5
lead us to introduce negative numbers: b = -2
We can decide that this is the right answer by observing that the Rules previously noted permit us to manipulate 3 - 5 and show it must be equal to 0 - 2.
But this leads a definitional problem: how to multiply and exponentiate using negative numbers, in a way that preserves The Rules?
Doing this for multiplication gives us the familiar principle of minus times minus is plus, and so on. We can't say any longer that the original definitions of the operations apply: it doesn't make sense to say that -2 time 5 is 5 added up -2 times. But (the relevant pieces of) The Rules continue to apply over the new set of numbers, which is what is important.
Now what does it mean to raise a number to a negative power? For instance, what does (a ^ -2) mean?
Well, we know that whatever it is, it must be the same as a ^ (3 - 5). So what is this?
Well, we know that (3 - 5) + 5 = 3.
Therefore, we know that (a ^ (3 - 5 )) * (a ^ 5) = a ^ 3
Therefore, we know that a ^ (3 - 5) = (a ^ 3) / (a ^ 5)
Since the right side is (a*a*a)/(a*a*a*a*a), it obviously reduces to 1/(a*a) = 1/(a^2)
This (suitably generalized) shows that negative powers must be the reciprocals of the corresponding positive powers -- although we still don't know what this means, since we haven't defined how to divide 1 by (say) 2!
To handle the problem of equations like b = 1 / 5, we introduce fractional numbers. We will not have much trouble getting fractions to follow the rules for addition and multiplication. But what does a fractional exponent mean?
For instance, what is a ^ (3 / 5) ?
We know that (3 / 5) * 5 = 3.
So we know that (a ^ (3 / 5)) ^ 5 = a ^ 3. because of the Rule that *a^b)^c = a^(b*c)
We can take the fifth root of both sides, and see that
a ^ (3 / 5) = root_5(a ^ 3)
In other words, fractional powers will turn out to be a combination of a regular power (the numerator) and a root (the denominator)...
The process of abstraction and generalization continues...
How can we solve an equation like b = 2 ^ (1 / 2) ?
There is no fractional number exactly equal to b in this case. A rigorous definition involves the notion of a sequence of fractions that are successively better approximations to b, and can be said to uniquely identify one of a new kind of number, an irrational number. We can write such numbers down using expressions like root_2(2). The mathematical details are a complicated, but once we accept that such numbers exist, we can continue with the process of making sure that The Rules continue to apply to their new domain.
For instance, what is an irrational power, such as 10 ^ (root_2(2)) ?
We can approximate root_2(2) by a rational number to a certain degree of approximation, and then calculate 10 raised to this rational power (which we already know how to do), thus yielding an approximation to 10 ^ (root_2(2)). If we want a better approximation, we can use a better approximation to the exponent.
On a modern computer or calculator (for instance in MATLAB), this is so easy that we don't give it a second thought.
However, in earlier times, the amount of labor involved in calculating irrational powers and the corresponding logarithms was daunting. So it was convenient to prepare tables of pre-calculated logarithms (or powers) to save time.
Consider the general problem of producing such tables. The choice of 10 as a base is traditional but arbitrary, based on our count of fingers and toes. Luckily, we can work out from The Rules that
log_a(x) = log_b(x)/log_b(a)
so once we have logs with respect to any base, logs to another base are only related by a scaling factor.
So back to the problem of calculating a table of base 10 logarithms in the olden days.
As MATLAB is happy to tell us, sqrt(2) is 1.41421356237310
Since this is 1 + .4 + .01 + .004 etc., we could make use of the equation
10 ^ (1 + .4 + .01 + .004 ...) = (10^1) * (10^.4) * (10^.01) * (10^.004) ...
This reduces the unique and difficult calculation
10 ^ (14142 / 10000 )
to a set of more general calculations.
These more general calculations are still difficult, however.
There is another trick, first used by Henry Briggs in 1617... and his tables were the basis for all log tables used up to the 1930's.
Instead of calculating 10 ^ (1/10), 10 ^ (1/100) and so on, we calculate 10 ^ (1 / 2), 10 ^ (1 / 4), etc.
These are easy to do because they are just successive
square roots of 10:
(10 ^ (1 / 2)) ^ (1 / 2) etc.
Thus 10 ^ (1 / 1024) is just
as MATLAB is happy to tell us.
>> 10 ^ (1 / 1024) ans = 1.00225114829291 >> sqrt(sqrt(sqrt(sqrt(sqrt(sqrt(sqrt(sqrt(sqrt(sqrt(10)))))))))) ans = 1.00225114829291
Square roots are relatively easy to calculate by hand, unlike (say) thousandth-roots.
Let's take a look at the first 20 square roots of 10:
(We start with some MATLAB code to let us print out the results is a reasonable way: read about the function 'printmat' if you want to understand it...) >> S = 2.^(-(1:20)); >> P = 2.^(1:20); >> R = 10.^S; >> D = (10.^S-1)./S; >> M = [P;S;R;D]'; >> RLAB = '1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20'; >> CLAB = 'P=2^N S=1/P 10^S (10^S-1)/S' CLAB = P=2^N S=1/P 10^S (10^S-1)/S >> printmat(M,'Successive Square Roots of 10',RLAB, CLAB); Successive Square Roots of 10 = P=2^N S=1/P 10^S (10^S-1)/S 1 2.00000 0.50000 3.16228 4.32456 2 4.00000 0.25000 1.77828 3.11312 3 8.00000 0.12500 1.33352 2.66817 4 16.00000 0.06250 1.15478 2.47651 5 32.00000 0.03125 1.07461 2.38745 6 64.00000 0.01562 1.03663 2.34451 7 128.00000 0.00781 1.01815 2.32342 8 256.00000 0.00391 1.00904 2.31297 9 512.00000 0.00195 1.00451 2.30777 10 1024.00000 0.00098 1.00225 2.30518 11 2048.00000 0.00049 1.00112 2.30388 12 4096.00000 0.00024 1.00056 2.30323 13 8192.00000 0.00012 1.00028 2.30291 14 16384.00000 6.10352e-05 1.00014 2.30275 15 32768.00000 3.05176e-05 1.00007 2.30267 16 65536.00000 1.52588e-05 1.00004 2.30263 17 131072.00000 7.62939e-06 1.00002 2.30261 18 262144.00000 3.81470e-06 1.00001 2.30260 19 524288.00000 1.90735e-06 1.00000 2.30259 20 1.04858e+06 9.53674e-07 1.00000 2.30259
We can observe that the right hand column, (10^S-1)/S, becomes 2.30259, roughly, as S gets to be small, so that we have 10^S = 2.30259*S + 1 (for S near zero).
Let's have MATLAB check this for us for some pretty small number, 1e-8:
>> 10^(1e-8) ans = 1.00000002302585 >> 2.30259*(1e-8)+1 ans = 1.00000002302590
This means that for S small, S is log10(2.30259*S+1). Thus for X close to 1, log10(X) is (X-1)/2.30259
As we said before, log_a(x) = log_10(x)/log_10(a)
We just observed that
log10(X) -> (X-1)/2.30259... as X approaches 1 from above.
This means that there must be some base, call it e,
such that log_e(X) -> (X-1) as X approaches 1 from above...
This would be a sort of "natural" base for logarithms, the number e such that (e^x-1)/x goes to 1 as x approaches 1, or more simply, the number e such that e^x = 1+x for small x.
The scaling factor for converting to this base from base 10, 1/log10(e), must be 2.30259..., and so e = 10^(1/2.30259...)
MATLAB tells us this is about
>> 10^(1/2.30259) ans = 2.71827603558628
which is appropriately close to what e should be using the official MATLAB e^x function, exp(x):
>> exp(1) ans = 2.71828182845905
Returning to the successive square roots of 10, these enabled the clever Mr. Briggs and his successors to calculate (for instance) log10(2) by observing that 2, like any number, can be expressed as a product of terms taken from the series of successive square roots of ten:
2 = (10^(1/4))*(10^(1/32))*(10^(1/64))*(10^(1/256))... = 1.77828... * 1.07461... * 1.03663... * 1.00904... * ...
MATLAB tells us that we're getting pretty close with these first 4 terms:
>> (10^(1/4))*(10^(1/32))*(10^(1/64))*(10^(1/256)) ans = 1.99885481187351
and that what's left out amounts roughly to multiplying by
>> 2/((10^(1/4))*(10^(1/32))*(10^(1/64))*(10^(1/256))) ans = 1.00057292211505
Since 1.0057292211505 is reasonably close to 1, our previous discovery tells us that log_10(1.00057292211505) is pretty close to .00057292211505/2.30259
and so log10(2) will be pretty close to
1/4 + 1/32 + 1/64 + 1/256 + .00057292211505/2.30259
and MATLAB obliges us:
>> 1/4 + 1/32 + 1/64 + 1/256 + .00057292211505/2.30259 ans = 0.30103006638288 >> log10(2) ans = 0.30102999566398
Briggs did exactly this kind of calculation, carried out to 14 digits (rather than the 6 digits or so in the example just worked), in order to produce his tables.
We still can't solve all simple equations within the bounds of the numbers and operators we've define so far.
For instance, what is the solution to
x^2 = -1 ?
Well, it seems like it should be sqrt(-1), but we don't have any numbers yet that work as the value of this expression, due to our rules for multiplication.
So we start out by widening the scope of our set of numbers by fiat to include the kind of numbers we need; let's just add a new number, 'i', defined as sqrt(-1) All we know about this number 'i' is that i^2 = -1
Now the problem is to make all The Rules work out.
We know that if i^2 is -1 then -i^2 must be the same thing, since the old Rules tell us that for any number a, a^2 = -a^2. In fact, since our only constraint on the meaning of 'i' is that i^2 = -1, in general, any equation containing i must stay true if the sign of i is changed everywhere. The expression obtained by changing the sign of i everywhere it occurs is called a "complex conjugate."
We can pretty easily see that by adding up i's and adding i's to other numbers and multiplying i's by other numbers and so on, we will want to write the new numbers in the general case as p+qi, where p and q are any numbers of the old ('real') kind.
Things don't get any worse when we multiply new ('complex') numbers: (p+qi)(r+si) just equals what we expect according to The Rules (using adjacency for multiplication):
pr + psi + qir + pisi = pr + (ps+qr)i + psii = (pr-ps)+(ps+qr)i
The general case of adding is even easier: The Rules tell us just to add real parts and add imaginary parts:
p+qi + r+si = (p+r)+(q+s)i
The tricky problem is to figure out how to compute complex powers of complex numbers.
First, let's limit our attention to the problem of computing complex powers of real numbers. For instance, what is this?
We know by The Rules that this must be the same as
(10^r) * (10^(is))
The first part is already solved, so we need only solve the second part, a real number taken to an imaginary power.
We believe that equations stay true if we replace all i's with -i's, so if
10^(is) = x+iy
10^(-is) = x-iy
If we multiply these together we see that
10^(is)*10^(-is) = 10^0 = 1 = (x+iy)*(x-iy) = x^2 + y^2
so if can find x (the real part of the answer) we can use this equation to calculate y (the imaginary part).
We're still stuck on the basic problem of computing a real number to an imaginary power, however.
To solve the problem (the last real problem in our program of generalizing operations and types of numbers on the path from adding integers to all of elementary algebra), we make one additional observation, and one leap of faith.
The observation is that if we can solve the problem for even one number, The Rules will tell us how to solve it for all others. Thus if we know 10^(is) for some s, and we want it for 2*s, can get this by squaring that value. And so forth.
The leap of faith is this: let's assume that our observation
10^S = 2.30259*S + 1 for small S,
is true for imaginary numbers as well as real ones!
So let's start with S = i/1024.
Our estimate that 10^S is roughly 2.30259*S +1 gives us:
10^(i/1024) = 2.30259i/1024 + 1 = 1 + *2.30259/1024)*i = 1 + 0.0022486i
Now we have needed single value of a real number taken to an imaginary power, from which we can in principle calculate all other examples by using Rules like (a^b)*(a^c) = a^(b+c)
For instance, 10^(i/1024) * 10^(i/1024) = (10^(i/1024))^2 = 10^(i/512) and (10^(i/512))^2 = 10^(i/256), and so on.
So now we can calculate a bunch of other values for 10 raised to complex powers by squaring our result 10^(i/1024) = 2.30259i/1024 + 1 = 1 + 0.0022486i over and over again.
Actually we'll let MATLAB do the arithmetic:
>> i=sqrt(-1); >> R = 1:11; >> S = [1024 512 256 128 64 32 16 8 4 2 1]; >> R(1) = 1 + 2.30259i/1024; >> for n = 2:11, R(n) = R(n-1)^2; end >> ReR = real(R); >> ImR = imag(R); >> CLAB = 'Power=i/S re(10^(i/S)) im(10*(i/S))'; >> RLAB = '1 2 3 4 5 6 7 8 9 10 11'; >> M = [S; ReR; ImR]';
Now to print the results on a fresh screen:
>> format short >> format compact >> printmat(M,'Successive Squares of 10^(i/1024) as 2.30259i/1024',RLAB,CLAB); Successive Squares of 10^(i/1024) as 2.30259i/1024 = Power=i/S re(10^(i/S)) im(10*(i/S)) 1 1024.00000 1.00000 0.00225 2 512.00000 0.99999 0.00450 3 256.00000 0.99997 0.00899 4 128.00000 0.99986 0.01799 5 64.00000 0.99939 0.03597 6 32.00000 0.99749 0.07190 7 16.00000 0.98982 0.14344 8 8.00000 0.95917 0.28396 9 4.00000 0.83938 0.54473 10 2.00000 0.40783 0.91447 11 1.00000 -0.66993 0.74591
Thus the eighth successive square is an estimate of the value of 10^(i/8):
>> R(8) ans = 0.9592 + 0.2840i
We can check this against the oracle, i.e. MATLAB's estimate:
>> 10^(i/8) ans = 0.9589 + 0.2839i
So we still have about 3 significant figures left, which is not too bad!
In order to follow what happens over a larger span and using more evenly spaced values of the complex exponent, let's take this estimate of 10^(i/8) and take a series of successive integer powers, representing 10^(i/8), 10^(2*i/8), 10^(3*i/8), etc. We can also back up to 1+0i, which is 10^(0*i/8). Note that each step is just a matter of multiplying the previous result by 0.959174+0.283958i, the value for 10^(i/8) we have calculated earlier.
>> X = 0:24; >> Y = 0:24; >> Y(1) = 1+0i; >> Y(2) = R(8); >> for n = 3:25, Y(n) = Y(n-1)*R(8); end >>
Now let's print the results on a fresh screen:
>> Y Y = Columns 1 through 4 1.0000 0.9592 + 0.2840i 0.8394 + 0.5447i 0.6504 + 0.7608i Columns 5 through 8 0.4078 + 0.9145i 0.1315 + 0.9929i -0.1558 + 0.9898i -0.4305 + 0.9051i Columns 9 through 12 -0.6699 + 0.7459i -0.8544 + 0.5252i -0.9687 + 0.2612i -1.0033 - 0.0245i Columns 13 through 16 -0.9553 - 0.3084i -0.8288 - 0.5671i -0.6339 - 0.7793i -0.3867 - 0.9275i Columns 17 through 20 -0.1076 - 0.9994i 0.1806 - 0.9892i 0.4541 - 0.8975i 0.6904 - 0.7319i Columns 21 through 24 0.8701 - 0.5060i 0.9782 - 0.2382i 1.0059 + 0.0493i 0.9509 + 0.3329i Column 25 0.8175 + 0.5893i
We can see that the numbers are going up and down, but this printout is not very easy to read. A graphical presentation would be easier to assimilate. Since the result of each step is a complex number, let's graph the real and imaginary parts separately.
>> newplot >> hold on >> X = (0:24)/8; >> plot(X, real(Y),'o'); >> plot(X,real(Y)); >> plot(X,imag(Y),'x'); >> plot(X,imag(Y)); >> hold off
These curves look like our familiar friends the sinusoids.
The real part starts with 1 and thus looks like a cosine wave.
The imaginary part starts from 0 and thus looks like a sine wave.
This should be easy to remember: for any real number a, a^(ix) will obviously be 1+0i when x is 0.
There is one little issue: the period is somewhere around 2.8, whereas we expect the period of a sinuoid to be 2*pi, or about 6.3. What's up? Well, back to our friend e, the "natural" logarithmic base.
Whereas we began our experiment from the premise that 10^S = 2.30259*S + 1 for small S, now we will start from the premise that e^S = S + 1 for small S.
Repeating everything otherwise as before, except that now we calculate e^(ix) instead of 10^(ix), we get:
>> R = 1:11; >> S = [1024 512 256 128 64 32 16 8 4 2 1]; >> R(1) = 1 + i/1024; >> for n = 2:11, R(n) = R(n-1)^2; end >> ReR = real(R); >> ImR = imag(R); >> CLAB = 'Power=i/S real(e^(i/S)) imag(e*(i/S))'; >> RLAB = '1 2 3 4 5 6 7 8 9 10 11'; >> M = [S; ReR; ImR]';
Now to print the results on a fresh screen:
>> format short >> format compact >> printmat(M,'Successive Squares of e^(i/1024) as 2.30259i/1024',RLAB,CLAB); Successive Squares of e^(i/1024) as 2.30259i/1024 = Power=i/S real(e^(i/S) imag(e*(i/S) 1 1024.00000 1.00000 0.00098 2 512.00000 1.00000 0.00195 3 256.00000 0.99999 0.00391 4 128.00000 0.99997 0.00781 5 64.00000 0.99989 0.01562 6 32.00000 0.99953 0.03125 7 16.00000 0.99808 0.06246 8 8.00000 0.99226 0.12468 9 4.00000 0.96903 0.24743 10 2.00000 0.87780 0.47954 11 1.00000 0.54057 0.84188
Now the 9th successive square is an estimate of the value of e^(i/4):
>> R(9) ans = 0.9690 + 0.2474i
We can check this against the oracle, i.e. MATLAB's estimate:
>> exp(1)^(i/4) ans = 0.9689 + 0.2474i
Again, at least 3 significant figures left.
Again, to follow what happens over a larger span and using more evenly spaced values of the complex exponent, we'll start with this estimate of e^(i/4) and take a series of successive integer powers, representing e^(i/4), e^(2*i/4), e^(3*i/4), etc. We again back up to e^(0*i/4), which is of course 1+0i.
As before, each step is just a matter of multiplying the previous result by the value for e^(i/4) we have calculated earlier.
>> Y = 1:28; >> Y(1) = 1+0i; >> Y(2) = R(9); >> for n = 3:28, Y(n) = Y(n-1)*R(9); end >>
Once again, we will see better by graphing the real and imaginary parts separately. This time we will also plot a vertical line at 2*pi in order to check the periodicity:
>> newplot >> hold on >> X = (0:27)/4; >> plot(X, real(Y),'o'); >> plot(X,real(Y)); >> plot(X,imag(Y),'x'); >> plot(X,imag(Y)); >> plot([2*pi 2*pi], [-1 1]); >> hold off
Pretty good! Mr. Briggs' trick leads us to a strong suspicion that
e^(i*x) = cos(x) + i*sin(x). Setting x equal to pi, we see that e^(i*pi) = -1, or
e^(i*pi) + 1 = 0
This little equation, so neatly relating the fundamental constants e, pi, i, 1 and 0, might have reconciled Pythagoras (if he had only known it!) to the irrationality of sqrt(2), or even to the failure of (3/2)^12 to equal 2^7.
Aside from its value as an aid to contemplation of the mystic wonders of algebra, the fact that e^(i*x) = cos(x) + i*sin(x) is quite useful in many branches of science and engineering. As we will see, it leads to a neater formulation of the Fourier transform, and to a set of powerful tools for certain problems of system analysis.