Chapter I Linear Equations 1

advertisement
Chapter I
Linear Equations
1
I.1. Solving Linear Equations
Prerequisites and Learning Goals
From your work in previous courses, you should be able to
• Write a system of linear equations using matrix notation.
• Use Gaussian elimination to bring a system of linear equations into upper triangular form
and reduced row echelon form (rref).
• Determine whether a system of equations has a unique solution, infinitely many solutions
or no solutions, and compute all solutions if they exist; express the set of all solutions in
parametric form.
• Recognize when matrix sums and matrix products are defined, and to compute them.
• Compute the inverse of a matrix when it exists, use the inverse to solve a system of equations,
describe for what systems of equations this is possible.
• Find the transpose of a matrix.
• Interpret a matrix as a linear transformation acting on vectors.
After completing this section, you should be able to
• Write down the definition and explain the meaning of the standard Euclidean norm, the 1norm and the infinity norm of a vector; compute norms by hand or using the MATLAB/Octave
command norm.
• Calculate the Hilbert-Schmidt norm of a matrix.
• Define the matrix norm of a matrix; describe the connection between the matrix norm and
how a matrix stretches the length of vectors and express it in a mathematical form; explain
what range of values the matrix norm can take; compute the matrix norm of a diagonal
matrix and, given sufficient information, of other types of matrices.
• Find solutions to linear equations of the form Ax = b by executing and interpreting the
output of the MATLAB/Octave commands rref and \.
• Define the condition number of a matrix and its relation to the matrix norm; compute it by
hand when possible or using the MATLAB/Octave command cond.
• Discuss the sensitivity of a linear equation Ax = b by relating changes in the solution x to
small changes in b; define the relative error, explain why the condition number is useful in
making estimates of relative errors in the solution x, and use it to make such estimates.
2
I.1.1. Review: Systems of linear equations
The first part of the course is about systems of linear equations. You will have studied such systems
in a previous course, and should remember how to find solutions (when they exist) using Gaussian
elimination.
Many practical problems can be solved by turning them into a system of linear equations. In
this chapter we will study a few examples: the problem of finding a function that interpolates
a collection of given points, and the approximate solutions of differential equations. In practical
problems, the question of existence of solutions, although important, is not the end of the story.
It turns out that some systems of equations, even though they may have a unique solution, are
very sensitive to changes in the coefficients. This makes them very difficult to solve reliably. We
will see some examples of such ill-conditioned systems, and learn how to recognize them using the
condition number of a matrix.
Recall that a system of linear equations, like this system of 2 equations in 3 unknowns
x1 +2x2 +x3 = 0
x1 −5x2 +x3 = 1
can be written as a matrix equation
 
x
0
1 2 1  1
x2 =
.
1
1 −5 1
x3
A general system of m linear equations in n unknowns can be written as
Ax = b
where A is an given m × n (m rows, n columns) matrix, b is a given m-component vector, and x
is the n-component vector of unknowns.
A system of linear equations may have no solutions, a unique solutions, or infinitely many solutions. This is easy to see when there is only a single variable x, so that the equation has the
form
ax = b
where a and b are given numbers. The solution is easy to find if a 6= 0: x = b/a. If a = 0 then the
equation reads 0x = b. In this case, the equation either has no solutions (when b 6= 0) or infinitely
many (when b = 0), since in this case every x is a solution.
To solve a general system Ax = b, form the augmented matrix [A|b] and use Gaussian elimination
to reduce the matrix to reduced row echelon form. This reduced matrix (which represents a system
of linear equations that has exactly the same solutions as the original system) can be used to decide
whether solutions exist, and to find them. If you don’t remember this procedure, you should review
it.
3
In the example above, the augmented matrix is
1 2 1 0
.
1 −5 1 1
The reduced row echelon form is
1 0 1 2/7
,
0 1 0 −1/7
which leads to a family of solutions (one for each value of the parameter s)


 
2/7
−1



x = −1/7 + s 0  .
0
1
I.1.2. Solving a non-singular system of n equations in n unknowns
Let’s start with a system of equations where the number of equations is the same as the number
of unknowns. Such a system can be written as a matrix equation
Ax = b,
where A is a square matrix, b is a given vector, and x is the vector of unknowns we are trying
to find. When A is non-singular (invertible) there is a unique solution. It is given by x = A−1 b,
where A−1 is the inverse matrix of A. Of course, computing A−1 is not the most efficient way to
solve a system of equations.
For our first introduction to MATLAB/Octave, let’s consider an example:

 

3
1 1
1
A = 1 1 −1 b = 1 .
1
1 −1 1
First, we define the matrix A and the vector b in MATLAB/Octave. Here is the input (after the
prompt symbol >) and the output (without a prompt symbol).
>A=[1 1 1;1 1 -1;1 -1 1]
A =
1
1
1
1
1
-1
>b=[3;1;1]
4
1
-1
1
b =
3
1
1
Notice that the entries on the same row are separated by spaces (or commas) while rows are separated by semicolons. In MATLAB/Octave, column vectors are n by 1 matrices and row vectors are 1
by n matrices. The semicolons in the definition of b make it a column vector. In MATLAB/Octave,
X’ denotes the transpose of X. Thus we get the same result if we define b as
>b=[3 1 1]’
b =
3
1
1
The solution can be found by computing the inverse of A and multiplying
>x = A^(-1)*b
x =
1
1
1
However if A is a large matrix we don’t want to actually calculate the inverse. The syntax for
solving a system of equations efficiently is
>x = A\b
x =
1
1
1
5
If you try this with a singular matrix A, MATLAB/Octave will complain and print an warning
message. If you see the warning, the answer is not reliable! You can always check to see that x
really is a solution by computing Ax.
>A*x
ans =
3
1
1
As expected, the result is b.
By the way, you can check to see how much faster A\b is than A^(-1)*b by using the functions
tic() and toc(). The function tic() starts the clock, and toc() stops the clock and prints the
elapsed time. To try this out, let’s make A and b really big with random entries.
A=rand(1000,1000);
b=rand(1000,1);
Here we are using the MATLAB/Octave command rand(m,n) that generates an m × n matrix
with random entries chosen between 0 and 1. Each time rand is used it generates new numbers.
Notice the semicolon ; at the end of the inputs. This suppresses the output. Without the
semicolon, MATLAB/Octave would start writing the 1,000,000 random entries of A to our screen!
Now we are ready to time our calculations.
tic();A^(-1)*b;toc();
Elapsed time is 44 seconds.
tic();A\b;toc();
Elapsed time is 13.55 seconds.
So we see that A\b quite a bit faster.
6
I.1.3. Reduced row echelon form
How can we solve Ax = b when A is singular, or not a square matrix (that is, the number of
equations is different from the number of unknowns)? In your previous linear algebra course you
learned how to use elementary row operations to transform the original system of equations to
an upper triangular system. The upper triangular system obtained this way has exactly the same
solutions as the original system. However, it is much easier to solve. In practice, the row operations
are performed on the augmented matrix [A|b].
If efficiency is not an issue, then addition row operations can be used to bring the system into
reduced row echelon form. In the this form, the pivot columns have a 1 in the pivot position and
zeros elsewhere. For example, if A is a square non-singular matrix then the reduced row echelon
form of [A|b] is [I|x], where I is the identity matrix and x is the solution.
In MATLAB/Octave you can compute the reduced row echelon form in one step using the function
rref(). For the system we considered above we do this as follows. First define A and b as before.
This time I’ll suppress the output.
>A=[1 1 1;1 1 -1;1 -1 1];
>b=[3 1 1]’;
In MATLAB/Octave, the square brackets [ ... ] can be used to construct larger matrices from
smaller building blocks, provided the sizes match correctly. So we can define the augmented matrix
C as
>C=[A b]
C =
1
1
1
1
1
-1
1
-1
1
3
1
1
Now we compute the reduced row echelon form.
>rref(C)
ans =
1
0
0
0
1
0
0
-0
1
1
1
1
7
The solution appears on the right.
Now let’s try to solve Ax = b with


1 2 3
A = 4 5 6
7 8 9
 
1

b = 1 .
1
This time the matrix A is singular and doesn’t have an inverse. Recall that the determinant of a
singular matrix is zero, so we can check by computing it.
>A=[1 2 3; 4 5 6; 7 8 9];
>det(A)
ans = 0
However we can still try to solve the equation Ax = b using Gaussian elimination.
>b=[1 1 1]’;
>rref([A b])
ans =
1.00000
0.00000
0.00000
0.00000
1.00000
0.00000
-1.00000
2.00000
0.00000
-1.00000
1.00000
0.00000
Letting x3 = s be a parameter, and proceeding as you learned in previous courses, we arrive at the
general solution
 
 
−1
1
x =  1  + s −2 .
0
1
On the other hand, if


1 2 3
A = 4 5 6
7 8 9
then
>rref([1 2 3 1;4 5 6 1;7 8 9 0])
ans =
1.00000
0.00000
0.00000
0.00000
1.00000
0.00000
-1.00000
2.00000
0.00000
tells us that there is no solution.
8
0.00000
0.00000
1.00000
 
1

b = 1 ,
0
I.1.4. Gaussian elimination steps using MATLAB/Octave
If C is a matrix in MATLAB/Octave, then C(1,2) is the entry in the 1st row and 2nd column.
The whole first row can be extracted using C(1,:) while C(:,2) yields the second column. Finally
we can pick out the submatrix of C consisting of rows 1-2 and columns 2-4 with the notation
C(1:2,2:4).
Let’s illustrate this by performing a few steps of Gaussian elimination on the augmented matrix
from our first example. Start with
C=[1 1 1 3; 1 1 -1 1; 1 -1 1 1];
The first step in Gaussian elimination is to subtract the first row from the second.
>C(2,:)=C(2,:)-C(1,:)
C =
1
0
1
1
0
-1
1
-2
1
3
-2
1
Next, we subtract the first row from the third.
>C(3,:)=C(3,:)-C(1,:)
C =
1
0
0
1
0
-2
1
-2
0
3
-2
-2
To bring the system into upper triangular form, we need to swap the second and third rows. Here
is the MATLAB/Octave code.
>temp=C(3,:);C(3,:)=C(2,:);C(2,:)=temp
C =
1
0
0
1
-2
0
1
0
-2
3
-2
-2
9
I.1.5. Norms for a vector
Norms are a way of measuring the size of a vector. They are important when we study how vectors
change, or want to know how close one vector is to another. A vector may have many components
and it might happen that some are big and some are small. A norm is a way of capturing information
about the size of a vector in a single number. There is more than one way to define a norm.
In your previous linear algebra course, you probably have encountered the most common norm,
called the Euclidean norm (or the 2-norm). The word norm without qualification usually refers to
this norm. What is the Euclidean norm of the vector
−4
a=
?
3
When you draw the vector as an arrow on the plane, this norm is the Euclidean distance between
the tip and the tail. This leads to the formula
p
kak = (−4)2 + 32 = 5.
This is the answer that MATLAB/Octave gives too:
> a=[-4 3]
a =
-4
3
> norm(a)
ans = 5
The formula is easily generalized to n dimensions. If x = [x1 , x2 , . . . , xn ]T then
p
kxk = |x1 |2 + |x2 |2 + · · · + |xn |2 .
The absolute value signs in this formula, which might seem superfluous, are put in to make the
formula correct when the components are complex numbers. So, for example
√
i p
√
2
2
1 = |i| + |1| = 1 + 1 = 2.
Does MATLAB/Octave give this answer too?
There are situations where other ways of measuring the norm of a vector are more natural.
Suppose that the tip and tail of the vector a = [−4, 3]T are locations in a city where you can only
walk along the streets and avenues.
10
3
−4
If you defined the norm to be the shortest distance that you can walk to get from the tail to the
tip, the answer would be
kak1 = | − 4| + |3| = 7.
This norm is called the 1-norm and can be calculated in MATLAB/Octave by adding 1 as an extra
argument in the norm function.
> norm(a,1)
ans = 7
The 1-norm is also easily generalized to n dimensions. If x = [x1 , x2 , . . . , xn ]T then
kxk1 = |x1 | + |x2 | + · · · + |xn |.
Another norm that is often used measures the largest component in absolute value. This norm
is called the infinity norm. For a = [−4, 3]T we have
kak∞ = max{| − 4|, |3|} = 4.
To compute this norm in MATLAB/Octave we use inf as the second argument in the norm function.
> norm(a,inf)
ans = 4
Here are three properties that the norms we have defined all have in common:
1. For every vector x and every number s, ksxk = |s|kxk.
2. The only vector with norm zero is the zero vector, that is, kxk = 0 if and only if x = 0
3. For all vectors x and y, kx + yk ≤ kxk + kyk. This inequality is called the triangle inequality.
It says that the length of the longest side of a triangle is smaller than the sum of the lengths
of the two shorter sides.
What is the point of introducing many ways of measuring the length of a vector? Sometimes one
of the non-standard norms has natural meaning in the context of a given problem. For example,
when we study stochastic matrices, we will see that multiplication of a vector by a stochastic matrix
decreases the 1-norm of the vector. So in this situation it is natural to use 1-norms. However, in
this course we will almost always use the standard Euclidean norm. If v a vector then kvk (without
any subscripts) will always denote the standard Euclidean norm.
11
I.1.6. Matrix norms
Just as for vectors, there are many ways to measure the size of a matrix A.
For a start we could think of a matrix as a vector whose entries just happen to be written in a
box, like
1 2
A=
,
0 2
rather than in a row, like
 
1
2

a=
0 .
2
√
Taking this point of view, we would define the norm of A to be 12 + 22 + 02 + 22 = 3. In fact,
the norm computed in this way is sometimes used for matrices. It is called the Hilbert-Schmidt
norm. For a general matrix A = [ai,j ], the formula for the Hilbert-Schmidt norm is
sX X
kAkHS =
|ai,j |2 .
i
j
The Hilbert-Schmidt norm does measure the size of matrix in some sense. It has the advantage of
being easy to compute from the entries ai,j . But it is not closely tied to the action of A as a linear
transformation.
When A is considered as a linear transformation or operator, acting on vectors, there is another
norm that is more natural to use.
Starting with a vector x the matrix A transforms it to the vector Ax. We want to say that a
matrix is big if increases the size of vectors, in other words, if kAxk is big compared to kxk. So
it is natural to consider the stretching ratio kAxk/kxk. Of course, this ratio depends on x, since
some vectors get stretched more than others by A. Also, the ratio is not defined if x = 0. But in
this case Ax = 0 too, so there is no stretching.
We now define the matrix norm of A to be the largest of these ratios,
kAk = max
x:kxk6=0
kAxk
.
kxk
This norm measures the maximum factor by which A can stretch the length of a vector. It is
sometimes called the operator norm.
Since kAk is defined to be the maximum of a collection of stretching ratios, it must be bigger
than or equal to any particular stretching ratio. In other words, for any non zero vector x we know
kAk ≥ kAxk/kxk, or
kAxk ≤ kAkkxk.
12
This is how the matrix norm is often used in practice. If we know kxk and the matrix norm kAk,
then we have an upper bound on the norm of Ax.
In fact, the maximum of a collection of numbers is the smallest number that is larger than or
equal to every number in the collection (draw a picture on the number line to see this), the matrix
norm kAk is the smallest number that is bigger than kAxk/kxk for every choice of non-zero x.
Thus kAk is the smallest number C for which
kAxk ≤ Ckxk
for every x.
An equivalent definition for kAk is
kAk = max kAxk.
x:kxk=1
Why do these definitions give the same answer? The reason is that the quantity kAxk/kxk does
not change if we multiply x by a non-zero scalar (convince yourself!). So, when calculating the
maximum over all non-zero vectors in the first expression for kAk, all the vectors pointing in the
same direction will give the same value for kAxk/kxk. This means that we need only pick one vector
in any given direction, and might as well choose the unit vector. For this vector, the denominator
is equal to one, so we can ignore it.
Here is another way of saying this. Consider the image of the unit sphere under A. This is the
set of vectors {Ax : kxk = 1} The length of the longest vector in this set is kAk.
The
below is a sketch of the unit sphere (circle) in two dimensions, and its image under
picture
1 2
. This image is an ellipse.
A=
0 2
||A||
The norm of the matrix is the distance from the q
origin to the point on the ellipse farthest from
√
the origin. In this case this turns out to be kAk = 9/2 + (1/2) 65.
It’s hard to see how this expression can be obtained from the entries of the matrix. There is no
easy formula. However, if A is a diagonal matrix the norm is easy to compute.
To see this, let’s consider a diagonal matrix


3 0 0
A = 0 2 0 .
0 0 1
13
If
 
x1
x = x2 
x3
then


3x1
Ax = 2x2 
x3
so that
kAxk2 = |3x1 |2 + |2x2 |2 + |x3 |2
= 32 |x1 |2 + 22 |x2 |2 + |x3 |2
≤ 32 |x1 |2 + 32 |x2 |2 + 32 |x3 |2
= 32 kxk2 .
This implies that for any unit vector x
kAxk ≤ 3
and taking the maximum over all unit vectors x yields kAk ≤ 3. On the other hand, the maximum
of kAxk over all unit vectors x is larger than the value of kAxk for any particular unit vector. In
particular, if
 
1
e1 = 0
0
then
kAk ≥ kAe1 k = 3.
Thus we see that
kAk = 3.
In general, the matrix norm of a diagonal matrix with diagonal entries λ1 , λ2 , · · · , λn is the largest
value of |λk |.
The MATLAB/Octave code for a diagonal matrix with diagonal entries 3, 2 and 1 is diag([3 2 1])
and the expression for the norm of A is norm(A). So for example
>norm(diag([3 2 1]))
ans =
3
I.1.7. Condition number
Let’s return to the situation where A is a square matrix and we are trying to solve Ax = b. If
A is a matrix arising from a real world application (for example if A contains values measured in
14
an experiment) then it will almost never happen that A is singular. After all, a tiny change in
any of the entries of A can change a singular matrix to a non-singular one. What is much more
likely to happen is that A is close to being singular. In this case A−1 will still exist, but will have
some enormous entries. This means that the solution x = A−1 b will be very sensitive to the tiniest
changes in b so that it might happen that round-off error in the computer completely destroys the
accuracy of the answer.
To check whether a system of linear equations is well-conditioned, we might therefore think of
using kA−1 k as a measure. But this isn’t quite right, since we actually don’t care if kA−1 k is large,
provided it stretches each vector about the same amount. For example, if we simply multiply each
entry of A by 10−6 the size of A−1 will go way up, by a factor of 106 , but our ability to solve the
system accurately is unchanged. The new solution is simply 106 times the old solution, that is, we
have simply shifted the position of the decimal point.
It turns out that for a square matrix A, the ratio of the largest stretching factor to the smallest
stretching factor of A is a good measure of how well conditioned the system of equation Ax = b
is. This ratio is called the condition number and is denoted cond(A).
Let’s first compute an expression for cond(A) in terms of matrix norms. Then we will explain
why it measures the conditioning of a system of equations.
We already know that the largest stretching factor for a matrix A is the matrix norm kAk. So
let’s look at the smallest streching factor. We might as well assume that A is invertible. Otherwise,
there is a non-zero vector that A sends to zero, so that the smallest stretching factor is 0 and the
condition number is infinite.
min
x6=0
kAxk
kAxk
= min
x6=0 kA−1 Axk
kxk
kyk
= min
y6=0 kA−1 yk
1
=
kA−1 yk
max
y6=0
kyk
1
=
.
kA−1 k
Here we used the fact that if x ranges over all non-zero vectors so does y = Ax and that the
minimum of a collection of positive numbers is one divided by the maximum of their reciprocals.
Thus the smallest stretching factor for A is 1/kA−1 k. This leads to the following formula for the
condition number of an invertible matrix:
cond(A) = kAkkA−1 k.
In our applications we will use the condition number as a measure of how well we can solve the
equations that come up accurately.
15
Now, let us try to see why the condition number of A is a good measure of how well we can solve
the equations Ax = b accurately.
Starting with Ax = b we change the right side to b0 = b + ∆b. The new solution is
x0 = A−1 (b + ∆b) = x + ∆x
where x = A−1 b is the original solution and the change in the solutions is ∆x = A−1 ∆b. Now the
absolute errors k∆bk and k∆xk are not very meaningful, since an absolute error k∆bk = 100 is not
very large if kbk = 1, 000, 000, but is large if kbk = 1. What we really care about are the relative
errors k∆bk/kbk and k∆xk/kxk. Can we bound the relative error in the solution in terms of the
relative error in the equation? The answer is yes. Beginning with
k∆xkkbk = kA−1 ∆bkkAxk
≤ kA−1 kk∆bkkAkkxk,
we can divide by kbkkxk to obtain
k∆bk
k∆xk
≤ kA−1 kkAk
kxk
kbk
k∆bk
= cond(A)
.
kbk
This equation gives the real meaning of the condition number. If the condition number is near to
1 then the relative error of the solution is about the same as the relative error in the equation.
However, a large condition number means that a small relative error in the equation can lead to a
large relative error in the solution.
In MATLAB/Octave the condition number is computed using cond(A).
> A=[2 0; 0 0.5];
> cond(A)
ans = 4
I.1.8. Summary of MATLAB/Octave commands used in this section
How to create a row vector
[ ] square brackets are used to construct matrices and vectors. Create a row in the matrix
by entering elements within brackets. Separate each element with a comma or space. For
example, to create a row vector a with three columns (i.e. a 1-by-3 matrix), type
a=[1 1 1] or equivalently a=[1,1,1]
16
How to create a column vector or a matrix with more than one row
; when the semicolon is used inside square brackets, it terminates rows. For example,
a=[1;1;1] creates a column vector with three rows
B=[1 2 3; 4 5 6] creates a 2 − by − 3 matrix
’ when a matrix (or a vector) is followed by a single quote ’ (or apostrophe) MATLAB flips rows
with columns, that is, it generates the transpose. When the original matrix is a simple row
vector, the apostrophe operator turns the vector into a column vector. For example,
a=[1 1 1]’ creates a column vector with three rows
B=[1 2 3; 4 5 6]’ creates a 3 − by − 2 matrix where the first row is 1 4
How to use specialized matrix functions
rand(n,m) returns a n-by-m matrix with random numbers between 0 and 1.
How to extract elements or submatrices from a matrix
A(i,j) returns the entry of the matrix A in the i-th row and the j-th column
A(i,:) returns a row vector containing the i-th row of A
A(:,j) returns a column vector containing the j-th column of A
A(i:j,k:m) returns a matrix containing a specific submatrix of the matrix A. Specifically, it
returns all rows between the i-th and the j-th rows of A, and all columns between the k-th
and the m-th columns of A.
How to perform specific operations on a matrix
det(A) returns the determinant of the (square) matrix A
rref(A) returns the reduced row echelon form of the matrix A
norm(V) returns the 2-norm (Euclidean norm) of the vector V
norm(V,1) returns the 1-norm of the vector V
norm(V,inf) returns the infinity norm of the vector V
17
I.2. Interpolation
Prerequisites and Learning Goals
From your work in previous courses, you should be able to
• compute the determinant of a square matrix; apply the basic linearity properties of the
determinant, and explain what its value means about existence and uniqueness of solutions.
After completing this section, you should be able to
• Give a definition of an interpolating function and show how a given interpolation problem
(i.e. the problem of finding an interpolating function of given shape/form) translates into
finding the solutions to a linear system; explain the idea of getting a unique interpolating
function by restricting the class of functions under consideration.
• Define the problem of Lagrange interpolation and express it in terms of a system of equations
where the unknowns are the coefficients of a polynomial of given degree; set up the system
in matrix form using the Vandermonde matrix, derive the formula for the determinant of the
Vandermonde matrix; explain why a solution to the Lagrange interpolation problem always
exists.
• Define the mathematical problem of interpolation using splines, this includes listing the conditions that the splines must satisfy and translating them into mathematical equations; express
the spline-interpolation problem in terms of a system of equations where the unknowns are
related to the coefficients of the splines, derive the corresponding matrix equation.
• Compare and constrast interpolation with splines with Lagrange interpolation.
• Explain how minimizing the bending energy leads to a description of the shape of the spline
as a piecewise polynomial function.
• Given a set of points, use MATLAB/Octave to calculate and plot the interpolating functions,
including the interpolating polynomial in Lagrange interpolation and the piecewise cubic
function for splines; this requires you be able to execute and interpret the output of specific
commands such as vander, polyval, plot.
• Discuss how the condition numbers arising in the above interpolation problems vary; explain
why Lagrange interpolation is not a practical method for large numbers of points.
18
I.2.1. Introduction
Suppose we are given some points (x1 , y1 ), . . . , (xn , yn ) in the plane, where the points xi are all
distinct.
Our task is to find a function f (x) that passes through all these points. In other words, we require
that f (xi ) = yi for i = 1, . . . , n. Such a function is called an interpolating function. Problems like
this arise in practical applications in situations where a function is sampled at a finite number of
points. For example, the function could be the shape of the model we have made for a car. We
take a bunch of measurements (x1 , y1 ), . . . , (xn , yn ) and send them to the factory. What’s the best
way to reproduce the original shape?
Of course, it is impossible to reproduce the original shape with certainty. There are infinitely
many functions going through the sampled points.
To make our problem of finding the interpolating function f (x) have a unique solution, we must
require something more of f (x), either that f (x) lies in some restricted class of functions, or that
f (x) is the function that minimizes some measure of “badness”. We will look at both approaches.
I.2.2. Lagrange interpolation
For Lagrange interpolation, we try to find a polynomial p(x) of lowest possible degree that passes
through our points. Since we have n points, and therefore n equations p(xi ) = yi to solve, it makes
sense that p(x) should be a polynomial of degree n − 1
p(x) = a1 xn−1 + a2 xn−2 + · · · + an−1 x + an
with n unknown coefficients a1 , a2 , . . . , an . (Don’t blame me for the screwy way of numbering the
coefficients. This is the MATLAB/Octave convention.)
19
The n equations p(xi ) = yi are n linear equations
write as
 n−1
x1
xn−2
· · · x21 x1
1
xn−1 xn−2 · · · x2 x2
2
2
 2
 ..
..
..
..
..
 .
. .
.
.
xn−1
xn−2
···
n
n
x2n xn
for these unknown coefficients which we may


 
 a1

y1
a
1 
2


 ..   y2 
1
 .   
 =  ..  .
..  
 .
a
. 
n−2


yn
1 an−1 
an
Thus we see that the problem of Lagrange interpolation reduces to solving a system of linear
equations. If this system has a unique solution, then there is exactly one polynomial p(x) of degree
n − 1 running through our points. This matrix for this system of equations has a special form and
is called a Vandermonde matrix.
To decide whether the system of equations has a unique solution we need to determine whether
the Vandermonde matrix is invertible or not. One way to do this is to compute the determinant.
It turns out that the determinant of a Vandermonde matrix has a particularly simple form, but it’s
a little tricky to see this. The 2 × 2 case is simple enough:
x1 1
= x1 − x2 .
det
x2 1
To go on to the 3 × 3 case we won’t simply expand the determinant, but recall that the determinant
is unchanged under row (and column) operations of the type ”add a multiple of one row (column)
to another.” Thus if we start with a 3 × 3 Vandermonde determinant, add −x1 times the second
column to the first, and then add −x1 times the third column to the second, the determinant
doesn’t change and we find that
 2





x1 x1 1
0
x1 1
0
0
1
det x22 x2 1 = det x22 − x1 x2 x2 1 = det x22 − x1 x2 x2 − x1 1 .
x23 x3 1
x23 − x1 x3 x3 1
x23 − x1 x3 x3 − x1 1
Now we can take advantage of the zeros in the first row, and calculate the determinant by expanding
along the top row. This gives
 2

2
x1 x1 1
x2 − x1 x2 x2 − x1
x2 (x2 − x1 ) x2 − x1
det x22 x2 1 = det
=
det
.
x23 − x1 x3 x3 − x1
x3 (x3 − x1 ) x3 − x1
2
x3 x3 1
Now, we recall that the determinant is linear in each row separately. This
x2 (x2 − x1 ) x2 − x1
x2
det
= (x2 − x1 ) det
x3 (x3 − x1 ) x3 − x1
x3 (x3 − x1 )
x2
= (x2 − x1 )(x3 − x1 ) det
x3
implies that
1
x3 − x1
1
.
1
But the determinant on the right is a 2 × 2 Vandermonde determinant that we have already
20
computed. Thus we end up with
 2
x1


x22
det
x23
the formula

x1 1
x2 1 = −(x2 − x1 )(x3 − x1 )(x3 − x2 ).
x3 1
The general formula is
 n−1
x1
xn−2
···
1
xn−1 xn−2 · · ·
2
 2
det  .
..
..
 ..
.
.
n−2 · · ·
xn−1
x
n
n
x21
x22
..
.
x1
x2
..
.

1
Y
1

..  = ± (xi − xj ),
.
x2n xn 1
i>j
where ± = (−1)n(n−1)/2 . It can be proved by induction using the same strategy as we used for
the 3 × 3 case. The product on the right is the product of all differences xi − xj . This product is
non-zero, since we are assuming that all the points xi are distinct. Thus the Vandermonde matrix
is invertible, and a solution to the Lagrange interpolation problem always exists.
Now let’s use MATLAB/Octave to see how this interpolation works in practice.
We begin by putting some points xi into a vector X and the corresponding points yi into a vector
Y.
>X=[0 0.2 0.4 0.6 0.8 1.0]
>Y=[1 1.1 1.3 0.8 0.4 1.0]
We can use the plot command in MATLAB/Octave to view these points. The command plot(X,Y)
will pop open a window and plot the points (xi , yi ) joined by straight lines. In this case we are
not interested in joining the points (at least not with straight lines) so we add a third argument:
’o’ plots the points as little circles. (For more information you can type help plot on the MATLAB/Octave command line.) Thus we type
>plot(X,Y,’o’)
>axis([-0.1, 1.1, 0, 1.5])
>hold on
The axis command adjusts the axis. Normally when you issue a new plot command, the existing
plot is erased. The hold on prevents this, so that subsequent plots are all drawn on the same
graph. The original behaviour is restored with hold off.
When you do this you should see a graph appear that looks something like this.
21
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
Now let’s compute the interpolation polynomial. Luckily there are build in functions in MATLAB/Octave that make this very easy. To start with, the function vander(X) returns the Vandermonde matrix corresponding to the points in X. So we define
>V=vander(X)
V =
0.00000
0.00032
0.01024
0.07776
0.32768
1.00000
0.00000
0.00160
0.02560
0.12960
0.40960
1.00000
0.00000
0.00800
0.06400
0.21600
0.51200
1.00000
0.00000
0.04000
0.16000
0.36000
0.64000
1.00000
0.00000
0.20000
0.40000
0.60000
0.80000
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
1.00000
We saw above that the coefficients of the interpolation polynomial are given by the solution a to
the equation V a = y. We find those coefficients using
>a=V\Y’
Let’s have a look at the interpolating polynomial. The MATLAB/Octave function polyval(a,X)
takes a vector X of x values, say x1 , x2 , . . . xk and returns a vector containing the values p(x1 ), p(x2 ), . . . p(xk ),
where p is the polynomial whose coefficients are in the vector a, that is,
p(x) = a1 xn−1 + a2 xn−2 + · · · + an−1 x + an
So plot(X,polyval(a,X)) would be the command we want, except that with the present definition
of X this would only plot the polynomial at the interpolation points. What we want is to plot the
polynomial for all points, or at least for a large number. The command linspace(0,1,100)
produces a vector of 100 linearly spaced points between 0 and 1, so the following commands do the
job.
22
>XL=linspace(0,1,100);
>YL=polyval(a,XL);
>plot(XL,YL);
>hold off
The result looks pretty good
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
The MATLAB/Octave commands for this example are in lagrange.m.
Unfortunately, things get worse when we increase the number of interpolation points. One clue
that there might be trouble ahead is that even for only six points the condition number of V is
quite high (try it!). Let’s see what happens with 18 points. We will take the x values to be equally
spaced between 0 and 1. For the y values we will start off by taking yi = sin(2πxi ). We repeat the
steps above.
>X=linspace(0,1,18);
>Y=sin(2*pi*X);
>plot(X,Y,’o’)
>axis([-0.1 1.1 -1.5 1.5])
>hold on
>V=vander(X);
>a=V\Y’;
>XL=linspace(0,1,500);
>YL=polyval(a,XL);
>plot(XL,YL);
The resulting picture looks okay.
23
1.5
1
0.5
0
-0.5
-1
-1.5
0
0.2
0.4
0.6
0.8
1
But look what happens if we change one of the y values just a little. We add 0.02 to the fifth y
value, redo the Lagrange interpolation and plot the new values in red.
>Y(5) = Y(5)+0.02;
>plot(X(5),Y(5),’or’)
>a=V\Y’;
>YL=polyval(a,XL);
>plot(XL,YL,’r’);
>hold off
The resulting graph makes a wild excursion and even though it goes through the given points, it
would not be a satisfactory interpolating function in a practical situation.
1.5
1
0.5
0
-0.5
-1
-1.5
0
0.2
0.4
A calculation reveals that the condition number is
24
0.6
0.8
1
>cond(V)
ans =
1.8822e+14
If we try to go to 20 points equally spaced between 0 and 1, the Vandermonde matrix is so ill
conditioned that MATLAB/Octave considers it to be singular.
25
I.2.3. Cubic splines
In the last section we saw that Lagrange interpolation becomes impossible to use in practice if
the number of points becomes large. Of course, the constraint we imposed, namely that the
interpolating function be a polynomial of low degree, does not have any practical basis. It is simply
mathematically convenient. Let’s start again and consider how ship and airplane designers actually
drew complicated curves before the days of computers. Here is a picture of a draughtsman’s spline
(taken from http://pages.cs.wisc.edu/~deboor/draftspline.html where you can also find a
nice photo of such a spline in use)
It consists of a bendable but stiff strip held in position by a series of weights called ducks. We
will try to make a mathematical model of such a device.
We begin again with points (x1 , y1 ), (x2 , y2 ), . . . (xn , yn ) in the plane. Again we are looking for
a function f (x) that goes through all these points. This time, we want to find the function that
has the same shape as a real draughtsman’s spline. We will imagine that the given points are the
locations of the ducks.
Our first task is to identify a large class of functions that represent possible shapes for the spline.
We will write down three conditions for a function f (x) to be acceptable. Since the spline has no
breaks in it the function f (x) should be continuous. Moreover f (x) should pass through the given
points.
Condition 1: f (x) is continuous and f (xi ) = yi for i = 1, . . . , n.
The next condition reflects the assumption that the strip is stiff but bendable. If the strip were not
stiff, say it were actually a rubber band that just is stretched between the ducks, then our resulting
function would be a straight line between each duck location (xi , yi ). At each duck location there
would be a sharp bend in the function. In other words, even though the function itself would be
continuous, the first derivative would be discontinuous at the duck locations. We will interpret the
words “bendable but stiff” to mean that the first derivatives of f (x) exist. This leads to our second
condition.
26
Condition 2: The first derivative f 0 (x) exists and is continuous everywhere, including each interior
duck location xi .
In between the duck locations we will assume that f (x) is perfectly smooth and that higher
derivatives behave nicely when we approach the duck locations from the right or the left. This
leads to
Condition 3: For x in between the duck points xi the higher order derivatives f 00 (x), f 000 (x), . . . all
exist and have left and right limits as x approaches each xi .
In this condition we are allowing for the possibility that f 00 (x) and higher order derivatives have
a jump at the duck locations. This happens if the left and right limits are different.
The set of functions satisfying conditions 1, 2 and 3 are all the possible shapes of the spline. How
do we decide which one of these shapes is the actual shape of the spline? To do this we need to
invoke a bit of the physics of bendable strips. The bending energy E[f ] of a strip whose shape is
described by the function f is given by the integral
Z xn
2
E[f ] =
f 00 (x) dx
x1
The actual spline will relax into the shape that makes E[f ] as small as possible. Thus, among all
the functions satisfying conditons 1, 2 and 3, we want to choose the one that minimizes E[f ].
This minimization problem is similiar to ones considered in calculus courses, except that instead
of real numbers, the variables in this problem are functions f satisfying conditons 1, 2 and 3. In
calculus, the minimum is calculated by “setting the derivative to zero.” A similar procedure is
described in the next section. Here is the result of that calculation: Let F (x) be the function
describing the shape that makes E[f ] as small as possible. In other words,
• F (x) satisfies condtions 1, 2 and 3.
• If f (x) also satisfies conditions 1, 2 and 3, then E[F ] ≤ E[f ].
Then, in addition to conditions 1, 2 and 3, F (x) satisfies
Condition a: In each interval (xi , xi+1 ), the function F (x) is a cubic polynomial. In other words, for
each interval there are coefficients Ai , Bi , Ci and Di such that F (x) = Ai x3 + Bi x2 + Ci x + Di
for all x between xi and xi+1 . The coefficients can be different for different intervals.
Condition b: The second derivative F 00 (x) is continuous.
Condition c: When x is an endpoint (either x1 or xn ) then F 00 (x) = 0
As we will see, there is exactly one function satisfying conditions 1, 2, 3, a, b and c.
27
I.2.4. The minimization procedure
In this section we explain the minimization procedure leading to a mathematical description of the
shape of a spline. In other words, we show that if among all functions f (x) satisfying conditions 1,
2 and 3, the function F (x) is the one with E[f ] the smallest, then F (x) also satisfies conditions a,
b and c.
The idea is to assume that we have found F (x) and then try to deduce what properties it must
satisfy. There is actually a is a hidden assumption here — we are assuming that the minimizer
F (x) exists. This is not true for every minimization problem (think of minimizing the function
(x2 + 1)−1 for −∞ < x < ∞). However the spline problem does have a minimizer, and we will
leave out the step of proving it exists.
Given the minimizer F (x) we want to wiggle it a little and consider functions of the form F (x) +
h(x), where h(x) is another function and be a number. We want to do this in such a way that
for every , the function F (x) + h(x) still satisfies conditions 1, 2 and 3. Then we will be able
to compare E[F ] with E[F + h]. A little thought shows that functions of form F (x) + h(x) will
satsify conditions 1, 2 and 3 for every value of if h satisfies
Condition 1’: h(xi ) = 0 for i = 1, . . . , n.
together with conditions 2 and 3 above.
Now, the minimization property of F says that each fixed function h satisfying 1’, 2 and 3 the
function of given by E[F + h] has a local minimum at = 0. From Calculus we know that this
implies that
dE[F + h] = 0.
(I.1)
d
=0
Now we will actually compute this derivative with respect to and see what information we can
get from the fact that it is zero for every choice of h(x) satisfying conditions 1’, 2 and 3. To simplify
the presentation we will assume that there are only three points (x1 , y1 ), (x2 , y2 ) and (x3 , y3 ). The
goal of this computation is to establish that equation (I.1) can be rewritten as (I.2).
To begin, we compute
Z x3
dE[F + h] d(F 00 (x) + h00 (x))2 =
dx
0=
d
d
x1
=0
=0
Z x3
=
2 (F 00 (x) + h00 (x))h00 (x)=0 dx
x1
Z x3
=2
F 00 (x)h00 (x)dx
x1
Z x2
Z x3
00
00
=2
F (x)h (x)dx + 2
F 00 (x)h00 (x)dx
x1
28
x2
We divide by 2 and integrate by parts in each integral. This gives
Z
Z x2
x=x3
x=x2
000
0
00
0
00
0
F (x)h (x)dx + F (x)h (x) x=x2 −
0 = F (x)h (x) x=x1 −
x3
F 000 (x)h0 (x)dx
x2
x1
In each boundary term we have to take into account the possibility that F 00 (x) is not continuous
across the points xi . Thus we have to use the appropriate limit from the left or the right. So, for
the first boundary term
x=x
F 00 (x)h0 (x)x=x21 = F 00 (x2 −)h0 (x2 ) − F 00 (x1 +)h0 (x1 )
Notice that since h0 (x) is continuous across each xi we need not distinguish the limits from the left
and the right. Expanding and combining the boundary terms we get
0 = −F 00 (x1 +)h0 (x1 ) + F 00 (x2 −) − F 00 (x2 +) h0 (x2 ) + F 00 (x3 −)h0 (x3 )
Z x3
Z x2
000
0
F 000 (x)h0 (x)dx
F (x)h (x)dx −
−
x1
x2
Now we integrate by parts again. This time the boundary terms all vanish because h(xi ) = 0 for
every i. Thus we end up with the equation
0 = −F 00 (x1 +)h0 (x1 ) + F 00 (x2 −) − F 00 (x2 +) h0 (x2 ) + F 00 (x3 −)h0 (x3 )
Z x2
Z x3
0000
+
F (x)h(x)dx −
F 0000 (x)h(x)dx
(I.2)
x1
x2
as desired.
Recall that this equation has to be true for every choice of h satisfying conditions 1’, 2 and 3. For
different choices of h(x) we can extract different pieces of information about the minimizer F (x).
To start, we can choose h
is zero everywhere except in the open interval (x1 , x2 ). For all
R xthat
2
0000
such h we then obtain 0 = x1 F (x)h(x)dx. This can only happen if
F 0000 (x) = 0
for x1 < x < x2
Thus we conclude that the fourth derivative F 0000 (x) is zero in the interval (x1 , x2 ).
Once we know that F 0000 (x) = 0 in the interval (x1 , x2 ), then by integrating both sides we can
conclude that F 000 (x) is constant. Integrating again, we find F 00 (x) is a linear polynomial. By
integrating four times, we see that F (x) is a cubic polynomial in that interval. When doing the
integrals, we must not extend the domain of integration over the boundary point x2 since F 0000 (x)
may not exist (let alone by zero) there.
Similarly F 0000 (x) must also vanish in the interval (x2 , x3 ), so F (x) is a (possibly different) cubic
polynomial in the interval (x2 , x3 ).
29
(An aside: to understand better why the polynomials might be different in the intervals (x1 , x2 )
and (x3 , x4 ) consider the function g(x) (unrelated to the spline problem) given by
(
0 for x1 < x < x2
g(x) =
1 for x2 < x < x3
Then g 0 (x) = 0 in each interval, and an integration tells us that g is constant in each interval.
However, g 0 (x2 ) does not exist, and the constants are different.)
We have established that F (x) satisfies condition a.
Now that we know that F 0000 (x) vanishes in each interval, we can return to (I.2) and write it as
0 = −F 00 (x1 +)h0 (x1 ) + F 00 (x2 −) − F 00 (x2 +) h0 (x2 ) + F 00 (x3 −)h0 (x3 )
Now choose h(x) with h0 (x1 ) = 1 and h0 (x2 ) = h0 (x3 ) = 0. Then the equation reads
F 00 (x1 +) = 0
Similarly, choosing h(x) with h0 (x3 ) = 1 and h0 (x1 ) = h0 (x2 ) = 0 we obtain
F 00 (x3 −) = 0
This establishes condition c.
Finally choosing h(x) with h0 (x2 ) = 1 and h0 (x1 ) = h0 (x3 ) = 0 we obtain
F 00 (x2 −) − F 00 (x2 +) = 0
In other words, F 00 must be continuous across the interior duck position. Thus shows that condition
b holds, and the derivation is complete.
This calculation is easily generalized to the case where there are n duck positions x1 , . . . , xn .
A reference for this material is Essentials of numerical analysis, with pocket calculator demonstrations, by Henrici.
I.2.5. The linear equations for cubic splines
Let us now turn this description into a system of linear equations. In each interval (xi , xi+1 ), for
i = 1, . . . n − 1, f (x) is given by a cubic polynomial pi (x) which we can write in the form
pi (x) = ai (x − xi )3 + bi (x − xi )2 + ci (x − xi ) + di
30
for coefficients ai , bi , ci and di to be determined. For each i = 1, . . . n − 1 we require that pi (xi ) = yi
and pi (xi+1 ) = yi+1 . Since pi (xi ) = di , the first of these equations is satisfied if di = yi . So let’s
simply make that substitution. This leaves the n − 1 equations
pi (xi+1 ) = ai (xi+1 − xi )3 + bi (xi+1 − xi )2 + ci (xi+1 − xi ) + yi = yi+1 .
Secondly, we require continuity of the first derivative across interior xi ’s.
p0i (xi+1 ) = p0i+1 (xi+1 ) or
This translates to
3ai (xi+1 − xi )2 + 2bi (xi+1 − xi ) + ci = ci+1
for i = 1, . . . , n − 2, giving an additional n − 2 equations. Next, we require continuity of the second
derivative across interior xi ’s. This translates to p00i (xi+1 ) = p00i+1 (xi+1 ) or
6ai (xi+1 − xi ) + 2bi = 2bi+1
for i = 1, . . . , n − 2, once more giving an additional n − 2 equations. Finally, we require that
p001 (x1 ) = p00n−1 (xn ) = 0. This yields two more equations
2b1 = 0
6an−1 (xn − xn−1 ) + 2bn−1 = 0
for a total of 3(n − 1) equations for the same number of variables.
We now specialize to the case where the distances between the points xi are equal. Let L =
xi+1 − xi be the common distance. Then the equations read
ai L3 + bi L2
2
3ai L + 2bi L
= yi+1 − yi
+ci L
− ci+1
+ci
−2bi+1
6ai L + 2bi
=0
=0
for i = 1 . . . n − 2 together with
an−1 L3 + bn−1 L2
+ 2b1
6an−1 L + 2bn−1
+cn−1 L
= yn − yn−1
=0
=0
We make one more simplification. After multiplying some of the equations with suitable powers of
L we can write these as equations for αi = ai L3 , βi = bi L2 and γi = ci L. They have a very simple
31
block structure. For example, when n = 4 the matrix form of the equations is

1
3

6

0

0

0

0

0
0
1
2
2
0
0
0
0
2
0
1
1
0
0
0
0
0
0
0

  
y2 − y1
α1
0 0
0 0 0
0

  
0 0 −1 0 0
0
  β1   0 




0 −2 0 0 0
0   γ1   0 


  
1 1
1 0 0
0
 α2  y3 − y2 




3 2
1 0 0 −1  β2  =  0 


  
6 2
0 0 −2 0 
  γ2   0 

  
0 0
0 1 1
1
 α3  y4 − y3 




0 
β3
0 0
0 0 0
0
0
γ3
0 0
0 6 2
0
Notice that the matrix in this equation does not depend on the points (xi , yi ). It has a 3 × 3 block
structure. If we define the 3 × 3 blocks


1 1 1
N = 3 2 1
6 2 0


0 0
0
M = 0 0 −1
0 −2 0


0 0 0
0 = 0 0 0
0 0 0


0 0 0
T = 0 2 0
0 0 0


1 1 1
V = 0 0 0
6 2 0
then the matrix in our equation has the form

N

S= 0
T
M
N
0

0
M
V
Once we have solved the equation for the coefficients αi , βi and γi , the function F (x) in the interval
(xi , xi+1 ) is given by
F (x) = pi (x) = αi
x − xi
L
3
+ βi
x − xi
L
2
+ γi
x − xi
L
+ yi
Now let us use MATLAB/Octave to plot a cubic spline. To start, we will do an example with
four interpolation points. The matrix S in the equation is defined by
32
>N=[1 1 1;3 2 1;6 2 0];
>M=[0 0 0;0 0 -1; 0 -2 0];
>Z=zeros(3,3);
>T=[0 0 0;0 2 0; 0 0 0];
>V=[1 1 1;0 0 0;6 2 0];
>S=[N M Z; Z N M; T Z V]
S =
1
3
6
0
0
0
0
0
0
1
2
2
0
0
0
0
2
0
1
1
0
0
0
0
0
0
0
0
0
0
1
3
6
0
0
0
0
0
-2
1
2
2
0
0
0
0
-1
0
1
1
0
0
0
0
0
0
0
0
0
0
1
0
6
0
0
0
0
0
-2
1
0
2
0
0
0
0
-1
0
1
0
0
Here we used the function zeros(n,m) which defines an n × m matrix filled with zeros.
To proceed we have to know what points we are trying to interpolate. We pick four (x, y) values
and put them in vectors. Remember that we are assuming that the x values are equally spaced.
>X=[1, 1.5, 2, 2.5];
>Y=[0.5, 0.8, 0.2, 0.4];
We plot these points on a graph.
>plot(X,Y,’o’)
>hold on
Now let’s define the right side of the equation
>b=[Y(2)-Y(1),0,0,Y(3)-Y(2),0,0,Y(4)-Y(3),0,0];
and solve the equation for the coefficients.
>a=S\b’;
33
Now let’s plot the interpolating function in the first interval. We will use 50 closely spaced points
to get a smooth looking curve.
>XL = linspace(X(1),X(2),50);
Put the first set of coefficients (α1 , β1 , γ1 , y1 ) into a vector
>p = [a(1) a(2) a(3) Y(1)];
Now we put the values p1 (x) into the vector YL. First we define the values (x − x1 )/L and put
them in the vector XLL. To get the values x − x1 we want to subtract the vector with X(1) in
every position from X. The vector with X(1) in every position can be obtained by taking a vector
with 1 in every position (in MATLAB/Octave this is obtained using the function ones(n,m)) and
multiplying by the number X(1). Then we divide by the (constant) spacing between the xi values.
>L = X(2)-X(1);
>XLL = (XL - X(1)*ones(1,50))/L;
Now we evaluate the polynomial p1 (x) and plot the resulting points.
>YL = polyval(p,XLL);
>plot(XL,YL);
To complete the plot, we repeat this steps for the intervals (x2 , x3 ) and (x3 , x4 ).
>XL = linspace(X(2),X(3),50);
>p = [a(4) a(5) a(6) Y(2)];
>XLL = (XL - X(2)*ones(1,50))/L;
>YL = polyval(p,XLL);
>plot(XL,YL);
>XL = linspace(X(3),X(4),50);
>p = [a(7) a(8) a(9) Y(3)];
>XLL = (XL - X(3)*ones(1,50))/L;
>YL = polyval(p,XLL);
>plot(XL,YL);
The result looks like this:
34
0.8
0.7
0.6
0.5
0.4
0.3
0.2
1
1.2
1.4
1.6
1.8
2
2.2
2.4
I have automated the procedure above and put the result in two files splinemat.m and plotspline.m.
splinemat(n) returns the 3(n − 1) × 3(n − 1) matrix used to compute a spline through n points
while plotspline(X,Y) plots the cubic spline going through the points in X and Y. If you put these
files in you MATLAB/Octave directory you can use them like this:
>splinemat(3)
ans =
1
3
6
0
0
0
1
2
2
0
2
0
1
1
0
0
0
0
0
0
0
1
0
6
0
0
-2
1
0
2
0
-1
0
1
0
0
and
>X=[1, 1.5, 2, 2.5];
>Y=[0.5, 0.8, 0.2, 0.4];
>plotspline(X,Y)
to produce the plot above.
Let’s use these functions to compare the cubic spline interpolation with the Lagrange interpolation
by using the same points as we did before. Remember that we started with the points
35
>X=linspace(0,1,18);
>Y=sin(2*pi*X);
Let’s plot the spline interpolation of these points
>plotspline(X,Y);
Here is the result with the Lagrange interpolation added (in red). The red (Lagrange) curve covers
the blue one and its impossible to tell the curves apart.
1.5
1
0.5
0
-0.5
-1
-1.5
0
0.2
0.4
0.6
0.8
1
Now we move one of the points slightly, as before.
>Y(5) = Y(5)+0.02;
Again, plotting the spline in blue and the Lagrange interpolation in red, here are the results.
36
1.5
1
0.5
0
-0.5
-1
-1.5
0
0.2
0.4
0.6
0.8
1
This time the spline does a much better job! Let’s check the condition number of the matrix for
the splines. Recall that there are 18 points.
>cond(splinemat(18))
ans =
32.707
Recall the Vandermonde matrix had a condition number of 1.8822e+14. This shows that the
system of equations for the splines is very much better conditioned, by 13 orders of magnitude!!
Code for splinemat.m and plotspline.m
function S=splinemat(n)
L=[1 1 1;3 2 1;6 2 0];
M=[0 0 0;0 0 -1; 0 -2 0];
Z=zeros(3,3);
T=[0 0 0;0 2 0; 0 0 0];
V=[1 1 1;0 0 0;6 2 0];
S=zeros(3*(n-1),3*(n-1));
for k=[1:n-2]
for l=[1:k-1]
37
S(3*k-2:3*k,3*l-2:3*l) = Z;
end
S(3*k-2:3*k,3*k-2:3*k) = L;
S(3*k-2:3*k,3*k+1:3*k+3) = M;
for l=[k+2:n-1]
S(3*k-2:3*k,3*l-2:3*l) = Z;
end
end
S(3*(n-1)-2:3*(n-1),1:3)=T;
for l=[2:n-2]
S(3*(n-1)-2:3*(n-1),3*l-2:3*l) = Z;
end
S(3*(n-1)-2:3*(n-1),3*(n-1)-2:3*(n-1))=V;
end
function plotspline(X,Y)
n=length(X);
L=X(2)-X(1);
S=splinemat(n);
b=zeros(1,3*(n-1));
for k=[1:n-1]
b(3*k-2)=Y(k+1)-Y(k);
b(3*k-1)=0;
b(3*k)=0;
end
a=S\b’;
npoints=50;
XL=[];
YL=[];
for k=[1:n-1]
XL = [XL linspace(X(k),X(k+1),npoints)];
p = [a(3*k-2),a(3*k-1),a(3*k),Y(k)];
XLL = (linspace(X(k),X(k+1),npoints) - X(k)*ones(1,npoints))/L;
YL = [YL polyval(p,XLL)];
end
plot(X,Y,’o’)
hold on
38
plot(XL,YL)
hold off
I.2.6. Summary of MATLAB/Octave commands used in this section
How to access elements of a vector
a(i) returns the i-th element of the vector a
How to create a vector with linearly spaced elements
linspace(x1,x2,n) generates n points between the values x1 and x2.
How to create a matrix by concatenating other matrices
C= [A B] takes two matrices A and B and creates a new matrix C by concatenating A and B
horizontally
Other specialized matrix functions
zeros(n,m) creates a n-by-m matrix filled with zeros
ones(n,m) creates a n-by-m matrix filled with ones
vander(X) creates the Vandermonde matrix corresponding to the points in the vector X. Note
that the columns of the Vandermonde matrix are powers of the vector X.
Other useful functions and commands
polyval(a,X) takes a vector X of x values and returns a vector containing the values of a polynomial p evaluated at the x values. The coefficients of the polynomial p (in descending powers)
are the values in the vector a.
sin(X) takes a vector X of values x and returns a vector containing the values of the function sin x
plot(X,Y) plots vector Y versus vector X. Points are joined by a solid line. To change line types
(solid, dashed, dotted, etc.) or plot symbols (point, circle, star, etc.), include an additional
argument. For example, plot(X,Y,’o’) plots the points as little circle.
39
I.3. Finite difference approximations
Prerequisites and Learning Goals
From your work in previous courses, you should be able to
• explain what it is meant by a boundary value problem.
After completing this section, you should be able to
• compute an approximate solution to a second order linear boundary value problem by deriving the corresponding finite difference equation and computing its solution with MATLAB/Octave; model different types of boundary conditions, including conditions on the value
of the solution at the boundary and on the value of the first derivative at the boundary.
I.3.1. Introduction and example
One of the most important applications of linear algebra is the approximate solution of differential
equations. In a differential equation we are trying to solve for an unknown function. The basic idea
is to turn a differential equation into a system of N × N linear equations. As N becomes large,
the vector solving the system of linear equations becomes a better and better approximation to the
function solving the differential equation.
In this section we will learn how to use linear algebra to find approximate solutions to a boundary
value problem of the form
f 00 (x) + q(x)f (x) = r(x)
for
0≤x≤1
subject to boundary conditions
f (0) = A,
f (1) = B.
This is a differential equation where the unknown quantity to be found is a function f (x). The
functions q(x) and r(x) are given (known) functions.
As differential equations go, this is a very simple one. For one thing it is an ordinary differential
equation (ODE), because it only involves one independent variable x. But the finite difference
methods we will introduce can also be applied to partial differential equations (PDE).
It can be useful to have a picture in your head when thinking about an equation. Here is a
situation where an equation like the one we are studying arises. Suppose we want to find the shape
of a stretched hanging cable. The cable is suspended above the points x = 0 and x = 1 at heights
of A and B respectively and hangs above the interval 0 ≤ x ≤ 1. Our goal is to find the height
f (x) of the cable above the ground at every point x between 0 and 1.
40
u(x)
A
B
0
1
x
The loading of the cable is described by a function 2r(x) that takes into account both the weight
of the cable and any additional load. Assume that this is a known function. The height function
f (x) is the function that minimizes the sum of the stretching energy and the gravitational potential
energy given by
Z
1
E[f ] =
[(f 0 (x))2 + 2r(x)f (x)]dx
0
subject to the condition that f (0) = A and f (1) = B. An argument similar (but easier) to the one
we did for splines shows that the minimizer satisfies the differential equation
f 00 (x) = r(x).
So we end up with the special case of our original equation where q(x) = 0. Actually, this special
case can be solved by simply integrating twice and adjusting the constants of integration to ensure
f (0) = A and f (1) = B. For example, when r(x) = r is constant and A = B = 1, the solution
is f (x) = 1 − rx/2 + rx2 /2. We can use this exact solution to compare against the approximate
solution that we will compute.
I.3.2. Discretization
In the finite difference approach to solving differential equations approximately, we want to approximate a function by a vector containing a finite number of sample values. Pick equally spaced
points xk = k/N , k = 0, . . . , N between 0 and 1. We will represent a function f (x) by its values
fk = f (xk ) at these points. Let
 
f0
 f1 
 
F =  . .
 .. 
fN
f(x)
f0 f1 f2 f3 f4 f5 f6 f7 f8
x0 x1 x2 x3 x4 x5 x6 x7 x8
x
41
At this point we throw away all the other information about the function, keeping only the values
at the sampled points.
f(x)
f0 f1 f2 f3 f4 f5 f6 f7 f8
x0 x1 x2 x3 x4 x5 x6 x7 x8
x
If this is all we have to work with, what should we use as an approximation to f 0 (x)? It seems
reasonable to use the slopes of the line segments joining our sampled points.
f(x)
f0 f1 f2 f3 f4 f5 f6 f7 f8
x0 x1 x2 x3 x4 x5 x6 x7 x8
x
Notice, though, that there is one slope for every interval (xi , xi+1 ) so the vector containing the
slopes has one fewer entry than the vector F. The formula for the slope in the interval (xi , xi+1 )
is (fi+1 − fi )/∆x where the distance ∆x = xi+1 − xi (in this case ∆x = 1/N ). Thus the vector
containing the slopes is
 



 f0
f1 − f0
−1 1
0 0 ··· 0  
 0 −1 1 0 · · · 0  f1 
 f2 − f1 
  f2 






 
0 −1 1 · · · 0
F0 = (∆x)−1  f3 − f2  = (∆x)−1  0
  f3  = (∆x)−1 DN F

 ..

..
..
..
.. . .
.  
 .


. 
. ..  
.
.
.
.
 .. 
fN − fN −1
0
0
0 0 ··· 1
fN
where DN is the N × (N + 1) finite difference matrix in the formula above. The vector F0 is our
approximation to the first derivative function f 0 (x).
42
To approximate the second derivative f 00 (x), we repeat this process to define the vector F00 . There
will be one entry in this vector for each adjacent pair of slopes, that is, each adjacent pair of entries
of F0 . These are naturally labelled by the interior points x1 , x2 , . . . , xn−1 . Thus we obtain
 

 f0
1 −2 1
0 ··· 0 0 0  
0 1 −2 1 · · · 0 0 0  f1 
  f2 


 
1 −2 · · · 0 0 0
F00 = (∆x)−2 DN −1 DN F = (∆x)−2 0 0
  f3  .
 ..


..
..
..
.
.
.
..
..
..  
 . 
.
. ..
.
.
.
 .. 
0 0
0
0 · · · 1 −2 1
fN
Let rk = r(xk ) be the sampled points for the load function r(x) and define the vector approximation for r at the interior points


r1


r =  ...  .
rN −1
The reason we only define this vector for interior points is that that is where F00 is defined. Now
we can write down the finite difference approximation to f 00 (x) = r(x) as
(∆x)−2 DN −1 DN F = r
or
DN −1 DN F = (∆x)2 r
This is a system of N − 1 equations in N + 1 unknowns. To get a unique solution, we need two more
equations. That is where the boundary conditions come in! We have two boundary conditions,
which in this case can simply be written as f0 = A and fN = B. Combining these with the N − 1
equations for the interior points, we may rewrite the system of equations as




1 0
0
0 ··· 0 0 0
A
1 −2 1
0 · · · 0 0 0


 (∆x)2 r1 
0 1 −2 1 · · · 0 0 0




 (∆x)2 r2 
0 0



1 −2 · · · 0 0 0 F = 

.
..
 ..



..
..
..
..
..
.. 
..
.
.


.
.
.
.
.
.
.
2


(∆x) rN −1 
0 0

0
0 · · · 1 −2 1
B
0 0
0
0 ··· 0 0 1
Note that it is possible to incorporate other types of boundary conditions by simply changing the
first and last equations.
Let’s define L to be the (N + 1) × (N + 1) matrix of coefficients for this equation, so that the
equation has the form
LF = b.
The first thing to do is to verify that L is invertible, so that we know that there is a unique
solution to the equation. It is not too difficult to compute the determinant if you recall that the
elementary row operations that add a multiple of one row to another do not change the value
43
of the determinant. Using only this type of elementary row operation, we can reduce L to an
upper triangular matrix whose diagonal entries are 1, −2, −3/2, −4/3, −5/4, . . . , −N/(N − 1), 1.
The determinant is the product of these entries, and this equals ±N . Since this value is not zero,
the matrix L is invertible.
It is worthwhile pointing out that a change in boundary conditions (for example, prescribing the
values of the derivative f 0 (0) and f 0 (1) rather than f (0) and f (1)) results in a different matrix L
that may fail to be invertible.
We should also ask about the condition number of L to determine how large the relative error of
the solution can be. We will compute this using MATLAB/Octave below.
Now let’s use MATLAB/Octave to solve this equation. We will start with the test case where
r(x) = 1 and A = B = 1. In this case we know that the exact solution is f (x) = 1 − x/2 + x2 /2.
We will work with N = 50. Notice that, except for the first and last rows, L has a constant value
of −2 on the diagonal, and a constant value of 1 on the off-diagonals immediately above and below.
Before proceeding, we introduce the MATLAB/Octave command diag. For any vector D, diag(D)
is a diagonal matrix with the entries of D on the diagonal. So for example
>D=[1 2 3 4 5];
>diag(D)
ans =
1
0
0
0
0
0
2
0
0
0
0
0
3
0
0
0
0
0
4
0
0
0
0
0
5
An optional second argument offsets the diagonal. So, for example
>D=[1 2 3 4];
>diag(D,1)
ans =
0
0
0
0
0
44
1
0
0
0
0
0
2
0
0
0
0
0
3
0
0
0
0
0
4
0
>diag(D,-1)
ans =
0
1
0
0
0
0
0
2
0
0
0
0
0
3
0
0
0
0
0
4
0
0
0
0
0
Now returning to our matrix L we can define it as
>N=50;
>L=diag(-2*ones(1,N+1)) + diag(ones(1,N),1) + diag(ones(1,N),-1);
>L(1,1) = 1;
>L(1,2) = 0;
>L(N+1,N+1) = 1;
>L(N+1,N) = 0;
The condition number of L for N = 50 is
>cond(L)
ans = 1012.7
We will denote the right side of the equation by b. To start, we will define b to be (∆x)2 r(xi )
and then adjust the first and last entries to account for the boundary values. Recall that r(x) is
the constant function 1, so its sampled values are all 1 too.
>dx = 1/N;
>b=ones(N+1,1)*dx^2;
>A=1; B=1;
>b(1) = A;
>b(N+1) = B;
Now we solve the equation for F.
>F=L\b;
The x values are N + 1 equally spaced points between 0 and 1,
45
>X=linspace(0,1,N+1);
Now we plot the result.
>plot(X,F)
1
0.98
0.96
0.94
0.92
0.9
0.88
0
0.2
0.4
0.6
0.8
1
Let’s superimpose the exact solution in red.
>hold on
>plot(X,ones(1,N+1)-X/2+X.^2/2,’r’)
(The . before an operator tells MATLAB/Octave to apply that operator element by element, so
X.^2 returns an array with each element the corresponding element of X squared.)
46
1
0.98
0.96
0.94
0.92
0.9
0.88
0
0.2
0.4
0.6
0.8
1
The two curves are indistinguishable.
What happens if we increase the load at a single point? Recall that we have set the loading
function r(x) to be 1 everywhere. Let’s increase it at just one point. Adding, say, 5 to one of the
values of r is the same as adding 5(∆x)2 to the right side b. So the following commands do the
job. We are changing b11 which corresponds to changing r(x) at x = 0.2.
>b(11) = b(11) + 5*dx^2;
>F=L\b;
>hold on
>plot(X,F);
Before looking at the plot, let’s do this one more time, this time making the cable really heavy at
the same point.
>b(11) = b(11) + 50*dx^2;
>F=L\b;
>hold on
>plot(X,F);
Here is the resulting plot.
47
1
0.95
0.9
0.85
0.8
0.75
0
0.2
0.4
0.6
0.8
1
So far we have only considered the case of our equation f 00 (x) + q(x)f (x) = r(x) where q(x) = 0.
What happens when we add the term containing q? We must sample the function q(x) at the
interior points and add the corresponding vector. Since we multiplied the equations for the interior
points by (∆x)2 we must do the same to these terms. Thus we must add the term




0 0 0 0 ··· 0
0
0
0
0 q1 0 0 · · · 0
0
0
 q1 f1 




0 0 q2 0 · · · 0

0
0
 q2 f2 





0
0
(∆x)2 
 = (∆x)2 0 0 0 q3 · · · 0
 F.
..


 .. ..
..
.. . .
..
..
.. 
.


. .
.
.
.
.
.
.


qN −1 fN −1 
0 0 0 0 · · · 0 qN −1 0
0
0 0 0 0 ··· 0
0
0
In other words, we replace the matrix L in our equation with L + (∆x)2 Q where Q is the (N +
1) × (N + 1) diagonal matrix with the interior sampled points of q(x) on the diagonal.
I’ll leave it to a homework problem to incorporate this change in a MATLAB/Octave calculation.
One word of caution: the matrix L by itself is always invertible (with reasonable condition number).
However L + (∆x)2 Q may fail to be invertible. This reflects the fact that the original differential
equation may fail to have a solution for some choices of q(x) and r(x).
Let us briefly discuss what changes need to be made if we replace the the boundary condition
f (0) = A with a condition on the first derivative of the form f 0 (0) = α.
Here is a summary of what we have done to model the condition f (0) = A. When we represent
f (x) by a vector F = [f0 , f1 , . . . , fN ]T the boundary condition is represented by the equation
48
f0 = A. This equation corresponds to the first row in the matrix equation LF + (∆x)2 QF = b
since the first row of L picks out f0 , the first row of Q is zero and the first entry of b is A.
A reasonable way to model the boundary condition f 0 (0) = α is to set the first entry in the vector
F0 equal to α. In other words we want (f1 − f0 )/(∆x) = α, or (f1 − f0 ) = (∆x)α. So let’s change
the first row of our matrix equation LF + (∆x)2 QF = b so that it corresponds to this equation.
To do this, change the first row of L to [−1, 1, 0, . . . , 0] and the first entry of b to (∆x)α.
A similar change can be used to model a boundary condtion for f 0 (1) on the other endpoint of
our interval. You are asked to do this in a homework problem.
I.3.3. Another example: the heat equation
In the previous example involving the loaded cable there was only one independent variable, x,
and as a result we ended up with an ordinary differential equation which determined the shape.
In this example we will have two independent variables, time t, and one spatial dimension x. The
quantities of interest can now vary in both space and time. Thus we will end up with a partial
differential equation which will describe how the physical system behaves.
Imagine a long thin rod (a one-dimensional rod) where the only important spatial direction is
the x direction. Given some initial temperature profile along the rod and boundary conditions at
the ends of the rod, we would like to determine how the temperature, T = T (x, t), along the rod
varies over time.
Consider a small section of the rod between x and x + ∆x. The rate of change of internal energy,
Q(x, t), in this section is proportional to the heat flux, q(x, t), into and out of the section. That is
∂Q
(x, t) = −q(x + ∆x, t) + q(x, t).
∂t
Now the internal energy is related to the temperature by Q(x, t) = ρCp ∆xT (x, t), where ρ and Cp
are the density and specific heat of the rod (assumed here to be constant). Also, from Fourier’s
law, the heat flux through a point in the rod is proportional to the (negative) temperature gradient
at the point, i.e., q(x, t) = −K0 ∂T (x, t)/∂x, where K0 is a constant (the thermal conductivity);
this basically says that heat “flows” from hotter to colder regions. Substituting these two relations
into the above energy equation we get
∂T
∂T
∂T
ρCp ∆x
(x, t) = K0
(x + ∆x, t) −
(x, t)
∂t
∂x
∂x
⇒
∂T
K0
(x, t) =
∂t
ρCp
∂T
∂x (x
+ ∆x, t) −
∆x
∂T
∂x (x, t)
.
Taking the limit as ∆x goes to zero we obtain
∂T
∂2T
(x, t) = k 2 (x, t),
∂t
∂x
49
where k = K0 /ρCp is a constant. This partial differential equation is known as the heat equation
and describes how the temperature along a one-dimensional rod evolves.
We can also include other effects. If there was a temperature source or sink, S(x, t), then this
will contribute to the local change in temperature:
∂T
∂2T
(x, t) = k 2 (x, t) + S(x, t).
∂t
∂x
And if we also allow the rod to cool down along its length (because, say, the surrounding air is a
different temperature than the rod), then the differential equation becomes
∂2T
∂T
(x, t) = k 2 (x, t) − HT (x, t) + S(x, t),
∂t
∂x
where H is a constant (here we have assumed that the surrounding air temperature is zero).
In certain cases we can think about what the steady state of the rod will be. That is after
sufficiently long time (so that things have had plenty of time for the heat to “move around” and
for things to heat up/cool down), the temperature will cease to change in time. Once this steady
state is reached, things become independent of time, and the differential equation becomes
0=k
∂2T
(x) − HT (x) + S(x),
∂x2
which is of the same form as the ordinary differential equation that we considered at the start of
this section.
50
Chapter II
Subspaces, Bases and Dimension
51
II.1. Subspaces, basis and dimension
Prerequisites and Learning Goals
From your work in previous courses, you should be able to
• Write down a vector as a linear combination of a set of vectors, and express the linear
combination in matrix form.
• Define linear independence for a collection of vectors.
• Define a basis for a vector subspace.
After completing this section, you should be able to
• Explain what a vector space is, give examples of vector spaces.
• Define vector addition and scalar multiplication for vector spaces of functions.
• Give a definition of subspace and decide whether a given collection of vectors forms a subspace.
• Recast the dependence or independence of a collection of vectors in Rn or Cn as a statement
about existence of solutions to a system of linear equations, and use it to decide if a collection
of vectors are dependent or independent.
• Define the span of a collection of vectors; show that given a set of vectors v1 , . . . , vk the span
span(v1 , . . . , vk ) is a subspace; determine when a given vector is in the span.
• Describe the significance of the two parts (independence and span) of the definition of a basis.
• Check if a collection of vectors is a basis.
• Show that any basis for a subspace has the same number of elements.
• Show that any set of k linearly independent vectors v1 , . . . , vk in a k dimensional subspace
S is a basis of S.
• Define the dimension of a subspace.
• Determine possible reactions from the formula matrix for a chemical system.
• Given the formula matrix for a chemical system, determine if there is a fixed ratio of molar
amounts of the elements in every possible sample composed of species in the system.
52
II.1.1. Vector spaces and subspaces


x1
 
In your previous linear algebra course, and for most of this course, vectors are n-tuples  ...  of
xn
numbers, either real or complex. The sets of all n-tuples, denoted Rn or Cn , are examples of vector
spaces.
In more advanced applications vector spaces of functions often occur. For example, an electrical
signal can be thought of as a real valued function x(t) of time t. If two signals x(t) and y(t) are
superimposed, the resulting signal is the sum that has the value x(t)+y(t) at time t. This motivates
the definition of vector addition for functions: the vector sum of the functions x and y is the new
function x + y defined by (x + y)(t) = x(t) + y(t). Similarly, if s is a scalar, the scalar multiple sx
is defined by (sx)(t) = sx(t). If you think of t as being a continuous index, these definitions mirror
the componentwise definitions of vector addition and scalar multiplication for vectors in Rn or Cn .
It is possible to give an abstract definition of a vector space as any collection of objects (the
vectors) that can be added and multiplied by scalars, provided the addition and scalar multiplication
rules obey a set of rules. We won’t follow this abstract approach in this course.
A collection of vectors V contained in a given vector space is called a subspace if vector addition
and scalar multiplication of vectors in V stay in V . In other words, for any vectors v1 , v2 ∈ V and
any scalars c1 and c2 , the vector c1 v1 + c2 v2 lies in V too.
In three dimensional space R3 , examples of subspaces are lines and planes through the origin. If
we add or scalar multiply two vectors lying on the same line (or plane) the resulting vector remains
on the same line (or plane). Additional examples of subspaces are the trivial subspace, containing
the single vector 0, as well as the whole space itself.
Here is another example of a subspace. The set of n × n matrices can be thought of as an
dimensional vector space. Within this vector space, the set of symmetric matrices (satisfying
AT = A) is a subspace. To see this, suppose A1 and A2 are symmetric. Then, using the linearity
property of the transpose, we see that
n2
(c1 A1 + c2 A2 )T = c1 AT1 + c2 AT2 = c1 A1 + c2 A2
which shows that c1 A1 + c2 A2 is symmetric too.
We have encountered subspaces of functions in the section on interpolation. In Lagrange interpolation we considered the set of all polynomials of degree at most m. This is a subspace of the
space of functions, since adding two polynomials of degree at most m results in another polynomial,
again of degree at most m, and scalar multiplication of a polynomial of degree at most m yields
another polynomial of degree at most m.
Another example of a subspace of functions is the set of all functions y(t) that satisfy the differential equation y 00 (t) + y(t) = 0. To check that this is a subspace, we must verify that if y1 (t) and
y2 (t) both solve the differential equation, then so does c1 y1 (t) + c2 y2 (t) for any choice of scalars c1
and c2 .
53
II.1.2. Linear dependence and independence
To begin we review the definition of linear dependence and independence. A linear combination of
vectors v1 , . . . , vk is a vector of the form
k
X
ci vi = c1 v1 + c2 v2 + · · · + ck vk
i=1
for some choice of numbers c1 , c2 , . . . , ck .
The vectors v1 , . . . , vk are called linearly dependent
if there exist numbers c1 , c2 , . . . , ck that are
P
not all zero, such that the linear combination ki=1 ci vi = 0
On the other hand, the vectors are called linearly independent if the only linear combination of
the vectors equaling zero has every ci = 0. In other words
k
X
ci vi = 0
implies
c1 = c2 = · · · = ck = 0
i=1
 
1

For example, the vectors 1,
1
 
 
7
1
0 and 1 are linearly dependent because
7
1
 
 
   
1
1
7
0
1 1 + 6 0 − 1 1 = 0
1
1
7
0
If v1 , . . . , vk are linearly dependent, then at least one of the vi ’s can be written as a linear
combination of the others. To see this suppose that
c1 v1 + c2 v2 + · · · + ck vk = 0
with not all of the ci ’s zero. Then we can solve for any of the vi ’s whose coefficient ci is not zero.
For instance, if c1 is not zero we can write
v1 = −(c2 /c1 )v2 − (c3 /c1 )v3 − · · · − (ck /c1 )vk
This means any linear combination we can make with the vectors v1 , . . . , vk can be achieved without
using v1 , since we can simply replace the occurrence of v1 with the expression on the right.
Sometimes it helps to have a geometrical picture. In three dimensional space R3 , three vectors
are linearly dependent if they lie in the same plane.
The columns of a matrix in echelon form are linearly independent if and only if every column is
a pivot column. We illustrate this with two examples.
54


1 ∗ ∗
The matrix 0 2 ∗ is an example of a matrix in echelon form where each column is a pivot
0 0 3
column. Here ∗ denotes an arbitrary entry.
To see that the columns are linearly independent suppose that
 
 
   
1
∗
∗
0
c1 0 + c2 2 + c3 ∗ = 0
0
0
3
0
Then, equating the bottom entries we find 3c3 = 0 so c3 = 0. But once we know c3 = 0 then the
equation reads
 
   
1
∗
0
c1 0 + c2 2 = 0
0
0
0
which implies that c2 = 0 too, and similarly c1 = 0
Similarly, for a matrix in echelon form (even if, as in the example below, it is not completely
reduced), the pivot columns are linearly independent. For example the first, second and fifth
columns in the matrix


1 1 1 1 0
0 1 2 5 5
0 0 0 0 1
are independent. However, the non-pivot columns can be written as linear combination of the pivot
columns. For example
 
 
 
1
1
1
2 = − 0 + 2 1
0
0
0
so if there are non-pivot columns, then the set of all columns is linearly dependent. This is particularly easy to see for a matrix in reduced row echelon form, like


1 0 1 1 0
0 1 2 5 0 .
0 0 0 0 1
In this case the pivot columns are standard basis vectors (see below), which are obviously independent. It is easy to express the other columns as linear combinations of these.
Recall that for a matrix U in echelon form, the presence or absence of non-pivot columns determines whether the homogeneous equation U x = 0 has any non-zero solutions. By the discussion
above, we can say that the columns of a matrix U in echelon form are linearly dependent exactly
when the homogeneous equation U x = 0 has a non-zero solution.
In fact, this is true for any matrix. Suppose that the vectors v1 , . . . , vk are the columns of a
matrix A so that
A = v1 v2 · · · vk .
55
If we put the coefficients c1 , c2 , . . . , ck into a vector
 
c1
c2 
 
c=.
 .. 
ck
then
Ac = c1 v1 + c2 v2 + · · · + ck vk
is the linear combination of the columns v1 , . . . , vk with coefficients ci .
Now it follows directly from the definition of linear dependence that the columns of A are linearly
dependent if there is a non-zero solution c to the homogeneous equation
Ac = 0
On the other hand, if the only solution to the homogeneous equation is c = 0 then the columns
v1 , . . . , vk are linearly independent.
To compute whether a given collection of vectors is dependent or independent we can place them
in the columns of a matrix A and reduce to echelon form. If the echelon form has only pivot
columns, then the vectors are independent. On the other hand, if the echelon form has some
non-pivot columns, then the equation Ac = 0 has some non-zero solutions and so the vectors are
dependent.
Let’s try this with the vectors in the example above in MATLAB/Octave.
>V1=[1
>V2=[1
>V3=[7
>A=[V1
1 1]’;
0 1]’;
1 7]’;
V2 V3]
A =
1
1
1
1
0
1
7
1
7
>rref(A)
ans =
1
0
0
0
1
0
1
6
0
Since the third column is a non-pivot column, the vectors are linearly dependent.
56
II.1.3. Span
Given a collection of vectors v1 , . . . , vk we may form a subspace of all possible linear combinations.
This subspace is called span(v1 , . . . , vk ) or the space spanned by the vi ’s. It is a subspace because
if we start with any two elements of span(v1 , . . . , vk ), say c1 v1 + c2 v2 + · · · + ck vk and d1 v1 + d2 v2 +
· · ·+dk vk then a linear combination of these linear combinations is again a linear combination since
s1 (c1 v1 + c2 v2 + · · · + ck vk ) + s2 (d1 v1 + d2 v2 + · · · + dk vk ) =
(s1 c1 + s2 d1 )v1 + (s1 c2 + s2 d2 )v2 + · · · + (s1 ck + s2 dk )vk
   
 
1
0
0
For example the span of the three vectors 0, 1 and 0 is the whole three dimensional
0
0
1
 
1
space, because every vector is a linear combination of these. The span of the four vectors 0,
0
 
   
1
0
0
1, 0 and 1 is the same.
1
1
0
II.1.4. Basis
A collection of vectors v1 , . . . , vk contained in a subspace V is called a basis for that subspace if
1. span(v1 , . . . , vk ) = V , and
2. v1 , . . . , vk are linearly independent.
Condition (1) says that any vector in V can be written as a linear combination of v1 , . . . , vk .
Condition (2) says that there is exactly one way of doing this. Here is the argument. Suppose there
are two ways of writing the same vector v ∈ V as a linear combination:
v = c1 v1 + c2 v2 + · · · + ck vk
v = d1 v1 + d2 v2 + · · · + dk vk
Then by subtracting these equations, we obtain
0 = (c1 − d1 )v1 + (c2 − d2 )v2 + · · · + (ck − dk )vk
Linear independence now says that every coefficient in this sum must be zero. This implies c1 = d1 ,
c2 = d2 . . . ck = dk .
57
Example: Rn has the standard basis e1 , e2 , . . . , en where
 
 
1
0
0
1
 
 
e1 =   e2 =   · · ·
 
 
..
..
.
.
1
1
Another basis for
is
,
. To see this, notice that saying that any vector y can be
1
−1
1
1
+ c2
written in a unique way as c1
is the same as saying that the equation
−1
1
R2
1 1
1 −1
c1
=x
c2
always has a unique solution. This is true.
A basis for the vector space P2 of polynomials of degree at most two is given by {1, x, x2 }. These
polynomials clearly span P2 since every polynomial p ∈ P2 can be written as a linear combination
p(x) = c0 · 1 + c1 x + c2 x2 . To show independence, suppose that c0 · 1 + c1 x + c2 x2 is the zero
polynomial. This means that c0 · 1 + c1 x + c2 x2 = 0 for every value of x. Taking the first and second
derivatives of this equation yields that c1 + 2c2 x = 0 and 2c2 = 0 for every value of x. Substituting
x = 0 into each of these equations we find c0 = c1 = c2 = 0.
that if we represent the polynomial p(x) = c0 ·1+c1 x+c2 x2 ∈ P2 by the vector of coefficients
 Notice

c0
c1  ∈ R3 , then the vector space operations in P2 are mirrored perfectly in R3 . In other words,
c2
adding or scalar multiplying polynomials in P2 is the same as adding or scalar multiplying the
corresponding vectors of coefficients in R3 .
This sort of correspondence can be set up whenever we have a basis v1 , v2 , . . . , vk for a vector
space V . In this case every vector v has a unique
 representation c1 v1 + c2 v2 + · · · + ck vk and we
c1
c2 
 
can represent the vector v ∈ V by the vector  .  ∈ Rk (or Ck ). In some sense this says that we
 .. 
ck
can always think of finite dimensional vector spaces as being copies of Rn or Cn . The only catch is
that the the correspondence that gets set up between vectors in V and vectors in Rn or Cn depends
on the choice of basis.
It is intuitively clear that, say, a plane in three dimensions will always have a basis of two vectors.
Here is an argument that shows that any two bases for a subspace S of Rk or Ck will always have
the same number of elements. Let v1 , . . . , vn and w1 , . . . , wm be two bases for a subspace S. Let’s
58
try to show that n must be the same as m. Since the vi ’s span V we can write each wi as a linear
combination of vi ’s. We write
n
X
wj =
ai,j vi
i=1
for each j = 1, . . . , m. Let’s put all the coefficients into an n × m matrix A = [ai,j ]. If we form
the matrix k × m matrix W = [w1 |w2 | · · · |wm ] and the k × n matrix V = [v1 |v2 | · · · |vm ] then the
equation above can be rewritten
W =VA
   
   
1 
1 
 1
 4







0 , −1
2 , −2
To understand this construction consider the two bases
and




−1
0
−6
1
 
1
for a subspace in R3 (in fact this subspace is the plane through the origin with normal vector 1).
1
Then we may write
 
 
 
4
1
1
 2  = 6  0  − 2 −1
−6
−1
0
 
 
 
1
1
1
−2 = −  0  + 2 −1
1
−1
0
and the equation W = V A for this example reads

 

4
1
1
1  2 −2 =  0 −1 6 −1
−2 2
−6 1
−1 0
Returning now to the general case, suppose that m > n. Then A has more columns than rows.
So its echelon form must have some non-pivot columns which implies that there must be some
non-zero solution to Ac = 0. Let c 6= 0 be such a solution. Then
W c = V Ac = 0
But this is impossible because the columns of W are linearly dependent. So it can’t be true that
m > n. Reversing the roles of V and W we find that n > m is impossible too. So it must be that
m = n.
We have shown that any basis for a subspace S has the same number of elements. Thus it makes
sense to define the dimension of a subspace S to be the number of elements in any basis for S.
59
Here is one last fact about bases: any set of k linearly independent vectors {v1 , . . . , vk } in a k
dimensional subspace S automatically spans S and is therefore a basis. To see this (in the case that
S is a subspace of Rn or Cn ) we let {w1 , . . . , wk } be a basis for S, which also will have k elements.
Form V = [v1 | · · · |vk ] and W = [w1 | · · · |wk ]. Then the construction above gives V = W A for a
k × k matrix A. The matrix A must be invertible. Otherwise there would be a non-zero solution
c to Ac = 0. This would imply V c = W Ac = 0 contradicting the independence of the rows of V .
Thus we can write W = V A−1 which shows that every wk is a linear combination of vi ’s.
This shows that the vi ’s must span S because every vector in S is a linear combination of the
basis vectors wk ’s which in turn are linear combinations of the vi ’s.
As an example of this, consider again the space P2 of polynomials of degree at most 2. We claim
that the polynomials {1, (x − a), (x − a)2 } (for any constant a) form a basis. We already know
that the dimension of this space is 3, so we only need to show that these three polynomials are
independent. The argument for that is almost the same as before.
II.2. The four fundamental subspaces for a matrix
From your work in previous courses, you should be able to
• Recognize and use the property of transposes for which (AB)T = B T AT for any matrices A
and B.
• Define the inner (dot) product of two vectors, and its properties (symmetry, linearity), and
explain its geometrical meaning.
• Use the inner product to decide if two vectors are orthogonal, and to compute the angle
between two vectors.
• State the Cauchy-Schwarz inequality and know for which vectors the inequality is an equality.
• For a linear system classify variables as basic and free.
After completing this section, you should be able to
• Define the four fundamental subspaces N (A), R(A), N (AT ), and R(AT ), associated to a
matrix A and its transpose AT , and show that they are subspaces.
• Express the Gaussian elimination process performed to reduce a matrix A to its row reduced
echelon form matrix U as a matrix factorization, A = EU , using elementary matrices, and
perform the steps using MATLAB/Octave.
• Compute bases for each of the four fundamental subspaces N (A), R(A), N (AT ) and R(AT )
of a matrix A; infer information on a matrix given the bases of its fundamental subspaces.
• Define and compute the rank of a matrix.
• State the formulas for the dimension of each of the four subspaces and explain why they are
true; when possible, use such formulas to find the dimension of the subspaces.
• Explain what it means for two subspaces to be orthogonal (V ⊥ W ) and for one subspace to
be the orthogonal complement of another (V = W ⊥ ).
• State which of the fundamental subspaces are orthogonal to each other and explain why,
verify the orthogonality relations in examples, and use the orthogonality relation for R(A) to
test whether the equation Ax = b has a solution.
60
II.2.1. Nullspace N (A) and Range R(A)
There are two important subspaces associated to any matrix. Let A be an n × m matrix. If x is m
dimensional, then Ax makes sense and is a vector in n dimensional space.
The first subspace associated to A is the nullspace (or kernel ) of A denoted N (A) (or Ker(A)).
It is defined as all vectors x solving the homogeneous equation for A, that is
N (A) = {x : Ax = 0}
This is a subspace because if Ax1 = 0 and Ax2 = 0 then
A(c1 x1 + c2 x2 ) = c1 Ax1 + c2 Ax2 = 0 + 0 = 0.
The nullspace is a subspace of m dimensional space Rm .
The second subspace is the range (or column space) of A denoted R(A) (or C(A)). It is defined
as all vectors of the form Ax for some x. From our discussion above, we see that R(A) is the the
span (or set of all possible linear combinations) of its columns. This explains the name “column
space”. The range is a subspace of n dimensional space Rn .
The four fundamental subspaces for a matrix are the nullspace N (A) and range R(A) for A
together with the nullspace N (AT ) and range R(AT ) for the transpose AT .
II.2.2. Finding basis and dimension of N (A)
Example: Let


1 3 3 10
A = 2 6 −1 −1 .
1 3 1
4
To calculate a basis for the nullspace N (A) and determine its dimension we need to find the solutions
to Ax = 0. To do this we first reduce A to reduced row echelon form U and solve U x = 0 instead,
since this has the same solutions as the original equation.
>A=[1 3 3 10;2 6 -1 -1;1 3 1 4];
>rref(A)
ans =
1
0
0
3
0
0
0
1
0
1
3
0
61
 
x1
x2 

This means that x = 
x3  is in N (A) if
x4
 

 
 x1
1 3 0 1  
0
0 0 1 3 x2  = 0
x3 
0 0 0 0
0
x4
We now divide the variables into basic variables, corresponding to pivot columns, and free variables,
corresponding to non-pivot columns. In this example the basic variables are x1 and x3 while the
free variables are x2 and x4 . The free variables are the parameters in the solution. We can solve
for the basic variables in terms of the free ones, giving x3 = −3x4 and x1 = −3x2 − x4 . This leads
to
  

 
 
x1
−3x2 − x4
−3
−1
x2  

1
0
x
2
 =

 
 
x3   −3x4  = x2  0  + x4 −3
x4
x4
0
1
 
 
−1
−3
0
1

 
The vectors 
 0  and −3 span the nullspace since every element of N (A) is a linear combination
1
0
of them. They are also linearly independent because if the linear combination on the right of the
equation above is zero, then by looking at the second entry of the vector (corresponding to the
first free variable) we find x2 = 0 and looking at the last entry (corresponding to the second free
variable) we find x4 = 0. So both coefficients must be zero.
To find a basis for N (A) in general we first compute U = rref(A) and determine which variables
are basic and which are free. For each free variable we form a vector as follows. First put a 1 in
the position corresponding to that free variable and a zero in every other free variable position.
Then fill in the rest of the vector in such a way that U x = 0. (This is easy to do!) The set all such
vectors - one for each free variable - is a basis for N (A).
II.2.3. The matrix version of Gaussian elimination
How are a matrix A and its reduced row echelon form U = rref(A) related? If A and U are n × m
matrices, then there exists an invertible n × n matrix such that
A = EU
E −1 A = U
This immediately explains why the N (A) = N (U ), because if Ax = 0 then U x = E −1 Ax = 0 and
conversely if Ax = 0 then U x = EAx = 0.
62
What is this matrix E? It can be thought of as a matrix record of the Gaussian elimination
steps taken to reduce A to U . It turns out performing an elementary row operation is the same
as multiplying on the left by an invertible square matrix. This invertible square matrix, called an
elementary matrix, is obtained by doing the row operation in question to the identity matrix.
Suppose we start with the matrix
>A=[1 3 3 10;2 6 -1 -1;1 3 1 4]
A =
1
2
1
3
6
3
3
-1
1
10
-1
4
The first elementary row operation that we want to do is to subtract twice the first row from the
second row. Let’s do this to the 3×3 identity matrix I (obtained with eye(3) in MATLAB/Octave)
and call the result E1
>E1 = eye(3)
E1 =
1
0
0
0
1
0
0
0
1
>E1(2,:) = E1(2,:)-2*E1(1,:)
E1 =
1
-2
0
0
1
0
0
0
1
Now if we multiply E1 and A we obtain
>E1*A
ans =
1
0
1
3
0
3
3
-7
1
10
-21
4
63
which is the result of doing that elementary row operation to A. Let’s do one more step. The
second row operation we want to do is to subtract the first row from the third. Thus we define
>E2 = eye(3)
E2 =
1
0
0
0
1
0
0
0
1
>E2(3,:) = E2(3,:)-E2(1,:)
E2 =
1
0
-1
0
1
0
0
0
1
and we find
>E2*E1*A
ans =
1
0
0
3
0
0
3
-7
-2
10
-21
-6
which is one step further along in the Gaussian elimination process. Continuing in this way until
we eventually arrive at U so that
Ek Ek−1 · · · E2 E1 A = U
−1
Thus A = EU with E = E1−1 E2−1 · · · Ek−1
Ek−1 . For the example above it turns out that


1 3
−6
E = 2 −1 −18
1 1
−9
which we can check:
64
>A=[1 3 3 10;2 6 -1 -1;1 3 1 4]
A =
1
2
1
3
6
3
3
-1
1
10
-1
4
>U=rref(A)
U =
1
0
0
3
0
0
0
1
0
1
3
0
>E=[1 3 -6; 2 -1 -18; 1 1 -9];
>E*U
ans =
1
2
1
3
6
3
3
-1
1
10
-1
4
If we do a partial elimination then at each step we can write A = E 0 U 0 where U 0 is the resulting
matrix at the point we stopped, and E 0 is obtained from the Gaussian elimination step up to that
point. A common place to stop is when U 0 is in echelon form, but the entries above the pivots
have not been set to zero. If we can achieve this without doing any row swaps along the way then
E 0 turns out to be lower triangular matrix. Since U 0 is upper triangular, this is called the LU
decomposition of A.
II.2.4. A basis for R(A)
The ranges or column spaces R(A) and R(U ) are not the same in general, but they are related.
In fact, the vectors in R(A) are exactly all the vectors in R(U ) multiplied by E, where E is the
invertible matrix in the equation A = EU . We can write this relationship as
R(A) = ER(U )
To see this notice that if x ∈ R(U ), that is, x = U y for some y then Ex = EU y = Ay is in R(A).
Conversely if x ∈ R(A), that is, x = Ay for some y then x = EE −1 Ay = EU y so x is E times a
vector in R(U ).
65
Now if we can find a basis u1 , u2 , . . . , uk for R(U ), the vectors Eu1 , Eu2 , . . . , Euk form a basis
for R(A). (Homework exercise)
But a basis for the column space R(U ) is easy to find. They are exactly the pivot columns of U .
If we multiply these by E we get a basis for R(A). But if
#
#
" " A = a1 a2 · · · am , U = u1 u2 · · · um
then the equation A = EU can be written
# "
#
" a1 a2 · · · am = Eu1 Eu2 · · · Eum
From this we see that the columns of A that correspond to pivot columns of U form a basis for
R(A). This implies that the dimension of R(A) is the number of pivot columns in U .
II.2.5. The rank of a matrix
We define the rank of the matrix A, denoted r(A) to be the number of pivot columns of U . Then
we have shown that for an n × m matrix A
dim(R(A)) = r(A)
dim(N (A)) = m − r(A)
II.2.6. Bases for R(AT ) and N (AT )
Of course we could find R(AT ) and N (AT ) by computing the reduced row echelon form for AT and
following the steps above. But then we would miss an important relation between the dimensions
of these spaces.
Let’s start with the column space R(AT ). The columns of AT are the rows of A (written as
column vectors instead of row vectors). So R(AT ) is the row space of A.
It turns out that R(AT ) and R(U T ) are the same. This follows from A = EU . To see this take
the transpose of this equation. Then AT = U T E T . Now suppose that x ∈ R(AT ). This means that
x = AT y for some y. But then x = U T E T y = U T y0 where y0 = E T y so x ∈ R(U T ). Similarly, if
x = U T y for some y then x = U T E T (E T )−1 y = AT (E T )−1 y = AT y0 for y0 = (E T )−1 y. So every
vector in R(U T ) is also in R(AT ). Here we used that E and hence E T is invertible.
Now we know that R(AT ) = R(U T ) is spanned by the columns of U T . But since U is in reduced
row echelon form, its non-zero columns of U T are independent (homework exercise). Therefore, the
66
non-zero columns of U T form a basis for R(AT ). There is one of these for every pivot. This leads
to
dim(R(AT )) = r(A) = dim(R(A))
The final subspace to consider is N (AT ). From our work above we know that
dim(N (AT )) = n − dim(R(AT )) = n − r(A).
Finding a basis is trickier. It might be easiest to find the reduced row echelon form of AT . But if
we insist on using A = EU or AT = U T E T we could proceed by multiplying on the right be the
inverse of E T . This gives
AT (E T )−1 = U T
Now notice that the last n − r(A) columns of U T are zero, since U is in reduced row echelon form.
So the last n − r(A) columns of (E T )−1 are in the the nullspace of AT . They also have to be
independent, since (E T )−1 is invertible.
Thus the last n − r(A) of (E T )−1 form a basis for N (AT ).
From a practical point of view, this is not so useful since we have to compute the inverse of a
matrix. It might be just as easy to reduce AT . (Actually, things are slightly better if we use the
LU decomposition. The same argument shows that the last n − r(A) columns of (LT )−1 also form
a basis for N (AT ). But LT is an upper triangular matrix, so its inverse is faster to compute.)
II.2.7. Orthogonal vectors and subspaces
In preparation for our discussion of the orthogonality relations for the fundamental subspaces of
matrix we review some facts about orthogonal vectors and subspaces.
Recall that the dot product, or inner product of two vectors
 
 
x1
y1
 x2 
 y2 
 
 
x= .  y= . 
 .. 
 .. 
xn
yn
is denoted by x · y or hx, yi and defined by

xT y = x 1 x 2 · · ·

y1
n


 y2  X
xn  .  =
x i yi
 .. 
i=1
yn
Some important properties of the inner product are symmetry
x·y =y·x
67
and linearity
(c1 x1 + c2 x2 ) · y = c1 x1 · y + c2 x2 · y.
The (Euclidean) norm, or length, of a vector is given by
v
u n
uX
√
kxk = x · x = t
x2i
i=1
An important property of the norm is that kxk = 0 implies that x = 0.
The geometrical meaning of the inner product is given by
x · y = kxkkyk cos(θ)
where θ is the angle between the vectors. The angle θ can take values from 0 to π.
The Cauchy–Schwarz inequality states
|x · y| ≤ kxkkyk.
It follows from the previous formula because | cos(θ)| ≤ 1. The only time that equality occurs in
the Cauchy–Schwarz inequality, that is x · y = kxkkyk, is when cos(θ) = ±1 and θ is either 0 or π.
This means that the vectors are pointed in the same or in the opposite directions.
The vectors x and y are orthogonal if x · y = 0. Geometrically this means either that one of
the vectors is zero or that they are at right angles. This follows from the formula above, since
cos(θ) = 0 implies θ = π/2.
Another way to see that x · y = 0 means that vectors are orthogonal is from Pythagoras’ formula.
If x and y are at right angles then kxk2 + kyk2 = kx + yk2 .
x+y
y
x
But kx + yk2 = (x + y) · (x + y) = kxk2 + kyk2 + 2x · y so Pythagoras’ formula holds exactly
when x · y = 0.
To compute the inner product of (column) vectors X and Y in MATLAB/Octave we use the
formula x · y = xT y. Thus the inner product can be computed using X’*Y. (If X and Y are row
vectors, the formula is X*Y’.)
The norm of a vector X is computed by norm(X). In MATLAB/Octave inverse trig functions
are computed with asin(), acos() etc. So the angle between column vectors X and Y could be
computed as
68
> acos(X’*Y/(norm(X)*norm(Y)))
Two subspaces V and W are said to be orthogonal if every vector in V is orthogonal to every
vector in W . In this case we write V ⊥ W .
W
T
V
S
In this figure V ⊥ W and also S ⊥ T .
A related concept is the orthogonal complement. The orthogonal complement of V , denoted V ⊥ ,
is the subspace containing all vectors orthogonal to V . In the figure W = V ⊥ but T 6= S ⊥ since T
contains only some of the vectors orthogonal to S.
If we take the orthogonal complement of V ⊥ we get back the original space V : This is certainly
plausible from the pictures. It is also obvious that V ⊆ (V ⊥ )⊥ , since any vector in V is perpendicular to vectors in V ⊥ . If there were a vector in (V ⊥ )⊥ not contained in V we could subtract its
projection onto V (defined in the next chapter) and end up with a non-zero vector in (V ⊥ )⊥ that
is also in V ⊥ . Such a vector would be orthogonal to itself, which is impossible. This shows that
(V ⊥ )⊥ = V.
One consequence of this formula is that V = W ⊥ implies V ⊥ = W . Just take the orthogonal
complement of both sides and use (W ⊥ )⊥ = W .
II.2.8. Orthogonality relations for the fundamental subspaces of a matrix
Let A be an n × m matrix. Then N (A) and R(AT ) are subspaces of Rm while N (AT ) and R(A)
are subspaces of Rn .
These two pairs of subspaces are orthogonal:
N (A) = R(AT )⊥
N (AT ) = R(A)⊥
We will show that the first equality holds for any A. The second equality then follows by applying
the first one to AT .
69
These relations are based on the formula
(AT x) · y = x · (Ay)
This formula follows from the product formula (AB)T = B T AT for transposes, since
(AT x) · y = (AT x)T y = xT (AT )T y = xT Ay = x · (Ay)
First, we show that N (A) ⊆ R(AT )⊥ . To do this, start with any vector x ∈ N (A). This means
that Ax = 0. If we compute the inner product of x with any vector in R(AT ), that is, any
vector of the form AT y, we get (AT y) · x = y · Ax = y · 0 = 0. Thus x ∈ R(AT )⊥ . This shows
N (A) ⊆ R(AT )⊥ .
Now we show the opposite inclusion, R(AT )⊥ ⊆ N (A). This time we start with x ∈ R(AT )⊥ .
This means that x is orthogonal to every vector in R(AT ), that is, to every vector of the form AT y.
So (AT y) · x = y · (Ax) = 0 for every y. Pick y = Ax. Then (Ax) · (Ax) = kAxk2 = 0. This
implies Ax = 0 so x ∈ N (A). We can conclude that R(AT )⊥ ⊆ N (A).
These two inclusions establish that N (A) = R(AT )⊥ .
Let’s verify these orthogonality relations in an

1
A = 1
2
example. Let

2 1 1
3 0 1
5 1 2
Then


1 0
3 1
rref(A) = 0 1 −1 0
0 0
0 0
70

1

0
rref(AT ) = 
0
0
0
1
0
0

1
1

0
0
Thus we get
   
−3
−1 


   

1  0
N (A) = span 
,
 1  0





0
1
   
2 
 1
R(A) = span 1 , 3


2
5
 
 −1 
N (AT ) = span −1


1
   
1
0 


   

0
 ,  1
R(AT ) = span 
3 −1





1
0
We can now verify directly that every vector in the basis for N (A) is orthogonal to every vector in
the basis for R(AT ), and similarly for N (AT ) and R(A).
Does the equation
 
2

Ax = 1
3
have a solution?
We can use the ideas above to answer this question easily. We are really asking
 
2
whether 1 is contained in R(A). But, according to the orthogonality relations, this is the same
3
 
2

as asking whether 1 is contained in N (AT )⊥ . This is easy to check. Simply compute the dot
3
product
   
2
−1
1 · −1 = −2 − 1 + 3 = 0.
3
1
Since the result is zero, we conclude that a solution exists.
II.2.9. The formula matrix of a chemical system
The fundamental subspaces of the formula matrix of a chemical system give information about the
possible reactions and invariant ratios between quantities of species in the system.
71
Formula vectors, the formula matrix A and its range R(A)
A chemical system consists of a collection of chemical species, each composed of a number of
elements. To each species we associate a formula vector which lists the amount of each element
in that species. For example, in the chemical system consisting of the species CH4 , S2 , CS2 and
H2 S, the components (elements) are C, H and S so that the formula vectors are [1, 4, 0]T for CH4 ,
[0, 0, 2]T for S2 , [1, 0, 2]T for CS2 and [0, 2, 1]T for H2 S. Taking all the formula vectors in the system
as columns of a matrix, we obtain the formula matrix, A, whose rows are labelled by the elements
and whose columns are labelled by the species. In this example
CH4
C
1

4
A= H
S
0

S2
0
0
2
CS2
1
0
2
H2 S

0
2 
1
A sample of n1 moles of CH4 , n2 moles of S2 , n3 moles of CS2 and n4 moles of H2 S is described
by the linear combination
 
 
 
 
1
0
1
0







b = n 1 4 + n2 0 + n3 0 + n4 2
0
2
2
1
The components b1 , b2 and b3 of b give the molar amounts of C, H and S in our sample. The vector
b is called the element abundance vector. We can write the equation above as
b = An
where n = [n1 , n2 , n3 , n4 ]T is called the species abundance vector. The entries of n (and hence also
of b) are positive numbers. In this context it is natural for them to be integers too.
The element abundance vector b is in the range R(A). However, not every vector in R(A) is an
element abundance vector, because of the positivity condition.
Chemical reactions and the null space N (A)
In a chemical system, some of the species may react to form other species in the system. We
can use the null space N (A) of the formula matrix to find all reactions satisfying the constraints
imposed by mass balance.
In the example chemical system system above, a possible reaction is
CH4 + 2S2 = CS2 + 2H2 S.
This can be written in vector form as
 
   
 
1
0
1
0
4 + 2 0 = 0 + 2 2 ,
0
2
2
1
72
or
An1 = An2 ,
with n1 = [1, 2, 0, 0]T and n2 = [0, 0, 1, 2]T . This equation expresses the mass balance of the
reaction, that is, the fact that the number of atoms of each component, C, H and S, is the same
on the left and right sides of the equation.
Notice that n = n1 − n2 is in the null space N (A) since An = An1 − An2 = 0. In general, given
a formula matrix A, an vector n is called a reaction vector if it is contained in N (A) and satisfies
an additional non-degeneracy condition.
The condition n ∈ N (A) ensures that a reaction vector n will contain coefficients for a mass
balanced equation. Species corresponding to the positive coefficients are on one side of the reaction
equation, while the species corresponding to the negative coefficients are on the other side.
The reaction equation An = 0 is a linear relation among columns of A for which the corresponding
entry in n is non-zero. The non-degeneracy condition is imposed to ensure that there is no other
independent relation among these columns. In other words, up to scaling, there can only be one
reaction equation involving a given collection of species. To state the condition, collect all the
columns of A for which the corresponding entry in n is non-zero. If we form a matrix à from these
columns then the condition is that N (Ã) is one dimensional. Another way of saying this is that if
we drop any one column from our collection the resulting set is linearly independent. Under this
condition n can have at most rank(A) + 1 non-zero entries.
The non-degeneracy condition is automatically satisfied if we compute a basis for N (A) in the
standard way from the row reduction of A. This is because exactly one of the non-zero entries of
such a basis vector is in a non-pivot position. This implies that the columns in à are a collection
of linearly independent columns (the pivot columns) and one additional column. There can only
be one independent linear relation in such a collection.
Since a reaction vector remains a reaction vector when multiplied by non-zero constant, we can
normalize, for example, to make a particular coefficient equal to one, or to make all the coefficients
integers.
Suppose we wish to find what reactions are possible when we add the species H2 to the collection
above. The new formula matrix is
CH4

C
1

4
A= H
S
0
S2
0
0
2
CS2
1
0
2
H2 S
0
2
1
H2

0
2 .
0
To find the null space we reduce A to reduced row echelon form. Here are the MATLAB/Octave
commands to do this.
1> A=[1 0 1 0 0;4 0 0 2 2;0 2 2 1 0];
73
2> rref(A)
ans =
1.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
0.00000
1.00000
0.50000
1.00000
-0.50000
0.50000
0.50000
-0.50000
This shows that the null space is two dimensional and spanned by [−1, −2, 1, 2, 0]T and [−1, −1, 1, 0, 2]T .
Here we have normalized to make the reaction coefficients integers. To write the reaction we collect
all the positive coefficients on one side and all the negative coefficients on the other. Then from
[−1, −2, 1, 2, 0]T we recover the reaction
CH4 + 2S2 = CS2 + 2H2 S
from before. In addition, the vector [−1, −1, 1, 0, 2] yields the new reaction
CH4 + S2 = CS2 + 2H2 .
The example above is from Smith and Missen [1], where more information on chemical stoiciometry can be found.
Preserved ratios and N (AT )
Non-zero vectors in the nullspace N (AT ) of the transpose of A also give information about the
chemical system. Consider the system consisting of only two species CH4 and H2 S with formula
matrix
CH4 H2 S


C
1
0
2 
A = H 4
S
0
1
Here N (AT ) is one dimensional, spanned by [4, −1, 2]T .
As we saw above, if we combine n1 moles of CH4 and n2 moles of H2 S then the components the
corresponding element abundance vector
 
 
1
0
n
b = n1 4 + n2 2 = A 1
n2
0
1
is in the range R(A). From the orthogonality relation R(A) = N (AT )⊥ we see that b must be
orthogonal to [4, −1, 2]T . This is true for any choice of n1 and n2 , so that for any possible sample,
74
4b1 − b2 + 2b3 = 0. Thus the molar amounts b1 , b2 , and b3 of C, H and S in any sample of this
chemical system have fixed ratio.
4b1 + 2b3
= 1.
b2
For more information about linear algebra applied to petrology see T. Gordon [2].
[1] William R. Smith and Ronald W. Missen, What is Chemical Stoichiometry?, Chemical Engineering Education, Winter 1979, 26–32.
[2] T. Gordon, Vector subspace analysis applied to inverse geochemical mass transfer problems, at
people.ucalgary.ca/∼gordon/InverseMassTransferProblems/GAC-MAC2000Slides.pdf
75
II.3. Graphs and Networks
Prerequisites and Learning Goals
From your work in previous courses you should be able to
• State Ohm’s law for a resistor.
• State Kirchhoff’s laws for a resistor network.
After completing this section, you should be able to
• Write down the incidence matrix of a directed graph, and draw the graph given the incidence
matrix.
• Define the Laplace operator L, or Laplacian, for a graph; given a graph, determine the entries
of L and describe how they are related to the nodes and edges of the graph.
• When the edges of a graph represent resistors or batteries in a circuit, you should be able to
– define voltage, voltage difference, current, and loop vectors;
– interpret each of the four subspaces associated with the incidence matrix D and their
dimensions in terms of voltage, current, and loop vectors; relate the dimension of N (D)
to the number of disconnected pieces in a graph; find bases for each subspace of D and
verify the orthogonality relations between such subspaces;
– interpret the range and the nullspace of the Laplacian in terms of voltage and current
vectors, and give a physical justification of such interpretation;
– explain what happens to L if two nodes in the graph are renamed and why and when it
is useful to do so;
– construct the voltage-to-current map for a pair of nodes and use it to calculate voltages, currents, or effective resistances; perform all necessary computations in MATLAB/Octave.
76
II.3.1. Directed graphs and their incidence matrix
A directed graph is a collection of vertices (or nodes) connected by edges with arrows. Here is a
graph with 4 vertices and 5 edges.
1
1
2
2
5
4
3
4
3
Graphs come up in many applications. For example, the nodes could represent computers and the
arrows internet connections. Or the nodes could be factories and the arrows represent movement
of goods. We will mostly focus on a single interpretation where the edges represent resistors or
batteries hooked up in a circuit.
In this interpretation we will be assigning a number to each edge to indicate the amount of
current flowing through that edge. This number can be positive or negative. The arrows indicate
the direction associated to a positive current.
The incidence matrix of a graph is an n × m matrix, where n is the number of edges and m is the
number of vertices. We label the rows by the edges in the graph and the columns by the vertices.
Each row of the matrix corresponds to an edge in the graph. It has a −1 in the place corresponding
to the vertex where the arrow starts and a 1 in the place corresponding to the vertex where the
arrow ends.
Here is the incidence matrix for the illustrated graph.
1
2
3
4


1
0
0
1 −1
2
1
0
 0 −1

3
0 −1
1
 0

4  0 −1
0
1
1
0
0 −1
5
The columns of the matrix have the following interpretation. The column representing a given
vertex has a +1 for each arrow coming in to that vertex and a −1 for each arrow leaving the vertex.
77
Given an incidence matrix, the corresponding graph can easily be drawn. What is the graph for


−1
1
0
 0 −1
1?
1
0 −1
(Answer: a triangular loop.)
II.3.2. Nullspace and range of incidence matrix and its transpose
We now wish to give an interpretation of the fundamental subspaces associated with the incidence
4
matrix of a graph. Let’s call the matrix D. In our example
  D acts on vectors v ∈ R and produces
v1

v2 

a vector Dv in R5 . We can think of the vector v = 
v3  as an assignment of a voltage to each of
v4


v2 − v1
v3 − v2 



the nodes in the graph. Then the vector Dv = 
v4 − v3  assigns to each edge the voltage difference
v4 − v2 
v1 − v4
across that edge. The matrix D is similar to the derivative matrix when we studied finite difference
approximations. It can be thought of as the derivative matrix for a graph.
II.3.3. The null space N (D)
This is the set of voltages v for which the voltage differences in Dv are all zero. This means that
any two nodes connected by an edge will have the same voltage. In ourexample,
this implies all

1
1

the voltages are the same, so every vector in N (D) is of the form v = s 
1 for some s. In other
1
 
1 



 
1
words, the null space is one dimensional with basis 
1.





1
For a graph that has several disconnected pieces, Dv = 0 will force v to be constant on each
connected component of the graph. Each connected component will contribute one basis vector to
N (D). This is the vector that is equal to 1 on that component and zero everywhere else. Thus
dim(N (D)) will be equal to the number of disconnected pieces in the graph.
78
II.3.4. The range R(D)
The range of D consists of all vectors b in R5 that are voltage differences, i.e., b = Dv for some v.
We know that the dimension of R(D) is 4 − dim(N (D)) = 4 − 1 = 3. So the set of voltage difference
vectors must be restricted in some way. In fact a voltage difference vector will have the property
that the sum of the
around a closed loop is zero. In the example the edges 1,
4,
5 form
 differences

b1
b2 
 

a loop, so if b = 
b3  is a voltage difference vector then b1 + b4 + b5 = 0 We can check this directly
b4 
b5


v2 − v1
v3 − v2 



in the example. Since b = Dv = 
v4 − v3  we check that (v2 − v1 ) + (v4 − v2 ) + (v1 − v4 ) = 0.
v4 − v2 
v1 − v4
In the example graph there are three loops, namely 1,
4,
5 and 2,
3,
4 and 1,
2,
3,
5 . The
corresponding equations that the components of a vector b must satisfy to be in the range of D
are
b1 + b4 + b5 = 0
b2 + b3 − b4 = 0
b1 + b2 + b3 + b5 = 0
Notice the minus sign in the second equation corresponding to a backwards arrow. However these
equations are not all independent, since the third is obtained by adding the first two. There are
two independent equations that the components of b must satisfy. Since R(D) is 3 dimensional,
there can be no additional constraints.
 
y1
y2 
 

Now we wish to find interpretations for the null space and the range of DT . Let y = 
y3  be
y4 
y5
5
a vector in R which we interpret
as being an assignment of a current to each edge in the graph.

y5 − y1
y1 − y2 − y4 

Then DT y = 
 y2 − y3 . This vector assigns to each node the amount of current collecting at
y3 + y4 − y5
that node.
II.3.5. The null space N (DT )
This is the set of current vectors y ∈ R5 which do not result in any current building up (or draining
away) at any of the nodes. We know that the dimension of this space must be 5 − dim(R(DT )) =
79
5 − dim(R(D)) = 5 − 3 = 2. We can guess at a basis for this space by noting
  that current running
1
0
 

around a loop will not build up at any of the nodes. The loop vector 
0 represents a current
1
1
running around the loop 1,
4,
5. We can verify that this vector lies in the null space of DT :
 
 1

 
−1 0
0
0
1  
0
 1 −1 0 −1 0  0 0
 0 =  

 0
0
1 −1 0
0 
1
0
0
1
1 −1
0
1

 
1
0
1
1
 
 

 
The current vectors corresponding to the other two loops are 
 1  and 1. However these three
0
−1
1
0
vectors are not linearly independent. Any choice of two of these vectors are independent, and form
a basis.

II.3.6. The range R(DT )
 
x1

x2 
T

This is the set of vectors in R4 of the form 
x3  = D y. With our interpretation these are vectors
x4
which measure how the currents in y are building up or draining away from each node. Since the
current that is building up at one node must have come from some other nodes, it must be that
x1 + x2 + x3 + x4 = 0
In our example, this can be checked directly. This one condition in R4 results in a three dimensional
subspace.
II.3.7. Summary and Orthogonality relations
The two subspaces R(D) and N (DT ) are subspaces of R5 . The subspace N (DT ) contains all linear
combination of loop vectors, while R(D) contains all vectors whose dot product with loop vectors
is zero. This verifies the orthogonality relation R(D) = N (DT )⊥ .
The two subspaces N (D) and R(DT ) are subspaces of R4 . The subspace N (D) contains constant
vectors, while R(DT ) contains all vectors orthogonal to constant vectors. This verifies the other
orthogonality relation N (D) ⊥ R(DT ).
80
II.3.8. Resistors and the Laplacian
Now we suppose that each edge of our graph represents a resistor. This means that we associate
with the ith edge a resistance Ri . Sometimes it is convenient to use conductances γi which are
defined to be the reciprocals of the resistances, that is, γi = 1/Ri .
1
R1
2
R5
R2
R
4
3
4
R3
If we begin by an assignment of voltage to every node, and put these numbers in a vector v ∈ R4 .
Then Dv ∈ R5 represents the vector of voltage differences for each of the edges.
Given the resistance Ri for each edge, we can now invoke Ohm’s law to compute the current
flowing through each edge. For each edge, Ohm’s law states that
(∆V )i = ji Ri ,
where (∆V )i is the voltage drop across the edge, ji is the current flowing through that edge, and
Ri is the resistance. Solving for the current we obtain
ji = Ri−1 (∆V )i .
Notice that the voltage drop (∆V )i in this formula is exactly the ith component of the vector Dv.
So if we collect all the currents flowing along each edge in a vector j indexed by the edges, then
Ohm’s law for all the edges can be written as
j = R−1 Dv
where


R1 0
0
0
0
 0 R2 0
0
0



0 R3 0
0
R=0

0
0
0 R4 0 
0
0
0
0 R5
is the diagonal matrix with the resistances on the diagonal.
81
Finally, if we multiply j by the matrix DT the resulting vector
J = DT j = DT R−1 Dv
has one entry for each node, representing the total current flowing in or out of that node along the
edges that connect to it.
The matrix
L = DT R−1 D
appearing in this formula is called the Laplacian. It is similar to the second derivative matrix that
appeared when we studied finite difference approximations.
One important property of the Laplacian is symmetry, that is the fact that LT = L. To see this
recall that the transpose of a product of matrices is the product of the transposes in reverse order
((ABC)T = C T B T AT ). This implies that
T
LT = (DT R−1 D)T = DT R−1 D = L
Here we used that DT
T
T
= D and that R−1 , being a diagonal matrix, satisfies R−1 = R−1 .
Let’s determine the entries of L. To start we consider the case where all the resistances have the
same value 1 so that R = R−1 = I. In this case L = DT D. Let’s start with the example graph
above. Then



 −1


1
0
0
−1
0
0
0
1 
2 −1
0 −1

0
−1
1
0
 1 −1
 −1

0 −1
0
3 −1 −1
=
 0

0
−1
1
L=
  0 −1
 0
1 −1
0
0 
2 −1

 0 −1
0
1
0
0
1
1 −1
−1 −1 −1
3
1
0
0 −1
Notice that the ith diagonal entry is the total number of edges connected to the ith node. The i, j
entry is −1 if the ith node is connected to the jth node, and 0 otherwise.
This pattern describes the Laplacian L for any graph. To see this, write
D = [d1 |d2 |d3 | · · · |dm ]
Then the i, j entry of DT D is dTi dj . Recall that di has an entry of −1 for every edge leaving the
ith node, and a 1 for every edge coming in. So dTi di , the diagonal entries of DT D, are the sum of
(±1)2 , with one term for each edge connected to the ith node. This sum gives the total number
of edges connected to the ith node. To see this in the example graph, let’s consider the first node.
This node has two edges connected to it and
 
−1
0
 

d1 = 
0
0
1
82
Thus the 1, 1 entry of the Laplacian is
dT1 d1 = (−1)2 + 12 = 2
On the other hand, if i 6= j then the vectors di and dj have a non-zero entry in the same position
only if one of the edges leaving the ith node is coming in to the jth node or vice versa. For a graph
with at most one edge connecting any two nodes (we usually assume this) this means that dTi dj
will equal −1 if the ith and jth nodes are connected by an edge, and zero otherwise. For example,
in the graph above the first edge leaves the first node, so that d1 has a −1 in the first position.
This first edge comes in to the second node so d2 has a +1 in the first position. Otherwise, there
is no overlap in these vectors, since no other edges touch both these nodes. Thus
 
1
−1
 

dT1 d2 = −1 0 0 0 1 
 0 = −1
−1
0
What happens if the resistances are not all equal to one? In this case we must replace D with
R−1 D in the calculation above. This multiplies the kth row of D with γk = 1/Rk . Making this
change in the calculations above leads to the following prescription for calculating the entries of L.
The diagonal entries are given by
X
Li,i =
γk
k
Where the sum goes over all edges touching the ith node. When i 6= j then
(
−γk if nodes i and j are connected with edge k
Li,j =
0 if nodes i and j are not connected
II.3.9. Kirchhoff’s law and the null space of L
Kirchhoff’s law states that currents cannot build up at any node. If v is the voltage vector for a
circuit, then we saw that Lv is the vector whose ith entry is the total current building up at the
ith node. Thus, for an isolated circuit that is not hooked up to any batteries, Kirchhoff’s law can
be written as
Lv = 0
By definition, the solutions are exactly the vectors in the nullspace N (L) of L. It turns out that
N (L) is the same as N (D), which contains all constant voltage vectors. This is what we should
expect. If there are no batteries connected to the circuit the voltage will be the same everywhere
and no current will flow.
83
To see N (L) = N (D) we start with a vector v ∈ N (D). Then Dv = 0 implies Lv = DT R−1 Dv =
DT R−1 0 = 0. This show that v ∈ N (L) too, that is, N (D) ⊆ N (L)
To show the opposite inclusion we first note that the matrix R−1 can be factored into a product
of √
invertible matrices R−1 = R−1/2 R−1/2 where R−1/2 is the diagonal matrix with diagonal entries
1/ Ri . This is possible because each Ri is a positive number. Also, since R−1/2 is a diagonal
matrix it is equal to its transpose, that is, R−1/2 = (R−1/2 )T .
Now suppose that Lv = 0. This can be written DT (R−1/2 )T R−1/2 Dv = 0. Now we multiply on
the left with vT . This gives
vT DT (R−1/2 )T R−1/2 Dv = (R−1/2 Dv)T R−1/2 Dv = 0
But for any vector w, the number wT w is the dot product of w with itself which is equal to the
length of w squared. Thus the equation above can be written
kR−1/2 Dvk2 = 0
This implies that R−1/2 Dv = 0. Finally, since R−1/2 is invertible, this yields Dv = 0. We have
shown that any vector in N (L) also is contained in N (D). Thus N (L) ⊆ N (D) and together with
the previous inclusion this yields N (L) = N (D).
II.3.10. Connecting a battery
To see more interesting behaviour in a circuit, we pick two nodes and connect them to a battery.
For example, let’s take our example circuit above and connect the nodes 1 and 2.
1
2
R1
R5
R2
R
4
3
4
84
R3
The terminals of a battery are kept at a fixed voltage. Thus the voltages v1 and v2 are now
known, say,
v1 = b1
v2 = b2
Of course, it is only voltage differences that have physical meaning, so we could set b1 = 0. Then
b2 would be the voltage of the battery.
At the first and second nodes there now will be current flowing in and out from the battery. Let’s
call these currents J1 and J2 . At all the other nodes the total current flowing in and out is still
zero, as before.
How are the equations for the circuit modified? For simplicity let’s set all the resistances Ri = 1.
The new equations are

   
2 −1
0 −1
b1
J1
−1




3 −1 −1 b2  J2 


=
 0 −1
2 −1 v3   0 
−1 −1 −1
3 v4
0
Two of the voltages v1 and v2 have changed their role in these equations from being unknowns to
being knowns. On the other hand, the first two currents, which were originally known quantities
(namely zero) are now unknowns.
Since the current flowing into the network should equal the current flowingout,
 we expect that
J1
J2 

J1 = −J2 . This follows from the orthogonality relations for L. The vector 
 0  is contained in
0
T
⊥
⊥
T
R(L). But R(L) = N (L ) = N (L) (since L = L ). But we know that N (L) consists of all
constant vectors. Hence
   
J1
1
J2  1
  ·   = J1 + J2 = 0
 0  1
1
0
To solve this system of equations we write it in block matrix form
A BT b
J
=
B C
v
0
where
and
2 −1
A=
−1
3
b
b= 1
b2
0 −1
B=
−1 −1
v
v= 3
v4
2 −1
C=
−1
3
J
J= 1
J2
0
0=
0
85
Our system of equations can then be written as two 2 × 2 systems.
Ab + B T v = J
Bb + Cv = 0
We can solve the second equation for v. Since C is invertible
v = −C −1 Bb
Using this value of v in the first equation yields
J = (A − B T C −1 B)b
The matrix A − B T C −1 B is the voltage-to-current map. In our example
1 −1
A − B T C −1 B = (8/5)
−1
1
In fact, for any circuit the voltage to current map is given by
1 −1
T −1
A−B C B =γ
−1
1
T −1
T −1
This can be
deduced from two facts: (i) A − B C B is symmetric and (ii) R(A − B C B) =
1
. You are asked to carry this out in a homework problem.
span
−1
Notice that this form of the matrix implies that if b1 = b2 then the currents are zero. Another
b
way of seeing this is to notice that if b1 = b2 then 1 is orthogonal to the range of A − B T C −1 B
b2
by (ii) and hence in the nullspace N (A − B T C −1 B).
The number
R=
1
γ
is the ratio of the applied voltage to the resulting current, is the effective resistance of the network
between the two nodes.
So in our example circuit, the effective resistance between nodes 1 and 2 is 5/8.
If the battery voltages are b1 = 0 and b2 = b then the voltages at the remaining nodes are
v3
0
4/5
−1
= −C B
=
b
v4
b
3/5
86
II.3.11. Two resistors in series
Let’s do a trivial example where we know the answer. If we connect two resistors in series, the
resistances add, and the effective resistance is R1 + R2 . The graph for this example looks like
1
R1
2
R2
3
The Laplacian for this circuit is

γ1
−γ1
0
L = −γ1 γ1 + γ2 −γ2 
0
−γ2
γ2

with γi = 1/Ri , as always. We want the effective resistance between nodes 1 and 3. Although it
is not strictly necessary, it is easier to see what the submatrices A, B and C are if we reorder the
vertices so that the ones we are connecting, namely 1 and 3, come first. This reshuffles the rows
and columns of L yielding
1
3
2


0
−γ1
1 γ1
3 0
γ2
−γ2 
2 −γ1 −γ2 γ1 + γ2
Here we have labelled the re-ordered rows and columns with the nodes they represent. Now the
desired submatrices are
γ1 0
A=
B = −γ1 −γ2
C = γ1 + γ2
0 γ2
and
T
A−B C
−1
2
1
γ1 γ2
γ1 0
γ1 γ1 γ2
1 −1
B=
−
=
0 γ2
1
γ1 + γ2 γ1 γ2 γ22
γ1 + γ2 −1
This gives an effective resistance of
R=
γ1 + γ2
1
1
=
+
= R1 + R2
γ1 γ2
γ1 γ2
87
as expected.
II.3.12. Example: a resistor cube
Hook up resistors along the edges of a cube. If each resistor has resistance Ri = 1, what is the
effective resistance between opposite corners of the cube?
8
4
3
5
1
7
6
2
We will use MATLAB/Octave to solve this problem. To begin we define the Laplace matrix L.
Since each node has three edges connecting it, and all the resistances are 1, the diagonal entries
are all 3. The off-diagonal entries are −1 or 0, depending on whether the corresponding nodes are
connected or not.
>L=[3 -1 0 -1
0 -1 3 -1 0 0
-1 0 0 0 3 -1
0 0 -1 0 0 -1
-1 0 0 0;-1 3 -1
-1 0;-1 0 -1 3 0
0 -1;0 -1 0 0 -1
3 -1;0 0 0 -1 -1
0
0
3
0
0 -1 0 0;
0 -1;
-1 0;
-1 3];
We want to find the effective resistance between 1 and 7. To compute the submatrices A, B and
C it is convenient to re-order the nodes so that 1 and 7 come first. In MATLAB/Octave, this can
be achieved with the following statement.
>L=L([1,7,2:6,8],[1,7,2:6,8]);
In this statement the entries in the first bracket [1,7,2:6,8] indicates the new ordering of the
rows. Here 2:6 stands for 2,3,4,5,6. The second bracket indicates the re-ordering of the columns,
which is the same as for the rows in our case.
Now it is easy to extract the submatrices A, B and C and compute the voltage-to-current map DN
>N = length(L);
>A = L(1:2,1:2);
>B = L(3:N,1:2);
>C = L(3:N,3:N);
>DN = A - B’*C^(-1)*B;
88
The effective resistance is the reciprocal of the first entry in DN. The command format rat gives
the answer in rational form. (Note: this is just a rational approximation to the floating point
answer, not an exact rational arithmetic as in Maple or Mathematica.)
>format rat
>R = 1/DN(1,1)
R = 5/6
89
Chapter III
Orthogonality
91
III.1. Projections
Prerequisites and Learning Goals
After completing this section, you should be able to
• Write down the definition of an orthogonal projection matrix and determine when a matrix
is an orthogonal projection.
• Identify the range of a projection matrix as the subspace onto which it projects and use
properties of the projection matrix to derive the orthogonality relation between the range
and nullspace of the matrix.
• Given the projection matrix onto a subspace S, compute the projection matrix onto S ⊥ ;
describe the relations between the nullspace and range of the two matrices.
• Use orthogonal projection matrices to decompose a vector into components parallel to and
perpendicular to a given subspace.
• Explain how the problem of finding the projection of a vector b onto a certain subspace
spanned by the columns of a matrix A translates into finding a vector x that minimizes the
length kAx − bk; show that x always exists and satisfies the least squares equation; discuss
how the results of the minimization problem can vary depending on the type of norm used;
discuss the sensitivity of a least squares fit to outliers.
• Compute the orthogonal projection matrix whose range is the span of a given collection of
vectors.
• Perform least squares calculations to find polynomial fits to a given set of points, or in other
applications where overdetermined systems arise. You should be able to perform all necessary
computations and plot your results using MATLAB/Octave.
• Interpret the output of the MATLAB/Octave \ command when applied to systems that have
no solutions.
III.1.1. Warm up: projections onto lines and planes in R3
Let a be a vector in three dimensional space R3 and let L = span(a) = {sa : s ∈ R} be the line
through a. The line L can also be identified as the range R(a), where a is considered to be a 3 × 1
matrix.
The projection of a vector x onto L is defined to be the vector in L that is closest to x. In the
diagram below, p is the projection of x onto L.
92
L
p
a
x
If sa is a point on the line L then the distance from sa to x is ksa−xk. To compute the projection
we must find s that minimizes ksa − xk. This is the same as minmizing the square ksa − xk2 . Now
ksa − xk2 = (sa − x) · (sa − x)
= s2 kak2 − 2sa · x + kxk2 .
To minimize this quantity, we can use elementary calculus: differentiate with respect to s, set the
derivative equal to zero and solve for s. This yields
s=
a·x
1 T
=
a x.
2
kak
kak2
and therefore the projection is given by
p=
1
(aT x)a
kak2
It is useful to rewrite this formula. The product (aT x)a of the scalar (aT x) times the vector a can
also be written as a matrix product aaT x, if we consider x and a to be 3 × 1 matrices. Thus
p=
1
aaT x.
kak2
This formula says that the projection of x onto the line throught a can be obtained by multiplying
x by the 3 × 3 matrix P given by
1
P =
aaT .
kak2
We now make some observations about the matrix P .
93
To begin, we observe that the matrix P satisfies the equation P 2 = P . To see why this must be
true, notice that P 2 x = P (P x) is the vector in L closest to P x. But P x is already in L so the
closest vector to it in L is P x itself. Thus P 2 x = P x, and since this is true for every x it must be
true that P 2 = P . We can also verify the equation P 2 = P directly by the calculation
P2 =
1
1
kak2 T
1
T
T
T
T
(aa
)(aa
)
=
a(a
a)a
=
aa =
aaT = P
4
4
4
kak
kak
kak
kak2
(here we used that matrix multiplication is associative and aT a = kak2 ).
Another fact about P is that it is equal to its transpose, that is, P T = P . This can also be
verified directly by the calculation
PT =
1
1
1
(aaT )T =
(aT )T aT =
aaT = P.
2
2
kak
kak
kak2
(here we use that (AB)T = B T AT and (AT )T = A).
Clearly, the range of P is
R(P ) = L.
The equation P T = P lets us determine the null space too. Using one of the orthogonality relations
for the four fundamental subspaces of P we find that
N (P ) = R(P T )⊥ = R(P )⊥ = L⊥ .

1
Example: Compute the matrix P that projects onto the line L through a =  2. Verify that
−1
 
1
P 2 = P and P T = P . What vector in L is closest to x = 1?
1

Let’s use MATLAB/Octave to do this calculation.
>x = [1 1 1]’;
>a = [1 2 -1]’;
>P = (a’*a)^(-1)*a*a’
P =
0.16667
0.33333
-0.16667
>P*P
94
0.33333
0.66667
-0.33333
-0.16667
-0.33333
0.16667
ans =
0.16667
0.33333
-0.16667
0.33333
0.66667
-0.33333
-0.16667
-0.33333
0.16667
This verifies the equation P 2 = P . The fact that P T = P can be seen by inspection. The vector in
L closest to x is given by
>P*x
ans =
0.33333
0.66667
-0.33333
Now we consider plane L⊥ orthogonal to L. Given a vector x, how can we find the projection of
x onto L⊥ , that is, the the vector q in L⊥ closest to x? Looking at the picture,
a
x
p
q
we can guess that q = x − p, where p = P x is the projection of x onto L. This would say that
q = x − P x = (I − P )x, where I denotes the identity matrix. In other words, Q = I − P is the
matrix that projects on L⊥ . We will see below that this guess is correct.
 
 
1
1
Example: Compute the vector q in the plane orthogonal to a =  2 that is closest to x = 1.
−1
1
As in the previous example, let’s use MATLAB/Octave. Assume that a, x and P have been defined
in the previous example. The 3 × 3 identity matrix is computed using the command eye(3). If we
compute
95
>Q=eye(3)-P
Q =
0.83333
-0.33333
0.16667
-0.33333
0.33333
0.33333
0.16667
0.33333
0.83333
then the vector we are seeking is
> Q*x
ans =
0.66667
0.33333
1.33333
III.1.2. Orthogonal projection matrices
A matrix P is called an orthogonal projection matrix if
• P2 = P
• PT = P.
1
T defined in the last section is an example of an orthogonal projection matrix.
The matrix kak
2 aa
This matrix projects onto its range, which is one dimensional and equal to the span of a. We will
see below that every orthogonal projection matrix projects onto its range, but the range can have
any dimension.
So, let P be an orthogonal projection matrix, and let Q = I − P . Then
1. Q is also an orthogonal projection matrix.
2. P +Q = I and P Q = QP = 0. (A consequence of this is that any vector in R(P ) is orthogonal
to any vector in R(Q) since (P x) · (Qy) = x · (P T Qy) = x · (P Qy) = 0.)
3. P projects onto its range R(P ). (In other words, P x is the closest vector in R(P ) to x.)
4. Q projects onto N (P ) = R(P )⊥ .
Let’s verify these statements in order:
1. This follows from Q2 = (I − P )(I − P ) = I − 2P + P 2 = I − 2P + P = I − P = Q and
(I − P )T = I T − P T = I − P
96
2. The identity P + Q = I follows immediately from the definition of Q. The second identity
follows from P Q = P (I − P ) = P − P 2 = P − P = 0. The identity QP = 0 has a similar
proof.
3. We want to find the closest vector in R(P ) (i.e., of the form P y for some y) to x. To do this
we must find the vector y that minimizes kP y − xk2 . We have
kP y − xk2 = kP (y − x) − Qxk2
(using x = P x + Qx)
= (P (y − x) − Qx) · (P (y − x) − Qx)
= kP (y − x)k2 + kQxk2
(the cross terms vanish by 2.)
This is obviously minimized when y = x. Thus P x is the closest vector in R(P ) to x.
4. Since Q is an orthogonal projection (by 1.) we know it projects onto R(Q) (by 3.). Since we
know that R(P )⊥ = N (P ) (from the basic subspace relation N (P ) = R(P T )⊥ and the fact
that P T = P ) it remains to show that R(Q) = N (P ). First, note that x ∈ R(Q) ⇔ Qx = x.
(The implication ⇐ is obvious, while the implication ⇒ can be seen as follows. Suppose
x ∈ R(Q). This means x = Qy for some y. Then Qx = Q2 y = Qy = x.) Now we can
complete the argument: x ∈ R(Q) ⇔ Qx = x ⇔ (I − P )x = x ⇔ P x = 0 ⇔ x ∈ N (P ).
This section has been rather theoretical. We have shown that an orthogonal projection matrix projects onto its range. But suppose that a subspace is presented as span(a1 , . . . , ak ) (or
equivalently R([a1 | · · · |ak ]) for a given collection of vectors a1 , . . . , ak . How can we compute the
projection matrix P whose range is this given subspace, so that P projects onto it? We will answer
this question in the next section.
III.1.3. Least squares and the projection onto R(A)
We now consider linear equations
Ax = b
That do not have a solution. This is the same as saying that b 6∈ R(A) What vector x is closest to
being a solution?
b
Ax
Ax−b
R(A) = possible values of Ax
We want to determine x so that Ax is as close as possible to b. In other words, we want to
minimize kAx − bk. This will happen when Ax is the projection of b onto R(A), that is, Ax = P b,
where P is the projection matrix. In this case Qb = (I − P )b is orthogonal to R(A). But
97
(I − P )b = b − Ax. Therefore (and this is also clear from the picture), we see that Ax − b is
orthogonal to R(A). But the vectors orthogonal to R(A) are exactly the vectors in N (AT ). Thus
the vector we are looking for will satisfy AT (Ax − b) = 0 or the equation
AT Ax = AT b
This is the least squares equation, and a solution to this equation is called a least squares solution.
(Aside: We can also use Calculus to derive the least squares equation. We want to minimize
kAx − bk2 . Computing the gradient and setting it to zero results in the same equations.)
It turns out that the least squares equation always has a solution. Another way of saying this
is R(AT ) = R(AT A). Instead of checking this, we can verify that the orthogonal complements
N (A) and N (AT A) are the same. But this is something we showed before, when we considered the
incidence matrix D for a graph.
If x solves the least squares equation, the vector Ax is the projection of b onto the range R(A),
since Ax is the closest vector to x in the range of A. In the case where AT A is invertible (this
happens when N (A) = N (AT A) = {0}), we can obtain a formula for the projection. Starting with
the least squares equation we multiply by (AT A)−1 to obtain
x = (AT A)−1 AT b
so that
Ax = A(AT A)−1 AT b.
Thus the projection matrix is given by
P = A(AT A)−1 AT
Notice that the formula for the projection onto a line through a is a special case of this, since then
AT A = kak2 .
It is worthwhile pointing out that if we say that the solution of the least squares equation gives
the “best” approximation to a solution, what we really mean is that it minimizes the distance, or
equivalently, its square
X
kAx − bk2 =
((Ax)i − bi )2 .
There are other ways of measuring how far Ax is from b, for example the so-called L1 norm
X
kAx − bk1 =
|(Ax)i − bi |
Minimizing the L1 norm will result in a different “best” solution, that may be preferable under
some circumstances. However, it is much more difficult to find!
98
III.1.4. Polynomial fit
Suppose we have some data points (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ) and we want to fit a polynomial
p(x) = a1 xm−1 + a2 xm−2 + · · · + am−1 x + am through them. This is like the Lagrange interpolation
problem we considered before, except that now we assume that n > m. This means that in general
there will no such polynomial. However we can look for the least squares solution.
To begin, let’s write down the equations that express the desired equalities p(xi ) = yi for i =
1, . . . m. These can be written in matrix form
 m−1
x1
xm−2
1
xm−1 xm−2
 2
2
 .
..
 ..
.

 ..
..
 .
.

 ..
..
 .
.
m−2
xm−1
x
n
n
···
···
···
···
···
···
 

y1
1
   y2 
1
 
 a1
.
.. 
.



.   a2  
.
=




.. 
..  ..

.  .  
.


.
..
a
 .. 
. m
yn
1
x1
x2
..
.
..
.
..
.
xn
or Aa = y. where A is a submatrix of the Vandermonde matrix. To find the least squares approximation we solve AT Aa = AT y. In a homework problem, you are asked to do this using
MATLAB/Octave.
In case where the polynomial has degree one this is a straight line fit, and the equation we want
to solve are


 
x1 1
y1
 x2 1  y2 

 a1
 
= . 
 ..

 .
 a2
 .. 
xn 1
yn
These equations will not have a solution (unless the points really do happen to lie on the same
line.) To find the least squares solution, we compute


x1 1

 P 2 P x1 x2 · · · xn  x2 1
xi
x
 ..
= P i
1 1 ··· 1  .
xi
n

xn 1
and

x1 x2 · · ·
1 1 ···

y1
  P
xn  y2 
x
y
i
i
 = P
yi
1  ... 
yn
This results in the familiar equations
P 2 P P
x
y
x
x
a
i
1
i
i
i
P
= P
xi
n
a2
yi
99
which are easily solved.
III.1.5. Football rankings
We can try to use least squares to rank football teams. To start with, suppose we have three teams.
We pretend each team has a value v1 , v2 and v3 such that when two teams play, the difference in
scores is the difference in values. So, if the season’s games had the following results
1
1
2
3
3
vs.
vs.
vs.
vs.
vs.
2 30 40
2 20 40
3 10 0
1 5 0
2 5 5
then the vi ’s would satisfy the equations
v2 − v1 = 10
v2 − v1 = 20
v2 − v3 = 10
v3 − v1 = 5
v2 − v3 = 0
Of course, there is no solution to these equations. Nevertheless we can find the least squares
solution. The matrix form of the equations is
Dv = b
with

−1
−1

D=
 0
−1
0

1
0
1
0

1 −1

0
1
1 −1
 
10
20
 

b=
10
5
0
The least squares equation is
DT Dv = DT b
or




3 −2 −1
−35
−2 4 −2 v =  40 
−1 −2 3
−5
Before going on, notice that D is an incidence matrix. What is the graph? (Answer: the nodes
are the teams and they are joined by an edge with the arrow pointing from the losing team to
100
the winning team. This graph may have more than one edge joining to nodes, if two teams play
more than once. This is sometimes called a multi-graph.). We saw that in this situation N (D) is
not empty, but contains vectors whose entries are all the same. The situation is the same as for
resistances, it is only differences in vi ’s that have a meaning.
We can solve this equation in MATLAB/Octave. The straightforward way is to compute
>L = [3 -2 -1;-2 4 -2;-1 -2 3];
>b = [-35; 40; -5];
>rref([L b])
ans =
1.00000
0.00000
0.00000
0.00000
1.00000
0.00000
-1.00000
-1.00000
0.00000
-7.50000
6.25000
0.00000
As expected, the solution is not unique. The general solution, depending on the parameter s is

  
−7.5
1
v = s 1 +  6.25 
0
1
We can choose s so that the vi for one of the teams is zero. This is like grounding
node in 
a

a 
0
−13.75
circuit. So, by choosing s = 7.5, s = −6.25 and s = 0 we obtain the solutions 13.75,  0 
7.5
−6.25


−7.5
or  6.25 .
0
 
0

Actually, it is easier to compute a solution with one of the vi ’s equal to zero directly. If v = v2 
v3
v
then v2 = 2 satisfies the equation L2 v2 = b2 where the matrix L2 is the bottom right 2 × 2
v3
block of L and b2 are the last two entries of b.
>L2 = L(2:3,2:3);
>b2 = b(2:3);
>L2\b2
ans =
13.7500
7.5000
101
We can try this on real data. The football scores for the 2007 CFL season can be found at
http://www.cfl.ca/index.php?module=sked&func=view&year=2007. The differences in scores
for the first 20 games are in cfl.m. The order of the teams is BC, Calgary, Edmonton, Hamilton,
Montreal, Saskatchewan, Toronto, Winnipeg. Repeating the computation above for this data we
find the ranking to be (running the file cfl.m)
v =
0.00000
-12.85980
-17.71983
-22.01884
-11.37097
-1.21812
0.87588
-20.36966
Not very impressive, if you consider that the second-lowest ranked team (Winnipeg) ended up in
the Grey Cup game!
102
III.2. Complex vector spaces and inner product
Prerequisites and Learning Goals
From your work in previous courses, you should be able to
• Perform arithmetic with complex numbers.
• Write down the definition of and compute the complex conjugate, modulus and argument of
a complex number.
After completing this section, you should be able to
• Define and perform basic matrix calculations with complex vectors and complex matrices.
• Define and compute the complex inner product and the norm of complex vectors, state basic
properties of the complex inner product.
• Define and compute the matrix adjoint for a complex matrix; explain its relation to the
complex inner product; compare its properties to the properties of the transpose of a real
matrix.
• Define an orthonormal basis for Cn and determine whether a set of complex vectors is an
orthonormal basis; determine the coefficients in the expansion of a complex vector in an
orthonormal basis.
• Write down the definition of a unitary matrix and list its properties; recognize when a matrix
is unitary.
• Define and compute the inner product and the norm for complex- (or real-) valued functions
that are defined on a given interval; define what it means for two functions to be orthonormal,
and verify it in specific examples.
• Define the complex exponential function, compute its value at given points and perform basic
computations (addition, differentiation, integration) involving complex exponential functions.
• Explain what are the elements of the vector space L2 ([a, b]) for an interval [a, b].
• Use complex numbers in MATLAB/Octave computations, specifically real(z), imag(z),
conj(z), abs(z), exp(z) and A’ for complex matrices.
103
III.2.1. Why use complex numbers?
So far the numbers (or scalars) we have been using have been real numbers. Now we will to start
using complex numbers as well. Here are two reasons why.
1. Solving polynomial equations (finding roots, factoring): If we use complex numbers, then
every polynomial
p(z) = a1 z n−1 + a2 z n−2 + · · · an−1 z + an
with a1 6= 0 (so that p(z) has degree n − 1) can be completely factored as
p(z) = a1 (z − r1 )(z − r2 ) · · · (z − rn−1 ).
The numbers r1 , . . . , rn−1 are called the roots of p(z) and are the values of z for which p(z) = 0.
Thus the equation p(z) = 0 always has solutions. There might not be n − 1 distinct solutions,
though, since it may happen that a given root r occurs more than once. If r occurs k times in the
list, then we say r has multiplicty k. An important point is that the roots of a polynomial may be
complex even when the coefficients a1 , . . . , an are real. For example z 2 + 1 = (z + i)(z − i).
2. Complex exponential: The complex exponential function eiθ is more convenient to use than
cos(θ) and sin(θ) because it is easier to multiply, differentiate and integrate exponentials than trig
functions.
Solving polynomial equations will be important when studying eigenvalues, while the complex
exponential appears in Fourier series and the discrete Fourier transform.
III.2.2. Review of complex numbers
x
Complex numbers can be thought of as points on the (x, y) plane. The point
, thought of as a
y
complex number, is written x + iy (or x + jy if you are an electrical engineer).
If z = x+iy then x is called the real part of z and is denoted Re(z) while y is called the imaginary
part of z and is denoted Im(z).
Complex numbers are added just like vectors in two dimensions. If z = x + iy and w = s + it,
then
z + w = (x + iy) + (s + it) = (x + s) + i(y + t)
The rule for multiplying two complex numbers is
zw = (x + iy)(s + it) = (xs − yt) + i(xt + ys)
Notice that i is a square root of −1 since
i2 = (0 + i)(0 + i) = (0 − 1) + i(0 + 0) = −1
104
This fact is all you need to remember to recover the rule for multiplying two complex numbers. If
you multiply the expressions for two complex numbers formally, and then substitute −1 for i2 you
will get the right answer. For example, to multiply 1 + 2i and 2 + 3i, compute
(1 + 2i)(2 + 3i) = 2 + 3i + 4i + 6i2 = 2 − 6 + i(3 + 4) = −4 + 7i
Complex addition and multiplication obey the usual rules of algebra:
z1 + z2 = z2 + z1
z 1 z2 = z2 z 1
z1 + (z2 + z3 ) = (z1 + z2 ) + z3 z1 (z2 z3 ) = (z1 z2 )z3
0 + z1 = z1
1z1 = z1
z1 (z2 + z3 ) = z1 z2 + z1 z3
The negative of any complex number z = x + iy is defined by −z = −x + (−y)i, and obeys
z + (−z) = 0.
The modulus of a complex number, denoted |z|, is the length of the corresponding vector in two
dimensions. If z = x + iy
p
|z| = |x + iy| = x2 + y 2
An important property is
|zw| = |z||w|
The complex conjugate of a complex number z, denoted z̄, is the reflection of z across the x axis.
Thus
x + iy = x − iy.
The complex conjugate obeys
z + w = z̄ + w̄,
zw = z̄ w̄
This means that complex conjugate of an algebraic expression can be obtained by changing all the
i’s to −i’s, either before or after performing arithmetic operations. The complex conjugate also
obeys
z z̄ = |z|2 .
This last equality is useful for simplifying fractions of complex numbers by turning the denominator
into a real number, since
z
z w̄
=
w
|w|2
For example, to simplify (1 + i)/(1 − i) we can write
1+i
(1 + i)2
1 − 1 + 2i
=
=
=i
1−i
(1 − i)(1 + i)
2
A complex number z is real (i.e. the y part in x + iy is zero) whenever z̄ = z. We also have the
following formulas for the real part and imaginary part of z. If z = x+iy then Re(z) = x = (z+ z̄)/2
and Im(z) = y = (z − z̄)/(2i)
105
We define the exponential, eit , of a purely imaginary number it to be the number
eit = cos(t) + i sin(t)
lying on the unit circle in the complex plane.
The complex exponential satisfies the familiar rule ei(s+t) = eis eit since by the addition formulas
for sine and cosine
ei(s+t) = cos(s + t) + i sin(s + t)
= cos(s) cos(t) − sin(s) sin(t) + i(sin(s) cos(t) + cos(s) sin(t))
= (cos(s) + i sin(s))(cos(t) + i sin(t))
= eis eit
Any complex number can be written in polar form
z = reiθ
where r and θ are the polar co-ordinates of z. This means r = |z| and θ is the angle that the line
joining z to 0 makes with the real axis. The angle θ is called the argument of z, denoted arg(z).
Since ei(θ+2πk) = eiθ for k ∈ Z the argument is only defined up to an integer multiple of 2π. In
other words, there are infinitely many choices for arg(z). We can always choose a value of the
argument with −π < θ ≤ π; this choice is called the principal value of the argument.
The polar form lets us understand the geometry of the multiplication of complex numbers. If
z1 = r1 eiθ1 and z2 = r2 eiθ2 then
z1 z2 = r1 r2 ei(θ1 +θ2 )
This shows that when we multiply two complex numbers, their arguments are added.
The exponential of a number that has both a real and imaginary part is defined in the natural
way.
ea+ib = ea eib = ea (cos(b) + i sin(b))
The derivative of a complex exponential is given by the formula
d (a+ib)t
e
= (a + ib)e(a+ib)t
dt
while the anti-derivative, for (a + ib) 6= 0 is
Z
e(a+ib)t dt =
1
e(a+ib)t + C
(a + ib)
If (a + ib) = 0 then e(a+ib)t = e0 = 1 so in this case
Z
Z
e(a+ib)t dt = dt = t + C
106
III.2.3. Complex vector spaces and inner product
The basic example of a complex vector space is the space Cn of n-tuples of complex numbers.
Vector addition and scalar multiplication are defined as before:

   
    
sz1
z1
z 1 + w1
w1
z1
 z2   sz2 
 z 2   w2   z 2 + w2 

   
    
 , s  ..  =  ..  ,
 ..  +  ..  = 
..

.  . 
.  .  
.
zn
wn
zn
z n + wn
szn
where now zi , wi and s are complex numbers.
For complex matrices (or vectors) we define the complex conjugate matrix (or vector) by conjugating each entry. Thus, if A = [ai,j ], then
A = [ai,j ].
The product rule for complex conjugation extends to matrices and we have
AB = ĀB̄


 
w1
z1
 w2 
 z2 
 
 
The inner product of two complex vectors w =  .  and z =  .  is defined by
 .. 
 .. 
wn
T
hw, zi = w z =
n
X
zn
w i zi
i=1
When the entries of w and z are all real, then this is just the usual dot product. (In these notes we
will reserve the notation w · z for the case when w and z are real.) When the vectors are complex it
is important to remember the complex conjugate in this definition. Notice that for complex vectors
the order of w and z in the inner product matters: we have hz, wi = hw, zi.
With this definition for the inner product the norm of z is always positive since
hz, zi = kzk2 =
n
X
|zi |2
i=1
For complex matrices and vectors we have to modify the rule for bringing a matrix to the other
107
side of an inner product.
hw, Azi = wT Az
= (AT w)T z
T
T
= (A w)
z
T
= hA w, zi
This leads to the definition of the adjoint of a matrix
T
A∗ = A .
(In physics you will also see the notation A† .) With this notation
hw, Azi = hA∗ w, zi.
MATLAB/Octave deals seamlessly with complex matrices and vectors. Complex numbers can
be entered like this
>z= 1 + 2i
z =
1 + 2i
There is a slight danger here in that if i has be defined to be something else (e.g. i =16) then z=i
would set z to be 16. In this case, if you do want z to be equal to the number 0 + i, you could use
z=1i to get the desired result, or use the alternative syntax
>z= complex(0,1)
z =
0 + 1i
The functions real(z), imag(z), conj(z), abs(z) compute the real part, imaginary part, conjugate and modulus of z.
The function exp(z) computes the complex exponential if z is complex.
If a matrix A has complex entries then A’ is not the transpose, but the adjoint (conjugate
transpose).
>z = [1; 1i]
108
z =
1 + 0i
0 + 1i
z’
ans =
1 - 0i
0 - 1i
Thus the square of the norm of a complex vector is given by
>z’*z
ans =
2
This gives the same answer as
>norm(z)^2
ans =
2.0000
(Warning: the function dot in Octave does not compute the correct inner product for complex
vectors (it doesn’t take the complex conjugate). This has been fixed in the latest versions, so you
should check. In MATLAB dot works correctly for complex vectors.)
109
III.3. Orthonormal bases, Orthogonal Matrices and Unitary Matrices
Prerequisites and Learning Goals
After completing this section, you should be able to
• Write down the definition of an orthonormal basis, and determine when a given set of vectors
is an orthonormal basis.
• Compute the coefficients in the expansion of a vector in an orthonormal basis.
• Compute the norm of a vector from its coefficients in its expansion in an orthonormal basis.
• Write down the definition of an orthogonal (unitary) matrix; recognize when a matrix is orthogonal (unitary); describe the action of an orthogonal (unitary) matrix on vectors; describe
the properties of the rows and columns of an orthogonal (unitary) matrix.
III.3.1. Orthonormal bases
A basis q1 , q2 , . . . is called orthonormal if
1. kqi k = 1 for every i (normal)
2. hqi , qj i = 0 for i 6= j (ortho).
The standard basis given by
 
1
0
 
e1 = 0 ,
 
..
.
 
0
1
 
e2 = 0 ,
 
..
.
 
0
0
 
e3 = 1 ,
 
..
.
···
is an orthonormal basis for Rn . For example, e1 and e2 form an orthonormal basis for R2 . Another
orthonormal basis for R2 is
1 −1
1 1
, q2 = √
q1 = √
1
2 1
2
The vectors in a basis for Rn can also be considered to be vectors in Cn . Any orthonormal basis
for Rn is also an orthonormal basis for Cn if we are using complex scalars (homework problem).
Thus the two examples above are also orthonormal bases for Cn and C2 respectively. On the other
hand, the basis
1 1
1
1
q1 = √
, q2 = √
i
−i
2
2
is an orthonormal basis for C2 but not for R2 .
110
If you expand a vector in an orthonormal basis, it’s very easy to find the coefficients in the
expansion. Suppose
v = c1 q1 + c2 q2 + · · · + cn qn
for some orthonormal basis q1 , q2 , . . . , qn . Then, if we take the inner product of both sides with
qk , we get
hqk , vi = c1 hqk , q1 i + c2 hqk , q2 i + · · · + ck hqk , qk i · · · + cn hqk , qn i
= 0 + 0 + · · · + ck + · · · + 0
= ck
This gives a convenient formula for each ck . For example, in the expansion
1 1
1 −1
1
√
√
= c1
+ c2
2
1
2 1
2
we have
3
1 1
1
=√
·
c1 = √
2
2 1
2
1
1 −1
1
=√
·
c2 = √
2
1
2
2
Notice also that the norm of v is easily expressed in terms of the coefficients ci . We have
kvk2 = hv, vi
= hc1 q1 + c2 q2 + · · · + cn qn , c1 q1 + c2 q2 + · · · + cn qn i
= |c1 |2 + |c2 |2 + · · · + |cn |2
Another way of saying this is that the vector c = [c1 , c2 , . . . , cn ]T of coefficients has the same norm
as v.
III.3.2. Orthogonal matrices and Unitary matrices
If we put the vectors of an orthonormal basis into the columns of a matrix, the resulting matrix is
called orthogonal (if the vectors are real) or unitary (if the vectors are complex). If q1 , q2 , . . . qn is
an orthonormal basis then the expansion
v = c1 q1 + c2 q2 + · · · + cn qn
can be expressed as a matrix equation v = Qc where c = [c1 , c2 , . . . , cn ]T and Q is the orthogonal
(or unitary) matrix
#
"
Q = q1 q2 · · · qn
111
The fact that the columns of Q are orthonormal means that Q∗ Q = I (equivalently Q∗ = Q−1 ).
When the entries of Q are real, so that Q is orthogonal, then Q∗ = QT . So for orthogonal matrices
QT Q = I (equivalently QT = Q−1 ).
To see this, we compute



Q Q=

∗
qT1
qT2
..
.





qTn

"
q1
hq1 , q1 i hq1 , q2 i · · ·
 hq2 , q1 i hq2 , q2 i · · ·

=
..
..
..

.
.
.
hqn , q1 i hqn , q2 i · · ·


1 0 ··· 0
0 1 · · · 0


= . .
..
..  .
 .. ..
.
.
0 0 ···
q2
···
#
qn

hq1 , qn i
hq2 , qn i 


..

.
hqn , qn i
1
Another way of recognizing unitary and orthogonal matrices is by their action on vectors. Suppose
Q is unitary. We already observed in the previous section that if v = Qc then kvk = kck. We can
also see this directly from the calculation
kQvk2 = hQv, Qvi = hv, Q∗ Qvi = hv, vi = kvk2
This implies that kQvk = kvk. In other words, unitary matrices don’t change the lengths of vectors.
The converse is also true. If a matrix Q doesn’t change the lengths of vectors then it must be
unitary (or orthogonal, if the entries are real). We can show this using the following identity, called
the polarization identity, that expresses the inner product of two vectors in terms of norms.
hv, wi =
1
kv + wk2 − kv − wk2 + ikv − iwk2 − ikv + iwk2
4
(You are asked to prove this in a homework problem.) Now suppose that Q doesn’t change the
length of vectors, that is, kQvk = kvk for every v. Then, using the polarization identity, we find
hQv,Qwi
1
kQv + Qwk2 − kQv − Qwk2 + ikQv − iQwk2 − ikQv + iQwk2
=
4
1
=
kQ(v + w)k2 − kQ(v − w)k2 + ikQ(v − iw)k2 − ikQ(v + iw)k2
4
1
kv + wk2 − kv − wk2 + ikv − iwk2 − ikv + iwk2
=
4
= hv, wi
112
Thus hv, Q∗ Qwi = hQv, Qwi = hv, wi for all vectors v and w. In particular, if v is the standard
basis vector ei and w = ej , then hei , Q∗ Qej i is the i, jth entry of the matrix Q∗ Q while hei , ej i is
the i, jth entry of the identity matrix I. Since these two quantities are equal for every i and j we
may conclude that Q∗ Q = I. Therefore Q is unitary.
Recall that for square matrices a left inverse is automatically also a right inverse. So if Q∗ Q = I
then QQ∗ = I too. This means that Q∗ is an unitary matrix whenever Q is. This proves the
(non-obvious) fact that if the columns of an square matrix form an orthonormal basis, then so do
the (complex conjugated) rows!
113
III.4. Fourier series
Prerequisites and Learning Goals
After completing this section, you should be able to
2πinx/L for n = 0, ±1, ±2, ..., a < x < b and L = b − a form
• Show that the functions en (x)
√ =e
an orthonormal (scaled by L) set in L2 ([a, b]).
• Use the fact that the functions en (x) form an infinite orthonormal basis to expand a L2
function in a Fourier series; explain how this leads to a formula for the coefficients of the
series, and compute the coefficients (in real and complex form).
• State and derive Parseval’s formula and use it to sum certain infinite series.
• Use MATLAB/Octave to compute and plot the partial sums of Fourier series.
• Explain what an amplitude-frequency plot is and generate it for a given function using MATLAB/Octave; describe the physical interpretation of the plot when the function is a sound
wave.
III.4.1. Vector spaces of complex-valued functions
Let [a, b] be an interval on the real line. Recall that we introduced the vector space of real valued
functions defined for x ∈ [a, b]. The vector sum f + g of two functions f and g was defined to be
the function you get by adding the values, that is, (f + g)(x) = f (x) + g(x) and the scalar multiple
sf was defined similarly by (sf )(x) = sf (x).
In exactly the same way, we can introduce a vector space of complex valued functions. The
independent variable x is still real, taking values in [a, b]. But now the values f (x) of the functions
may be complex. Examples of complex valued functions are f (x) = x + ix2 or f (x) = eix =
cos(x) + i sin(x).
Now we introduce the inner product of two complex valued functions on [a, b]. In analogy with
the inner product for complex vectors we define
Z
hf, gi =
b
f (x)g(x)dx
a
and the associated norm defined by
kf k2 = hf, f i =
Z
b
|f (x)|2 dx
a
For real valued functions we can ignore the complex conjugate.
114
Example: the inner product of f (x) = 1 + ix and g(x) = x2 over the interval [0, 1] is
2
Z
h1 + ix, x i =
1
Z
2
1
(1 + ix) · x dx =
2
Z
(1 − ix) · x dx =
0
0
0
1
x2 − ix3 dx =
1
1
−i
3
4
It will often happen that a function, like f (x) = x is defined for all real values of x. In this case
we can consider inner products and norms for any interval [a, b] including semi-infinite and infinite
intervals, where a may be −∞ or b may be +∞. Of course the values of the inner product an norm
depend on the choice of interval.
There are technical complications when dealing with spaces of functions. In this course we will
deal with aspects of the subject where these complications don’t play an important role. However,
it is good to aware that they exist, so we will mention a few.
One complication is that the integral defining the inner product may not exist. For example for
the interval (−∞, ∞) = R the norm of f (x) = x is infinite since
Z ∞
|x|2 dx = ∞
−∞
Even if the interval is finite, like [0, 1], the function might have a spike. For example, if f (x) = 1/x
then
Z 1
1
dx = ∞
2
0 |x|
too. To overcome this complication we agree to restrict our attention to square integrable functions.
For any interval [a, b], these are the functions f (x) for which |f (x)|2 is integrable. They form a
vector space that is usually denoted L2 ([a, b]). It is an example of a Hilbert space and is important
in Quantum Mechanics. The L in this notation indicates that the integrals should be defined as
Lebesgue integrals rather than as Riemann integrals usually taught in elementary calculus courses.
This plays a role when discussing convergence theorems. But for any functions that come up in
this course, the Lebesgue integral and the Riemann integral will be the same.
The question of convergence is another complication that arises in infinite dimensional vector
spaces of functions. When discussing infinite orthonormal bases, infinite linear combinations of
vectors (functions) will appear. There are several possible meanings for an equation like
∞
X
ci φi (x) = φ(x).
i=0
since we are talking about convergence of an infinite series of functions. The most obvious interpretation is that for every fixed value of x the infinite sum of numbers on the left hand side equals
the number on the right.
P
Here is another interpretation: the difference of φ and the partial sums N
i=0 ci φi tends to zero
when measured in the L2 norm, that is
lim k
N →∞
N
X
ci φi − φk = 0
i=0
115
With this definition, it might happen that there are individual values of x where the first equation
doesn’t hold. This is the meaning that we will give to the equation.
III.4.2. An infinite orthonormal basis for L2 ([a, b])
Let [a, b] be an interval of length L = b − a. For every integer n, define the function
en (x) = e2πinx/L .
Then infinite collection of functions
{. . . , e−2 , e−1 , e0 , e1 e2 , . . .}
√
forms an orthonormal basis for the space L2 ([a, b]), except that each function en has norm L
instead of 1. (Since this is the usual normalization,
we will stick with it. To get a true orthonormal
√
basis, we must divide each function by L.)
√
Let’s verify that these functions form an orthonormal set (scaled by L). To compute the norm
we calculate
Z b
ken k2 = hen , en i =
e2πinx/L e2πinx/L dx
a
Z b
=
e−2πinx/L e2πinx/L dx
a
Z b
=
1dx.
a
=L
This shows that ken k =
√
L for every n. Next we check that if n 6= m then en and em are orthogonal.
Z
b
hen , em i =
e−2πinx/L e2πimx/L dx
a
Z
=
b
e2πi(m−n)x/L dx
a
b
L
e2πi(m−n)x/L 2πi(m − n)
x=a
L
=
e2πi(m−n)b/L − e2πi(m−n)a/L
2πi(m − n)
=0
=
116
Here we used that e2πi(m−n)b/L = e2πi(m−n)(b−a+a)/L = e2πi(m−n) e2πi(m−n)a/L = e2πi(m−n)a/L
√ .
This shows that the functions {. . . , e−2 , e−1 , e0 , e1 e2 , . . .} form an orthonormal set (scaled by L).
To show these functions form a basis we have to verify that they span the space L2 [a, b]. In other
words, we must show that any function f ∈ L2 [a, b] can be written as an infinite linear combination
f (x) =
∞
X
cn en (x) =
n=−∞
∞
X
cn e2πinx/L .
n=−∞
This is a bit tricky, since it involves infinite series of functions. For a finite dimensional space, to
show that an orthogonal set forms a basis, it suffices to count that there are the same number of
elements in an orthogonal set as there are dimensions in the space. For an infinite dimensional
space this is no longer true. For example, the set of en ’s with n even is also an infinite orthonormal
set, but it doesn’t span all of L2 [a, b].
In this course, we will simply accept that it is true that {. . . , e−2 , e−1 , e0 , e1 e2 , . . .} span L2 [a, b].
Once we accept this fact, it is very easy to compute the coefficients in a Fourier expansion. The
procedure is the same as in finite dimensions. Starting with
∞
X
f (x) =
cn en (x)
n=−∞
we simply take the inner product of both sides with em . The only term in the infinite sum that
survives is the one with n = m. Thus
hem , f i =
∞
X
cn hem , en i = cm L
n=−∞
and we obtain the formula
cm =
1
L
Z
b
e−2πimx/L f (x)dx
a
III.4.3. Real form of the Fourier series
Fourier series are often written in terms of sines and cosines as
∞
f (x) =
a0 X
+
(an cos(2πnx/L) + bn sin(2πnx/L))
2
n=1
To obtain this form, recall that
e±2πinx/L = cos(2πnx/L) ± i sin(2πnx/L)
117
Using this formula we find
∞
X
cn e2πnx/L = c0 +
n=−∞
= c0 +
∞
X
n=1
∞
X
cn e2πnx/L +
∞
X
c−n e−2πnx/L
n=1
cn (cos(2πnx/L) + i sin(2πnx/L))
n=1
∞
X
+
c−n (cos(2πnx/L) − i sin(2πnx/L))
n=1
= c0 +
∞
X
((cn + c−n ) cos(2πnx/L) + i(cn − c−n ) sin(2πnx/L)))
n=1
Thus the real form of the Fourier series holds with
a0 = 2c0
an = cn + c−n
bn = icn − ic−n
for n > 0
for n > 0.
Equivalently
a0
2
an bn
cn =
+
for n > 0
2
2i
a−n b−n
cn =
−
for n < 0.
2
2i
c0 =
The coefficients an and bn in the real form of the Fourier series can also be obtained directly. The
set of functions
{1/2, cos(2πx/L), cos(4πx/L), cos(6πx/L), . . . , sin(2πx/L), sin(4πx/L), sin(6πx/L), . . .}
p
also form an orthogonal basis where each vector has norm L/2. This leads to the formulas
b
2
L
Z
2
bn =
L
Z
an =
cos(2πnx/L)f (x)
a
for n = 0, 1, 2, . . . and
b
sin(2πnx/L)f (x)
a
for n = 1, 2, . . .. The desire to have the formula for an work out for n = 0 is the reason for dividing
by 2 in the constant term a0 /2 in the real form of the Fourier series.
One advantage of the real form of the Fourier series is that if f (x) is a real valued function,
then the coefficients an and bn are real too, and the Fourier series doesn’t involve any complex
118
numbers. However, it is often easier to calculate the coefficients cn because exponentials are easier
to integrate than sines and cosines.
III.4.4. An example
Let’s compute the Fourier coefficients for the square wave function. In this example L = 1.
1 if 0 ≤ x ≤ 1/2
f (x) =
−1 if 1/2 < x ≤ 1
If n = 0 then e−i2πnx = e0 = 1 so c0 is simply the integral of f .
Z
1
1/2
Z
1dx = 0
1/2
0
0
1
1dx −
f (x)dx =
c0 =
Z
Otherwise, we have
1
Z
cn =
e−i2πnx f (x)dx
0
Z
=
1/2
−i2πnx
e
Z
dx −
0
e−i2πnx x=1/2
1
e−i2πnx dx
1/2
x=1
−i2πnx
e
−
−i2πn x=0
−i2πn x=1/2
2 − 2eiπn
=
2πin
0
if n is even
=
2/iπn if n is odd
=
Thus we conclude that
f (x) =
∞
X
n=−∞
n odd
2 i2πnx
e
iπn
To see how well this series is approximating f (x) we go back to the real form of the series. Using
an = cn + c−n and bn = icn − ic−n we find that an = 0 for all n, bn = 0 for n even and bn = 4/πn
for n odd. Thus
f (x) =
∞
∞
X
X
4
4
sin(2πnx) =
sin(2π(2n + 1)x)
πn
π(2n + 1)
n=1
n odd
n=0
We can use MATLAB/Octave to see how well this series is converging. The file ftdemo1.m
contains a function that take an integer N as an argument and plots the sum of the first 2N + 1
terms in the Fourier series above. Here is a listing:
119
function ftdemo1(N)
X=linspace(0,1,1000);
F=zeros(1,1000);
for n=[0:N]
F = F + 4*sin(2*pi*(2*n+1)*X)/(pi*(2*n+1));
end
plot(X,F)
end
Here are the outputs for N = 0, 1, 2, 10, 50:
1.5
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
-1
-1
-1.5
-0.2
1.5
0
0.2
0.4
0.6
0.8
1
1.2
-1.5
-0.2
1.5
1
1
0.5
0.5
0
0
-0.5
-0.5
-1
-1
-1.5
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
-1.5
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0
0.2
0.4
0.6
0.8
1
1.2
1.5
1
0.5
0
-0.5
-1
-1.5
-0.2
120
0
0.2
0.4
0.6
0.8
1
1.2
III.4.5. Parseval’s formula
If v1 , v2 , . . . , vn is an orthonormal basis in a finite dimensional vector space and the vector v has
the expansion
n
X
v = c1 v1 + · · · + cn vn =
ci vi
i=1
then, taking the inner product of v with itself, and using the fact that the basis is orthonormal, we
obtain
n
n X
n
X
X
ci cj hvi , vj i =
|ci |2
hv, vi =
i=1
i=1 j=1
The same formula is true in Hilbert space. If
f (x) =
∞
X
cn en (x)
n=−∞
Then
Z
1
|f (x)|2 dx = hf, f i =
0
n=∞
X
n=−∞
n odd
or
|cn |2
n=−∞
In the example above, we have hf, f i =
1=
∞
X
R1
0
1dx = 1 so we obtain
n=∞
∞
X 4
8 X
1
4
=
2
=
2
2
2
2
2
π n
π n
π
(2n + 1)2
n=0
n odd
∞
X
n=0
n=0
1
π2
=
(2n + 1)2
8
III.4.6. Interpretation of Fourier series
What is the meaning of a Fourier series in a practical example? Consider the sound made by a
musical instrument in a time interval [0, T ]. This sound can be represented by a function y(t) for
t ∈ [0, T ], where y(t) is the air pressure at a point in space, for example, at your eardrum.
A complex exponential e2πiωt = cos(2πωt) ± i sin(2πωt) can be thought of as a pure oscillation
with frequency ω. It is a periodic function whose values are repeated when t increases by ω −1 . If
t has units of time (seconds) then ω has units of Hertz (cycles per second). In other words, in one
second the function e2πiωt cycles though its values ω times.
The Fourier basis functions can be written as e2πiωn t with ωn = n/T . Thus Fourier’s theorem
states that for t ∈ [0, T ]
∞
X
y(t) =
cn e2πiωn t .
n=−∞
121
In other words, the audio signal y(t) can be synthesized as a superposition of pure oscillations with
frequencies ωn = n/T . The coefficients cn describe how much of the frequency ωn is present in
the signal. More precisely, writing the complex number cn as cn = |cn |e2πiτn we have cn e2πiωn t =
|cn |e2πi(ωn t+τn ) . Thus |cn | represents the amplitude of the oscillation with frequency ωn while τn
represents a phase shift.
A frequency-amplitude plot for y(t) is a plot of the points (ωn , |cn |). It should be thought of as
a graph of the amplitude as a function of frequency and gives a visual representation of how much
of each frequency is present in the signal.
If y(t) is defined for all values of t we can use any interval that we want and expand the restriction
of y(t) to this interval. Notice that the frequencies ωn = n/T in the expansion will be different for
different values of T .
Example: Let’s illustrate this with the function y(t) = e2πit and intervals [0, T ]. This function is
itself a pure oscillation with frequency ω = 1. So at first glance one would expect that there will be
only one term in the Fourier expansion. This will turn out to be correct if number 1 is one of the
available frequencies, that is, if there is some value of n for which ωn = n/T = 1. (This happens
if T is an integer.) Otherwise, it is still possible to reconstruct y(t), but more frequencies will be
required. In this case we would expect that |cn | should be large for ωn close to 1. Let’s do the
calculation. Fix T . Let’s first consider the case when T is an integer. Then
Z
1 T −2πint/T 2πit
cn =
e
e dt
T 0
Z
1 T 2πi(1−n/T )t
=
e
dt
T 0
(
1
n=T
=
1
2πi(T
−n)
0
−e =0 n=
6 T,
2T πi(1−n/T ) e
as expected. Now let’s look at what happens when T is not an integer. Then
1
cn =
T
Z
T
e−2πint/T e2πit dt
1
=
e2πi(T −n) − 1
2πi(T − n)
0
A calculation (that we leave as an exercise) results in
p
2 − 2 cos(2πT (1 − ωn ))
|cn | =
2πT |1 − ωn |
We can use MATLAB/Octave to do an amplitude-frequency plot. Here are the commands for
T = 10.5 and T = 100.5
122
N=[-200:200];
T=10.5;
omega=N/T;
absc=sqrt(2-2*cos(2*pi*T*(1-omega)))./(2*pi*T*abs(1-omega));
plot(omega,absc)
T=100.5;
omega=N/T;
absc=sqrt(2-2*cos(2*pi*T*(1-omega)))./(2*pi*T*abs(1-omega));
hold on;
plot(omega,absc, ’r’)
Here is the result
0.6
0.5
0.4
0.3
0.2
0.1
0
-20
-15
-10
-5
0
5
10
15
20
As expected, the values of |cn | are largest when ωn is close to 1.
Let us return to the sound made by a musical instrument, represented by a function y(t) for
t ∈ [0, T ]. The frequency content of the sound is captured by the infinite Fourier series and can
be displayed using a frequency-amplitude plot. In practical situations, though, we cannot measure
y(t) for infinitely many t values, but must sample this functions for a discrete set of t values. How
can we perform a frequency analysis with this finite sample? To do this, we will use the discrete
Fourier transform, which is the subject of the next section.
123
III.5. The Discrete Fourier Transform
Prerequisites and Learning Goals
After completing this section, you should be able to
• Explain why the vectors in Cn obtained by sampling the exponential Fourier bases en (t) form
an orthogonal basis for Cn (discrete Fourier bases).
• Use the discrete Fourier basis to expand a vector in Cn obtaining the discrete Fourier transform of the vector; recognize the matrix that implements the discrete Fourier transform as
an unitary matrix.
• Use the Fast Fourier transform (fft) algorithm to compute the discrete Fourier transform, and
explain why the Fast Fourier transform algorithm is a faster method. You should be able
to perform Fourier transform computations by executing and interpreting the output of the
MATLAB/Octave fft command.
• Explain the relation between the coefficients in the Fourier series of a function f defined
on [0, L] and the coefficients in the discrete Fourier transform of the corresponding sampled
values of f , and discuss its limitations.
• Construct a frequency-amplitude plot for a sampled signal using MATLAB/Octave; give a
physical interpretation of the resulting plot; explain the relation between this plot and the
infinite frequency-amplitude plot.
III.5.1. Definition
In the previous section we saw that the functions ek (x) = e2πikx for k ∈ Z form an infinite
orthonormal basis for the Hilbert space of functions L2 ([0, 1]). Now we will introduce a discrete,
finite dimensional version of this basis.
To motivate the definition of this basis, imagine taking a function defined on the interval [0, 1]
and sampling it at the N points 0, 1/N, 2/N, . . . , j/N, . . . , (N − 1)/N . If we do this to the basis
functions ek (x) we end up with vectors ek given by




ek = 


e2πi0k/N
e2πik/N
e2πi2k/N
..
.

 
 
 
=
 
 
e2πi(N −1)k/N
where
ωN = e2πi/N
124

1
k
ωN
2k
ωN
..
.
(N −1)k
ωN







The complex number ωN lies on the unit circle, that is, |ωN | = 1. Moreover ωN is a primitive N th
N = 1 and ω j 6= 1 unless j is a multiple of N .
root of unity. This means that ωN
N
k+N
k ω N = ω k we see that e
Because ωN
= ωN
k+N = ek . Thus, although the vectors ek are defined
N
N
for every integer k, they start repeating themselves after N steps. Thus there are only N distinct
vectors, e0 , e1 , . . . , eN −1 .
These vectors, ek for k = 0, . . . , N − 1 form an orthogonal basis for CN . To see this we use the
formula for the sum of a geometric series:

N
−1
N
r=1
X
rj = 1 − rN

r 6= 1
j=0
1−r
Using this formula, we compute
hek , el i =
N
−1
X
ωN
kj
lj
ωN
=
N
−1
X
(l−k)j
ωN
=
j=0
j=0


N
l=k
(l−k)N
1 − ωN


l−k
1 − ωN
= 0 l 6= k
Now we can expand any vector f ∈ CN in this basis. Actually, to make our discrete Fourier
transform agree with MATLAB/Octave we divide each basis vector by N . Then we obtain
N −1
1 X
f=
cj ej
N
j=0
where
ck = hek , f i =
N
−1
X
e−2πikj/N fj
j=0
The map that sends the vector f to the vector of coefficients c = [c0 , . . . , cN −1 ]T is the discrete
Fourier transform. We can write this in matrix form as
c = Ff,
f = F −1 c
where the matrix F −1 has the vectors ek as its columns. Since this vectors are an orthogonal basis,
the inverse is the transpose, up to a factor of N . Explicitly

1
1


1
F =
.
.
.
1
ωN
ω 2N
..
.
1
ω 2N
ω 4N
..
.
2(N −1)
−1
ωN
1 ωN
N
···
···
···
···
1
−1
ωN
N
2(N −1)
ωN
..
.
(N −1)(N −1)








ωN
125
and
F
−1

1
1
1 
1
=

N  ..
.
1
ωN
ωN 2
..
.
···
···
···
1
ωN 2
ωN 4
..
.
1 ωN N −1 ωN 2(N −1) · · ·
1

ωN N −1
ωN 2(N −1)
..
.






ωN (N −1)(N −1)
The matrix F̃ = N −1/2 F is a unitary matrix (F̃ −1 = F̃ ∗ ). Recall that unitary matrices preserve
the length of complex vectors. This implies that the lengths of the vectors f = [f0 , f1 , . . . , fN −1 ]
and c = [c0 , c1 , . . . , cN −1 ] are related by
kck2 = N kf k2
or
−1
2
sumN
k=0 |ck |
=N
N
−1
X
|fk |2
k=0
This is the discrete version of Parseval’s formula.
III.5.2. The Fast Fourier transform
Multiplying an N × N matrix with a vector of length N normally requires N 2 multiplications,
since each entry of the product requires N , and there are N entries. It turns out that the discrete
Fourier transform, that is, multiplication by the matrix F , can be carried out using only N log2 (N )
multiplications (at least if N is a power of 2). The algorithm that achieves this is called the Fast
Fourier Transform, or FFT. This represents a tremendous saving in time: calculations that would
require weeks of computer time can be carried out in seconds.
The basic idea of the FFT is to break the sum defining the Fourier coefficients ck into a sum
of the even terms and a sum of the odd terms. Each of these turns out to be (up to a factor we
can compute) a discrete Fourier transform of half the length. This idea is then applied recursively.
Starting with N = 2n and halving the size of the Fourier transform at each step, it takes n = log2 (N )
steps to arrive at Fourier transforms of length 1. This is where the log2 (N ) comes in.
To simplify the notation, we will ignore the factor of 1/N in the definition of the discrete Fourier
transform (so one should divide by N at the end of the calculation.) We now also assume that
N = 2n
126
so that we can divide N by 2 repeatedly. The basic formula, splitting the sum for ck into a sum
over odd and even j’s is
ck =
N
−1
X
e−i2πkj/N fj
j=0
=
N
−1
X
e−i2πkj/N fj +
j=0
j even
N
−1
X
j=0
j odd
N/2−1
=
X
e−i2πkj/N fj
N/2−1
e
−i2πk2j/N
f2j +
j=0
X
e−i2πk(2j+1)/N f2j+1
j=0
N/2−1
=
X
N/2−1
e
−i2πkj/(N/2)
f2j + e
j=0
−i2πk/N
X
e−i2πkj/(N/2) f2j+1
j=0
Notice that the two sums on the right are discrete Fourier transforms of length N/2.
To continue, it is useful to write the integers j in base 2. Lets assume that N = 23 = 8. Once
you understand this case, the general case N = 2n will be easy. Recall that
0 = 000
(base 2)
1 = 001
(base 2)
2 = 010
(base 2)
3 = 011
(base 2)
4 = 100
(base 2)
5 = 101
(base 2)
6 = 110
(base 2)
7 = 111
(base 2)
The even j’s are the ones whose binary expansions have the form ∗ ∗ 0, while the odd j’s have
binary expansions of the form ∗ ∗ 1.
For any pattern of bits like ∗ ∗ 0, I will use the notation F <pattern> to denote the discrete Fourier
transform where the input data is given by all the fj ’s whose j’s have binary expansion fitting the
pattern. Here are some examples. To start, Fk∗∗∗ = ck is the original discrete Fourier transform,
since every j fits the pattern ∗ ∗ ∗. In this example k ranges over 0, . . . , 7, that is, the values start
repeating after that.
Only even j’s fit the pattern ∗ ∗ 0, so F ∗∗0 is the discrete Fourier transform of the even j’s given
by
N/2−1
X
∗∗0
Fk =
e−i2πkj/(N/2) f2j .
j=0
127
Here k runs from 0 to 3 before the values start repeating. Similarly, F ∗00 is a transform of length
N/4 = 2 given by
N/4−1
X
∗00
Fk =
e−i2πkj/(N/4) f4j .
j=0
In this case k = 0, 1 and then the values repeat. Finally, the only j matching the pattern 010 is
j = 2, so Fk010 is a transform of length one term given by
N/8−1
Fk010 =
X
e−i2πkj/(N/8) f2 =
j=0
0
X
e0 f2 . = f2
j=0
With this notation, the basic even–odd formula can be written
Fk∗∗∗ = Fk∗∗0 + ω kN Fk∗∗1 .
Recall that ωN = ei2π/N , so ω N = e−i2π/N .
Lets look at this equation when k = 0. We will represent the formula by the following diagram.
0
F **
0
1
F **
0
128
ω0
N
F ***
0
This diagram means that F0∗∗∗ is obtained by adding F0∗∗0 to ω 0N F0∗∗1 . (Of course ω 0N = 1 so we
could omit it.) Now lets add the diagrams for k = 1, 2, 3.
0
F **
0
0
F **
1
0
F **
2
0
F **
3
ω0
N
ω1
N
ω2
N
ω3
N
F ***
0
F ***
1
F ***
2
F ***
3
1
F **
0
1
F **
1
1
F **
2
1
F **
3
129
Now when we get to k = 4, we recall that F ∗∗0 and F ∗∗1 are discrete transforms of length
N/2 = 4. Therefore, by periodicity F4∗∗0 = F0∗∗0 , F5∗∗0 = F1∗∗0 , and so on. So in the formula
F4∗∗∗ = F4∗∗0 + ω 4N F4∗∗1 we may replace F4∗∗0 and F4∗∗1 with F0∗∗0 and F0∗∗1 respectively. Making
such replacements, we complete the first part of the diagram as follows.
0
F **
0
0
F **
1
0
F **
2
0
F **
3
1
F **
0
1
F **
1
1
F **
2
1
F **
3
130
ω0
N
ω1
N
ω2
N
ω3
N
ω4
N
ω5
N
ω6
N
ω7
N
F ***
0
F ***
1
F ***
2
F ***
3
F ***
4
F ***
5
F ***
6
F ***
7
To move to the next level we analyze the discrete Fourier transforms on the left of this diagram
in the same way. This time we use the basic formula for the transform of length N/2, namely
Fk∗∗0 = Fk∗00 + ω kN/2 Fk∗10
and
Fk∗∗1 = Fk∗01 + ω kN/2 Fk∗11 .
The resulting diagram shows how to go from the length two transforms to the final transform on
the right.
F*0 00
0
ωN/2
0
F **
0
F*100
ω1
0
F **
1
N/2
0
F **
2
0
ωN
ωN1
F ***
0
F ***
1
F*0 10
2
ωN/2
F*1 10
3
ωN/2
0
F **
3
ω3
N
F ***
3
F *0 01
0
ωN/2
1
F **
0
ω4
N
F ***
4
F *1 01
ω1
N/2
1
F **
1
5
ωN
F *0 11
ω2N/2
F *111
ω3N/2
1
F **
2
1
F **
3
2
ωN
ωN6
ωN7
F ***
2
F ***
5
F ***
6
F ***
7
131
Now we go down one more level. Each transform of length two can be constructed from transforms
of length one, i.e., from the original data in some order. We complete the diagram as follows. Here
we have inserted the value N = 8.
f0 = F 000
0
ω0
2
f4 = F100
0
ω1
2
f2 = F 010
0
ω 02
f6 = F110
0
ω12
f1 = F 001
0
ω0
2
f5 = F101
0
ω1
2
f3 = F 011
0
ω02
f7 = F 111
0
ω1
2
F*0 00
F*100
F*0 10
F*1 10
F *0 01
F *1 01
F *0 11
F *111
ω0
4
ω1
4
ω42
ω43
ω40
ω1
4
ω2
4
ω3
4
0
F **
0
0
F **
1
0
F **
2
0
F **
3
1
F **
0
1
F **
1
1
F **
2
1
F **
3
ω0
8
ω1
8
ω82
ω38
ω48
ω5
8
ω6
8
ω7
8
= c0
F ***
0
= c1
F ***
1
= c2
F ***
2
= c3
F ***
3
= c4
F ***
4
= c5
F ***
5
= c6
F ***
6
= c7
F ***
7
Notice that the fj ’s on the left of the diagram are in bit reversed order. In other words, if we
reverse the order of the bits in the binary expansion of the j’s, the resulting numbers are ordered
from 0 (000) to 7 (111).
Now we can describe the algorithm for the fast Fourier transform. Starting with the original
data [f0 , . . . , f7 ] we arrange the values in bit reversed order. Then we combine them pairwise, as
indicated by the left side of the diagram, to form the transforms of length 2. To do this we we
need to compute ω 2 = e−iπ = −1. Next we combine the transforms of length 2 according to the
middle part of the diagram to form the transforms of length 4. Here we use that ω 4 = e−iπ/2 = −i.
Finally we combine the transforms of length 4 to obtain the transform of length 8. Here we need
ω 8 = e−iπ/4 = 2−1/2 − i2−1/2 .
The algorithm for values of N other than 8 is entirely analogous. For N = 2 or 4 we stop
at the first or second stage. For larger values of N = 2n we simply add more stages. How
many multiplications do we need to do? Well there are N = 2n multiplications per stage of the
algorithm (one for each circle on the diagram), and there are n = log2 (N ) stages. So the number
of multiplications is 2n n = N log2 (N )
As an example let us compute the discrete Fourier transform with N = 4 of the data [f0 , f1 , f2 , f3 ] =
[1, 2, 3, 4]. First we compute the bit reversed order of 0 = (00), 1 = (01), 2 = (10), 3 = (11) to be
132
(00) = 0, (10) = 2, (01) = 1, (11) = 3. We then do the rest of the computation right on the diagram
as follows.
f0 =
1
1
1+3=4
1
4+6=10 = c 0
f2 =
3
−1
1−3=−2
−i
−2+2i
f1 =
2
1
2+4=6
−1
4−6=−2 = c 2
f3 =
4
−1
2−4=−2
i
−2−2i
= c1
= c3
The MATLAB/Octave command for computing the fast Fourier transform is fft. Let’s verify
the computation above.
> fft([1 2 3 4])
ans =
10 +
0i
-2 +
2i
-2 +
0i
-2 -
2i
The inverse fft is computed using ifft.
III.5.3. A frequency-amplitude plot for a sampled audio signal
Recall that a frequency-amplitude plot for the function y(t) defined on the interval [0, T ] is a plot
of the points (ωn , |cn |), where ωn = n/T and cn are the numbers appearing in the Fourier series
y(t) =
∞
X
n=−∞
2πiωn t
cn e
=
∞
X
cn e2πint/T
n=−∞
If y(t) represents the sound of a musical instrument, then the frequency-amplitude plot gives a
visual representation of the strengths of the various frequencies present in the sound.
Of course, for an actual instrument there is no formula for y(t) and the best we can do is to
sample this function at a discrete set of points. Let tj = jT /N for j = 0, . . . , N − 1 be N equally
spaced points, and let yj = y(tj ) be the sampled values of y(t). Put the results in a vector
y = [y0 , y2 , . . . , yN −1 ]T . How can we make an approximate frequency-amplitude plot with this
information?
133
The key is to realize that the coefficients in the discrete Fourier transform of y can be used to
approximate the Fourier series coefficients cn . To see this, do a Riemann sum approximation of the
integral in the formula for cn . Using the equally spaced points tj with ∆tj = T /N we obtain
1
cn =
T
≈
1
T
T
Z
0
N
−1
X
e−2πint/T y(t)dt
e−2πintj /T y(tj )∆tj
j=0
N −1
T X −2πinj/N
=
e
yj
TN
j=0
1
= c̃n
N
where c̃n is the nth coefficient in the discrete Fourier transform of y
The frequency correpsonding to cn is n/T . So, for an approximate frequency-amplitude plot, we
can plot the points (n/T, |c̃n |/N ). Typically we are not given T but rather the vector y of samples,
from which we may determine the number N of samples, and the sampling frequency Fs = N/T .
Then the points to be plotted can also be written as (nFs /N, |c̃n |/N ).
It is important to realize that the approximation cn ≈ c̃n /N is only good for small n. The reason
is that the Riemann sum will do a worse job in approximating the integral when the integrand is
oscillating rapidly, that is, when n is large. So we should only plot a restricted range of n. In fact,
it never makes sense to plot more than N/2 points. To see why, recall the formula
c̃n =
N
−1
X
e−2πinj/N yj .
j=0
Notice that, although in the discrete Fourier transform n ranges from 0 to N − 1, the formula for
c̃n makes sense for any integer n. With this extended definition of c̃n , (1) c̃n+N = c̃n , and (2) for
y real valued, c̃−n = c̃n . Relation (2) implies that |c̃−n | = |c̃n | so that the plot of |c̃n | is symmetric
about n = 0. But there is also a symmetry about n = N/2, since using (2) and then (1) we find
|c̃N/2+k | = |c̃−N/2−k | = |c̃N/2−k |
Here is a typical plot of |c̃n | for N = 8 illustrating the two lines of symmetry.
0
1
0
1
0
1
0
1
11
00
00
11
0
1
0
1
00
11
00
11
00
11
00
11
00
11
00
11
0
1
0
1
00
11
00
11
00
11
00
11
0
1
0
1
00
11
00
11
00
11
00
11
00
11
0
1
0
1
00
11
00
11
00
11
00
11
00
001
11
00
11
00
11
0
0
1
00
11
0011
11
00
11
0
1
0
1
0
1
0
1
−4 −3 −2 −1 0 1 2 3 4 5 6 7 8
134
9
The coefficients cn for the Fourier series obey the symmetry (2) but not (1) so if we were to add
these to the plot (using the symbol ◦) the result might look like this:
1
1
0
0
0
1
0 11
1
11
00
00
0
1
0
1
00
11
00
11
00
11
00
11
00
11
00
11
0
1
0
1
00
11
00
11
00
11
00
11
0
1
0
1
0011
11
00
11
00
00
11
00
11
0
1
0
1
00
11
00
11
00
11
00
11
00
001
11
00
11
00
11
1
0
0
00
11
0011
11
00
11
0
1
0
1
0
1
0
1
−4 −3 −2 −1 0 1 2 3 4 5 6 7 8
9
So we see that |c̃7 | should be thought of as an approximation for |c−1 | rather than for |c7 |
To further compare the meanings of the coefficients cn and c̃n it is instructive to consider the
formulas (both exact) for the Fourier series and the discrete Fourier transform for yj = y(tj ):
N −1
1 X
yj =
c̃n e2πinj/N
N
n=0
y(tj ) =
∞
X
n=−∞
cn e2πintj /T =
∞
X
cn e2πinj/N
n=−∞
The coefficients cn and c̃n /N are close for n close to 0, but then their values must diverge so that
the infinite sum and the finite sum above both give the same answer.
Now let’s try and make a frequency amplitude plot using MATLAB/Octave for a sampled flute
contained in the audio file F6.baroque.au available at
http://www.phys.unsw.edu.au/music/flute/baroque/sounds/F6.baroque.au.
This file contains a sampled baroque flute playing the note F6 , which has a frequency of 1396.91
Hz. The sampling rate is Fs = 22050 samples/second.
Audio processing is one area where MATLAB and Octave are different. The Octave code to load
the file F6.baroque.au is
y=loadaudio(’F6.baroque’,’au’,8);
while the MATLAB code is
y=auread(’F6.baroque.au’);
After this step the sampled values are loaded in the vector y. Now we compute the FFT of y and
store the resulting values c̃n in a vector tildec. Then we compute a vector omega containing the
frequencies and make a plot of these frequencies against |c̃n |/N . We plot the first Nmax=N/4 values.
135
tildec = fft(y);
N=length(y);
Fs=22050;
omega=[0:N-1]*Fs/N;
Nmax=floor(N/4);
plot(omega(1:Nmax), abs(tildec(1:Nmax)/N));
Here is the result.
12
10
8
6
4
2
0
0
1000
2000
3000
4000
5000
Notice the large spike at ω ≈ 1396 corresponding to the note F6 . Smaller spikes appear at the
overtone series, but evidently these are quite small for a flute.
136
Chapter IV
Eigenvalues and Eigenvectors
137
IV.1. Eigenvalues and Eigenvectors
Prerequisites and Learning Goals
After completing this section, you should be able to
• Write down the definition of eigenvalues and eigenvectors and compute them using the standard procedure involving finding the roots of the characteristic polynomial. You should be
able to perform relevant calculations by hand or using specific MATLAB/Octave commands
such as poly, roots, and eig.
• Define algebraic and geometric multiplicities of eigenvalues and eigenvectors; discuss when it
is possible to find a set of eigenvectors that form a basis.
• Determine when a matrix is diagonalizable and use eigenvalues and eigenvectors to perform
matrix diagonalization.
• Recognize the form of the Jordan Canonical Form for non-diagonalizable matrices.
• Explain the relationship between eigenvalues and the determinant and trace of a matrix.
• Use eigenvalues to compute powers of a diagonalizable matrix.
IV.1.1. Definition
Let A be an n × n matrix. A number λ and non-zero vector v are an eigenvalue eigenvector pair
for A if
Av = λv
Although v is required to be nonzero, λ = 0 is possible. If v is an eigenvector, so is sv for any
number s 6= 0.
Rewrite the eigenvalue equation as
(λI − A)v = 0
Then we see that v is a non-zero vector in the nullspace N (λI − A). Such a vector only exists if
λI − A is a singular matrix, or equivalently if
det(λI − A) = 0
138
IV.1.2. Standard procedure
This leads to the standard textbook method of finding eigenvalues. The function of λ defined by
p(λ) = det(λI − A) is a polynomial of degree n, called the characteristic polynomial, whose zeros
are the eigenvalues. So the standard procedure is:
• Compute the characteristic polynomial p(λ)
• Find all the zeros (roots) of p(λ). This is equivalent to completely factoring p(λ) as
p(λ) = (λ − λ1 )(λ − λ2 ) · · · (λ − λn )
Such a factorization always exists if we allow the possibility that the zeros λ1 , λ2 , . . . are
complex numbers. But it may be hard to find. In this factorization there may be repetitions
in the λi ’s. The number of times a λi is repeated is called its algebraic multiplicity.
• For each distinct λi find N (λI − A), that is, all the solutions to
(λi I − A)v = 0
The non-zero solutions are the eigenvectors for λi .
IV.1.3. Example 1
This is the typical case where all the eigenvalues

3
A= 1
−1
are distinct. Let

−6 −7
8
5
−2 1
Then, expanding the determinant, we find
det(λI − A) = λ3 − 12λ2 + 44λ − 48
This can be factored as
λ3 − 12λ2 + 44λ − 48 = (λ − 2)(λ − 4)(λ − 6)
So the eigenvalues are 2, 4 and 6.
These steps can be done with MATLAB/Octave using poly and root. If A is a square matrix,
the command poly(A) computes the characteristic polynomial, or rather, its coefficients.
> A=[3 -6 -7; 1 8 5; -1 -2 1];
> p=poly(A)
p =
1.0000
-12.0000
44.0000
-48.0000
139
Recall that the coefficient of the highest power comes first. The function roots takes as input a
vector representing the coefficients of a polynomial and returns the roots.
>roots(p)
ans =
6.0000
4.0000
2.0000
To find the eigenvector(s) for λ1 = 2 we must solve the homogeneous equation (2I − A)v = 0.
Recall that eye(n) is the n × n identity matrix I
>rref(2*eye(3) - A)
ans =
1
0
0
0
1
0
-1
1
0
From this we can read off the solution


1
v1 = −1
1
Similarly we find for λ2 = 4 and λ3 = 6 that the corresponding eigenvectors are
 
 
−2
−1
v2 = −1 v3 =  1 
0
1
The three eigenvectors v1 , v2 and v3 are linearly independent and form a basis for R3 .
The MATLAB/Octave command for finding eigenvalues and eigenvectors is eig. The command
eig(A) lists the eigenvalues
>eig(A)
ans =
4.0000
2.0000
6.0000
140
while the variant [V,D] = eig(A) returns a matrix V whose columns are eigenvectors and a diagonal
matrix D whose diagonal entries are the eigenvalues.
>[V,D] = eig(A)
V =
5.7735e-01
5.7735e-01
-5.7735e-01
5.7735e-01
-5.7735e-01
5.7735e-01
-8.9443e-01
4.4721e-01
2.2043e-16
D =
4.00000
0.00000
0.00000
0.00000
2.00000
0.00000
0.00000
0.00000
6.00000
Notice that the eigenvectors have been normalized to have length one. Also, since they have been
computed numerically, they are not exactly correct. The entry 2.2043e-16 (i.e., 2.2043 × 10−16 )
should actually be zero.
IV.1.4. Example 2
This example has a repeated eigenvalue.


1 1 0
A = 0 2 0
0 −1 1
The characteristic polynomial is
det(λI − A) = λ3 − 4λ2 + 5λ − 2 = (λ − 1)2 (λ − 2)
In this example the eigenvalues are 1 and 2, but the eigenvalue 1 has algebraic multiplicity 2.
Let’s find the eigenvector(s) for λ1 = 1 we have


0 1 0
I − A = 0 1 0
0 −1 0
From this it is easy to see that there are two linearly independent eigenvectors for this eigenvalue:
 
 
1
0



v1 = 0
and w1 = 0
0
1
141
In this case we say that the geometric multiplicity is 2. In general, the geometric multiplicity is
the number of independent eigenvectors, or equivalently the dimension of N (λI − A)
The eigenvalue λ2 = 2 has eigenvector
 
−1

v2 = −1
1
So, although this example has repeated eigenvalues, there still is a basis of eigenvectors.
IV.1.5. Example 3
Here is an example where the geometric multiplicity

2 1
A = 0 2
0 0
is less than the algebraic multiplicity. If

0
1
2
then the characteristic polynomial is
det(λI − A) = (λ − 2)3
so there is one eigenvalue λ1 = 2 with algebraic multiplicity 3.
To find the eigenvectors we compute


0 −1 0
2I − A = 0 0 −1
0 0
0
From this we see that there is only one independent solution
 
1
v1 = 0
0
Thus the geometric multiplicity dim(N (2I − A)) is 1. What does MATLAB/Octave do in this
situation?
>A=[2 1 0; 0 2 1; 0 0 2];
>[V D] = eig(A)
V =
142
1.00000
0.00000
0.00000
-1.00000
0.00000
0.00000
1.00000
-0.00000
0.00000
D =
2
0
0
0
2
0
0
0
2
It simply returned the same eigenvector three times.
In this example, there does not exist a basis of eigenvectors.
IV.1.6. Example 4
Finally, here is an example where the eigenvalues are complex, even though the matrix has real
entries. Let
0 −1
A=
1 0
Then
det(λI − A) = λ2 + 1
which has no real roots. However
λ2 + 1 = (λ + i)(λ − i)
so the eigenvalues are λ1 = i and λ2 = −i. The eigenvectors are found with the same procedure as
before, except that now we must use complex arithmetic. So for λ1 = i we compute
i 1
iI − A =
−1 i
There is trick for computing the null space of a singular 2 × 2 matrix. Since the two rows must be
multiplesofeach other (in this case the second row is i times the first row) we simply need to find
a
a vector
with ia + b = 0. This is easily achieved by flipping the entries in the first row and
b
changing the sign of one of them. Thus
1
v1 =
−i
If a matrix has real entries, then the eigenvalues and eigenvectors occur in conjugate pairs. This
can be seen directly from the eigenvalue equation Av = λv. Taking complex conjugates (and using
143
that the conjugate of a product is the product of the conjugates) we obtain Āv̄ = λ̄v̄ But if A is
real then Ā = A so Av̄ = λ̄v̄, which shows that λ̄ and v̄ are also an eigenvalue eigenvector pair.
From this discussion it follows that v2 is the complex conjugate of v1
1
v2 =
i
IV.1.7. A basis of eigenvectors
In three of the four examples above the matrix A had a basis of eigenvectors. If all the eigenvalues
are distinct, as in the first example, then the corresponding eigenvectors are always independent
and therefore form a basis.
To see why this is true, suppose A has eigenvalues λ1 , . . . , λn that are all distinct, that is, λi 6= λj
for i 6= j. Let v1 , . . . , vn be the corresponding eigenvectors.
Now, starting with the first two eigenvectors, suppose a linear combination of them equals zero:
c1 v1 + c2 v2 = 0
Multiplying by A and using the fact that these are eigenvectors, we obtain
c1 Av1 + c2 Av2 = c1 λ1 v1 + c2 λ2 v2 = 0
On the other hand, multiplying the original equation by λ2 we obtain
c1 λ2 v1 + c2 λ2 v2 = 0.
Subtracting the equations gives
c1 (λ2 − λ1 )v1 = 0
Since (λ2 − λ1 ) 6= 0 and, being an eigenvector, v1 6= 0 it must be that c1 = 0. Now returning to
the original equation we find c2 v2 = 0 which implies that c2 = 0 too. Thus v1 and v2 are linearly
independent.
Now let’s consider three eigenvectors v1 , v2 and v3 . Suppose
c1 v1 + c2 v2 + c3 v3 = 0
As before, we multiply by A to get one equation, then multiply by λ3 to get another equation.
Subtracting the resulting equations gives
c1 (λ1 − λ3 )v1 + c2 (λ2 − λ3 )v2 = 0
But we already know that v1 and v2 are independent. Therefore c1 (λ1 − λ3 ) = c2 (λ2 − λ3 ) = 0.
Since λ1 − λ3 6= 0 and λ2 − λ3 6= 0 this implies c1 = c2 = 0 too. Therefore v1 , v2 and v3 are
independent.
144
Repeating this argument, we eventually find that all the eigenvectors v1 , . . . , vn are independent.
In example 2 above, we saw that it might be possible to have a basis of eigenvectors even when
there are repeated eigenvalues. For some classes of matrices (for example symmetric matrices
(AT = A) or orthogonal matrices) a basis of eigenvectors always exists, whether or not there are
repeated eigenvalues. Will will consider this in more detail later in the course.
IV.1.8. When there are not enough eigenvectors
Let’s try to understand a little better the exceptional situation where there are not enough eigenvectors to form a basis. Consider
1
1
A =
0 1+
1
. What
when = 0 this matrix has a single eigenvalues λ = 1 and only one eigenvector v1 =
0
happens when we change slightly? Then the eigenvalues change to 1 and 1 + , and being distinct,
they must have independent eigenvectors. A short calculation reveals that they are
1
1
v2 =
v1 =
0
These two eigenvectors are almost, but not quite, dependent. When becomes zero they collapse
and point in the same direction.
In general, if you start with a matrix with repeated eigenvalues and too few eigenvectors, and
change the entries of the matrix a little, some of the eigenvectors (the ones corresponding to the
eigenvalues whose algebraic multiplicity is higher than the geometric multiplicity) will split into
several eigenvectors that are almost parallel.
IV.1.9. Diagonalization
Suppose A is an n × n matrix with eigenvalues λ1 , · · · , λn and a basis of eigenvectors v1 , . . . , vn .
Form the matrix with eigenvectors as columns
S = v1 v2 · · · vn
145
Then
AS = Av1 Av2 · · · Avn
= λ 1 v1 λ 2 v2 · · · λ n vn

λ1 0 0
 0 λ2 0


= v1 v2 · · · vn  0 0 λ 3
 ..
.
0 0 0
···
···
···
···

0
0

0

.. 
. 
λn
= SD
where D is the diagonal matrix with the eigenvalues on the diagonal. Since the columns of S are
independent, the inverse exists and we can write
A = SDS −1
S −1 AS = D
This is called diagonalization.
Notice that the matrix S is exactly the one returns by the MATLAB/Octave call [S D] = eig(A).
>A=[1 2 3; 4 5 6; 7 8 9];
>[S D] = eig(A);
>S*D*S^(-1)
ans =
1.0000
4.0000
7.0000
2.0000
5.0000
8.0000
3.0000
6.0000
9.0000
IV.1.10. Jordan canonical form
If A is a matrix that cannot be diagonalized, there still exits a similar factorization called the
Jordan Canonical Form. It turns out that any matrix A can be written as
A = SBS −1
where B is a block diagonal matrix. The matrix B has the

B1 0
0 ···
 0 B2 0 · · ·


0 B3 · · ·
B=0
 ..
 .
0
146
0
0
···
form

0
0

0

.. 
. 
Bk
Where each submatrix Bi (called a Jordan block) has a single eigenvalue on the diagonal and 1’s
on the superdiagonal.


λi 1 0 · · · 0
 0 λi 1 · · · 0 




Bi =  0 0 λi · · · 0 
 ..
.. 
.
.
0
0
0
···
λi
If all the blocks are of size 1 × 1 then there are no 1’s and the matrix is diagonalizable.
IV.1.11. Eigenvalues, determinant and trace
Recall that the determinant satisfies det(AB) = det(A) det(B) and det(S −1 ) = 1/ det(S). Also, the
determinant of a diagonal matrix (or more generally of an upper triangular matrix) is the product
of the diagonal entries. Thus if A is diagonalizable then
det(A) = det(SDS −1 ) = det(S) det(D) det(S −1 ) = det(S) det(D)/ det(S) = det(D) = λ1 λ2 · · · λn
Thus the determinant of a matrix is the product of the eigenvalues. This is true for non-diagonalizable
matrices as well, as can be seen from the Jordan Canonical Form. Notice that the number of times
a particular λi appears in this product is the algebraic multiplicity of that eigenvalues.
P
The trace of a matrix is the sum of the diagonal entries. If A = [ai,j ] then tr(A) = i ai,i . Even
though it is not true that AB = BA in general, the trace is not sensitive to the change in order:
X
X
tr(AB) =
ai,j bj,i =
bj,i ai,j = tr(BA)
i,j
i,j
Thus (taking A = SD and B = S −1 )
tr(A) = tr(SDS −1 ) = tr(S −1 SD) = tr(D) = λ1 + λ2 + · · · + λn
Thus the trace of a diagonalizable matrix is the sum of the eigenvalues. Again, this is true for
non-diagonalizable matrices as well, and can be seen from the Jordan Canonical Form.
IV.1.12. Powers of a diagonalizable matrix
If A is diagonalizable then its powers Ak are easy to compute.
Ak = SDS −1 SDS −1 SDS −1 · · · SDS −1 = SDk S −1
147
because all of the factors S −1 S cancel. Since
 k
λ1
0


k
D =0
 ..
.
0
powers of the diagonal matrix D are given by

0 0 ··· 0
λk2 0 · · · 0 

0 λk3 · · · 0 

.. 
. 
0
0
···
λkn
this formula provides an effective way to understand and compute Ak for large k.
148
IV.2. Hermitian matrices and real symmetric matrices
Prerequisites and Learning Goals
After completing this section, you should be able to
• Define and compare real symmetric and Hermitian matrices.
• Show that the eigenvalues of Hermitian matrices are real and that eigenvectors corresponding
to distinct eigenvalues are orthogonal.
• Explain why a Hermitian matrix can be diagonalized with a unitary matrix.
• Calculate the effective resistance between any two points in a resistor network using the
eigenvalues and eigenfunctions of the Laplacian matrix.
IV.2.1. Hermitian and real symmetric matrices
T
An n × n matrix A is called Hermitian if A∗ = A. (Recall that A∗ = A .) If A = [ai,j ] then this is
the same as saying aj,i = ai,j . An equivalent condition is
hv, Awi = hAv, wi
for all vectors v and w. Here are some examples of Hermitian matrices:


2
1+i 1
2 i
1 − i
1
0
−i 1
1
0
0
An n × n matrix A is called real symmetric if it is Hermitian and all its entries are real. In other
words, A is real symmetric if has real entries and AT = A.
IV.2.2. Eigenvalues of Hermitian matrices are real
A general n × n matrix may have complex eigenvalues, even if all its entries are real. However,
the eigenvalues of a Hermitian (possibly real symmetric) matrix are all real. To see this, let’s start
with an eigenvalue eigenvector pair λ, v for a Hermitian matrix A. For the moment, we allow the
possibility that λ and v are complex. Since A is Hermitian, we have
hv, Avi = hAv, vi
and since Av = λv this implies
hv, λvi = hλv, vi
149
Here we are using the inner product for complex vectors given by hv, wi = v∗ w = vT w. Thus
hλv, vi = λhv, vi, so that
λhv, vi = λhv, vi.
Since v is an eigenvector, it cannot be zero. So hv, vi = kvk2 6= 0. Therefore we may divide by
hv, vi to conclude
λ = λ.
This shows that λ is real.
IV.2.3. Eigenvectors of Hermitian matrices corresponding to distinct eigenvectors are
orthogonal
Suppose Av1 = λ1 v1 and Av2 = λ2 v2 with λ1 6= λ2 . Since A is Hermitian,
hAv1 , v2 i = hv1 , Av2 i.
Thus we find that
λ1 hv1 , v2 i = λ2 hv1 , v2 i
Here λ1 should appear as λ1 , but we already know that eigenvalues are real so λ1 = λ1 . This can
be written
(λ1 − λ2 )hv1 , v2 i = 0
and since λ1 − λ2 6= 0 this implies
hv1 , v2 i = 0
This calculation shows that if a Hermitian matrix A has distinct eigenvalues then the eigenvectors
are all orthogonal, and by rescaling them, we can obtain an orthonormal basis of eigenvectors. If we
put these orthonormal eigenvectors in the columns of a matrix U then U is unitary and U ∗ AU = D
where D is the diagonal matrix with the eigenvalues of A as diagonal entries.
In fact, even it A has repeated eigenvalues, it is still true that an orthonormal basis of eigenvectors
exists, as we will show below.
If A is real symmetric, then the eigenvectors can be chosen to be real. This implies that the matrix
that diagonalizes A can be chosen to be an orthogonal matrix. One way to see this is to notice
that if once we know that λ is real then all the calculations involved in computing the nullspace
of λI − A only involve real numbers. Another way to see this is to start with any eigenvector v
(possibly complex) for A. We can write in terms of its real and imaginary parts as v = x + iy
where x and y are real vectors. Then Av = λv implies Ax + iAy = λx + iλy. Now we equate real
and imaginary parts. Using that A and λ are real, this yields Ax = λx and Ay = λy. Thus x and
y are both eigenvectors with real entries. This might seem to good to be true - now we have two
eigenvectors instead of one. But in most cases, x and y will be multiplies of each other, so that
they are the same up to scaling.
If A is any square matrix with real entries, then A + AT is real symmetric. (The matrix AT A is
also real symmetric, and has the additional property that all the eigenvalues are positive.) We can
use this to produce random symmetric matrices in MATLAB/Octave like this:
150
>A = rand(4,4);
>A = A+A’
A =
0.043236
1.240654
0.658890
0.437168
1.240654
1.060615
0.608234
0.911889
0.658890
0.608234
1.081767
0.706712
0.437168
0.911889
0.706712
1.045293
Let’s check the eigenvalues and vectors or A
>[V, D] = eig(A)
V =
-0.81345
0.54456
0.15526
-0.13285
0.33753
0.19491
0.35913
-0.84800
0.23973
0.55585
-0.78824
-0.11064
0.40854
0.59707
0.47497
0.50100
0.00000
0.36240
0.00000
0.00000
0.00000
0.00000
0.55166
0.00000
0.00000
0.00000
0.00000
3.15854
D =
-0.84168
0.00000
0.00000
0.00000
The eigenvalues are real, as expected. Also, the eigenvectors contained in the columns of the matrix
V have been normalized. Thus V is orthogonal:
>V’*V
ans =
1.0000e+00
9.1791e-17
-1.4700e-16
1.3204e-16
6.5066e-17
1.0000e+00
-1.0012e-16
2.2036e-17
-1.4700e-16
-1.0432e-16
1.0000e+00
-1.0330e-16
1.4366e-16
2.2036e-17
-1.2617e-16
1.0000e+00
(at least, up to numerical error.)
151
IV.2.4. Every Hermitian matrix can be diagonalized by a unitary matrix
Now we will show that every Hermitian matrix A (even if it has repeated eigenvalues) can be
diagonalized by a unitary matrix. Equivalently, every Hermitian matrix has an orthonormal basis.
If A is real symmetric, then the basis of eigenvectors can be chosen to be real. Therefore this will
show that every real symmetric matrix can be diagonalized by an orthogonal matrix.
The argument we present works whether or not A has repeated eigenvalues and also gives a new
proof of the fact that the eigenvalues are real.
To begin we show that if A is any n × n matrix (not neccesarily Hermitian), then there exists a
unitary matrix U such that U ∗ AU is upper triangular. To see this start with any eigenvector v of A
with eigenvalue λ. (Every matrix has at least one eigenvalue.) Normalize v so that kvk = 1. Now
choose and orthonormal basis q2 , . . . , qn for the orthogonal complement of the subspace spanned
by v, so that v, q2 , . . . , qn form an orthonormal basis for all of Cn . Form the matrix U1 with these
vectors as columns. Then, using ∗ to denotes a number that is not neccesarily 0, we have

 

vT

 


 

qT2
∗
 A v q2 . . . qn 
U1 AU1 = 
..

 


 

.
qTn

vT



=


qT2
..
.




 λv Aq2 . . . Aqn 




qTn

λkvk2 ∗ . . . ∗
 λhq2 , vi ∗ . . . ∗


=
..
.. .. .. 

.
. . .
λhqn , vi ∗ . . . ∗


λ ∗ ... ∗
 0 ∗ . . . ∗


=  . . . .
 .. .. .. .. 

0 ∗ ... ∗


λ ∗ ... ∗

0


= .
.
 ..
A2 
0
Here A2 is an (n − 1) × (n − 1) matrix.
152

Repeating the same procedure with A2 we can construct an (n − 1) × (n − 1) unitary matrix Ũ2
with


λ2 ∗ . . . ∗

0


Ũ2∗ A2 Ũ2 =  .
.
 ..
A3 
0
Now we use the(n − 1) × (n − 1) unitary matrix Ũ2 to construct an n × n unitary matrix


1 0 ... 0

0


U2 =  .

 ..
Ũ2 
0
Then it is not hard to see that


λ ∗ ∗ ∗ ∗
 0 λ2 ∗ . . . ∗


0 0

∗ ∗
U2 U1 AU1 U2 = 
.
 .. ..

. .
A3 
0 0
Continuing in this way, we find unitary matrices U3 , U4 , . . . , Un−1 so that


λ ∗ ∗ ∗
∗
 0 λ2 ∗ . . . ∗ 




∗
Un−1
· · · U2∗ U1∗ AU1 U2 · · · Un−1 =  0 0
.
 .. ..

.
.
. .

.
0 0
λn
Define U = U1 U2 · · · Un−1 . Then U is unitary, since the product of unitary matrices is unitary,
∗
and U ∗ = Un−1
· · · U2∗ U1∗ . Thus the equation above says that U ∗ AU is upper triangular, that is,


λ ∗ ∗ ∗
∗
 0 λ2 ∗ . . . ∗ 


0 0

∗
U AU = 

 .. ..

..
. .

.
0 0
λn
153
Notice that if we take the adjoint of this equation, we get


λ 0 0 0
0
 ∗ λ2 0 . . . 0 




U ∗ A∗ U =  ∗ ∗

 .. ..

..
. .

.
∗ ∗
λn
Now lets return to the case where A is Hermitian. Then A∗
in the previous two equations are equal. Thus
 

λ 0 0
λ ∗ ∗ ∗
∗
 0 λ2 ∗ . . . ∗   ∗ λ2 0
 

 ∗ ∗
0 0
=

  .. ..
 .. ..
..
 . .
. .
.
0
0
∗
λn
∗
= A so that the matrices appearing

0
0




..

.
λn
0
...
This implies that all the entries denoted ∗ must actually be zero. This also shows that λi = λi for
every i. In other words, Hermitian matrices can be diagonalized by a unitary matrix, and all the
eigenvalues are real.
IV.2.5. Powers and other functions of Hermitian matrices
The diagonalization formula for a Hermitian matrix A with eigenvalues λ1 , . . . , λn and orthonormal
basis of eigenvectors v1 , . . . vn can be written




vT1
λ1 0 · · · 0





  0 λ2 · · · 0 

vT2

∗




A = U DU = v1 v2 . . . vn   .
.. . .
..  
.

.
.


.
.
.
.




.
0 0 · · · λn
T
v
n
This means that for any vector w
Aw =
n
X
λi vi vTn w.
i=1
Since this is true for any vector w it must be that
A=
n
X
λi Pi
i=1
where Pi = vi vTn . The matrices Pi are the orthogonal projections onto the one dimensional subspaces spanned by the eigenvectors. We have Pi Pi = Pi and Pi Pj = 0 for i 6= j.
154
Recall that to compute powers Ak we simply need to replace λi with λki in the diagonalization
formula. Thus
n
X
Ak =
λki Pi .
i=1
This formula is valid for negative powers k provided none of the λi are zero. In particular, when
k = −1 we obtain a formula for the inverse.
In fact we can define f (A) for any function f to be the matrix
f (A) =
n
X
f (λi )Pi .
i=1
IV.2.6. Effective resistance revisited
In this section we will use its eigenvalues and eigenvectors of the Laplacian matrix to express
the effective resistance between any two nodes of a resistor network. The Laplacian L is a real
symmetric matrix.
The basic equation for a resistor network is
Lv = J
where the entries of v = [v1 , v2 , . . . , vn ]T are the voltages the nodes, and the entries of J =
[J1 , J2 , . . . , Jn ] are the currents that are flowing in or out of each node. If we connect a b volt
battery to nodes i and j we are fixing the voltage difference between these nodes to be the battery
voltage so that
vi − vj = b.
In this situation the there are no currents flowing in or out of the network except at nodes i and
j, so except for these two nodes J has all entries equal to zero. Since the current flowing in must
equal the current flowing out we have Ji = c, Jj = −c for some value of c. This can be written
J = c(ei − ej )
where ei and ej are the standard basis vectors. The effective resistance between the ith and jth
nodes is
R = b/c.
This is the quantity that we wish to compute.
Let λi and ui , i − 1 . . . , n be the eigenvalues and eigenvectors for L. Since L is real symmetric,
we know that the eigenvectors λi are real and the eigenvectors ui can be chosen to be orthonormal.
However for the Laplacian we know more than this. We will assume that all the nodes in the network
are connected. Then L has a one dimensional nullspace spanned by the vector with constant entries.
√
Thus λ1 = 0 and u1 = (1/ n)[1, 1, . . . , 1]T . Furthermore all the eigenvalues of L are non-negative
155
(see homework problem). Since only one of them is zero, all the other eigenvalues λ2 , . . . , λn must
be strictly positive.
Pn
T
Write L =
i=1 λi Pi where Pi = ui ui is the projection onto the subspace spanned by the
eigenvector uP
i . Since λ1 = 0, the matrix L is not invertible. However we can define the partial
inverse G = ni=2 λ−1
i Pi . Then

! n
n
n
X
X
X
−1


GL =
λi Pi
λ j Pj =
Pi = P
i=2
j=1
i=2
where P is the projection onto the orthogonal complement of the nullpace.
Since v and P v differ by a multiple of the vector u1 with constant entries, we have
b = vi − vj = (P v)i − (P v)j
= hei − ej , P vi = hei − ej , GLvi
= hei − ej , GJi = chei − ej , G(ei − ej )i
n
n
X
X
2
=c
hei − ej , λk−1 Pk (ei − ej )i = c
λ−1
k |hei − ej , uk i| .
k=2
k=2
Therefore the effective resistance is
R = b/c =
n
X
k=2
2
λ−1
k |hei − ej , uk i| =
n
X
2
λ−1
k |(uk )i − (uk )j |
k=2
Let’s redo the calculation of the effective resistance in section II.3.10 using this formula. We first
define L and compute the eigenvalues and eigenvectors.
> L=[2 -1 0 -1;-1 3 -1 -1;0 -1 2 -1;-1 -1 -1 3];
> [U,D]=eig(L);
Somewhat confusingly (uk )i = U (i, k), and λk = D(k, k). Thus the effective resistance is obtained
by summing the terms D(k,k)^(-1)*(U(1,k)-U(2,k))^2.
>
>
R
R
R
R=0;
for k=[2:4] R=R+D(k,k)^(-1)*(U(1,k)-U(2,k))^2 end
= 0.25000
= 0.25000
= 0.62500
156
This gives the same answer 5/8 = 0.62500 as before.
Although the formula in this section can be used for calculations, the real advantage lies in
the connection it exposes between the effective resistance and the eigenvalues of L. In many
applications, one is interested in networks with low effective resistance between any two points.
Such networks are highly connected - a desirable property when designing, say, communications
networks. The formula in this section shows that the effective resistance will be low if the non-zero
eigenvalues λ2 , λ3 , . . . are large.
157
IV.3. Power Method for Computing Eigenvalues
Prerequisites and Learning Goals
After completing this section, you should be able to
• Explain how to implement the power method in order to compute certain eigenvalues/eigenvectors,
write out the steps of the algorithm and discuss its output; implement the algorithm in MATLAB/Octave and use it to compute the largest and smallest eigenvalues and the eigenvalue
closest to a given number (and corresponding eigenvectors) of a Hermitian matrix; discuss
what it means when the algorithm does not convergence.
IV.3.1. The power method
The power method is a very simple method for finding a single eigenvalue–eigenvector pair.
Suppose A is an n × n matrix. We assume that A is real symmetric, so that all the eigenvalues
are real. Now let x0 be any vector of length n. Perform the following steps:
• Multiply by A
• Normalize to unit length.
repeatedly. This generates a series of vectors x0 , x1 , x2 , . . .. It turns out that these vectors converge
to the eigenvector corresponding to the eigenvalue whose absolute value is the largest.
To verify this claim, let’s first find a formula for xk . At each stage of this process, we are
multiplying by A and then by some number. Thus xk must be a multiple of Ak x0 . Since the
resulting vector has unit length, that number must be 1/kAk x0 k. Thus
xk =
Ak x 0
kAk x0 k
We know that A has a basis of eigenvectors v1 , v2 , . . . , vn . Order them so that |λ1 | > |λ2 | ≥ · · · ≥
|λn |. (We are assuming here that |λ1 | =
6 |λ2 |, otherwise the power method runs into difficulty.) We
may expand our initial vector x0 in this basis
x0 = c1 v1 + c2 v2 + · · · cn vn
We need that c1 6= 0 for this method to work, but if x0 is chosen at random, this is almost always
true.
Since Ak vi = λki vi we have
Ak x0 = c1 λk1 v1 + c2 λk2 v2 + · · · cn λkn vn
= λk1 c1 v1 + c2 (λ2 /λ1 )k v2 + · · · cn (λn /λ1 )k vn
= λk1 (c1 v1 + k )
158
where k → 0 as k → ∞. This is because |(λi /λ1 )| < 1 for every i > 1 so the powers tend to zero.
Thus
kAk x0 k = |λ1 |k kc1 v1 + k k
so that
Ak x 0
kAk x0 k
λ1 k c1 v1 + k
=
|λ1 |
kc1 v1 + k k
v1
k
→ (±) ±
kv1 k
xk =
We have shown that xk converges, except for a possible sign flip at each stage, to a normalized
eigenvector corresponding to λ1 . The sign flip is present exactly when λ1 < 0. Knowing v1 (or a
multiple of it) we can find λ1 with
hv1 , Av1 i
λ1 =
kv1 k2
This gives a method for finding the largest eigenvalue (in absolute value) and the corresponding
eigenvector. Let’s try it out.
>A = [4 1 3;1 3 2; 3 2 5];
>x=rand(3,1);
>for k = [1:10]
>y=A*x;
>x=y/norm(y)
>end
x =
0.63023
0.38681
0.67319
x =
0.58690
0.37366
0.71828
x =
0.57923
159
0.37353
0.72455
x =
0.57776
0.37403
0.72546
x =
0.57745
0.37425
0.72559
x =
0.57738
0.37433
0.72561
x =
0.57736
0.37435
0.72562
x =
0.57735
0.37436
0.72562
x =
0.57735
0.37436
0.72562
x =
0.57735
0.37436
0.72562
160
This gives the eigenvector. We compute the eigenvalue with
>x’*A*x/norm(x)^2
ans =
8.4188
Let’s check:
>[V D] = eig(A)
V =
0.577350
0.441225
-0.687013
0.577350
-0.815583
-0.038605
0.577350
0.374359
0.725619
D =
1.19440
0.00000
0.00000
0.00000
2.38677
0.00000
0.00000
0.00000
8.41883
As expected, we have computed the largest eigenvalue and eigenvector. Of course, a serious program
that uses this method would not just iterate a fixed number (above it was 10) times, but check
for convergence, perhaps by checking whether kxk − xk−1 k was less than some small number, and
stopping when this was achieved.
So far, the power method only computes the eigenvalue with the largest absolute value, and the
corresponding eigenvector. What good is that? Well, it turns out that with an additional twist we
can compute the eigenvalue closest to any number s. The key observation is that the eigenvalues
of (A − sI)−1 are exactly (λi − s)−1 (unless, of course, A − sI is not invertible. But then s is an
eigenvalue itself and we can stop looking.) Moreover, the eigenvectors of A and (A − sI)−1 are the
same.
Let’s see why this is true. If
Av = λv
then
(A − sI)v = (λ − s)v.
Now if we multiply both sides by (A − sI)−1 and divide by λ − s we get
(λ − s)−1 v = (A − sI)−1 v.
161
These steps can be run backwards to show that if (λ − s)−1 is an eigenvalue of (A − sI)−1 with
eigenvector v, then λ is an eigenvalue of A with the same eigenvector.
Now start with an arbitrary vector x0 and define
xk+1 =
(A − sI)−1 xk
.
k(A − sI)−1 xk k
Then xk will converge to the eigenvector vi of (A − sI)−1 for which |λi − s|−1 is the largest. But,
since the eigenvectors of A and A − sI are the same, vi is also an eigenvector of A. And since
|λi − s|−1 is largest when λi is closest to s, we have computed the eigenvector vi of A for which λi
is closest to s. We can now compute λi by comparing Avi with vi
Here is a crucial point: when computing (A − sI)−1 xk in this procedure, we should not actually
compute the inverse. We don’t need to know the whole matrix (A − sI)−1 , but just the vector (A −
sI)−1 xk . This vector is the solution y of the linear equation (A − sI)y = xk . In MATLAB/Octave
we would therefore use something like (A - s*eye(n))\Xk.
Let’s try to compute the eigenvalue of the matrix A above closest to 3.
>A = [4 1 3;1 3 2; 3 2 5];
>x=rand(3,1);
>for k = [1:10]
>y=(A-3*eye(3))\x;
>x=y/norm(y)
>end
x =
0.649008
-0.756516
0.080449
x =
-0.564508
0.824051
0.047657
x =
0.577502
-0.815593
-0.036045
162
x =
-0.576895
0.815917
0.038374
x =
0.577253
-0.815659
-0.038454
x =
-0.577311
0.815613
0.038562
x =
0.577338
-0.815593
-0.038590
x =
-0.577346
0.815587
0.038600
x =
0.577349
-0.815585
-0.038603
x =
-0.577350
0.815584
0.038604
This gives the eigenvector. Now we can find the eigenvalue
163
> lambda = x’*A*x/norm(x)^2
lambda = 2.3868
Comparing with the results of eig above, we see that we have computed the middle eigenvalue and
eigenvector.
164
IV.4. Recursion Relations
Prerequisites and Learning Goals
After completing this section, you should be able to
• Derive a matrix equation from a recursion relation and use it to find the n-th term in the
sequence given some initial conditions. You should be able to perform these computations by
hand or with MATLAB/Octave.
• Discuss how the relative magnitude of the eigenvalues of the matrix obtained from a recursion
relation determines the behaviour of the high power terms of the recursion.
• Determine initial values for which the solution of a recursion relation will become large or
small (depending on the eigenvalues of the associated matrix).
IV.4.1. Fibonacci numbers
Consider the sequence of numbers given by a multiple of powers of the golden ratio
√ !n
1
1+ 5
√
n = 1, 2, 3 . . . .
2
5
When n is large, these numbers are almost integers:
>format long;
>((1+sqrt(5))/2)^30/sqrt(5)
ans =
832040.000000241
>((1+sqrt(5))/2)^31/sqrt(5)
ans =
1346268.99999985
>((1+sqrt(5))/2)^32/sqrt(5)
ans =
2178309.00000009
Why? To answer this question we introduce the Fibonacci sequence:
0
1
1
2
3
5
8
13
21
...
165
where each number in the sequence is obtained by adding the previous two. If you go far enough
along in this sequence you will encounter
...
832040
1346269
2178309
...
and you can check (without using MATLAB/Octave, I hope) that the third number is the sum of
the previous two.
But why should powers of the golden ratio be very nearly, but not quite, equal to Fibonacci
numbers? The reason is that the Fibonacci sequence is defined by a recursion relation. For the
Fibonacci sequence F0 , F1 , F2 , . . . the recursion relation is
Fn+1 = Fn + Fn−1
This equation, together with the identity Fn = Fn can be written in matrix form as
Fn+1
Fn
1 1
=
Fn
1 0 Fn−1
Thus, taking n = 1, we find
F2
1 1 F1
=
F1
1 0 F0
Similarly
2 F3
F1
1 1 F2
1 1
=
=
F2
1 0 F1
1 0
F0
and continuing like this we find
n F1
Fn+1
1 1
=
1 0
F0
Fn
Finally, since F0 = 0 and F1 = 1 we can write
n 1
1 1
Fn+1
=
0
Fn
1 0
We can diagonalize
the matrix to get a formula for the Fibonacci numbers. The eigenvalues and
1 1
eigenvectors of
are
1 0
" √ #
√
1+ 5
1+ 5
2
λ1 =
v1 =
2
1
and
166
√
1− 5
λ2 =
2
"
v2 =
√ #
1− 5
2
1
This implies
−1
1 λ1 λ2 λn1 0
λ1 λ2 λn1 0
λ1 λ2
1 −λ2
=√
1 1
0 λn2
1 1
0 λn2 −1 λ1
5 1 1
so that
1 λ1 λ2 λn1 0
1 λn+1
Fn+1
1 −λ2 1
− λn+1
1
2
=√
=√
Fn
0 λn2 −1 λ1
0
λn1 − λn2
5 1 1
5
In particular
1
Fn = √ (λn1 − λn2 )
5
Since λ2 ∼ −0.6180339880 is smaller than 1 in absolute value, the powers λn2 become small very
quickly as n becomes large. This explains why
1
Fn ∼ √ λn1
5
for large n.
If we want to use MATLAB/Octave to compute Fibonacci numbers, we don’t need to bother
diagonalizing the matrix.
>[1 1;1 0]^30*[1;0]
ans =
1346269
832040
produces the same Fibonacci numbers as above.
IV.4.2. Other recursion relations
The idea that was used to solve for the Fibonacci numbers can be used to solve other recursion
relations. For example the three-step recursion
xn+1 = axn + bxn−1 + cxn−2
can be written as a matrix equation

 


xn+1
a b c
xn
 xn  = 1 0 0 xn−1 
xn−1
0 1 0 xn−2
so given three initial values x0 , x1 and x2 we can find the rest by computing powers of a 3 × 3
matrix.
In the next section we will solve a recurrence relation that arises in Quantum Mechanics.
167
IV.5. The Anderson Tight Binding Model
Prerequisites and Learning Goals
After completing this section, you should be able to
• describe a bound state with energy E for the discrete Schrodinger equation for a single electron
moving in a one dimensional semi-infinite crystal.
• describe a scattering state with energy E.
• compute the energies for which a bound state exists and identify the conduction band, for a
potential that has only one non-zero value.
• compute the conduction bands for a one dimensional crystal.
IV.5.1. Description of the model
Previously we studied how to approximate differential equations by matrix equations. If we apply
this discretization procedure to the Schrödinger equation for an electron moving in a solid we obtain
the Anderson tight binding model.
We will consider a single electron moving in a one dimensional semi-infinite crystal. The electron
is constrained to live at discrete lattice points, numbered 0, 1, 2, 3, . . .. These can be thought of
as the positions of the atoms. For each lattice point n there is a potential Vn that describes how
much the atom at that lattice point attracts or repels the electron. Positive Vn ’s indicate repulsion,
whereas negative Vn ’s indicate attraction. Typical situations studied in physics are where the Vn ’s
repeat the same pattern periodically (a crystal), or where they are chosen at random (disordered
media). In fact, the term Anderson model usually refers to the random case, where the potentials
are chosen at random, independently for each site.
The wave function for the electron is a sequence of complex numbers Ψ = {ψ0 , ψ1 , ψ2 , . . .}. The
sequence Ψ is called a bound state with energy E if satisfies the following three conditions:
(1) The discrete version of the time independent Schrödinger equation
−ψn+1 − ψn−1 + Vn ψn = Eψn ,
(2) the boundary condition
ψ0 = 0,
(3) and the normalization condition
2
N =
∞
X
n=0
168
|ψn |2 < ∞.
This conditions are trivially satisfies if Ψ = {0, 0, 0, . . .} so we specifically exclude this case. (In
fact Ψ is actually the eigenvector of an infinite matrix so this is just the condition that eigenvectors
must be non-zero.)
Given an energy E, it is always possible to find a wave function Ψ to satisfy conditions (1)
and (2). However for most energies E, none of these Ψ’s will be getting small for large n, so the
normalization condition (3) will not be satisfied. There are only a discrete set of energies E for
which a bound state satisfying all three conditions is satisfied. In other words, the energy E is
quantized.
If E is one of the allowed energy values and Ψ is the corresponding bound state, then the numbers
|ψn |2 /N 2 are interpreted as the probabilities of finding an electron with energy E at the nth site.
These numbers add up to 1, consistent with the interpretation as probabilities.
Notice that if Ψ is a bound state with energy E, then so is any non-zero multiple aΨ =
{aψ0 , aψ1 , aψ2 , . . .}. Replacing Ψ with aΨ has no effect on the probabilities because N changes to
aN , so the a’s cancel in |ψn |2 /N 2 .
IV.5.2. Recursion relation
The discrete Schrödinger equation (1) together with the initial condition (2) is a recursion relation
that can be solved using the method we saw in the previous section. We have
ψn+1
V − E −1
ψn
= n
ψn
1
0
ψn−1
so if we set
xn =
and
ψn+1
ψn
z −1
A(z) =
1 0
then this implies
xn = A(Vn − E)A(Vn−1 − E) · · · A(V1 − E)x0 .
Condition (2) says that
ψ1
x0 =
,
0
since ψ0 = 0.
In fact, we may assume ψ1 = 1, since replacing Ψ with aΨ where a = 1/ψ1 results in a bound
state where this is true. Dividing by ψ1 is possible unless ψ1 = 0. But if ψ1 = 0 then x0 = 0
and the recursion implies that every ψk = 0. This is not an acceptable bound state. Thus we may
assume
1
x0 =
0
169
So far we are able to compute xn (and thus ψn ) satisfying conditions (1) and (2) for any values
of V1 , V2 , . . .. Condition (3) is a statement about the large n behaviour of ψn . This can be very
difficult to determine, unless we know more about the values Vn .
IV.5.3. A potential with most values zero
We will consider the very simplest situation where V1 = −a and all the other Vn ’s are equal to zero.
Let us try to determine for what energies E a bound state exists.
In this situation
xn = A(−E)n−1 A(−a − E)x0 = A(−E)n−1 x1
where
−a − E −1 1
−(a + E)
x1 = A(−a − E)x0 =
=
1
0
0
0
The large n behavior of xn can be computed using the eigenvalues and eigenvectors of A(−E).
Suppose they are λ1 , v1 and λ2 , v2 . Then we expand
x1 = a1 v1 + a2 v2
and conclude that
v2
xn = A(−E)n−1 (a1 v1 + a2 v2 ) = a1 λ1n−1 v1 + a2 λn−1
2
Keep in mind that all the quantities in this equation depend on E. Our goal is to choose E so that
the xn become small for large n.
Before computing the eigenvalues of A(−E), let’s note that det(A(−E)) = 1. This implies that
λ1 λ2 = 1
Suppose the eigenvalues are complex. Then, since A(−E) has real entries, they must be complex
conjugates. Thus λ2 = λ1 and 1 = λ1 λ2 = λ1 λ1 = |λ1 |2 . This means that λ1 and λ2 lie on the
unit circle in the complex plane. In other words, λ1 = eiθ and λ2 = e−iθ for some θ. This implies
that λn−1
= ei(n−1)θ is also on the unit circle, and is not getting small. Similarly λ2n−1 is not
1
getting small. So complex eigenvalues will never lead to bound states. In fact, complex eigenvalues
correspond to scattering states, and the energy values for which eigenvalues are complex are the
energies at which the electron can move freely through the crystal.
Suppose the eigenvalues are real. If |λ1 | > 1 then |λ2 | = 1/|λ1 | < 1 and vice versa. So one of the
products λ1n−1 , λn−1
will be growing large, and one will be getting small. So the only way that xn
2
can be getting small is if the coefficient a1 or a2 sitting in front of the growing product is zero.
Now let us actually compute the eigenvalues. They are
√
−E ± E 2 − 4
λ=
.
2
170
If −2 < E < 2 then the eigenvalues are complex, so there are no bound states. The interval
[−2, 2] is the conduction band, where the electrons can move through the crystal.
If E = ±2 then there is only one eigenvalue, namely 1. In this case there actually is only one
eigenvector, so our analysis doesn’t apply. However there are no bounds states in this case.
√
2
Now let us consider the case E < −2. Then the
large eigenvalue is λ1 = (−E + E − 4)/2 and
√
−1
the corresponding eigenvector is v1 =
. The small eigenvalue is λ2 = (−E − E 2 − 4)/2
E + λ1
−1
and the corresponding eigenvector is v2 =
. We must now compute a1 and determine
E + λ2
a
when it is zero. We have [v1 |v2 ] 1 = x1 . This is 2 × 2 matrix equation that we can easily solve
a2
a1
for
. A short calculation gives
a2
a1 = (λ1 − λ2 )−1 (−(a + E)(E + λ2 ) + 1).
Thus we see that a1 = 0 whenever
(a + E)(E −
p
E 2 − 4) − 2 = 0
Let’s consider the case a = 5 and plot this function on the interval [−10, −2]. To see if it crosses
the axis, we also plot the function zero.
>N=500;
>E=linspace(-10,-2,N);
>ONE=ones(1,N);
>plot(E,(5*ONE+E).*(E-sqrt(E.^2 - 4*ONE)) - 2*ONE)
>hold on
>plot(E,zeros(1,N))
Here is the result
171
100
80
60
40
20
0
-20
-10
-9
-8
-7
-6
-5
-4
-3
-2
We can see that there is a single bound state in this interval, just below −5. In fact, the solution
is E = −5.2.
The case E > 2 is similar. This time we end up with
p
(a + E)(E + E 2 − 4) − 2 = 0
When a = 5 this never has a solution for E > 2. In fact the right side of this equation is bigger
than (5 + 2)(2 + 0) − 2 = 12 and so can never equal zero.
In conclusion, if V1 = −5, and all the other Vn ’s are zero, then there is exactly one bound state
with energy E = −5.2. Here is a diagram of the energy spectrum for this potential.
Bound state energy
−6
−4
Conduction band
−2
0
2
E
For the bound state energy of E = −5.2, the corresponding wave function Ψ, and thus the probability that the electron is located at the nth lattice point can now also be computed. The evaluation
of the infinite sum that gives the normalization constant N 2 can be done using a geometric series.
172
IV.5.4. Conduction bands for a crystal
The atoms in a crystal are arranged in a periodic array. We can model a one dimensional crystal
in the tight binding model by considering potential values that repeat a fixed pattern. Let’s focus
on the case where the pattern is 1, 2, 3, 4 so that the potential values are
V1 , V2 , V3 , . . . = 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, . . .
In this case, if we start with the formula
1
xn = A(Vn − E)A(Vn−1 − E) · · · A(V1 − E)
0
we can group the matrices into groups of four. The product
T (E) = A(V4 − E)A(V3 − E)A(V2 − E)A(V1 − E)
is repeated so that
x4n
1
= T (E)
0
n
Notice that the matrix T (E) has determinant 1 since it is a product of matrices with determinant
1. So, as above, the eigenvalues λ1 and λ2 are either real with λ2 = 1/λ1 , or complex conjugates
on the unit circle. As before, the conduction bands are the energies E for which the eigenvalues of
T (E) are complex conjugates. It turns out that this happens exactly when
|tr(T (E))| ≤ 2
To see this, start with the characteristic polynomial for T (E)
det(λI − T (E)) = λ2 − tr(T (E))λ + det(T (E))
(see homework problem). Since det(T (E)) = 1 the eigenvalues are given by
p
tr(T (E)) ± tr(T (E))2 − 4
λ=
.
2
When |tr(T (E))| ≤ 2 the quantity under the square root sign is negative, and so the eigenvalues
have a non-zero imaginary part.
Let’s use MATLAB/Octave to plot the values of tr(T (E)) as a function of E. For convenience
we first define a function that computes the matrices A(z). To to this we type the following lines
into a file called A.m in our working directory.
function A=A(Z)
A=[Z -1; 1 0];
end
173
Next we start with a range of E values and define another vector T that contains the corresponding
values of tr(T (E)).
N=100;
E=linspace(-1,6,N);
T=[];
for e = E
T=[T trace(A(4-e)*A(3-e)*A(2-e)*A(1-e))];
end
Finally, we plot T against E. At the same time, we plot the constant functions E = 2 and E = −2.
plot(E,T)
hold on
plot(E,2*ones(1,N));
plot(E,-2*ones(1,N));
axis([-1,6,-10,10])
On the resulting picture the energies where T (E) lies between −2 and 2 have been highlighted.
10
5
0
−5
−10
−1
0
1
2
3
We see that there are four conduction bands for this crystal.
174
4
5
6
IV.6. Markov Chains
Prerequisites and Learning Goals
After completing this section, you should be able to
• Write down the definition of a stochastic matrix, list its properties and explain why they are
true; recognize when a matrix is stochastic.
• Write down the stochastic matrix and any relevant state vectors for a random walk, use
them to describe the long-time behaviour of the random walk, explain why the probabilities in a random walk approach limiting values; use stochastic matrices to solve practical
Markov chain problem. You should be able to perform these computations by hand or with
MATLAB/Octave.
• For an internet specified by a directed graph, write down the stochastic matrix associated
with the Google Page rank algorithm for a given damping factor α (and related algorithms),
and compute the ranking of the sites (by hand or with MATLAB/Octave).
• Use the Metropolis algorithm to produce a stochastic matrix with a predetermined limiting
probability distribution; use the algorithm to perform a random walk that converges with
high probability to the maximum value of a given positive function.
IV.6.1. Random walk
In the diagram below there are three sites labelled 1, 2 and 3. Think of a walker moving from site
to site. At each step the walker either stays at the same site, or moves to one of the other sites
according to a set of fixed probabilities. The probability of moving to the ith site from the jth site
is denoted pi,j . These numbers satisfy
0 ≤ pi,j ≤ 1
because they are probabilities (0 means “no chance” and 1 means “for sure”). On the diagram they
label the arrows indicating the relevant transitions. Since the walker has to go somewhere at each
step the sum of all the probabilities leaving a given site must be one. Thus for every j,
X
pi,j = 1
i
175
p
p
3,3
2,2
p2,3
3
2
p3,2
p
p2,1
1,3
p
p
1,2
3,1
1
p1,1
Let xn,i be the probability that the walker is at site i after n steps. We collect these probabilities
into a sequence of vectors called state vectors. Each state vector contains the probabilities for the
nth step in the walk.


xn,1
xn,2 


xn =  . 
 .. 
xn,k
The probability that the walker is at site i after n + 1 steps can be calculated from probabilities
for the previous step. It is the sum over all sites of the probability that the walker was at that site,
times the probability of moving from that site to the ith site. Thus
X
xn+1,i =
pi,j xn,j
j
This can be written in matrix form as
xn = P xn−1
where P = [pi,j ]. Using this relation repeatedly we find
xn = P n x0
where x0 contains the probabilities at the beginning of the walk.
The matrix P has two properties:
1. All entries of P are non-negative.
2. Each column of P sum to 1.
A matrix with these properties is called a stochastic matrix.
176
The goal is to determine where the walker is likely to be located after many steps. In other words,
we want to find the large n behaviour of xn = P n x0 . Let’s look at an example. Suppose there are
three sites, the transition probabilities are given by


0.5 0.2 0.1
P = 0.4 0.2 0.8
0.1 0.6 0.1
and the walker starts at site 1 so that initial state vector is
 
1
x0 = 0
0
Now let’s use MATLAB/Octave to calculate the subsequent state vectors for n = 1, 10, 100, 1000.
>P=[.5 .2 .1; .4 .2 .8; .1 .6 .1];
>X0=[1; 0; 0];
>P*X0
ans =
0.50000
0.40000
0.10000
>P^10*X0
ans =
0.24007
0.43961
0.32032
>P^100*X0
ans =
0.24000
0.44000
0.32000
P^1000*X0
ans =
177
0.24000
0.44000
0.32000
The state vectors converge. Let’s see what happens if the initial vector is different, say with equal
probabilities of being at the second and third sites.
>X0 = [0; 0.5; 0.5];
>P^100*X0
ans =
0.24000
0.44000
0.32000
The limit is the same. Of course, we know how to compute high powers of a matrix using the
eigenvalues and eigenvectors. A little thought would lead us to suspect that P has an eigenvalue
of 1 that is largest in absolute value, and that the corresponding eigenvector is the limiting vector,
up to a multiple. Let’s check
>eig(P)
ans =
1.00000
0.35826
-0.55826
>P*[0.24000; 0.44000; 0.32000]
ans =
0.24000
0.44000
0.32000
178
IV.6.2. Properties of stochastic matrices
The fact that the matrix P in the example above has an eigenvalue of 1 and an eigenvector that is
a state vector is no accident. Any stochastic matrix P has the following properties:
(1) If x is a state vector, so is P x.
(2) P has an eigenvalue λ1 = 1.
(3) The corresponding eigenvector v1 has all non-negative entries.
(4) The other eigenvalues λi have |λi | ≤ 1
If P or some power P k has all positive entries (that is, no zero entries) then
(3’) The eigenvector v1 has all positive entries.
(4’) The other eigenvalues λi have |λi | < 1
(Since eigenvectors are only defined up to non-zero scalar multiples, strictly speaking, (3) and (3’)
should say that after possibly multiplying v1 by −1 the entries are non-negative or positive.) These
properties explain the convergence properties of the state vectors of the random walk. Suppose (3’)
and (4’) hold and we expand the initial vector x0 in a basis of eigenvectors. (Here we are assuming
that P is diagonalizable, which is almost always true.) Then
x0 = c1 v1 + c2 v2 + · · · + ck vk
so that
xn = P n x0 = c1 λn1 v1 + c2 λn2 v2 + · · · + ck λnk vk
Since λ1 = 1 and |λi | < 1 for i 6= 1 we find
lim xn = c1 v1
n→∞
Since each xn is a state vector, so is the limit c1 v1 . This allows us to compute c1 easily. It is the
reciprocal of the sum of the entries of v1 . In particular, if we chose v1 to be a state vector then
c1 = 1.
Now we will go through the properties above and explain why they are true
(1) P preserves state vectors:
Suppose x is a state vector, that is, x has non-negative entries which sum to 1. Then P x has
non-negative entries too, since all the entries of P are non-negative. Also
X
XX
XX
X X
X
(P x)i =
Pi,j xj =
Pi,j xj =
xj
Pi,j =
xj = 1
i
i
j
j
i
j
i
j
Thus the entries of P x also sum to one, and P x is a state vector.
179
(2) P has an eigenvalue λ1 = 1
The key point here is that P and P T have the same eigenvalues. To see this recall that det(A) =
det(AT ). This implies that
det(λI − P ) = det((λI − P )T ) = det(λI T − P T ) = det(λI − P T )
So P and P T have the same characteristic polynomial. Since the eigenvalues are the zeros of the
characteristic polynomial, they must be the same for P and P T . (Notice that this does not say
that the eigenvectors are the same.)
Since P has columns adding up to 1, P T has rows that add up to 1. This fact can be expressed
as the matrix equation
   
1
1
1 1
   
P . = . .
 ..   .. 
1
1
But this equation says that 1 is an eigenvalue for P T . Therefore 1 is an eigenvalue for P as well.
(4) Other eigenvalues of P have modulus ≤ 1
To show that this is true, we use the 1-norm introduced at the beginning of the course. Recall
that the norm k · k1 is defined by
 
x1
 x2   
 ..  = |x1 | + |x2 | + · · · + |xn |.
 .  1
xn
Multiplication by P decreases the length of vectors if we use this norm to measure length. In other
words
kP xk1 ≤ kxk1
180
for any vector x. This follows from the calculation (almost the same as one above, and again using
the fact that columns of P are positive and sum to 1)
X
kP xk1 =
|(P x)i |
i
X X
=
|(
Pi,j xj )|
i
j
≤
XX
=
XX
i
=
Pi,j |xj |
j
j
i
X
|xj |
Pi,j |xj |
X
j
=
X
Pi,j
i
|xj |
j
= kxk1
Now suppose that λ is an eigenvalue, so that P v = λv for some non-zero v. Then
kλvk1 = kP vk1
Since kλvk1 = |λ|kvk1 and kP vk1 ≤ kvk1 this implies
|λ|kvk1 ≤ kvk1
Finally, since v is not zero, kvk1 > 0. Therefore we can divide by kvk1 to obtain
|λ| ≤ 1
(3) The eigenvector v1 (or some multiple of it) has all non-negative entries
We can give a partial proof of this, in the situation where the eigenvalues other than λ1 = 1 obey
the strict inequality |λi | < 1. In this case the power method implies that starting with any initial
vector x0 the vectors P n x0 converge to a multiple c1 v1 of v1 . If we choose the initial vector x0 to
have only positive entries, then every vector in the sequence P n x0 has only non-negative entries.
This implies that the limiting vector must have non-negative entries.
(3) and (4) vs. (3’) and (4’) and P n with all positive entries
Saying that P n has all positive entries means that there is a non-zero probability of moving
between any two sites in n steps. The fact that in this case all the eigenvalues other than λ1 = 1
obey the strict inequality |λi | < 1 follows from a famous theorem in linear algebra called the
Perron–Frobenius theorem.
181
Although we won’t be able to prove the Perron–Frobenius theorem, we can give some examples
to show that if the condition that P n has all positive entries for some n is violated, then (3’) and
(4’) need not hold.
The first example is the matrix
0 1
P =
1 0
This matrix represents a random walk with two sites that isn’t very random. Starting at site one,
n
the walker moves to site two with probability 1, and vice
versa. The powers P of P are equal
to
1
1
I or P depending on whether n is even or odd. So P n
doesn’t converge: it is equal to
for
0
0
0
even n and
for odd n. The eigenvalues of P are easily computed to be 1 and −1. They both
1
have modulus one.
For the second example, consider a random walk where the sites can be divided into two sets A
and B and the probability of going from any site in B to any site in A is zero. In this case the
i, j entries P n with the i site in A and the jth site in B are always zero. Also, applying P to any
state vector drains probability from A to B without sending any back. This means that in the limit
P n x0 (that is, the eigenvector v1 ) will have zero entries for all sites in A. For example, consider a
three site random walk (the very first example above) where there is no chance of ever going back
to site 1. The matrix


1/3 0
0
P = 1/3 1/2 1/2
1/3 1/2 1/2
corresponds to such a walk. We can check
>P=[1/3 0 0;1/3 1/2 1/2; 1/3 1/2 1/2];
>[V D] = eig(P)
V =
0.00000
0.70711
0.70711
0.00000
0.70711
-0.70711
0.81650
-0.40825
-0.40825
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.33333
D =


0
The eigenvector corresponding to the eigenvalue 1 is 1/2 (after normalization to make it a state
1/2
vector).
182
IV.6.3. Google PageRank
I’m going to refer you to the excellent article by David Austin:
http://www.ams.org/featurecolumn/archive/pagerank.html
Here are the main points:
The sites are web pages and connections are links between pages. The random walk is a web
surfer clicking links at random. The rank of a page is the probability that the surfer will land on
that page in the limit of infinitely many clicks.
We assume that the surfer is equally likely to click on any link on a given page. In other words,
we assign to each outgoing link a transition probability of
1/(total number of outgoing links for that page).
This rule doesn’t tell us what happens when the surfer lands on a page with no outgoing links
(so-called dangling nodes). In this case, we assume that any page on the internet is chosen at
random with equal probability.
The two rules above define a stochastic matrix
P = P1 + P2
where P1 contains the probabilities for the outgoing links and P2 contains the probabilities for the
dangling nodes. The matrix P1 is very sparse, since each web page has typically about 10 outgoing
links. This translates to 10 non-zero entries (out of about 2 billion) for each column of P1 . The
matrix P2 has a non-zero column for each dangling node. The entries in each non-zero column are
all the same, and equal to 1/N where N is the total number of sites.
If x is a state vector, then P1 x is easy to compute, because P1 is so sparse, and P2 x is a vector
with all entries equal to the total probability that the state vector x assigns to dangling nodes,
divided by N , the total number of sites.
We could try to use the matrix P to define the rank of a page, by taking the eigenvector v1
corresponding to the eigenvalue 1 of P and defining the rank of a page to be the entry in v1
corresponding to that page. There are problems with this, though. Because P has so many zero
entries, it is virtually guaranteed that P will have many eigenvalues with modulus 1 (or very close
to 1) so we can’t use the power method to compute v1 . Moreover, there probably will also be many
web pages assigned a rank of zero.
To avoid these problems we choose a number α between 0 and 1 called the damping factor. We
modify the behaviour of the surfer so that with probability α the rules corresponding to the matrix
P above are followed, and with probability 1 − α the surfer picks a page at random. This behaviour
is described by the stochastic matrix
S = (1 − α)Q + αP
183
where Q is the matrix where each entry is equal to 1/N (N is the total number of sites.) If x is a
state vector, then Qx is a vector with each entry equal to 1/N .
The value of α used in practice is about 0.85. The final ranking of a page can then be defined to
be the entry in v1 for S corresponding to that page. The matrix S has all non-zero entries
IV.6.4. The Metropolis algorithm
So far we have concentrated on the situtation where a stochastic matrix P is given, and we are
interested in finding invariant distribution (that is, the eigenvector with eigenvalue 1). Now we want
to turn the situation around. Suppose we have a state vector π and we want to find a stochastic
matrix that has this vector as its invariant distribution. The Metropolis algorithm, named after
the mathematician and physicist Nicholas Metropolis, takes an arbitrary stochastic matrix P and
modifies to produce a stochastic matrix Q that has π as its invariant distribution.
This is useful in a situation where we have an enormous number of sites and some function p
that gives a non-negative value for each site. Suppose that there is one site where p is very much
larger than for any of the other sites, and that our goal is to find that site. In other words, we have
a discrete maximization problem. We assume that it is not difficult to compute p for any given
site, but the number of sites is too huge to simply compute p for every site and see where it is the
largest.
To solve this problem let’s assume that the sites are labeled by integers 1, . . . , N . The vector
p = [p1 , . . . , pN ]T has non-negative entries, and our goal is to P
find the largest one. We can form a
state vector π (in principle) by normalizing p, that is, π = p/ i pi . Then the state vector π gives
a very large probability to the site we want to find. Now we can use the Metropolis algorithm to
define a random walk with π as its invariant distribution. If we step from site to site according to
this random walk, chances are high that after a while we end up at the site where π is large.
P
In practice we don’t want to compute the sum i pi in the denominator of the expression for
π vector, since the number of terms is huge. An important feature of the Metropolis algorithm is
that this computation is not required.
You can more learn about the Metropolis algorithm in the article The Markov chain Monte Carlo
revolution by Persi Diaconis at
http://www.ams.org/bull/2009-46-02/S0273-0979-08-01238-X/home.html.
In this article Diaconis presents an example where the Metropolis algorithm is used to solve a
substitution cipher, that is, a code where each letter in a message is replaced by another letter.
The “sites” in this example are all permutations of a given string of letters and punctuation. The
function p is defined by analyzing adjacent pairs of letters. Every adjacent pair of letters has a
certain probablility of occuring in an English text. Knowing these probablities is it possible to
construct a function p that is large on strings that are actually English. Here is the output of a
random walk that is attempting to maximizde this function.
184
IV.6.5. Description of the algorithm
To begin, notice that a square matrix P with non-negative entries is stochastic if P T 1 = 1, where
1 = [1, 1, . . . , 1]T is the vector with entries all equal to 1. This is just another way of saying that
the columns of P add up to 1.
Next, suppose we are given a state vector π = [π1 , π2 , . . . , πn ]T . Form the diagonal matrix Π that
has these entries on the diagonal. Then, if P is a stochastic matrix, the condition
P Π = ΠP T
implies that π is the invartiant distribution for P . To see this, notice that Π1 = π so applying the
both sides of the equation to 1 yields P π = π.
This condition can be written as a collection of conditions on the components of P .
pi,j πj = πi pj,i
Notice that for diagonal entries pi,i this condition is always true. So we may make any changes we
want to the diagonal entries without affecting this condition. For the off-diagonal entries (i 6= j)
there is one equation for each pair pi,j , pj,i .
Here is how the Metropolis algorithm works. We start with a stochastic matrix P where these
equations are not neccesarily true. For each off-diagonal pair pi,j , pj,i , of matrix entries, we make
the equation above hold by by decreasing the value of the one of the entries, while leaving the other
entry alone. It is easy to see that the adjusted value will still be non-negative.
Let Q denote the adjusted matrix. Then QΠ = ΠQT as required, but Q is not neccesarily a
stochastic matrix. However we have the freedom to change the diagonal entries of Q. So set the
diagonal entries of Q equal to
X
qi,i = 1 −
qj,i
j6=i
185
Then the columns of Q add up to 1 as required. The only thing left to check is that the entries
we just defined are non-negative. Since we have always made the adjustment so that qi,j ≤ pi,j we
have
X
qi,i ≥ 1 −
pj,i = pi,i ≥ 0
j6=i
In practice we may not be presented with state vector π but with a positive vector p = [p1 , p2 , . . . , pn ]T
whose P
maximal component we want to find. Although in principle it is easy to obtain π as
π=p/( i pi ), in practice there may be so many terms in the sum that it is impossible to compute. However notice that the crucial equation pi,j πj = πi pj,i is true for πi and πj if and only if it
is true for pi and pj . Thus we may use the values of p instead.
Notice that if one of pi,j or pj,i is 0 then qi,j = qj,i = 0. So it is possible that this algorithm
would yield Q = I. This is indeed a stochastic matrix where every state vector is an invariant
distribution, but it is not very useful. To avoid examples like this, one could restrict the use of the
Metropolis algorithm to matrices P where pi,j and pj,i are always either both zero or both nonzero.
IV.6.6. An example
In this example we will use the Metropolis algorithm to try to maximize a function f (x, y) defined
in the square −1 ≤ x ≤ 1 and −1 ≤ y ≤ 1. We put down a uniform (2N + 1) × (2N + 1)
grid and take the resulting lattice points to be our sites. The initial stochastic matrix P is the
one that gives equal probabilities for travelling form a site to each neighbouring site. We use the
Metropolis algorithm to modify this, obtaining a stochastic matrix Q whose invariant distribution
is proportional to the sampled values of f . We then use Q to produce a random path that has a
high probability of converging to the maximum of f .
To begin we write a MATLAB/Octave function f.m that defines a Gaussian function f with a
peak at (0, 0).
function f=f(x,y)
a = 10;
f = exp(-(a*x)^2-(a*y)^2);
end
Next we write a function p.m such that p(i,j,k,l,N) defines the matrix element for stochastic
matrix P corresponding to the sites (i, j) and (k, l) when the grid size is (2N + 1) × (2N + 1).
186
function p = p(i,j,k,l,N)
% [i,j] and [k,l] are two points on a grid with
% -N <= i,j,k,l <= N. The probabilities are those for
% a uniform random walk, taking into account the edges
% Note: with our convention, p([i,j],[k,l],N) is the
% probability of going from [k,l] to [i,j]
% Note: if i,j,k,l are out of range, then p=0
if( ([k,l]==[N,N] | [k,l]==[N,-N] | [k,l]==[-N,N] | [k,l]==[-N,-N])
& abs(i-k)+abs(j-l)==1)
% starting point [k,l] is a corner:
p = 1/2;
elseif( (k==N | k==-N | l==N | l==-N) & abs(i-k)+abs(j-l)==1)
% starting point [k,l] is on the edge:
p = 1/3;
elseif(abs(i-k)+abs(j-l)==1)
% starting point is inside
p = 1/4;
else
p = 0;
end
% handle out of range cases:
if(i<-N | i>N | j<-N | j>N | k<-N | k>N | l<-N | l>N )
p=0;
end
end
The next function q.m implements the Metropolis algorithm to give the matrix elements of Q.
function qq = q(i,j,k,l,N)
if(i~=k | j~=l)
if(p(i,j,k,l,N)==0)
qq = 0;
elseif(p(i,j,k,l,N)*f(k/N,l/N) > f(i/N,j/N)*p(k,l,i,j,N))
qq = p(i,j,k,l,N)*f(i/N,j/N)/f(k/N,l/N);
else
qq = p(i,j,k,l,N);
187
end
else
qq = 1 - q(k+1,l,k,l,N) - q(k-1,l,k,l,N) - q(k,l+1,k,l,N) - q(k,l-1,k,l,N);
end
Finally, here are the commands that compute a random walk defined by Q
% set the grid size
N=50;
% set the number of iterations
niter=1000
% pick starting grid point [i,j] at random
k=discrete_rnd(1,[-N:N],ones(1,2*N+1)/(2*N+1));
l=discrete_rnd(1,[-N:N],ones(1,2*N+1)/(2*N+1));
% or start in a corner (if these are uncommented)
k=N;
l=N;
% initialize: X and Y contain the path
% F contains f values along the path
X=zeros(1,niter);
Y=zeros(1,niter);
F=zeros(1,niter);
% the main loop
for count=1:niter
% pick the direction to go in the grid,
% according to the probabilites in the stochastic matrix q
probs = [q(k,l+1,k,l,N),q(k+1,l,k,l,N),q(k,l-1,k,l,N),q(k-1,l,k,l,N),q(k,l,k,l,N)];
dn = discrete_rnd(1,[1,2,3,4,5],probs);
switch dn
case 1
l=l+1;
case 2
k=k+1;
case 3
l=l-1;
case 4
k=k-1;
end
188
% calculate X, Y and F values for the new grid point
X(count)=k/N;
Y(count)=l/N;
F(count)=f(k/N,l/N);
end
% plot the path
subplot(1,2,1);
plot(X,Y);
axis([-1,1,-1,1]);
axis equal
%plot the values of F along the path
subplot(1,2,2);
plot(F);
The resulting plot looks like
1
1
0.8
0.5
0.6
0
0.4
-0.5
0.2
-1
-1
-0.5
0
0.5
1
0
0
200
400
600
800
1000
189
IV.7. The Singular Value Decomposition
Prerequisites and Learning Goals
After completing this section, you should be able to
• explain what the singular value decomposition of a matrix is
• explain why a Hermitian matrix always has real eigenvalues and an orthonormal basis of
eigenvectors.
• write down the relationship between the singular values of a matrix and its matrix norm
IV.7.1. The matrix norm and eigenvalues
Recall that the matrix norm kDk of a diagonal matrix

λ1 0 0 · · ·
 0 λ2 0 · · ·


D =  0 0 λ3 · · ·
 ..
..
.. . .
.
.
.
.
0
0
0
···

0
0

0

.. 
. 
λn
is the largest absolute value of a diagonal entry, that is, the largest value of |λi |. Since, for a
diagonal matrix the diagonal entries λi are also the eigenvalues of D it is natural to conjecture that
for any matrix A the matrix norm kAk is the largest absolute value of an eigenvalue of A.
Let’s test this conjecture on a 2 × 2 example.
A=[1 2; 3 4]
A =
1
3
2
4
eig(A)
ans =
-0.37228
5.37228
norm(A)
ans = 5.4650
190
Since 5.37228 6= 5.4650 we see that the conjecture is false. But before giving up on this idea, let’s
try one more time.
B=[1 3; 3 4]
B =
1
3
3
4
eig(B)
ans =
-0.85410
5.85410
norm(B)
ans = 5.8541
This time the norm and the largest absolute value of an eigenvalue are both equal to 5.8541
The difference between these two examples is that B is a Hermitian matrix (in fact, a real
symmetric matrix) while A is not. It turns out that for any Hermitian matrix (recall this means
A∗ = A) the matrix norm is equal to the largest eigenvalue in absolute value. This is related to
the fact that every Hermitian matrix A can be diagonalized by a unitary U : A = U DU ∗ with D
diagonal with the eigenvalues on the diagonal and U unitary. The point is that multiplying A by
a unitary matrix doesn’t change the norm. Thus D has the same norm as U D which has the same
norm as U DU ∗ = A. But the norm of D is the largest eigenvalue in absolute value.
The singular value is a decompostion similar to this that is valid for any matrix. For an arbitrary
matrix it takes the form
A = U ΣV ∗
where U and V are unitary and Σ is a diagonal matrix with positive entries on the diagonal. These
positve numbers are called the singular values of A. As we will see it is the largest singluar value
that is equal to the matrix norm. The matrix A does not have to be a square matrix. In this case,
the unitary matrices U and V are not only different, but have different sizes. The matrix Σ has
the same dimensions as A.
IV.7.2. The singular value decomposition
When A is Hermitian then we can write A = U DU ∗ where U is unitary and D is diagonal. There is
a similar factorization that is valid for any n × n matrix A. This factorization is called the singular
value decomposition, and takes the form
A = U ΣV ∗
191
where U is an n × n unitary matrix, V is an n × n unitary matrix and Σ is an n × m diagonal
matrix with non-negative entries. (If n 6= m the diagonal of Σ starts at the top left corner and runs
into one of the sides of the matrix at the other end.) Here is an example.
√
√  √


  √
1/ √3 −1/√2 −1/√ 6
1 1
3 √0 0 1
1 −1 = −1/ 3 −1/ 2 1/ 6   0

2
√
√
−1 0
0 1
0
0
1/ 3
0
2 6
The positive diagonal entries of Σ are called the singular values of A.
Here is how to construct the singular value decomposition in the special case where A is an
invertible square (n × n) matrix. In this case U Σ and V will all be n × n as well.
To begin, notice that A∗ A is Hermitian, since (A∗ A)∗ = A∗ A∗∗ = A∗ A), Moreover, the (real)
eigenvalues are all positive. To see this, suppose that A∗ Av = λv. Then, taking the inner product
of both sides with v, we obtain hv, A∗ Avi = hA∗ v, Avi = λhv, vi. The eigenvector v is by definition
not the zero vector, and because we have assumed that A is invertible Av is not zero either. Thus
λ = kAvk2 /kvk2 > 0.
Write the positive eigenvalues of A∗ A as σ12 , σ22 , . . . , σn2 and let Σ be the matrix with σ1 , σ2 , . . . , σn
on the diagonal. Notice that Σ is invertible, and Σ2 has the eigenvalues of A∗ A on the diagonal.
Since A∗ A is Hermitian, we can find a unitary matrix V such that A∗ A = V Σ2 V ∗ . Define U =
AV Σ−1 . Then U is unitary, since U ∗ U = Σ−1 V ∗ A∗ AV Σ−1 = Σ−1 Σ2 Σ−1 = I.
With these definition
U ΣV ∗ = AV Σ−1 ΣV ∗ = AV V ∗ = A
so we have produced the singular value decomposition.
Notice that U is in fact the unitary matrix that diagonalizes AA∗ since U Σ2 U ∗ = AV Σ−1 Σ2 Σ−1 V ∗ A∗ =
AV V ∗ A∗ = AA∗ . Moreover, this formula shows that the eigenvalues of AA∗ are the same as those
of A∗ A, since they are the diagonal elements of Σ2 .
In MATLAB/Octave, the singular value decomposition is computed using [U,S,V]=svd(A). Let’s
try this on the example above:
[U S V] = svd([1 1;1 -1;0 1])
U =
0.57735
-0.57735
0.57735
-0.70711
-0.70711
0.00000
S =
1.73205
192
0.00000
-0.40825
0.40825
0.81650
0.00000
0.00000
1.41421
0.00000
V =
0
1
-1
-0
IV.7.3. The matrix norm and singular values
The matrix norm of a matrix is the value of the largest singular value. This follows from the fact
that multiplying a matrix by a unitary matrix doesn’t change its matrix norm. So, if A = U ΣV ∗ ,
then the matrix norm kAk is the same as the matrix norm kΣk. But the Σ is a diagonal matrix
with the singular values of A along the diagonal. Thus the matrix norm of Σ is the largest singular
value of A.
Actually, the Hilbert Schmidt
p norm is also related to the singular values. Recall that for a matrix
with real entries kAkHS = tr(AT A). For a matrix with complex entries, the definition is kAkHS =
p
tr(A∗ A). Now, if A = U ΣV ∗ is the singular value decomposition of A, then A∗ A = V Σ2 V ∗ . Thus
tr(A∗ A) = tr(V Σ2 V ∗ ) = tr(V ∗ V Σ2 ) = tr(Σ2 ). Here we
used that tr(AB) = tr(BA) for any two
qP
2
matrices A and B and that V ∗ V = I. Thus kAkHS =
i σi , where σi are the singular values.
Notice that if we define the vector of singular values σ = [σ1 , . . . , σn ]T then kAk = kσk∞ and
kAkHS = kσk2 . By taking other vector norms for σ, like the 1-norm, or the p-norm for other values
of p, we can define a whole new family of norms for matrices. These norms are called Schatten
p-norms and are useful in some situations.
IV.7.4. The null space of a measured formula matrix
We return to the formula matrix of a chemical system considered in section II.2.9, and consider
a situation where the formula matrix A contains experimentally measured, rather than exact,
values. For example, the chemical system could be a rock sample, instead of elements the chemical
constituents could be oxides O1 , . . . On and the species a collection of minerals M1 , . . . Mm found
in the rock. The formula matrix
M1
O1 a1,1
O2 
 a2,1
A= .  .
..  ..
M2
a1,2
a2,2
..
.
···
···
···
..
.
Mm

a1,m
a2,m 

.. 
. 
On
an,2
···
an,m

an,1
contains measured values ai,j . As before, we want to determine possible reactions by finding the
null space N (A). However, if n ≥ m, a generic n × m matrix does not have a non-zero nullspace.
193
So, because of the errors in the measured values ai,j , there will practically never be a non-zero null
space N (A), even if it is actually present in the chemical system. We can use the singular value
decomposition of A to determine if the null space is present, within experimental error.
According to the singular value decomposition A can be written as a product
A = U ΣV T
where U is an n × n orthogonal matrix, V is an m × m orthogonal matrix, and (for n ≥ m)


σ1 0 · · · 0
 0 σ2 · · · 0 


 ..
.. . .
.. 
.
.
.
. 



Σ=
 0 0 · · · σm 
 0 0 ··· 0 


 ..
.. . .
.. 
.
.
.
. 
0 0 ··· 0
The singular values σi are non-negative numbers arranged in decreasing order σ1 ≥ σ2 ≥ · · · ≥
σm . The dimension of N (A) is equal to the number of σi that equal zero. In a situation, say,
where σ1 , . . . σm−1 are large but σm is very small, we can simply set σm = 0 in Σ. If Σ̂ is the
resulting matrix, we compute  = U Σ̂V T . Then N (Â) will be one dimensional and we can find the
corresponding reaction. In fact, since Σ̂[0, 0, . . . , 1]T = 0 it is easy to see that ÂV [0, 0, . . . , 1]T =
U Σ̂V T V [0, 0, . . . , 1]T = U Σ̂[0, 0, . . . , 1]T = 0. Therefore V [0, 0, . . . , 1]T , which is the last column of
V , is a non zero vector in N (Â) from which we can read off the reaction. If k of the σi are small
enought to be discarded, then the corresponding reactions can be read off from the last k columns
of V . (To satisfy the non-degeneracy condition we may need to take linear combinations if we are
using more than one column.)
How do we decide if a σi is small enough to throw away? We need to know the error ∆a in
our measurments. Form the matrix Σ̂i where all the singluar values except σi are set to zero. The
largest entry of Âi = U Σ̂i V T is the largest change that will occur if we discard σi . So if the largest
entry of Âi is less than ∆a then σi can be discarded. The k, jth entry of Âi can be written as an
inner product hek , Âi ej i where ek and ej are standard unit vectors. Since
|hek , Âi ej i| ≤ kek kkÂi kkej k
= kU Σ̂i V T k
≤ kU kkΣ̂i kkV T k
= kΣ̂i k
= σi
a sufficient condition is σi ≤ ∆a.
194
Here is an example with made up data. Suppose that the

33.2 6.3 1.0 22.2
16.7 59.4 3.7 8.7

A=
 2.9 9.0 5.0 35.9
10.0 5.3 74.8 36.8
37.1 20.1 15.3 3.8
measured formula matrix is

31.7
0.0 

26.1

31.3
16.4
and the error in the measurements is ∆a = 0.1
We calculate the singular value decomposition in MATLAB/Octave as follows.
> A = [33.2, 6.3, 1.0, 22.2, 31.7; 16.7, 59.4, 3.7, 8.7, 0.0; 2.9, 9.0, 5.0,
35.9, 26.1;10.0, 5.3, 74.8, 36.8, 31.3; 37.1, 20.1, 15.3, 3.8, 16.4];
> [U S V]=svd(A)
U =
-0.3607348
-0.2782364
-0.3285601
-0.7515318
-0.3459814
0.1958034
0.7614613
-0.0042285
-0.5244669
0.3267328
0.6938994
-0.4810714
0.3696774
-0.3700403
0.1161153
0.1309949
-0.2626182
-0.7213246
0.0623230
0.6242425
-0.5769536
-0.2058233
0.4848297
-0.1389982
0.6085894
S =
Diagonal Matrix
1.0824e+02
0
0
0
0
0
6.5227e+01
0
0
0
0
0
4.4061e+01
0
0
0
0
0
3.2641e+01
0
0
0
0
0
7.0541e-03
-3.5041e-01
-3.0206e-01
-5.9629e-01
-4.7299e-01
-4.5463e-01
3.9986e-01
7.6983e-01
-4.7893e-01
-1.1098e-01
-7.6054e-02
3.7864e-01
-4.6536e-01
-5.7058e-01
2.5679e-01
4.9857e-01
6.6340e-01
-2.5699e-01
2.9917e-01
-6.3131e-01
-7.6154e-02
3.6587e-01
-1.8306e-01
3.5851e-04
5.4724e-01
-7.3018e-01
V =
Since σ5 = 00.0070541 ≤ 0.1 = ∆a we can set it to zero. The reaction coefficients are then given
195
by the last column of V . Thus the reaction is
0.36587M1 + 0.0003.5851M3 + 0.54724M4 = 0.18306M2 + 0.73018M5
In fact, we can get a nicer looking equation if we normalize the coefficients. Treating the third
coefficient as zero, we normalize so that the next smallest is one.
> VN=V(:,5)/V(2,5)
VN =
-1.9986309
1.0000000
-0.0019584
-2.9893892
3.9887438
This yields the equivalent reaction 2M1 + 3M4 = M2 + 4M5 .
Applications of the linear algebra to petrology can be found in [4]. T. Gordon’s student Curtis
Brett tested their results using the singular value decomposition in [5]. See also [6] for the use of
the singular value decomposition in petrology.
References
[3] E. Froese, An outline of phase equilibrium, preprint
[4] C. J. N. Fletcher and H. J. Greenwood, Metamorphism and Structure of Penfold Creek Area,
near Quesnel Lake, British Columbia, J. Petrology (1979) 20 (4): 743-794.
[5]C. Brett, Algebraic Applications to Petrology, report submitted to Terence M. Gordon in partial
fulfillment of the requirements for the degree of Masters of Science, Earth and Oceanic Science
department UBC, (2007)
[6] G. W. Fisher, Matrix analysis of metamorphic mineral assemblages and reactions, Contributions
to Mineralogy and Petrology, (1989) Volume 102, Number 1, 69-77
196
IV.8. Principal Coordinates Analysis (PCA)
IV.8.1. Introduction
Imagine we are interested in comparing a selection of n different wines from different vineyards. A
connoisseur is given samples from pairs of wines and compares them, giving a (somewhat subjective)
comparison rating for how similar or dissimilar they are: 0 if the wines cannot be told apart up to
10 if they are totally different. This comparison rating is referred to as the dissimilarity between
the two wines. At the end of the day the connoisseur has compared all the different wines and
assigned ratings to describe the similarity and dissimilarity of each pair. This data is combined
into a dissimilarity matrix T where Tij is the rating given to the comparison between wines i and
j.
Even for a moderate selection, trying to make sense out of this data can be challenging. We want
to group the wines into similar-tasting families but, unless there is a very conspicuous characteristic
that is different (such as half of the selection being red and the other half white), it may be difficult
to identify significant patterns or relationships just by staring at the data.
The idea of Principal Coordinates Analysis (PCA) is firstly to represent the different objects
under consideration (in this case the wines) graphically, as points v1 , v2 , . . . , vn in Rp for some
suitable choice of dimension p. The distance kvi −vj k between points in the plot is chosen to reflect
as closely as possible the entry Tij in the dissimilarity matrix T . Our eye is much more capable of
spotting relationships and patterns in such a plot than it is spotting them in the raw data. However
another problem arises in doing this: we can only visualize two or possibly three dimensions (p = 2
or 3) whereas the data would most naturally be graphed in a much higher dimensional, possibly
even n − 1-dimensional, space. The second key idea of PCA is to reduce the number of dimensions
being considered to those in which the variation is greatest.
IV.8.2. Definitions and useful properties
A real square matrix A is called positive definite if
hx, Axi > 0
for any x 6= 0.
A real square matrix A is called positive semi-definite if
hx, Axi ≥ 0
for any x 6= 0.
A real matrix of the form AT A for any real matrix A is always positive semi-definite because
hx, AT Axi = hAx, Axi = kAxk2 ≥ 0.
197
A positive semi-definite matrix has all non-negative eigenvalues. The proof is as follows. Let v
be an eigenvector of A with eigenvalue λ so
Av = λv.
Because A is positive semi-definite we have
hv, Avi ≥ 0.
Using the fact that v is an eigenvector we also have
hv, Avi = hv, λvi = λhv, vi.
Combining these two results and remembering that hv, vi > 0 for an eigenvector, we have our
result:
λ ≥ 0.
IV.8.3. Reconstructing points in Rp
We begin with a simple example, taking our objects to be a set of n points w1 , w2 , . . . , wn in Rp .
We take the dissimilarity between objects i and j to be the usual distance between those points
Tij = kwi − wj k.
Is it possible to reconstruct the relative location of the points from the data in T alone?
It is important to notice that any reconstruction that we find is not unique. We are free to rotate,
reflect and translate the points and they still satisfy the only requirement that we make of them,
namely that the distances between the points are as specified.
Reconstructing two points in Rp
Consider two points, w1 and w2 , that are a known distance, say 2, apart. A single line can always
be placed between any two points and therefore we expect that the points can be represented in
one dimension.
If we only know
the distance between the points, then a possible representation of
them is v1 = 0 and v2 = 2 .
Reconstructing three points in Rp
Now consider three points, w1 , w2 and w3 with dissimilarity matrix:


0 3 5
T = 3 0 4 .
5 4 0
198
A plane can always be placed through any three points and therefore we expect that the points
can be represented in two dimensions. We can find such a representation using trilateration. First
we choose a point to represent w1 :
1
Next we draw a circle of radius 3 centred at our first point and choose a second point on the
circle to represent w2 :
3
2
1
Finally we draw a circle of radius 5 centred at our first point and a circle of radius 4 centred at
our second point and choose our third point to be at one of the intersections of these two circles to
represent w3 :
3
4
3
1
2
5
0
3
Thus we have been able to create a representation of the three points: v1 =
, v2 =
and
0
0
3
v3 =
.
4
199
Arbitrary number of points in Rp
The approach above works for a low dimensional space and with few points, but how can we handle
many dimensions and/or many points?
Let v1 , . . . , vn be the representation of the points that we are trying to find. We assume that
these points are in Rn (from the discussion above we can infer that the points can be chosen to be
in an n − 1 dimensional space, but choosing n makes the analysis below a little cleaner).
Let V be the matrix whose rows are the vectors for the points we are trying to find:
 T
v1
vT 
 2
 T
V = v3 
 .. 
 . 
vnT
We assume that the points are chosen such that the sum of each column of V is 0, in other words
the centroid of the set of points is placed at the origin of Rn . We are allowed to do this because of
the translational freedom that we have in finding a solution.
All we know about the points is the distance that they are required to be apart. We have
(Tij )2 = kvi − vj k2 = hvi − vj , vi − vj i
= hvi , vi i + hvj , vj i − 2hvi , vj i.
(IV.1)
It is easier to work with this equation in matrix form so we make the following definitions. Let
B be the matrix whose entries are the inner products between the points
Bij = hvi , vj i = viT vj ,
or equivalently
B = V V T.
Let ∆ be the matrix containing the squares of the dissimilarities with entries
∆ij = (Tij )2 .
Finally, let Q be the matrix with entries
Qij = kvi k2
or equivalently
Q = qeT ,
200

 
kv1 k2
1
 kv2 k2 
1


 
where q =  .  and e =  . .
.
 . 
 .. 

kvn k2
1
We can now rewrite (IV.1) as the matrix equation
∆ = Q + QT − 2V V T .
(IV.2)
We want to find the matrix V , and know the matrix ∆. So the next step is to eliminate Q from
this equation. Notice that qeT e = nq. This suggests that post-multiplying equation (IV.2) by
H=I−
1 T
ee
n
will eliminate Q:
QH = qeT I − (1/n)eeT = qeT − qeT = 0,
where 0 here represents the zero matrix. We also have that H is symmetric (H = H T ) so HQT =
H T QT = (QH)T = 0. Therefore pre-multiplying by H eliminates QT . Applying these operations
to (IV.2) we obtain the matrix equation
H∆H = −2HV V T H = −2HV V T H T = −2HV (HV )T .
We can simplify this by finding HV :
1 T
1
HV = I − ee
V = V − eeT V
n
n
but each entry of the row vector eT V is the sum of each column of V which we assumed was zero.
Therefore eT V = 0T and
HV = V.
We have now obtained the equation
1
V V T = − H∆H,
2
which is an equation that we are able to solve to find V . The matrix on the left is symmetric and
positive semi-definite (using the properties we saw in §IV.8.2). The matrix on the right is symmetric
(the dissimilarity matrix should be symmetric). In order for a solution to exist, it must also be
positive semi-definite. In the example we are considering, we know that a solution of the problem
exists because the dissimilarity matrix was constructed from a set of points in Rn . Therefore in
this case the matrix on the right must be positive semi-definite.
We can find a solution as follows. Because −(1/2)H∆H is a real symmetric matrix it can be
diagonalized as
1
− H∆H = SDS T ,
2
201
where S is the matrix of normalized eigenvectors and D is the diagonal matrix of eigenvalues.
Because it is positive semi-definite all eigenvalues are non-negative, and we can take the square
root D1/2 of D, where the entries on the leading diagonal are the square roots of the eigenvalues.
We can now write
V V T = (SD1/2 )(SD1/2 )T .
(IV.3)
and a solution for V is
V = SD1/2
Cautionary note: all we have actually shown is (IV.3). Our claim that SD1/2 is a possible solution
for V needs more justification. It is not immediately clear that just because this relation between V
and SD1/2 holds the rows of SD1/2 must satisfy the same distance requirements as the rows of V .
This result is in fact a particular property of matrices of the form AAT . Assuming AAT = BB T ,
the fundamental idea is that the entries ij of AAT and BB T are the dot products of the rows i and
j of A and B respectively. This means that corresponding rows of A and B must have the same
length, and also the angle between rows i and j of A must equal the angle between the same rows
of B for all i and j. Therefore the points represented by the rows of A and the points represented
by the rows of B have the same magnitudes and relative angles and so we can find rotations and
reflections mapping the points of A to the points of B.
Example
Consider the five points in R2 given by
2
1
,
, w2 =
w1 =
3
0
1
,
w4 =
−2
2
,
w3 =
5
6
3
4
2
2
1
0
4
−2
5
−4
−4
202
−3
−2
−1
0
1
2
3
−3
,
w5 =
−3
The dissimilarity matrix for these points is
√
√


10
26 √2
√0
√5
 10
0
2
26 √61
√

√

T =  26 √2
0
5 2 √89

√
 2

26
5
2
0
17
√
√
√
5
61
89
17
0
Knowing only T , can we find a representation of the points?
Using MATLAB/Octave we have
> T
T =
0.00000
3.16228
5.09902
2.00000
5.00000
>
>
>
S
3.16228
0.00000
2.00000
5.09902
7.81025
5.09902
2.00000
0.00000
7.07107
9.43398
2.00000
5.09902
7.07107
0.00000
4.12311
5.00000
7.81025
9.43398
4.12311
0.00000
H = eye(5)-1/5*ones(5,1)*ones(1,5);
Delta = T.^2; % this finds the square of each element in T
[S D] = eig(-0.5*H*Delta*H)
=
-0.0450409
0.9633529
0.2644287
0.2207222
0.7175132
0.3690803
0.0063295
0.0398071 -0.6871823
0.3848777
0.6031796
0.1253245 -0.3538343 -0.2243887 -0.3502305
-0.2791402 -0.1936833
0.6580702 -0.5270235 -0.4611091
-0.6480788
0.1367176 -0.6084717 -0.3885333
0.0419632
D =
56.60551
0.00000
0.00000
0.00000
0.00000
0.00000
-0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
5.79449
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
-0.00000
Remember that MATLAB/Octave returns the eigenvectors normalized, so we do not need to normalize the columns of S.
> V = S*D^0.5
V =
Columns 1 through 3:
203
-0.33887
2.77684
4.53812
-2.10016
-4.87593
+
+
+
+
+
0.00000i
0.00000i
0.00000i
0.00000i
0.00000i
0.00000
0.00000
0.00000
-0.00000
0.00000
+
+
+
+
0.00000i
0.00000i
0.00000i
0.00000i
0.00000i
0.00000
0.00000
-0.00000
-0.00000
0.00000
+
+
+
0.00000i
0.00000i
0.00000i
0.00000i
0.00000i
0.63653
0.09582
-0.85174
1.58409
-1.46470
+
+
+
+
+
0.00000i
0.00000i
0.00000i
0.00000i
0.00000i
Columns 4 and 5:
0.00000
-0.00000
-0.00000
-0.00000
-0.00000
+
+
+
+
+
0.00000i
0.00000i
0.00000i
0.00000i
0.00000i
Each row of V is now one of the points we want to find in R5 . Notice that only the first and third
entries of each row are non-zero (this is a result of the second, fourth and fifth eigenvalues in D
being zero). Therefore we can plot the result in R2 as follows:
> plot(V(:,1),V(:,3),’rs’)
> axis([-6 5 -2 2])
and we obtain:
2
1.5
4
1
0.5
1
0
2
−0.5
−1
−1.5
−2
−6
5
3
−4
−2
0
2
4
This plot is rotated, translated and reflected from the plot that we started off with, but the
relative positions of the points are the same as those in the original.
204
IV.8.4. The dimension of the plot
In the example above, if we had only been given the dissimilarity matrix T our best guess initially
would have been that the 5 points came from a four-dimensional space. However we saw that
we only needed two dimensions to plot the data: the components in the second, fourth and fifth
coordinate directions were all zero because the corresponding eigenvalue was zero. Often most of
the variation takes place in only a few directions (the directions of the largest eigenvalues) and we
can look at the projection of the data points onto those coordinate directions (in the example above
the first and third coordinates). Because the matrix S in the construction contains orthonormal
vectors, the coordinate axes chosen in this way will be orthonormal.
There is a bit of an art (and controversy) to choosing the correct number of coordinates, but if
the first two or three eigenvalues are substantially larger than the remainder it is reasonably safe
just to use those for a comparison.
IV.8.5. Real examples
In most real-life applications, the dissimilarity matrix T does not produce a matrix −H∆H/2 that
is positive semi-definite and so taking the square root of the matrix of eigenvalues D gives imaginary
numbers. Consequently the decomposition V = SD1/2 does not make sense.
An example is four cities connected by roads as follows:
1
4
3
2
We take the dissimilarity between two cities to be the driving time between them. If the journey
time between cities along each road is exactly 1 hour (some roads are much better than others),
then cities 1, 2 and 3 must all be an equal distance 2 apart in our representation. Therefore they
are the vertices of an equilateral triangle. But now city 4 is only an hour apart from all three other
cities and so must be a distance 1 apart from all the vertices of the equilateral triangle. But this
is impossible (even with arbitrarily many dimensions):
205
2
3
1
1
2
One solution for this problem is to use only the coordinate directions with real eigenvalues. If
these eigenvalues have significantly larger magnitudes than the negative eigenvalues, then this still
produces a useful representation of the objects under consideration. This is the approach we will
use in the examples below.
A difficult aspect of multi-dimensional scaling is determining how best to measure the dissimilarity
between objects. This will depend on such things as what particular properties of the objects we
are interested in and whether this data is quantitative (for example if we are comparing people we
may be interested in their heights or their hair or eye colour). For our examples, we will assume
that an appropriate choice of measurement has already been made.
Example 1
Take the set of objects to be the following towns and cities: Dawson Creek, Fort Nelson, Kamloops,
Nanaimo, Penticton, Prince George, Prince Rupert, Trail, Vancouver and Victoria. We index them
as
1
2
3
4
5
6
7
8
9
10
206
Dawson Creek
Fort Nelson
Kamloops
Nanaimo
Penticton
Prince George
Prince Rupert
Trail
Vancouver
Victoria
If our interest in the cities is their relative geographical location, then an appropriate measure
of dissimilarity is the distance ‘as the crow flies’ between them. Then the entry T11 is the distance
from Dawson Creek to itself (so 0 km), T12 and T21 are the distance between Dawson Creek and
Fort Nelson, 374 km, and so on. The full dissimilarity matrix in km is:








T =







0
374
558
772
698
261
670
752
756
841
374
0
913
1077
1059
551
695
1124
1074
1159

558 772 698 261 670 752 756 841
913 1077 1059 551 695 1124 1074 1159 

0 306 151 384 782 261 260 330 

306
0 318 530 718 453
58
98 

151 318
0 535 912 140 260 295 

384 530 535
0 504 629 524 610 

782 718 912 504
0 1041 753 816 

261 453 140 629 1041
0 395 417 

260
58 260 524 753 395
0
86 
330
98 295 610 816 417
86
0
We can now find the PCA representation of the points using MATLAB/Octave (the m-file is on
the web in distance.m):
> T = [0.
374.
558.
772.
698.
261.
670.
752.
756.
841.
374.
0.
913.
1077.
1059.
551.
695.
1124.
1074.
1159.
558.
913.
0.
306.
151.
384.
782.
261.
260.
330.
772.
1077.
306.
0.
318.
530.
718.
453.
58.
98.
698.
1059.
151.
318.
0.
535.
912.
140.
260.
295.
261.
551.
384.
530.
535.
0.
504.
629.
524.
610.
670.
695.
782.
718.
912.
504.
0.
1041.
753.
816.
752.
1124.
261.
453.
140.
629.
1041.
0.
395.
417.
756.
1074.
260.
58.
260.
524.
753.
395.
0.
86.
841.;
1159.;
330.;
98.;
295.;
610.;
816.;
417.;
86.;
0.];
> Delta = T.^2;
> H = eye(size(T)) - 1./length(T)*ones(length(T),1)*ones(1,length(T));
> [S D] = eig(-0.5*H*Delta*H);
We would like to sort the eigenvalues such that the largest corresponds to the first coordinate, the
next largest to the second, and so on. We can use the MATLAB/Octave command sort for this:
> [lambda,I] = sort(diag(D),’descend’)
lambda =
1.4615e+06
4.4276e+05
7.6808e+02
2.4605e+02
207
1.5347e+02
3.9772e+00
1.4233e-11
-2.9002e+02
-4.5881e+02
-1.1204e+03
I =
1
2
4
7
8
10
9
6
5
3
Here the vector lambda is the eigenvalues sorted from largest to smallest and I contains the original indices of the elements in the sorted vector. We see that the two largest eigenvalues have
substantially greater magnitudes than the following ones. This is what we would expect: it should
be approximately possible to represent these towns and cities as points on a plane, so only two
dimensions are needed. We also notice that it is not possible to represent the points perfectly in any
dimensional space because there are some negative eigenvalues. These result partly from the fact
that the distances are measured along the (curved) surface of the Earth and partly from rounding
the distances in T .
If we just take the first two coordinates and plot the points
>
>
>
>
X = S(:,I(1))*sqrt(D(I(1),I(1)));
Y = S(:,I(2))*sqrt(D(I(2),I(2)));
plot(X,Y,’bo’);
axis equal
we obtain
208
500
Prince Rupert
400
300
200
Nanaimo
100
Vancouver
0
Victoria
Prince George
Fort Nelson
−100
Kamloops
Penticton
−200
Dawson Creek
−300
Trail
−400
−1000
−800
−600
−400
−200
0
200
400
Notice that although the orientation of the cities is both rotated and reflected, their relative positions are exactly what we would expect.
Example 2
Instead, we may be interested in how easily and quickly we can get between the different towns and
cities. In this case, a more appropriate measure of dissimilarity is the driving distance between the
different towns and cities, or better still the driving time between them. The dissimilarity matrix
for driving times (in hours and minutes) is:








T =







0 : 00
5 : 27
10 : 06
15 : 02
12 : 14
4 : 27
12 : 38
15 : 28
12 : 47
15 : 20
5 : 27 10 : 06 15 : 02 12 : 14 4 : 27 12 : 38
0 : 00 15 : 42 20 : 39 17 : 50 10 : 03 18 : 14
15 : 42 0 : 00 5 : 49 2 : 53 5 : 39 13 : 49
20 : 39 5 : 49 0 : 00 8 : 45 10 : 35 18 : 46
17 : 50 2 : 53 8 : 45 0 : 00 7 : 47 15 : 57
10 : 03 5 : 39 10 : 35 7 : 47 0 : 00 8 : 11
18 : 14 13 : 49 18 : 46 15 : 57 8 : 11 0 : 00
20 : 55 5 : 43 9 : 24 3 : 29 11 : 12 19 : 23
18 : 23 3 : 33 2 : 27 4 : 25 8 : 20 16 : 31
20 : 56 6 : 04 1 : 31 6 : 56 10 : 52 19 : 03

15 : 28 12 : 47 15 : 20
20 : 55 18 : 23 20 : 56 

5 : 43 3 : 33 6 : 04 

9 : 24 2 : 27 1 : 31 

3 : 29 4 : 25 6 : 56 

11 : 12 8 : 20 10 : 52 

19 : 23 16 : 31 19 : 03 

0 : 00 7 : 09 9 : 41 

7 : 09 0 : 00 3 : 03 
9 : 41 3 : 03 0 : 00
If we try the same code as used for the geographical distance with the dissimilarity matrix
containing the driving times between the cities we find
lambda =
4.7754e+02
209
1.7095e+02
7.5823e+01
1.0814e+01
1.2364e+00
-1.4217e-15
-4.6973e-01
-3.4915e+00
-1.0043e+01
-3.3626e+01
and using just the first two coordinates we obtain this plot:
10
Fort Nelson
5
Dawson Creek
Penticton
0
Kamloops
Prince George
Trail
Victoria
Nanaimo
Vancouver
−5
−10
Prince Rupert
−15
−15
−10
−5
0
5
10
We notice that the structure is similar to that found for the distances, but there are a number
of important differences. Roughly speaking major roads appear as straight lines on this plot:
Vancouver to Kamloops is approximately Highways 1 and 5, then north from Kamloops to Prince
George is Highway 97. Dawson Creek and Fort Nelson are found continuing along Highway 97
from Prince George while Prince Rupert is found by turning on to Highway 16. However there is
a problem with what we have found: Trail is placed almost on top of Nanaimo and Victoria, when
it is at the other side of the Province!
The problem is that we need to consider more principal coordinates to distinguish between Trail
and Vancouver Island: the third eigenvalue is not much smaller than the second. If we add another
principal coordinate
>
>
>
>
>
X = S(:,I(1))*sqrt(D(I(1),I(1)));
Y = S(:,I(2))*sqrt(D(I(2),I(2)));
Z = S(:,I(3))*sqrt(D(I(3),I(3)));
plot3(X,Y,Z,’bo’)
axis equal
210
then we see the following:
Victoria
Nanaimo
4
3
Vancouver
2
FN
1
PG
Kamloops
0
Dawson
Creek
−1
−2
Penticton
−3
PR
−4
Trail
−5
6
4
2
5
0
−2
0
−4
−6
−5
−8
−10 −10
(The plot is much easier to understand if you plot it yourself and rotate it. The m-file is on the
website in hours.m)
We can now identify Highways 3 and 97 to Trail branching off at Kamloops and the route by
ferry to Vancouver Island (where a large part of the time is the ferry crossing which is similar for
both Nanaimo and Victoria).
211
Related documents
Download