Linear Algebra Background

advertisement
Linear Algebra Basics
Introduction
The name – from matrix laboratory – says it all: MATLAB was designed from the get-go
to work with matrices. It even treats individual numbers as matrices. The upside is that it’s
easier to handle complex calculations, the downside is that you really need to have a certain
comfort level with a few concepts and techniques from linear algebra. The following is
intended provide you with what you need to know about matrix arithmetic, the matrix
transpose, matrix inverses, and powers of matrices
Matrices, vectors, and scalars
A matrix is simply a rectangular (or square) array of numbers arranged in rows and
columns. In fact, the term “array” is used pretty much interchangeably with matrix.
Programmers tend to use “array”, while mathematicians and biologists tend to use “matrix”.
The entries in a matrix are termed elements of the matrix. A general representation of a
matrix (we’ll call it A) would be:
 a11
a
 21
A  a31

 
 a r1
a12
a 22
a32

ar 2
a13
a 23
a33

ar 3





a1c 
a 2c 
a3c   a rc 

 
a rc 
where the a’s represent the elements, usually but not always numerical, of the matrix. The
elements of a matrix are usually enclosed in brackets, as above, or sometimes parentheses.
The subscripts r and c are used here to emphasize that the first number in a matrix element’s
subscript represents its row number and the second represents its column number. In the real
world, i and j are the standard subscripts, and we will adhere to that usage from now on.
Examples of matrices include:
a 22 matrix; an
1  3 
:
example
of a
2 7.2

 square matrix.
a 23
6  3 1
:
rectangular
 2 0 9

 matrix
 8
0 1.4  1.3 a 14 matrix; an
example of a
: row vector.
0 0.33  a 33 matrix;
 5
 0.26 13
0  : another square

 4.91 6  1.35 matrix.
2.5  a 32
8
 0  1.1 :

 rectangular
 5  3  matrix
 9 .1 
 2 .7 

:
  0 .7 


 2 .3 
a 41 matrix;
an example of
a column
vector.
Once you start working with MATLAB, you will quickly encounter the terms “column
dimension” and “row dimension”. These terms are not part of standard linear algebra jargon,
and their use in MATLAB is perhaps unfortunate, because the term “dimension” has a specific
– and very different – meaning in linear algebra (which you will also encounter when
working with MATLAB). Nevertheless, the terms are useful in the context of matrix
manipulations, for example when applying the MATLAB sum command to the matrix
1 2 3
A  4 5 6 ,
7 8 9
we can instruct MATLAB to generate the three column sums as follows:
sum(A,1)
= 12 15 18
where we have summed along the row dimension ( = dimension 1) to produce a 31 row
vector. Similarly, we can generate the three row sums by summing along the column
dimension (dimension 2):
6
 
sum(A,2) = 15
 
 24 
to produce a 13 column vector
Something to note, here: you’re probably accustomed to entering data into columns of an
Excel spreadsheet, hitting the <Enter> key after each entry to drop down to the next row.
Data entry in Excel is thus equivalent to creating column vectors. In contrast, you will
probably find that data entry in MATLAB is best done in the form of row vectors, which
brings us to an important MATLAB technique: concatenation. Concatenation basically means
the joining of two or more arrays to produce one. Thus, the matrix
3 4 9
A  7 8 9
4 2 0
could be thought of as the concatenation along the column dimension in MATLAB-speak):of
the three column vectors:
3
4
9 
a1  7  , a2  8  and a3  9 ,
4
2
0
or as the three row vectors:
a1  3 4 9 , a2  7 8 9 and a3  4 2 0 .
concatenated along the row dimension:
Concatenation is readily accomplished in MATLAB, and is typically used to merge
complementary sets of data from different sources, including the results of calculations that
have been performed by different programs or parts of programs. You’ll be able to work
with MATLAB more efficiently if you can train yourself to think of matrices as concatenated
column or row vectors, so let’s concatenate:
1 2
5 6
A
, B

,
3 4
7 8 
to see how concatenation works, and to get you started thinking in MATLAB. The Matlab
command cat(<dimension>,A,B) concatenates matrices A and B along the dimension, row or
column, that you specify. Thus, cat(1,A,B) causes concatenation of A and B along the row
dimension and yields
1
3

5

7
2
4
6

8
while cat(2,A,B) concatenates along the column dimension and yields
1 2 5 6
3 4 7 8


If you’ve ever tried to work with arrays in a programming language such as C/C++ or Java,
you’re probably starting to appreciate how easy matrix manipulations are to carry out with
MATLAB!
Diagonal and Triangular Matrices
These are special forms of square matrices. The elements of a matrix for which i = j are
referred to as the diagonal elements of the matrix, while the other elements (i  j) are referred
to as the off-diagonal elements. Thus, in the matrix
9 3 6
A  1 7 9
0 7 2
the diagonal elements are 9, 7, and 2. A specific type of diagonal in which all of the offdiagonal elements are zero:
9 0 0
A  0 7 0
0 0 2
is termed a diagonal matrix. Diagonal matrices are extremely useful in a variety of
modeling contexts. The trace of a matrix is the sum of its diagonal elements. Thus, the trace
for the above matrix is 18.
Triangular matrices come in two forms: upper triangular, in which all of the elements
below the diagonal are zero,
9 3 6
A  0 7 9
0 0 2
and lower, in which zeros comprise all of the elements above the diagonal:
9 0 0
A  1 7 0
0 7 2
Although we will probably not have much use for triangular matrices in this course,
execution of MATLAB code can be dramatically increased by their incorporation into models.
The Identity Matrix
One especially important diagonal matrix is termed the identity matrix. The identity
matrix is, of course, always a square matrix, and its diagonal elements are all ones, while its
off-diagonal elements are all zeros. A 44 identity matrix would thus look like this:
1
0
I 
0

0
0 0 0
1 0 0
0 1 0

0 0 1
The identity matrix is the functional equivalent of the number 1, because multiplying a
matrix by its identity matrix yields the same matrix. I.e., AI  A .
The Transpose of a Matrix
The transpose of a matrix is an important construct that is frequently encountered when
~
working with matrices, and is represented variously by AT, A, Atr, tA, or rarely A . Most
linear algebra texts use AT, while MATLAB and a number of publications use A. For that
reason, we’ll generally use A to represent the transpose of a matrix. Operationally, the
transpose of a matrix is created by ‘converting’ of its rows into the corresponding columns of
its transpose, meaning the first row of a matrix becomes the first column of its transpose, the
second row becomes the second column, and so on. Thus, if
1 2 3
A  4 5 6 ,
7 8 9
the transpose of A is

1 2 3
1 4 7 


A  4 5 6  2 5 8 
7 8 9
3 6 9 
Depending on where you go with mathematical biology, you may not have much need for
matrix transposes, but taking the transpose of vector is something you will do frequently
when working with MATLAB. For example, as stated earlier, it’s generally easier to enter
data into MATLAB as row vectors, but your model may require that your data be in column
vector form – or you may more simply be more comfortable working with column vectors.
In either case, go ahead and enter your data as a row vector, then take its transpose to get
your column vector.
Practice Problems:
Calculate the transpose of the following:
6
A    1
0.5
0
 11 
5  3
11
B  4 1 3
solution
Matrix Addition & Subtraction:
Matrix addition is straightforward. Mathematically, we express C = A + B as

cij  aij  bij

meaning you simply sum the corresponding elements of A and B to get the elements of C.
Thus, if
5  6
4
1 2 3



A  4 5 6 and B   7 8
9  ,
 1  2 3 
7 8 9
summing the two matrices yields
 a11  b11
C  A  B  a 21  b21
 a31  b31
a12  b12
a 22  b22
a32  b32
a13  b13  1  4 2  5 3  6  5 7  3
a 23  b23   4  7 5  8 6  9   3 13 15 
a33  b33   7  1 8  2 9  3  8 6 12 
For subtraction, you simply subtract the corresponding elements:
2  5 3  (6)  3  3 9 
 a11  b11 a12  b12 a13  b13   1  4



D  A  B  a 21  b21 a 22  b22 a 23  b23   4  (7)
58
6  9    11  3  3
 a31  b31 a32  b32 a33  b33   7  1
8  (2)
9  3   6 10 6 
Practice Problem
Calculate A  B and A  B for the following:
7 8 2
8 4 2


A  9 2 3, B  8 4 9
1 6 4
3 1 6
solution
Matrix Multiplication:
Scalar Multiplication
In linear algebra, individual numbers are referred to as scalars, from the Latin word for
ladder. Multiplication of a matrix by a scalar proceeds element-by-element, with the scalar
being multiplied by each element in turn. Mathematically, we express multiplication of a
matrix A by a scalar as
   
cA  c aij  caij
where c is the scalar. Thus, multiplication by the scalar 3 is accomplished as:
 3 1  3  3 3  1  9 3
3



1.7 2 3  1.7 3  2 5.1 6
Multiplication of Matrices
Multiplication of one matrix by another is more complicated than scalar multiplication,
and is carried out in accordance with a strict rule. Mathematically, if C is a matrix resulting
from the multiplication of two matrices, A and B, then the elements cij of C are given by:
n
cij   a ik bkj
Equation 1
k 1
where k = 1, 2,…, n is the number of columns in A and the number of rows in B. Look
carefully at the subscripts of a and b, and note that Equation 1 requires that the number of
columns in the left-hand matrix must be the same as the number of rows in the right-hand
matrix. Also note that Equation 1 tells us that the product matrix has i rows and j columns.
What does Equation 1 mean? Well, if we wish to calculate the product of two matrices A
and B:
 a11
A  a 21
a31
a12
a 22
a32
a13 
b11 b12

a 23  and B  b21 b22
b31 b32
a33 
b13 
b23  ,
b33 
then n = 3, and the product C = AB is defined by Equation 1 as:
 a11b11  a12b21  a13b31
AB  a 21b11  a 22b21  a 23b31
 a31b11  a32b21  a33b31
a11b12  a12b22  a13b32
a 21b12  a 22b22  a 23b32
a31b12  a32b22  a33b32
a11b13  a12b23  a13b33 
a 21b13  a 22b23  a 23b33 
a31b13  a32b23  a33b33 
In other words, you multiply each of the elements of a row in the left-hand matrix by the
corresponding elements of a column in the right-hand matrix (that’s why the number of
elements in the row and the column must be equal), and then sum the resulting n products to
obtain one element in the product matrix.
The choice of the row and column to be used in the multiplication and summation is
based on which element of the product matrix ( C ) that you wish to calculate:


The left-hand matrix row you work with is the same as the row of the product
matrix element you wish to calculate.
The right-hand matrix column you work with is the same as the column of the
product matrix element you wish to calculate.
For example, suppose you define the matrix C as the product of the two 33 matrices, A and
B, shown above. If you wish to calculate the value of c11 ,
 c11 c12 c13 
c

 21 c22 c23 
c31 c32 c33 
you work element-by-element across the first row of the left-hand matrix and element-byelement down the first column of the right-hand matrix as follows:
 a11
c11  a 21
a31
a12
a 22
a32
a13 
a 23 
a33 
b11 b12
b
 21 b22
b31 b32
b13 
b23   c11  a11b11  a12b21  a13b31
b33 
Similarly, to calculate the value of c23
 c11 c12
c
 21 c 22
c31 c32
c13 
c 23 
c33 
you work across the second row of the left-hand matrix and down the third column of the
right-hand matrix:
 a11
c23  a 21
a31
a12
a 22
a32
a13 
a 23 
a33 
b11 b12
b
 21 b22
b31 b32
b13 
b23   c23  a21b13  a22b23  a23b33
b33 
Example: find the product of two matrices, A and B. Let
1 2 3
 4 5 6


A  4 5 6 and B  7 8 9 .
7 8 9
1 2 3
We first note that multiplication of A by B is allowed by Equation 1 because the number of
columns in A is the same as the number of rows in B, which allows us to calculate C = AB as:
1 2 3   4 5 6 
C  AB  4 5 6 7 8 9
7 8 9 1 2 3
1* 4  2 * 7  3 *1 1* 5  2 * 8  3 * 2 1* 6  2 * 9  3 * 3 
 4 * 4  5 * 7  6 *1 4 * 5  5 * 8  6 * 2 4 * 6  5 * 9  6 * 3
7 * 4  8 * 7  9 *1 7 * 5  8 * 8  9 * 2 7 * 6  8 * 9  9 * 3
 21 27 33 
 57 72 87 
93 117 141
Example: What about multiplication of vectors? For example, suppose you wanted to
calculate the product of two vectors, F = 1 2 3 and G = 4 5 6 . Well, a quick look at
Equation 1 shows that the following possible setups cannot be used for vector multiplication:
1   4 
1 2 34 5 6 , 2 5 , and
3 6
1 
24 5 6
 
3
because of mismatch between the number of columns in the left-hand matrix and the number
of rows in the right-hand matrix. The only allowable combination is:
 4
1 2 3 5  FG   4  10  18  32
6
Note that we took the transpose of G in order to match the number of columns and rows, as
required by Equation 1. Also note that the product of vector multiplication is a scalar.
Finally, the two matrices being multiplied don’t have to be square, or even the same size.
All that’s required is that the number of columns in the left-hand matrix be the same as the
number of rows in the right-hand matrix.
Practice Problems:
For each of the following, determine whether the indicated multiplication is allowed and, if it
is, calculate the corresponding product matrix:
2 1 3  1 6 1 
a. 1 8 3  3 5 3 



5 7 1  2 4  9
 4 3 1  5 
b. 2 2 2 4
1 4 3 3
5 
c. 4
 
3
4 3 1 
 2 2 2


1 4 3
2 3  2  0.75
d. 


4 8  1 0.5 
1  2 6 5 3
e. 


2 5 1.3 5 5 1
solutions
IMPORTANT
Commutivity of Matrix Operations
This may strike you but as a little arcane – ok, perhaps as a lot arcane – but it’s essential
that you understand the concept of commutivity. A mathematical operation applied to two
or more numbers or variables is said to be commutative if the result is independent of the
order in which the operation is carried out. Thus:
3+4=4+3=7
3 - 4 = -4 + 3 = -1
3  4 = 4  3 = 12
which we take for granted in our everyday lives. The same is true of matrix addition and
subtraction, but – and this has important ramifications for your work with MATLAB – matrix
multiplication is generally not commutative: that is, if A and B are two matrices, in general
AB  BA . The significance of this is that the ordering of matrices is important when you’re
setting up the calculation of their product. In fact, left multiplication and right
multiplication are terms that you will encounter in linear algebra and when working with
MATLAB.
There are certain situations in which matrix multiplication is commutative, such as
multiplication of vectors and multiplication by the identity matrix.. But, except for those
special cases, you have to be careful about how you set up any problem that involves matrix
multiplication. If you aren’t, your MATLAB program may crash or, worse, you may end up
with results that look good but are not valid. You must write your MATLAB code with this in
mind. I cannot stress this too strongly.
Practice Problem:
Verify the non-commutivity of matrix multiplication for yourself by calculating AB and
BA for the matrices:
1 2
 5 6
A
, B


3 4
7 8 
solution
Now, just for fun, confirm the assertion that vector multiplication is commutative, using
vectors of your own choosing, composed of at least 3 elements.
Practice Problems: Multiplication By the Identity Matrix
I earlier asserted that the identity matrix was the matrix equivalent of the number 1
because the result of multiplication of any matrix by its corresponding identity matrix is
simply the matrix itself. That is, for any matrix A, AI  A . Check this assertion by
multiplying the following matrices by their corresponding identity matrix (while you’re at it,
check to see if multiplication involving the identity matrix is indeed commutative):
1 6 7 
2 5 8 


3 4 9
8

 1

0
 0 .4 

1 
 1
 8
 
0 

 0.4 1 
solutions
Question: how are those last two matrices related? Also, convince yourself that
multiplication by the identity matrix is commutative
Matrix Division
In terms of the way we usually think of division of numbers, there really isn’t such a
thing as matrix division. That is, in contrast with the quotient 4 9 , the matrix quotient A B isn’t
defined. To see why this is true, suppose you were presented with the following:
 a11 a12 


A a 21 a 22 

B b11 b12 
b

 21 b22 
How would you proceed? Well, you might think that, analogous with the operation of
addition, dividing each element of A by the corresponding element of B might be the way to
go. Let’s try it, using
9 2
1 2
and B  
A


3 7 
3 4
Element-wise division of A by B yields
9 1 
C

1 1.75
which might seem a reasonable result. However, recalling that if C  A , then BC  A (or,
B
CB  A , since matrix multiplication isn’t commutative), let’s check our result by carrying
out both possible multiplications of B and C. This results in the following:
1 2 9 1  11 4.5
BC  


A
3 4 1 1.75 31 10 
and
9 1  1 2  12 22
CB  


A
1 1.75 3 4 6.25 9 
Clearly, element-wise division of two matrices does not accomplish what we intended when
A
we set up the quotient .
B
It turns out that the functional equivalent of matrix division is defined, but it’s even less
straightforward than multiplication. To understand matrix division, recall that there are three
ways that division of numbers may be represented. For example, division of 7 by 3 can be
represented (and carried out) as follows:
7
1
 7   7  3 1
3
3
Likewise for variables:
1
x
 x   xy 1
y
 y
A
And therein lies the clue to how we can accomplish matrix division: with
undefined per
B
se, we take advantage of the fact that matrix multiplication is defined and carry out matrix
division by multiplying the dividend matrix by the inverse of the divisor matrix. Thus, we
A
express the quotient of two matrices, , as
B
A
1
 A   AB 1
B
B
1
where B represents the inverse of matrix B. Let’s illustrate this using A and B. First, we
need the inverse of B. You can calculate the inverse for yourself, or you can accept my word
that the inverse of B is:
1 
 2
B 1  

1.5  0.5
With B-1 in hand, we then accomplish the division of A by B as follows:
C
1   15
8 
9 2 1 2 9 2  2
A
 AB 1  









B
3 7 3 4 3 7 1.5  0.5  4.5  0.5
and, if we check our result by carrying out the multiplication CB, we get:
8  1 2 9 2
 15
CB  


A
 4.5  0.5 3 4 3 7
So, the technique works. The technique is readily extended to larger matrices; ‘all’ you need
is the inverse of the divisor matrix.
And therein lies the rub: not all matrices are invertible, so not all matrices can be used as
divisors. First of all, only square matrices can be inverted. Inverses of vectors and
rectangular matrices are not defined, so division involving vectors or rectangular matrices is
not possible.
But, there’s a more insidious problem, one that can – and probably will – give you fits
from time-to-time when you’re developing and running models. Consider the following:
9 2 
4 5.
E

C  
3
F 1
1.8 5.4


If you try to calculate the inverse of matrix F, you will find that the result is
  
F 1  

  
which renders the product EF 1 meaningless, because it’s a matrix whose elements are all
precisely zero, no matter what the values of the elements of the E might be. A matrix such as
F whose inverse consists entirely of infinite elements is said to be singular, and
mathematicians, clever devils that they are, sidestep the problem by stating that a singular
matrix cannot be inverted, or equivalently, that its inverse does not exist.
Fortunately, truly singular matrices are not frequently encountered in mathematical
biology. But, matrices that are close enough to singular to cause significant problems can be
produced under many modeling scenarios. To appreciate this, let’s look at the result of
telling MATLAB to divide E by F as presented above. Do that, and instead of a numerical
result you get the following message:
>> C = E/F
Warning: Matrix is singular to working precision.
ans =
Inf Inf
Inf Inf
indicating that you have tried to divide one matrix (E) by a singular matrix (F) whose inverse
doesn’t exist. So far, so good; MATLAB has made you aware of the problem.
But, there’s potential for disaster lurking in that phrase “singular to working precision”.
It refers to the inherent limit on the accuracy of computer arithmetic that results from the way
computers store and work with numbers, a topic that we’ll delve into deeply during lab
sessions next fall. Suffice for now to point out that if we were to increase just one of the
elements of F by only 210-15, say from 5.4 to 5.400000000000002, MATLAB will blithely go
ahead and calculate an answer that might look reasonable at first glance, and give no warning
at all that there’s a potential problem. Because of its significance for computer modeling
we’ll spend a fair amount of lab time investigating the causes and ramifications of this
situation. One consequence is that you need to be extremely wary when your models include
operations involving matrix division.
Powers of Matrices
The need to take powers of matrices arises frequently in biological models, for example
in models of population dynamics and models involving Markov processes. We will confine
ourselves to the situation where the power is a integer, positive or negative, and proceed by
first recalling that the nth power (n a positive integer) of a number or a variable is simply the
number multiplied by itself n - 1 times. Similarly, the nth power (n a negative integer) of a
number or a variable is simply the reciprocal, or inverse, of its nth power (n positive). Thus,
32  3  3  9 , 33  3  3  3  27 , x 2  x  x , x 3  x  x  x , while 33  1 33 , x 3  1 3 , and so
x
on.. It’s pretty much the same when working with matrices. If A is a square matrix, then
A2  A  A
A3  A  A  A
 
A 3  A 1
3
 1 A3
and so on. (why won’t it work with a rectangular matrix?) Also, as with numbers and
variables,
A m A n  A m n
and
A 
m n
 A mn
Sources for more information:
If you want to learn more about linear algebra, or just would like an alternate presentation
of any of the above topics, here are some useful URLs:



http://www.sosmath.com/matrix/matrix.html
http://www.math.ucdavis.edu/~daddel/linear_algebra_appl/OTHER_PAGES/other_pages.ht
ml
http://pax.st.usm.edu/cmi/mat-linalg_html/linal.html
And, of course, Googling the appropriate terms or phrases will bring up a wealth of
information, especially with respect to applications, MATLAB scripts, etc.
Solutions to Practice Problems
Matrix transpose
 6  1 0.5  6 11 0 
A'  11 
5     1  11 
 0 11  3 0.5
5  3
 4
B   1 
3
return
Matrix addition & subtraction
15 12 4 
A  B  17 6 12
 4 7 19
0 
1 4

A  B   1  2  6
 2 5  2
return
Matrix Multiplication
a. Multiplication is allowed:
29  22
 5

answer =  17 58  2 
 14 69 17 
b. Multiplication is allowed:
35 
answer =  24 
30 
c. Multiplication is not allowed.
d. Multiplication is allowed:
1 0
answer = 
 (what kind of matrix is this? What does this result suggest
0 1 
about the relationship between the two matrices?)
e. Multiplication is not allowed.
return
Matrix multiplication isn’t commutative
1 2 5 6 19 22
AB  



3 4 7 8 43 50
5 6 1 2 23 34
BA  



7 8 3 4 31 46
return
Multiplication by the identity matrix
1 6 7 1 0 0 1 6 7
AI = 2 5 8 0 1 0  2 5 8


 

3 4 9 0 0 1 3 4 9
8
AI = 
 1

0
1 0 0
 0.4 
 8
 0 1 0   
1 
1
0 0 1 

0
 0.4

1 
 1
 1
 8
 8
1
0



AI =  
0  
0 
 

0
1
  0.4 1 
 0.4 1  


return
Download