Linear Algebra Basics Introduction The name – from matrix laboratory – says it all: MATLAB was designed from the get-go to work with matrices. It even treats individual numbers as matrices. The upside is that it’s easier to handle complex calculations, the downside is that you really need to have a certain comfort level with a few concepts and techniques from linear algebra. The following is intended provide you with what you need to know about matrix arithmetic, the matrix transpose, matrix inverses, and powers of matrices Matrices, vectors, and scalars A matrix is simply a rectangular (or square) array of numbers arranged in rows and columns. In fact, the term “array” is used pretty much interchangeably with matrix. Programmers tend to use “array”, while mathematicians and biologists tend to use “matrix”. The entries in a matrix are termed elements of the matrix. A general representation of a matrix (we’ll call it A) would be: a11 a 21 A a31 a r1 a12 a 22 a32 ar 2 a13 a 23 a33 ar 3 a1c a 2c a3c a rc a rc where the a’s represent the elements, usually but not always numerical, of the matrix. The elements of a matrix are usually enclosed in brackets, as above, or sometimes parentheses. The subscripts r and c are used here to emphasize that the first number in a matrix element’s subscript represents its row number and the second represents its column number. In the real world, i and j are the standard subscripts, and we will adhere to that usage from now on. Examples of matrices include: a 22 matrix; an 1 3 : example of a 2 7.2 square matrix. a 23 6 3 1 : rectangular 2 0 9 matrix 8 0 1.4 1.3 a 14 matrix; an example of a : row vector. 0 0.33 a 33 matrix; 5 0.26 13 0 : another square 4.91 6 1.35 matrix. 2.5 a 32 8 0 1.1 : rectangular 5 3 matrix 9 .1 2 .7 : 0 .7 2 .3 a 41 matrix; an example of a column vector. Once you start working with MATLAB, you will quickly encounter the terms “column dimension” and “row dimension”. These terms are not part of standard linear algebra jargon, and their use in MATLAB is perhaps unfortunate, because the term “dimension” has a specific – and very different – meaning in linear algebra (which you will also encounter when working with MATLAB). Nevertheless, the terms are useful in the context of matrix manipulations, for example when applying the MATLAB sum command to the matrix 1 2 3 A 4 5 6 , 7 8 9 we can instruct MATLAB to generate the three column sums as follows: sum(A,1) = 12 15 18 where we have summed along the row dimension ( = dimension 1) to produce a 31 row vector. Similarly, we can generate the three row sums by summing along the column dimension (dimension 2): 6 sum(A,2) = 15 24 to produce a 13 column vector Something to note, here: you’re probably accustomed to entering data into columns of an Excel spreadsheet, hitting the <Enter> key after each entry to drop down to the next row. Data entry in Excel is thus equivalent to creating column vectors. In contrast, you will probably find that data entry in MATLAB is best done in the form of row vectors, which brings us to an important MATLAB technique: concatenation. Concatenation basically means the joining of two or more arrays to produce one. Thus, the matrix 3 4 9 A 7 8 9 4 2 0 could be thought of as the concatenation along the column dimension in MATLAB-speak):of the three column vectors: 3 4 9 a1 7 , a2 8 and a3 9 , 4 2 0 or as the three row vectors: a1 3 4 9 , a2 7 8 9 and a3 4 2 0 . concatenated along the row dimension: Concatenation is readily accomplished in MATLAB, and is typically used to merge complementary sets of data from different sources, including the results of calculations that have been performed by different programs or parts of programs. You’ll be able to work with MATLAB more efficiently if you can train yourself to think of matrices as concatenated column or row vectors, so let’s concatenate: 1 2 5 6 A , B , 3 4 7 8 to see how concatenation works, and to get you started thinking in MATLAB. The Matlab command cat(<dimension>,A,B) concatenates matrices A and B along the dimension, row or column, that you specify. Thus, cat(1,A,B) causes concatenation of A and B along the row dimension and yields 1 3 5 7 2 4 6 8 while cat(2,A,B) concatenates along the column dimension and yields 1 2 5 6 3 4 7 8 If you’ve ever tried to work with arrays in a programming language such as C/C++ or Java, you’re probably starting to appreciate how easy matrix manipulations are to carry out with MATLAB! Diagonal and Triangular Matrices These are special forms of square matrices. The elements of a matrix for which i = j are referred to as the diagonal elements of the matrix, while the other elements (i j) are referred to as the off-diagonal elements. Thus, in the matrix 9 3 6 A 1 7 9 0 7 2 the diagonal elements are 9, 7, and 2. A specific type of diagonal in which all of the offdiagonal elements are zero: 9 0 0 A 0 7 0 0 0 2 is termed a diagonal matrix. Diagonal matrices are extremely useful in a variety of modeling contexts. The trace of a matrix is the sum of its diagonal elements. Thus, the trace for the above matrix is 18. Triangular matrices come in two forms: upper triangular, in which all of the elements below the diagonal are zero, 9 3 6 A 0 7 9 0 0 2 and lower, in which zeros comprise all of the elements above the diagonal: 9 0 0 A 1 7 0 0 7 2 Although we will probably not have much use for triangular matrices in this course, execution of MATLAB code can be dramatically increased by their incorporation into models. The Identity Matrix One especially important diagonal matrix is termed the identity matrix. The identity matrix is, of course, always a square matrix, and its diagonal elements are all ones, while its off-diagonal elements are all zeros. A 44 identity matrix would thus look like this: 1 0 I 0 0 0 0 0 1 0 0 0 1 0 0 0 1 The identity matrix is the functional equivalent of the number 1, because multiplying a matrix by its identity matrix yields the same matrix. I.e., AI A . The Transpose of a Matrix The transpose of a matrix is an important construct that is frequently encountered when ~ working with matrices, and is represented variously by AT, A, Atr, tA, or rarely A . Most linear algebra texts use AT, while MATLAB and a number of publications use A. For that reason, we’ll generally use A to represent the transpose of a matrix. Operationally, the transpose of a matrix is created by ‘converting’ of its rows into the corresponding columns of its transpose, meaning the first row of a matrix becomes the first column of its transpose, the second row becomes the second column, and so on. Thus, if 1 2 3 A 4 5 6 , 7 8 9 the transpose of A is 1 2 3 1 4 7 A 4 5 6 2 5 8 7 8 9 3 6 9 Depending on where you go with mathematical biology, you may not have much need for matrix transposes, but taking the transpose of vector is something you will do frequently when working with MATLAB. For example, as stated earlier, it’s generally easier to enter data into MATLAB as row vectors, but your model may require that your data be in column vector form – or you may more simply be more comfortable working with column vectors. In either case, go ahead and enter your data as a row vector, then take its transpose to get your column vector. Practice Problems: Calculate the transpose of the following: 6 A 1 0.5 0 11 5 3 11 B 4 1 3 solution Matrix Addition & Subtraction: Matrix addition is straightforward. Mathematically, we express C = A + B as cij aij bij meaning you simply sum the corresponding elements of A and B to get the elements of C. Thus, if 5 6 4 1 2 3 A 4 5 6 and B 7 8 9 , 1 2 3 7 8 9 summing the two matrices yields a11 b11 C A B a 21 b21 a31 b31 a12 b12 a 22 b22 a32 b32 a13 b13 1 4 2 5 3 6 5 7 3 a 23 b23 4 7 5 8 6 9 3 13 15 a33 b33 7 1 8 2 9 3 8 6 12 For subtraction, you simply subtract the corresponding elements: 2 5 3 (6) 3 3 9 a11 b11 a12 b12 a13 b13 1 4 D A B a 21 b21 a 22 b22 a 23 b23 4 (7) 58 6 9 11 3 3 a31 b31 a32 b32 a33 b33 7 1 8 (2) 9 3 6 10 6 Practice Problem Calculate A B and A B for the following: 7 8 2 8 4 2 A 9 2 3, B 8 4 9 1 6 4 3 1 6 solution Matrix Multiplication: Scalar Multiplication In linear algebra, individual numbers are referred to as scalars, from the Latin word for ladder. Multiplication of a matrix by a scalar proceeds element-by-element, with the scalar being multiplied by each element in turn. Mathematically, we express multiplication of a matrix A by a scalar as cA c aij caij where c is the scalar. Thus, multiplication by the scalar 3 is accomplished as: 3 1 3 3 3 1 9 3 3 1.7 2 3 1.7 3 2 5.1 6 Multiplication of Matrices Multiplication of one matrix by another is more complicated than scalar multiplication, and is carried out in accordance with a strict rule. Mathematically, if C is a matrix resulting from the multiplication of two matrices, A and B, then the elements cij of C are given by: n cij a ik bkj Equation 1 k 1 where k = 1, 2,…, n is the number of columns in A and the number of rows in B. Look carefully at the subscripts of a and b, and note that Equation 1 requires that the number of columns in the left-hand matrix must be the same as the number of rows in the right-hand matrix. Also note that Equation 1 tells us that the product matrix has i rows and j columns. What does Equation 1 mean? Well, if we wish to calculate the product of two matrices A and B: a11 A a 21 a31 a12 a 22 a32 a13 b11 b12 a 23 and B b21 b22 b31 b32 a33 b13 b23 , b33 then n = 3, and the product C = AB is defined by Equation 1 as: a11b11 a12b21 a13b31 AB a 21b11 a 22b21 a 23b31 a31b11 a32b21 a33b31 a11b12 a12b22 a13b32 a 21b12 a 22b22 a 23b32 a31b12 a32b22 a33b32 a11b13 a12b23 a13b33 a 21b13 a 22b23 a 23b33 a31b13 a32b23 a33b33 In other words, you multiply each of the elements of a row in the left-hand matrix by the corresponding elements of a column in the right-hand matrix (that’s why the number of elements in the row and the column must be equal), and then sum the resulting n products to obtain one element in the product matrix. The choice of the row and column to be used in the multiplication and summation is based on which element of the product matrix ( C ) that you wish to calculate: The left-hand matrix row you work with is the same as the row of the product matrix element you wish to calculate. The right-hand matrix column you work with is the same as the column of the product matrix element you wish to calculate. For example, suppose you define the matrix C as the product of the two 33 matrices, A and B, shown above. If you wish to calculate the value of c11 , c11 c12 c13 c 21 c22 c23 c31 c32 c33 you work element-by-element across the first row of the left-hand matrix and element-byelement down the first column of the right-hand matrix as follows: a11 c11 a 21 a31 a12 a 22 a32 a13 a 23 a33 b11 b12 b 21 b22 b31 b32 b13 b23 c11 a11b11 a12b21 a13b31 b33 Similarly, to calculate the value of c23 c11 c12 c 21 c 22 c31 c32 c13 c 23 c33 you work across the second row of the left-hand matrix and down the third column of the right-hand matrix: a11 c23 a 21 a31 a12 a 22 a32 a13 a 23 a33 b11 b12 b 21 b22 b31 b32 b13 b23 c23 a21b13 a22b23 a23b33 b33 Example: find the product of two matrices, A and B. Let 1 2 3 4 5 6 A 4 5 6 and B 7 8 9 . 7 8 9 1 2 3 We first note that multiplication of A by B is allowed by Equation 1 because the number of columns in A is the same as the number of rows in B, which allows us to calculate C = AB as: 1 2 3 4 5 6 C AB 4 5 6 7 8 9 7 8 9 1 2 3 1* 4 2 * 7 3 *1 1* 5 2 * 8 3 * 2 1* 6 2 * 9 3 * 3 4 * 4 5 * 7 6 *1 4 * 5 5 * 8 6 * 2 4 * 6 5 * 9 6 * 3 7 * 4 8 * 7 9 *1 7 * 5 8 * 8 9 * 2 7 * 6 8 * 9 9 * 3 21 27 33 57 72 87 93 117 141 Example: What about multiplication of vectors? For example, suppose you wanted to calculate the product of two vectors, F = 1 2 3 and G = 4 5 6 . Well, a quick look at Equation 1 shows that the following possible setups cannot be used for vector multiplication: 1 4 1 2 34 5 6 , 2 5 , and 3 6 1 24 5 6 3 because of mismatch between the number of columns in the left-hand matrix and the number of rows in the right-hand matrix. The only allowable combination is: 4 1 2 3 5 FG 4 10 18 32 6 Note that we took the transpose of G in order to match the number of columns and rows, as required by Equation 1. Also note that the product of vector multiplication is a scalar. Finally, the two matrices being multiplied don’t have to be square, or even the same size. All that’s required is that the number of columns in the left-hand matrix be the same as the number of rows in the right-hand matrix. Practice Problems: For each of the following, determine whether the indicated multiplication is allowed and, if it is, calculate the corresponding product matrix: 2 1 3 1 6 1 a. 1 8 3 3 5 3 5 7 1 2 4 9 4 3 1 5 b. 2 2 2 4 1 4 3 3 5 c. 4 3 4 3 1 2 2 2 1 4 3 2 3 2 0.75 d. 4 8 1 0.5 1 2 6 5 3 e. 2 5 1.3 5 5 1 solutions IMPORTANT Commutivity of Matrix Operations This may strike you but as a little arcane – ok, perhaps as a lot arcane – but it’s essential that you understand the concept of commutivity. A mathematical operation applied to two or more numbers or variables is said to be commutative if the result is independent of the order in which the operation is carried out. Thus: 3+4=4+3=7 3 - 4 = -4 + 3 = -1 3 4 = 4 3 = 12 which we take for granted in our everyday lives. The same is true of matrix addition and subtraction, but – and this has important ramifications for your work with MATLAB – matrix multiplication is generally not commutative: that is, if A and B are two matrices, in general AB BA . The significance of this is that the ordering of matrices is important when you’re setting up the calculation of their product. In fact, left multiplication and right multiplication are terms that you will encounter in linear algebra and when working with MATLAB. There are certain situations in which matrix multiplication is commutative, such as multiplication of vectors and multiplication by the identity matrix.. But, except for those special cases, you have to be careful about how you set up any problem that involves matrix multiplication. If you aren’t, your MATLAB program may crash or, worse, you may end up with results that look good but are not valid. You must write your MATLAB code with this in mind. I cannot stress this too strongly. Practice Problem: Verify the non-commutivity of matrix multiplication for yourself by calculating AB and BA for the matrices: 1 2 5 6 A , B 3 4 7 8 solution Now, just for fun, confirm the assertion that vector multiplication is commutative, using vectors of your own choosing, composed of at least 3 elements. Practice Problems: Multiplication By the Identity Matrix I earlier asserted that the identity matrix was the matrix equivalent of the number 1 because the result of multiplication of any matrix by its corresponding identity matrix is simply the matrix itself. That is, for any matrix A, AI A . Check this assertion by multiplying the following matrices by their corresponding identity matrix (while you’re at it, check to see if multiplication involving the identity matrix is indeed commutative): 1 6 7 2 5 8 3 4 9 8 1 0 0 .4 1 1 8 0 0.4 1 solutions Question: how are those last two matrices related? Also, convince yourself that multiplication by the identity matrix is commutative Matrix Division In terms of the way we usually think of division of numbers, there really isn’t such a thing as matrix division. That is, in contrast with the quotient 4 9 , the matrix quotient A B isn’t defined. To see why this is true, suppose you were presented with the following: a11 a12 A a 21 a 22 B b11 b12 b 21 b22 How would you proceed? Well, you might think that, analogous with the operation of addition, dividing each element of A by the corresponding element of B might be the way to go. Let’s try it, using 9 2 1 2 and B A 3 7 3 4 Element-wise division of A by B yields 9 1 C 1 1.75 which might seem a reasonable result. However, recalling that if C A , then BC A (or, B CB A , since matrix multiplication isn’t commutative), let’s check our result by carrying out both possible multiplications of B and C. This results in the following: 1 2 9 1 11 4.5 BC A 3 4 1 1.75 31 10 and 9 1 1 2 12 22 CB A 1 1.75 3 4 6.25 9 Clearly, element-wise division of two matrices does not accomplish what we intended when A we set up the quotient . B It turns out that the functional equivalent of matrix division is defined, but it’s even less straightforward than multiplication. To understand matrix division, recall that there are three ways that division of numbers may be represented. For example, division of 7 by 3 can be represented (and carried out) as follows: 7 1 7 7 3 1 3 3 Likewise for variables: 1 x x xy 1 y y A And therein lies the clue to how we can accomplish matrix division: with undefined per B se, we take advantage of the fact that matrix multiplication is defined and carry out matrix division by multiplying the dividend matrix by the inverse of the divisor matrix. Thus, we A express the quotient of two matrices, , as B A 1 A AB 1 B B 1 where B represents the inverse of matrix B. Let’s illustrate this using A and B. First, we need the inverse of B. You can calculate the inverse for yourself, or you can accept my word that the inverse of B is: 1 2 B 1 1.5 0.5 With B-1 in hand, we then accomplish the division of A by B as follows: C 1 15 8 9 2 1 2 9 2 2 A AB 1 B 3 7 3 4 3 7 1.5 0.5 4.5 0.5 and, if we check our result by carrying out the multiplication CB, we get: 8 1 2 9 2 15 CB A 4.5 0.5 3 4 3 7 So, the technique works. The technique is readily extended to larger matrices; ‘all’ you need is the inverse of the divisor matrix. And therein lies the rub: not all matrices are invertible, so not all matrices can be used as divisors. First of all, only square matrices can be inverted. Inverses of vectors and rectangular matrices are not defined, so division involving vectors or rectangular matrices is not possible. But, there’s a more insidious problem, one that can – and probably will – give you fits from time-to-time when you’re developing and running models. Consider the following: 9 2 4 5. E C 3 F 1 1.8 5.4 If you try to calculate the inverse of matrix F, you will find that the result is F 1 which renders the product EF 1 meaningless, because it’s a matrix whose elements are all precisely zero, no matter what the values of the elements of the E might be. A matrix such as F whose inverse consists entirely of infinite elements is said to be singular, and mathematicians, clever devils that they are, sidestep the problem by stating that a singular matrix cannot be inverted, or equivalently, that its inverse does not exist. Fortunately, truly singular matrices are not frequently encountered in mathematical biology. But, matrices that are close enough to singular to cause significant problems can be produced under many modeling scenarios. To appreciate this, let’s look at the result of telling MATLAB to divide E by F as presented above. Do that, and instead of a numerical result you get the following message: >> C = E/F Warning: Matrix is singular to working precision. ans = Inf Inf Inf Inf indicating that you have tried to divide one matrix (E) by a singular matrix (F) whose inverse doesn’t exist. So far, so good; MATLAB has made you aware of the problem. But, there’s potential for disaster lurking in that phrase “singular to working precision”. It refers to the inherent limit on the accuracy of computer arithmetic that results from the way computers store and work with numbers, a topic that we’ll delve into deeply during lab sessions next fall. Suffice for now to point out that if we were to increase just one of the elements of F by only 210-15, say from 5.4 to 5.400000000000002, MATLAB will blithely go ahead and calculate an answer that might look reasonable at first glance, and give no warning at all that there’s a potential problem. Because of its significance for computer modeling we’ll spend a fair amount of lab time investigating the causes and ramifications of this situation. One consequence is that you need to be extremely wary when your models include operations involving matrix division. Powers of Matrices The need to take powers of matrices arises frequently in biological models, for example in models of population dynamics and models involving Markov processes. We will confine ourselves to the situation where the power is a integer, positive or negative, and proceed by first recalling that the nth power (n a positive integer) of a number or a variable is simply the number multiplied by itself n - 1 times. Similarly, the nth power (n a negative integer) of a number or a variable is simply the reciprocal, or inverse, of its nth power (n positive). Thus, 32 3 3 9 , 33 3 3 3 27 , x 2 x x , x 3 x x x , while 33 1 33 , x 3 1 3 , and so x on.. It’s pretty much the same when working with matrices. If A is a square matrix, then A2 A A A3 A A A A 3 A 1 3 1 A3 and so on. (why won’t it work with a rectangular matrix?) Also, as with numbers and variables, A m A n A m n and A m n A mn Sources for more information: If you want to learn more about linear algebra, or just would like an alternate presentation of any of the above topics, here are some useful URLs: http://www.sosmath.com/matrix/matrix.html http://www.math.ucdavis.edu/~daddel/linear_algebra_appl/OTHER_PAGES/other_pages.ht ml http://pax.st.usm.edu/cmi/mat-linalg_html/linal.html And, of course, Googling the appropriate terms or phrases will bring up a wealth of information, especially with respect to applications, MATLAB scripts, etc. Solutions to Practice Problems Matrix transpose 6 1 0.5 6 11 0 A' 11 5 1 11 0 11 3 0.5 5 3 4 B 1 3 return Matrix addition & subtraction 15 12 4 A B 17 6 12 4 7 19 0 1 4 A B 1 2 6 2 5 2 return Matrix Multiplication a. Multiplication is allowed: 29 22 5 answer = 17 58 2 14 69 17 b. Multiplication is allowed: 35 answer = 24 30 c. Multiplication is not allowed. d. Multiplication is allowed: 1 0 answer = (what kind of matrix is this? What does this result suggest 0 1 about the relationship between the two matrices?) e. Multiplication is not allowed. return Matrix multiplication isn’t commutative 1 2 5 6 19 22 AB 3 4 7 8 43 50 5 6 1 2 23 34 BA 7 8 3 4 31 46 return Multiplication by the identity matrix 1 6 7 1 0 0 1 6 7 AI = 2 5 8 0 1 0 2 5 8 3 4 9 0 0 1 3 4 9 8 AI = 1 0 1 0 0 0.4 8 0 1 0 1 1 0 0 1 0 0.4 1 1 1 8 8 1 0 AI = 0 0 0 1 0.4 1 0.4 1 return