Introduction to Matrix Multiplication Through Affine Transformations Ad Astra Education By Dov Kruger PhD Dov.Kruger@AdAstraEducation.org This is a draft of a concept paper resulting from discussions with mathematics teachers on Edmodo. It is being shared online in the hopes that it will be helpful, but it is not a complete polished course unit. Matrices are rather abstract. People see matrix multiplication and don't understand what it means. A teacher reached out on Edmodo to ask how to motivate students to learn this topic, because the book just offers matrices as a notation to represent systems of linear equations, and students feel they don't need matrices and wonder why they have to learn them. So let's look at some examples that have geometric application, and incidentally power all those high powered graphics cards we like to play 3-D video games on. At the end of this document, you can see an application that will allow you to exercise this material in postscript. I'm going to use Matlab or Octave notation here because I can run commands in Octave and paste them and results right in here. Let me know if you have any problems understanding. Notation: A row vector is written: x = [1 2 3] Octave then responds: x = 1 2 3 A column vector is written: y = [1;2;3] Octave replies with: y = 1 2 3 A matrix can be entered in octave with a semicolon at the end of each row: octave:11> A = [1 0 0;0 1 0 ; 0 0 1] A = 1 0 0 0 1 0 0 0 1 For 2-D, let's make it even simpler and go to a 2x2 matrix: octave:12> A = [1 0;0 1] A = 1 0 0 1 Suppose you have a point at (3, 1). If you multiply matrix A by this point, nothing happens: octave:13> x=[3;1] x = 3 1 octave:14> A*x ans = 3 1 That's because the matrix multiplication stands for the following: 1 0 𝑎 𝑏 𝐴=[ ]=[ ] 0 1 𝑐 𝑑 ′ 𝑥 = 𝑎𝑥 + 𝑏𝑦 = 1𝑥 + 0𝑦 = 𝑥 𝑦′ = 𝑐𝑥 + 𝑑𝑦 = 0𝑥 + 1𝑦 = 𝑦 In other words, x' and y', the changed values of x and y, are not changed at all because A is the identity matrix, the matrix which when you multiply something by it, keeps its values. Can you think of the analogy in regular scalar numbers? What, number, multiplied by anything, gives the same number? Suppose you want to stretch the point out. Make it twice as big in the x and y: octave:15> A = [2 0 ; 0 2] A = 2 0 0 2 octave:16> A*x ans = 6 2 With twos down the main diagonal, this matrix doubles it. So how does this work in graphics on a computer? Well, suppose you are drawing a shape on a screen, like a square. It has four coordinates: 3,9 10,9 3, 2 10, 2 If you apply A*x to each point in the square, the entire square will double in size. Why? Because each point gets twice as far away from the origin. Try it on graph paper! There are a few more tricks you can pull, but they require a bit more advanced math. If you don't know trigonometry yet, just take on faith that for any angle , you can compute two functions: sin(), cos() For = 45° √2 sin() = cos() = ≅ .707 2 If you make your matrix: 𝑐𝑜𝑠𝜃 𝑠𝑖𝑛𝜃 . 707 . 707 𝐴=[ ]=[ ] −.707 . 707 −𝑠𝑖𝑛𝜃 𝑐𝑜𝑠𝜃 And draw a graph showing the original point and the new point as vectors, you can see that the point has been rotated by 45 degrees clockwise. quiver(0,0,x(1),x(2)) hold on x2 = A*x quiver(0,0,x2(1), x2(2)) x3 = A*x2 quiver(0,0,x3(1), x3(2)) axis equal So what do you think would happen if you applied this operation to four corners of a rectangle, and then drew the rectangle? Translation There are three kinds of operations that can be performed on points to transform them. You have already seen two: Scaling (making the distance from the origin grow and shrink) Rotation (rotating around the origin). The third kind of transformation is translation, moving a point in the x and y. That is addition, not multiplication, so it doesn't work using a 2x2 matrix. Instead, we need a 3x3 matrix: 𝑎 𝑏 𝑐 𝑥 𝑎𝑥 + 𝑏𝑦 + 𝑐 𝑑 𝑒 𝑓 𝑦 [ ] [ ] = [𝑑𝑥 + 𝑒𝑦 + 𝑓 ] 𝑔 ℎ 𝑖 1 1 If we just ignore the fact that the vector has only two real values, not three, the multiplication for x involves a and b in the matrix, and c is just added. Notice that we don't really need a full 3x3 matrix, we just need the first two rows of it, so it's really a 2x3 matrix. The following example shows a matrix with only translation values. It shifts the point (4,2) by (+3,-3) 1 0 [0 1 0 0 (1)4 + (0)2 + 3 3 4 7 −3] [2] = [(0)4 + (1)2 − 3] = [−1] 1 1 1 1 Taking this Further: Using Matrix Transforms If you want to take this further, you can try this out in a number of computer languages. All of them require a little bit of extra work, but then you can use transformations to draw. The simplest one is probably postscript. Here is a small program written in postscript. If you download ghostscript, a free open-source program that will display postscript, you can display this program on the screen. The coordinate system in postscript is as follows: 0,792 612,792 0,0 612,0 0 0 moveto 100 0 lineto 100 100 lineto 0 100 lineto closepath fill 200 300 translate 0 0 moveto 100 0 lineto 100 100 lineto 0 100 lineto closepath 1 0 0 setrgbcolor fill 30 rotate 0 0 moveto 100 0 lineto 100 100 lineto 0 100 lineto closepath 0 1 0 setrgbcolor stroke 0.5 2 scale 0 0 moveto 100 0 lineto 100 100 lineto 0 100 lineto closepath 0 0 1 setrgbcolor stroke showpage There are many nice postscript tutorials out there that you can use to learn how to draw more. Here are a couple: http://paulbourke.net/dataformats/postscript/ http://merganser.math.gvsu.edu/david/psseminar/ Exercises 1. Given the point (4,1), define a matrix that will scale it by 2 in the x and by 0.5 in the y. Write the matrix and do the multiplication. What should the result be? 2. Given the point (3,2) define a matrix that will invert it, in other words flip it to (-3,-2) 3. Given a point (2,5) define a matrix that will rotate it left by 90 degrees.