METHODS FOR FINDING THE SOLtrrION TO SYSl'EMS OF LINEAR EQUATIONS A RESEARCH PAPER SUBNITTED TO THE HONORS COMMITTEE IN PARl'IAL FULFILLMENT OF THE REQUIREMENTS FCR THE HONORS PROORAM by WILLIDf C. MERX ADVISOR - DR. EARL H. MCKINNEY BALL Sl'ATE UNIVERSITY MUNClE, INDIANA JUNE, 1965 - .. -; ( --", 1 ACKNGlIEDGEMENT S I wish to express my warmest thanks to Dr. Earl H. McKinney, who added to his very busy schedule the task of reading and correcting this paper, for his helpful suggestions and encourage- mente i1 - .... , TABLE OF CONTENTS Page • ACKNcrNLEDGEMENTS •••••••••••••••••••••••••••••••••••••••••••••• ii PROBLEM ••••••••••••••••••••••••••••••••••••••••••••••••••••••• 1 INTRODUCTION •••••••••••••••••••••••••••••••••••••••••••••••••• 2 I. GRAPliIC :r1E:rHODS •••••••••••••••••••••••••••••••••••••• 5 II. ELIMINATION METHODS •••••••••••••••••••••••••••••••••• 7 ...................... 7 •••••••••.••••••••••••••••••••••••••••• 12 The Gauss Elimination Process III. IV. MA.TRIC ~HODS The Gauss Elimination Process The Crout Method ••••••••••••••••••••••••••••••••••• The Gauss-Jordan Elimination Process •.......•.....• Inverse Methods ' ...................... ............................... ..... 13 19 DETERMINANT MEXHODS •••••••••••••••••••••••••••••••••• 22 ................................. .... 22 Cramer •s Rule V. RELAXATION VI. rrERATION ,~ • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 1• • • • • - 27 The Jacobi l1ethod. ••••••••••••••••••••••••••••• 1. • • • • y.,.. The Gauss-Seidel Method. •••••••••••••••••••••••.••••• 36 • • • • • • • • • • • • • • • • • • • • • • • • • • • • • It • • • • Conditioning of a 3.Ystem ••••••••••••••••••••••••••• Choosing a Method 38 40 ••......•................•........•...••........•..•.• 43 lvfA.TRICES •••••••••••••••••••••••••••••••••••••••• t • • • • • 43 • • • • • • • • • • • • • • • • • • • • • • • • • • • • • It • • • • I. 20 .....••............................•....•... VII. WHICH MEXHOD TO USE APPENDIX 12 iii Page II. D~ERMINANTS ••••••••••••••••••••••••••••••••••••••••• LIST OF REFERENCES •••••••••••••••••••••••••••••••••••••••••••• 46 iv 1 PROBLEM The solution to many physical problems involves a system of linear equations. For example, the process of finding the amount of current through a component in an electrical circuit entails breaking the circuit into loops and writing each of them as a linear equation. By solving these equations simultaneously the experimenter can find the amount of current in each of the loops, and, hence, the current through each component. The solution to such systems, however, is oftentimes not easily obtained. If the number of equations and the number of variables is greater than five, the number of arithmetic calculations is large and the solution, though theoretically easily obtained, is found only after much tedious labor. The purpose of this paper is to investigate a few of the more popular methods for finding the solution to such systems and to determine when each can be used to its best advantage. Little considerations will be given to the theory involved in the development of each of these methods. 2 INTRODUCTION A system of linear equations can be denoted in various ways, but this paper will use the three methods that follow. Throughout this paper these systems will be referred to frequently, and will be distinguished by referring to S,ystem I, S,ystem II, and respectively. System ~~stem Note, however, that these are equivalent systems. 1. a ll x l + a 12x 2 + a l JX3 + • • • + a 1nx n =bi a 2l xl + a 22x 2 + a 2 JX3 + • • • + a 2nx n = b2~ a 3lx 1 + a 32x 2 + a y t3 + • • • + a 3nx n = b~~ • • • • • • • • • • + a mnx n = b m System II n j ,-. I= a. .x. 1 ~J J = b. ~ , where i = 1, 2, • • • , m. III, 3 - §ystem ill A X = B, where ail a a a a A= 21 • ami X= a 31 • 12 a 13 • • • a 22 a 23 • ·• a 33 • · · a 32 • • • a m2 a • a • ln 2n 3n • a mn m3 • • • Xl b1 ~ b2 B= x3 b 3 • • bm Xm The notation i..'1. the above systems warrants some explanation. The x's of course are the variables, and their subscripts merely indicate specific variables. The fixed for each system considered. ~'s and E's are scalars that are The first subscript of a and the subscript of b denote the number of the equation. script of ~ denotes the variable of which ~ The second sub- is the coefficient. For instance, a34 is the coefficient of the fourth variable in the third equation. Note that each of the above systems involves n variables. From the Law of m<n, Trichoto~, m = n, or ~ equations in either m>n. This paper is mostly concerned with systems of the second case, i.e., where m = n, and in all of the systems used henceforth an !l - 4 will be written in place of!!!_ If one wishes to solve a system for which m>n, he may do so by deleting (m - n) of the equations, thus obtaining a system with as many equations as variables. of this paper can then be applied. The methods A system of equations such that m< n can be solved by treating (n - m) of the variables as constants, and applying the methods of this paper. - BALL STATE UNIVERSITY MUNCIE, INDIANA June 2, 1965 Dr. Goutor, Mr. William Merx has done a fine job in preparing this honors thesis,"Methods of Finding the Solution to Systems of Linear Equations." I approve the thesis completely. (t ~ !I~~~' Dr'-:'-'~arl H'. MCKinn~ Head Mathematics Department .I - GRAPHIC l-rerHODS The reader will recall from geomet~ that a linear equation in two variables can be represented on a two-dimensional coordinate system by a line. the equation. Also, eve~ point on that line is a solution to Likewise, a linear equation in three variables can be represented in a three-dimensional space by a plane where every point in that plane is a solution to the equation. Analogously, a linear equation with four variables has as its representation in a four-dimensional space a hyperplane. Every point in this hyper- plane is a solution to the equation. This discussion can be carried further, but it is not necessary to do so here. Suppose that in the same two-dimensional space two linear equations are represented by two lines. Obviously, these two lines either intersect, or are parallel, or coincide. If they intersect, the point of intersection (and it can be shown that there i.s only one) is a point on both lines, and thus is a solution to both equations. Hence that point is the solution to the system composed of two equations represented in the space. If the two lines coincide every solution of one equation is also a solution to the other. Thus, the two equations are identical, and there is no unique solution to the system. If the two lines are parallel, there is no point in the plane that lies on both lines, and it is said that a -. 6 solution to this type of system does not exist. This discussion can be extended to three-space. Suppose that two linear equations are represented in the same threedimensional space. If these planes intersect, and do not coincide, they intersect in a line. Thus every point on that line is a solution to the two equations. If a third equation were represented by a third plane, not parallel to or coinciding with either of the first two, the plane will intersect the line in one and only one point. This point is the solution to the system of three equations. Of course this method can be carried on to four or more variables, but since graphical representation is not practical, there is no need to discuss them here. Example 1. Find the solution to the following system. = 12 Y =1 2x + Jy x y -y=l 2x + 3y = 12 x B.y inspection, the solution is x = 3, y = 2. II - ELIMINATION METHODS Probably the most TNidely used methods for finding the solution to a system of linear equations are the elimination methods. The most popular of these are the Gauss Elimination Process and the Gauss-Jordan Elimination Process. The former will be discussed here, and the latter will be discussed in Section III. The dis- cussion is facilitated there by a simplification of notation. The Gauss Elimination Process The first equation (i.e., when i ::: 1) of System II is n I (1 ) j ::: 1 which can be written n all Xl + I j ::: 2 a 1 j x j ::: b l • (2) By multiplying (2) by the multiplicative inverse of aU' one obtains n Xl + I j ::: 2 0) Equation (3) provides a means of removing the first terms (i.e., ail Xl' where i ::: 2, 3, ••• , n) from the remaining (n - 1) equations. .- This is accomplished by multiplying equation (3) by 8 - -ail in each case and adding the result to the respective equation. These steps modify the remaining equations of the system to the following n (ail - ail) Xl + . J L= 2 (a .. - a. ~J b i - ail (all)-l b 1 ' where i ~ 1 = 2, (a l1 )-l a1 .) x. J J = 3, ••• , n. (4) This simplifies to n ~ j =2 1 1 a .. x. = b., where i = 2, 3, • • • , n, J ~J ~ when the following substitutions are made: a~j = a ij - ail (a 11 )-1 a 1j (6) and Note that equation (5) represents a system of equations each with (n - 1) variables. The first equation of system (5) can now be written n 1 a 22 x2 + I' 1 1 a · x . = b2 • 2J J j =3 (8) Multiplying (8) by the multiplicative inverse of a~2 yields n ~ + j I= 3 Equation (9) provides a method of reducing the remaining equations of system (5) to a system of (n - 2) equations in (n - 2' variables. ,- 9 The new system is - n ~ j =3 2 2 a .. x. = b. , ~J where i J = 3, ~ (10 ) 4, • • • , n, where • (11 ) and 2 bi = b1i 1 (1 )-1 1 - a i2 a 22 b2 • It will be seen that i f one carries out this process n (12 ) times, System II will reduce to Xl + c 12 x2 + c 13 x3 + • • • + C1 (n-1) X(n_l) + c 1n Xn = d1 x2 + c 23 x3 + • • • + c 2 (n_1) x(n_1) + c 2n xn = d 2 = dJ. x3 + • • • + c 3 (n-1) x(n_l) + c 3n xn • • • • • • • • • • x( n-l ) + c(n-l )n xn x n (13) • = d( n-1 ) = d n where the £. 's and ,9; 's are known scalars. System (13) does not directly yield the solution to System II. However, a process kno',m as "back substitution" can be applied to find the solution. This process is described here. One will note that the last equation in (13) is solved, i.e., xn = dn • (14) If this value is substituted into the next to the last equation of (13), one will find that - 10 - If this value is substituted in the second from the last equation in (13), the value of x(n_2) will be found. This process can be repeated until values for all of the variables are found. Example~. Find the solution to the following system. x-3y- Z= - 32 2x - Y - 3z = 5 -5x + y + 2z = -242 • Multiplying the first equation by its multiplicative inverse leaves it unchanged. Multiplying it by -2 and adding it to the second equation yields 5y - z = 69. Multiplying the first equation by 5 and adding it to the third yields -lQy- 3z = -402. Hence the new system is x-3y- z=-32 5y - z == 69 -14y - 3z = -402. MUltiplying the second equation by 5-1 yields y - 1/5 z = 69/5. Multiplying this equation by 14 and adding to the third equ.ation yields -29/5 z = -1044/5. The new system is now reduced to x-3yz==-32 y - 1/5 z == 69/5 -29/5z = -1044/5. Multiplying the third equation of this system by -5/29 will yield the new system - 11 - x-ny - z=-)2 1/5 z z = 69/5 = )6 which is of the same form as system (13) above. The process of back sUbstitution yields z = )6 y = 69/5 + )6/5 = 21 x = -)2 + ) (21) + )6 = 67, the solution to the system. - III - MATRIC METHODS The Gauss Elimination Process The discussion of the previous section can be simplified somewhat if a matrix is used in place of System II. The augumented matrix is made up of the matrix A in System III augumented by including the column of constants, B. Thus the augumented matrix of System II is • • • • • • • • • • ·. a (16) b n nn This matrix can be simplified by applying the elementary row operations of Appendix I. The reader should note that these opera- tions are nearly identical to the operations of Section II. The goal in simplifying matrix (16) is to arrive at the form 1 c 0 1 0 0 0 0 c 23 • • • 1 • • • c c • • 13 • • • c 12 • • 0 • • • • 1n 2n c 3n d 1 ~ d 3 (17 ) • 1 ~ Note the similarity between matrix (17 ) and system (13). Matrix (17) is known as a triangular matrix, and the process of arriving - 13 at such a matrix is known as "triangularization.·t If one 1rTere working with matrices in solving a system, after triangularizing the matrix, one vlould apply the definition of the augumented matrix and write matrix (17) as system (13). Using the process of back substitution, one could then find the values of the variables. Note that this discussion is identical to the one in Section II, except for a simplification of notation. The Crout Method Note that the process of triangularization transforms a given matrix into an equivalent matrix involving new scalars, but eaCh of these new scalars is derived from the original scalars. It would be convenient if one could find the resulting matrix (17) without going through the tedious process of applying the elementa~ row operations. P. D. Crout developed such a method, which bears his name. To justify this method it is convenient to make some observations about the following matrix. 1 1 1 1 r l1 r 12 r 14 r13 2 2 r1 r 222 r 24 r23 21 1 2 3 r3 r31 r32 r33 34 4 1 2 3 r41 r42 r43 r44 • r 1 n1 • 2 rn2 • • r3 n3 • 4 rn4 1 r15 2 r25 3 r35 4 r45 • • • • • ·. • • • • • • • • r5 • • • n5 1 rln 2 r2n r3 3n 4 r4n (18) • r n nn s n n Ignoring the superscripts for the moment and comparing matrix 14 - (18) with matrix (17), one sees that i C, • 1 r .. == ~J ~J j i == j { o By < i > j taking note of how the values c .. , 1 < i < n, 1 < j < n, were ~J - - -- derived, equations can be written to yield each r .. ~J direct~. One method of attacking this problem is to let 1 r i1 == ail i > - 1 (or i ~ j) (20) in matrix (16). If one divides the first row of matrix (16) by 1 1 r11' rlj is obtained, i.e., 1 r lj == al ,J. r 1ll 1 <j (21 ) (or i < j) b1 1 sl == 1 r ll (22) 1 Note that this division yields the requirement that r 11 == 1, but it is not necessary to make this statement. previously been assigned to r~l Since one value has and that value is the only one used in the calculation of the values of other terms, the value of this term in matrix (18) is assumed to be 1 without making a formal statement. This intuitive convention simplifies the notation of the outcome of this discussion. Using the first row to perform an elementary row operation on each of the remaining rows yields i ~ 2, j ~ 2 (23) (24) - 15 as new values for the elements of those rows. r i1 = This operati.on makes ° as desired, but in order to keep the notation as si.mple as possible, this statement is not made formally. At this time, we are only concerned with what happens in the second column, i.e., 211 r i2 = a i2 - r i1 r 12 i > 2 (or i > j) 2 If row two is divided by r 22 , one obtains 1 2 This step makes r~2 1 a 2 ,1, - r 21 r 1 .J· = 1, , 2 > j (or i > j) (26) which is desired, but again is not stated. Using the second row to perform an elementary row operation on the remaining rows, one can obtain 2 2 1 1 Ca,l.J, - r'l r 1 J' ) - r i2 r 2j l. 2 2 1 1 (b i - r i1 sl1 ) - r i2 s1, • as new values for the elements of those rot..rs. r~2 l. = 0, - i> 3, j ~ 3 (28) (29) This step also makes where i > 3, but ~gain this is not stated. - This process follows the same pattern and will not be carried out here. The reader may verify that 31122 r43 = a43 - r41 r13 - r 42 r23 4 r 44 - = a44 11 22 33 - r 41 r 14 - r 42 r 24 - r43 r)4 (30) (31) 16 s4 = b4 1 1 2 2 3 3 - r 41 s1 - r42 s2 - r43 s3 r (33) 4 44 3 4 Note, however, that r43 becomes 0 and r44 becomes 1 after further operations 8.nd these new values are the ones used when values are substituted in (18). By evaluating the elements in the foll~Ning order: elements of the first column; elements of the first row to the right of the first col~~; elements of the second column below the first row; elements of the second rO'toT to the right of the second column; etc., one can use the following summation equations to find the elements of matrix (18) j - 1 = a .. ~J - ~ ()4) i> j k == 1 i - 1 1 i r .. ~~ a .. ~J l i <j k == 1 i - 1 i s i == Example 2. 1 i r .. ~~ b.~ - r k k Sk ik Find the solution to the following system by triangularizing the augumented matrix. - I k =1 (36) 17 - x - 3y 2x - z == - 32 y - 3z == 5 -5x + y + 2z == -242. The augumented matrix is G -3 -1 1 -1 -3 2 -J2 -24~U Since this system has relatively few' variables, an easy method is the Gauss Elimination Method (or triangularization). But since this method was used in Example 2, the Crout Method will be: used here to provide the reader with an example. <he should recall that the goal of this method is to reduce the matrix 1 l1 1 r 21 1 r31 r 1 12 2 r 22 2 r32 1 r13 2 r23 3 rJ3 1 51 2 52 1 c c 0 1 d 1 d 2 0 0 r 53 3 to 12 13 c 23 1 dJJ Using equations ()4), (35), and (36), one can obtain the following 1 r 11 = ali == 1 1 r 21 == a 21 == 2 1 r31 == a 31 == -5 - 18 - 1 r 12 = --r- = -3 rl1 [alz1 1 r13 = 1 1 rl1 [a13] ::: -1 1 s1 = 1 r 111 [b1] ::: -32 1 211 r 22 = a22 - r 21 r 12 = -1 - (2)(-3) 211 r32 = a 32 - r31 r 12 ::: 1 - (-5)(-3) 2 r23 1 = 2r 22 1 -- 5 t 3 - (2)(-1~ = - ::: 2 - (-5)(-1) - (-14)( - 3 3 = = -14 G23 - r~1 ri~ 31122 r33 ::: a 33 - r31 r13 - r32 r23 S =5 + +)= - -L3 r33 - _ 2- - or 29 course some of the elements listed above become 0 after further operations, and the augumented matrix becomes - 1 -3 o 1 o o -1 -32 -1/5 69/ 1 36 • 19 Applying the definition of an augmented matrix, one has z 3y - x - y - = -32 1/5 z = 69/5 z , = 36 which, after the process of "back sUbstitution" is applied, yields x = ~ 67 y = 21 z = 36 • Gauss - Jordan Elimination Method A variation of the method known as triangularization is a method known as diagonalization of a matrix, or the Gauss-Jordan Elimination Process. One will recall that in triangularization, zeros were introduced in place of all the elements below the diagonal by multiplying the elements of one row by a scalar and adding them to another row. In diagonalization, the same process is also applied to the elements above the diagonal. Hence, the following matrix results. 1 0 o ·.. o o 1 o • • • o • • • o o o o o o • • • • • • • • • 1 o f • • • o 1 f (n - 1) n After applying the definitions of the augmented matrix, the solution can be read directly, i.e., - ... , x n = f n • (38) 20 - Inverse Methods Another method of finding the solution to a system of linear equations is that of using the inverse of the coefficient matrix. The properties of matrices allovr one to make the following statements about S,ystem III: A-1 (AX) (A -1 == A-1 B (39) -1 A) X = A B 1 I X == A- B where I is the identity matrix. X -1 where A exists = (40) (41 ) Hence, the equation A-1 B (42) leads one to another method for finding the solution to a system of equations. Of course this method is not always of value, because of the difficulty of finding the inverse matrix. Below are a few of the many methods of finding the inverse of a matrix. Direct Solution. 1. b b b 12 • • • U b 21 b 1n a b a · 22 • • 2n • • • • • • b n1 bn2 • • • 0 • • • 0 0 1 • • • 0 • • 0 • • • 0 - bnn 1 == • • 11 21 • an 1 • a12 • • • a 1n a 22 • • • a • • • · ann • a n2 • • 2n (43) • · If B is the inverse of A, one knows 1 21 Hence, by the definition of the multiplication of matrices n I k =1 b ik a kj = t, 0, when i = j when i f= j (44) • Thus, all that remains is to find the variables b .. , 1 < i < n, ~J 1 ~ j ~n, matrix. -- in the above system, and one will have the inverse But this involves U systems of equations, each with U equations in U variables. Since this is the type of problem we are trying to solve, this method is of little practical value. 2. Operations.2!1 the Identity M3.trix. Another method of finding the inverse of a matrix, is to reduce the matrix to the identity matrix by using the elementary row operations. ~ applying the same operations on the identity matrix, one can obtain the inverse of the original matrix. 3. Inversion E.l. ~ Use .2£ ~ Adjoint M3.trix. Cbe should recall that (adj A) A IAI = -1 A • Thus i f the adjoint of A is calculated, the inverse can be found directly. - See Appendix I for a definition of the adjoint matrix. ,rv Cramer's DEI'ERMlNANT MEI'HODS ~ Another important method used for finding the solution to a system of linear equations is Cramer's Rule. If one were to multiply both sides of S,vstem III by the adjoint of matrix A, i.e., adj A, one would obtain adj A AX = adj AB• (46) I , (47) However, II adj A A :: A and (46) beoomes I A I I X :: adj A B , (48) or IAI 0 0 • • • e xl 0 IAI 0 • •• 0 x • • • 0 • • • IAI • 0 0 • Xn 1°111 1°211 1°311 · . . 10nl1 b1 1°121 1°221 1°32 1 • • • IOn21 b • 101nl - • 2 • • • • Ic2n l IC3n1 • • • • 2 • • ICrJ b n (49) 23 But (49) can be written as 0 • • • 0 IAlx2 0 • • • 0 0 , AIXl 0 • • • • • 0 0 • 0 • • • • IAIXn n k L = I t ~ Ick2 1 ~ c kl 1 n = k L = 1 (50) • • • • JCkn I b k Cki bk • n I k = 1 n Hence, k I= X.= ~ EKpansion of IAI 1 IAI i = 1, 2, • • • n. (51) by cofactors yields 1 < - -<n i The reader should note that the numerator of equation (51) and the right side of equation (52) are the same except that the k th column of a's in (52) is replaced by the column of scalars, b i , 1 ,- ~ i ~ n, from the original system. 24 - Thus, Cramer's Rule is summarized by the equation (53) where '~I is the same as IAI k th except that the placed by the column of scalars in the system. column is re- Hence, the solution to any system of linear equat~ons can be easily indicated by (53), and all that remains is to evaluate (n + 1) determinants of order !l. Below are two methods for evaluating such determinants. 1. Evaluation Cofactors. ~ The most widely known method of evaluation of determinants is the method of Evaluation by Cofactors. The general procedure is to write an !l th order determinant in terms of !l determinants each of order (n - 1). Each of these can then be written as (n - 1) determinants of order (n - 2), and so on. It can be shown that when the elements of any row are multiplied by their respective cofactors and the products are summed, the result is the value of the original determinant. Thus, if all a a a D= 21 • a n1 a 22 13 • • · a 23 • • • a • • a a 12 n2 • a n3 • • • • 1n 2n (54) • a nn then n D= - I k =1 a kj C Kj • (55) 25 Evaluation ~ ~ Method 2. 2£ Chi~. A slight variation of the preceeding method reduces the amount of arithmetic in evaluating a determinant considerably. Suppose some element, say a ' 22 (If no element equals of the general determinant (54) equals one. one, the elements of a row can be divided by a scalar to obtain a one somewhere. See Appendix II.) Applying the properties of a determinant, one can make the remaining elements in the second rot-T equal to zero. a11 - a21 a12 a12 a 22 0 D= a a 31 - a 21 32 • a13 - a23 a12 • a • • a n l - a21 an 2 • ·· aln - a2n a12 0 a 32 • 33 an 2 0 - a • • Thus 23 a • • • a 3n - a 2n a 32 (56) 32 • • • ~3 - a23 ~ • • • • ~ - a2n ~2 • • • Applying the previous method to row two, one obtains D == a 22 • all - a 21 a12 a 13 - a 23 a 12 • • • a 1n - a2n a12 a 31 - a 21 a32 a • • • • • • • 33 - a 23 a 32 • • • • • • •• a 3n - a2n a32 • • ~2 • - a2n • ~ The process can be applied again to further reduce the order of the determinant. Example~. Find the solution to the following system. x-3y- z=-J2 2x- y-3z== 5 -5x + y + 2z == -242 - (57 ) 26 The determinant of the coefficient matrix is I A I=~~ -1 -3 -3 = -29 -1 1 2 Applying equation (51), one obtains x= -32 5 -242 1 2 y= -~ -3 1 ,AI - = -32 -1 5 -3 2 = -32 5 -242 = -242 IAI 2 -3 -1 -~ 1 1 z = -1 -3 1 -1 IAI -12!!:J -29 = 67 -6 02 -29 = 21 -1044 = 36 • -29 V-RELAXATION The relaxation and iterative methods for finding the solution to a system of linear equations differ from the above methods, in that they find values for all of the variables simulta,neously. These methods produce decimal approximations, which converge to the solution, i f one exists. In order to discuss relaxation it is necessary to estab- lish the following definition. In S,ystem II the residual, R. , of ~ each equation is n R.~ = b.~ or I j = 1 a ij Xj • i= 1, 2, • • • n (58) course, if the exact solution to the system were substituted in (58), the residuals would each equal zero. Hence, the overall aim of relaxation is to reduce the residuals to zero. In the example = -5x + y + 2z -242 x-Jy- z= -32 2x- y-3z= 5 the residuals are R1 = -242 + 5x - y - 2z R2 = R3 = -32 - x + Jy + z (60) 5 - 2x + y + 3z • Usually in starting the relaxation process, the initial residuals - 28 are found by setting all the variables equal to zero. The initial residuals in this example are then (61 ) One will note that i f x is changed one unit in the positive direction, R1 is increased five unite. Likewise one unit, and R3 is decreased two units. If ~ ~ is decreased is changed one unit in the positive direction, Rl is decreased one unit, R2 is increased three units, and R3 is increased one unit. If ~ is changed one unit in the positive direction, Rl is decreased two units, R2 is in- creased one unit, and R3 is increased three units. The F~ocess of changing the values of the residuals by using these properties is known as the basic unit operation. The properties are summarized below in an operations table. AR1 ~Rz ~3 == 1 5 -1 -2 fly == 1 -1 3 1 AZ == 1 -2 1 3 ~x (62) These operations can be used to any extent, as needed; thus i f fl x == n, then Rl is changed 5n units, R2 is changed -n units, and ~ is changed -2n units. Relaxation, then, consists of the repeated application of the basic unit operators in order to change the residuals to zero. A basic rule used in determining which operator to use and to what extent to use it is this: reduce the currently largest residual to zero, or as near to zero as possible. - the operator that will reduce the residual Hence one simply chooses l~th the largest abso- 29 lute value to zero by using it the least number of times. Obviously, R1 is the largest residual in the example considered here. From (62) one can see that R1 can be made nearly zero by letting (63) Ax = 48 , i.e., by applying the basic unit operator~x forty-eight times. The application of this operator reduces R1 to -2, but at the same time R2 becomes -80, and R3 becomes -91. The above paragraph can be summarized by the following table, which is known as a relaxation table. x=y=z=O ~ -32 = 48 -80 Ax Now the aim is to reduce R3 to zero or nearly so. (64) From (62) one can see that this can be accomplished by letting z = -30. Thus one has R1 R2 R3 -242 -32 5 Ax = 48 -2 -80 -91 30 -62 -50 -1 x=y=z= 0 AZ = This same process can be repeated, and is summarized here. - (66) 30 - R1 R2 R3 -242 -32 5 Ax = 48 -2 -80 -91 Az = 30 -62 -50 -1 Ax = 12 -2 -62 -25 Ay,= 20 -22 -2 -5 Ax = 4 -2 -6 -13 Az = 4 -10 -2 -1 Ax = 2 0 -4 -5 1 -2 -3 -2 Ay = 1 -3 0 -1 6.x = 1 2 -1 -3 Az = 1 0 0 0 0 0 0 x=y=z=O Az '= Xd7J y = 21 z = 36 (67) The values in the first column can be added yielding A x = 67, A y = 21, and 6. z = 36. This indicates that ~ changed 67 units in the positive direction, ~ changed 21 units, and z changed 36 units. If one substitutes x = 67, y = 21, and z = 36 (68) in equations (60), he will find that This step is shown in the last line of (67). Hence, the solution to system (60) is denoted by equations (68) above. - 31 - Since the solution to the above example was integral, the residuals were capable of being made exactly zero. Hm~ever, if this condition were not met, the residuals would not become zero, and it would be necessary to use decimal fractions to approximate the solution. The relaxation table for the following system i.s given below without explanation to illustrate this technique. 7x - Jy = 96 -3x + lOy = 141 (70) The operation table for this system is flx fly The re18~tion =1 =1 3 3 -10 (71 ) table is x=y=O Rl R2 96 141 fly = 14 138 1 flx = 20 -2 61 fly = 6 16 1 2 2 7 = 1 5 -3 flx = 1 -2 0 -2.0 0.0 flx::: fly 23} x = y = 21 - -7 (72) 32 -, (relaxation table continued) Rl R2 /!){= -0.3 0.1 -0.9 AY= -0.1 -0.2 0.1 -0.20 0.10 0.01 0.01 0.010 0.010 = 0.001 0.003 0.013 A Y = 0.001 0.006 0.003 0.006 0.003 x = 22.7 Y == 20.9 } A x = -0.03 x = 22.67 Y = 20.90 Ax } x == 22.671} Y = 20.901 Hence, the solution correct to two decimal places is x = 22.67, Y = 20.90 (73) The reader should note that several places in the relaxation process 3. sum of the variables 10Tas m:3.de and the values substituted in the original system to find the residuals at that point. Tnis acts as a check on the 1oTork, since arithmetic errors can occur quite easily in the process. If an error is found, it can be cor- rected by continuing from the values fOill1d in the check step, making it unnecessarJ to repeat the earlier steps. - 33 - The ation. ~bove discussion summarizes the basic process of relax- With a little practice one ~~ll find that it is relatively easy to apply and will find ways to simplify it. A few of these methods are discussed here. 1. Overrelaxation. Overrelaxation is the process whereby the residual is reduced beyond zero, because the processor sees that the next step will increase the residual. 2. Block Relaxation ~ Group Relaxation. Sometimes more than one basic operator can be applied at a single step. If the operators are all used to the same extent, the process is known as block relaxation. If they are not used to the same extent, the process is termed group relaxation. 3. MUltiPlying Factors. the residual L~ This is a process which reduces proportionate steps. That is, if the residuals are all reduced in the same proportion after several steps, they will be reduced in the same proportion when the steps are repeated proportionately. Hence, the work can be done in one step. VI - ITERATION Iterative techniques of finding the solution to a system of linear equations, like the relaxation method of the previous section, begin with a guess at the solution and proceed to the true solution. Iterative techniques differ from relaxation, however, in that the former entails a definite routine for proceeding from one approximation to the next, where relaxation does not. The Jacobi Method of Iteration and the Gauss-Seidal Method of Iteration foll~T essential\1 the same procedure, and both are explained best by considering an example. The Jacobi Method Suppose one is asked to find the solution to the following system. -5x + y + 2z = -242 X-ff- z=-32 2x - y - 3z = (74) 5 The first step is to use the successive equations to "solve" for the variables, i.e., - x = 51 y = 31 ( 32 + x - z) z = - 31 (5 - 2x + y) (242 + y + 2z) (75) 35 - If the trial solution (76) x=y=z=O is selected and substituted in the right hand side o~ system (75) one obtains x= ! (242 + 0 + 0) = y= '31 = z ()2 + 0 - 0) = - -1) (5 - 0 + 0) 242 T ¥- _ i - - ) = 48.400 = 10.666 (77) = -1.666 If the trial solution were the true solution, each variable would be unchanged side o~ one iterative step to the next, i.e., the right ~rom equation (77) 't-Tould be zero, but this is not the case here. Hence the trial solution is not the true solution, and the step must be repeated, this time using the values in (77) as the trial solution. The result x = 51 (242 y = z ~oll~ws. 3'1 1 = - 3" + 10.666 - 3.332) = 49.866 = 27.355 (32 + 48.400 + 1.666) (5 - 96.800 + 10.666) = 27.044 The next step is to use the values in (78) as the trial solution in the same manner. steps are listed below. - The results o~ the next sixteen (78) 36 - x = 64.688 y= 18.274 z = 22.459 61.038 24.743 35.367 67.495 19.223 30.777 64.555 22.906 36.922 67.750 19.877 33.734 65.869 22.005 36.874 67.550 20.331 34.911 66.430 21.546 36.589 67.344 20.612 35.438 66.697 21.302 36.358 67.203 20.779 35.698 66.835 21.168 36.209 67.112 20.875 35.834 66.908 21.092 36.116 67.064 20.930 35.908 66.949 21.052 36.066 (79) The reader will note that the values in (79) are slowly converging to x = 67, y = 21, z = 36. (80) The Gauss-Seidel Method The Gauss-Seidel Method differs from the Jacobi Method in one respect. The method is begun by choosing the trial solution (76) and a new value is found for ~ as in (77). However, instead of usi.'Y}g this trial solution to find a new value for ;Z, the trial solution is used. N~N x = 48.400 y = 31 y = z =0 (81 ) Hence = 26.800 (32 + 48.400 - 0) (82) the trial solution x = 48.400 y is used to find a new value for z = - 3'1 = 26.800 ~ z in (75). =0 (83) Hence (5 - 96.800 + 26.800) = 21.666 (84) The reader should note that the difference between the two methods is that the Gauss-Seidel Method uses the most recently - 37 - found values in its trial solution as opposed to the Jacobi Method which uses the same trial solution to find new values for all variables. The result of the next six steps of the Gauss-Seidel Method are listed below. x = y = z = 62.426 24.253 31.886 65.997 22.043 34.983 66.801 21.272 35.776 66.964 21.062 35.955 66.994 21.013 35.991 66.999 21.002 (85) 35.998 Just as in (79) the values in (85) converge to (80), but (85) converges much faster. - -VII - WHICH MEl'HOD TO USE Conditioning ~ ~ system Before discussing which of the above methods is best to use in finding the solution to a system of linear equations, it would be wise to note a few facts about such systems. As pointed out in the first section, the planes representing three equations in three variables may intersect, coincide with, or be parallel to one another. This paper is concerned with the first case, i.e., the case where a unique solution exists. It can be shown that in the other two cases the determinant of the coefficient matrix, IAI ' is always zero. It should be noted that either no solution exists or there are an infinite number of solutions, depending on whether the planes coincide or are parallel. Consider the system z == 16 x + 3y 2z 2x + 6y == 8 y + 26z = 32 • 7x - (86) The determinant of the coefficient matrix is IAI = 1 3 -1 2 6 -2 7 -1 26 == 0 Of course a glance at system (86) will tell the reader that the first two equations describe parallel planes. -- Hence, no solution 39 -- exists. But this fact raises a question. I I = 0, what happens when I AI is when A coefficients)? If no solution exists small (in relation to the Consider the system x + y = 1 (88) 1.001x + y = 2 • Obviously the solution to this system is x = 1000, y = -999. Also note that A= 1 1 1.001 1 = -0.001 (89) which is small in relation to the coefficients. Suppose the coefficient of x in the second equation is changed .001 in the negative direction, a change of less t.han .1% of the previous value. S,ystem (88) becomes x+y=l (90) x+y=2 which has no solution. But suppose the same coefficient is changed .002. in the negative direction. S,ystem (88) becomes x + .999x + y =1 y = 2 which has the solution x = -1000, y = 1001. A system such as (88) is known as an ill-conditi.oned· system of equations. Obvious~ it is virtually senseless to carry out an involved method of solution to such a system, as a small - error in a coefficient will yield a large error in the solution. 40 Choosing ~ Method. The choice of a method for finding the solution to a system of linear equations depends on bro things -- the type of solution that is needed and the calculation involved in arriving at the solution. The Graphic Method of finding a solution yields only an approximate solution that depends on the precision of the graph and the accuracy to shich the graph may be read. The Gauss Elimination Process along with the method of "back substitution" will yield an exact fractional solution to a system or a decimal approximation as accurate as desired. The computation involved is quite excessive, however, but is a "necessary evil" that accompanies the accuracy. The matrix method of triangularization yields the same accuracy and involves the same comput.ation, but has the advantage of not involving the repeated writing of the variables. The simplification of the notation makes the method somewhat faster. The Gauss-Jordan Elimination Process does not require the method of "back substitution", and is somewhat faster. The Crout Method of Elimination can be used with either matrices or directly on the system. made as accurate as desired. The solution obtained can be The summation equations can be programmed for a computer or can be used with a desk calculator, a property that makes it considerably faster than the Gauss Elimi- - nation method or the Gauss-Jordan Elimination method. 41 In general, inverse methods are seldom used (in hand cal- culations) because of the difficulty involved in the calculation of the inverse matrix. If a system has a matrix such that the inverse is easily calculated, the solution to the system will be more easily found by a method other than the inverse method. Determinant Methods yield a very accurate solution, but the computation involved in solving a determinant is excessive. These methods are no more accurate than some of the above methods, and hence, little is gained by carrying out the calculation. Since the Crout Method has the advantage of being used with a computer or desk calculator, it is a good substitute for the Determina.nt Methods. Relaxation is a good method to use i f extreme accuracy is not desired. A solution to one or two decimal places can be found relatively easily. However, this method is not well suited to finding an accurate solution because of the excessive amount of computation needed. The same can be said for the Jacobi and Gauss-Seidel methods of iteration. Accurate solutions are not easy to obtain, but rough approximations are. The Gauss-Seidel method is preferable because it converges more rapidly. Many times a system of equations will have a coeffi.cient matrix that is sy~~etric. for this situation. There are methods especially designed The Doolittle Process, which is similar to the Crout Method, and the Square-Root Method of Banachiewi.cz and D;.r.fer are two such methods. - For the details of these methods, see 42 Kaiser S. Kunz, Numerical Analysis, (New York: MCGraw-Hill Book Company, Inc., 1957). As can be seen, the process of finding the solution to a system of linear equations is quite involved, and the method chosen for finding the solution should depend on the type of desired and the nature of the coefficient matrix. SOll~ion APPENDIX I MATRICES Definitions: 1. A matrix (denoted by a capital letter, e.g., A) is a rectangular array of numbers of some algebraic system. The elew~nts of A are denoted by (a ). rs 2. If matrjx A has n rows and ~ columns, the dimension of A is n X m. 3. Two matrices are equal if' and only if' they have the same dimension and they are element-wise equal, i.e., = B if and only if (a rs ) = ). rs 4. The sum of two matrices with the same dimension is the matrix formed by the sum of the corresponding elements of the two matrices, i.e., A (b 5. The product of two matrices A and B of dimension n X m and m X p, respectively, is given by m AB == I k == 1 a ik b ks' where AB is of dimension n X p. 6. The product of a scalar £. and a matrix A is given by cA = (cars). 7. The transpose of a matrix A == (ars ) is A' 8. The adjoint of a matrix cofactor matrix. = (a sr ). A is the transpose of the adj A == C' == (c sr ) . See Appendix II for the definition of a cofactor. 44 APPENDIX II DEl'ERMINANTS Definitions: 1. matrix. matrix. A determinant is a number associated with any square That number is from the s~me system as the elements of the 2. The principal minor m is the determinant of the matrix formed by deleting row r r~d column ~ from the original matrix. 3. The cofactor c rs = c rs (_l)r+s m rs of the element a rs is given by • Properties of Determinants: 1. If any two rows (or columns) of a determinant are interchanged, the determinant changes sign. 2. If the corresponding elements of two rows (or columns) of a determinant are identical, the determinant is zero. 3. If the rows are written as columns, and the columns as rows, the value of the determinant is unchanged. 4. If every element of any row (or column) contains the same factor, the determinant contains that factor. 5. If every element in any row (or column) is resolved into the s~~ of two other elements, the determinant may be resolved into the sum of two other determinants. 6. If the elements of any row (or col~~) are proportional to the corresponding elements of any other row (or column), the determinant is zero. 7. If the elements of one row (or column) are equal to the sum of fixed multiples of the corresponding elements of the other rows (or columns) the determinant is zero. 8. If to the elements of any row (or column) there is added a fixed multiple of the corresponding elements of another row (or column), the determinant is unchanged. Elementary Row Operations: 1. Any two rows may be interchanged. 2. The elements of any row may be multiplied by any scalar. 3. A multiple of any row may be added to any other row. - LIST OF REFERENCES Allen, D.N. de G. Relaxation Methods. Ne'to1 York: Book Company, Inc., 1954. 257 pages. McGraw-Hill Fuller, Leonard E. Basic l-htrix Theory. Englewood Cliffs: Prentice-Hall, Inc., 1962. 245 pages. Kunz, Kaiser S. Numerical Analysis. New York: Inc., 1957. 381 pages. McGraw-Hill Book Comp~, Micon, Nathaniel. Numerical Analysis. New York: Sons, Inc., 1963. 161 pages. John Wiley & Redish, K.A. ~ Introduction i2. cOM;§gtational Methods. John Wiley & Sons, Inc., 19 1. 211 pages. Shaw, Frederick S. Relaxation Methods. New York: tions, Inc., 1953. 395 pages. New York: Dover Publica- stanton, Ralph G. Numerical Methods !.2!:. Science ~ Engineering. Englewood Cliffs: Prentice-Hall, Inc., 1961. 266 pages. stoll, Robert R. Linear Algebra ~ l-htrix Theory. New York: McGraw-Hill Book Company, Inc., 1952. 272 pages. -