Section 6a: Error Analysis and Norms I Our interest in this section will be to analyze the propagation of error from observed values to computed values. This is somewhat separable from the analysis of roundoff or truncation error. Let us begin by considering the matrix equation Av w , where A is observed and supposed to be invertible, one of v and w is observed, and the other is to be computed from the equation. We set A A0 A, v v0 v, w w0 w . Here, the subscripted symbol denotes the observed or computed value, the unsubscripted symbol denotes the "true" value, and the symbol preceded by denotes an error whose precise value is unknown, but can be bounded in some sense. What we would like to do is to deduce bounds on the error of the computed value from known bounds on the errors of the observed values. The case in which v is measured and w is computed is much easier, and we will deal with it first. We have the obvious identities w Av A0 Av0 v A0 v0 A0 v Av0 Av w0 A0 v Av0 Av , so that w A0 v Av0 Av . The object of the game here is to attach numerical measures of length to all these objects that behave somewhat like the absolute values of numbers and, therefore allow us to make similar deductions. Specifically, we want to attach a non-negative, real-valued "norm", written || ||, to each vector and matrix that will satisfy the following conditions for all vectors and matrices. We will write for a generic scalar. Axioms for Vector and Matrix Norms || v || 0 if and only if v 0 ; || A || 0 if and only if A 0 || v w |||| v || || w ||; || A B |||| A || || B || || v ||| | || v ||; || A ||| | || A || || Av |||| A || || v ||; || AB |||| A || || B || Notice that the analogy to the properties of the absolute value of a number breaks down only in the last condition; for ordinary numbers (or even complex numbers) the product of the absolute values is the absolute value of the products. For vectors and matrices, this would be incompatible with the first condition, since it is perfectly possible for a matrix product to be zero even if neither factor is zero. Any function of vectors and matrices taking non-negative values and satisfying the axioms above is called a norm. In particular, these conditions are satisfied if we set || Av || || v || v v ; || A || max v . This norm is called the 2-norm, written || ||2 where || v || there is the possibility of confusion with other norms. Indeed, if we take any norm for vectors that satisfies those of the axioms above that involve only vectors, then we can define a matrix norm in terms of the vector norm, exactly as we have in the case of the 2- norm. The two most important norms other than the 2-norm are the 1-norm and the norm, defined respectively by || v ||1 i | vi |; || v || max i | vi | . In each case, we define the matrix norm in terms of the vector norm, as we did for the 2-norm. The matrix norms, so defined, can be determined directly from the matrices because of the following Proposition Proposition 6a.1 || A || 2 is the largest singular value of A . || A ||1 is the largest 1-norm of any column of A . || A || is the largest 1-norm of any row of A . See page 15 of Helzer's notes for a partial proof. Returning to the discussion that began this section, we may deduce for any norm that || w |||| A0 || || v || || A || || v0 || || A || || v || . However, in order for this to be useful, we must also be able to relate the norm of a vector or matrix to the absolute values of its components or entries. Before addressing this, we adopt some additional notation. | v | and | A | denote respectively the vector whose components are the absolute values of the corresponding components of v and the matrix whose entries are the absolute values of the corresponding entries of A . We emphasize that | v | and | A | are respectively a vector and a matrix, while || v || and || A || are non-negative real numbers. Moreover while, as we shall see, there are many different vector and matrix norms, the absolute value of a vector or matrix is defined once and for all. The matrix and vector inequalities v w and A B are interpreted as asserting that each component or entry on the left is less than or equal to the corresponding component or entry on the right. We can make similar interpretations of v w and A B , but it is important to observe that the usual trichotomy law does not hold. Furthermore, it does not 1 1 follow from v w that either v w or v w . (Consider the example v ; w ). 1 2 On the other hand, it is still correct to deduce v w from v w and w v , since the deduction can be carried out at every component separately. All the remarks in this paragraph apply equally well to inequalities involving vectors. The most useful properties connecting the norm of a vector or matrix with the absolute values of its entries are: Additional Properties of Norms | A || B ||| A |||| B ||; | v || w ||| v |||| w || The norm of a vector or matrix is greater than or equal to the absolute value of any of its components or entries. These properties are called respectively Property I and Property II in Helzer's notes. They are separate from the axioms for a norm, but they hold for all the norms we have defined with the single exception that property I does not hold for the matrix 2-norm. The following proposition summarizes some obvious consequences of Property I. Proposition 6a.2: If v is a vector in R n and no component of v has absolute value larger than , then o || v ||1 n o || v || 2 n o || v || If A is an m n matrix and no entry of A has absolute value greater than , then o || A ||1 m o || A || n In order to say anything about the 2-norm of a matrix whose entries are bounded, we need to establish some relations among the various norms we have defined. Proposition 6a.3 summarizes these, and provides the missing bound on the 2-norm of a matrix whose entries are bounded. Proposition 6a.3: If v is a vector in R n , then || v || || v || 2 || v ||1 n || v || 2 n || v || If A is an m n matrix, then || A ||1 m || A || 2 m || A || m n || A || 2 mn || A ||1 . If A is an m n matrix and no entry of A has absolute value greater than , then || A || 2 min( m n , n m ) Proof: The first assertion is Proposition 1.15 on page 10 of Helzer's notes, and is proved there. The second assertion follows easily from the first, and the third from the second. Note that the second assertion is a generalization of Proposition 1.18 on page 13 of Helzer's notes. We turn now to the more difficult problem of analyzing the error if v is computed from A and w via the matrix equation Av w . This problem is more subtle than the one we have already considered, primarily because if A is close to being singular, than a small error in w may cause a large error in v and, worse yet, a small error in A may cause A 1 not to exist. We start by observing that, for any vector norm || ||, if A is invertible, we have || v || . In the case of the 2-norm, it is clear that || A 1 || 2 is the inverse of || A 1 || max v || Av || the smallest singular value of A . There is no correspondingly transparent characterization of the 1-norm or the -norm of the inverse of a matrix; to compute them, one must generally perform the inversion. The following proposition gives conditions that both guarantee that A will be invertible and provide an error bound for A 1 . Proposition 6a.4: If A A0 A , A0 is invertible and for some norm || A || || A01 || 1 , then A is || A01 || invertible and || A A || . 1 1 1 0 Proof: We write A A0 ( I A01 A) so that the product on the right is invertible provided both factors are. We have assumed that A0 is invertible. We invert I A01 A by forming the matrix power series I A01 A ( A01 A) 2 ( A01 A) 3 the condition || A || || A01 || 1 now both forces the power series to converge to an inverse for I A01 A and provides the bound for || A 1 A01 || . Problems: 1 1 2 3 1. Consider the matrix equation Av w where A0 and v0 2 , and 2 3 4 1 suppose that no component of v or A has absolute value greater than .05. a. Obtain bounds for || w ||1 , || w || 2 and || w || , using Propositions 6a.16a.3 b. Is any of the bounds in part a. a consequence of the others? c. Taking all three bounds together, sketch the smallest region in R 2 that is certain to contain w . 3 2 and no component of A has absolute 7 5 value greater than .03. Use proposition 6a.4 to determine which of the 1- norm, the 2-norm, and the -norm gives the best bound for the entries of A 1 A01 . 1 3. Let A , A0 and A be as in problem 2. Let w w0 w with w0 and 1 assume that no component of w has absolute value greater than .02. Use the results of Problem 2, and obtain the smallest range of possible values you can for v , where Av w . 2. Suppose A A0 A , A0