Handout 6a

advertisement
Section 6a: Error Analysis and Norms I
Our interest in this section will be to analyze the propagation of error from observed
values to computed values. This is somewhat separable from the analysis of roundoff or
truncation error. Let us begin by considering the matrix equation Av  w , where A is
observed and supposed to be invertible, one of v and w is observed, and the other is to
be computed from the equation. We set A  A0  A, v  v0  v, w  w0  w . Here,
the subscripted symbol denotes the observed or computed value, the unsubscripted
symbol denotes the "true" value, and the symbol preceded by  denotes an error whose
precise value is unknown, but can be bounded in some sense. What we would like to do
is to deduce bounds on the error of the computed value from known bounds on the errors
of the observed values. The case in which v is measured and w is computed is much
easier, and we will deal with it first.
We have the obvious identities
w  Av   A0  Av0  v   A0 v0  A0 v  Av0  Av  w0  A0 v  Av0  Av ,
so that w  A0 v  Av0  Av . The object of the game here is to attach numerical
measures of length to all these objects that behave somewhat like the absolute values of
numbers and, therefore allow us to make similar deductions. Specifically, we want to
attach a non-negative, real-valued "norm", written || ||, to each vector and matrix that
will satisfy the following conditions for all vectors and matrices. We will write  for a
generic scalar.
Axioms for Vector and Matrix Norms
 || v || 0 if and only if v  0 ; || A || 0 if and only if A  0
 || v  w |||| v ||  || w ||; || A  B |||| A ||  || B ||
|| v |||  | || v ||; || A |||  | || A ||

 || Av |||| A || || v ||; || AB |||| A || || B ||
Notice that the analogy to the properties of the absolute value of a number breaks down
only in the last condition; for ordinary numbers (or even complex numbers) the product
of the absolute values is the absolute value of the products. For vectors and matrices,
this would be incompatible with the first condition, since it is perfectly possible for a
matrix product to be zero even if neither factor is zero.
Any function of vectors and matrices taking non-negative values and satisfying the
axioms above is called a norm. In particular, these conditions are satisfied if we set
|| Av ||
|| v || v  v ; || A || max v
. This norm is called the 2-norm, written || ||2 where
|| v ||
there is the possibility of confusion with other norms. Indeed, if we take any norm for
vectors that satisfies those of the axioms above that involve only vectors, then we can
define a matrix norm in terms of the vector norm, exactly as we have in the case of the 2-
norm. The two most important norms other than the 2-norm are the 1-norm and the  norm, defined respectively by || v ||1   i | vi |; || v ||   max i | vi | . In each case, we
define the matrix norm in terms of the vector norm, as we did for the 2-norm. The matrix
norms, so defined, can be determined directly from the matrices because of the following
Proposition
Proposition 6a.1



|| A || 2 is the largest singular value of A .
|| A ||1 is the largest 1-norm of any column of A .
|| A ||  is the largest 1-norm of any row of A .
See page 15 of Helzer's notes for a partial proof.
Returning to the discussion that began this section, we may deduce for any norm that
|| w |||| A0 || || v ||  || A || || v0 ||  || A || || v || . However, in order for this to be useful,
we must also be able to relate the norm of a vector or matrix to the absolute values of its
components or entries. Before addressing this, we adopt some additional notation. | v |
and | A | denote respectively the vector whose components are the absolute values of the
corresponding components of v and the matrix whose entries are the absolute values of
the corresponding entries of A .
We emphasize that | v | and | A | are respectively a vector and a matrix, while || v || and
|| A || are non-negative real numbers. Moreover while, as we shall see, there are many
different vector and matrix norms, the absolute value of a vector or matrix is defined once
and for all.
The matrix and vector inequalities v  w and A  B are interpreted as asserting that
each component or entry on the left is less than or equal to the corresponding component
or entry on the right. We can make similar interpretations of v  w and A  B , but it is
important to observe that the usual trichotomy law does not hold. Furthermore, it does not
1
1
follow from v  w that either v  w or v  w . (Consider the example v  ; w  ).
1
2
On the other hand, it is still correct to deduce v  w from v  w and w  v , since the
deduction can be carried out at every component separately. All the remarks in this
paragraph apply equally well to inequalities involving vectors.
The most useful properties connecting the norm of a vector or matrix with the absolute
values of its entries are:
Additional Properties of Norms

| A || B ||| A |||| B ||; | v || w ||| v |||| w ||

The norm of a vector or matrix is greater than or equal to the absolute value of
any of its components or entries.
These properties are called respectively Property I and Property II in Helzer's notes.
They are separate from the axioms for a norm, but they hold for all the norms we have
defined with the single exception that property I does not hold for the matrix 2-norm. The
following proposition summarizes some obvious consequences of Property I.
Proposition 6a.2:
 If v is a vector in R n and no component of v has absolute value larger than  ,
then
o || v ||1  n
o || v || 2  n
o || v ||   

If A is an m n matrix and no entry of A has absolute value greater than  ,
then
o || A ||1  m
o || A ||   n
In order to say anything about the 2-norm of a matrix whose entries are bounded, we
need to establish some relations among the various norms we have defined. Proposition
6a.3 summarizes these, and provides the missing bound on the 2-norm of a matrix whose
entries are bounded.
Proposition 6a.3:
 If v is a vector in R n , then || v ||  || v || 2 || v ||1  n || v || 2  n || v || 


If A is an m n matrix, then || A ||1  m || A || 2  m || A ||   m n || A || 2  mn || A ||1 .
If A is an m n matrix and no entry of A has absolute value greater than  ,
then || A || 2  min( m n , n m )
Proof: The first assertion is Proposition 1.15 on page 10 of Helzer's notes, and is proved
there. The second assertion follows easily from the first, and the third from the second.
Note that the second assertion is a generalization of Proposition 1.18 on page 13 of
Helzer's notes.
We turn now to the more difficult problem of analyzing the error if v is computed from
A and w via the matrix equation Av  w . This problem is more subtle than the one we
have already considered, primarily because if A is close to being singular, than a small
error in w may cause a large error in v and, worse yet, a small error in A may cause
A 1 not to exist.
We start by observing that, for any vector norm || ||, if A is invertible, we have
|| v ||
. In the case of the 2-norm, it is clear that || A 1 || 2 is the inverse of
|| A 1 || max v
|| Av ||
the smallest singular value of A . There is no correspondingly transparent
characterization of the 1-norm or the  -norm of the inverse of a matrix; to compute
them, one must generally perform the inversion. The following proposition gives
conditions that both guarantee that A will be invertible and provide an error bound for
A 1 .
Proposition 6a.4:
If A  A0  A , A0 is invertible and for some norm || A || || A01 ||    1 , then A is
|| A01 || 
invertible and || A  A ||
.
1 
1
1
0
Proof: We write A  A0 ( I  A01 A) so that the product on the right is invertible
provided both factors are. We have assumed that A0 is invertible. We invert I  A01 A
by forming the matrix power series I  A01 A  ( A01 A) 2  ( A01 A) 3   the
condition || A || || A01 ||    1 now both forces the power series to converge to an
inverse for I  A01 A and provides the bound for || A 1  A01 || .
Problems:
1
1 2 3
1. Consider the matrix equation Av  w where A0 
and v0   2 , and
2 3 4
1
suppose that no component of v or A has absolute value greater than .05.
a. Obtain bounds for || w ||1 , || w || 2 and || w ||  , using Propositions 6a.16a.3
b. Is any of the bounds in part a. a consequence of the others?
c. Taking all three bounds together, sketch the smallest region in R 2 that is
certain to contain w .
3 2
and no component of A has absolute
7 5
value greater than .03. Use proposition 6a.4 to determine which of the 1- norm,
the 2-norm, and the  -norm gives the best bound for the entries of A 1  A01 .
1
3. Let A , A0 and A be as in problem 2. Let w  w0  w with w0  and
1
assume that no component of w has absolute value greater than .02. Use the
results of Problem 2, and obtain the smallest range of possible values you can for
v , where Av  w .
2. Suppose A  A0  A , A0 
Download