PowerPoint

Lecture XXVII Orthonormal Bases and Projections  Suppose that a set of vectors {x1,…,xr} for a basis for some space S in Rm space such that r  m. For mathematical simplicity, we may want to form an orthogonal basis for this space. One way to form such a basis is the Gram-Schmit orthonormalization. In this procedure, we want to generate a new set of vectors {y1,…yr} that are orthonormal. The Gram-Schmit process is: y1  x1 x2 ' y1 y 2  x2  y1 y1 ' y1 x3 ' y1 x3 ' y2 y3  x3   y1 ' y1 y2 ' y2 zi  yi  yi ' yi  2 1 Example  1 9     x1   3 , x2   7   4 16       1   9 7 16 3   70  1 9    4     13      3    50  y2   7      13 1   16   4   20      1 3 4  3    13     4    The vectors can then be normalized to one. However, to test for orthogonality:  70   13  1 3 4  5013  0  20   13   Theorem 2.13 Every r-dimensional vector space, except the zero-dimensional space {0}, has an orthonormal basis.  Theorem 2.14 Let {z1,…zr} be an orthornomal basis for some vector space S, of Rm. Then each x  Rm can be expressed uniquely as x uv were u  S and v is a vector that is orthogonal to every vector in S.  Definition 2.10 Let S be a vector subspace of Rm. The orthogonal complement of S, denoted S, is the collection of all vectors in Rm that are orthogonal to every vector in S: That is, S={x:x  Rm and x’y=0 for all yS}.  Theorem 2.15. If S is a vector subspace of Rm then its orthogonal complement S is also a vector subspace of Rm. Projection Matrices  The orthogonal projection of an m x 1 vector x onto a vector space S can be expressed in matrix form.  Let {z1,…zr} be any othonormal basis for S while {z1,…zm} is an orthonormal basis for Rm. Any vector x can be written as: x  1 z1   r zr    r 1 zr 1   m zm   u  v  Aggregating 1’,2’)’ where 1=(1 …r)’ and 2=(r+1…m)’ and assuming a similar decomposition of Z=[Z1 Z2], the vector x can be written as: x  Z  Z11  Z 2 2 u  Z11 v  Z 2 2 given orthogonality, we know that Z1’Z1=Ir and Z1’Z2=(0), and so 1  1  Z1Z1 ' x  Z1Z1 ' Z1 Z 2    Z1 0   Z11  u  2   2   Theorem 2.17 Suppose the columns of the m x r matrix Z1 from an orthonormal basis for the vector space S which is a subspace of Rm. If x  Rm, the orthogonal projection of x onto S is given by Z1Z1’x.  Projection matrices allow the division of the space into a spanned space and a set of orthogonal deviations from the spanning set. One such separation involves the Gram-Schmit system.  In general, if we define the m x r matrix X1=(x1,…xr) and define the linear transformation of this matrix that produces an orthonormal basis as A, so that: Z1  X1 A we are left with the result that: Z1 ' Z1  A' X1 ' X1 A  I r  Given that the matrix A is nonsingular, the projection matrix that maps any vector x onto the spanning set then becomes: 1 PS  Z1Z1 '  X 1 AA' X 1 '  X 1 ( X ' X ) X 1 '  Ordinary least squares is also a spanning decomposition. In the traditional linear model: y  Xb   yˆ  Xbˆ within this formulation b is chosen to minimize the error between y and estimated y:  y  Xb '  y  Xb   This problem implies minimizing the distance between the observed y and the predicted plane Xb, which implies orthogonality. If X has full column rank, the projection space becomes X(X’X)-1X’ and the projection then becomes: Xb  X  X ' X  X ' y 1  Premultiplying each side by X’ yields: X ' Xb  X ' X  X ' X  X ' y 1 b  X ' X  X ' X X ' X  X ' y 1 b  X ' X  X ' y 1 1  Idempotent matrices can be defined as any matrix such that AA=A.  Note that the sum of square errors under the projection can be expressed as: SSE   y  Xb '  y  Xb    I    X  X ' X  X 'y ' I  X  X ' X  X 'y  y ' I  X  X ' X  X 'I  X  X ' X  X 'y  y  X X ' X  X ' y ' y  X X ' X  X ' y 1 1 1 1 N N 1 n 1 n  In general, the matrix In-X(X’X)-1X’ is referred to as an idempotent matrix. An idempotent matrix is one that AA=A: I n    X X ' X  X ' In  X X ' X  X '  1 1 I n  X  X ' X  X ' X  X ' X  X ' 1 1 X X ' X  X ' X X ' X  X ' 1  In  X X ' X  X ' 1 1  Thus, the SSE can be expressed as:   SSE  y ' I n  X  X ' X  X ' y 1  y ' y  y ' X  X ' X  X ' y  v' v 1 which is the sum of the orthogonal errors from the regression Eigenvalues and Eigenvectors  Eigenvalues and eigenvectors (or more appropriately latent roots and characteristic vectors) are defined by the solution Ax  x for a nonzero x. Mathematically, we can solve for the eigenvalue by rearranging the terms: Ax  x  0  A  I x  0  Solving for  then involves solving the characteristic equation that is implied by: A  I  0  Again using the matrix in the previous example:  1 9 5 1 0 0 1    3 7 8   0 1 0   3     2 3 5 0 0 1 2  5  14  132  3  0 9 5 7 8 3 5  In general, there are m roots to the characteristic equation. Some of these roots may be the same. In the above case, the roots are complex. Turning to another example: 5  3 3   A  4  2 3    {1,2,5} 4  4 5  The eigenvectors are then determined by the linear dependence in A-I matrix. Taking the last example:  4  3 3   A  I   4  3 3 4  4 4  Obviously, the first and second rows are linear. The reduced system then implies that as long as x1=x2 and x3=0, the resulting matrix is zero.  Theorem 11.5.1 For any symmetric matrix, A, there exists an orthogonal matrix H (that is, a square matrix satisfying H’H=I) wuch that: H ' AH  L where L is a diagonal matrix. The diagonal elements of L are called the characteristic roots (or eigenvalues) of A. The ith column of H is called the characteristic vector (or eigenvector) of A corresponding to the characteristic root of A.  This proof follows directly from the definition of eigenvalues. Letting H be a matrix with eigenvalues in the columns it is obvious that AH  LH H ' AH  H ' LH  LH ' H  L Kronecker Products  Two special matrix operations that you will encounter are the Kronecker product and vec() operators.  The Kronecker product is a matrix is an element by element multiplication of the elements of the first matrix by the entire second matrix:  a11B a12 B a B a B 21 22  A B      am1 B am 2 B  a1n B    a2 n B       amn B   The vec(.) operator then involves stacking the columns of a matrix on top of one another.

PowerPoint

Related documents

Products

Support

PowerPoint

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib