AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 11: Orthogonal Matrices; Projectors; Linear Least Squares Problems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 14 Outline 1 Orthogonal Vectors and Matrices 2 Projectors 3 Linear Least Squares Problems Xiangmin Jiao Numerical Analysis I 2 / 14 Orthogonal Vectors Definition A pair of vectors are orthogonal if x T y = 0. In other words, the angle between them is 90 degrees Xiangmin Jiao Numerical Analysis I 3 / 14 Orthogonal Vectors Definition A pair of vectors are orthogonal if x T y = 0. In other words, the angle between them is 90 degrees Definition Two sets of vectors X and Y are orthogonal if every x ∈ X is orthogonal to every y ∈ Y . Xiangmin Jiao Numerical Analysis I 3 / 14 Orthogonal Vectors Definition A pair of vectors are orthogonal if x T y = 0. In other words, the angle between them is 90 degrees Definition Two sets of vectors X and Y are orthogonal if every x ∈ X is orthogonal to every y ∈ Y . Definition A set of nonzero vectors S is orthogonal if they are pairwise orthogonal. They are orthonormal if it is orthogonal and in addition each vector has unit Euclidean length. Xiangmin Jiao Numerical Analysis I 3 / 14 Orthogonal Vectors Theorem The vectors in an orthogonal set S are linearly independent. Xiangmin Jiao Numerical Analysis I 4 / 14 Orthogonal Vectors Theorem The vectors in an orthogonal set S are linearly independent. Proof. Prove by contradiction. If a vector can be expressed as linear combination of the other vectors in the set, then it is orthogonal to itself. Question: If the column vectors of an m × n matrix A are orthogonal, what is the rank of A? Xiangmin Jiao Numerical Analysis I 4 / 14 Orthogonal Vectors Theorem The vectors in an orthogonal set S are linearly independent. Proof. Prove by contradiction. If a vector can be expressed as linear combination of the other vectors in the set, then it is orthogonal to itself. Question: If the column vectors of an m × n matrix A are orthogonal, what is the rank of A? Answer: n = min{m, n}. In other words, A has full rank. Xiangmin Jiao Numerical Analysis I 4 / 14 Components of Vector Given an orthonormal set {q1 , q2 , . . . , qm } forming a basis of Rm , vector Pvm can Tbe decomposed into orthogonal components as v = i=1 (qi v )qi Xiangmin Jiao Numerical Analysis I 5 / 14 Components of Vector Given an orthonormal set {q1 , q2 , . . . , qm } forming a basis of Rm , vector Pvm can Tbe decomposed into orthogonal components as v = i=1 (qi v )qi Another way to express the condition is v = Pm T i=1 (qi qi )v qi qiT is an orthogonal projection matrix. Note that it is NOT an orthogonal matrix. Xiangmin Jiao Numerical Analysis I 5 / 14 Components of Vector Given an orthonormal set {q1 , q2 , . . . , qm } forming a basis of Rm , vector Pvm can Tbe decomposed into orthogonal components as v = i=1 (qi v )qi Another way to express the condition is v = Pm T i=1 (qi qi )v qi qiT is an orthogonal projection matrix. Note that it is NOT an orthogonal matrix. More generally, given an orthonormal set {q1 , q2 , . . . , qn } with n ≤ m, we have v =r+ n n X X (qiT v )qi = r + (qi qiT )v and r T qi = 0, 1 ≤ i ≤ n i=1 i=1 Let Q bePcomposed of column vectors {q1 , q2 , . . . , qn }. QQ T = ni=1 (qi qiT ) is an orthogonal projection matrix. Xiangmin Jiao Numerical Analysis I 5 / 14 Orthogonal Matrices Definition A matrix is orthogonal if Q T = Q −1 , i.e., if Q T Q = QQ T = I . In the real case, we say the matrix is orthogonal. Its column vectors are orthonormal. In other words, qiT qj = δij , the Kronecker delta. Note: If Q ∈ Cm×m , then Q is said to be unitary (instead of being orthogonal) Xiangmin Jiao Numerical Analysis I 6 / 14 Orthogonal Matrices Definition A matrix is orthogonal if Q T = Q −1 , i.e., if Q T Q = QQ T = I . In the real case, we say the matrix is orthogonal. Its column vectors are orthonormal. In other words, qiT qj = δij , the Kronecker delta. Note: If Q ∈ Cm×m , then Q is said to be unitary (instead of being orthogonal) Question: What is the geometric meaning of multiplication by an orthogonal matrix? Xiangmin Jiao Numerical Analysis I 6 / 14 Orthogonal Matrices Definition A matrix is orthogonal if Q T = Q −1 , i.e., if Q T Q = QQ T = I . In the real case, we say the matrix is orthogonal. Its column vectors are orthonormal. In other words, qiT qj = δij , the Kronecker delta. Note: If Q ∈ Cm×m , then Q is said to be unitary (instead of being orthogonal) Question: What is the geometric meaning of multiplication by an orthogonal matrix? Answer: It preserves angles and Euclidean length. In the real case, multiplication by an orthogonal matrix Q is a rotation (if det(Q) = 1) or reflection (if det(Q) = −1). Xiangmin Jiao Numerical Analysis I 6 / 14 Outline 1 Orthogonal Vectors and Matrices 2 Projectors 3 Linear Least Squares Problems Xiangmin Jiao Numerical Analysis I 7 / 14 Projectors A projector satisfies P 2 = P. They are also said to be idempotent. I I Orthogonal projector. Oblique projector. Example: I I 0 0 α 1 is an oblique projector if α 6= 0, is orthogonal projector if α = 0. Xiangmin Jiao Numerical Analysis I 8 / 14 Projectors A projector satisfies P 2 = P. They are also said to be idempotent. I I Orthogonal projector. Oblique projector. Example: I I 0 0 α 1 is an oblique projector if α 6= 0, is orthogonal projector if α = 0. Xiangmin Jiao Numerical Analysis I 8 / 14 Complementary Projectors Complementary projectors: P vs. I − P. What space does I − P project? Xiangmin Jiao Numerical Analysis I 9 / 14 Complementary Projectors Complementary projectors: P vs. I − P. What space does I − P project? I I I Answer: null(P). range(I − P) ⊇ null(P) because Pv = 0 ⇒ (I − P)v = v . range(I − P) ⊆ null(P) because for any v (I − P)v = v − Pv ∈ null(P). A projector separates Rm into two complementary subspace: range space and null space (i.e., range(P) + null(P) = Rm and range(P) ∩ null(P) = {0} for projector P ∈ Rm×m ) It projects onto range space along null space I In other words, x = Px + r , where r ∈ null(P) Question: Are range space and null space of projector orthogonal to each other? Xiangmin Jiao Numerical Analysis I 9 / 14 Orthogonal Projector An orthogonal projector is one that projects onto a subspace S1 along a space S2 , where S1 and S2 are orthogonal. Theorem A projector P is orthogonal if and only if P = P T . Xiangmin Jiao Numerical Analysis I 10 / 14 Orthogonal Projector An orthogonal projector is one that projects onto a subspace S1 along a space S2 , where S1 and S2 are orthogonal. Theorem A projector P is orthogonal if and only if P = P T . Proof. “If” direction: If P = P T , then (Px)T (I − P)y = x T (P − P 2 )y = 0. “Only if” direction: Suppose P projects onto S1 along S2 where S1 ⊥ S2 , and S1 has dimension n. Let {q1 , . . . , , qn } be orthonormal basis of S1 and {qn+1 , . . . , qm } be a basis for S2 . Let Q be orthogonal matrix whose jth column is qj , and we have PQ = (q1 , q2 , . . . , qn , 0, . . . , 0), so Q T PQ = diag(1, 1, · · · , 1, 0, · · · ) = Σ, and P = QΣQ T . Xiangmin Jiao Numerical Analysis I 10 / 14 Orthogonal Projector An orthogonal projector is one that projects onto a subspace S1 along a space S2 , where S1 and S2 are orthogonal. Theorem A projector P is orthogonal if and only if P = P T . Proof. “If” direction: If P = P T , then (Px)T (I − P)y = x T (P − P 2 )y = 0. “Only if” direction: Suppose P projects onto S1 along S2 where S1 ⊥ S2 , and S1 has dimension n. Let {q1 , . . . , , qn } be orthonormal basis of S1 and {qn+1 , . . . , qm } be a basis for S2 . Let Q be orthogonal matrix whose jth column is qj , and we have PQ = (q1 , q2 , . . . , qn , 0, . . . , 0), so Q T PQ = diag(1, 1, · · · , 1, 0, · · · ) = Σ, and P = QΣQ T . Question: Are orthogonal projectors orthogonal matrices? Xiangmin Jiao Numerical Analysis I 10 / 14 Basis of Projections Projection with orthonormal basis I I I I Given any matrix Q̂ ∈ Rm×n whose columns are orthonormal, then P = Q̂ Q̂ T is orthogonal projector, so is I − P We write I − P as P⊥ In particular, if Q̂ = q, we write Pq = qq T and P⊥q = I − Pq T For arbitrary vector a, we write Pa = aaaT a and P⊥a = I − Pa Xiangmin Jiao Numerical Analysis I 11 / 14 Basis of Projections Projection with orthonormal basis I I I I Given any matrix Q̂ ∈ Rm×n whose columns are orthonormal, then P = Q̂ Q̂ T is orthogonal projector, so is I − P We write I − P as P⊥ In particular, if Q̂ = q, we write Pq = qq T and P⊥q = I − Pq T For arbitrary vector a, we write Pa = aaaT a and P⊥a = I − Pa Projection with arbitrary basis I Given any matrix A ∈ Rm×n that has full rank and m ≥ n P = A(AT A)−1 AT I is orthogonal projection What does P project onto? Xiangmin Jiao Numerical Analysis I 11 / 14 Basis of Projections Projection with orthonormal basis I I I I Given any matrix Q̂ ∈ Rm×n whose columns are orthonormal, then P = Q̂ Q̂ T is orthogonal projector, so is I − P We write I − P as P⊥ In particular, if Q̂ = q, we write Pq = qq T and P⊥q = I − Pq T For arbitrary vector a, we write Pa = aaaT a and P⊥a = I − Pa Projection with arbitrary basis I Given any matrix A ∈ Rm×n that has full rank and m ≥ n P = A(AT A)−1 AT I is orthogonal projection What does P project onto? F I range(A) (AT A)−1 AT is called the pseudo-inverse of A, denoted as A+ Xiangmin Jiao Numerical Analysis I 11 / 14 Outline 1 Orthogonal Vectors and Matrices 2 Projectors 3 Linear Least Squares Problems Xiangmin Jiao Numerical Analysis I 12 / 14 Linear Least Squares Problems Overdetermined system of equations Ax ≈ b, where A has more rows than columns and has full rank, in general has no solutions Example application: Polynomial least squares fitting In general, minimize the residual r = b − Ax In terms of 2-norm, we obtain linear least squares problem: Given A ∈ Rm×n , m ≥ n, and b ∈ Rm , find x ∈ Rn such that kb − Axk2 is minimized If A has full rank, the minimizer x is the solution to the normal equation AT Ax = AT b or in terms of the pseudoinverse A+ , x = A+ b, where A+ = (AT A)−1 AT ∈ Rn×m Xiangmin Jiao Numerical Analysis I 13 / 14 Geometric Interpretation Ax is in range(A), and the point in range(A) closest to b is its orthogonal projection onto range(A) Residual r is then orthogonal to range(A), and hence AT r = AT (b − Ax) = 0 Ax is orthogonal projection of b, where x = A+ b, so P = AA+ = A(AT A)−1 AT is orthogonal projection Xiangmin Jiao Numerical Analysis I 14 / 14