MATH 311 Topics in Applied Mathematics I Lecture 26: Least squares problems (continued). Orthogonal bases. The Gram-Schmidt orthogonalization process. Least squares solution System of linear equations: a11x1 + a12x2 + · · · + a1n xn = b1 a21x1 + a22x2 + · · · + a2n xn = b2 ⇐⇒ Ax = b · · · · · · · · · am1 x1 + am2 x2 + · · · + amn xn = bm For any x ∈ Rn define a residual r (x) = b − Ax. The least squares solution x to the system is the one that minimizes kr (x)k (or, equivalently, kr (x)k2). Theorem A vector x̂ is a least squares solution of the system Ax = b if and only if it is a solution of the associated normal system AT Ax = AT b. Problem. Find the constant function that is the least squares fit to the following data 0 1 2 3 x f (x) 1 0 1 2 1 1 c = 1 0 1 c=0 =⇒ f (x) = c =⇒ 1 (c) = 1 c = 1 c =2 2 1 1 1 0 1 (1, 1, 1, 1) 1 (c) = (1, 1, 1, 1) 1 2 1 c = 41 (1 + 0 + 1 + 2) = 1 (mean arithmetic value) Problem. Find the linear polynomial that is the least squares fit to the following data 0 1 2 3 x f (x) 1 0 1 2 1 1 0 c = 1 1 0 1 1 c1 c1 + c2 = 0 =⇒ f (x) = c1 + c2 x =⇒ 1 2 c2 = 1 c + 2c = 1 1 2 c + 3c = 2 2 1 3 1 2 1 1 0 0 c1 1 1 1 1 1 1 1 1 1 1 = c2 0 1 2 3 1 1 2 0 1 2 3 2 1 3 4 6 c1 4 c1 = 0.4 = ⇐⇒ 6 14 c2 8 c2 = 0.4 Problem. Find the quadratic polynomial that is the least squares fit to the following data 0 1 2 3 x f (x) 1 0 1 2 f (x) = c1 + c2 x + c3 x 2 1 1 0 0 c = 1 1 0 1 1 1 c1 c1 + c2 + c3 = 0 =⇒ =⇒ 1 2 4 c2 = 1 c + 2c + 4c = 1 1 2 3 c3 c + 3c + 9c = 2 2 1 3 9 1 2 3 1 1 0 0 c 1 1 1 1 1 1 1 1 1 0 1 2 3 1 1 1 c2 = 0 1 2 3 0 1 1 2 4 c3 0 1 4 9 0 1 4 9 2 1 3 9 4 6 14 c1 4 c1 = 0.9 6 14 36 c2 = 8 ⇐⇒ c2 = −1.1 c3 = 0.5 14 36 98 c3 22 Orthogonal sets Let h·, ·i denote the scalar product in Rn . Definition. Nonzero vectors v1, v2, . . . , vk ∈ Rn form an orthogonal set if they are orthogonal to each other: hvi , vj i = 0 for i 6= j. If, in addition, all vectors are of unit length, kvi k = 1, then v1, v2, . . . , vk is called an orthonormal set. Example. The standard basis e1 = (1, 0, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , en = (0, 0, 0, . . . , 1). It is an orthonormal set. Orthonormal bases Suppose v1, v2, . . . , vn is an orthonormal basis for Rn (i.e., it is a basis and an orthonormal set). Theorem Let x = x1v1 + x2 v2 + · · · + xn vn and y = y1v1 + y2v2 + · · · + yn vn , where xi , yj ∈ R. Then (i) hx, yi = x1 y1 + x2y2 + · · · + xn yn , p (ii) kxk = x12 + x22 + · · · + xn2 . Proof: (ii) follows from (i) when y = x. * n + * + n n n X X X X hx, yi = x i vi , y j vj = x i vi , y j vj i =1 = j=1 n n XX i =1 j=1 i =1 xi yj hvi , vj i = n X i =1 j=1 xi yi . Suppose V is a subspace of Rn . Let p be the orthogonal projection of a vector x ∈ Rn onto V . If V is a one-dimensional subspace spanned by a hx, vi vector v then p = v. hv, vi If V admits an orthogonal basis v1 , v2, . . . , vk then hx, v1i hx, v2i hx, vk i p= v1 + v2 + · · · + vk . hv1 , v1i hv2 , v2i hvk , vk i k X hx, vj i hx, vi i hvj , vi i = hvi , vi i = hx, vi i Indeed, hp, vi i = hv hv j , vj i i , vi i j=1 =⇒ hx−p, vi i = 0 =⇒ x−p ⊥ vi =⇒ x−p ⊥ V . Coordinates relative to an orthogonal basis Theorem If v1 , v2, . . . , vn is an orthogonal basis for Rn , then hx, v2i hx, vn i hx, v1 i v1 + v2 + · · · + vn x= hv1 , v1i hv2 , v2i hvn , vn i for any vector x ∈ Rn . Corollary If v1 , v2, . . . , vn is an orthonormal basis for Rn , then x = hx, v1 iv1 + hx, v2iv2 + · · · + hx, vn ivn for any vector x ∈ Rn . The Gram-Schmidt orthogonalization process Let V be a subspace of Rn . Suppose x1, x2, . . . , xk is a basis for V . Let v1 = x1 , hx2 , v1i v1 , hv1 , v1i hx3 , v1i hx3 , v2i v3 = x3 − v1 − v2 , hv1 , v1i hv2 , v2i ................................................. hxk , vk−1i hxk , v1i v1 − · · · − vk−1. vk = xk − hv1 , v1i hvk−1, vk−1i v2 = x2 − Then v1, v2, . . . , vk is an orthogonal basis for V . x3 v3 p3 Span(v1 , v2 ) = Span(x1 , x2 ) Any basis x1 , x2 , . . . , xk −→ Orthogonal basis v1 , v2 , . . . , vk Properties of the Gram-Schmidt process: • vj = xj − (α1 x1 + · · · + αj−1xj−1), 1 ≤ j ≤ k; • the span of v1, . . . , vj is the same as the span of x1 , . . . , xj ; • vj is orthogonal to x1 , . . . , xj−1; • vj = xj − pj , where pj is the orthogonal projection of the vector xj on the subspace spanned by x1 , . . . , xj−1; • kvj k is the distance from xj to the subspace spanned by x1 , . . . , xj−1. Normalization Let V be a subspace of Rn . Suppose v1, v2, . . . , vk is an orthogonal basis for V . v1 v2 vk Let w1 = , w2 = ,. . . , wk = . kv1k kv2 k kvk k Then w1 , w2, . . . , wk is an orthonormal basis for V . Theorem Any non-trivial subspace of Rn admits an orthonormal basis. Orthogonalization / Normalization An alternative form of the Gram-Schmidt process combines orthogonalization with normalization. Suppose x1, x2, . . . , xk is a basis for a subspace V ⊂ Rn . Let v1 = x1 , w1 = kvv11k , v2 = x2 − hx2 , w1iw1 , w2 = v2 kv2 k , v3 = x3 − hx3 , w1iw1 − hx3 , w2iw2 , w3 = v3 kv3 k , ................................................. vk = xk − hxk , w1iw1 − · · · − hxk , wk−1iwk−1, wk = kvvkk k . Then w1, w2 , . . . , wk is an orthonormal basis for V . Problem. Let Π be the plane spanned by vectors x1 = (1, 1, 0) and x2 = (0, 1, 1). (i) Find the orthogonal projection of the vector y = (4, 0, −1) onto the plane Π. (ii) Find the distance from y to Π. First we apply the Gram-Schmidt process to the basis x1 , x2 : v1 = x1 = (1, 1, 0), hx2 , v1 i 1 v2 = x2 − v1 = (0, 1, 1) − (1, 1, 0) = (−1/2, 1/2, 1). hv1 , v1 i 2 Now that v1 , v2 is an orthogonal basis for Π, the orthogonal projection of y onto Π is hy, v2i 4 −3 hy, v1i v1 + v2 = (1, 1, 0) + (−1/2, 1/2, 1) p= hv1 , v1 i hv2 , v2 i 2 3/2 = (2, 2, 0) + (1, −1, −2) = (3, 1, −2). The distance from y to Π is ky − pk = k(1, −1, 1)k = √ 3.