Inner Product Spaces The basic properties of the dot product are collected in Theorem 3.2.2, page 138: Up to this point, we have been studying abstract vector spaces, which were generalized from the algebraic properties of vectors in Rn . But certain geometric aspects of Rn have been missing, namely, distance and angle. The axioms for a vector space say nothing about these properties. Theorem If u, v, and w are vectors in Rn and k is a scalar, then the following hold: In this chapter, we will address these issues by generalizing properties of the dot product. Let us begin by reviewing some of the known properties of the dot product. (b) (u + v) • w = u • w + v • w Recall that for x and y in Rn , the dot product is (c) (ku) • v = k(u • v) x • y = x1 y1 + x2 y2 + · · · + xn yn = n X (a) u • v = v • u Proof. Properties (a)–(c) follow easily from the rules of matrix algebra. Also, since the square of a real number is nonnegative, v • v = (v1 )2 + (v2 )2 + · · · + (vn )2 ≥ 0, and this sum can only be zero if v1 = v2 = · · · = vn = 0. 2 Now these are only a few of the many properties of the dot product, but we will use them as the basis of our generalization. We will discover that most other familiar properties will follow from them. So the theorem quoted above says that the dot product is an example of an inner product on Rn . It is called the Euclidean inner product or sometimes the standard inner product on Rn . Here is our generalization from Rn to abstract vector spaces. But there are many other examples of inner products on Rn , as well as on vector spaces that are quite different from Rn . Definition For example, on V = R2 , define Let V be a real vector space. An inner product on V is a scalar-valued function of two vectors that associates with every pair of vectors u and v in V a real number hu , vi satisfying the following axioms for any vectors u, v, and w in V and any scalar k. hu , vi = 2u1 v1 + 5u2 v2 . We will proceed to check that all four axioms for an inner product are satisfied. 1. hu , vi = hv , ui (symmetry) For Axiom 1, we have 2. hu + v , wi = hu , wi + hv , wi (distributivity) 4. hv , vi ≥ 0, and if hv , vi = 0, then v = 0. (positivity) xj yj . j=1 If elements of R are interpreted as column vectors, then y1 y2 T x • y = x y = x1 x2 · · · xn . .. yn 3. hku , vi = khu , vi (homogeneity) (distributivity) (homogeneity) (d) v • v ≥ 0, and if v • v = 0, then v = 0. n 1 (symmetry) hv , ui = 2v1 u1 + 5v2 u2 = 2u1 v1 + 5u2 v2 = hu , vi. (positivity) 4 3 For Axiom 2, we have This example can be generalized to Rn . hu + v , wi = 2(u1 + v1 )w1 + 5(u2 + v2 )w2 = 2u1 w1 + 2v1 w1 + 5u2 w2 + 5v2 w2 Definition = 2u1 w1 + 5u2 w2 + 2v1 w1 + 5v2 w2 = hu , wi + hv , wi. Given positive real numbers w1 , w2 , . . . , wn , for any x and y in Rn , define For Axiom 3, hx , yi = w1 x1 y1 + w2 x2 y2 + · · · + wn xn yn = hku , vi = 2(ku1 )v1 + 5(ku2 )v2 = k 2u1 v1 + 5u2 v2 = khu , vi. Finally, Axiom 4 holds for the same reason it did for the Euclidean inner product, namely, because a square cannot be negative: (v1 )2 ≥ 0 and (v2 )2 ≥ 0, so 2(v1 )2 + 5(v2 )2 ≥ 0, etc. w j xj yj j=1 Then h , i is an inner product on Rn , called the weighted Euclidean inner product on Rn with weights w1 , w2 , . . . , wn . This is an example of a weighted Euclidean inner product on R2 , with weights 2 and 5. 5 n X 6 And even requiring wj ≥ 0 is not enough. For example, on R3 , let Note that it is crucial in the definition above that the weights be positive. w1 = 3, w2 = 4, and w3 = 0. For example, on R2 , define hu , vi = 2u1 v1 − 5u2 v2 Then (∗) hx , yi = 3x1 y1 + 4x2 y2 + 0x3 y3 . Then we can easily see that Axioms 1–3 still hold, and the proofs are similar. But when it comes to Axiom 4, we find that Now this still satisfies Axioms 1–3, and hx , xi ≥ 0 is satisfied. However if x = (0, 0, 1), then we have hv , vi = 2(v1 )2 − 5(v2 )2 , h(0, 0, 1) , (0, 0, 1)i = 3(0)2 + 4(0)2 + 0(12 ) = 0, and it is no longer obvious that hv , vi ≥ 0. even though x 6= 0. This violates the second part of Axiom 4, so this is not an inner product. In fact, if v = (1, 1), then These examples illustrate that a given vector space V can have more than one inner product. We usually only consider one inner product at a time, but if we need to distinguish them, we can use subscripts, like h , i1 , h , i2 , etc. hv , vi = 2(1)2 − 5(1)2 = 2 − 5 = −3 < 0. So (∗) is not an inner product. 8 7 If we select a particular inner product on V , then the combination of the two is called an inner product space. Also, by combining addition and scalar multiplication, we can deal with arbitrary linear combinations. We will explore a wide variety of inner product spaces, but first we will derive some useful consequences of the axioms. Theorem In an inner product space, the inner product “preserves” linear combinations in each argument: First, note that Axioms 2 and 3 are “one-sided”. They deal with sums or scalar multiples in the left argument only. But the symmetry axiom allows us to extend that: (a) hc1 v1 +c2 v2 +· · ·+ck vk , wi = c1 hv1 , wi+c2 hv2 , wi+· · ·+cn hvn , wi Theorem (b) hu , c1 v1 + c2 v2 + · · · + ck vk i = c1 hu , v1 i + c2 hu , v2 i + · · · + cn hu , vn i If u, v, and w are vectors in an inner product space and k is a scalar, then Or, in sigma notation, DP E P k k (c) j=1 cj vj , w = j=1 cj hvj , wi D P E P k k (d) u , j=1 cj vj = j=1 cj hu , vj i 2.* hu , v + wi = hu , vi + hu , wi. 3.* hu , kvi = khu , vi. 10 9 Proof: Here are some useful consequences of the axioms for an inner product. (a): h−x , yi = h(−1)x , yi = (−1)hx , yi = −hx , yi, etc. Theorem Suppose V is an inner product space, and x, y, u, v are vectors in V . (b): Using (a), hx − y , ui = hx + (−y) , ui = hx , ui + h−y , ui = hx , vi − hy , vi. (a) h−x , yi = hx , − yi = −hx , yi. (c): h0 , xi = h0x , xi = 0hx , xi = 0, etc. (d): Using axiom (2) first and then (2*), (b) hx − y , vi = hx , vi − hy , vi and hv , x − yi = hv , xi − hv , yi (c) h0 , xi = hx , 0i = 0. hx + y , u + vi = hx , u + vi + hy , u + vi = hx , ui + hx , vi + hy , ui + hy , vi. (d) hx + y , u + vi = hx , ui + hx , vi + hy , ui + hy , vi. (e) hx + y , x + yi = hx , xi + 2hx , yi + hy , yi . (e): By (d), with u = x and v = y, we have (f) hx + y , x − yi = hx , xi − hy , yi. hx + y , x + yi = hx , xi + hx , yi + hy , xi + hy , yi (g) If hv , xi = 0 for all x ∈ V , then v = 0. = hx , xi + hx , yi + hx , yi + hy , yi = hx , xi + 2hx , yi + hy , yi (h) If hu , xi = hv , xi for all x ∈ V , then u = v. 11 12 Norm (length) (f): Again using (d), but with with u = x and v = −y, and then applying (a), we have In R3 , we have a distance formula, which we used to define the length of a vector: p √ length of x = kxk = (x1 )2 + (x2 )2 + (x3 )2 = x • x hx + y , x − yi = hx , xi + hx , − yi + hy , xi + hy , − yi = hx , xi − hx , yi + hx , yi − hy , yi = hx , xi − hy , yi and this was generalized to Rn : p √ length of x = kxk = (x1 )2 + (x2 )2 + · · · + (xn )2 = x • x (g): Suppose hv , xi = 0 for all x. Then just put x = v. We get hv , vi = 0. Then Axiom 4 says that v must be 0. (h): If hu , xi = hv , xi for all x ∈ V , then And we defined the distance between two vectors x and y (interpreted as points) by p d(x, y) = kx − yk = (x1 − y1 )2 + (x2 − y2 )2 + · · · + (xn − yn )2 . 0 = hu , xi − hv , xi = hu − v , xi for all x. Then (g) shows that u − v = 0, so u = v. 13 14 Since an inner product is a generalization of the dot product, we can use it in a similar way to define length and distance in an inner product space. This depends critically on Axiom 4. Since hv , vi ≥ 0, it is legitimate to take its square root (within the real number system). Now we see the significance of the second part of Axiom 4: . . . and if hv , vi = 0 then v = 0. Because, if this were not satisfied, we could have a nonzero vector with length 0. Definition Or, we could have two different vectors whose distance is zero. Let V be an inner product space. Then for any x in V , we define the norm (or “length”) of v by p kvk = hv , vi. Axiom 4 avoids these anomalies, so we have the following facts. Theorem If u and v are vectors in an inner product space V and k is a scalar, then So we always have hv , vi = kvk2 . (a) kvk ≥ 0, and if kvk = 0, then v = 0. And the distance between two vectors in V , denoted by d(u, v), is p d(u, v) = ku − vk = hu − v , u − vi. 15 (b) d(u, v) ≥ 0, and if d(u, v) = 0, then u = v. (c) kkvk = |k| kvk. So, for example, k − vk = kvk. 16 Examples Proof. (a) and (b) are easy exercises. We will now look at several examples of inner products on various spaces. To prove (c), note that, using Axioms 3 and 3*, Example 1. Recalling that √ Let V = R2 with the weighted inner product hkv , kvi = khv , kvi = k 2 hv , vi. hx , yi = 4x1 y1 + 7x2 y2 . k 2 = |k| (NB: not k !), we have √ p p kkvk = k 2 hv , vi = k 2 hv , vi = |k| kvk. If u = (−3, 2), v = (1, 4), and w = (−1, 1), compute each of the following: (a) hu , vi (b) hu , wi (d) h7u − 9v , wi 17 18 (c) hv , wi (e) kwk (f) d(u, v). Remember that the standard inner product (i.e., the dot product) in Rn can be written as a matrix multiplication: Solution: (a): hu , vi = h(−3, 2) , (1, 4)i = 4(−3)(1) + 7(2)(4) = −12 + 56 = 44. hx , yi = x • y = xT y. (b): hu , wi = h(−3, 2) , (−1, 1)i = 4(−3)(−1) + 7(2)(1) = 12 + 14 = 26. (c): hv , wi = h(1, 4) , (−1, 1)i = 4(1)(−1) + 7(4)(1) = −4 + 28 = 24. The weighted Euclidean inner products can also be expressed in matrix form. (d): Here we can use one of the “linear combination” facts above: Given positive weights w1 , w2 , . . . , wn , define the diagonal matrix w1 0 · · · 0 0 w2 · · · 0 W = . .. .. . .. . . 0 0 · · · wn h7u − 9v , wi = 7hu , wi − 9hv , wi = 7(26) − 9(24) = 182 − 216 = −34. Or, the “hard” way: 7u − 9v = (−30, −22), so h(−30, −22) , (−1, 1)i = 4(−30)(−1) + 7(−22)(1) = 120 − 154 = −34. √ (e): hw , wi = 4(−1)(−1) + 7(1)(1) = 11, so kwk = 11. (f): u − v = (−2, 1), so k(−2, 1)k2 = h(−2, 1) , (−2, 1)i = 4(−2)2 + 7(1)2 = 16 + 7 = 23. √ Then d(u, v) = 23. Then clearly, hx , yi = xT W y. 20 19 Matrix generated inner products Proof: This can be proved using rules of matrix (and transpose) algebra. (See Exercise 31, page 344.) Axiom 1: First, observe that G is a symmetric matrix: G T = (AT A)T = AT (AT )T = AT A = G . This example can be generalized to a whole class of inner products. Theorem Since the value of an inner product is a scalar (essentially a 1 × 1 T T T matrix), it is symmetric: hx , yiT A = hx , yiA . But hx , yiA = (x G y) = yT G T (xT )T = yT G x = hy , xiA . Given an invertible n × n matrix A, let G = AT A. For x and y in Rn , define hx , yiA = xT G y. Axiom 2; hx + y , ziA = (x + y)T G z = (xT + yT )G z = xT G z + yT G z = hx , ziA + hy , ziA . Then h , iA is an inner product on Rn , called the inner product on Rn generated by A. Axiom 3: hkx , yiA = (kx)T G y = k(xT )G y = khx , yiA . Observe that we can write this inner product as Axiom 4: hx , xiA = xT G x = xT AT Ax = (Ax)T (Ax). Let y = Ax. Then we have yT y, which is just the dot product, so hx , xiA = y • y ≥ 0. hx , yiA = xT AT Ay = (Ax)T (Ay). Finally, if hx , xiA = 0, then y • y = 0, so y = Ax = 0. But we assumed A is invertible. Therefore x = 0. 22 21 A matrix inner product Every weighted Euclidean inner product is in fact generated by a diagonal matrix. Let w1 0 W = . .. 0 0 w2 .. . ··· ··· 0 0 .. . 0 ··· wn , Example 2 0 2 Let A = , and let h , iA be the matrix inner product on R2 1 −1 generated by A. where wj > 0 for every j. Then the square roots of the weights exist, so let √ w1 0 ··· 0 √ 0 w2 · · · 0 A= . .. .. , .. . . √ 0 0 ··· wn If u = (−3, 2), v = (1, 4), and w = (−1, 1), compute each of the following: (a) hu , wiA Then AT A = W , so the weighted inner product is generated by A. 23 24 (b) hv , wiA (c) h7u − 9v , wiA (d) kwk. First, we compute G : G = AT A = 0 2 1 0 −1 1 2 1 = −1 −1 −1 5 Then everything comes from simple matrix products. (a): 1 −1 −1 −2 2 = −3 2 = 6 + 12 = 18. −1 5 1 6 1 −1 −1 −2 (b): hv , wiA = 1 4 = 1 4 = −2 + 24 = 22. −1 5 1 6 hu , wiA = −3 (c): 7hu , wiA − 9hv , wiA = 7(18) − 9(22) = −72. √ −2 (d): hw , wiA = −1 1 = 2 + 6 = 8, so kwk = 8. 6 25