Chapter 2 Bases and Dimension One of the main things we’ll do in the first half of this course is to “classify” vector spaces. Classification is one of the great aims of pure maths; when you’re faced with a lot of interesting objects that share common properties, you want to try and write down all objects with those properties. Roughly speaking, for vector spaces it turns out that the two properties that matter are the field the space is defined over and the dimension of the space. Now, you should already have an idea of what the dimension of some vector spaces is: R2 is a 2-dimensional space, R3 is 3-dimensional, and so on. The main purpose of this section is to formalize this notion and generalize it to arbitrary vector spaces. 2.1 The span of a set Definition 2.1. Let V be a vector space over K, and let v1 , . . . , vn ∈ V . A vector v ∈ V is said to be a linear combination of v1 , . . . , vn if there exist scalars α1 , . . . , αn ∈ K such that u = α 1 v1 + · · · + α n vn 1 2 1 3 Example 2.2. In R , the vector 2 is a linear combination of the vectors 1, 0, 3 1 0 0 1 , because −12 0 1 2 1 2 = 3 1 − 5 0 − 1 . −12 3 1 0 Definition 2.3. If S is a non-empty subset of a vector space V then the collection of all linear combinations of vectors from S is called the span of S, denoted Sp(S). By convention, if S is the empty set ∅, then we set Sp(S) = {0}. Note that we always have S ⊆ Sp(S). For a finite subset {v1 , . . . , vk } of V , we sometimes write Sp(v1 , . . . , vk ) instead of Sp({v1 , . . . , vk }). If S is a subset of V such that Sp(S) = V , then we say that S spans V , or that V is spanned by S. 7 Example 2.4. Let V = K[x], and let S = {x, 1 + x2 , x7 }. Then Sp(S) = {α + βx + αx2 + γx7 | α, β, γ ∈ K}. Example 2.5. Let K = R. We’ll show that 2 0 −1 Sp 2 , 1 , 3 = R3 . 1 1 4 x We have to show that any vector y ∈ R3 can be written as a linear combination of z the given vectors. So we have to see if there are real numbers α, β, γ such that 0 −1 2 x y = α 2 + β 1 + γ 3 4 z 1 1 −α + 2γ = 2α + β + 3γ . 4α + β + γ That is, we must try to solve the three equations −α + 2γ = x 2α + β + 3γ = y 4α + β + γ = z. Using Gaussian elimination we quickly find that there is a solution given by α=x−y+z 9 7 β = −5x + y − z 2 2 and 1 1 γ =x− y+ z 2 2 So any vector in R3 can be written as a linear combination of the given vectors, as we wanted. 2.2 Linear Independence Definition 2.6. A subset S of a vector space V is said to be linearly dependent if there are distinct vectors v1 , . . . , vn in S and scalars α1 , . . . , αn not all zero such that α1 v1 + · · · + αn vn = 0. A set that is not linearly dependent is linearly independent. Remark. Any set containing 0 must be linearly dependent because 10 = 0 is a linear dependence. Any set consisting of a single nonzero vector v is linearly independent because we have αv = 0 if and only if α = 0. 8 Remark. Given a finite subset S = {v1 , . . . , vk } of a vector space V , we often want to decide whether S is linearly independent or not. To do this, we usually suppose there exist elements α1 , . . . , αk ∈ K such that 0 = α 1 v1 + · · · + α n vk . Then S is linearly independent if and only if the only possibility is α1 = · · · = αk = 0. For example, consider the vector space V = K[x], and the set S = {1−x, 2+x, 3+x2 }. Then let α, β, γ ∈ K and write 0 = α(1 − x) + β(2 + x) + γ(3 + x2 ) = (α + 2β + 3γ) + (β − α)x + γx2 Now we conclude that γ = 0 for the x2 term to disappear, and this gives two equations β − α = 0 and α + 2β = 0. The only solution to these two equations is α = β = 0, so we can conclude that S is a linearly independent subset of V . We now show that if a vector space V is spanned by a finite set of vectors, then it is spanned by a finite linearly independent set. Theorem 2.7. Suppose that the vector space V 6= {0} is spanned by the set S = {v1 , . . . , vk }. Then there is a linearly independent subset of S that also spans V . Proof. If S is independent there is nothing to prove. Otherwise, there is a linear dependence between the elements of S; i.e., there exist α1 , . . . , αk ∈ K, not all 0, such that 0 = α 1 v1 + · · · + α k vk . Let αi be the first nonzero coefficient. Then we can rearrange to get ! −1 X vi = α j vj αi j6=i and we have written vi as a linear combination of the other vectors in S. Delete vi to give the subset S1 = {v1 , . . . , vi−1 , vi+1 , . . . , vk }. Then S1 must also span V (why? Check this!). If S1 is linearly independent then we are done. If S1 is linearly dependent we can delete one of its vectors to obtain a smaller subset that still spans V . We continue deleting appropriate vectors and obtain successively smaller subsets S ⊃ S1 ⊃ S2 ⊃ . . . that each span V . This process must terminate in a linearly independent subset Sr which spans V after at most k − 1 steps, since after k − 1 steps we reach a subset consisting of one nonzero element, which is automatically linearly independent. 2.3 Bases and Dimension Definition 2.8. A basis for a vector space V is a subset that spans V and is also linearly independent. By Theorem 2.7, every vector space that is spanned by a finite set of vectors has a finite basis. Such spaces are called finite-dimensional. 9 Remark. In a finite-dimensional vector space V with a basis {e1 , . . . , ek }, it will often make a difference what order the vectors appear in the basis. To emphasize this, from now on we denote a finite basis by an ordered tuple (e1 , . . . , ek ). This allows us to distinguish between the bases (e1 , e2 , . . . , ek ) and (e2 , e1 , . . . , ek ), for example. Lemma 2.9 (Steinitz Exchange Lemma). Suppose B = (e1 , . . . , en ) is a basis for the vector space V , and suppose v ∈ V is any nonzero vector. Then there exists 1 ≤ i ≤ n such that B ′ = (e1 , . . . , ei−1 , ei+1 , . . . , en , v) is a basis for V (i.e., we can exchange the vector v for ei and still have a basis). Proof. Since B is a basis for V , we can write v= n X (∗) αj ej j=1 for some αj ∈ K, 1 ≤ j ≤ n. Now, since v 6= 0, not all αi = 0. Let αi be the first nonzero coefficient. Then we can write X αj 1 ei = v − ej . αi α i j6=i Now B ′ = (e1 , . . . , ei−1 , ei+1 , . . . , en , v) still spans V (any element can be written as a combination of the ej , and wherever ei appears it can be replaced with a linear combination of v and the remaining ej ). To show linear independence, suppose X 0= βj ej + βv j6=i for some β, βj ∈ K with 1 ≤ j ≤ n, j 6= i. Then substituting for v from equation (*), we get ! n X X 0= βj ej + β αj ej j=1 j6=i = X (βj + βαj )ej + βαi ei . j6=i By linear independence of B, we conclude that βj + βαj = 0 for each j 6= i and βαi = 0. Since αi 6= 0 by our earlier choice, we must have β = 0, and then we get βj = 0 for all j 6= i as well. Thus B ′ is linearly independent. We can now finally prove the crucial result which allows us to define the dimension of a vector space: Theorem 2.10. Let V be a finite-dimensional vector space. Then every basis for V has the same size. Proof. Since V is finite-dimensional, there exists a finite basis B = (e1 , . . . , en ). Let B ′ = (v1 , v2 , . . .) be any other basis (possibly infinite). First suppose that the size of B ′ 10 is at least n. We proceed inductively by successively replacing elements of B by elements of B ′ using the Steinitz Exchange Lemma 2.9. Suppose we have successfully performed k such exchanges: i.e., we have a new basis Bk = (ei1 , . . . , ein−k , v1 , . . . , vk ), where the numbers i1 < i2 < · · · < in−k are elements of the set {1, . . . , n}. Then our next step, according to the procedure in the Steinitz Exchange Lemma, is to write vk+1 = n−k X α j e ij + k X βj vj j=1 j=1 and then find the first nonzero coefficient. If αj = 0 for all 1 ≤ j ≤ n − k, then we have written vk+1 as a linear combination of v1 , . . . , vk , contradicting the fact that B ′ is a basis, and hence is linearly independent. Hence the first nonzero coefficient is αj for some 1 ≤ j ≤ n − k, and we can replace eij with vk+1 to get a new basis Bk+1 = (ei1 , . . . , eij−1 , eij+1 , . . . , ein−k , v1 , . . . , vk , vk+1 ), by the Steinitz Exchange Lemma. So, starting with B0 = B and performing n exchanges, we conclude that the set Bn = (v1 , . . . , vn ) is a basis for V . Now if B ′ contains more than n elements, we can write vn+1 as a linear combination of v1 , . . . , vn , because Bn is a basis. But then B ′ is not linearly independent, which is a contradiction. Thus we conclude that B ′ has precisely n elements. Finally, we rule out the case that B ′ has size strictly less than n. Suppose B ′ = (v1 , . . . , vk ) and k < n. Then we can use the procedure above to successively exchange the vi s with e1 , . . . , ek , which would force us to conclude that (e1 , . . . , ek ) is a basis for V , contradicting the fact that B is linearly independent. This completes the proof. Theorem 2.10 leads immediately to the following definition: Definition 2.11. The dimension dim V of a finite-dimensional vector space V is the number of vectors in any basis; it is an invariant of the vector space. By convention, we say that the dimension of the trivial vector space {0} is zero. Examples 2.12. Many familiar vector spaces are finite-dimensional, and have a so-called standard basis. (i) Let V = Kn . For each 1 ≤ i ≤ n, let ei be the column vector with a 1 in the ith position and 0s elsewhere. Then (e1 , . . . , en ) is a basis for V , which therefore has dimension n (as expected!). (ii) Let V = Mm×n (K). For each 1 ≤ i ≤ m and 1 ≤ j ≤ n, let Eij be the matrix with a 1 in the (i, j)-position and 0s elsewhere. Then (Eij )1≤i≤m,1≤j≤n is a basis for V , which has size mn, so V has dimension mn. 11 (iii) Let V = K[x]. Then V has an infinite basis (1, x, x2 , x3 , . . .), so V is not finitedimensional. We now record some basic but useful facts about finite-dimensional spaces. Lemma 2.13. Suppose V is a vector space with basis B = (e1 , . . . , en ), so dim V = n. Then the following are true: (i) Any linearly independent set has size less than or equal to n. Hence, any subset containing more than n vectors is linearly dependent. (ii) Any finite spanning set for V contains a basis. Hence no subset containing fewer than n vectors can span V . (iii) Any linearly independent set containing n vectors is a basis for V . Similarly, any spanning set of size n is a basis for V . (iv) Given v ∈ V , write v = α1 e1 + · · · + αn en for scalars αi . Then the αi are uniquely defined. Proof. (i). Suppose S is linearly independent and suppose, for contradiction, that S contains more than n elements. Then, by n successive applications of the Steinitz Exchange Lemma 2.9, as in the proof of Theorem 2.10, we can replace the elements of B by n elements from S, obtaining a new basis B ′ which is a subset of S. But now the remaining elements of S can be written as linear combinations of elements in B ′ , because B ′ is a basis. This contradicts the linear independence of S. Hence S must have size less than or equal to n. (ii). Apply Theorem 2.7. (iii). Suppose S is a linearly independent subset which does not span V . Then there exists nonzero v ∈ V which is not in the span of S. Now we show that S ∪ {v} is still linearly independent. The set S is finite by part (i), say S = {v1 , . . . , vr } with r ≤ n. Suppose that α1 , . . . , αr , α ∈ K are such that α1 v1 + · · · + αr vr + αv = 0. If α 6= 0, then we can rearrange to get v as a linear combination of the vi , which contradicts the choice of v. Hence α = 0, and then all αi = 0 as well by linear independence of S. Thus S ∪ {v} is linearly independent. But S ∪ {v} has size one larger than S, so this procedure is only possible if S has size strictly less than n, by (i). Hence any linearly independent subset of size n must span V . For the second part, any finite spanning set contains a basis by (ii). But every basis has size n by Theorem 2.10, so a spanning set of size n must already be a basis. (iv). Suppose we also have β1 , . . . , βn such that v = β1 e1 + · · · + βn en . Then v − v = 0 = (α1 −β1 )e1 +· · ·+(αn −βn )en . By linear independence of B, we must have αi −βi = 0 for all i, and we are done. The proof of part (iii) above shows how to construct a basis from a linearly independent subset in a finite-dimensional space: keep adding vectors outside the span of your subset until you get to the maximal size for a linearly independent subset. We record this as a theorem. 12 Theorem 2.14. In a finite-dimensional vector space V , any linearly independent set can be extended to a basis. 13