Preface The goal of the course MAA 101 Algebra is to introduce the theory of linear algebra, and to provide tools to solve systems of linear equations. We will define an abstract setting, the theory of vector spaces, which can be applied to very different mathematical objects. With these tools, one will be able to study linear problems. This notion will be rigorously defined in the course, but it can occur in very different areas of mathematics (systems of linear equations, sequences defined by a linear recurrence relation, linear differential equations...). Linear algebra is a very important topic in mathematics, for one often tries to reduce a complicated problem to a linear one. A concrete way to represent linear transformations is the use of matrices, which is introduced in the course (and whose study will continue in an advanced algebra course). It is in particular used when one considers functions defined in the space, and is then common in other topics than mathematics (especially physics and mechanics). i ii PREFACE Contents Preface i Preliminaries 0.1 Vectors . . . . . . . . . . . . . . . . . . 0.1.1 Plane vectors . . . . . . . . . . . 0.1.2 Space vectors . . . . . . . . . . . 0.1.3 Higher dimension . . . . . . . . . 0.2 Sequences and polynomials . . . . . . . 0.2.1 Sequences . . . . . . . . . . . . . 0.2.2 Polynomials . . . . . . . . . . . . 0.3 Solving a system of linear equations . . 0.3.1 Elimination of variables . . . . . 0.3.2 Linear combinations of equations 0.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 2 3 3 3 4 7 7 8 9 1 Vector spaces 1.1 Definition and examples . . . . . . . . . 1.1.1 Definition of a vector space . . . 1.1.2 Standard vector spaces . . . . . . 1.1.3 Exercises . . . . . . . . . . . . . 1.2 Vector subspaces . . . . . . . . . . . . . 1.2.1 Definition . . . . . . . . . . . . . 1.2.2 Operations on vector subspaces . 1.2.3 Direct sum . . . . . . . . . . . . 1.2.4 Exercises . . . . . . . . . . . . . 1.3 Family of vectors . . . . . . . . . . . . . 1.3.1 Spanning family . . . . . . . . . 1.3.2 Linearly independent family . . . 1.3.3 Basis . . . . . . . . . . . . . . . . 1.3.4 Exercises . . . . . . . . . . . . . 1.4 Dimension of a vector space . . . . . . . 1.4.1 Definition . . . . . . . . . . . . . 1.4.2 Dimension and families . . . . . 1.4.3 Dimension and vector subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 11 11 13 15 16 16 18 19 21 22 22 24 26 28 29 29 30 32 iii iv CONTENTS 1.4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Linear maps and matrices 2.1 Linear maps . . . . . . . . . . . . . . . . . 2.1.1 Definition and examples . . . . . . 2.1.2 Linear maps and vector subspaces 2.1.3 Construction of linear maps . . . . 2.1.4 Case of finite dimension . . . . . . 2.1.5 Rank and nullity . . . . . . . . . . 2.1.6 Duality . . . . . . . . . . . . . . . 2.1.7 Exercises . . . . . . . . . . . . . . 2.2 Matrices . . . . . . . . . . . . . . . . . . . 2.2.1 Definition . . . . . . . . . . . . . . 2.2.2 Operations on matrices . . . . . . 2.2.3 Invertible matrices . . . . . . . . . 2.2.4 Matrix of a linear map . . . . . . . 2.2.5 Change of coordinates matrix . . . 2.2.6 Exercises . . . . . . . . . . . . . . 35 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 37 37 38 39 41 44 45 46 49 50 51 54 55 55 57 3 Determinant and applications 3.1 Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Invertible 2 × 2 matrices . . . . . . . . . . . . . . . . 3.1.2 Definition of the determinant . . . . . . . . . . . . . 3.1.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . 3.1.4 The determinant of a linear map . . . . . . . . . . . 3.1.5 Computation of the determinant of a matrix . . . . 3.1.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Applications to systems of linear equations . . . . . . . . . 3.2.1 Definition and properties . . . . . . . . . . . . . . . 3.2.2 Cramer’s rule . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Gaussian elimination . . . . . . . . . . . . . . . . . . 3.2.4 Applications of the Gaussian elimination . . . . . . . 3.2.5 Computation of the rank and the inverse of a matrix 3.2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 61 61 62 65 66 68 69 71 71 72 73 76 78 80 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preliminaries 0.1 0.1.1 Vectors Plane vectors We will define the set of vectors in the real plane. We recall that R denotes the set of the real numbers. Definition 0.1.1. The set of plane vectors is denoted by R2 and is defined by x 2 R = , x ∈ R, y ∈ R y −−→ Given two points A, B in the plane, one can define the vector AB. If A has xA xB coordinates and B has coordinates then yA yB −−→ xB − xA AB = yB − yA One can add two vectors. Definition 0.1.2. Let u1 = x1 y1 and u2 = defines their sum as u1 + u2 = x1 + x2 y1 + y2 x2 y2 be elements in R2 . One This sum has the following geometric interpretation (Chasles’ relation). Proposition 0.1.3. Let A, B, C be points in the real plane. Then −−→ −−→ −→ AB + BC = AC xA xB Proof. Assume that the coordinates of A, B, C are respectively , yA yB xC and . Then yC −−→ −−→ −→ xB − xA xC − xB (xB − xA ) + (xC − xB ) xC − xA AB+BC = + = = = AC yB − yA yC − yB (yB − yA ) + (yC − yB ) yC − yA 1 2 CONTENTS One can also multiply a vector by a real. x Definition 0.1.4. Let u = be a plane vector, and let λ ∈ R. One defines y the plane vector λ · u as λx λ·u= λy If the element λ is a rational, one can give a geometric interpretation to the multiplication of a vector by λ. Exercise 0.1.5. Let A, B be two points in the real plane, and let M be the midpoint of the segment [AB]. Prove that −−→ 1 −−→ AM = · AB 2 0.1.2 Space vectors One can consider points in the space, and thus space vectors. Definition 0.1.6. The set of space vectors is denoted by R3 and is defined by x R3 = y , x ∈ R, y ∈ R, z ∈ R z The set of space vectors is related to the points in the real space in the following −−→ way. Given two points A, B in the space, one can definethe vector AB. If A xA xB has coordinates yA and B has coordinates yB then zA zB x − xA −−→ B yB − yA AB = zB − zA The properties of the space vectors are analogous to the ones for the plane vectors. There is also an addition, and multiplication by a real. x1 x2 Definition 0.1.7. Let u1 = y1 and u2 = y2 be elements in R3 . z1 z2 One defines their sum as x1 + x2 u1 + u2 = y1 + y2 z1 + z2 If λ ∈ R, one defines λx1 λ · u1 = λy1 λz1 0.2. SEQUENCES AND POLYNOMIALS 0.1.3 3 Higher dimension A plane vector consists exactly of two coordinates, and a space vector of three coordinates. One can define a generalization in higher dimension of these objects. Let n ≥ 1 be an integer. Definition 0.1.8. We define the space Rn as x1 x2 n R = . , x1 , x2 , . . . , xn ∈ R .. xn We will call the elements in Rn n-vectors. One can still define the addition of two n-vectors, and multiply a n-vector by a real. y1 x1 Definition 0.1.9. Let u = ... and v = ... be elements in Rn . One yn xn defines their sum as u+v = If λ ∈ R, one defines x1 + y1 .. . xn + yn λx1 λ · u = ... λxn Note that one has not defined the multiplication of two vectors. A 1-vector is just a real, and the operations defined above are the usual addition and multiplication for reals. The 2-vectors are the plane vectors, and the 3-vectors are the space vectors. For arbitrary n, the n-vectors might not have a geometric interpretation. One can still remark that the 4-vectors are used in physics : in relativity, oneconsid x y ers points in the space-time. The coordinates of such a point are then z , t the first three coordinates referring to the position in the space, and the last one to the time. 0.2 0.2.1 Sequences and polynomials Sequences If n ≥ 1, a n-vector has a finite number of coordinates (equal to n). If one wants to allow an infinite number of coordinates, one is led to introduce the set 4 CONTENTS of sequences. Definition 0.2.1. The set of real sequences, denoted by RN , is defined as the set of applications from N to R. Let u ∈ RN ; then u is an application from N to R, and is thus determined by its values u(0), u(1), . . . . Define un := u(n) for all n ∈ N; the sequence u will be written as u = (un )n∈N Example 0.2.2. Let a and λ be reals. The sequence v = (λan )n∈N is called a geometric sequence. The usual addition and multiplication by a real are possible for sequences. These operations are realized on each coordinate. Definition 0.2.3. Let u = (un )n∈N and v = (vn )n∈N be real sequences. Their sum is defined by u + v = (un + vn )n∈N If λ ∈ R, then one defines λ · u = (λun )n∈N 0.2.2 Polynomials A real sequence has an infinite number of coordinates (indexed by N). A n-vector has a fixed number of coordinates. One can construct a set whose elements have a finite number of coordinates, but without prescribing its number. Definition 0.2.4. A real polynomial is a sequence (un )n∈N such that there exists an integer N ≥ 0 with un = 0 for all n > N . Let P = (un )n∈N be a real polynomial. By definition, there exists N ≥ 0 such that all the coordinates un are zero if n > N . We will write P = u0 + u1 X + · · · + uN X N The letter X is a called an indeterminate. One may also write P (X) for the polynomial P . The set of real polynomials will be denoted by R[X]. If P is a real polynomial, and x is a real, then one can substitute X by x and get a real P (x). One can then attach to P a function from R to R. Definition 0.2.5. Let P = a0 + a1 X + · · · + aN X N be a real polynomial. The polynomial function attached to it is the function R → R defined by x → a0 + a1 x + · · · + aN xN 0.2. SEQUENCES AND POLYNOMIALS 5 We will still denote by P this polynomial function. This gives a more concrete (and more intuitive) definition of the set of real polynomials : this is the subset of the real functions which have the form x → a0 + a1 x + · · · + aN xN for some integer N ≥ 0, and some reals a0 , a1 , . . . , aN . A real polynomial has a finite number of coefficients. One can attach to it an invariant, called the degree. Definition 0.2.6. Let P = a0 + a1 X + · · · + aN X N ∈ K[X]. If P = 0, then we set by convention the degree of P to be −∞. If P 6= 0, the degree of P is the largest integer k such that ak 6= 0. The degree of a polynomial P ∈ R[X] will be written deg(P ). A nonzero polynomial can thus be written of the form a0 + a1 X + · · · + adeg(P ) X deg(P ) with adeg(P ) 6= 0. Example 0.2.7. The degree of P = 5X 3 + 2X − 1 is 3, and the degree of Q = −X 7 + 3X 6 − 2X 2 is 7. From our definition, the set of real polynomials is a subset of the set of real sequences. We have defined the addition of two sequences, and the multiplication of a sequence by a real. These operations stabilize the set of polynomials. Proposition 0.2.8. Let P, Q ∈ R[X], and λ ∈ R. Then P + Q and λ · P are in R[X]. Moreover, deg(P + Q) ≤ max(deg(P ), deg(Q)), and deg(λ · P ) = deg(P ) if λ 6= 0. Proof. We will see P, Q as sequences P = (an )n∈N and Q = (bn )n∈N . The result is true if P = 0 or Q = 0, therefore we can assume P 6= 0 and Q 6= 0. Let N = deg(P ) and M = deg(Q). Then an = 0 for all n ≥ N + 1, and bn = 0 for all n ≥ M + 1. One has P +Q = (an +bn )n∈N , and an +bn = 0 for all n ≥ max(N, M )+ 1. This proves that P +Q ∈ R[X], and that deg(P +Q) ≤ max(N, M ) = max(deg(P ), deg(Q)). Moreover, λ · P = (λan ), and λan = 0 for all n ≥ N + 1. This proves that λ · P ∈ R[X], and deg(λ · P ) ≤ N . If λ 6= 0, then λaN 6= 0 (aN is nonzero since N is the degree of P ). In this case, one has deg(λ · P ) = N = deg(P ). Remark 0.2.9. The inequality deg(P +Q) ≤ max(deg(P ), deg(Q)) might not be an equality. For example, if P = X 2 +X and Q = −X 2 +1, then P +Q = X +1, and deg(P + Q) = 1, but deg(P ) = deg(Q) = 2. These operations are just the usual ones on the polynomial functions. One can also multiply two polynomials. 6 CONTENTS Definition 0.2.10. Let P = (an )n∈N and Q = (bn )n∈N be real polynomials. We define the sequence P · Q as the sequence (cn )n∈N with cn = a0 bn + a1 bn−1 + · · · + an b0 This formula corresponds to the usual multiplication on polynomial functions; one has X i ·X j = X i+j for all non-negative integers i, j, and the usual properties for the multiplication remain true. Exercise 0.2.11. Compute (X 4 − X 2 ) · (X 3 + 2X). Proposition 0.2.12. If P, Q ∈ R[X], then P · Q ∈ R[X]. Moreover deg(P · Q) = deg(P ) + deg(Q) Proof. The result is easy if P = 0 or Q = 0, therefore we can assume P = 6 0 and Q 6= 0. Let N = deg(P ) and M = deg(Q), and we write P = (an )n∈N and Q = (bn )n∈N . The polynomial P · Q has coefficients (cn )n∈N , with cn = n X ak bn−k k=0 for all n ≥ 0. Note that the quantity ak bn−k is zero, unless k ≤ N and n−k ≤ M . If the quantity ak bn−k is nonzero, then n = k + (n − k) ≤ N + M . This proves that cn = 0 for all n ≥ N + M + 1. One concludes that P · Q ∈ R[X] and that deg(P · Q) ≤ N + M . Moreover, cN +M = aN bM 6= 0, and one gets deg(P · Q) = N + M = deg(P ) + deg(Q) One can finally define the composition of two polynomials. Definition 0.2.13. Let P = a0 + a1 X + · · · + an X n be a polynomial, and let Q be another polynomial. We define the polynomial P (Q(X)) by P (Q(X)) = a0 + a1 Q(X) + · · · + an Q(X)n We will especially be interested by the case where Q has degree 1 (i.e. consider P (aX + b), where a is nonzero). Since we are able to multiply two polynomials, one has the notion of divisibility. We say that a polynomial P is divisible by a polynomial Q if there exists a polynomial R such that P = Q · R. If Q is of the form X − a, one has the following characterization. Proposition 0.2.14. Let P ∈ R[X], and a ∈ R. Then P is divisible by X − a if and only if P (a) = 0. If this is the case, one says that a is a root for P . 0.3. SOLVING A SYSTEM OF LINEAR EQUATIONS 7 Proof. If P is divisible by X −a, there exists Q ∈ R[X] such that P = (X −a)Q. Then P (a) = (a − a)Q(a) = 0. Conversely, assume that P (a) = 0. Let Q be the polynomial defined by Q(X) = P (X+ a). Thus Q(0) = P (a) = 0. There exist reals b0 , . . . , bn such that Q = b0 + b1 X + · · · + bn X n Moreover, Q(0) = b0 = 0, hence Q = X(b1 + · · · + bn X n−1 ) Since P (x) = Q(x − a) for all x ∈ R, one gets P = (X − a)(b1 + · · · + bn (X − a)n−1 ) and P is divisible by X − a. 0.3 Solving a system of linear equations One of the goals of this course is to develop techniques to solve systems of linear equations in a very general setting. In this section, we will start by studying simple examples and introducing some elementary techniques. As an example, let us consider the following system of equations: 2x + 3y = 5 7x + 11y = −3 0.3.1 Elimination of variables The most intuitive way to solve the previous system is to eliminate one variable. Indeed, one has two equations to solve, and two unknowns (x and y). One can use the first equation to express the unknown x in terms of y. This gives the relation 5 − 3y 2 One can then substitute this value in the second equation. One gets the equation 5 − 3y 7 + 11y = −3 2 x= One has now one equation, and one variable. Simplifying the equation, one has y 41 =− 2 2 One can solve this equation, and find the value for the unknown y. One finds y = −41. Once we have the value for y, one goes back to the expression of x in terms of y to know the value for x. One gets 8 CONTENTS 5 − 3y = 64 2 One then checks that (x, y) = (64, −41) is a solution for the considered system. x= This method can be adapted to more general systems of equations : if one has p equations with n variables, then one solves the first equation, and obtains the expression of one variable in terms of the others. One substitute this value in the other equations. This variable has thus been eliminated, since one has p − 1 equations, and only n − 1 variables. One can then repeat this process, and eliminate another variable. Important remark: In the previous discussion to solve the system of equations, we have used implications. It means that we have assumed the existence of a solution (x, y), and got information on x and y. After resolving the system, we have obtained a unique value for the couple (x, y). But since we have used implications, these values may not be a solution to the original system of equations. That is why one needs to check that the possible solution we got is actually a solution. Alternatively, one can reason by equivalences. Two systems are equivalent if the set of solutions for the first one is the same as for the second one. The goal is then to transform the original system into simpler systems, so that at the end one has the explicit solution, all the systems being equivalent. For this example, one can write the justification this way. x = 5−3y x = 5−3y 2 2 ⇔ 5−3y 7x + 11y = −3 7 2 + 11y = −3 x = 64 x = 5−3y 2 ⇔ ⇔ y = −41 y = −41 2x + 3y = 5 ⇔ 7x + 11y = −3 The use of equivalences guarantees us that (x, y) = (64, −41) is the only solution of the system of equations. 0.3.2 Linear combinations of equations The elimination of variables is an intuitive technique, but a drawback is the number a computations needed to obtain the value for one variable. Another technique consists in forming linear combinations of equations. For example, let us consider our system again ( 2x + 3y = 5 (1) 7x + 11y = −3 (2) One can look at 11 · (1) − 3 · (2). This gives the equation x = 64 0.4. EXERCISES 9 and one has directly the value for x! Similarly, considering −7 · (1) + 2 · (2), one gets y = −41 This process allows us to know very fast the value for one particular variable, and gives the solution for the considered system with only two linear combinations to compute. As before, since only implications have been used, one needs to check that the value found for x and y are indeed solutions of the system. In general, the linear combinations technique can be used, starting with two equations and n variables, to end up with one equation with n − 1 variables. It can thus be seen as a variant of the elimination of variables technique, but without expressing explicitly one variable in terms of the others. Let us consider the following system of equations. −2x + 3y + z = 3 (3) −4x + 2y − 2z = 2 (4) 3x + y − z = 6 (5) One will use the first equation to eliminate the variable z, using linear combinations. One will replace the equation (4) by (4) + 2 · (3), and the equation (5) by (5) + (3). The above system is thus equivalent to −2x + 3y + z = 3 (6) −8x + 8y = 8 (7) x + 4y = 9 (8) Now we concentrate on the last two equations: this is a system of two equations with two variables. Forming the linear equations (7) − 2 · (8) and (7) + 8 · (8), one gets that the considered system is equivalent to −2x + 3y + z = 3 −10x = −10 40y = 80 One then easily solves these equations, and one gets a unique solution (x, y, z) = (1, 2, −1). Since we have used equivalences to solve the system, there is no need to check that this is actually a solution. 0.4 Exercises Exercise 0.4.1. Let a ∈ R. Solve the following system of equations. 2x + y = 1 ax − 5y = −5 10 CONTENTS Exercise 0.4.2. Let a ∈ R. Solve the following system of equations. x + 2y − 3z = 1 −x + 7y + z = 2 3x − 3y − 7z = a Exercise 0.4.3. Prove that there exists a unique real polynomial P of degree d ≤ 2 such that P (1) = 2, P (2) = 3 and P (3) = 1. Exercise 0.4.4. Let P, Q be nonzero polynomials with deg(Q) ≥ 1. Prove that deg(P (Q(X))) = deg(P ) · deg(Q) Exercise 0.4.5. Let P, Q be nonzero polynomials with deg(Q) < deg(P ). Prove that deg(P + Q) = deg(P ) Exercise 0.4.6. Let a, b ∈ R, and let (un )n≥0 be the sequence defined by u0 = a, u1 = b, and un+2 = 3un+1 − 2un for every n ≥ 0. We want to prove that there exist reals α, β such that un = α + β · 2n for all n ≥ 0. 1. Prove that there exists at most one possibility for α and β, and give their expression in terms of a and b. 2. Prove the desired result by induction. Exercise 0.4.7. Let a, b, c, d ∈ R. On which condition does the following system of equations have a solution ? (One may first separate the cases a = 1 and a 6= 1) ax + y + z = b x + ay + z = c x + y + az = d Exercise 0.4.8. Let a, b ∈ R. Solve the following system of equations. x + 2y − z = 3 ax + 2y − z = 7 −x − 2y + bz = −3 Chapter 1 Vector spaces In the preliminary chapter, we have encountered different sets, enjoying similar structures. Indeed, when we consider the set of n-vectors, real sequences, or real polynomials, one can add two elements, and multiply an element by a real. The goal of this chapter is to give a common framework for these sets, and to develop general tools which can be used for each of these sets. The set of n-vectors, sequences or polynomials have been defined over the set of real numbers R. It is possible to define them over the set of complex numbers C. The set of complex sequences consists in sequences whose elements are complex numbers, the set of complex polynomials consists in polynomials whose coefficients are complex numbers... We do not want to make a distinction between those two cases, since every tool and result developed in this course can be applied to either of them. For this reason, we will use throughout the course the letter K, which can be either R or C1 . This will allow us to treat simultaneously the case where the coefficients are real or complex numbers. 1.1 1.1.1 Definition and examples Definition of a vector space In this section one wants to define a common structure, which can be applied to n-vectors, sequences or polynomials. The first step is to define an addition. 1 More generally, one will only need that K is a field in this course. This notion will be defined in another algebra course. The important property of a field is that one can define the inverse of a non zero element. For example, Q is a field, but Z is not. 11 12 CHAPTER 1. VECTOR SPACES Definition 1.1.1. Let E be a set. An addition ‘+’on E is a map E×E →E (x, y) → x + y such that • x + y = y + x for all x, y ∈ E • x + (y + z) = (x + y) + z for all x, y, z ∈ E. Example 1.1.2. The sets R and C with the usual addition. Remark 1.1.3. The first property is called commutativity, and the second one associativity. They state that when one wants to add several elements, the order in which one makes the operations is not important. For the n-vectors, one has the zero vector, as well as the opposite of a vector. These could also be defined for sequences and polynomials. Let us give a precise definition. Definition 1.1.4. Let E be a set with an addition denoted ‘+’. We say that (E, +) is a commutative group if • There exists an element 0 ∈ E such that 0 + x = x for all x ∈ E. • For all x ∈ E, there exists an element −x ∈ E such that x + (−x) = 0 The element −x is called the opposite of x. It is actually unique, as is proved by the following proposition. Proposition 1.1.5. Let (E, +) be a commutative group, and x ∈ E. There exists a unique element y such that x + y = 0. Proof. The existence follows from the properties of a commutative group. To prove the uniqueness, assume that there exists y, z ∈ E such that x+ y = x+ z = 0. Then z + (x + y) = z + 0 = z. But on the other hand z + (x + y) = (z + x) + y = (x + z) + y = 0 + y = y One concludes that y = z. We have given a precise definition for the addition of two elements. Now we want to be able to multiply elements by real numbers (or complex numbers). Recall that K can refer to either R or C. Definition 1.1.6. Let E be a set. An external multiplication ‘·’ on E is a map K×E →E (λ, x) → λ · x 1.1. DEFINITION AND EXAMPLES 13 Of course, if E is a set with an addition and an external multiplication, one would like some compatibilities between these two laws. One is then led to the definition of a vector space. Definition 1.1.7. Let E be a set with an addition ‘+’, and external multiplication ‘·’. Then (E, +, ·) is a K-vector space if • (E, +) is a commutative group • λ · (x + y) = λ · x + λ · y • (λ + µ) · x = λ · x + µ · x • (λµ) · x = λ · (µ · x) • 1 · x = x. If the context is clear, one might just call E a vector space. The elements in E will be called vectors, and elements in K scalars. From the axioms satisfied by a vector space, one can deduce the following equalities. Proposition 1.1.8. We have for all λ ∈ K and x ∈ E • 0·x=0 • λ·0=0 • (−1) · x = −x Exercise 1.1.9. Prove the previous proposition. One also has the following property. Proposition 1.1.10. Let λ ∈ K and x ∈ E. If λ · x = 0, then λ = 0 or x = 0. Proof. If λ · x = 0, and λ 6= 0, then x = 1 · x = (λ−1 λ) · x = λ−1 · (λ · x) = λ−1 · 0 = 0 1.1.2 Standard vector spaces In this section, we will define the standard vector spaces ; these will mostly be the examples given in the preliminary chapter. Let us start with a very basic (yet important) vector space. Definition 1.1.11. The zero vector space is defined by {0}. The addition, and external multiplication are easily defined, and one checks that all the axioms of a vector space are satisfied. The following vector space is the most important of the course. 14 CHAPTER 1. VECTOR SPACES Definition 1.1.12. Let n ≥ 1, and define x1 Kn = ... , x1 , . . . , xn ∈ K xn 0 The zero element is ... . The addition of two vectors is defined by 0 x1 .. . + xn x01 x1 + x01 .. = .. . . x0n xn + x0n The opposite of a vector is defined by x1 −x1 − ... = ... xn −xn Finally, the external multiplication is defined by x1 λx1 λ · ... = ... xn λxn With these laws, the set Kn is a vector space. Remark 1.1.13. If n = 1, the vector space Kn is just the set K, with the usual addition and multiplication. Definition 1.1.14. The set of sequences is defined as S = {(xn )n≥0 , xn ∈ K for all n ≥ 0} The zero element of S is the constant sequence equal to 0, and one can define the addition of two sequences, the opposite of a sequence and the multiplication of a sequence by a scalar in a similar way as for Kn . With these laws, S is a vector space. Definition 1.1.15. The set of polynomials with coefficients in K is defined as K[X] = {a0 + a1 X + · · · + an X n , n ≥ 0, ai ∈ K} This is again a vector space, with the laws defined in the preliminary chapter. 1.1. DEFINITION AND EXAMPLES 15 Definition 1.1.16. Let X be a non-empty set, and let F(X) = {f : X → K} be the set of functions from X to K. With the obvious addition of functions, and multiplication by a scalar, this is a vector space. Note that no assumption has been made on the set X. The laws on F(X) come from the ones on K. Actually, one can consider functions from a set X to any vector space. Proposition 1.1.17. Let X be a non-empty set, and E be a vector space. Let F(X, E) = {f : X → E} be the set of functions from X to E. Then F(X, E) is a vector space. Proof. The zero element is defined as the zero function, i.e. the function f defined by f (x) = 0 for all x ∈ X (here 0 is the zero element of E). If f, g ∈ F(X, E), we define f + g as (f + g)(x) = f (x) + g(x) for all x ∈ X. The opposite of a function f is the function defined by (−f )(x) = −f (x) for any x ∈ X. Finally, if f ∈ F(X, E) and λ ∈ K, one defined the external multiplication by (λ · f )(x) = λ · f (x) for all x ∈ X. One then checks the axioms of a vector space. 1.1.3 Exercises Exercise 1.1.1. Let (E, +, ·) be a K-vector space. Prove the following relations. 1. 0 · x = 0, for all x ∈ E. 2. λ · 0 = 0, for all λ ∈ K. 3. (−1) · x = −x for all x ∈ E. Exercise 1.1.2. Let G = (−1, 1) = {x ∈ R, −1 < x < 1}, and define x?y = x+y 1 + xy for x, y ∈ G. 1. Check that x ? y is well defined for x, y ∈ G, and that it is also an element of G. 2. Prove that ? defines an addition on G. 3. Prove that (G, ?) is a commutative group. 16 CHAPTER 1. VECTOR SPACES Exercise 1.1.3. Let (E, +, ·) be a K-vector space. Prove that the relation x + y = y + x for all x, y ∈ E is a consequence of the other axioms of a vector space and the relations 0 + x = x + 0 = 0, x + (−x) = (−x) + x = 0 for all x ∈ E. Exercise 1.1.4. Define explicitly a structure of R-vector space on the set R>0 = {x ∈ R, x > 0} (Hint: use the multiplication instead of the addition) 1.2 Vector subspaces The definition of a vector space is quite complicated: one has to define the zero element, addition, opposite and external multiplication. One should then check all the axioms of a vector space. Even though all this is quite obvious in practice, it still takes a lot of time to do everything rigorously. Luckily, we have defined the standard vector spaces (Kn , set of sequences, set of polynomials and set of functions). In practice, we will only be concerned with subsets of these standard vector spaces. We then need a criterion to know when a subset of vector space is itself a vector space. This leads to the notion of vector subspaces. 1.2.1 Definition Let E be a (K)-vector space, and let F ⊆ E be a subset of E. Definition 1.2.1. We say that F is a vector subspace of E if • 0∈F • for all x, y ∈ F , and all λ, µ ∈ K, λ · x + µ · y ∈ F . Let F be a vector subspace of E. Then for x, y ∈ F , 1 · x + 1 · y = x + y ∈ F . If x ∈ F, λ ∈ K, then λ · x + 0 · 0 = λ · x ∈ F . One then sees that F is stable under addition and external multiplication. Proposition 1.2.2. If F is a vector subspace of E, then one can define an addition and external multiplication on F . With these laws, F is itself a vector space. Proof. We define the zero element of F to be 0 (which is in F by the definition of a vector subspace). Let x, y ∈ F and λ ∈ K. Since x + y ∈ F by the previous remark, this defines an addition on F ; moreover −x = (−1) · x is also in F and this defines the opposite of an element. Finally, λ · x ∈ F , and this defines the external multiplication. One has to check the axioms of a vector space for F . But since each axiom is satisfied for every vector in E, it is also satisfied for every vector in F . 1.2. VECTOR SUBSPACES 17 Example 1.2.3. If E is a vector space, then {0} and E are vector subspaces of E. Example 1.2.4. Let E = K3 , and define x F = y ∈ K3 , x, y ∈ K 0 0 x x The zero vector is in F . If u = y and v = y 0 are in F , then for all 0 0 λ, µ ∈ K λ · x + µ · x0 λ · u + µ · v = λ · y + µ · y0 ∈ F 0 This proves that F is a vector subspace of E. Example 1.2.5. Let F = {P ∈ K[X], P (0) = 0} Then F is a vector subspace of K[X] : • 0∈F • if P, Q ∈ F and λ, µ ∈ K then (λP + µQ)(0) = λP (0) + µQ(0) = 0 Therefore λP + µQ ∈ F . A vector subspace is in general defined by linear equations. We will give a precise meaning to this term in the next chapter, but it roughly means that the equations defining the subset contains no squares or terms of higher order (or any other functions such as the cosine, exponential...). Example 1.2.6. Let E = K2 , and define x F = ∈ K2 , x2 = y y 1 2 ∈ F but u + u = is not in F . This proves that F is 1 2 not a vector subspace of E. Then u = Example 1.2.7. Let K[X]n ⊂ K[X] be the subset of polynomials P with deg(P ) ≤ n. Then K[X]n is a vector subspace of K[X]. 18 CHAPTER 1. VECTOR SPACES 1.2.2 Operations on vector subspaces One can define the product of two vector spaces. Recall that for two sets E1 , E2 , their product is defined as E1 × E2 = {(x, y), x ∈ E1 , y ∈ E2 } Proposition 1.2.8. One can define an addition, opposite, and external multiplication on E1 × E2 , which make it a vector space. Proof. The zero element is obviously (0, 0), where the first element is the zero element of E1 , and the second one the zero element of E2 . The addition, opposite and external multiplication are defined by • (x, y) + (x0 , y 0 ) = (x + x0 , y + y 0 ) • −(x, y) = (−x, −y) • λ · (x, y) = (λ · x, λ · y) One then checks all the axioms of a vector space. One can form the intersection of two vector subspaces. Proposition 1.2.9. Let E be a vector space, and let F1 , F2 be vector subspaces of E. Then F1 ∩ F2 is a vector subspace of E. Recall that F1 ∩ F2 = {x ∈ E, x ∈ F1 and x ∈ F2 } Proof. Since the zero element is in F1 and F2 , one has 0 ∈ F1 ∩ F2 . x, y ∈ F1 ∩ F2 , and λ, µ ∈ K then If λ · x + µ · y ∈ F1 λ · x + µ · y ∈ F2 and then λ · x + µ · y ∈ F1 ∩ F2 . This concludes the proof. The union of two vector subspaces is not a vector subspace in general (see exercise 1.2.3). Instead, one can define the sum of two vector subspaces. Definition 1.2.10. Let E be a vector space, and let F1 , F2 be vector subspaces of E. The sum of F1 and F2 is defined as F1 + F2 = {x1 + x2 , x1 ∈ F1 , x2 ∈ F2 } Proposition 1.2.11. Let E be a vector space, and let F1 , F2 be vector subspaces of E. Then F1 + F2 is a vector subspace of E. 1.2. VECTOR SUBSPACES 19 Proof. We have 0 = 0 + 0 ∈ F1 + F2 . If u, v ∈ F1 + F2 , then there exist x1 , x01 ∈ F1 , x2 , x02 ∈ F2 such that u = x1 + x2 and v = x01 + x02 . Then for any λ, µ ∈ K λu + µv = λ(x1 + x2 ) + µ(x01 + x02 ) = (λx1 + µx01 ) + (λx2 + µx02 ) and λu + µv ∈ F1 + F2 . Example 1.2.12. Let E = K3 , and let x F1 = y ∈ K3 , x, y ∈ K 0 0 F2 = 0 ∈ K3 , z ∈ K z 0 F20 = y ∈ K3 , y ∈ K 0 Then • F1 ∩ F2 = {0} • F1 ∩ F20 = F20 • F1 + F2 = K 3 • F1 + F20 = F1 Remark 1.2.13. If E is a vector space, k ≥ 1 and F1 , . . . , Fk are vector subspaces, one can define their sum F1 + · · · + Fk = {x1 + · · · + xk , x1 ∈ F1 , . . . , xk ∈ Fk } It is a vector subspace of E. 1.2.3 Direct sum Let E be a vector space, and let F1 , F2 be vector subspaces of E. The condition E = F1 + F2 means that every element x of E can be written as x = x1 + x2 , with x1 ∈ F1 , x2 ∈ F2 . However, the elements x1 , x2 might not be unique. If this is the case, one says that E is the direct sum of F1 and F2 . Definition 1.2.14. We say that E is the direct sum of F1 and F2 , and we write E = F1 ⊕ F2 if any element x ∈ E can be written uniquely of the form x = x1 + x2 with x1 ∈ F1 and x2 ∈ F2 . 20 CHAPTER 1. VECTOR SPACES If E = F1 ⊕ F2 , one says that F1 and F2 are complementary subspaces. One has the following characterization of the direct sum. Proposition 1.2.15. One has E = F1 ⊕ F2 if and only if E = F1 + F2 and F1 ∩ F2 = {0}. Proof. Assume E = F1 ⊕ F2 . Then E = F1 + F2 . Let x ∈ F1 ∩ F2 . Then 0 = 0 + 0 = x + (−x) Since x ∈ F1 and −x ∈ F2 , one gets by the uniqueness property x = 0. This proves that F1 ∩ F2 = {0}. Conversely, assume that E = F1 + F2 and F1 ∩ F2 = {0}. If x ∈ E, there exists x1 ∈ F1 , x2 ∈ F2 with x = x1 + x2 . We will now prove that these elements x1 , x2 are unique. Assume there are other elements x01 ∈ F1 , x02 ∈ F2 such that x = x1 + x2 = x01 + x02 Then the element y = x1 − x01 = x02 − x2 is in F1 ∩ F2 = {0}. Thus x01 = x1 and x02 = x2 . One concludes that E = F1 ⊕ F2 . Example 1.2.16. Consider the vector space E = K2 , and let x F1 = ∈ E, x ∈ K 0 F2 = 0 y ∈ E, y ∈ K Then E = F1 ⊕ F2 . More generally, we say that two vector subspaces F, G of a vector space E are in direct sum if F ∩ G = {0}. In this case, one has F +G=F ⊕G One can also define the direct sum for an arbitrary number of vector subspaces. Definition 1.2.17. Let E be a vector space, k ≥ 2 an integer, and F1 , . . . , Fk vector subspaces of E. We say that E is the direct sum of F1 , . . . , Fk , and we write E = F1 ⊕ · · · ⊕ Fk if any element x ∈ E can be written uniquely of the form x = x1 + · · · + xk with x1 ∈ F1 , . . . , xk ∈ Fk . One still has a characterization, but it is more involved. Proposition 1.2.18. Let E be a vector space, k ≥ 2 an integer, and F1 , . . . , Fk vector subspaces of E. Then E = F1 ⊕ · · · ⊕ Fk if and only if E = F1 + · · · + Fk and for every 1 ≤ i ≤ k X Fi ∩ ( Fj ) = {0} j6=i 1.2. VECTOR SUBSPACES 21 Proof. Assume E = F1 ⊕· · ·⊕Fk . Then E = F1 +· · ·+Fk . Let x ∈ F1 ∩ (F2 + · · ·+ Fk ). Then there exist x2 ∈ F2 , . . . , xk ∈ Fk such that x = x2 + · · · + xk . Then 0 = 0 + · · · + 0 = x + (−x2 ) + · · · + (−xk ) Since x ∈ F1 and −xj ∈ Fj for 2 ≤ j ≤ k, one gets by the uniqueness property x P = 0. This proves that F1 ∩ (F2 + · · · + Fk ) = {0}. The proof that Fi ∩ ( j6=i Fj ) = {0} for 2 ≤ i ≤ k is similar. P Conversely, assume that E = F1 + · · · + Fk and Fi ∩ ( j6=i Fj ) = {0} for all 1 ≤ i ≤ k. If x ∈ E, there exist x1 ∈ F1 , . . . , xk ∈ Fk with x = x1 + · · · + xk . We will now prove that these elements x1 , . . . , xk are unique. Assume there are other elements x01 ∈ F1 , . . . , x0k ∈ Fk such that x = x1 + · · · + xk = x01 + · · · + x0k Then the element y = x1 −x01 = (x02 −x2 )+· · ·+(x0k −xk ) is in F1 ∩ (F2 + · · ·+ Fk ) = {0}. Thus x01 = x1 and (x02 − x2 ) + · · · + (x0k − xk ) = 0. One then proves by induction that xj = x0j for all 2 ≤ j ≤ k. This concludes the proof that E = F1 ⊕ · · · ⊕ Fk . 1.2.4 Exercises Exercise 1.2.1. Are the following subsets vector subspaces? Justify your answer. x 1. y ∈ K3 , x + y + z = 0 ⊆ K3 z x 2. y ∈ K3 , x2 + y + z = 0 ⊆ K3 z x 3. y ∈ K3 , x2 + 4z 2 + 4xz = 0 ⊆ K3 z Exercise 1.2.2. Are the following subsets vector subspaces? Justify your answer. 1. {P ∈ K[X], P (1) = P (2)} ⊂ K[X] 2. P ∈ K[X], P (X)2 = P (X 2 ) ⊂ K[X] 3. {P ∈ K[X], P (X) = P (−X)} ⊂ K[X] Exercise 1.2.3. Let E be a K-vector space, and let F1 , F2 be two vector subspaces. Define F1 ∪ F2 = {x ∈ E|x ∈ F1 or x ∈ F2 } Prove that F1 ∪ F2 is a vector subspace if and only if F1 ⊆ F2 or F2 ⊆ F1 . 22 CHAPTER 1. VECTOR SPACES Exercise 1.2.4. Let x F = y ∈ K3 |x + y + z = 0 z Find a vector subspace G such that F ⊕ G = K3 . Exercise 1.2.5. Find a vector space E, and vector subspaces F1 , F2 , F3 such that E = F1 + F2 + F3 , Fi ∩ Fj = {0} if i 6= j, but one does not have E = F1 ⊕ F2 ⊕ F3 . Exercise 1.2.6. A sequence (un )n≥0 is periodic if there exists an integer m ≥ 1 such that un+m = un for all n ≥ 0. Is the set of periodic sequences a vector subspace of the space of sequences? Exercise 1.2.7. Let n ≥ 1 be an integer. Let F be the set of polynomials of degree less or equal than n − 1, and let G be the set of polynomials which are divisible by X n . 1. Prove that F and G are vector subspaces of K[X]. 2. Prove that K[X] = F ⊕ G. Exercise 1.2.8. Let m ≥ 1 be an integer; a sequence (un )n≥0 is m-periodic if un+m = un for all n ≥ 0. Let Sm be the set of m-periodic sequences for all m ≥ 1. 1. Prove that Sm is a vector space for all m ≥ 1. 2. Compute S2 ∩ S3 . Justify your answer. 3. Compute S2 + S3 . Justify your answer. 4. Find a vector subspace F ⊆ S6 such that (S2 + S3 ) ⊕ F = S6 . 1.3 Family of vectors Throughout this section, E will denote a vector space. 1.3.1 Spanning family We start by the definition of a family of vectors. Definition 1.3.1. A family of vectors is a collection (e1 , . . . , en ) of elements of E. The integer n is called the cardinality of the family. Families can be used to define vector subspaces. First let us define the notion of linear combination. 1.3. FAMILY OF VECTORS 23 Definition 1.3.2. Let (e1 , . . . , en ) be a family of vectors. A linear combination of (e1 , . . . , en ) is an element of the form λ 1 e1 + · · · + λ n en for some scalars λ1 , . . . , λn ∈ K. The set of all possible linear combinations of a family will be called the subspace generated (or spanned) by the family. It is defined as follows. Definition 1.3.3. Let (e1 , . . . , en ) be a family of vectors. We define Vect(e1 , . . . , en ) = {λ1 e1 + · · · + λn en |λ1 , . . . , λn ∈ K} One can prove that the set Vect(e1 , . . . , en ) is a vector subspace2 of E that contains the vectors e1 , . . . , en . It is actually the smallest one, as is shown by the following proposition. Proposition 1.3.4. Let (e1 , . . . , en ) be a family of vectors. Then Vect(e1 , . . . , en ) is the smallest vector subspace of E containing the elements e1 , . . . , en . Proof. If F is a vector subspace of E containing e1 , . . . , en , then it contains also all the possible linear combinations of (e1 , . . . , en ), and hence Vect(e1 , . . . , en ). Let us now prove that it is actually a vector subspace. Let x = λ1 e1 + · · · + λn en and y = µ1 e1 + · · · + µn en be elements of Vect(e1 , . . . , en ). If α, β are in K, then αx + βy = (αλ1 + βµ1 )e1 + · · · + (αλn + βµn )en = γ1 e1 + · · · + γn en with γi = αλi +βµi for all 1 ≤ i ≤ n. This proves that αx+βy ∈ Vect(e1 , . . . , en ) and concludes the proof. We can now introduce the notion of spanning family. Definition 1.3.5. Let (e1 , . . . , en ) be a family of vectors. We say that (e1 , . . . , en ) is a spanning family if E = Vect(e1 , . . . , en ) Equivalently, (e1 , . . . , en ) is a spanning family if every element of E is a linear combination of (e1 , . . . , en ). One has the following behavior when one considers a smaller family. Proposition 1.3.6. Let (e1 , . . . , en ) be a family of vectors, and let 1 ≤ k ≤ n. Then Vect(e1 , . . . , ek ) ⊆ Vect(e1 , . . . , en ). Consequently, if (e1 , . . . , ek ) is a spanning family, so is (e1 , . . . , en ). 2 One may also write Span(e , . . . , e ) instead of Vect(e , . . . , e ). The former notation is n n 1 1 common in english, whereas the latter is used in french. 24 CHAPTER 1. VECTOR SPACES Proof. Let x ∈ Vect(e1 , . . . , ek ). There exist scalars λ1 , . . . , λk such that x = λ1 e1 + · · ·+ λk ek . Then x = λ1 e1 + · · · + λk ek + 0 · ek+1 + · · · + 0 · en ∈ Vect(e1 , . . . , en ) This proves the inclusion Vect(e1 , . . . , ek ) ⊆ Vect(e1 , . . . , en ). If (e1 , . . . , ek ) is a spanning family, then Vect(e1 , . . . , ek ) = E, and one gets E ⊆ Vect(e1 , . . . , en ). Since the other inclusion is always satisfied one concludes that E = Vect(e1 , . . . , en ), and that (e1 , . . . , en ) is a spanning family. Example 1.3.7. Let E = K2 , and let 1 0 e1 = e2 = 0 1 e3 = 1 1 Then • (e1 , e2 ) is a spanning family. • (e1 , e3 ) is a spanning family. • (e1 , e2 , e3 ) is a spanning family. But (e1 ) is not a spanning family. 1.3.2 Linearly independent family A family is spanning if every vector of E is a linear combination of this family. One can then ask when such a linear combination is uniquely determined. This leads to the notion of linear independence. Definition 1.3.8. Let (e1 , . . . , en ) be a family of vectors. It is linearly independent if for λ1 , . . . , λn ∈ K, n X λi ei = 0 ⇒ λ1 = λ2 = · · · = λn = 0 i=1 If not, the family is said to be linearly dependent. In this case, there exist λ1 , . . . , λn ∈ K with n X λi ei = 0 i=1 and the elements λi not all zero. An equation of the form n X λi ei = 0 i=1 is called a dependence relation. The family is linear independent if there are no non-trivial dependence relations. For a family of one vector, one has the following result. 1.3. FAMILY OF VECTORS 25 Proposition 1.3.9. Let e1 ∈ E. Then the family (e1 ) is linearly dependent if and only if e1 = 0. Proof. If e1 = 0, then one has the dependence relation 1 · e1 = 0, and the family (e1 ) is linearly dependent. Conversely, assume that (e1 ) is linearly dependent. There exists λ1 ∈ K with λ1 e1 = 0 and λ1 6= 0. One then gets e1 = 0 thanks to Proposition 1.1.10. For families of greater cardinality, one has the following characterization. Proposition 1.3.10. Let n ≥ 2, and let (e1 , . . . , en ) be a family of vectors. Then the family (e1 , . . . , en ) is linearly dependent if and only if there exists 1 ≤ k ≤ n such that ek is a linear combination of the family (ei )i6=k . Proof. If ek is a linear combination of the family (ei )i6=k , then there exist (λi )i6=k such that X ek = λi ei i6=k This gives a non-trivial dependence relation. Conversely, assume that the family (e1 , . . . , en ) is linearly dependent. Then there exists (λ1 , . . . , λn ) ∈ Kn not all zero with n X λ i ei = 0 i=1 There exists an integer k such that λk 6= 0. Then one has ek = − X λi ei λk i6=k which proves that ek is a linear combination of the family (ei )i6=k . If one considers a bigger family, one has the following behavior. Proposition 1.3.11. Let (e1 , . . . , en ) be a family of vectors, and 1 ≤ k ≤ n. If (e1 , . . . , en ) is linearly independent, so is (e1 , . . . , ek ). Proof. Assume that the family (e1 , . . . , en ) is linearly independent. Let λ1 , . . . , λk be scalars and assume that λ1 e1 + · · · + λk ek = 0. Then λ1 e1 + · · · + λk ek + 0 · ek+1 + · · · + 0 · en = 0 Since the family (e1 , . . . , en ) is linearly independent, one gets λ1 = · · · = λk = 0. This proves that the family (e1 , . . . , ek ) is linearly independent. One also has this useful lemma. Lemma 1.3.12. Let n ≥ 2, and let (e1 , . . . , en ) be a family of vectors. Then it is linearly independent if and only if (e1 , . . . , en−1 ) is linearly independent and en ∈ / Vect(e1 , . . . , en−1 ). 26 CHAPTER 1. VECTOR SPACES Proof. Assume that (e1 , . . . , en ) is linearly independent. Then so is (e1 , . . . , en−1 ). Moreover en ∈ / Vect(e1 , . . . , en−1 ). This proves one implication. Conversely, assume that (e1 , . . . , en−1 ) is linearly independent and en ∈ / Vect(e1 , . . . , en−1 ). Let λ1 , . . . , λn be scalars and assume that λ1 e1 + · · · + λn en = 0. If λn 6= 0, one gets λn−1 λ1 en−1 en = − e1 − · · · − λn λn One gets a contradiction, since en is not a linear combination of (e1 , . . . , en−1 ). Thus λn = 0, and λ1 e1 + · · · + λn−1 en−1 = 0. Since the family (e1 , . . . , en−1 ) is linearly independent, one gets λ1 = · · · = λn−1 = 0. One then concludes that the family (e1 , . . . , en ) is linearly independent. For example, to check that (e1 , e2 ) is linearly independent, it is enough to check that e1 6= 0 and e2 ∈ / Vect(e1 ). Example 1.3.13. Let E = K2 , and let 1 0 e1 = e2 = 0 1 e3 = 1 1 Then • (e1 , e2 ) is a linearly independent family. • (e1 , e3 ) is a linearly independent family. • (e1 ) is a linearly independent family. But (e1 , e2 , e3 ) is a linearly dependent family. 1.3.3 Basis When a family is both spanning and linearly independent, it is called a basis. Definition 1.3.14. A family (e1 , . . . , en ) is a basis of E if it is a spanning and linearly independent family. One has the following characterization of a basis. Proposition 1.3.15. If (e1 , . . . , en ) is a basis, any element x ∈ E can be written uniquely as n X x= λ i ei i=1 with λ1 , . . . , λn ∈ K. Proof. The fact that (e1 , . . . , en ) is a spanning family gives the existence. We will now prove the uniqueness. Assume that there exist λ1 , . . . , λn , λ01 , . . . , λ0n ∈ K such that x = λ1 e1 + · · · + λn en = λ01 e1 + · · · + λ0n en 1.3. FAMILY OF VECTORS 27 Then (λ1 − λ01 )e1 + · · · + (λn − λ0n )en = 0 Since the family (e1 , . . . , en ) is linearly independent, one gets λi = λ0i for all 1 ≤ i ≤ n, which proves the uniqueness. Pn If (e1 , . . . , en ) is a basis, any element x ∈ E can be written uniquely as x = i=1 λi ei . The scalars λ1 , . . . , λn are thus well defined and are called the coordinates of x with respect to the basis (e1 , . . . , en ). Example 1.3.16. Let E = K2 , and let 1 0 e1 = e2 = 0 1 e3 = 1 1 Then • (e1 , e2 ) is a basis. • (e1 , e3 ) is a basis. But (e1 ) and (e1 , e2 , e3 ) are not bases. Proposition 1.3.17. Let n ≥ 1 and consider the vector space Kn . Let 0 1 .. 0 ... en = . e1 = . 0 .. 0 1 Then (e1 , . . . , en ) is a basis for Kn , called the standard basis. Proposition 1.3.18. Let n ≥ 0, and consider K[X]n the vector space of polynomials of degree less or equal to n. Then the family (1, X, . . . , X n ) is a basis for K[X]n , called the standard basis. In this course, the most important examples of vector spaces are Kn and its vector subspaces. One has seen that the standard vector space Kn has a standard basis. Let us now consider a vector subspace F of Kn . To find a basis for F , one should find elements f1 , . . . , fk belonging to F such that the family (f1 , . . . , fk ) is linearly independent, and every element of F can be expressed as a linear combination of f1 , . . . , fk . It is important to check that the vectors actually belong to F , since any basis for Kn satisfy the other properties. If the vector subspace F is defined by some equations, the simplest way is to solve these equations to get an explicit description of F . One would then find vectors f1 , . . . , fk such that F = Vect(f1 , . . . , fk ). One thus gets a spanning family for F . The next step is to check linear independence. If the family is linearly independent, it is a basis. Otherwise, one of the vectors is a linear combination of the others, and one can remove it from the family. One then gets a spanning family with k − 1 elements, and one can repeat this process until the family is linearly independent. 28 CHAPTER 1. VECTOR SPACES x Example 1.3.19. Let F = y ∈ K3 , x + y + z = 0 . This is a vector z x subspace of K3 . Let u = y be an element of K3 . Then z x ⇔ u ∈ Vect(f1 , f2 ) y u ∈ F ⇔ z = −x − y ⇔ ∃x, y ∈ K, u = −x − y 1 0 with f1 = 0 and f2 = 1 . The family (f1 , f2 ) is thus spanning −1 −1 for F . One then checks that it is a linearly independent family, hence a basis for F . 1.3.4 Exercises Exercise 1.3.1. Let E be a vector space, and let (e1 , . . . , en ) be a family of vectors such that every x ∈ E can be written uniquely of the form x= n X λi ei i=1 for some scalars λ1 , . . . , λn ∈ K. Prove that (e1 , . . . , en ) is a basis. Exercise 1.3.2. Let E = K[X], n ≥ 1, and let (P1 , . . . , Pn ) be a family of vectors of E. Prove that it is never a spanning family. Exercise 1.3.3. Let a ∈ K. Consider the vector space K3 , and the elements 1 3 −1 e1 = 0 e2 = 1 e3 = 3 0 2 a On what condition is the family (e1 , e2 , e3 ) linearly independent? In this case, prove that it is a basis. Exercise 1.3.4. Let E be a vector space, and let (e1 , . . . , en ) be P a linearly n independent family. Let α1 , . . . , αn be elements of K, and define y = i=1 αi ei . On what condition is the family (y +xi )1≤i≤n linearly independent? Justify your answer. Exercise 1.3.5. Let E be the set of functions from R to R. If n ∈ Z is an integer, let fn ∈ E be the function defined by x → exp(nx). Let k ≥ 1 and n1 < n2 < · · · < nk be integers. Prove that the family (fn1 , . . . , fnk ) is linearly independent. 1.4. DIMENSION OF A VECTOR SPACE 1.4 29 Dimension of a vector space If a vector space admits a basis, then each vector can be expressed in this basis, and is thus determined by a finite number of coordinates. Unfortunately, some vector spaces do not have any bases, and are then more complicated to understand. To distinguish between these two cases, one has the notion of dimension. 1.4.1 Definition Let E be a vector space. Definition 1.4.1. One says that E has finite dimension if it has a spanning family. If it is not the case, one says that E has infinite dimension. Example 1.4.2. The vector spaces {0}, Kn and K[X]n (n ≥ 1) are finite dimensional. We will now define the dimension of a finite dimensional vector space. It will be a non-negative integer. Definition 1.4.3. We define the dimension of the zero vector space {0} to be zero. If E is a non-zero vector space of finite dimension, we define its dimension to be the minimum of the cardinalities of the spanning families. We write dim E for the dimension of E. The dimension enjoys the following properties. Proposition 1.4.4. Let E be a finite dimensional vector space, and let n = dim E. • n ∈ Z≥0 • n = 0 ⇔ E = {0} • Every spanning family has cardinality greater or equal than n. • If n ≥ 1, E has a spanning family of cardinality n. Example 1.4.5. The vector space K admits (1) as a spanning family, hence its dimension is less or equal to 1. Since it is non-zero, its dimension is exactly 1. Exercise 1.4.6. Prove that dim K2 = 2. For infinite dimensional vector spaces, one has the following proposition. Proposition 1.4.7. Let E be a vector space of infinite dimension. Then for all n ≥ 1, there exists a linearly independent family of cardinality n. 30 CHAPTER 1. VECTOR SPACES Proof. Let n ≥ 1 be an integer. We will construct by induction a linearly independent family (e1 , . . . , en ). Since E is not equal to {0}, there exists a nonzero vector e1 . The family (e1 ) is then linearly independent. Assume that (e1 , . . . , ek ) are constructed for a certain integer k, and let us construct the vector ek+1 . Since E has infinite dimension, the family (e1 , . . . , ek ) is not spanning. Therefore, there exists a vector ek+1 not in Vect(e1 , . . . , ek ). The family (e1 , . . . , ek+1 ) is then linearly independent by Lemma 1.3.12. This concludes the construction. 1.4.2 Dimension and families If one considers a finite dimensional vector space, then every spanning family has a cardinality greater or equal than the dimension. We will prove an analogous result for linearly independent families. First, let us start with a technical lemma. Lemma 1.4.8. Let n ≥ 1, and let E be a vector space generated by n elements. Then every family of vectors of cardinality m > n is linearly dependent. Proof. We prove this lemma by induction on n. Let us start with the case n = 1. Let E be a vector space generated by a vector e1 . It is enough to prove that every family of cardinality 2 is linearly dependent. Let (x, y) be a family of vectors of E; there exist λ, µ ∈ K such that x = λe1 y = µe1 If λ = 0, x = 0 and the family (x, y) is linearly dependent. Otherwise, e1 = 1 x λ y= µ x λ and the family (x, y) is linearly dependent. Assume the lemma true for an integer n ≥ 1, and let us prove it for n + 1. Let E be a vector space generated by a family (e1 , . . . , en+1 ). It is enough to prove that every family of cardinality n + 2 is linearly dependent. Let (x1 , . . . , xn+2 ) be a family of vectors; there exist scalars (λi,j )i,j ∈ K such that for any 1 ≤ i ≤ n + 2 xi = n+1 X λi,j ej j=1 If x1 = 0, we are done. Otherwise, one of the elements (λ1,j )j is non-zero. Without loss of generality, one can assume λ1,n+1 6= 0. Define yi = xi − λi,n+1 x1 λ1,n+1 for 2 ≤ i ≤ n + 2. Then one has yi = n X j=1 λi,j − λi,n+1 λ1,j λ1,n+1 ej 1.4. DIMENSION OF A VECTOR SPACE 31 The vectors (y2 , . . . , yn+2 ) thus belong to F = Vect(e1 , . . . , en ). Thanks to the induction hypothesis, one gets n+2 X µi yi = 0 i=2 for some scalars µ2 , . . . , µn+2 , not all zero. Finally, one has ! n+2 n+2 X X λi,n+1 x1 = 0 µi xi − µi λ1,n+1 i=2 i=2 This concludes the proof. If E is a nonzero vector space with finite dimension, it has a spanning family of cardinality dim E. The previous lemma then implies that a linearly independent family has cardinality less or equal to dim E. Let us summarize the situation. Proposition 1.4.9. Let E be a non-zero vector space of finite dimension. • Every spanning family has cardinality ≥ dim E. • There exists a spanning family of cardinality dim E. • Every linearly independent family has cardinality ≤ dim E. As a consequence, every basis has the same cardinality, equal to the dimension. Corollary 1.4.10. Let E be a non-zero vector space of finite dimension. Every basis of E has cardinality dim E. This gives a simple way to compute the dimension of a vector space: one first constructs a basis, and the dimension will be its cardinality. Example 1.4.11. One has dim Kn = n for all n ≥ 1. One has dim K[X]n = n + 1 for all n ≥ 0. Exercise 1.4.12. Let E1 , E2 be finite dimensional vector spaces. Prove that E1 × E2 is finite dimensional and dim(E1 × E2 ) = dim(E1 ) + dim E2 If E has dimension n ≥ 1, then a basis has necessarily n elements. If one starts with a family with n elements, to prove that it is a basis, it is enough to check either the spanning property or the linear independence. Proposition 1.4.13. Let E be a vector space with finite dimension n ≥ 1 and let (e1 , . . . , en ) be a family of vectors of E. Then the following assertions are equivalent. • (e1 , . . . , en ) is a basis. 32 CHAPTER 1. VECTOR SPACES • (e1 , . . . , en ) is a linearly independent family. • (e1 , . . . , en ) is a spanning family. Proof. If the family is a basis, it is linearly independent. Conversely, assume it is linearly independent. If it is not spanning, there would exist a vector en+1 not in Vect(e1 , . . . , en ). The family (e1 , . . . , en+1 ) would then be linearly independent by Lemma 1.3.12, and its cardinality is n + 1 > dim E which is impossible. We have then proved that it is a basis if and only if it is linearly independent. Let us now prove that it is a basis if and only if it is spanning. If it is a basis, it is of course spanning. So, let us assume that the family is spanning. If n = 1, then e1 6= 0 because E is non zero, and the family is linearly independent. Assume n ≥ 2, and that the family is linearly dependent. Then one of the vectors is a linear combination of the others; without loss of generality one has en ∈ Vect(e1 , . . . , en−1 ). The family (e1 , . . . , en−1 ) is then spanning, and has cardinality n − 1 < dim E which is impossible. This proves that the family is a basis if and only if it is spanning. Corollary 1.4.14. Let E be a nonzero vector space with finite dimension. Then E has a basis. Proof. We know that E has a spanning family of cardinality dim E. From the previous proposition, it is a basis. We finish this section by the following characterization of infinite dimensional vector spaces. Proposition 1.4.15. Let E be a vector space. Then E has infinite dimension if and only if for each integer n ≥ 1 there exists a linearly independent family of cardinality n. Proof. We have already seen that if E is infinite dimensional, it has linearly independent families of any cardinality (Proposition 1.4.7). Conversely, assume that for each integer n ≥ 1, E has a linearly independent family of cardinality n. If E were finite dimensional, every linearly independent family would have a cardinality less or equal to the dimension of E. This cannot be the case, hence E has infinite dimension. 1.4.3 Dimension and vector subspaces If E is a finite dimensional vector space, one has information on its vector subspaces. Proposition 1.4.16. Let E be a finite dimensional vector space and F ⊆ E be a vector subspace. Then F has finite dimension, and dim F ≤ dim E. Moreover, dim F = dim E if and only if F = E. 1.4. DIMENSION OF A VECTOR SPACE 33 Proof. If F had infinite dimension, there would exist a linearly independent family of vectors of F (thus of E) of cardinality dim E + 1, which is impossible. This proves that F has finite dimension. For the next part, if F = {0}, the proposition is true. Otherwise, let k = dim F ≥ 1 and let (e1 , . . . , ek ) be a basis of F . Then (e1 , . . . , ek ) is a linearly independent family of vectors of E. This implies k ≤ dim E. Thus dim F ≤ dim E. If k = dim E, then (e1 , . . . , ek ) would be a basis of E and F = Vect(e1 , . . . , ek ) = E If F ⊆ E is a vector subspace, and E is finite dimensional, we will prove the existence of a complementary subspace. First, let us present an intermediary result. Theorem 1.4.17 (Incomplete basis theorem). Let E be a vector space of dimension n ≥ 2. Let 1 ≤ k < n and let (e1 , . . . , ek ) be a linearly independent family. There exist vectors (ek+1 , . . . , en ) such that (e1 , . . . , en ) is a basis for E. Proof. Since k < n, the family (e1 , . . . , ek ) is not a basis. Since it is linearly independent, it is not spanning. Let ek+1 be a vector with ek+1 ∈ / Vect(e1 , . . . , ek ). Then (e1 , . . . , ek+1 ) is a linearly independent family by Lemma 1.3.12. If k+ 1 = n, it is a basis. If not, one repeats the process to construct ek+2 , . . . , en . At the end, one has a linearly independent family (e1 , . . . , en ). Since n = dim E, it is a basis for E. We have the following property for complementary subspaces. Proposition 1.4.18. Let E be a finite dimensional vector space, and F, G vector subspaces with E = F ⊕ G. Then dim E = dim F + dim G. If F and G are non zero, and if (e1 , . . . , ek ) is a basis of F , (f1 , . . . , fl ) a basis of G, then (e1 , . . . , ek , f1 , . . . , fl ) is a basis of E. Proof. The first part is true if F or G is {0}. Therefore, one can restrict to the case where F and G are both nonzero. We will first prove the second part. Let us prove that (e1 , . . . , ek , f1 , . . . , fl ) is linearly independent. Assume that there exist scalars λ1 , . . . , λk , µ1 , . . . , µl such that λ1 e1 + · · · + λk ek + µ1 f1 + · · · + µl fl = 0 Then the element y = λ1 e1 +· · ·+λk ek = −(µ1 f1 +· · ·+µl fl ) is in F ∩G = {0}. Since the families (e1 , . . . , ek ) and (f1 , . . . , fl ) are linearly independent, one gets λ1 = · · · = λk = µ1 = · · · = µl = 0. This proves that the family (e1 , . . . , ek , f1 , . . . , fl ) is linearly independent. Let us now prove that it is spanning. Let x ∈ E; since E = F + G, there exist u ∈ F and v ∈ G such that x = u + v. Since (e1 , . . . , ek ) is basis for F , there 34 CHAPTER 1. VECTOR SPACES exist scalars λ1 , . . . , λk such that u = λ1 e1 + · · · + λk ek . Similarly, there exist scalars µ1 , . . . , µl such that v = µ1 f1 + · · · + µl fl . Finally, one has x = u + v = λ1 e1 + · · · + λk ek + µ1 f1 + · · · + µl fl This proves that (e1 , . . . , ek , f1 , . . . , fl ) is a spanning family. We have then proved that it is a basis. Its cardinality is equal to the dimension of E, and thus dim E = k + l = dim F + dim G We will now prove that every vector subspace admits a complementary subspace. Proposition 1.4.19. Let E be a finite dimensional vector space, and F ⊆ E a vector subspace. Then F admits a complementary subspace. Proof. If F = {0}, then E is a complementary subspace. If F = E, {0} is a complementary subspace. Otherwise, let (e1 , . . . , ek ) be a basis of F , completed in a basis (e1 , . . . , en ) of E. Let G = Vect(ek+1 , . . . , en ) One then proves that E =F ⊕G Indeed, F + G = Vect(e1 , . . . , ek ) + Vect(ek+1 , . . . , en ) = Vect(e1 , . . . , en ) = E. Let x ∈ F ∩ G; there exist scalars λ1 , . . . , λn with x = λ1 e1 + · · · + λk ek = λk+1 ek+1 + · · · + λn en Since the family (e1 , . . . , en ) is linearly independent, one gets λ1 = · · · = λn = 0, and thus x = 0. This finishes the proof. Let E be a finite dimensional vector space, and let F, G be vector subspaces. If F ∩ G = {0}, then F + G = F ⊕ G and Proposition 1.4.18 gives dim(F + G) = dim F + dim G One has a more general formula. Theorem 1.4.20. Let E be a finite dimensional vector space, and let F, G be vector subspaces. Then dim(F + G) = dim F + dim G − dim(F ∩ G) Proof. Let F 0 ⊆ F be a vector subspace such that F = (F ∩ G) ⊕ F 0 Then dim F 0 = dim F − dim(F ∩ G). One then proves that F + G = F0 ⊕ G 1.4. DIMENSION OF A VECTOR SPACE 35 Indeed, if x ∈ F 0 ∩G, then x is in F ∩G (because F 0 ⊆ F ). Thus x ∈ (F ∩ G)∩ F 0 = {0}, and x = 0. This proves that F 0 ∩ G = {0}. Let us now take y ∈ F + G; there exist u ∈ F , and v ∈ G such that y = u + v. Since F = (F ∩ G) ⊕ F 0 , there exist u1 ∈ F ∩ G and u2 ∈ F 0 such that u = u1 + u2 . Finally y = u1 + u2 + v. The element u2 is in F 0 , and u1 + v ∈ G. This proves that F + G = F 0 + G. We have proved that F + G = F 0 ⊕ G. Taking the dimension, one gets dim(F + G) = dim F 0 + dim G = dim F + dim G − dim(F ∩ G) If E is finite dimensional, and F, G are complementary subspaces, then dim E = dim F + dim G. If this condition holds, the previous theorem implies that only one of two conditions for complementary subspaces needs to be checked. Corollary 1.4.21. Let E be a finite dimensional vector space, and F, G vector subspaces with dim E = dim F + dim G. Then the following assertions are equivalent • E =F ⊕G • E =F +G • F ∩ G = {0} Proof. If E = F ⊕ G, then by definition E = F + G and F ∩ G = {0}. Assume now that E = F + G. Then dim(F ∩ G) = dim F + dim G − dim(F + G) = dim E − dim E = 0 which implies that F ∩ G = {0}, and E = F ⊕ G. This proves the equivalence between the first two assertions. Assume that F ∩ G = {0}. Then dim(F + G) = dim F + dim G − dim(F ∩ G) = dim E Thus F + G = E and E = F ⊕ G. This proves that the first and third assertions are equivalent. 1.4.4 Exercises Exercise 1.4.1. Let k ≥ 1, and let Sk be the set of k-periodic sequences (i.e. the set of sequences (un )n≥0 such that un+k = un for all n ≥ 0). Find a basis for Sk . What is its dimension? Exercise 1.4.2. Prove that the space of sequences is infinite dimensional. Exercise 1.4.3. Let E be a vector space, and assume that (e1 , . . . , en ) is a basis of E. Let λ1 , . . . , λn be scalars. On what condition is the family (e1 , . . . , en−1 , n X i=1 a basis for E? Justify your answer. λi ei ) 36 CHAPTER 1. VECTOR SPACES Exercise 1.4.4. Let n ≥ 1, and let 0 < x1 < · · · < xn < 1 be reals. We set x0 = 0 and xn+1 = 1. Let E be the set of continuous functions f from [0, 1] to R such that the restriction of f to [xi , xi+1 ) is affine for all 0 ≤ i ≤ n (i.e. there exist ai , bi ∈ R such that f (x) = ai x + bi for x ∈ [xi , xi+1 )). Compute the dimension of E. What happens if one removes the continuity assumption? Exercise 1.4.5. Let a, b ∈ K, and let S be the set of sequences (un )n≥0 such that un+2 = aun+1 + bun Prove that S is a vector space, and compute its dimension. Exercise 1.4.6. Let E be a nonzero vector space, n ≥ 1 an integer, and F = (e1 , . . . , en ) a spanning family. Prove that there exists a subfamily of F which is a basis, i.e. that there exist integers i1 , . . . , ik such that (ei1 , . . . , eik ) is a basis. Exercise 1.4.7. Let E ⊆ K[X] be a nonzero vector subspace. Assume that E has finite dimension. 1. Prove that there exists a basis for E consisting of polynomials with the same degree. 2. Prove that there exists a basis for E consisting of polynomials with distinct degrees. Chapter 2 Linear maps and matrices 2.1 2.1.1 Linear maps Definition and examples Definition 2.1.1. Let E, F be vector spaces. A map u : E → F is called a linear map if u(λx + µy) = λu(x) + µu(y) for all λ, µ ∈ K and x, y ∈ E. If u : E → F is a linear map, then u(0) = 0 (note that the first 0 is the zero element of E, whereas the second one is the zero element of F ). Indeed, one has u(0) = u(0 + 0) = u(0) + u(0) Example 2.1.2. Let E be a vector space, and e ∈ E. The map u : K → E defined by u(λ) = λ · e is a linear map. Definition 2.1.3. If E, F are vector spaces, we denote by L(E, F ) the set of linear maps from E to F . Proposition 2.1.4. The set L(E, F ) is a vector space. Proof. Let F(E, F ) be the set of functions from E to F ; it is a vector space. Let us prove that L(E, F ) is a vector subspace of F(E, F ). First, the zero function is a linear map. Let u, v ∈ L(E, F ), λ, µ, α, β ∈ K, and x, y ∈ E. Then (αu + βv)(λx + µy) = αu(λx + µy) + βv(λx + µy) = α(λu(x) + µu(y)) + β(λv(x) + µv(y)) = λ(αu(x) + βv(x)) + µ(αu(y) + βv(y)) = λ(αu + βv)(x) + µ(αu + βv)(y) Thus αu + βv ∈ L(E, F ). 37 38 CHAPTER 2. LINEAR MAPS AND MATRICES Let E, F, G be vectors spaces; if u : E → F and v : F → G are functions, one can form the composition v ◦ u. This operation preserves linear maps. Proposition 2.1.5. Let E, F, G be vector spaces, u ∈ L(E, F ) and v ∈ L(F, G). Then v ◦ u ∈ L(E, G). Proof. Let λ, µ ∈ K and x, y ∈ E. Then (v ◦ u)(λx + µy) = v(u(λx + µy)) = v(λu(x) + µu(y)) = λv(u(x)) + µv(u(y)) = λv ◦ u(x) + µv ◦ u(y) A special case of linear maps is when E = F . Definition 2.1.6. Let E be a vector space. We denote by L(E) the vector space L(E, E). An element of L(E) is called an endomorphism of E. If E is a vector space, then • The identity of E, denoted by id, is in L(E). Recall that id is the function E → E defined by id(x) = x for all x ∈ E. • If u, v ∈ L(E), then v ◦ u and u ◦ v are in L(E). 2.1.2 Linear maps and vector subspaces Let E, F be vector spaces, and u : E → F a function. One can then look at the direct image of subsets of E, and the inverse image of subsets of F . If u is a linear map, these operations will preserve the vector subspaces. Proposition 2.1.7. Let E, F be vector spaces, and u ∈ L(E, F ). If G is a vector subspace of E, then u(G) is a vector subspace of F . If H is a vector subspace of F , then u−1 (H) is a vector subspace of E. Proof. Let λ, µ ∈ K and x, y ∈ u(G). There exist s, t ∈ G with x = u(s) and y = u(t). Then λx + µy = λu(s) + µu(t) = u(λs + µt) ∈ u(G) since λs + µt ∈ G. Of course, the zero element of F is in u(G) (since u(0) = 0). This proves that u(G) is a vector subspace of F . Suppose now that a, b ∈ u−1 (H). Then u(λa + µb) = λu(a) + µu(b) ∈ H since u(a) and u(b) are in the vector subspace H. Thus λa + µb ∈ u−1 (H). Since u(0) = 0, the zero element of E is in u−1 (H). This proves that u−1 (H) is a vector subspace of E. 2.1. LINEAR MAPS 39 Two particular cases are very important. Definition 2.1.8. Let E, F be vector spaces, and u ∈ L(E, F ). Then u−1 ({0}) is a vector subspace of E, called the kernel of u, and is written Ker u. u(E) is a vector subspace of F , called the image of u, and is written Im u. These vector subspaces can be used to determine if the linear map is injective or surjective. Proposition 2.1.9. Let E, F be vector spaces, and u ∈ L(E, F ). Then u is injective if and only if Ker u = {0}. It is surjective if and only if Im u = F . Proof. The surjectivity part is obvious. Assume that u is injective, and let x ∈ Ker u. Then u(x) = 0 = u(0). By injectivity, one gets x = 0. Thus Ker u = {0}. Conversely, assume that Ker u = {0}, and let us prove that u is injective. Let x, y ∈ E and assume that u(x) = u(y). Then u(x − y) = u(x) − u(y) = 0 In other words, the element x − y is in the kernel of u, and is then 0. Hence x = y, which proves that u is injective. If a linear map is bijective, we have a special word for this. Definition 2.1.10. Let E, F be vector spaces. If an element of L(E, F ) is bijective, we say that it is an isomorphism. We say that E and F are isomorphic if there exists an isomorphism u : E → F . If an element of L(E) is bijective, we say it is an automorphism. Proposition 2.1.11. Let E, F be vector spaces and let u ∈ L(E, F ) be an isomorphism. Its inverse u−1 is a linear map. Proof. Let x, y ∈ F , and a = u−1 (x), b = u−1 (y). Then for λ, µ ∈ K, one has u(λa + µb) = λu(a) + µu(b) = λx + µy Thus u−1 (λx + µy) = λa + µb = λu−1 (x) + µu−1 (y) This proves that u−1 is a linear map. 2.1.3 Construction of linear maps In this section, we will see how to construct some linear maps between two vector spaces. First, let us define the restriction of a linear map. 40 CHAPTER 2. LINEAR MAPS AND MATRICES Definition 2.1.12. Let E, F be vector spaces, and u ∈ L(E, F ). If E1 ⊆ E is a vector subspace, the restriction of u to E1 is the function u|E1 : E1 → F induced by u. Explicitly, one has u|E1 (x) = u(x) for every x ∈ E1 . The function u|E1 is a linear map, and thus belongs to L(E1 , F ). The next proposition shows how one can glue two linear maps defined on complementary subspaces. Proposition 2.1.13. Let E, F be vector spaces. Assume that E1 , E2 are two vector subspaces of E with E = E1 ⊕ E2 . Let u1 ∈ L(E1 , F ) and u2 ∈ L(E2 , F ). There exists a unique linear map u ∈ L(E, F ) such that u|E1 = u1 u|E2 = u2 Proof. Assume that such a linear map u exists, and let x ∈ E. There exist x1 ∈ E1 and x2 ∈ E2 such that x = x1 + x2 . Then u(x) = u(x1 ) + u(x2 ) = u1 (x1 ) + u2 (x2 ) This proves that u is necessarily unique. To prove the existence, one will construct explicitly the linear map u by the above formula. Let x ∈ E; there are unique elements x1 ∈ E1 , x2 ∈ E2 with x = x1 + x2 . Define u(x) := u1 (x1 ) + u2 (x2 ) One checks that such a formula defines indeed a linear map from E to F . Its restriction to E1 is obviously u1 , and similarly for E2 . An important application is the construction of projections. Let E be a vector space, and E1 , E2 two vector subspaces with E = E1 ⊕ E2 . Definition 2.1.14. Let x ∈ E decomposed as x = x1 +x2 , with x1 ∈ E1 , x2 ∈ E2 . The projection along E2 onto E1 if the map p1 : E → E defined by p1 (x) = x1 The projection p1 is thus constructed by the previous proposition. One imposes that p1 is the identity on E1 (i.e. p1 (x) = x for all x ∈ E1 ), and is zero on E2 . The map p1 is then a linear map. The projection along E1 onto E2 is the map p2 defined by p2 = id −p1 . Proposition 2.1.15. One has Ker p1 = E2 and Im p1 = E1 . Proof. Let x = x1 + x2 be an element of E. Then p1 (x) = 0 ⇔ x1 = 0 ⇔ x = x2 ⇔ x ∈ E2 This proves that the kernel of p1 is E2 . From the definition of p1 , its image is included in E1 . If x ∈ E1 , then p1 (x) = x, and x is then in the image of p1 . The image of p1 is then exactly E1 . 2.1. LINEAR MAPS 41 One has the following characterization of the projections. Proposition 2.1.16. Let E be a vector space, and p ∈ L(E). Then p is a projection if and only if p ◦ p = p. Proof. One can check that the projection p1 defined previously satisfies p1 ◦ p1 = p1 . Indeed, if E = E1 ⊕ E2 , and x = x1 + x2 (with x1 ∈ E1 , x2 ∈ E2 ), then p1 ◦ p1 (x) = p1 (x1 ) = x1 = p1 (x) Conversely, let p be an endomorphism of E with p ◦ p = p. Define E1 = Im p and E2 = Ker p. Any x ∈ E can be written as x = p(x) + (x − p(x)) and p(x − p(x)) = p(x) − p(p(x)) = p(x) − p(x) = 0 Thus x − p(x) ∈ E2 . Since p(x) ∈ E1 , this implies that E = E1 + E2 . Moreover, let y ∈ E1 ∩ E2 . The element y is in the image of p, and there exists z ∈ E such that y = p(z). The element y is in the kernel of p, hence p(y) = 0. But p(y) = p(p(z)) = p(z) = y. One concludes that E1 ∩ E2 = {0}, and that E = E1 ⊕ E2 . Since the decomposition of an element x ∈ E in this direct sum is given by x = p(x) + (x − p(x)), p is the projection along F2 onto F1 . 2.1.4 Case of finite dimension We are mostly interested by the case where the vector spaces are finite dimensional. In this case, a powerful tool is the existence of bases. A linear map is then entirely determined by the image of a basis. Proposition 2.1.17. Let E be a vector space of dimension n ≥ 1, and let (e1 , . . . , en ) be a basis for E. Let F be another vector space, and let f1 , . . . , fn be vectors of F . Then there exists a unique linear map u : E → F such that u(ei ) = fi for 1 ≤ i ≤ n. In other words, a linear map u ∈ L(E, F ) is entirely determined by the vectors u(e1 ), . . . , u(en ). Proof. Assume that such a linear map u exists, and let x ∈ E. There exist scalars λ1 , . . . , λn such that x = λ1 e1 + · · · + λn en . Thus u(x) = u(λ1 e1 + · · · + λn en ) = λ1 u(e1 ) + · · · + λn u(en ) = λ1 f1 + · · · + λn fn This proves that u is unique. Moreover, one checks that the above formula defines a linear map from E to F sending ei to fi for each 1 ≤ i ≤ n. Exercise 2.1.18. Let E be a vector space with basis (e1 , . . . , en ). Let F be another vector space, and let u ∈ L(E, F ). Prove that Im(u) = Vect(u(e1 ), . . . , u(en )) 42 CHAPTER 2. LINEAR MAPS AND MATRICES The next two lemmas say that an injective linear map preserves the linearly independent families, whereas a surjective one preserves the spanning families. Lemma 2.1.19. Let E, F be vector spaces, and assume that u ∈ L(E, F ) is injective. Let (e1 , . . . , en ) be a linearly independent family of vectors of E. Then the family (u(e1 ), . . . , u(en )) is linearly independent. Proof. Assume that there exists scalars λ1 , . . . , λn with λ1 u(e1 ) + · · · + λn u(en ) = 0 Then u(λ1 e1 + · · · + λn en ) = 0. Since u is injective, one gets λ 1 e1 + · · · + λ n en = 0 Using the linearly independence of the family (e1 , . . . , en ), one gets λ1 = · · · = λn = 0. The family (u(e1 ), . . . , u(en )) is then linearly independent. Lemma 2.1.20. Let E, F be vector spaces, and assume that u ∈ L(E, F ) is surjective. Let (e1 , . . . , en ) be a spanning family for E. Then the family (u(e1 ), . . . , u(en )) is a spanning family for F . Proof. Let y ∈ F . Since u is surjective, there exists x ∈ E with y = u(x). Since (e1 , . . . , en ) is a spanning family for E, there exist scalars λ1 , . . . , λn with x = λ 1 e1 + · · · + λ n en Thus y = u(x) = λ1 u(e1 ) + · · · + λn u(en ) This proves that (u(e1 ), . . . , u(en )) is a spanning family for F . One has the following criterion for being an isomorphism. Theorem 2.1.21. Let E, F be vector spaces, and u ∈ L(E, F ). Assume that E has finite dimension n ≥ 1, and let (e1 , . . . , en ) be a basis for E. Then u is an isomorphism if and only if (u(e1 ), . . . , u(en )) is a basis for F . Proof. If u is an isomorphism, the previous lemmas imply that (u(e1 ), . . . , u(en )) is a basis for F . Assume that (u(e1 ), . . . , u(en )) is a basis for F , and let us prove that u is bijective. Let x ∈ E with u(x) = 0. Write x = λ1 e1 + · · · + λn en . Then 0 = u(x) = λ1 u(e1 ) + · · · + λn u(en ) Since (u(e1 ), . . . , u(en )) is a basis, one gets λ1 = · · · = λn = 0, and x = 0. This proves the injectivity. Now take y ∈ F . Since (u(e1 ), . . . , u(en )) is a basis, there exist λ1 , . . . , λn with y = λ1 u(e1 ) + · · · + λn u(en ) = u(λ1 e1 + · · · + λn en ) This proves the surjectivity. 2.1. LINEAR MAPS 43 As a corollary, one has a simple criterion for isomorphic vector spaces (in the case of finite dimension). Corollary 2.1.22. Let E, F be vector spaces, and assume that E has finite dimension. Then E is isomorphic to F if and only if F has finite dimension and dim E = dim F . Proof. Assume that there exist an isomorphism between E and F . If E = {0}, then F is also the zero vector space, and the result holds. Otherwise, E admits a basis (e1 , . . . , en ). From the previous theorem, (u(e1 ), . . . , u(en )) is a basis for F , which has thus dimension n = dim E. Conversely, assume that dim F = dim E. If this integer is 0, the result holds. Otherwise, let n ≥ 1 be this integer, and let (e1 , . . . , en ), (f1 , . . . , fn ) be bases respectively for E and F . Let u ∈ L(E, F ) be the linear map defined by u(ei ) = fi for 1 ≤ i ≤ n. By the previous theorem, u is an isomorphism (it sends a basis to a basis). The vector spaces E and F are thus isomorphic. Remark 2.1.23. If E is vector space of dimension n ≥ 1, it is isomorphic to Kn . Exercise 2.1.24. Let E, F be vector spaces, and u : E → F an isomorphism. Let E0 be a vector subspace of E, and F0 a vector subspace of F . Prove that dim u(E0 ) = dim E0 and dim u−1 (F0 ) = dim F0 . When one considers two vector spaces of the same dimension, the injectivity (or surjectivity) of a linear map implies the bijectivity. Therefore, to prove that a linear map is an isomorphism in this setting, it is enough to check the injectivity (or surjectivity). Proposition 2.1.25. Let E, F be finite dimensional vector spaces with dim E = dim F , and let u ∈ L(E, F ). Then u is bijective ⇔ u is injective ⇔ u is surjective Proof. Let n = dim E = dim F . If n = 0 the result holds; let us then assume that n ≥ 1. Let (e1 , . . . , en ) be a basis for E. Assume that u is injective. Then (u(e1 ), . . . , u(en )) is linearly independent. This family has cardinality n = dim E = dim F . It is thus a basis for F . This proves that u is bijective. The converse is obviously true. Now assume that u is surjective. Then (u(e1 ), . . . , u(en )) is a spanning family for F . This family has cardinality n = dim E = dim F . It is thus a basis for F . This proves that u is bijective. The converse being true, this finishes the proof. Let E, F be finite dimensional vector spaces with dim E = dim F , and let u ∈ L(E, F ). Then u is bijective if and only if there exists v ∈ L(F, E) such that v ◦ u = idE u ◦ v = idF The previous proposition tells us that one can ask for weaker conditions. 44 CHAPTER 2. LINEAR MAPS AND MATRICES Corollary 2.1.26. Let E, F be finite dimensional vector spaces with dim E = dim F , and let u ∈ L(E, F ). The following properties are equivalent • u is an isomorphism • There exists v ∈ L(F, E) such that v ◦ u = idE . • There exists v ∈ L(F, E) such that u ◦ v = idF . Proof. Indeed, if there exists v ∈ L(F, E) with v ◦ u = idE , then u is injective, hence bijective since dim E = dim F . If there exists v ∈ L(F, E) with u ◦ v = idF , then u is surjective, hence bijective. 2.1.5 Rank and nullity We will now associate to a linear map some numerical invariants. Definition 2.1.27. Let E, F be finite dimensional vector spaces, and u ∈ L(E, F ). The rank of u is the integer rk u = dim Im u The nullity of u is the integer nul u = dim Ker u One has the following properties. Proposition 2.1.28. Let E, F be finite dimensional vector spaces, and u ∈ L(E, F ). • u = 0 ⇔ rk u = 0 ⇔ nul u = dim E • rk u ≤ min(dim E, dim F ) • u is surjective if and only if rk u = dim F . • u is injective if and only if nul u = 0. • If dim E = dim F = n, then u is an isomorphism ⇔ rk u = n ⇔ nul u = 0 Proof. All these properties follow from the previous section, except the second one. Since the image of u is a vector subspace of F , one gets that the rank if u is less or equal to the dimension of F . Let us now prove that rk u ≤ dim E. Let n = dim E; if n = 0, then u = 0 and the result holds. Let us then assume that n ≥ 1, and let (e1 , . . . , en ) be a basis for E. Then (u(e1 ), . . . , u(en )) is a spanning family for Im u, and rk u = dim Im u ≤ n = dim E 2.1. LINEAR MAPS 45 Exercise 2.1.29. Let E, F be vector spaces, and u : E → F an isomorphism. Let G be another vector space, and let v ∈ L(G, E), w ∈ L(F, G). Prove that rk u ◦ v = rk v and rk w ◦ u = rk w. The last assertion of the proposition implies that the rank and the nullity are not independent. Actually, there is simple formula between these integers. Let us first prove an intermediary result. Proposition 2.1.30. Let E, F be finite dimensional vector spaces, and u ∈ L(E, F ). Let E 0 be a complementary subspace of Ker u in E. Then u induces an isomorphism E 0 → Im u Proof. Let ũ : E 0 → Im u be the linear map induced by u. Explicitly, one has ũ(x) = u(x) for x ∈ E 0 . Let x ∈ Ker ũ. Then x ∈ E 0 and u(x) = 0. Thus x ∈ E 0 ∩ Ker u = {0}, and x = 0. This proves the injectivity of ũ. Let us prove the surjectivity. Let y ∈ Im u. There exists x ∈ E such that y = u(x). Since E = Ker u⊕E 0 , there exist x1 ∈ Ker u, x2 ∈ E 0 with x = x1 +x2 . Then y = u(x) = u(x1 + x2 ) = u(x1 ) + u(x2 ) = u(x2 ) = ũ(x2 ) This finishes the proof. As a consequence, one deduces the following formula (the most important of the course!). Theorem 2.1.31 (Rank-nullity theorem). Let E, F be finite dimensional vector spaces, and u ∈ L(E, F ). Then dim E = rk u + nul u Proof. Let E 0 be a complementary subspace of Ker u in E. Since E = Ker u⊕E 0 , one has dim E = dim Ker u + dim E 0 , and dim E 0 = dim E − nul u On the other hand, one has dim Im u = rk u. The previous proposition implies that E 0 and Im u have the same dimension, hence the formula. 2.1.6 Duality If E is a vector space, the set of linear maps from E to another vector space is still a vector space. If these vector spaces are finite dimensional, so is the set of linear maps, and one can compute its dimension. Proposition 2.1.32. Let E, F be finite dimensional vector spaces. Then L(E, F ) is finite dimensional and dim L(E, F ) = dim E · dim F 46 CHAPTER 2. LINEAR MAPS AND MATRICES Proof. Let n = dim E; the case E = {0} is easy, so assume that n ≥ 1. Let (e1 , . . . , en ) be a basis for E. Consider the function φ : L(E, F ) → F n defined by φ(u) = (u(e1 ), . . . , u(en )) Then φ is a linear map. Moreover, it is bijective. Since F n is finite dimensional, so is L(E, F ) and dim L(E, F ) = dim F n = n · dim F = dim E · dim F An important case is the the set of linear maps from E to K. Such a linear map will be called a linear form. Definition 2.1.33. Let E be a vector space. A linear form on E is an element φ ∈ L(E, K). Proposition 2.1.34. Let E be a vector space of dimension n ≥ 1, and let φ 6= 0 be a linear form. Then rk u = 1 and nul u = n − 1. Proof. The image of φ is a vector subspace of K, and is hence either {0} or K. The first case occurs only when φ = 0, which is excluded. The image of φ is then K, and its rank is 1. The rank-nullity theorem gives the nullity. Definition 2.1.35. Let E be a vector space. The dual space of E is E ∗ = L(E, K) Exercise 2.1.36. Prove that if E is a vector space of dimension n, then so is E∗. We have defined the dual of a vector space. One can also define the dual of a linear map. Definition 2.1.37. Let E, F be vector spaces, and u ∈ L(E, F ). Define u∗ : F ∗ → E ∗ φ→φ◦u The element u∗ is a linear map from F ∗ to E ∗ , and thus in L(F ∗ , E ∗ ). 2.1.7 Exercises Exercise 2.1.1. For each of the following vector spaces E and functions f : E → E, is f a linear map? Justify your answer. • E = K3 and x x+y−z f y = 2x + 3y + z z x + 2y + 2z 2.1. LINEAR MAPS 47 • E = K[X] and f (P (X)) = P (X + 1). • E = K[X] and f (P (X)) = P (X 2 ). • E = K[X] and f (P (X)) = P (X)2 . • E is the set of sequences and f ((un )n≥0 ) = (un+1 )n≥0 . Exercise 2.1.2. In the previous exercise, in the cases where f was a linear map, compute the kernel and image of f . Exercise 2.1.3. Let E = K3 and define 1 0 F1 = Vect 2 , 3 0 −1 1 F2 = Vect 0 1 Prove that E = F1 ⊕ F2 and compute the projection onto F1 along F2 . Exercise 2.1.4. Let a ∈ K, and let p : K2 → K2 be the map defined by x −x + y p = y −2x + ay On which condition is p a projection? Exercise 2.1.5. Let E be a vector space, n ≥ 1 an integer, and (e1 , . . . , en ) a family of vectors. We define a map f : Kn → E by x1 .. . → x1 e1 + · · · + xn en xn Prove that f is a linear map. Give a criterion (on f ) for the family (e1 , . . . , en ) to be • linearly independent • spanning • a basis Exercise 2.1.6. Let E be the set of functions from R to R. Consider the function φ : E → E defined by φ(f )(x) = xf (x) for f ∈ E and all x ∈ R. Prove that φ is a linear map. Compute its kernel and image. 48 CHAPTER 2. LINEAR MAPS AND MATRICES Exercise 2.1.7. What happens if one replaces E by the set of continuous functions from R to R in the previous exercise? Exercise 2.1.8. Let E be a vector space, and s ∈ L(E) with s ◦ s = id. Let E1 = Ker(s − id) and E2 = Ker(s + id). 1. Prove that s is an automorphism and compute s−1 . 2. Let x ∈ E1 and y ∈ E2 . Compute s(x) and s(y). 3. Prove that E1 ∩ E2 = {0}. 4. Prove that E = E1 ⊕ E2 . 5. Let s ∈ L(K2 ) be the linear map defined by 1 0 0 1 s = s = 0 1 1 0 Prove that s ◦ s = id. Make a picture. Exercise 2.1.9. Let E, F, G be finite dimensional vector spaces, u ∈ L(E, G) and v ∈ L(F, G). Prove that ∃w ∈ L(E, F ) such that u = v ◦ w ⇔ Im u ⊆ Im v Exercise 2.1.10. Let E be a finite dimensional vector space, and u, v, w ∈ L(E). Prove that ∃a, b ∈ L(E) such that u = v ◦ a + w ◦ b ⇔ Im u ⊆ Im v + Im w Exercise 2.1.11. Let E be a vector space of dimension n ≥ 1 and u an endomorphism of E. If k ≥ 1, define uk = u ◦ · · · ◦ u (such that u1 = u, u2 = u ◦ u and uk = u ◦ uk−1 ). Define Nk = Ker uk and Ik = Im uk for every k ≥ 1. • Let k ≥ 1 be an integer. Prove that Nk ⊆ Nk+1 and Ik+1 ⊆ Ik . • Let k ≥ 1 be an integer. Prove that Ik = Ik+1 ⇔ Nk = Nk+1 . • Let k ≥ 1 be an integer with Ik = Ik+1 . Prove that Il = Ik for every l ≥ k. • Define dk = dim Ik for every k ≥ 1. Prove that the sequence (dk )k≥1 is decreasing. • Prove that the sequence (dk − dk+1 )k≥1 is decreasing, ultimately equal to 0 (i.e. dk − dk+1 = 0 if k is greater than a certain integer N ). One might use the rank-nullity theorem to the restriction of u to Ik . Exercise 2.1.12. Let E be a vector space of dimension n ≥ 1 and u an endomorphism of E. Prove that Ker u = Ker u2 ⇔ Im u = Im u2 ⇔ E = Ker u ⊕ Im u 2.2. MATRICES 49 Exercise 2.1.13. Let E, F, G be finite dimensional vector spaces, u ∈ L(E, G) and v ∈ L(E, F ). Prove that ∃w ∈ L(F, G) such that u = w ◦ v ⇔ Ker v ⊆ Ker u Exercise 2.1.14. Let E be a finite dimensional vector space, and u, v, w ∈ L(E). Prove that ∃a, b ∈ L(E) such that u = a ◦ v + b ◦ w ⇔ Ker v ∩ Ker w ⊆ Ker u Exercise 2.1.15. Let E, F, G be finite dimensional vector spaces, u ∈ L(E, F ) and v ∈ L(F, G). Prove that rk(v ◦ u) ≥ rk u + rk v − dim F (One might apply the rank-nullity theorem to the restriction of v to the image of u). Exercise 2.1.16. Let E be a vector space of dimension n ≥ 1. 1. Compute the dimension of E ∗ . 2. Let (e1 , . . . , en ) be a basis for E and 1 ≤ i ≤ n. Define e∗i : E → K by e∗i (λ1 e1 + · · · + λn en ) = λi . Prove that e∗i ∈ E ∗ . 3. Prove that (e∗1 , . . . , e∗n ) is a basis for E ∗ . 4. Let x ∈ E. Prove that the map ψx : E ∗ → K defined by ψx (φ) = φ(x) is a linear map. 5. Let ψ : E → E ∗∗ be the map defined by ψ(x) = ψx . Prove that ψ is a linear map. 6. Prove that ψ is an isomorphism. Exercise 2.1.17. Let E, F be vector spaces, and u ∈ L(E, F ). Define u∗ : F ∗ → E ∗ by u∗ (φ) = φ ◦ u. Prove that u∗ is a linear map. Let G be another vector space, and v ∈ L(F, G). Prove that (v ◦ u)∗ = u∗ ◦ v ∗ . Exercise 2.1.18. Let E, F be finite dimensional vector spaces, and u ∈ L(E, F ). Prove that u∗ is surjective if and only if u is injective. Prove that u∗ is injective if and only if u is surjective. 2.2 Matrices The matrices will be a tool used to represent linear maps between finite dimensional vector spaces. 50 CHAPTER 2. LINEAR MAPS AND MATRICES 2.2.1 Definition 2 3 Let a linear us consider map u : K → K . Then u is determined by the vectors 1 0 u and u . These are elements in K3 , and have then three 0 1 coordinates each. Let a d 1 0 u = b u = e 0 1 c f The linear map u is then determined by the formula u x y ax + dy 1 0 1 0 =u x +y = xu +yu = bx + ey 0 1 0 1 cx + f y The linear map u is then determined by sent u by the following table: a b c six scalars a, b, c, d, e, f . We will repre d e f This table has three rows (the dimension of K3 ) and two columns (the dimension of K2 ). We call it the matrix associated to u. Let us first define the set of matrices. Definition 2.2.1. Let n, p ≥ 1 be integers. A matrix with p rows and n columns is a table a1,1 . . . a1,n .. .. A = ... . . ap,1 ... ap,n where the coefficients are in K. One also writes A = (ai,j ) 1≤i≤p , by convention the coefficient ai,j lies at the 1≤j≤n i-th row and j-th column. One says that A is a p × n matrix. Definition 2.2.2. We denote by Mp,n (K) the set of p × n matrices. We denote by Mn (K) the set of n × n matrices. Proposition 2.2.3. The sets Mp,n (K) and Mn (K) are vector spaces. The addition is defined by (ai,j ) 1≤i≤p + (bi,j ) 1≤i≤p = (ai,j + bi,j ) 1≤i≤p 1≤j≤n 1≤j≤n 1≤j≤n The external multiplication is defined by λ · (ai,j ) 1≤i≤p = (λai,j ) 1≤i≤p 1≤j≤n 1≤j≤n 2.2. MATRICES 51 Proposition 2.2.4. The vector space Mp,n (K) has dimension np. The vector space Mn (K) has dimension n2 . Proof. Let 1 ≤ i ≤ p and 1 ≤ j ≤ n. Define Ei,j as the matrix with all coefficients equal to 0, except the coefficient in the i-th row and j-th column which is equal to 1. Then the matrices (Ei,j ) 1≤i≤p form a basis for Mp,n (K). This space has thus 1≤j≤n dimension np. One has thus a standard basis for Mp,n (K). Example 2.2.5. 1 • E1,1 = 0 0 • E1,2 = 0 0 • E2,1 = 1 0 • E2,2 = 0 2.2.2 A basis for the vector space M2 (K) consists of the matrices 0 0 1 0 0 0 0 1 Operations on matrices We will associate a linear map to a matrix. Let n, p be positive integers and let A ∈ Mp,n (K). Let C1 , . . . , Cn be the columns of A; these are elements of Kp . Definition 2.2.6. We define the linear map ΦA : Kn 1 0 0 .. ΦA . = C1 . . . ΦA . .. 0 0 1 → Kp by = Cn x1 Equivalently, one has the formula for every x = ... ∈ Kn xn ΦA (x) = x1 C1 + · · · + xn Cn This gives a bijection {linear maps Kn → Kp } ↔ {p × n matrices} Indeed, we have associated a linear map from Kn to Kp to a p × n matrix. If Φ is a linear map from Kn to Kp , the image of the standard basis of Kn by Φ gives 52 CHAPTER 2. LINEAR MAPS AND MATRICES n vectors in Kp . One can then form the matrix with these vectors as columns. It has indeed n columns and p rows. One can evaluate a linear map from Kn to Kp on every vector on Kn . One has a corresponding operation on matrices. Definition 2.2.7. Let A ∈ Mp,n (K) and X ∈ Kn . We define the multiplication of X by A as A · X = ΦA (X) It is a vector x1 .. If X = . in Kp . and if A has columns C1 , . . . , Cn , then xn A · X = x1 C 1 + · · · + xn C n In the world of linear maps, one can form the composition of two linear maps. In the world of matrices, it will correspond to the multiplication of matrices. More precisely, let n, p, q be positive integers. Let A ∈ Mp,n (K), and B ∈ Mn,q (K). One has the associated linear maps ΦA : Kn → Kp ΦB : Kq → Kn One can then consider the composition ΦA ◦ ΦB ; it is a linear map from Kq to Kp . Definition 2.2.8. We define the product A · B as the matrix associated to the linear map ΦA ◦ ΦB . It is a p × q matrix. The following recipe allows us to compute the matrix the columns of B. By definition 1 0 0 .. ΦB . = C1 . . . ΦB . .. 0 0 1 A · B. Let C1 , . . . , Cq be = Cq Then ΦA ◦ ΦB 1 0 .. . = A · C1 0 ... 0 .. ΦA ◦ ΦB . = A · Cq 0 1 The columns of A · B are then A · C1 , . . . , A · Cq . Exercise 2.2.9. Compute 1 0 2 −3 4 0 2 2 −1 · 0 0 −1 −1 1 7 0 0 2 2.2. MATRICES 53 We have attached several invariants to linear maps, and one can do the same for matrices. Definition 2.2.10. Let A ∈ Mp,n (K), and ΦA : Kn → Kp be the associated linear map. We define the Kernel, the Image, the rank and the nullity of A as those of ΦA . Ker A is a vector subspace of Kn , and Im A a vector subspace of Kp . C1 , . . . , Cn are the columns of A, then If Im A = Vect(C1 , . . . , Cn ) If one exchanges the rows and columns of a matrix, one defined its transpose. Definition 2.2.11. Let n, p be positive integers, and A = (ai,j ) 1≤i≤p ∈ Mp,n (K). 1≤j≤n The transpose of A is the matrix t A ∈ Mn,p (K) defined by t A = (bi,j )1≤i≤n 1≤j≤p with bi,j = aj,i for all 1 ≤ i ≤ n and 1 ≤ j ≤ p. 5 Example 2.2.12. The transpose of the matrix −2 3 7 5 0 is 7 −1 −2 0 3 −1 Given several matrices, one can form a block matrix. Definition 2.2.13. Let n1 , n2 , p1 , p2 be positive integers. Let Ai,j ∈ Mpi ,nj for i, j ∈ {1, 2}. One can then form the block matrix A1,1 A1,2 A= A2,1 A2,2 The matrix A is a (p1 + p2 ) × (n1 + n2 )-matrix. More precisely, the matrix A has coefficients (ai,j ) 1≤i≤p1 +p2 with 1≤j≤n1 +n2 • ai,j = a1,1 i,j if i ≤ p1 and j ≤ n1 • ai,j = a1,2 i,j−n1 if i ≤ p1 and j > n1 • ai,j = a2,1 i−p1 ,j if i > p1 and j ≤ n1 • ai,j = a2,2 i−p1 ,j−n1 if i > p1 and j > n1 where (ak,l i,j )1≤i≤pk are the coefficients of Ak,l . 1≤j≤nl . 54 CHAPTER 2. LINEAR MAPS AND MATRICES 2.2.3 Invertible matrices We have already seen that if a linear map between two vector spaces is an isomorphism, then the vector spaces must have the same dimension. If A is a p × n matrix (with n, p positive integers), and ΦA is an isomorphism, then necessarily n = p. Definition 2.2.14. Let n ≥ 1 be an integer, and A ∈ Mn (K). We say that A is invertible if the associated linear map is an isomorphism. We denote by GLn (K) the set of invertible n × n-matrices. The properties proved for linear maps immediately give the following proposition. We denote by In the identity matrix of size n; its associated linear map is the identity id : Kn → Kn . Proposition 2.2.15. Let n ≥ 1 and A ∈ Mn (K). The following assertions are equivalent • A is invertible. • There exists B ∈ Mn (K) with AB = BA = In . • There exists B ∈ Mn (K) with AB = In . • There exists B ∈ Mn (K) with BA = In . • For all Y ∈ Kn , the equation AX = Y has a unique solution X ∈ Kn . If A ∈ Mn (K) is invertible, we will denote by A−1 the unique matrix such that AA−1 = A−1 A = In . It is called the inverse of A. To see if a matrix is invertible or not, and to compute its inverse, one proceeds as follows. One takes a vector Y ∈ Kn , and one considers the equation AX = Y , looking for X ∈ Kn . If there exist some vectors Y for which this equation has no solution, the matrix is not invertible. Otherwise, this equation has a unique solution X ∈ Kn . The coordinates of X depend on Y , and one has the relation X = A−1 Y . This is how one can compute the matrix A−1 . 2 −5 y1 x1 Example 2.2.16. Let A = . For Y = and X = , −1 3 y2 x2 one has AX = Y ⇔ 2x1 − 5x2 = y1 ⇔ −x1 + 3x2 = y2 Thus A is invertible and A−1 = 3 1 5 2 x1 = 3y1 + 5y2 ⇔X= x2 = y1 + 2y2 . 3 1 5 2 Y 2.2. MATRICES 2.2.4 55 Matrix of a linear map In the previous sections, we have used matrices to represent linear maps from Kn to Kp (for some positive integers n, p). One can do the same for linear maps between finite dimensional vector spaces, once bases have been fixed. Indeed, the choice of a basis for a vector space of dimension n ≥ 1, gives an isomorphism between this vector space and Kn . Let E be a vector space of dimension n ≥ 1, and let B = (e1 , . . . , en ) be a basis for E. Let x ∈ E; there exist unique scalars x1 , . . . , xn with x = x1 e1 + · · · + xn en x1 Definition 2.2.17. We say that the vector X = ... represents the coorxn dinates of x in the basis B. Thanks to the basis B, one associates to every element in E a vector in Kn . Let F be another vector space of dimension p ≥ 1, and B1 = (f1 , . . . , fp ) a basis for F . Let u ∈ L(E, F ). For each 1 ≤ i ≤ n, let Ci ∈ Kp be the coordinates of u(ei ) in the basis B1 . Definition 2.2.18. The matrix of u with respect to the bases B and B1 is the matrix U ∈ Mp,n (K) having C1 , . . . , Cn as columns. Example 2.2.19. Let E be a vector space of dimension 2, with basis (e1 , e2 ). Let F be a vector space of dimension 3, with basis (f1 , f2 , f3 ). Let u ∈ L(E, F ) be the linear map defined by u(e1 ) = 2f1 − 2f2 + f3 u(e2 ) = −4f1 + 7f2 The matrix of u with respect to these bases is 2 −4 −2 7 1 0 When one considers an endomorphism u of a vector space E, then one needs only one basis (although one could choose a basis for E as the domain, and another one for E considered as the target space). More precisely, let E be a vector space of dimension n ≥ 1, u ∈ L(E) and fix a basis for E. Then the matrix of u with respect to this basis is an element in Mn (K). 2.2.5 Change of coordinates matrix When one considers a linear map between finite dimensional vector spaces, the choice of bases allow us to attach to this linear map a matrix. But this construction depends on the choice of bases for the vector spaces. We will now 56 CHAPTER 2. LINEAR MAPS AND MATRICES describe what happens when other bases are considered. Let E be a vector space of dimension n ≥ 1 and let B = (e1 , . . . , en ) be a basis for E. Let B 0 = (e01 , . . . , e0n ) be another basis for E. For each 1 ≤ i ≤ n, let Pi be the coordinates of e0i in the basis B. Definition 2.2.20. The change of coordinates matrix from B to B 0 is the matrix P ∈ Mn (K) with columns P1 , . . . , Pn . Proposition 2.2.21. The matrix P is in GLn (K). Proof. Indeed, let us consider the linear map v ∈ L(E) sending ei to e0i . Then P is the matrix of v in the basis B. Since v sends a basis to a basis, it is an isomorphism, and the matrix P is then invertible. Let x ∈ E, and let X (resp X 0 ) be the coordinates of x in the basis B (resp. B 0 ). Proposition 2.2.22. One has X = P X 0 , where P is the change of coordinates matrix from B to B 0 . 0 x1 .. 0 Proof. Assume that X = . , and consider the linear map v ∈ L(E) sending ei to e0i x0n as before. Then x = x01 e01 + · · · + x0n e0n = x01 v(e1 ) + · · · + x0n v(en ) = v(x01 e1 + · · · + x0n en ) x1 Assume that X = ... , so that x = x1 e1 + · · · + xn en . Putting everything xn together, one gets x1 e1 + · · · + xn en = v(x01 e1 + · · · + x0n en ) Since P is the matrix of v in the basis B, one gets X = P X 0 . We have described how the coordinates of an element of E behave with respect to a change of basis. We will now do the same for the matrix of a linear map. Let E be a vector space of dimension n, and B, B 0 be two bases for E. Let F be a vector space of dimension p, and B1 , B10 be two bases for F . Let P (resp. Q) be the change of coordinates matrix from B to B 0 (resp. from B1 to B10 ). Let u ∈ L(E, F ) and let U (resp. U 0 ) be the matrix of u with respect to the bases B and B1 (resp. B 0 and B10 ). Proposition 2.2.23. One has U 0 = Q−1 U P . 2.2. MATRICES 57 Proof. Let x ∈ E and let X (resp. X 0 ) be the coordinates of x in the basis B (resp. B 0 ). Let y = u(x), and let Y (resp. Y 0 ) be the coordinates of y in the basis B1 (resp. B10 ). Then one has Y = UX X = P X0 Y = QY 0 One gets Y 0 = Q−1 Y = Q−1 U X = Q−1 U P X 0 This implies that U 0 = Q−1 U P . When one considers an endomorphism of a vector space, one needs only one basis to construct its matrix. One gets the following formula for the change of basis. Corollary 2.2.24. Let w ∈ L(E) be an endomorphism of E, and let A (resp. A0 ) be the matrix of w with respect to the basis B (resp. B 0 ). Then A0 = P −1 AP This corollary motivates the definition of similar matrices. Definition 2.2.25. Let n ≥ 1 and let A, B ∈ Mn (K). We say that A, B are similar if there exist a matrix P ∈ GLn (K) such that B = P −1 AP If A and B are similar, they represent the same endomorphism, but with different bases. 2.2.6 Exercises Exercise 2.2.1. Let n ≥ 1 be a integer and A, B ∈ Mn (K) be diagonal matrices. We call a1 , . . . , an (resp. b1 , . . . , bn ) the coefficients on the diagonal of A (resp. B). Compute the matrix A · B. Exercise 2.2.2. Let θ ∈ R, and Rθ be the rotation of angle θ on the real plane R2 . Prove that Rθ : R2 → R2 is a linear map, and write its matrix. 1 a Exercise 2.2.3. Let a ∈ K and A = ∈ M2 (K). Compute An for 0 1 every n ≥ 1. Exercise 2.2.4. Let n ≥ 1 and A ∈ Mn (K) such that AB = BA for all B ∈ Mn (K). Prove that there exists λ ∈ K with A = λIn . Exercise 2.2.5. Let E, F be vector spaces with dim E = n ≥ 1, dim F = p ≥ 1 and let u ∈ L(E, F ). Prove that there exist bases of E and F such that the matrix of u in these bases is Id 0 0 0 This matrix is written as a block matrix, and the top left block has size d × d. What is the integer d equal to? 58 CHAPTER 2. LINEAR MAPS AND MATRICES Exercise 2.2.6. Let n ≥ 1 and D ∈ Mn (K) be a diagonal matrix with diagonal coefficients d1 , . . . , dn . On which condition is D invertible? If this is the case, compute the inverse of D. Exercise 2.2.7. Let n ≥ 1 and let F ⊂ Mn (K) be the set of non-invertible matrices. Is F a vector subspace of Mn (K)? Exercise 2.2.8. Let n, p ≥ 1 and (ai,j ) 1≤i≤p be a p × n matrix. We define the 1≤j≤n transpose of A as the matrix t A = (aj,i )1≤i≤n 1≤j≤p This is a n × p matrix. 1. Write the transpose of the matrix 2 −2 1 −4 7 0 2. Let A, B ∈ Mp,n (K) and λ ∈ K. Prove that t (A + B) = t A + t B t (λA) = λt A t t ( A) = A 3. Let q ≥ 1 and A ∈ Mp,n (K), B ∈ Mn,q (K). Prove that t (A · B) = t B · t A 4. If A ∈ Mn (K) is invertible, prove that t A is invertible and that t A−1 . t A −1 = Exercise 2.2.9. Let E, F be vector spaces, with dim E = n, dim F = p and let u ∈ L(E, F ). Fix bases for E and F , and let U be the matrix of u with respect to these bases. Consider the dual u∗ ∈ L(F ∗ , E ∗ ). We endow the vector spaces E ∗ , F ∗ with the dual bases. Compute the matrix of u∗ with respect to these bases. Exercise 2.2.10. Let n ≥ 1; we recall that a n × n matrix A = (ai,j ) 1≤i≤n is 1≤j≤n upper triangular if ai,j = 0 when i > j. Let E be a vector space of dimension n, and let B = (e1 , . . . , en ) be a basis. For 1 ≤ i ≤ n − 1, define Ei = Vect(e1 , . . . , ei ). 1. Let u ∈ L(E), and let U be its matrix in the basis B. Prove that U is upper triangular if and only if u(Ei ) ⊆ Ei for 1 ≤ i ≤ n − 1. 2. Prove that the product of two upper triangular matrices is upper triangular. 2.2. MATRICES 59 3. Prove that if an upper triangular matrix is invertible, then its inverse is upper triangular. 4. Let Un (K) be the set of upper triangular matrices. Prove that Un (K) is a vector subspace of Mn (K), and compute its dimension. Exercise 2.2.11. Let n, p ≥ 1 and A ∈ Mn,p (K). Using exercises 2.2.5 and 2.2.8, prove that rk A = rk t A 60 CHAPTER 2. LINEAR MAPS AND MATRICES Chapter 3 Determinant and applications 3.1 3.1.1 Determinant Invertible 2 × 2 matrices a c Let us consider a matrix A = ∈ M2 (K), with a, b, c, d scalars. We b d would like to have a criterion to know one would if A is invertible. Equivalently, a c like to know when the vectors x = and y = form a basis for K2 . b d If the vector x is the zero vector, it is not the case. Assume that x 6= 0; then (x, y) is linearly dependent if and only if there exists λ ∈ K such that y = λx. One then tries to solve the equations c = λa d = λb Let us distinguish two cases. First, consider the case where a = 0. The equations become c = 0 and d = λb. Since x 6= 0, one has b 6= 0, and the second equation has a unique solution λ ∈ K. The first equation gives a condition on the vector y. In this case, the family (x, y) is linearly dependent if and only if c = 0. Let us assume now that a 6= 0. The first equation gives a value for λ : one gets λ = ac . Substituting in the second equation, one gets the relation d = bc a . In this case, the family (x, y) is linearly dependent if and only if ad = bc. Putting all the cases together, one gets the following criterion. a c Proposition 3.1.1. Considers two vectors x = and y = in K2 . b d Then the family (x, y) is a basis if and only if ad − bc 6= 0 Proof. From the previous discussion, one has that (x, y) is linearly dependent if and only if ad − bc = 0, hence the result. 61 62 CHAPTER 3. DETERMINANT AND APPLICATIONS This suggests the definition of the determinant. a c Definition 3.1.2. Let x = and y = . One defines the determib d nant of the family (x, y) as det(x, y) = ad − bc From the previous proposition, the family (x, y) is a basis if and only if the quantity det(x, y) is non zero. 3.1.2 Definition of the determinant Let n ≥ 1 be an integer. One would like to define the determinant of a family of n vectors in Kn . More precisely, one would like to define a function Kn × · · · × Kn → K (x1 , . . . , xn ) → det(x1 , . . . , xn ) such that (x1 , . . . , xn ) is a basis if and only if det(x1 , . . . , xn ) 6= 0. For n = 1, one can take the function K → K defined by x → x. For n = 2, one can take the function previously defined (x, y) → det(x, y). Note that this function is not a linear map; indeed one has det(λ · (x, y)) = λ2 det(x, y) for every λ ∈ K and (x, y) ∈ K2 × K2 . But if y is fixed, then the function x → det(x, y) is a linear form (and similarly if x is fixed). One says that the function (x, y) → det(x, y) is a 2-linear form. Let us define this notion. Let us fix a vector space E of dimension n ≥ 1. Definition 3.1.3. Let E × ··· × E → K (x1 , . . . , xn ) → d(x1 , . . . , xn ) be a function. We say that d is a n-linear form if for all 1 ≤ i ≤ n and (xj )j6=i the map E → K xi → d(x1 , . . . , xn ) is a linear form. If n = 1, a 1-linear form is just a linear form. For n = 2, the determinant defined in the previous section is a 2-linear form. We denote by Ln (E, K) the set of n-linear forms. Proposition 3.1.4. The set Ln (E, K) is a vector space. 3.1. DETERMINANT 63 Proof. The set of n-linear forms is a subset of the set of functions from E×· · ·×E to K, which is a vector space. One then proves that Ln (E, K) is a vector subspace of this vector space. Definition 3.1.5. Let d ∈ Ln (E, K). We say that d is alternating if d(x1 , . . . , xn ) = 0 whenever there exists i 6= j with xi = xj . This definition is motivated by the fact that a family of vectors (x1 , . . . , xn ) is not a basis if two vectors of the family are equal. We denote by An (E) the set of alternating n-linear forms. This is a vector subspace of Ln (E, K). Proposition 3.1.6. A n-linear form d is alternating if and only if for every i < j and x1 , . . . , xn d(x1 , . . . , xi , . . . , xj , . . . , xn ) = −d(x1 , . . . , xj , . . . , xi , . . . , xn ) Proof. If d satisfies the property in the proposition, let x1 , . . . , xn be vectors in E with xi = xj for some integers i < j. We then get d(x1 , . . . , xi , . . . , xi , . . . , xn ) = −d(x1 , . . . , xi , . . . , xi , . . . , xn ) which implies that d(x1 , . . . , xn ) = 0. This proves that d is alternating. Conversely, assume that d is alternating. Take for simplicity i = 1 and j = 2. Then for all x1 , . . . , xn , one has d(x1 + x2 , x1 + x2 , x3 , . . . , xn ) = 0 Using the n-linearity, one gets d(x1 , x1 , x3 , . . . , xn ) + d(x2 , x2 , x3 , . . . , xn )+ d(x1 , x2 , x3 , . . . , xn ) + d(x2 , x1 , x3 , . . . , xn ) = 0 The first two terms are 0 by the alternating property, and one can thus conclude. If d is an alternating n-linear form, then the quantity d(x1 , . . . , xn ) gets multiplied by −1 if one exchanges the place of two vectors. One has a more general result when one changes the order of the vectors x1 , . . . , xn . We denote by Sn the set of permutations of the set {1, . . . , n}, i.e. the set of bijections from {1, . . . , n} to itself. It is a finite set of cardinality n!. To each permutation σ, one can attach a sign ε(σ) ∈ {−1, 1}, called the signature of σ. Corollary 3.1.7. Let d be an alternating n-linear form. Then for all x1 , . . . , xn ∈ E and σ ∈ Sn one has d(xσ(1) , . . . , xσ(n) ) = ε(σ)d(x1 , . . . , xn ) where ε(σ) is the signature of σ. 64 CHAPTER 3. DETERMINANT AND APPLICATIONS Proof. The above property is true for transpositions. Let σ ∈ Sn be a permutation. Since the set of permutations is generated by transpositions, there exist k ≥ 1 and transpositions τ1 , . . . , τk such that σ = τ1 · · · · · τk . Then d(xσ(1) , . . . , xσ(n) ) = (−1)k d(x1 , . . . , xn ) Since the signature of σ is (−1)k , this concludes the proof. We are now ready to define the determinant for E. Recall that E is a vector space of dimension n. One will need a basis B = (e1 , . . . , en ) for E. Theorem 3.1.8. The space An (E) has dimension 1. Moreover, there exists a unique element d ∈ An (E) such that d(e1 , . . . , en ) = 1 This n-linear form is called the determinant with respect to the basis B, and is denoted by detB . Proof. Let d ∈ An (E), and x1 , . . . xn ∈ E. Let us write for every i between 1 and n xi = xi,1 e1 + · · · + xi,n en Then by n-linearity d(x1 , . . . , xn ) = d(x1,1 e1 + · · · + x1,n en , . . . , xn,1 e1 + · · · + xn,n en ) n n X X = ··· x1,i1 . . . xn,in d(ei1 , . . . , ein ) i1 =1 in =1 The quantity d(ei1 , . . . , ein ) is zero unless there exists σ ∈ Sn with ik = σ(k), in which case it is equal to ε(σ)d(e1 , . . . , en ) One then gets d = λ detB , with λ = d(e1 , . . . , en ) and X detB (x1 , . . . , xn ) = ε(σ)x1,σ(1) . . . xn,σ(n) σ∈Sn One checks that the function detB is a n-linear form. It is alternating since detB (xτ (1) , . . . , xτ (n) ) = ε(τ )detB (x1 , . . . , xn ) for any permutation τ ∈ Sn . This concludes the proof, remarking that detB (e1 , . . . , en ) = 1 3.1. DETERMINANT 65 Let us now consider Kn , for an integer n ≥ 1. In this case, one will take the standard basis for Kn . Definition 3.1.9. If x1 , . . . , xn are in Kn , we define det(x1 , . . . , xn ) as the determinant with respect to the standard basis of Kn . One can also define the determinant of a matrix. Definition 3.1.10. Let A ∈ Mn (K), and let C1 , . . . , Cn be the columns of A. We define the determinant of A as det(A) = det(C1 , . . . , Cn ) We also note |A| for the determinant of A. 3.1.3 Properties The determinant can be used to determine if a family of vectors is a basis. Let n ≥ 1 be an integer, and let E be vector space of dimension n. Let B be a basis for E. We have defined an alternating n-linear form detB in the previous section. It enjoys the following property. Proposition 3.1.11. Let x1 , . . . , xn be elements in E. Then the family (x1 , . . . , xn ) is a basis for E if and only if detB (x1 , . . . , xn ) 6= 0. Proof. Assume that B 0 = (x1 , . . . , xn ) is a basis. Then the n-linear forms detB and detB0 are non-zero elements of An (E), which has dimension 1. Therefore, there exists λ ∈ K, non-zero, such that detB = λdetB0 Evaluating at (x1 , . . . , xn ), one gets λ = detB (x1 , . . . , xn ), and this quantity is then non-zero. Assume now that (x1 , . . . , xn ) is linearly dependent. One of the vectors is a linear combination of the others. Assume for simplicity that x1 = λ 2 x2 + · · · + λ n xn Then one has detB (x1 , . . . , xn ) = λ2 detB (x2 , x2 , . . . , xn ) + · · · + λn detB (xn , x2 , . . . , xn ) All of these terms are 0 by the alternating property, and thus detB (x1 , . . . , xn ) = 0. 66 CHAPTER 3. DETERMINANT AND APPLICATIONS One has immediately the analogous result for vectors in Kn and matrices. Recall that one has an explicit formula for the determinant of a matrix. Let A = (ai,j ) 1≤i≤n be a n × n matrix. Then 1≤j≤n X ε(σ)aσ(1),1 . . . aσ(n),n det(A) = σ∈Sn Corollary 3.1.12. Let A ∈ Mn (K) be a matrix. Then A is invertible if and only if det(A) 6= 0. The following result tells us that a matrix and its transpose have the same determinant. Proposition 3.1.13. Let A ∈ Mn (K); one has det(t A) = det(A) Proof. One has det(A) = X ε(σ)aσ(1),1 . . . aσ(n),n σ∈Sn det(t A) = X ε(σ)a1,σ(1) . . . an,σ(n) σ∈Sn Notice that for each σ ∈ Sn a1,σ(1) . . . an,σ(n) = aσ−1 (1),1 . . . aσ−1 (n),n Therefore det(t A) = X ε(σ)aσ−1 (1),1 . . . aσ−1 (n),n = σ∈Sn X ε(σ −1 )aσ(1),1 . . . aσ(n),n σ∈Sn One can then conclude using the fact that σ and σ −1 have the same signature for every σ ∈ Sn . 3.1.4 The determinant of a linear map Let E be a vector space of dimension n ≥ 1 and u ∈ L(E). If we choose a basis for E, one can associate to u its matrix with respect to this basis, and compute its determinant. It is however unclear if this quantity depends on the choice of the basis. It is actually the case, as is shown by the following proposition. Proposition 3.1.14. Let E be a vector space of dimension n ≥ 1 and u ∈ L(E). One can associate to u a scalar det(u), called the determinant of u. It is equal to the determinant of the matrix of u with respect to any basis for E. Proof. Consider the function An (E) → An (E) d(x1 , . . . , xn ) → d(u(x1 ), . . . , u(xn )) 3.1. DETERMINANT 67 This function is well defined, and is a linear map. Since the vector space An (E) has dimension 1, this linear map is the multiplication by a scalar that we will denote by det(u). By definition, one has d(u(x1 ), . . . , u(xn )) = det(u)d(x1 , . . . , xn ) for all d ∈ An (E) and x1 , . . . , xn ∈ E. Let B = (e1 , . . . , en ) be a basis for E. We apply the previous result with d = detB and (x1 , . . . , xn ) = (e1 , . . . , en ). One gets det(u) = detB (u(e1 ), . . . , u(en )) and the right hand side is the determinant of the matrix of u with respect to B. From the proof of this proposition, one gets that the determinant of the identity of E is 1 (det(id) = 1). We will also be able to compute the determinant of a composition of endomorphisms. Proposition 3.1.15. Let E be a (non zero) finite dimensional vector space, and u, v ∈ L(E). Then det(v ◦ u) = det(v) · det(u) Proof. Let B = (e1 , . . . , en ) be a basis for E. Then det(v ◦ u) = detB (v(u(e1 )), . . . , v(u(en ))) = det(v) · detB (u(e1 ), . . . , u(en )) = det(v) · det(u) In the world of matrices, one has the following result. Corollary 3.1.16. Let n ≥ 1. Then det(In ) = 1. Let A, B ∈ Mn (K). Then det(AB) = det(A) · det(B) If A is invertible, then det(A−1 ) = 1 det(A) Proof. Set E = Kn . The matrix of the identity of E is In , and the determinant of this matrix is thus 1. Let ΦA and ΦB be the endomorphisms respectively associated to A and B. Then the endomorphism associated to AB is ΦA ◦ ΦB , and thus det(AB) = det(ΦA ◦ ΦB ) = det(ΦA ) det(ΦB ) = det(A) det(B) If A is invertible, then AA−1 = In . Taking the determinant and using the previous relation, one gets det(In ) = det(A) det(A−1 ). Since the determinant of the identity matrix is 1, one gets the desired result. 68 3.1.5 CHAPTER 3. DETERMINANT AND APPLICATIONS Computation of the determinant of a matrix In this section, we will develop tools for computing the determinant of a n × n matrix. Let us start with formulas for small values of n. Recall that the determinant of a matrix A is denoted by |A|. Case n = 1: let a ∈ K; then |a| = a Case n = 2: let a, b, c, d ∈ K; then a b c = ad − bc d Case n = 3: let a, b, c, d, e, f, g, h, i ∈ K; then a b c d e f g h = aei + bf g + cdh − ceg − af h − bdi i For general n (and often for n = 3), one will not use the general formula to compute a determinant. The first is the expansion against a column. tool x1 Let n ≥ 2, A ∈ Mn (K), and let ... be the first column of A. xn Proposition 3.1.17. One has det(A) = x1 D1 − x2 D2 + · · · + (−1)n+1 xn Dn where Di is the determinant obtained by removing the first column, and the i-th row. To compute the determinant of a n×n matrix, one is then reduced to compute n determinants of (n − 1) × (n − 1) matrices. One can then compute a determinant by induction (when one ends up with 2 × 2-matrices, one can use the formula). The above proposition is called the expansion against the first column. One can similarly expand a determinant against other columns, but one should be careful about the signs. Let A ∈ Mn (K), and let j be an integer between 1 and y1 n. Let ... be the j-th column of A. yn Proposition 3.1.18. One has det(A) = (−1)j+1 y1 E1 + (−1)j+2 y2 E2 + · · · + (−1)j+n yn En where Ei is the determinant obtained by removing the j-th column, and the i-th row. 3.1. DETERMINANT 69 The sign associated to the coefficient in place (i, j) is (−1)i+j , for all i, j between 1 and n. To compute a determinant, one can also form linear combinations of columns. Let A ∈ Mn (K), and let C1 , . . . , Cn be the P columns of A. Let 1 ≤ i ≤ n, and let (λj )j6=i be scalars. Define Ci0 = Ci + j6=i λj Cj and Cj0 = Cj if j 6= i. Let A0 be the matrix with columns C10 , . . . , Cn0 . Proposition 3.1.19. One has det(A) = det(A0 ). In other words, the determinant does not change if one adds to a column a linear combination of the other columns. Here are other rules for the determinant. • If you swap two columns, the determinant gets multiplied by −1. • If you multiply a column by a scalar λ, the determinant gets multiplied by λ. We have seen that the determinant of a matrix is equal to the determinant of its transpose. Since the transposition exchanges lines and columns, one can do the same operations on the lines of a matrix. Explicitly, one can • expand the determinant along a line • add to a line a linear combination of the other lines • swap two lines (the determinant is multiplied by −1) • multiply a line by a scalar λ (the determinant is multiplied by λ) 3.1.6 Exercises Exercise 3.1.1. Let n ≥ 1 and D ∈ Mn (K) be a diagonal matrix with coefficients d1 , . . . , dn . Compute det D. Exercise 3.1.2. Let n ≥ 1, A ∈ Mn (K) and λ ∈ K. Express the determinant of λA in function of the determinant of A. Exercise 3.1.3. Let n ≥ 1 and A = (ai,j ) 1≤i≤n ∈ Mn (K). We define the trace 1≤j≤n of A by Tr A = a1,1 + a2,2 + · · · + an,n 1. Prove that Tr is a linear form on Mn (K). 2. Compute Tr(AB) for A, B ∈ M2 (K). 3. Prove that Tr(AB) = Tr(BA) for A, B ∈ M2 (K). 4. Prove that Tr(AB) = Tr(BA) for A, B ∈ Mn (K). 70 CHAPTER 3. DETERMINANT AND APPLICATIONS 5. Let E be a vector space of dimension n and u ∈ L(E). Let U ∈ Mn (K) be the matrix of u in a basis B. Prove that Tr U does not depend on the choice of the basis B. We will call this element the trace of u, and write it Tr u. 6. Let E be a vector space of dimension n, and let p ∈ L(E) be a projection. Prove that Tr p = rk p. Exercise 3.1.4. (Vandermonde define the n × n matrix 1 1 A= 1 . .. matrix) Let n ≥ 2 and x1 , . . . , xn ∈ K. We x1 x2 x3 .. . x21 x22 x23 .. . ... ... ... .. . xn−1 1 xn−1 2 xn−1 3 .. . 1 xn x2n ... xn−1 n 1 1 1 .. . x1 x2 x3 .. . x21 x22 x23 .. . ... ... ... .. . xn−2 1 xn−2 2 xn−2 3 .. . P (x1 ) P (x2 ) P (x3 ) .. . 1 xn x2n ... xn−2 n P (xn ) 1. Prove that det A = for any polynomial P of degree n − 1 with leading coefficient 1 (i.e. a polynomial of the form P = X n−1 + an−2 X n−2 + · · · + a0 ). 2. Compute the determinant of A. 3. On which condition is A invertible? Exercise 3.1.5. Let n ≥ 1 and A ∈ Mn (K) be a matrix such that Tr(AB) = 0 for all matrix B ∈ Mn (K). Prove that A = 0. Exercise 3.1.6. Let n ≥ 1 and let φ be a linear form on Mn (K). Prove that there exists A ∈ Mn (K) such that φ(X) = Tr(AX) for all X ∈ Mn (K). Exercise 3.1.7. 1. Let n ≥ 0. Prove that there exist polynomials Tn , Un of degree n such that for every x ∈ R cos(nx) = Tn (cos(x)) sin((n + 1)x) = sin(x)Un (cos(x)) (Remark: These polynomials are called the Chebyshev polynomials) 3.2. APPLICATIONS TO SYSTEMS OF LINEAR EQUATIONS 71 2. Let n ≥ 2, and x1 , . . . , xn ∈ R. Compute 1 1 1 .. . cos(x1 ) cos(x2 ) cos(x3 ) .. . cos(2x1 ) cos(2x2 ) cos(2x3 ) .. . ... ... ... .. . cos((n − 1)x1 ) cos((n − 1)x2 ) cos((n − 1)x3 ) .. . 1 cos(xn ) cos(2xn ) . . . cos((n − 1)xn ) Exercise 3.1.8. Let n ≥ 2 and consider the 1 ... .. . . J = . . 1 ... matrix whose coefficients are all 1 1 .. . 1 1. Prove that for every matrix A ∈ Mn (K) the function x → det(A + xJ) is a polynomial of degree less or equal to 1. 2. Let a, b, x1 , . . . , xn ∈ K with a 6= b. Compute x1 a a .. . a b .. b . a b xi ... a a ... b .. . b .. b . a b xn 3. Treat the case when a = b. 3.2 3.2.1 Applications to systems of linear equations Definition and properties We start by defining linear equations and systems of linear equations. Definition 3.2.1. Let n ≥ 1. A linear equation in x1 , . . . , xn is an equation of the form a1 x1 + · · · + an xn = b with a1 , . . . , an , b ∈ K. Definition 3.2.2. A system of linear equations is a collection of linear equations. Example 3.2.3. 2x + 3y = 5 7x + 11y = −3 72 CHAPTER 3. DETERMINANT AND APPLICATIONS In general, a system of linear equations has p equations, n variables x1 , . . . , xn and has the form a1,1 x1 + · · · + a1,n xn = b1 .. . ap,1 x1 + · · · + ap,n xn = bp We will write A = (ai,j ) 1≤i≤p . A is the matrix associated to the system of 1≤j≤n equations; one has Mp,n (K). A ∈ x1 If we write X = ... and B = xn b1 .. , the system of equations is equiva. bp lent to AX = B Then one has the following cases. • If B ∈ / Im A, there is no solution. • Assume that B ∈ Im A. There exists then X0 ∈ Kn such that AX0 = B. Then AX = B ⇔ AX = AX0 ⇔ A(X − X0 ) = 0 ⇔ X − X0 ∈ Ker A The set of solutions is then {X0 + X1 , X1 ∈ Ker A} If Ker A = {0}, there is at most one solution. If not, there is either 0 or an infinite number of solutions (since Ker A is in this case a vector space of positive dimension, and hence has an infinite number of elements). In the special case where n = p and A is invertible, one has always a unique solution. Indeed, in this case AX = B ⇔ X = A−1 B. One even has a formula for the solution. However, computing the inverse of A is complicated in general (more complicated than solving the system of linear equations). 3.2.2 Cramer’s rule In this section, we assume that the matrix A ∈ Mn (K) is invertible, and we study the equation AX = B, where B ∈ Kn is given. Let C1 , . . . , Cn be the columns of A. We define ∆ = det(A) = det(C1 , . . . , Cn ) and ∆1 = det(B, C2 , . . . , Cn ) .. . ∆n = det(C1 , . . . , Cn−1 , B) 3.2. APPLICATIONS TO SYSTEMS OF LINEAR EQUATIONS 73 x1 Let X = ... be the unique solution of AX = B. xn Proposition 3.2.4 (Cramer’s rule). One has x1 = ∆1 ∆ ... xn = ∆n ∆ Note that ∆ 6= 0 since A is invertible. Proof. Since X satisfies AX = B, one has x1 C 1 + · · · + xn C n = B One then computes ∆1 = det(B, C2 , . . . , Cn ) = det(x1 C1 + · · · + xn Cn , C2 , . . . , Cn ) =x1 det(C1 , C2 , . . . , Cn ) + x2 det(C2 , C2 , . . . , Cn ) + · · · + xn det(Cn , C2 , . . . , Cn ) =x1 det(C1 , . . . , Cn ) = x1 ∆ One finally gets x1 = ∆1 ∆ , and similarly for x2 , . . . , xn . Example 3.2.5. Let us consider the system of linear equations 2x + 5y = 5 4x + 11y = −3 One has ∆= 2 4 5 =2 11 ∆1 = Thus the solution x= 3.2.3 5 −3 ∆1 = 35 ∆ 5 = 70 11 y= ∆2 = 2 4 5 = −26 −3 ∆2 = −13 ∆ Gaussian elimination Cramer’s rule is an efficient tool to compute a solution of a system of linear equations. Unfortunately, it works only when the associated matrix is invertible. We will describe in this section a process to solve systems of linear equations in the general case. First let us consider the case where p = n and d1 ∗ ∗ ∗ 0 d2 ∗ ∗ A= . .. ∗ 0 0 0 0 0 dn 74 CHAPTER 3. DETERMINANT AND APPLICATIONS with the scalars d1 , . . . , dn not zero. Then det A = d1 · · · d2 · · · · · dn 6= 0 so one could use Cramer’s rule. Alternatively, one can solve the equation AX = B by solving the last equation dn xn = bn ⇔ xn = bn dn This equation gives the value for xn . Then the n − 1-th equation is dn−1 xn−1 + λxn = bn−1 for some scalar λ, and this equation gives us the value for xn−1 . Repeating this process, one solves the equation AX = B. Now let p, n be positive integers, and consider form d1 ∗ ∗ ∗ . .. ∗ ∗ 0 A= 0 0 dr ∗ 0 0 0 0 0 0 0 0 a matrix A ∈ Mp,n (K) of the ∗ ∗ ∗ 0 0 for some integer r ≥ 1 and non-zero scalars d1 , . . . , dr . Lemma 3.2.6. The above matrix has rank r, and its image is Vect(e1 , . . . , er ) where (e1 , . . . , ep ) is the standard basis for Kp . Proof. It is clear that the image of A lies inside Vect(e1 , . . . , er ). Moreover, d1 e1 is in the image of A. Since d1 6= 0, so is e1 . There exists λ ∈ K such that d2 e2 + λe1 is in the image of A. Since e1 is in the image of A and d2 6= 0, one concludes that e2 belongs to the image of A. Repeating this process, one sees that the vectors e1 , . . . , er are in the image of A, hence the result. b1 One will now solve the equation AX = B for a vector B = ... . bp Proposition 3.2.7. The system of equations AX = B has a solution if and only if br+1 = · · · = bp = 0 If this is the case, the solution X is entirely determined by the coefficients xr+1 , . . . , xn . 3.2. APPLICATIONS TO SYSTEMS OF LINEAR EQUATIONS 75 Proof. If there exists a solution, necessarily one has br+1 = · · · = bn . Let us assume that these conditions are satisfied, and consider the equation AX = B. We have thus p equations, and the equations number r + 1, r + 2, . . . , p are automatically satisfied. The equation number r gives dr xr + λr+1 xr+1 + · · · + λn xn = br Since dr 6= 0, this gives the value of xr in function of xr+1 , . . . , xn . The equation number r − 1 is dr−1 xr−1 + µr xr + µr+1 xr+1 + · · · + µn xn = br−1 Substituting the value found for xr , this gives the value of xr−1 in function of xr+1 , . . . , xn . Repeating this process, one is able to compute the values of x1 , . . . , xr , in function of the variables xr+1 , . . . , xn . There exist thus solutions to the equation AX = B, and a solution is uniquely determined by the value of xr+1 , . . . , xn . We turn to the general case, with A ∈ Mp,n (K). The method of Gaussian elimination consists in operating elementary operations on A to get an upper triangular matrix, as studied previously. These operations will be • Switch two rows of A. This consists in switching the order of two equations. • Add to a row a scalar multiple of another. This consists in adding to one equation a scalar multiple of another equation. • Switch two columns of A. This consists in switching the order of two variables. One starts with A = (ai,j ) 1≤i≤p ∈ Mp,n (K). If A = 0, we are done. If not, up 1≤j≤n to switching some columns, one can reduce to the case where the first column of A is non zero. Up to switching some lines, one reduces to the case where a1,1 6= 0. ai,1 Now let L1 , . . . , Lp be the lines of A. We will replace Li by Li − a1,1 L1 for every 2 ≤ i ≤ p. After these operations, the matrix has the form A0 = a1,1 0 .. . ∗ ∗ .. . ... ... .. . 0 ∗ ... ∗ ∗ ··· ∗ Now we apply the same algorithm to the matrix A1 ∈ Mp−1,n−1 (K) obtained from A0 by removing the first row and the first column. 76 CHAPTER 3. DETERMINANT AND APPLICATIONS Example 3.2.8. Let a, b, c ∈ K and consider the system of equations x − 3y = a 3x − 8y = b −7x − 11y = c The coefficient in x in the first equation is not zero. Adding a scalar multiple of the first equation to the other ones gives the system x − 3y = a y = b − 3a −32y = c + 7a Now we add to the last equation a multiple of the second one, and gets x − 3y = a y = b − 3a 0 = c + 32b − 89a In conclusion, the previous system has a solution if and only if c + 32b − 89a = 0 in which case the solution is 3.2.4 x = −8a + 3b y = −3a + b Applications of the Gaussian elimination Let n, p be positive integers, and let A ∈ Mp,n (K). Thanks to the Gaussian algorithm, with elementary operations on the matrix A, one ends up with a matrix of the form D ∗ 0 0 where D ∈ Mr (K) is a r×r matrix which is upper triangular and whose diagonal coefficients are non zero. It is not hard to see that the elementary operations preserve the rank, therefore r is equal to the rank of A. Let us sum up the result of the Gaussian elimination algorithm. Theorem 3.2.9. Let A ∈ Mp,n (K), and B ∈ Kp . Let r be the rank of A. For the equation AX = B to have a solution, B must satisfy p − r conditions. If these conditions are satisfied, the solution will depend on n − r parameters. We will now study the applications of this result, and what the Gaussian elimination allows us to compute. We have already seen that it can be used to compute the rank of a matrix. If A ∈ Mn (K) is invertible, then one can solve the equation AX = Y for every vector Y ∈ Kn . Since the solution is X = A−1 Y , one can compute the matrix A−1 . 3.2. APPLICATIONS TO SYSTEMS OF LINEAR EQUATIONS 77 One can also compute the dimension of a vector subspace, knowing a spanning family. If f1 , . . . , fn ∈ Kp , and F = Vect(f1 , . . . , fn ), one can consider the matrix A ∈ Mp,n (K) whose columns are the vectors f1 , . . . , fn . The dimension of F is then the rank of A. One can find a basis for a vector subspace defined by some linear equations. Let A ∈ Mp,n (K), and F = Ker A ⊆ Kn . Since F is defined by the equation AX = 0, the Gaussian elimination algorithm allows us to find a basis for F . x Example 3.2.10. Let A = 2 −1 2 and F = Ker A. Then y ∈ F z if and only if 2x − y + 2z = 0. The vector subspace F consists of vectors of the form x 2x + 2z z 0 1 A basis for F is then 2 , 2 . 1 0 Conversely, take a vector subspace F ⊆ Kn , with basis (f1 , . . . , fk ). One might want to find equations defining F , i.e. find a matrix A0 ∈ Mp,n (K) for a certain integer p such that F = Ker A0 Let X ∈ Kn . Then X ∈ F if and only if there exist λ1 , . . . , λk ∈ K such that X = λ1 f1 + · · · + λk fk . This is equivalent to the fact that the equation AZ = X has a solution, where A ∈ Mn,k is the matrix whose columns are f1 , . . . , fk . Since the rank of A is k, the fact that the equation AZ = X has a solution gives n − k conditions on the vector X. These conditions are equivalent to X ∈ F . Example 3.2.11. Let F ⊆ K3 be the vector subspace with basis 1 0 2 , 2 0 1 x Let X = y ∈ K3 ; then X ∈ F if and only if there exist λ, µ ∈ K such that z λ=x 2λ + 2µ = y µ=z 78 CHAPTER 3. DETERMINANT AND APPLICATIONS One sees that this system has a solution if and only if y = 2x + 2z. Therefore x F = y ∈ K3 , y = 2x + 2z z 3.2.5 Computation of the rank and the inverse of a matrix We have described in the previous section an algorithm to compute the rank and the inverse of a matrix. This is an inductive process, which can be implemented by hand or using a calculator/computer. However, one might be interested in having an explicit formula to compute the rank or inverse of a matrix. First, let us define some objects attached to a matrix. Let n, p be positive integers, and let A ∈ Mp,n (K), and r ≤ min(n, p). Definition 3.2.12. Let 1 ≤ i1 < · · · < ir ≤ p, and 1 ≤ j1 < · · · < jr ≤ n be integers. Let us consider the submatrix A0 of A obtained by keeping only the coefficients in position (ik , jl ) for some integers k, l between 1 and r. This is a r × r matrix. The determinant of A0 will be a called a minor of A of order r. Alternatively, the submatrix A0 is obtained from A by removing some columns and some rows. When p = n, one can use the minors of A to define the comatrix of A. Let n be a positive integer, A ∈ Mn (K), and i, j be integers between 1 and n. Let Mi,j be the minor of A obtained by removing the i-th row and j-th column, and define Ci,j = (−1)i+j Mi,j The elements Ci,j are called the cofactors of A. Definition 3.2.13. The comatrix of A is the matrix com(A) ∈ Mn (K) whose coefficients are (Ci,j ) 1≤i≤n . 1≤j≤n The minors of a matrix can be used to compute its rank. Proposition 3.2.14. Let n, p be integers, and A ∈ Mp,n (K) be a non zero matrix. Let r be an integer with 1 ≤ r ≤ min(p, n). If r > rk A, then every minor of order r is 0. Moreover, there exists a minor of order rk A which is non-zero. Proof. Let C1 , . . . , Cn be the columns of A, and let r > rk A. Let 1 ≤ i1 < · · · < ir ≤ n and 1 ≤ j1 < · · · < jr ≤ p be integers. The vectors (Ci1 , . . . , Cir ) are in the image of A which has dimension rk A and are thus linearly dependent. Let Ci01 , . . . , Ci0r be the vectors in Kr obtained by keeping the coordinates of Ci1 , . . . , Cir in positions j1 , . . . , jr . The family of vectors (Ci01 , . . . , Ci0r ) is linearly dependent, and the determinant 3.2. APPLICATIONS TO SYSTEMS OF LINEAR EQUATIONS 79 of the associated matrix is thus 0. But this determinant is exactly the corresponding minor of A. This proves that every minor of A of order r is 0. Let r = rk A. Now we want to prove that there exists a non-zero minor of order r. Up to reordering the columns of A, one can assume that (C1 , . . . , Cr ) is linearly independent. Let A0 ∈ Mp,r (K) be the matrix whose columns are C1 , . . . , Cr . It has rank r. Let us now consider the matrix t A0 ; it has also rank r (see exercise 2.2.11). Therefore, one can extract r columns of t A0 who will be linearly independent. Up to reordering, one can assume that this is true for the first r columns. Let A1 ∈ Mr (K) be the matrix obtained by keeping the first r rows of A0 . Then A1 has rank r, and its determinant is then non-zero. This construction gives a non-zero minor of A of order rk A. The rank of A is then the maximum of the integers r such that there exists a non-zero minor of order r. Computing all the minors of A gives us directly the rank of A. Example 3.2.15. Consider the matrix 1 −1 A= 2 4 3 7 2 −2 −4 One computes that det(A) = 0, therefore rk A ≤ 2. However −1 = 6 6= 0 4 1 2 This gives rk A ≥ 2. Finally, one concludes that rk A = 2. Computing all the minors of a matrix gives us directly the rank. Note that even for a 3 × 3 matrix, the number of minors is quite large: there is one minor of order 3 (the determinant of the matrix), 9 minors of order 2, and 9 minors of order 1 (these are just the coefficients of the matrix). The comatrix will be used to compute the inverse of a matrix. Proposition 3.2.16. Let n ≥ 1, and A ∈ Mn (K). One has t t A · com(A) = com(A) · A = det(A) · In t Proof. Let B = com(A) · A and let i, j be integers between 1 and n. Recall that the coefficients of com(A) are the cofactors (Ci,j ) 1≤i≤n . The coefficient of 1≤j≤n B in position (i, j) is then a1,j C1,i + · · · + an,j Cn,i If i = j, this quantity is equal to the determinant of A (it is the expansion along the i-th column). If j 6= i, it is equal to the determinant of A0 , where A0 is the 80 CHAPTER 3. DETERMINANT AND APPLICATIONS matrix obtained from A by replacing the i-th column by the j-th column of A. Since A0 has two columns which are equal, this quantity is then 0. t The proof is similar for A · com(A). This proposition allows us to give an explicit formula for the inverse of a matrix. Corollary 3.2.17. Let n ≥ 1, and A ∈ GLn (K). Then A−1 = 1 t com(A) det(A) Example 3.2.18. Let n = 2 and let a, b, c, d ∈ K. Consider the matrix a c A= b d Then d −b −c a 1 1 t com(A) = = det(A) ad − bc com(A) = If A is invertible, then A −1 d −b −c a Remark 3.2.19. If A is invertible, then Cramer’s rule gives a explicit solution to the equation AX = Y , for every vector Y ∈ Kn . This solution is X = A−1 Y . One checks that the solution given by Cramer’s rule is compatible with the expression of A−1 given by the previous corollary. 3.2.6 Exercises Exercise 3.2.1. Let a, b, c, d ∈ K. On which condition does the following system of equations have a solution? Compute the solution when it does. x − 2y + z = a 2x − 4y + 3z = b y + 7z = c x + 4z = d Exercise 3.2.2. Let A ∈ M3 (K). 1. Prove that switching the first and second rows of A corresponds to multiply A on the left by 0 1 0 1 0 0 0 0 1 3.2. APPLICATIONS TO SYSTEMS OF LINEAR EQUATIONS 81 2. Let λ ∈ K. Prove that adding λ times the first row to the second row of A corresponds to multiply A on the left by 1 0 0 λ 1 0 0 0 1 3. Prove that switching the first and multiply A on the right by 0 1 0 second columns of A corresponds to 1 0 0 0 0 1 4. Find generalizations of these results for A ∈ Mp,n (K). Exercise 3.2.3. Let n, p ≥ 1 and A ∈ Mp,n (K). Prove that the elementary operations on A do not change the rank. If p = n, do the elementary operations preserve the determinant? Exercise 3.2.4. Let a, b, c, d ∈ K with ad − bc 6= 0. Using Cramer’s rule, compute the inverse of a c b d Exercise 3.2.5. Let n ≥ 1 and A ∈ Mn (K). 1. Let f : K → K be the function defined by f (x) = det(A + x · id). Prove that f is a non-zero polynomial function. 2. We say that a sequence of matrices (Am )m≥0 (with Am ∈ Mn (K) for every m ≥ 0) is convergent to A if each of the coefficients of the matrices (Am )m≥0 is convergent to the corresponding coefficient of A. Prove that there exists a sequence (Am )m≥0 convergent to A with Am ∈ GLn (K) for every m ≥ 0. Exercise 3.2.6. Let n ≥ 1, E be the space of functions from R to R, and let f1 , . . . , fn ∈ E. Prove that (f1 , . . . , fn ) is linearly independent if and only if there exist x1 , . . . , xn ∈ R with det(fi (xj ))i,j 6= 0