MODERN ALGEBRA NOTES FOR MATH 401-402 at Binghamton University by Matthew G. Brin 1-st Edition Department of Mathematical Sciences Binghamton University State University of New York c 2011 Matthew G. Brin All rights reserved. ii Preface This book is designed for a two semester undergraduate course in modern algebra. It makes no pretence that it can be used for a graduate course. The assumption is made that students are just barely familiar with rigorous proofs and that proofs by induction are are things that they have seen and done a few times, but not mastered. It is assumed that the students have seen sets to the extent of being familiar with unions and intersections, but little else. It is assumed that students have had some very basic linear algebra. The topics needed from linear algebra are the notions used in bases—span and linear independence—and basic facts about solutions to systems of homogeneous linear equations. It will be assumed that the students are familiar with basic algebraic manipulations. Lastly, it is not assumed that the students know anything about complex numbers, although it will be hoped that they have heard of them. We do not follow a foundational point of view. That is, we do not start with a blank slate and then add structures with axioms, one structure at a time, stopping after each structure is introduced to develop its theory as much as is needed before moving on to the next structure. In spite of this, the book is self contained. Ultimately it proves from scratch enough about the usual core subject (groups, rings and fields) to have developed Galois theory to the extent of proving the existence of specific non-solvable fifth degree polynomial equations. We do this without referring to any outside facts except one—the intermediate value theorem from calculus. This is used in the proof of the fundamental theorem of algebra. [Note: There is not really time to do the Fundamental Theorem of Algebra. As of now, the FTA is taken as a black box.] The pace is very slow. The ideal goal is to have the students learn absolutely everything. This is not realistic, but some students will come close. The rest should try to learn as much of the theory as they can. iii iv PREFACE Contents Preface iii I 1 Preliminaries 1 Tools 1.1 Why we look at a bit of history . . . . . . . . . . . . . . . . . 1.2 The quadratic . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Classical solutions . . . . . . . . . . . . . . . . . . . . 1.2.2 Before the Greeks . . . . . . . . . . . . . . . . . . . . 1.2.3 After the Greeks . . . . . . . . . . . . . . . . . . . . . 1.2.4 More algebra from geometry . . . . . . . . . . . . . . 1.2.5 Solvability by radicals . . . . . . . . . . . . . . . . . . 1.2.6 On the roots and coefficients of the quadratic . . . . . 1.3 The cubic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Mixing the old and the new . . . . . . . . . . . . . . . 1.3.2 Reducing the cubic . . . . . . . . . . . . . . . . . . . . 1.3.3 Solving the reduced cubic . . . . . . . . . . . . . . . . 1.4 Complex arithmetic . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 Complex numbers and the basic operations . . . . . . 1.4.2 Complex numbers as a vector space . . . . . . . . . . 1.4.3 Complex numbers in Cartesian and polar coordinates 1.4.4 Complex conjugation . . . . . . . . . . . . . . . . . . . 1.4.5 Powers and roots of complex numbers . . . . . . . . . 1.4.6 Roots of 1 . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 The cubic revisited . . . . . . . . . . . . . . . . . . . . . . . . 1.5.1 Picking out the solutions from the formula . . . . . . 1.5.2 Symmetry and Asymmetry . . . . . . . . . . . . . . . 1.5.3 The symmetric and the asymmetric . . . . . . . . . . 1.6 The quartic (optional) . . . . . . . . . . . . . . . . . . . . . . 1.6.1 Reduction . . . . . . . . . . . . . . . . . . . . . . . . . 1.6.2 The resolvent . . . . . . . . . . . . . . . . . . . . . . . v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 5 6 7 8 12 14 14 15 16 18 19 22 22 25 25 27 28 31 33 33 35 41 44 44 44 vi 2 Objects of study 2.1 First looks . . . . . . . . . . . . . . . . . . . . 2.1.1 Groups . . . . . . . . . . . . . . . . . 2.1.2 Fields . . . . . . . . . . . . . . . . . . 2.1.3 Rings . . . . . . . . . . . . . . . . . . 2.1.4 Homomorphisms . . . . . . . . . . . . 2.1.5 And more . . . . . . . . . . . . . . . . 2.2 Functions . . . . . . . . . . . . . . . . . . . . 2.2.1 Sets . . . . . . . . . . . . . . . . . . . 2.2.2 Functions . . . . . . . . . . . . . . . . 2.2.3 Function vocabulary . . . . . . . . . . 2.2.4 Inverse functions . . . . . . . . . . . . 2.2.5 Special functions . . . . . . . . . . . . 2.3 Groups . . . . . . . . . . . . . . . . . . . . . . 2.3.1 The definition . . . . . . . . . . . . . . 2.3.2 Operations . . . . . . . . . . . . . . . 2.3.3 Examples . . . . . . . . . . . . . . . . 2.3.4 The symmetric groups . . . . . . . . . 2.4 The integers mod k . . . . . . . . . . . . . . . 2.4.1 Equivalence relations . . . . . . . . . . 2.4.2 Equivalence classes . . . . . . . . . . . 2.4.3 The groups . . . . . . . . . . . . . . . 2.4.4 Groups that act and groups that exist 2.5 Rings . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Definitions . . . . . . . . . . . . . . . 2.5.2 Examples . . . . . . . . . . . . . . . . 2.6 Fields . . . . . . . . . . . . . . . . . . . . . . 2.6.1 Definitions . . . . . . . . . . . . . . . 2.6.2 Examples . . . . . √ . . . . . . . . . . . 2.6.3 The irrationality of 2 . . . . . . . . . 2.7 Properties of the ring of integers . . . . . . . 2.7.1 An outline . . . . . . . . . . . . . . . . 2.7.2 Well ordering and induction . . . . . . 2.7.3 The division algorithm . . . . . . . . . 2.7.4 Greatest common divisors . . . . . . . 2.7.5 Factorization into primes . . . . . . . 2.7.6 Euclid’s first theorem about primes . . 2.7.7 Uniqueness of prime factorization . . . 2.8 The fields Zp . . . . . . . . . . . . . . . . . . 2.9 Homomorphisms . . . . . . . . . . . . . . . . 2.9.1 Complex conjugation . . . . . . . . . . 2.9.2 The projection from Z to Zk . . . . . CONTENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 47 47 48 49 49 50 50 50 53 54 56 58 58 59 59 60 61 64 64 66 68 71 72 72 73 74 74 75 77 78 78 79 80 82 84 85 86 88 89 90 90 CONTENTS vii 3 Theories 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 The definition . . . . . . . . . . . . . . . . . . . 3.2.2 First results . . . . . . . . . . . . . . . . . . . . 3.2.3 Subgroups . . . . . . . . . . . . . . . . . . . . . 3.2.4 Homomorphisms . . . . . . . . . . . . . . . . . 3.2.5 Subgroups associated to a homomorphism . . . 3.2.6 Homomorphisms that are one-to-one and onto 3.2.7 The group of automorphisms of a group . . . . 3.3 Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 The definition . . . . . . . . . . . . . . . . . . . 3.3.2 First results . . . . . . . . . . . . . . . . . . . . 3.3.3 Subrings . . . . . . . . . . . . . . . . . . . . . . 3.3.4 Homomorphisms . . . . . . . . . . . . . . . . . 3.3.5 Subrings associated to homomorphisms . . . . 3.3.6 Isomorphisms and automorphisms . . . . . . . 3.4 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 The definition . . . . . . . . . . . . . . . . . . . 3.4.2 First results . . . . . . . . . . . . . . . . . . . . 3.4.3 Field extensions . . . . . . . . . . . . . . . . . 3.4.4 Homomorphisms and isomorphisms . . . . . . . 3.4.5 Automorphisms . . . . . . . . . . . . . . . . . . 3.5 On leaving Part I . . . . . . . . . . . . . . . . . . . . . II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Group Theory 4 Group actions I: groups of permutations 4.1 Consequences of Lemma 3.2.1 . . . . . . . 4.2 Examples . . . . . . . . . . . . . . . . . . 4.2.1 Dihedral groups . . . . . . . . . . . 4.2.2 Stabilizers . . . . . . . . . . . . . . 4.3 Conjugation . . . . . . . . . . . . . . . . . 4.3.1 Definition and basics . . . . . . . . 4.3.2 Conjugation of permutations . . . 4.3.3 Conjugation of stabilizers . . . . . 4.3.4 Conjugation and cycle structure . 4.3.5 Permutations that are conjugate in 4.3.6 One more example . . . . . . . . . 4.3.7 Overview . . . . . . . . . . . . . . 91 91 93 93 93 95 97 98 99 100 103 103 104 105 105 105 106 108 108 108 109 111 112 114 117 . . . . . . . . . . . . . . . . . . Sn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 121 123 124 127 130 130 132 134 135 139 142 142 viii 5 Group actions II: general actions 5.1 Definition and examples . . . . . 5.1.1 The definition . . . . . . . 5.1.2 Examples . . . . . . . . . 5.2 Stabilizers . . . . . . . . . . . . . 5.3 Orbits and fixed points . . . . . . 5.4 Cosets and counting arguments . 5.4.1 Cosets . . . . . . . . . . . 5.4.2 Lagrange’s theorem . . . 5.4.3 The index of a subgroup . 5.4.4 Sizes of orbits . . . . . . . 5.4.5 Cauchy’s theorem . . . . CONTENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 146 146 148 150 152 154 154 155 156 157 157 6 Subgroups 6.1 Subgroup generated by a set of elements 6.1.1 Strategy . . . . . . . . . . . . . . 6.1.2 The strategy applied . . . . . . . 6.1.3 Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 161 161 162 164 7 Quotients and homomorphic images 7.1 The outline . . . . . . . . . . . . . . . . . . . . . . . 7.1.1 On the groups Z and Zk . . . . . . . . . . . . 7.1.2 The new outline . . . . . . . . . . . . . . . . 7.2 Cosets . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Identifying the equivalence classes with cosets 7.2.2 Cosets of normal subgroups . . . . . . . . . . 7.3 The construction . . . . . . . . . . . . . . . . . . . . 7.3.1 The multiplication . . . . . . . . . . . . . . . 7.3.2 The projection homomorphism . . . . . . . . 7.3.3 The first isomorphism theorem . . . . . . . . 7.3.4 Abelian groups and products . . . . . . . . . 7.3.5 Examples . . . . . . . . . . . . . . . . . . . . 7.3.6 The correspondence theorem . . . . . . . . . 7.3.7 Another isomorphism theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 167 167 168 169 169 170 173 173 174 175 176 177 177 180 8 Classes of groups 8.1 Abelian groups . . . . . . . . . . . . 8.1.1 Subgroups of abelian groups . 8.1.2 Quotients of abelian groups . 8.2 Solvable groups . . . . . . . . . . . . 8.2.1 The definition . . . . . . . . . 8.2.2 Subgroups of solvable groups 8.2.3 Quotients of solvable groups . 8.2.4 Finite solvable groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 183 183 184 184 184 184 185 185 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CONTENTS 9 Permutation groups 9.1 Odd and even permutations . . . . . . . . 9.1.1 Crossing number of a permutation 9.2 The alternating groups . . . . . . . . . . . 9.2.1 The A5 menagerie . . . . . . . . . 9.2.2 Getting a three-cycle . . . . . . . . 9.2.3 Getting all of A5 . . . . . . . . . . 9.3 Showing a subgroup is all of Sn . . . . . . III ix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Field theory 189 189 190 192 192 193 193 195 199 10 Field basics 10.1 Introductory remarks . . . . . . . . . . . . . . 10.2 Review . . . . . . . . . . . . . . . . . . . . . . 10.3 Fixed fields of automorphisms . . . . . . . . . 10.4 Automorphisms and polynomials . . . . . . . 10.5 On the degree of an extension . . . . . . . . . 10.5.1 Comparing degree with index . . . . . 10.5.2 Properties of the degree . . . . . . . . 10.6 The characteristic of a field . . . . . . . . . . 10.6.1 Definition and properties . . . . . . . 10.6.2 A minimal field of each characteristic 10.6.3 Consequences of Theorem 10.6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 201 202 203 203 205 205 206 208 208 209 210 11 Polynomials 11.1 Motivation: the construction of Zp from Z . . 11.2 Rings . . . . . . . . . . . . . . . . . . . . . . 11.2.1 Ring definitions . . . . . . . . . . . . . 11.3 Polynomials . . . . . . . . . . . . . . . . . . . 11.3.1 Introductory remarks on polynomials . 11.3.2 Polynomial basics . . . . . . . . . . . 11.3.3 Degree . . . . . . . . . . . . . . . . . . 11.4 The division algorithmm for polynomials . . . 11.4.1 Roots and linear factors . . . . . . . . 11.5 Greatest common divisors and consequences . 11.5.1 Divisors and units . . . . . . . . . . . 11.5.2 GCD of polynomials . . . . . . . . . . 11.5.3 Irreducible polynomials . . . . . . . . 11.6 Uniqueness of factorization . . . . . . . . . . 11.7 Roots of polynomials . . . . . . . . . . . . . . 11.7.1 Counting roots . . . . . . . . . . . . . 11.7.2 Polynomials as functions . . . . . . . . 11.7.3 Automorphisms and roots . . . . . . . 11.8 Derivatives and multiplicities of roots . . . . 11.8.1 The derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 213 214 214 215 215 216 220 221 222 222 222 224 225 226 227 227 228 228 229 229 x CONTENTS 11.8.2 Multiplicities of roots . . . . . . . . . . . . . . . . . . . . 230 11.9 Factoring polynomials over the reals . . . . . . . . . . . . . . . . 230 12 Constructing field extensions 12.1 Smallest extensions . . . . . . . . . . . 12.2 Algebraic and transcendental elements 12.3 Extension by an algebraic element . . 12.3.1 The construction . . . . . . . . 12.3.2 The structure of F [x]/P (x) . . 12.3.3 A result about automorphisms 12.3.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 233 234 235 236 238 241 243 13 Multiple extensions 13.1 Multiple extensions . . . . . . . . . . . . . . . . . 13.2 Algebraic extensions . . . . . . . . . . . . . . . . 13.3 Automorphisms . . . . . . . . . . . . . . . . . . . 13.3.1 Relativizing Proposition 12.3.5 . . . . . . 13.3.2 Applying the relative proposition . . . . . 13.4 Splitting fields . . . . . . . . . . . . . . . . . . . 13.4.1 Existence . . . . . . . . . . . . . . . . . . 13.4.2 Uniqueness . . . . . . . . . . . . . . . . . 13.4.3 An application . . . . . . . . . . . . . . . 13.5 Fixed fields . . . . . . . . . . . . . . . . . . . . . 13.5.1 Independence of automorphisms . . . . . 13.5.2 Sizes of fixed fields . . . . . . . . . . . . . 13.6 A criterion for irreducibility . . . . . . . . . . . . 13.6.1 Primitive polynomials and content . . . . 13.6.2 The Eisenstein Irreducibility Criterion . . 13.6.3 Applications of the irreducibility criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 248 249 251 252 254 257 258 258 259 260 261 262 264 264 266 267 14 Galois theory basics 14.1 Separability . . . . . . . . . . . . . . . . . . . 14.2 The primitive element theorem . . . . . . . . 14.3 Galois extensions . . . . . . . . . . . . . . . . 14.3.1 Finite, separable, normal extensions . 14.3.2 Splitting fields . . . . . . . . . . . . . 14.3.3 Characterizations of Galois extensions 14.4 The fundamental theorem of Galois Theory . 14.4.1 Some permutation facts . . . . . . . . 14.4.2 The Fundamental Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 269 271 272 273 276 277 277 278 279 15 Galois theory in C 15.1 Radical extensions . . . . . . 15.1.1 An outline . . . . . . . 15.2 Improving radical extensions 15.2.1 The first improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 284 285 286 287 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CONTENTS xi 15.2.2 The second improvement . . . . . . . . . . . . . . . . . . 288 15.3 On the improved extension . . . . . . . . . . . . . . . . . . . . . 289 15.4 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 xii CONTENTS Part I Preliminaries 1 Chapter 1 Tools 1.1 Why we look at a bit of history Polynomial equations Modern algebra grew out of classical algebra. Classical algebra is learned today in middle and high school, and it developed irregularly over thousands of years. The earliest recorded evidence that people thought about algebraic questions at all dates from around 3000 BC. By 1600 AD most of the tools currently taught in middle and high school were in place. They were well understood and in common practice within a few decades of 1600, and it is no accident that calculus, a triumphant marriage of geometry and algebra, was invented when it was (around 1665 AD). Standard topics of classical algebra include solutions to polynomial equations such as the quadratic equation ax2 + bx + c = 0, the cubic equation ax3 + bx2 + cx + d = 0 and the quartic (also called the biquadratic) equation ax4 + bx3 + cx2 + dx + e = 0. Solutions to the quadratic equation were known before 1000 BC, and solutions to the cubic and quartic equatons were first published in 1545 AD. The quintic equation ax5 + bx4 + cx3 + dx2 + ex + f = 0 resisted all attacks until the early 1800s when it was finally understood that there was no formula of the expected type that gave the solutions. (We define “expected type” in Section 1.2.5.) In addition, techniques were developed by 3 4 CHAPTER 1. TOOLS Galois in 1832 and refined by others over the next 100 years that detected which particular quintics (with coefficients replaced by specific numbers) had solutions in the expected form and which did not. In fact, Galois’ techniques apply to polynomial equations of any degree, not just to degree five. A bit on what will be covered These notes will cover those aspects of modern algebra (whose development started around 1800) sufficient to describe the theory of Galois and some of its applications to the solutions of polynomials. This gives names to the topics in these notes, but gives no details. Still giving no details, we point out that the advances of the early 1800s were based on the introduction of new tools. The absence of these tools between 1600 and 1800 prevented any real progress on the quintic. We have arrived at the theme (given by the title) of this chapter: tools. In fact, the teaching of new tools is a major goal of these notes. Our fascination with tools is one reason for learning a bit of mathematical history. By seeing how something familiar (solving the quadratic) was done centuries ago, we can see how the absence of certain tools affects the difficulty of a problem. We will also look at the less familiar cubic and quartic. We do this for several reasons. First, they were done several decades before the full development of classical algebra and we can again point out some difficulties that a lack of tools created. Second, we will get hints on the nature of the new tools that were developed around 1800 and will be able to summarize the topics that will follow. Third, we will have opportunities to cover some basic material (complex numbers, for example) that will be used throughout the rest of these notes. Comments The problem of deciding which polynomial equations have solutions in an expected form may not seem all that important. In fact while Galois’ techniques make the decision possible in theory, they do not make the decision easy in practice. However, the tools developed by Galois are much more important than their application to this particular problem. An advanced course on the theory will cover many other topics. Still, the problem is important in that its resolution led to an important part of mathematics. The arrangement of the material in these notes was chosen for several reasons. The first is that the material constitutes a reasonable introduction to modern algebra. The second is that the ultimate topic of these notes, Galois theory, is one of the truly astounding developments of modern mathematics. The third is that the linking of polynomial equations to the Galois theory shows how modern algebra grew out of and connects to classical algebra. In spite of the discussion of the last few paragraphs, these notes do not follow a historical development. History will be mentioned, but will not be followed blindly. Ideas will be placed where they make the most sense, and not always 1.2. THE QUADRATIC 5 in the order in which they were discovered. Even in this chapter, where most of the historical references will occur, the history given is so incomplete as to be almost non-existent. Modern algebra has developed far beyond Galois theory. If you wish to learn more, you will have to take more advanced courses—perhaps in graduate school. In later versions of these notes, extra topics may be included. 1.2 The quadratic We now look at a familiar topic from classical algebra: solving quadratic equations. We know how to solve the equation ax2 + bx + c = 0. In fact we know how to solve ax2 + bx = c and ax2 = bx + c since we can rearrange the second and third equations to look like the first. When equations such as this were first considered, negative quantities were not accepted. The first form would not have been considered since positive quantities can not sum to zero and the second and third forms would have been considered different kinds of problems. Since we want to see how these equations were solved some thousands of years ago, we have to consider forms other than the one that we consider standard. This causes temporary problems in terminology. The quantity ax2 + bx + c is a polynomial of degree 2 in the variable x, and we have the convenient term “root” of the polynomial which refers to any value which makes the polynomial equal to zero when that value is substituted for x. So saying we are looking for roots of the polynomial ax2 + bx + c is the same as saying we are looking for solutions to the equation ax2 + bx + c = 0. The word “root” does not cooperate with ax2 + bx = c. So we have to use the word “solution” when we consider such an equation. Much of this course will involve a search for roots of polynomials, and so the word “root” will be a large part of our language. However, in this chapter, it will have to share space with the word “solution.” To make sure that we know what we are talking about, we will quickly review some terms. They will be repeated more carefully later. Some definitions A polynomial in one variable (say x) is a linear combination of powers of x where we will have more to say about the coefficients and where they come from later. An example is 3x5 − 7x3 + 2x − 5. The summands are the terms and the degree of term is the power of the variable in the term. In the example, −7x3 is a term and it has degree 3. The constant term (−5 in the example) has degree 0 since x0 (usually omitted when writing out the polynomial) is part of the term. The degree of the polynomial is 6 CHAPTER 1. TOOLS the highest degree of a term with a non-zero coefficient. That non-zero coefficient in that term of highest degree in the polynomial is the leading coefficient. The polynomial above has degree 5 and leading coefficient 3. As mentioned above, a root of a polynomial with variable x is a value of x that makes the value of the whole polynomial equal to zero. The quadratic formula gives roots of polynomials of degree 2. We now take a look at that formula. 1.2.1 Classical solutions We start with the more familiar: the way the formulas for the roots of quadratic polynomials have been derived since the 1600s. If three numbers a, b and c are given with a 6= 0, then the (as many as two) values x1 and x2 for x that make ax2 + bx + c = 0 true are given by the formulas √ −b + b2 − 4ac , x1 = 2a x2 = (1.1) −b − √ b2 − 4ac . 2a (1.2) We require a 6= 0 since having a = 0 would give us a polynomial that is not of degree 2. Also, the formulas in (1.2) would make no sense if a = 0. The validity of (1.2) can be checked by substituting x1 and x2 into (1.1) and simplifying. For example, direct multiplication computes (x1 )2 as √ √ b2 − 2b b2 − 4ac + (b2 − 4ac) b2 − b b2 − 4ac − 2ac 2 (x1 ) = = 4a2 2a2 so that √ √ b2 − b b2 − 4ac − 2ac −b2 + b b2 − 4ac + +c a(x1 ) + bx1 + c = 2a 2a √ √ b2 − b b2 − 4ac − 2ac − b2 + b b2 − 4ac + 2ac = 2a = 0. 2 More importantly, we can explain how the formulas in (1.2) are derived. The standard method involves completing the square. If we subtract c from both sides of (1.1), and then divide both sides of the result by a, we get c b x2 + x = − . a a (1.3) The technique of completing the square has us add (b/2a)2 to both sides of (1.3) to get b2 c b2 b x2 + x + 2 = 2 − a 4a 4a a 1.2. THE QUADRATIC which simplifies to 7 2 b2 − 4ac b = . x+ 2a 4a2 (1.4) The right side of (1.4) has (as many as two) square roots and we can write r b b2 − 4ac x+ =± 2a 4a2 from which we get b x=− ± 2a r b± b2 − 4ac =− 4a2 √ b2 − 4ac . 2a We can also justify the steps for completing the square. Using the distributive laws three times and the commutative law once, we can write (x + k)2 = (x + k)(x + k) = (x + k)x + (x + k)k = x2 + kx + xk + k 2 = x2 + 2kx + k 2 . This tells us that if we are given an expression x2 + hx, we can think of h as 2k and k as h/2. So if we add the square of h/2 to x2 + hx, we get the square of (x + h/2). We can make a partial list of the tools we used in our analysis of (1.1). We have already noted our use of such laws of arithmetic as the distributive and commutative laws, and we did not bother to note our use of other laws such as the associative law. Our calculation of (x + k)2 was used implicitly in our calculation of (x1 )2 . We used laws of algebra such as “equals added to equals give equal results,” we have used laws involving the addition of fractions, we have used our knowledge of square roots, and most importantly, we have used our skill with the manipulation of symbols. The manipulated symbols represent both numerical quantities as well as operations such as addition, subtraction, multiplication, division and the taking of square roots. In other words, we have used standard tools of classical algebra. In the next two sections we will give some of the details of how quadratics were handled before most of these tools were worked out. 1.2.2 Before the Greeks The oldest mathematical texts we know now date from about 2000 BC to 1000 BC (give or take several hundred years) depending on the geographical area. These come from Egypt, Mesopotamia, India and China. Some of these texts include solutions the quadratic equation. There are commonalities among these texts. Most include tables of calculations, and all include solved problems with specific numbers. A problem might 8 CHAPTER 1. TOOLS read “A square and six of its sides equals 40.” In modern notation this would read x2 + 6x = 40. The solution would read “Take 40 and the square of half of six to get 49. Take the square root of 49 to get 7. Now remove from 7 half of six to get 4.” In modern notation this would read s 2 6 √ 6 + 40 − = 49 − 3 = 4. 2 2 The other root −10 would be ignored because negative quantities would not have been accepted. There would be no proof of the correctness of the procedure, although with the wisdom of hindsight one can see in some cases from the wording and arrangement of steps that a geometric justification lay behind the solution. In the next section, we will look at later writings where the geometric justification is written out. A more common version of a quadratic problem might give the area and perimeter of a rectangle. This is equivalent to the information xy = a and x + y = b which reduces by substitution (a modern technique) to the quadratic x(b − x) = a or x2 − bx + a = 0. Once again, a specific problem would be given with specific numbers for a and b and a solution would simply tell how to manipulate a and b to get the answer. We will look again at this form of the problem before we leave the topic of quadratic equations. As mentioned, the absence of negative numbers would have an equation such as x2 = 6x + 40 considered as a different type of problem as x2 + 6x = 40. Thus the order of operations and the wording of the solution would read differently for each of the two equations. In spite of the restrictions described above, some texts had mathematics that would be regarded today as quite advanced and calculations that were extremely exact. One text from before 1600 BC has an approximation to the square root of two that is off by only 0.0000006.1 1.2.3 After the Greeks Around 500 BC, the first Greek texts on mathematics start to appear. The Greek texts brought in significant changes. The most obvious is that not only were proofs included, they were emphasized. As part of their structure of a mathematical world built on proofs, they developed the axiomatic system. That is, a small number of assumed truths were listed upon which all succeeding arguments and conclusions were based. 1 The study of ancient mathematical texts is a swiftly developing and changing field. Some areas produce hundreds of source texts in a variety of notations and languages, and interpretations of these texts change with time as they are more thoroughly studied, and as more information gets integrated from knowledge of the history and culture of the relevant times. I have tried to limit myself to statements that are not controversial, and have stayed away from questions as to depth of understanding, flow of ideas over time and geography, credit, originality, or worth of any of the mathematics contained in the ancient texts. 1.2. THE QUADRATIC 9 However, this system was mostly applied to geometry and number theory, and much less so to algebra. Algebraic manipulations were still done on a verbal basis, with no rules for the manipulation of symbols representing quantities, and no symbols for artithmetic operations. To see the first recorded attempt to bring basic rules into algebra, we shift our attention to the Middle East. Located geographically between all the areas that we have mentioned so far, and with regimes that from time to time strongly supported the development and preservation of the sciences, the Middle East (centered around Baghdad) brought together many of the contributions from different locations, and made many contributions of its own. We are not giving a detailed history, and so we look at just one book. The book appeared around 830 AD and its title has been variously translated over the years. It has been given the title The Compendious Book on Calculation by Completion and Balancing and also the shorter title The Algebra of Mohammed Ben Musa [4]. The author’s full name has been given as Abu Abdallah Muhammad ibn Musa al-Khwarizmi where the last part “al-Khwarizmi” is generally taken to be where the author was from. In spite of this, the author is usually referred to as “al-Khwarizmi.” It is thought that he was born in Persia (now Iran) and lived his adult life in Baghdad (now in Iraq). The importance of al-Khwarizmi’s book is based on the fact that it is the earliest known text to combine all of the following features about an algebraic topic (the solution of quadratic equations): 1. a systematic listing of all forms of the problem, 2. a small set of operations from which a reduction to one of the forms could be obtained, and 3. for each form, a solution and a proof that the solution is correct. Second, the book introduced these notions to the west (that is, to Europe).2 Since the standard of proof of the time was geometric proof in the Greek tradition, the proofs are geometric. We discuss the above numbered items. The six forms The forms of the quadratic listed in [4] are: ax2 = bx, ax2 = c, x = c, ax2 + bx = c, ax2 + c = bx, (1.5) 2 ax = bx + c. Given that all the numbers are to be positive, it is seen that this covers all possibilities. As noted, without negative quantities, the form ax2 + bx + c = 0 is 2 Another book by al-Khwarizmi, on arithmetic, introduced to Europe the numbers we call arabic numbers. The name “arabic numbers” comes from the fact that Europe learned of them from al-Khwarizmi’s book in spite of the fact that the book itself states that the numbers come from India. 10 CHAPTER 1. TOOLS not possible. Also note that even the case where a = 0 is covered and that the only restrictions are that the unknown must appear and that there be at least two terms. The operations The operations for reduction to one of the forms are of two types. The first operation, called al-Jabr, is the moving of a subtracted quantity to the other side of the equation as an added quantity by adding the quantity to both sides. Al-Jabr means “the completion,” usually of something defective. Thus x2 − 3x = 2 becomes x2 = 3x + 2 by this operation. The object being completed is x2 which is less than complete before the operation because of the subtraction of the quantity 3x. A corruption of “al-Jabr” gives us the word algebra.3 The second operation, called al-Mukabalah, (English spellings vary greatly) means “the balancing” or “the reckoning.” It is used to cancel like terms that might be on opposite sides of the equality. Thus 3x2 + 6 = x2 + 12 is balanced by two applications of al-Mukabalah to become 2x2 = 6. Using the operations of al-Jabr and al-Mukabalah, al-Khawarizmi argued that any combination of squares, first powers and constants could be reduced to one of the six forms of the quadratic. It was taken for granted that like quantities on the same side could be combined. Recall that all equations were expressed in words and it was not questioned that “six squares and two squares” is the same as “eight squares.” The solutions and proofs The three equations in (1.5) with only two terms are trivial and we will not discuss those. We give the treatment of one of the remaining three equations in (1.5) as was done in [4], and leave the other two as exercises. It is likely that the solutions and arguments below were known long before 830 AD. The importance of al-Khwarizmi’s book is its completeness, its building from a small set of rules, and its inclusion of justifications for all parts of the solutions To explain the solution of x2 + px = q, the following figure is given in [4]. 3A A B A B C B A B A corruption of the name “al-Khwarizmi” gives us the word algorithm. (1.6) 1.2. THE QUADRATIC 11 The following explanation is given of the figure. We will use modern symbols. Al-Khwarizmi used words. The square C has the unknown x as the length of its sides. The rectangles B have x as the length of one side and p4 as the length of the other side. Thus the sum of the areas of the four rectangles labeled B is px. The small squares A have p4 as the length of their sides. The sum of the four squares labeled A is p2 /4. It is given that x2 + px = q, but x2 + px is the combined area of C and all 2 the squares labeled B. Thus the area of the entire figure is q + p4 . The side of the full figure is x + p2 . Thus we know that r p p2 x+ = q+ . 2 4 Finally we get x= r p p −p + p2 + 4q p2 q+ − = 4 2 2 which agrees with what we know from the quadratic formula applied to x2 + px − q = 0. Note that since p is a positive number, −p is negative and we would end up with a negative p solution (not allowed) unless a larger positive number is added to it. Since p2 + 4q > p (recall that q is also positive), we have a positive solution. Note further that we cannot use the negative square root since then a negative solution would result. Thus there is only one solution given in [4] for this case. Exercises (1) The two problems in this exercise set cover the last two forms in (1.5). In the following all quantities are to be assumed positive. 1. The following figure in [4] is used to analyze x2 + q = px. i j k l e g h f a (1.7) b c d The square cdhg has sides whose lengths are the unknown x. The length ad is p. The point b is halfway from a to d. 12 CHAPTER 1. TOOLS The lengths lk, lg, and lf are all equal. What represents q? There are two positive solutions to x2 + q = px. Why? Show that the diagram justifies the smaller of the two. In order to have a real solution, the fact q < ( p2 )2 must be built into the figure. Show how this is the case. The book [4] gives no diagram to justify the larger solution, but points out one length in the diagram above that is the larger solution. Which is it? Can you draw a diagram that justifies the larger solution? 2. The following figure in [4] is used to analyze x2 = px + q. m d c g e i f (1.8) j l k a b The square abcd has sides whose lengths are the unknown x. The length bi is p. The point l is halfway between b and i. The lengths of gi and il are equal. The lengths of gf and ic are equal. What represents q? There is only one positive solution to x2 = px + q. Why? Show that the diagram justifies the solution. 1.2.4 More algebra from geometry In [4], the following alternate figure is given to justify the case x2 + px = q. x C p 2 A x B p 2 In the figure, areas C and B sum to px, and area A is x2 . One can think of starting with the small square on the lower left with sides p2 , and expanding the 1.2. THE QUADRATIC 13 square A until areas A, B and C add up to q. This makes p2 p 2 =q+ x+ 2 4 which agrees with the solution given before for x2 + px = q. More interesting about the figure above is that it is a pictoral representation of the square of a binomial. If we label the sides differently as shown x x y y then we immediately get that (x + y)2 = x2 + 2xy + y 2 . (1.9) This illustrates the power of geometric figures to capture what we know from simple algebraic manipulations. However, it must be remembered that without algebraic notation all algebraic discussion must be done with words. In [4], the explations that accompany (1.7) and (1.8) occupy two pages each. Exercises (2) The following exercises about the quadratic use modern techniques. The first two exercises are relevant to techniques that will be used in our discussion of the cubic and quartic. 1. The formula (1.4) makes it appear that we are solving for the quantity b b b x + . If we let y = x + , then x = y − . What happens when we 2a 2a 2a substitute this value for x into (1.1)? Assuming you did the calculation right and saw the right phenomenon, can you explain why this particular substitution works (and another such as x = y − b) does not? 2. If you solve the equation obtained by the substution above for y, how would you recover x, the solution to the original equation? 3. The standard solutions to ax2 + bx + c = 0 given by the quadratic formula (1.2) assume a 6= 0. This is sensible since if a = 0, then the equation is not quadratic. However, there is a form of the solution that allows for a = 0. What happens when you “rationalize the numerator” of the solutions given by (1.2)? (This is exactly the operation done in Calculus I to derive the derivative of the square root function from the definition of the derivative.) Do this for both solutions in (1.2) since they have important differences. What happens when a = 0? What happens when c = 0? 14 1.2.5 CHAPTER 1. TOOLS Solvability by radicals The form of the solutions in (1.2) indicates what we meant in Section 1.1 by “expected type.” In (1.2), the coefficients of (1.1) and a few constants (specifically 2 and 4) are combined using the operations of addition, subtraction, multiplication, division and the taking of square roots. In Section 1.3 we consider the cubic, and will also have to take cube roots. All of these observations lead to the following definition. A polynomial is said to be solvable by radicals if its roots can be expressed as formulas in the coefficients of the polynomial and a number of constants using the operations of addition, subtraction, multiplication, division and the taking of n-th roots for various positive integers n. We will discuss what constants are allowed later in these notes. For now, we will just say that the same constants (and thus the same formula) must be used no matter what the coefficients are. It would not be reasonable to allow the constants (and the formula) to change with the polynomial. The fifth operation, that of taking n-th roots, has a very different status from the four operations of addition, subtraction, multiplication and division. This theme will unfold slowly. Here we will point out that a more accurate description of the fifth operation is that it extracts solutions to equations of the form xn = c. When c 6= 0 and n = 2, we know that there are two solutions. For arbitrary n, we will see shortly that there are n solutions whenever c 6= 0. On the other hand, the operation of addition always gives a unique result. This is also true of subtraction, multiplication and division. (We always refuse to divide by zero, so we never have to deal with the fact that zero divided by zero might be said to give infinitely many results.) 1.2.6 On the roots and coefficients of the quadratic Let us make some observations using the techniques available since 1600 AD. It is somewhat simpler to deal with ax2 + bx + c = 0 if a = 1. If we have a true quadratic, then a 6= 0 and we can divide both sides by a. So it is not a serious restriction to require a = 1. In general, a polynomial with leading coefficient equal to 1 is called a monic polynomial. We have just argued that finding roots for any polynomial can be done if we know how to find roots for monic polynomials. We will see later that monic polynomials have other conveniences. Since the 1600s, we have known that if r1 and r2 are the roots of x2 + bx + c, then x2 + bc + c = (x − r1 )(x − r2 ) = x2 − (r1 + r2 )x + r1 r2 . (1.10) So we must have r1 + r2 = −b and r1 r2 = c.4 This brings us back to one topic discussed in Section 1.2.2. 4 The first equality in (1.10) and the use of the word “must” assume facts about polynomials that we have not yet proven. These facts will be proven in due time (Section 11.6), and when we do we will point out that the discussion here has been fully justified. 1.3. THE CUBIC 15 If we are told that two unknowns have a known sum and a known product, then the information gleaned from (1.10) lets us write down a quadratic whose roots are the two unknowns. Specifically, if r1 r2 = p and r1 + r2 = s, then x and y are roots of x2 − sx + p. This will be exploited in the following section on cubics. The fact that the coefficient of x is the negative of the sum of the roots and the constant term is the product of the roots can be verfied directly (assuming a = 1) from the solutions given in (1.2). There is a two way relationship between the coefficients of a polynomial, and the roots of the polynomial. In one direction, the coefficients clearly control the roots since the roots must make the value of the polynomial equal to zero. Understanding this direction is the goal to finding the roots. The formulas in (1.2) give this direction for the quadratic, and we will see formulas (or at least how to get them) for the cubic and quartic. The other direction is not mysterious at all. We see that for the quadratic it is quite simple. One coefficient is obtained by simple addition and negation, and the other by multiplication. We put off discussing this further until we have analyzed the cubic. At that point, we will have more information available to discuss. Exercises (3) 1. Verify that the roots as given in (1.2) (when a = 1) multiply to the constant term and add to the negative of the coeffient of x in (1.1). 2. Solve each of the following for x and y using the technique discussed above. Using any other technique avoids learning the point of this discussion. Of course, you should be willing to write down answers that are negative and that involve the square roots of negative numbers. Square roots of negative numbers will be discussed more thoroughly in a later section. 1.3 (a) s = 10, p = 2. (b) (c) s = 2, s = 10, (d) (e) s = −10, s = −10, p = 10. p = −2. p = 2. p = −2. The cubic The full solution to the cubic equation appeared in Girolamo Cardano’s book [2] in 1545. Solutions of very special cases with carefully arranged coefficients had been worked out long before. A non-numerical, graphical solution to the 16 CHAPTER 1. TOOLS cubic by the Persian poet and mathematician Omar Khayyám5 (which involved drawing a circle and a hyperbola and measuring a coordinate of an intersection point) dates from the eleventh century. As with the quadratic, non-acceptance of negative numbers required that the general cubic be treated in a number of cases. Regarding cases with only two terms as trivial, and cases with no cube or no constant term as reducible to lower degree problems, one is left with thirteen different cases. x3 + ax2 = c, x3 + bx = c, x3 + c = bx, x3 + c = ax2 , x3 = bx + c, x3 = ax2 + c, x3 + ax2 + bx = c, x3 + ax2 + c = bx, x3 + bx + c = ax2 , x3 + ax2 = bx + c, 3 (1.11) 2 x + bx = ax + c, x3 + c = ax2 + bx, x3 = ax2 + bx + c. Cardano learned of the solution to one or two of the thirteen cases from another mathematician, worked out the remaining cases himself and published the full account in [2].6 In [2], each of the thirteen cases in (1.11) is given its own chapter, most running to two or three pages, and a few, with extra examples and alternate approaches, running to quite a few more. 1.3.1 Mixing the old and the new We will not try to give an exact account of how cubic equations were solved in [2]. First, the translations available have already modernized the arguments to some extent. Second, the arguments will take too long to work out completely in two different ways. However, we will give one idea of the restrictions under which Cardano worked. Section 1.2.4 showed how the square of a binomial could be understood geometrically. Relevant to the solution of the cubic will be the cube of a binomial. The following figure illustrates how (x + y)3 can be understood. 5 There is speculation that the eleventh century, Persian poet Omar Khayyám and the eleventh century, Persian mathematician Omar Khayyám were two different people. 6 The publication of [2] started a bitter, ten year feud between Cardano and Niccolò Tartaglia who had given Cardano the solution to a special case under a promise that Cardano would not publish the solution until Tartaglia published his solutions first. The feud went on in spite of the fact that Cardano named Tartaglia in [2] as the source of Cardano’s original knowledge for one of the cases. Tartaglia was not the first to solve a special case. Some thirty years earlier the case x3 + bx2 = c had been worked out by Scipione del Ferro but never published. It is said that Cardano published [2] after learning about Ferro’s results because he then knew that Tartaglia was not the first to solve a special case, and because, after some five years, Tartaglia had still not published his own solutions. The feud illustrates the importance that mathematics had in certain circles of society in the middle of the Renaissance. 1.3. THE CUBIC 17 o xooooo y o ooo ooo o_ o y _o o_ o_ _ _ oo x o_ o _o o_ o_ _ _ o o x _ _ _ _ o o ooo ooooo o o o o o o o o oo ooo ooo _ _ _ _ _ _ o_ _ _ o_o o o ooooo o o o o _ _o _ _ _ o_o o oo _ _ _ _ _ _ o_ _ _ o_o o o o ooo _ _o _ o o_ _ o_oooo o oo y (1.12) The full cube shown is made up of eight pieces. There is a small cube of volume y 3 in the front, upper, right corner, and a larger cube of volume x3 in the back, lower, left corner. There are three “bars” of volume xy 2 that touch the small cube of volume y 3 . There are three “slabs” of volume x2 y that touch the larger cube of volume x3 . The full figure has volume (x + y)3 , so we see that (x + y)3 = x3 + 3x2 y + 3xy 2 + y 3 . (1.13) We will shortly make use of this equality. Later we will use (x + y)3 = x3 + 3xy(x + y) + y 3 (1.14) which we can obtain from (1.13) very easily with our knowledge of the distributive law. However in 1545, the equality (1.14) was obtained by modifying the figure (1.12). If each “bar” in (1.12) is combined with one of the three “slabs” in (1.12), then the following figure results. o o ooo xooooo o o oo y o ooo ooo oo ooo _ _ _oo_ _ _ o_o _ _ _ _ _ o oo o o ooo y o_ o_ o_ _ _ _ _ _ _ o o_ o_ _ oo ooo o oo ooo (1.15) x _ _ _ _ _ _ _ _ _ _ _ _ o oo oo ooo o o o ooooo o o o _ _ _ _ _ _ _ o_ o oo ooo o y x The larger “slabs” in (1.15) now have volume xy(x + y) each, and the full figure verifies (1.14). 18 CHAPTER 1. TOOLS This is as far as we will go in trying to reproduce the efforts of the sixteenth century in dealing with algebraic quantities geometrically. We will use (1.13) and (1.14) in the arguments that follow, but will use all the algebra that we know now in their application. In particular, we will not shy away from negative numbers, or from square roots of negative numbers. 1.3.2 Reducing the cubic The general form of the cubic equation is ax3 + bx2 + cx + d = 0. If a = 0, then we really do not have a cubic. So we assume a 6= 0. This lets us divide both sides by a which gives us a monic cubic with the same solutions. So from now on we work with equations of the form x3 + px2 + qx + r = 0. (1.16) We now modify a technique introduced in the problems in Section 1.2.4. If we add the right constant to x and call the result y, we might end up with a simpler equation. We can figure out exactly what to add. If we let y = x + k, then x = y − k. We can substitute this in for x in (1.16) and then apply (1.13) and (1.9). This gives x3 +px2 + qx + r = (y − k)3 + p(y − k)2 + q(y − k) + r = (y 3 − 3y 2 k + 3yk 2 − k 3 ) + p(y 2 − 2yk + k 2 ) + q(y − k) + r = y 3 + (p − 3k)y 2 + (q + 3k 2 − 2pk)y + (r − k 3 + pk 2 − qk). Thus we see that if p = 3k or k = p3 , then the equation in y will have no term with y 2 . If we set k = p3 , then the last expression above becomes p 2 p 2 p 3 p p p y 3 + (p − 3 )y 2 + (q + 3 − 2p )y + (r − +p −q ) 3 3 3 3 3 3 2 3 3 2 p p p p p − 2 )y + (r − + −q ) = y 3 + (q + 3 3 27 9 3 3 2 2p p p −q ) = y 3 + (q − )y + (r + 3 27 3 p 2 p 3 p 3 = y + (q − 3 ). )y + (r + 2 −q 3 3 3 To use this calculation, one starts with the problem given by (1.16). The problem (1.16) is then replaced by the problem p 3 p p 2 )y + (r + 2 −q y 3 + (q − 3 ) = 0. (1.17) 3 3 3 If a solution y is found for (1.17), then x = y − k = y − p3 is a solution to (1.16). 1.3. THE CUBIC 19 The form of the equation in (1.17) is so bad, that there is little reason to try to remember it. It is much more useful to remember that when presented with (1.16) the substutition x = y − p3 will remove the square term. For example (to pick an example where the numbers are not so bad), given x3 − 6x2 + 4x − 5 = 0 one puts in x = y − −6 3 = y + 2 and gets (y + 2)3 − 6(y + 2)2 + 4(y + 2) − 5 = (y 3 + 6y 2 + 12y + 8) − 6(y 2 + 4y + 4) + 4y + 8 − 5 = y 3 + 12y + 8 − 24y − 24 + 4y + 8 − 5 = y 3 − 8y − 13 = 0. If a solution y is found for the above, then x = y + 2 is a solution to the original. Note that carrying along the terms for y 2 verifies the fact that they vanish in the end, but if one is confident in the technique, then they can be ignored. Exercises (4) 1. In each of the following, write down a cubic in y having no y 2 term so that the solutions of the original and the solutions of the equation in y differ by a single constant. State the relationship between the solutions in x and the solutions in y. You should do at least one of the problems “the hard way.” That is, do not use (1.17). You can even try “the harder way.” That is, pretend you do not know in advance what the constant difference between x and y is. (a) x3 + 9x2 − 2x + 3 = 0. (b) 2x3 − 12x2 + 6x + 3 = 0. (c) x3 + x2 + 1 = 0. 1.3.3 Solving the reduced cubic We will not completely solve the cubic here. We will get a nice formula for the solution, but we will have trouble interpreting it. The full interpretation of the formula will require learning more about complex numbers. We will derive the formula and then take a break to learn about complex numbers. Then we will return to give the last word on the solution of the cubic in Section 1.5. The solutionto the (reduced) cubic is based on two key steps. Just as the reduction above can be boiled down to one idea (replace x by y − p3 ), and the solution to the quadratic can be boiled down to one idea (complete the square), the derivation of the formula for the cubic can be boiled down to a small number (two) of ideas. These ideas and how to implement them should be learned, rather than trying to memorize the whole derivation. 20 CHAPTER 1. TOOLS The two ideas come from recognizing two resemblances. The first is the following. The reduced cubic can be written as x3 + qx + r = 0. The equation we derived in (1.14) can be written as (x + y)3 − 3xy(x + y) − (x3 + y 3 ) = 0. Since x is in use in the cubic, we rewrite the above as (u + v)3 − 3uv(u + v) − (u3 + v 3 ) = 0 to avoid having x used in a different way in two different places. Now if x = u+v, if −3uv = q and if −(u3 +v 3 ) = r, then the equation above resembles the reduced cubic. Then if we can use −3uv = q and −(u3 + v 3 ) = r to solve for u and v, then we get x as u + v. But we can solve for u and v. This requires the second idea. From −3uv = q, we get q 3 q . (1.18) uv = − , or u3 v 3 = − 3 3 From −(u3 + v 3 ) = r, we get u3 + v 3 = −r. But this means that we know the sum and product of the two unknowns, u3 and v 3 . From Section 1.2.6, we know that u3 and v 3 are roots of a quadratic. The specific quadratic is q 3 = 0. (1.19) z 2 + rz − 3 The solutions to (1.19) are q 3 r r r 2 q 3 −r + r2 + 4 q3 z1 = + + =− 2 2 2 3 q r 3 r r 2 q 3 −r − r2 + 4 q3 − + =− z2 = 2 2 2 3 The unknown u will be the cube root of one of them (z1 , say), and the other unknown v will be the cube root of the other. Since x = u + v, we get s s r r r 2 q 3 r r r 2 q 3 3 3 + − + + − + . (1.20) x= − 2 2 3 2 2 3 This needs some interpretation. The fact that there are two values whose square is r 2 q 3 + 2 3 1.3. THE CUBIC 21 is taken into account by using one value for z1 and the other for z2 . The formula (1.20) makes it clear what to do with these two values. However, we will see in the next section that there are three possible values for each of the cube roots u and v. If all three values are used for both u and v, then it might seem that there are nine possible values for x = u + v. That this is not the case, and how to pick out the right combinations will be covered after a thorough discussion of complex numbers. Two examples There are other difficulties hidden in (1.20). The follwing examples will illustrate this. We will also refer to these examples later when we return to the topic of the relationships between the roots and the coefficients. Substituting 1 for x shows that x = 1 is a solution for each of the following. x3 + 3x − 4 = 0. 3 x − 9x + 8 = 0. If we apply (1.20) to (1.21), we get q q √ √ 3 3 x = 2+ 5+ 2− 5 (1.21) (1.22) (1.23) which hardly looks like 1. The formula in (1.23), while strange, is at least trying to give a real number. The square√root of 5 is a specific real √ number which is slightly larger than two. Thus 2 + 5 is positive and 2 − 5 is negative. A positive real number has a positive real cube root and a negative real number has a negative real cube root, so there is a real value that (1.23) can be specifying for x. There is no other real solution to (1.21). The left side of (1.21) has derivative equal to 3x2 + 3 which is positive for all real x. Thus x3 + 3x − 4 is a strictly increasing function of x, and it crosses the x-axis exactly once. Thus if our derivation of (1.20) is correct and complete, then (1.23) must be equal to 1. The equation (1.22) has more problems. It is of the form f (x) = 0 where √ 3 f (x) = x√ − 9x + 8. √ We have f ′ (x) = 3x2 − 9 which is zero √ at x = ± 3. √ Now f (− 3) = 8 + 6 3 which is√positive, and f ( 3) = 8 − 6 3 which can be checked and found negative. ( 3 is approximately 1.732.) It follows from standard curve sketching arguments that f (x) crosses the x-axis three times and (1.22) has three real solutions. But (1.20) gives q q √ √ 3 3 x = −4 + −11 + −4 − −11. (1.24) The expression (1.24) does not look as though it gives any real solutions, let alone three real solutions of which one is x = 1. In [2], Cardano struggled with expressions such as (1.24) and attempted to make sense of them. He was willing to manipulate such expressions formally (as you will in the next exercise) in spite of the fact that he had no confidence 22 CHAPTER 1. TOOLS that they had any meaning. With the later development (and slow acceptance) of complex numbers, expressions such as (1.24) became easy to handle and interpret. The next tool that we will look at will be complex numbers and their rules of arithmetic. Exercises (5) 1. Plug the solutions (1.23) and (1.24) into their respective equations and show that they are indeed solutions. This is just a brute force calculation. Use (1.13) to cube the solutions. There will be only one tricky part to the simplification. 2. [XXXXXXXXXXxx need some cubics to work on.] 1.4 Complex arithmetic As seen from the examples in the last section, we need to understand square and cube roots thoroughly in order to make sense out of (1.20). This is done by understanding complex arithmetic thoroughly. We refer to the topic as complex arithmetic as opposed to complex analysis since we are only going to add, subtract, multiply, divide and take roots. We will not differentiate, integrate or take limits. Students that have taken complex analysis will find what we do familiar, but the emphasis will be very different. A prior knowledge of complex analysis is not at all necessary. 1.4.1 Complex numbers and the basic operations There is no real number whose square is −1. So we invent a new number i with the property that i2 = −1. Almost everything that follows is forced if we want to end up with numbers that are as well behaved as real numbers. A desire to have laws of arithmetic such as the commutative law, associative law, distributive law and so forth dictate what happens next. A complex number is an expression of the form a + bi where a and b are real numbers. We say that a + bi = c + di if and only if a = c and b = d. The set of all complex numbers is denoted C. We now look at operations. The “calculation” (a + bi) + (c + di) = (a + c) + (bi + di) = (a + c) + (b + d)i is motivated by the laws of arithmetic mentioned above. The word “calculation” is in quotes since we really do not yet have a definition for the sum of two complex numbers. But the desire to copperate with the laws of arithmetic leads us to use the result as a definition. Thus we say that (a + bi) + (c + di) = (a + c) + (b + d)i 1.4. COMPLEX ARITHMETIC 23 is a definition. Note that our definition does in fact take two complex numbers and give us a third complex number since both (a + c) and (b + d) are real numbers. The “calculation” (a + bi)(c + di) = ac + adi + bci + bdi2 = ac + adi + bci + bd(−1) = (ac − bd) + (ad + bc)i leads to the definition (a + bi)(c + di) = (ac − bd) + (ad + bc)i. Now that we have definitions for addition and multiplication, we can see how the definitions behave. Direct calculation (no longer in quotes since we have definitions) (0 + 0i) + (c + di) = (0 + c) + (0 + d)i = c + di and (1 + 0i)(c + di) = (1c − 0d) + (1d + 0c)i = c + di show that 0 + 0i is an identity for addition and 1 + 0i is an identity for multiplication. The check (a + bi) + ((−a) + (−b)i) = (a + (−a)) + (b + (−b))i = 0 + 0i shows that (−a) + (−b))i is an additive inverse for a + bi. So we set −(a + bi) = (−a) + (−b)i. Now that we have additive inverses, we can subtract by adding the additive inverse. That is if z and w are complex numbers (and yes, it is legitimate to use a single letter to represent a complex number), then z − w = z + (−w). We note that (0+0i)(c+di) = 0+0i by direct calculation from the definition. Multiplicative inverses will allow us to divide in the same way that additive inverses allow us to subtract. But multiplicative inverses are slightly more complicated. Given a + bi, we want c + di so that (a + bi)(c + di) = (ac − bd) + (ad + bc)i = 1 + 0i. (1.25) There are three ways to find c+ di, easy, medium and hard. It will turn out that hard is the most useful for the rest of the course, but here we will do medium and easy. The calculation above leads to the medium technique directly. We note that a + bi cannot be 0 + 0i since no value of c + di will make (0 + 0i)(c + di) = 1 + 0i. 24 CHAPTER 1. TOOLS We now use the information in (1.25) to write down two equations ac − bd = 1, ad + bc = 0. (1.26) These are two linear equations in the two unknowns c and d. The values a, b, 0 and 1 are given. Straightforward linear algebra gives the solutions a , + b2 −b d= 2 a + b2 c= a2 (1.27) so that we can write (a + bi)−1 = a2 a b − 2 i. 2 +b a + b2 The easy technique is to know a standard trick. We want to put 1 a + bi in the form of a complex number. We make the denominator real by doing a b 1 1 a − bi a − bi 1 = 2 − 2 i. (1.28) = ·1= · = 2 a + bi a + bi a + bi a − bi a + b2 a + b2 a + b2 All of the calculations above are fine but the last equality is not yet justified. We are trying to get multiplicative inverses so that we can divide. The last equality assumes that we already know how to divide by a real number. That means the calculation in (1.28) should be preceded by a definition of division by a real number. This is easy to do, and so the outline using (1.28) to get multiplicative inverses is sometimes used. We leave the hard technique for much later. Note that the assumption a + bi 6= 0 + 0i shows up in the formula. If we do not make the assumption, then dividing by a2 + b2 might be dividing by zero, and if we do make the assumption, then dividing by a2 + b2 is never dividing by zero since a and b are real and a2 + b2 cannot be zero if at least one of a or b is not zero. Lastly we mention some relaxations of rules when it comes to writing down complex numbers. We often write bi for 0 + bi, we often write a for a + 0i. This means that 0 + 0i is often written as 0, and 1 + 0i is often written as 1. Exercises (6) 1. Prove all that were not already done in class of commutativity and associativity for addition and multiplication, and distributivity of multiplication over addition in complex arithmetic. 2. Do the linear algebra that gives the solutions (1.27) to (1.26). 1.4. COMPLEX ARITHMETIC 1.4.2 25 Complex numbers as a vector space This minor section is here only because the concept becomes very important later, and it is worth seeing more than once. If r and s are real numbers, then they are also complex numbers by thinking of r as r + 0i and s as s + 0i. If r and s are real numbers and z, y and w are complex numbers, then we know the following facts. 1. z + w = w + z. 2. (z + y) + w = z + (y + w). 3. z + 0 = z. 4. z + (−z) = 0. 5. (rs)(z) = r(sz). 6. (r + s)z = rz + sz 7. r(z + y) = rz + ry. 8. 1z = z. These facts make C a vector space with the real numbers as scalars. The definition of a complex number says that every complex number is uniquely represented as a + bi with a and b real. Since a + bi = a1 + bi, this can be reinterpreted to say that every complex number is a unique linear combination of the two complex numbers 1 and i. This makes the set {1, i} a basis for C with the real numbers as scalars. We will have more to say about this later. 1.4.3 Complex numbers in Cartesian and polar coordinates Cartesian coordinates Since each complex number is specified by a pair of real numbers, it is possible to plot the complex numbers in the Cartesian plane. The point (a, b) in the plane will correspond to the complex number a + bi and vice-versa. This makes all points in the x-axis correspond to complex numbers of the form a + 0i, which are regarded as real numbers. For that reason, the x-axis is called the real axis. The y-axis corresponds to complex numbers of the form 0 + bi, which are called imaginary numbers, so the y-axis is calle the imaginary axis. The addition rule (a + bi) + (c + di) = (a + c) + (b + d)i says that when viewed as points in the plane, complex numbers are added coordinate by coordinate, just as 2-dimensional vectors are. This continues the observations made in the previous section. Similary, negation where −(a + bi) = (−a) + (−b)i is also done coordinate by coordinate. 26 CHAPTER 1. TOOLS In conclusion, the Cartesian representation of complex numbers as points in the plane where the a and b in a + bi are x and y coordinates cooperates well with addition, negation and thus subtraction. But multiplication and division do not do as well in the Cartesian representation. The x and y coordinates get mixed badly under multiplication and division. Polar coordinates It turns out that polar coordinates cooperate beautifully with multiplication and division. Since n-th roots are a multiplicative concept, polar coordinates cooperate beautifully with the taking of n-th roots as well. This is our main reason for looking at polar coordinates. To do polar coordinates we need distance of a point to the origin, and an angle that the line to the origin makes with respect to the positive x-axis. In the following figure z = a + bi · r b θ d a (1.29) the Cartesion coordinates of z are (a, b) and the polar coordinates are (r, θ). The relationships that we will need between the various quanties are as follows: a = r cos(θ), b = r sin(θ), p r = a2 + b 2 , In particular, if the polar coordinates (r, θ) are known, then the Cartesian coordinates are (r cos(θ), r sin(θ)) making z = r cos(θ) + r sin(θ)i = r(cos(θ) + i sin(θ)). (1.30) Putting the i before the sin(θ) in (1.30) is traditional. Let us multiply two complex numbers written in the form (1.30). For z = r(cos(θ) + i sin(θ)) and w = s(cos(φ) + i sin(φ)), we get zw = r(cos(θ) + i sin(θ))s(cos(φ) + i sin(φ)) = rs((cos(θ) cos(φ) − sin(θ) sin(φ)) + i(cos(θ) sin(φ) + sin(θ) cos(φ)) = rs(cos(θ + φ) + i sin(θ + φ)) (1.31) 1.4. COMPLEX ARITHMETIC 27 where the last equal sign follows from two of the standard trigonometry identities. This result is extremely important and needs some discussion to bring out its admirable qualities. Modulus and argument In (1.29), the length r is called the modulus of the complex number z. It is simply the distance from z to the origin. The modulus of z is usually denoted |z| which explains why it is sometimes called the absolute value of z. Note that |z| ≥ 0 for any complex number z and that |z| is zero if and only if z = 0. The angle θ in (1.29) is called the argument of the complex number z and is denoted Arg(z). The expression (1.30) expresses a complex number z in terms of its modulus and argument. It expresses z as a real number |z| times a complex number (cos(θ) + i sin(θ)) which has modulus 1 since q cos2 (θ) + sin2 (θ) = 1. In (1.31), we multiply z, which has modulus r and argument θ, times w, which has modulus s and argument φ. We see that the result has modulus rs and argument θ + φ. We can turn this into an easily stated rule: when complex numbers are multiplied, the moduli are multiplied and the arguments are added. Exercises (7) 1.4.4 Complex conjugation In (1.28), we saw that (a + bi)(a − bi) = a2 + b2 which is the square of the modulus of a + bi. The complex number a − bi is called the complex conjugate of the complex number a + bi. If z is a complex number, then its complex conjugate is written as z. There are many nice properties of the complex conjugate. To write some of them down efficiently, we adopt some standard notation. If z = a + bi, then a is the real part of z, written Re(z), and b (a real number) is the imaginary part of z, written Im(z). This makes z = Re(z) + Im(z)i. We can now record several facts about complex conjugates. The first has already been noted. zz = |z|2 , (z + z)/2 = Re(z), (z − z)/(2i) = Im(z), z + w = z + w, −z = −z, zw = z · w, 1/z = 1/z, z = z. (1.32) 28 CHAPTER 1. TOOLS If we plot two complex numbers that are the conjugates of each other, z z •DD DD DD DD D zz z zz zz z •z then from the Cartesian view one is obtained from the other by reflection about the real axis. From the polar view, we can say that they have the same modulus, and that the arguments are the negatives of each other. Exercises (8) 1. Prove all the facts in (1.32) 2. Prove that a complex number z is real if and only if z = z. What is the shortest proof you can give? 1.4.5 Powers and roots of complex numbers Let the complex number z have modulus r and argument θ so that z = r(cos(θ)+ i sin(θ)). Then z n which is just z · z · · · · · z with n copies of z, has modulus r · r · · · · · r with n copies of r and has argument θ + θ + · · · + θ with n copies of θ. That is, z n has modulus rn and argument nθ. In particular, z n = rn (cos(nθ) + i sin(nθ)). This completely analyzes powers of complex numbers when given in polar form. Now that powers are understood, we can look at roots. If we look for the n-th root of this same z, then we want some complex number √ w with modulus s and argument φ so that sn = r and nφ = θ. That is, s = n r and φ = nθ . The modulus is straightforward. It is supposed to be √ a non-negative real number, and there is only one non-negative real number n r that can be the n-th root of the non-negative real number r. The argument is less straightforward. Saying that θ is the argument of z specifies θ as an angle, but not as a real number. All of θ, θ + 2π, θ + 4π, θ − 2π specify the same angle. Since they specify the same angle, there is no reason to use any one over the other when specifying the argument of z. But when they are divided by n, then can end up specifying different angles. For example θ , 3 θ + 2π , 3 θ + 4π 3 1.4. COMPLEX ARITHMETIC 29 all specify different angles, but when multiplied by 3, they all become the same angle as θ. However θ θ + 6π = + 2π 3 3 specifies the same angle as θ/3 which is one of the angles above. We can analyze the situation completely as follows. If z has argument θ, then anything of the form θ + k(2π) represents the same angle. If we divide k by n, we get a quotient q and remainder7 m so that k = qn + m and we can require that the remainder satisfy 0 ≤ m < n. Now θ + (qn + m)(2π) θ m θ + k(2π) = = + q(2π) + (2π) n n n n m θ which represents the same angle as + (2π). So we see that the angle we get n n for the n-th root depends only on the remainder and not the quotient. Now if we use two different values of k with two different remainders (m1 and m2 , say), then the two angles that result from this will differ by θ m1 m2 m1 − m2 θ + (2π) − + (2π) = (2π) n n n n n which is less than 2π since m1 − m2 is less than n and thus does not represent the angle 0. This makes m1 θ + (2π) n n and θ m2 + (2π) n n two different angles. So the angles of the n-th root do not depend on the quotient, but depend completely on the remainder. This lets us list all the n-th roots. If z 6= 0 has modulus r and argument θ, then all the n-th roots of z are of the form √ θ m m θ n r cos + (2π) + i sin + (2π) , m ∈ {0, 1, . . . , n − 1}. n n n n If z = 0, then the modulus is zero and all roots have modulus zero. But the only complex number with modulus zero is 0 itself. So all n-th roots of 0 equal 0. Examples 1. Let us find cube roots of z = 1 = 1 + 0i. The √ modulus is 1 and the argument is zero. The cube roots all have modulus 3 1 and the arguments of the cube roots are 0 0 + (2π) = 0, 3 7 The 1 2π 0 + (2π) = , 3 3 2 4π 0 + (2π) = . 3 3 use of quotients and remainders will be extremely important later in these notes. 30 CHAPTER 1. TOOLS The three cube roots are pictured in the xy-plane below. ω •11 11 1 xy ~1} z| {•1 • ω2 They are 1 = 1(cos(0) + i sin(0)) = 1 + 0i, √ 1 3 ω = 1(cos(2π/3) + i sin(2π/3)) = − + i , 2 √2 3 1 . ω 2 = 1(cos(4π/3) + i sin(4π/3)) = − − i 2 2 The fact that the third root ω 2 is the square of the root ω follows from the way complex multiplication behaves when expressed in polar coordinates. The use of ω (the Greek letter omega) and ω 2 are used for the two non-real cube roots of 1 is standard and will be used this way for the rest of these notes. √ 2. Let us find fourth roots of 2ω = −1 + i 3. The √ modulus is 2 and the argument is 2π/3. The fourth roots all have modulus 4 2 and the arguments are 2π 0 π + (2π) = , 12 4 6 2π 1 2π + (2π) = , 12 4 3 2π 2 7π + (2π) = , 12 4 6 The four fourth roots are shown below. 2ω • b•1 a 11 q• 2 1 q q 1q} xy ~ g̀a fqe b z c d| { qqq 111 • c •d 2π 3 5π + (2π) = . 12 4 3 1.4. COMPLEX ARITHMETIC 31 They are a= √ 4 √ π π 4 2(cos( ) + i sin( )) = 2 6 6 √ 3 1 +i 2 2 ! , √ ! √ 1 2π 3 2π 4 , b = 2(cos( ) + i sin( )) = 2 − + i 3 3 2 2 ! √ √ √ 3 7π 7π 1 4 4 c = 2(cos( ) + i sin( )) = 2 − , −i 6 6 2 2 √ ! √ √ 1 5π 3 5π 4 4 . −i d = 2(cos( ) + i sin( )) = 2 3 3 2 2 √ 4 Notice that a2 , a3 are not other fourth roots of 2ω. (See this by computing modulus and argument of a2 and a3 using the principle stated at the end of Section 1.4.3.) This differs from the behavior of ω and ω 2 which are both cube roots of 1. However, there are nice relationships between a, b, c and d which we will explore in the next section. Exercises (9) 1. This exercise will compute ω and ω 2 algebraically. The cube roots of 1 should be solutions to x3 = 1 or to the equivalent equation x3 − 1 = 0. However, x = 1 is clearly one solution, so x − 1 should be a factor of x3 − 1. What is the other factor? If you don’t already know the answer, you should do the long division (x3 − 1)/(x − 1) to find out. Then you should memorize the answer since it is important. The other factor is a quadratic. Solve the quadratic. If all is done correctly, your answers should be ω and ω 2 . 2. Verify that ω 2 = ω and ω 2 = ω. Find all complex numbers z so that z 2 = z. 3. (a) Find all cube roots of i. (b) Find all cube roots of 2. (c) Find all sixth roots of −1. (d) Find all fourth roots of −1. 1.4.6 Roots of 1 The n-th roots of 1 (also called roots of unity) occupy a special place in these discussions. They will turn out to be important not only now, but much later in the notes. The modulus of 1 is 1, of which all its powers are 1, and all of its positive real roots are 1. Thus all n-th roots of 1 lie on the unit circle (circle of radius one with center at the origin). 32 CHAPTER 1. TOOLS Since the argument of 1 is zero, the n-th roots of 1 are spaced evenly around the unit circle with angle exactly 2π/n between them, starting at 1. If α is the root with argument exactly 2π/n, then the various powers of α give all the n-th roots of 1. Here are the twelve 12-th roots of 1. • • • • • •α xy ~} z| {•1 • α5• • • α8 • 11 α We have labeled α5 , α8 and α11 for no particular purpose other than for illustrative examples. Application to n-th roots of arbitrary complex numbers Let z be a complex number (other than zero) and let w be an n-th root of z. We know that w 6= 0 since z 6= 0. Let β be an n-th root of 1. Then (βw)n = β n wn = 1z = z shows that βw is another n-th root of z. Further, if y is another n-th root of z, then γ = wy satisfies y n yn z γn = = n = =1 w w z which shows that γ is an n-th root of 1. Since y = γw, our two arguments have shown that multiplying an n-th root of z by an n-th root of 1 gives another n-th root of z, and that every n-th root of z can be obtained from one single n-th root of z by multplying by the various n-th roots of 1. Let α be the n-th root of 1 with argument exactly equal to 2π/n. Let w be one n-th root of z. Then all the n-th roots of z form precisely the set {w = 1w, αw, α2 w, . . . , αn−1 w} = {αi w | i = 0, 1, 2, . . . , n − 1} (1.33) of n complex numbers. We can review these observations from the polar view of complex multiplication. If w is an n-th root of z and γ an n-th root of 1, then the modulus of γ is 1. That makes the modulus of γw the same as the modulus of w and the correct modulus to be an n-th root of z. The argument of γw will differ from that of w by a multiple of 2π/n and thus be a correct argument for an n-th root of z. To summarize: √ If z has modulus r and argument θ, then one n-th root w of z has modulus n r and has argument θ/n. Now we form all the n-th roots of z as specified by (1.33). 1.5. THE CUBIC REVISITED 33 Exercises (10) 1. What are all the 6-th roots of 64? (You are supposed to know that 26 = 64.) 2. What are all the 6-th roots of −64? 3. What are all the cube roots of −27i? 1.5 The cubic revisited 1.5.1 Picking out the solutions from the formula In Section 1.3, we reduced an arbitrary cubic to the form x3 + qx + r = 0 and found the solutions are given by x= s 3 − r 2 + r r 2 2 + q 3 3 + s 3 − r 2 − r r 2 2 + q 3 3 . (1.20) We know that each of the cube roots can take on three values (unless the value is zero). But the three cube roots are not independent. The two cube roots correspond to the values u and v chosen in Section 1.3.3 so that x = u + v. The values u and v satisfied various equalities of which the most relevent (1.18) is that uv = − 3q . This means that v=− q 3u and the second cube root in (1.20) is specified once the first cube root is known. In particular, if the first cube root is real, and q is real, the second cube root must be real. We know how to get all cube roots of a number once we have one cube root of the number. In a very similar manner, we can get all roots of x3 + qx + r once we have one root. The equality uv = − 3q is the key. Assume that x = u + v is one root of x3 + qx + r where u is one cube root of r r r 2 q 3 + − + , (1.34) 2 2 3 and v is one cube root of − r 2 + r r 2 2 + q 3 3 . (1.35) Then the other cube roots of (1.34) are ωu and ω 2 u and the other cube roots of (1.35) are ωv and ω 2 v where ω and ω 2 are the non-real cube roots of 1. Given 34 CHAPTER 1. TOOLS that uv = − 3q and that any two cube roots used to make x also have to multiply to − q3 , we must have that the three roots given by (1.20) are x1 = u + v, x2 = ωu + ω 2 v, (1.36) 2 x3 = ω u + ωv since these are the only combinations where the two parts have the same product as uv. We can apply this to the examples worked out in Section 1.3.3. We saw that the formula (1.20) applied to x3 + 3x − 4 = 0 gave q q √ √ 3 3 x = 2 + 5 + 2 − 5. (1.23) Since all numbers the numbers in (1.23) are real, there are two real values of the two cube roots. These must go together, since uv = − 3q and q = 3 is real. If we take the cube roots in (1.23) as representing real numbers, then the other two solutions to x3 + 3x − 4 = 0 are q q q q √ √ √ √ 3 3 3 3 2 + 5 + ω2 2 − 5 , and ω 2 2+ 5 +ω 2− 5 . ω The formula (1.20) applied to x3 − 9x + 8 = 0 gave q q √ √ 3 3 x = −4 + −11 + −4 − −11. (1.24) √ The√ two numbers inside the cube root signs, α = −4 + i 11 and α = −4 − i 11 are both not real and are complex conjugates of each other. So they have the same modulus and their arguments are the negatives of each other. If we take the (real) cube roots of the modulus and 1/3 of the two arguments, then we get two complex numbers that are cube roots of α and α and that are also complex conjugates of each other. If we call these β and β, then their product is real and must be −q/3 = 9/3 = 3. We can verify that by noting that αα = |α|2 = 16 + 11 = 27 and the cube root of 27 is 3. So x = β + β must be one solution. This is real by one of the facts from (1.32). The other two solutions are ωβ + ω 2 β and ω 2 β + ωβ. From the comments in Section 1.3.3, we know that all the roots of x3 − 9x + 8 = 0 are supposed to be real. But this can be verified from a key observation from Exercise Set (9) that ω 2 = ω. Now ω 2 β = ω · β = ωβ, so the two parts of the second solution are complex conjugates of each other and the sum is real. A similar calculation shows that the third solution is real. Recall that one of the solutions must be 1. We√ leave it to the reader to decide which. Recall that one solution is less √ √ than − 3, another is greater than 3. Also note that the argument of −4 + i 11 is just slightly larger than 3π/4. (It is almost exactly 140 degrees.) 1.5. THE CUBIC REVISITED 35 Exercises (11) 1. Which of the solutions to x3 − 9x + 8 = 0 must give x = 1? Hint, the relative positions of the three solutions on the line and the facts mentioned are all that is needed. 2. Combine all the steps from Sections 1.3.2 and 1.3.3 to solve 2x3 + 12x2 + 18x + 12 = 0. The numbers were chosen to come out not completely horrendous, but not exactly nice. 1.5.2 Symmetry and Asymmetry This continues the discussion started in Section 1.2.6. There we looked at the relationship between the roots and coefficients of the quadratic. Here will make the same study for the cubic. As might be expected from the more complicated situation, we will have more to say. In fact, the complexities are high enough to give hints about what will come later in the notes. We also continue the discussion started in Section 1.2.5 on the differences between the four operations of addition, subtraction, multiplication and division, and the fifth operation consisting of the taking of n-th roots. Symmetry If we are given a monic cubic equation x3 + px2 + qx + r = 0 (1.37) and we know that the three roots8 are r1 , r2 and r3 , then the equation (1.37) must9 be the same as 0 = (x − r1 )(x − r2 )(x − r3 ) = x3 − (r1 + r2 + r3 )x2 + (r1 r2 + r2 r3 + r3 r1 )x − r1 r2 r3 . From this we get p = −(r1 + r2 + r3 ), q = r1 r2 + r2 r3 + r3 r1 , r = −r1 r2 r3 . (1.38) These formulas are only slightly more complicated than the ones we get for the quadratic from (1.10). Also, the formulas in (1.38) share with the corresponding formulas for the quadratic the property discussed in the next paragraph. 8 Later 9 The we will need to justify our claim that there are three roots word “must” has the same qualifications that we mentioned in Section 1.2.6. 36 CHAPTER 1. TOOLS There are no instructions how to assign the three roots to the symbols r1 , r2 and r3 . In fact there are 6 ways of doing so, corresponding to the six ways of ordering three things. The formulas in (1.38) come out the same no matter how the values of the roots are assigned to the three symbols. For p, this comes down to the commutativity of addition and for r it comes down to the commutativity of multiplication. For q the reason is a bit more complicated, but it is easy to see that it is true. The word that is attached to these observations is “symmetric.” To discuss this more precisely, we make some definitions. The discussion can be held from two points of view. The first uses orderings. When we assign the three roots of the cubic to the symbols r1 , r2 and r3 , we are giving an ordering to the roots. The one assigned to r1 is the first, the one assigned to r2 is the second and the one assigned to r3 is the third. There are six possible ways to do this with three values. There are three choices for which of the three is the first, two choices remain for which is to be second, and there is then only one choice left of which is to be the third. The product of 3, 2, and 1 is six. To make a different assignment of the roots to r1 , r2 and r3 is to choose a different ordering of the roots. The statement that the formulas in (1.38) are symmetric means that the values for p, q and r come out the same no matter which ordering is used for the roots. We can use orderings to introduce the second point of view which discusses permutations. Permutations will have a major role in these notes. If f : A → A is a function from a set to itself that is one-to-one and onto, then f is said to be a permutation of the set A. If A has three elements (for example, it is the set of roots of a cubic), then there are six permutations of A. This is easy to see by referring back to orderings. If an ordering is picked for the three elements, then we can talk about the first element, the second element, and the third. There are three places a permutation can send the first, there are only two places the permutation can send the second (this uses the one-to-one aspect), and there is then only one place to send the last. Again, there are 6 possibilities. We can write out all 6 permutations easily. If we simply list, in order, where r1 , r2 and r3 are taken, then a list such as r3 , r1 , r2 describes the permutation that takes r1 to r3 , takes r2 to r1 , and takes r3 to r2 . With this convention, the 6 permutations of r1 , r2 and r3 are r1 , r2 , r3 , r1 , r3 , r2 , r2 , r1 , r3 , r2 , r3 , r1 , (1.39) r3 , r1 , r2 , r3 , r2 , r1 . Note that the first permutation “does nothing.” It takes each of r1 , r2 and r3 to itself. However, this is a valid one-to-one and onto function from {r1 , r2 , r3 } 1.5. THE CUBIC REVISITED 37 to itself, and this is a valid permutation. This permutation is called the identity permutation. From the point of view of permutations, the statement that the formulas in (1.38) are symmetric means that the values of the right hand sides of the equalities in (1.38) do not change if a permutation is applied to the values of r1 , r2 and r3 . Recall that in Section 1.2.6, we pointed out that if r1 and r2 are the roots of x2 + bx + c, then b = −(r1 + r2 ) and c = r1 r2 . Note that the values of b and c do not change if a permutation is applied to the values of r1 and r2 . Asymmetry As interesting as the symmetries in (1.38) are, they get more interesting when compared to formulas that are less symmetric. Both the quadratic and cubic are solvable by radicals. From the coefficients and constants, one can get to the roots by the five operations of addition, subtraction, multiplication, division and the taking of n-th roots. If we look at the sequence of steps involved in these calculations, we get a sequence of intermediate values on the way to the roots. Thus for one root of x2 √ + bx + c, 2 2 b2 − 4c, the intermediate values might be listed as b first, then b − 4c, then √ √ 1 2 2 then −b + b − 4c and finally 2 (−b + b − 4c). Since the coefficients can be computed from the roots, each intermediate value can be computed from the roots as well by plugging in r1 r2 for c and −(r1 + r2 ) for b. However, we can get some intermediate values more easily by other means. √ Let us look at the first value b2 − 4c where a square root occurs. From r1 = we get p 1 (−b + b2 − 4c), 2 and r1 = p 1 (−b − b2 − 4c), 2 p b2 − 4c = r1 − r2 . (1.40) y 2 − (b2 − 4c) = 0. (1.41) The √ formula is very simple, but symmetry is lost. If we switch r1 with r2 , we get − b2 − 4c which is the other number whose square is b2 − 4c. Thus permuting the roots moves the value r1 − r2 among the solutions (in y) of The intermediate values, such as b2 or b2 − 4c involving no square roots must have symmetric formulas in terms of r1 and r2 . This is because both b and c do not change when r1 and r2 are permuted, so neither will b2 or b2 − 4c. We 2 2 2 have b2 = (r√ 1 + r2 ) from the formula for b, and b − 4c = (r1 − r2 ) from the formula for b2 − 4c. For the cubic x3 + qx + r, the intermediate values that involve n-th roots are r r 2 q 3 + (1.42) 2 3 38 CHAPTER 1. TOOLS as well as s 3 − r 2 + r r 2 2 + q 3 s 3 and 3 − r 2 − r r 2 2 + q 3 3 . (1.43) The two quantities in (1.43) have already been given convenient names. One is u and the other is v. Further, u3 and v 3 are the same except for the sign in front of the quantity (1.42). Thus we get that (1.42) is equal to 12 (u3 − v 3 ). So if we have u and v expressed in terms of r1 , r2 and r3 , then we have (1.42) expressed in terms of r1 , r2 and r3 as well. The three roots in terms of u and v are given by r1 = u + v, r2 = ωu + ω 2 v, (1.36) 2 r3 = ω u + ωv. Since there are two unknowns to solve for (u and v), we only need the first two of the equations in (1.36). This is not surprising since we are assuming that the coefficient of x2 is zero. This means that 0 = −(r1 + r2 + r3 ) and we can get the third root from the first two. In spite of this, we get nicer expressions for u and v if we use all three equations. This requires one important observation and one trick. The observation is that the numbers ω and ω 2 are the roots of the quadratic that is left when x3 − 1 has x − 1 factored out. See Exercise Set (9). That is, they are the two roots of x2 + x + 1. From this we get that ω 2 + ω + 1 = 0. Note that if we plug in the expressions in (1.36) into r1 + r2 + r3 , then the fact ω 2 + ω + 1 = 0 immediately verifies that r1 + r2 + r3 = 0. The trick is to multiply r2 by ω 2 and r3 by ω. This gives r1 = u + v, 2 ω r2 = u + ωv, ωr3 = u + ω 2 v. Now if the three expressions are added, we get r1 + ω 2 r2 + ωr3 = 3u + (1 + ω + ω 2 )v = 3u + 0v = 3u so that 1 (r1 + ω 2 r2 + ωr3 ). 3 In an almost identical manner, we get u= v= 1 (r1 + ωr2 + ω 2 r3 ). 3 (1.44) (1.45) So the formulas for the important itermediate values u and v are fairly simple, but not symmetric. The lack of symmetry will be explored in exercises. 1.5. THE CUBIC REVISITED 39 To find the value of (1.42) we must cube u and v and take the difference. While lengthy, this is straightforward. The reader can show (a + b + c)3 = a3 + b3 + c3 + 3(a2 b + a2 c + ab2 + b2 c + ac2 + bc2 ) + 6abc. (1.46) If this is used to evaluate u3 , we get 1 3 (r + r23 + r33 + 3(r12 r2 ω 2 + r12 r3 ω + r1 r21 ω + r22 r3 ω 2 + r1 r32 ω 2 + r1 r32 ω)+ 6r1 r2 r3 ). 27 1 Similarly, for v 3 we get 1 3 (r + r23 + r33 + 3(r12 r2 ω + r12 r3 ω 2 + r1 r21 ω 2 + r22 r3 ω + r1 r32 ω + r1 r32 ω 2 )+ 6r1 r2 r3 ). 27 1 Now u3 − v 3 calculates as 1 3(ω 2 − ω)(r12 r2 − r12 r3 − r1 r22 + r22 r3 + r1 r33 − r2 r33 ). 27 √ But ω 2 − ω = −i 3. So we have r 1 r 2 q 3 + = (u3 − v 3 ) 2 3 2 √ −i3 3 2 (r1 r2 − r12 r3 − r1 r22 + r22 r3 + r1 r32 − r2 r32 ) = 54 √ i 3 = (−r12 r2 + r12 r3 + r1 r22 − r22 r3 − r1 r32 + r2 r32 ). 18 Direct computation shows that (r1 − r2 )(r2 − r3 )(r3 − r1 ) =r1 r2 r3 − r12 r2 − r1 r32 + r12 r3 − r22 r3 + r22 r1 − r2 r32 − r1 r2 r3 = − r12 r2 − r1 r32 + r12 r3 − r22 r3 + r22 r1 − r2 r32 . Combining the last two calculations gives r √ r 2 q 3 i 3 + = (r1 − r2 )(r2 − r3 )(r3 − r1 ). 2 3 18 (1.47) The formulas (1.44), (1.45) and (1.47) share properties with (1.40). They are all polynomials in the roots (combinations of products and powers but there are no divisions by the roots or taking of n-th roots) that are fairly simple, but not symmetric. The effects of permutating the roots in (1.40) are obvious and in (1.47) they are almost as obvious. The effects in (1.47) as well as (1.44) and (1.45) will be covered in the next exercises. After the exercises we will discuss the implications of all of the observations. 40 CHAPTER 1. TOOLS Exercises (12) 1. Derive (1.45) in a manner similar to the derivation of (1.44). 2. Derive (1.46). You can do this in two ways. One is to just work out the cube and get 27 terms that have to be gathered. Another is to use (1.13) on (a + (b + c))3 which will then need another use of (1.13) for the (b + c)3 which will appear. 3. Verify all the calculations after (1.46) that lead to (1.47). 3 3 3 3 2 4. Verify √ the calculations of u , v , u −v and verify the claim that ω −ω = −i 3. 5. This will study the effect of permuting the values of r1 , r2 and r3 by considering some specific permutations. The first is easy: what happens to u and v if r1 is kept the same while r2 is switched with r3 ? Next, what happens to u and v if r1 , r2 and r3 are rotated by sending r1 to r2 , sending r2 to r3 and sending r3 to r1 ? Lastly, what happens to u and v if r1 and r2 are switched while r3 is kept the same? 6. Verify that all the permutations of r1 , r2 and r3 leave the values of the right sides of the equations in (1.38) the same. 7. Assume r1 6= r2 and verify that one permutation of r1 and r2 leaves the right side of (1.40) the same, and one does not. 8. Assume that r1 , r2 and r3 are all different and verify that the only permutation of r1 , r2 and r3 that leaves the right side of (1.44) the same is the identity permutation. Do the same for (1.45). 9. Assume that r1 , r2 and r3 are all different and verify that any permutation of r1 , r2 and r3 can only either leave (1.47) the same, or introduce a minus sign. Continue with the assumption and determine which permutations leave (1.47) the same and which introduce a minus sign. 10. This problem addresses the fact that the discussion in this section assumed p = 0 in x3 + px2 + qx + r = 0. Without the assumption p = 0, the techniques of Section 1.3.2 would be brought in to reduce the cubic to one with p = 0. The reduction to a monic cubic need not be considered since the roots do not change under that reduction. But the reduction to p = 0 changes the roots. The roots of the original are obtained from the roots of the reduced cubic by subtracting p3 from the roots of the reduced cubic. (See the remarks after (1.17).) This has the effect of subtracting p3 from each line of (1.36). Show that this has no effect on the formulas (1.44) and (1.45) for u and v and the formula (1.47) for the intermediate square root. 1.5. THE CUBIC REVISITED 1.5.3 41 The symmetric and the asymmetric Effects of permuting the roots If the problems in the previous section were done correctly, they would reveal that the effect of permuting the roots of the cubic is to move u and v among all the cube roots of ! ! r r r 2 q 3 r r r 2 q 3 + and − − . + + − 2 2 3 2 2 3 There is another way to express this. The cube roots of the expressions above are the solutions of the following equations with unknown y: ! r r r 2 q 3 3 y − − + =0 (1.48) + 2 2 3 and 3 y − − r 2 − r r 2 2 + q 3 3 ! = 0. (1.49) A clever way to combine two equations where one side is zero is to multiply them. Any solution to the product must be a solution to one of the originals.10 Thus when the roots of the cubic are permuted, the values of u and v move among the solutions to ! ! r r r 2 q 3 r r r 2 q 3 − y3+ + = 0. (1.50) + + y3+ 2 2 3 2 2 3 This simplifies to 3 y + r 2 2 − which then simplifies even further to r 2 y 6 + ry 3 − 2 + q 3 3 =0 q 3 = 0. (1.51) 3 Of course, this is just the quadratic (1.19) with the variable z in (1.19) replaced by y 3 and we have come full circle. In summary, the values of u and v move among the solutions of (1.51) when the roots of x3 + qx + r are permuted. We have already noted that the values obtained from (1.40) move among the roots of y 2 − (b2 − 4c) when the roots of x2 + bx + c are permuted, and the previous problems show that the values of (1.47) move among the roots of r 2 q 3 y2 − (1.52) + 2 3 when the roots of x3 + qx + r are permuted. 10 This assumes that if ab = 0, then one of a or b is 0. This property will be explored more carefully later in the notes. 42 CHAPTER 1. TOOLS Our observations so far We can now add to comments that were started in Sections 1.2.5 and 1.2.6. Those sections discussed the requirements on formulas giving roots in terms of coefficients, and the nature of the formulas giving the coefficients in terms of the roots. We have observed the following. 1. If a polynomial is solvable by radicals, then there is a chain of intermediate values in going from the coefficients to the roots, were each new value in the chain is obtained from previous values by one of five allowable operations: addition, subtraction, multiplication, division, and the taking of n-th roots for various n. 2. The formulas giving the coefficients from the roots are simple (they are polynomials in the roots) and symmetric. Permutations of the roots do not change the values of the formulas. 3. The formulas giving the intermediate values from the roots are also polynomials in the roots and are symmetric if the intermediate values are only computed from the coefficients using the four operations of addition, subtraction, multiplication and division. Permutations of the roots do not change these values. 4. The formulas for the intermediate values that involve the taking of n-th roots are also polynomials in the roots but are not symmetric. Permutations of the roots do change the values of these formulas. 5. For intermediate values that change with permutations of the roots, the values are constrained to move among the roots of other polynomials. 6. The number of permutations that change an intermediate value and the number of permutations that leave that value unchanged can depend on the particular value. Attempts to explain the observations In 1832, Galois was able to create a unified system that took all of these observations into account. He was then able to study the system in enough detail to tell the difference between polynomials that were “solvable” and those that were not. In the creation of his system he had the help of earlier work of mathematicians who were going off in a wrong direction. We have observed that important intermediate values in the computation of the roots of the quadratic and cubic are themselves roots of other polynomials such as (1.41) for the quadratic, and (1.51) and (1.52) for the cubic. These polynomials are easier to solve than the original, and so the original is ultimately solvable. There are similar polynomials for the quartic. These polynomials whose roots give intermediate values 1.5. THE CUBIC REVISITED 43 are called resolvents, and an intense search for resolvents for the fifth degree polynomial had been under way for some time by 1800. The resolvents for the cubic and quartic were obtained in 1545 by tricky algebraic manipulation that was peculiar to each degree. Up to 1800, it was hoped that a unified technique could be found that would build resolvents for any degree equation that would help solve the equation. Even though this attempt was doomed to failure it still generated some interesting mathematics. The formulas for the intermediate values (1.44), (1.45) and (1.47) exhibit certain symmetries even if they are not completely symmetric. Further, when the resolvents are written out in terms of the roots, other symmetries appear. This is not surprising since the resolvents use combinations of the coefficients from the original equation and the coefficients have symmetric expressions in terms of the roots. This led to a separate study of symmetries and permutations. The goal of the study—building useful resolvents—was not to be realized, but the study itself turned out to be more important than the unrealized goal. Two names, Lagrange and Cauchy, associated to the study of permutations will appear again in these notes. Galois was able to take what was known about permutations and create new objects to study. He focused on the interaction between the permutations that left certain values unchanged, and the values that were unchanged by the permutations. He referred to the collection of permutations that left certain values unchanged as a “group” of permutations. The name stuck and groups became important objects of study. Galois did not name the collections of values that were left unchanged by certain permutations, but they evolved into another set of objects called fields. These new objects and their interactions will be a major focus of these notes. A look at things to come This chapter has almost run its course. From this point, new objects, new behaviors, new rules, and new techniques will be introduced in fairly rapid succession. Each of these will be a new tool. This chapter had the task of emphasizing the importance of tools, but it was not the job of this chapter to introduce them all. That will be the task of the rest of these notes. We have named much of what will occupy us. New objects, such as groups and fields will be defined and studied. These will not be the only new objects, but are the only ones we can list now. We will also study more familiar objects such as polynomials. The introduction of new objects is such an important concept that it deserves its own chapter, and will be the subject of Chapter 2. In that chapter other objects will be introduced, and some that are already familiar (such as the integers, real numbers, complex numbers and polynomials) will be reviewed as examples of the objects of study or as ingredients from which more examples can be constructed. 44 CHAPTER 1. TOOLS The last section of this chapter will discuss the quartic equation. It will show is more of what we have already learned, so it will be labeled as optional. The effects of permuting the roots will be more complicated. This is not surprising. There are six ways to permute three roots. Quartic equations have four roots and there are twenty-four ways to permute them. 1.6 The quartic (optional) This section introduces no new phenomena, but it gives richer examples of the phenomena that we have already observed. It is arranged as a series of exercises and so may be viewed as a larger project. 1.6.1 Reduction Exercises (13) 1. Find a way to reduce the general quartic ax4 + bx3 + cx2 + dx + e = 0 to x4 + qx2 + rx + s = 0. Familiarity with the corresponding reduction of the cubic will make this easy. 1.6.2 The resolvent We will introduce the idea given in [2] that lets us solve a quartic if we can solve a cubic. Once the idea is in place, getting the cubic is reasonably straightforward. The form x4 + qx2 + rx + s has the powers 4, 2, 1, and 0 of x. The powers 4, 2 and 0 form a quadratic in the variable x2 and it is possible to complete the square for the sum of those powers alone. The powers 2, 1 and 0 form a quadratic in the variable x it is possible to complete the square for the sum of those powers alone. However, the two sets of powers are mixed. The idea is to complete the square of both sets at the same time. If this is done so that there is no extra constant term remaining, then an equation of the form (x2 + k)2 + t(x + j)2 = 0 results which can be turned into (x2 + k)2 = −t(x + j)2 which can be solved easily for x. The problem is to complete the square of the group of powers 4, 2 and 0 simultaneously with the group of powers 2, 1 and 0 so there is no constant term remaining. The trick is to figure out how much of the x2 term should be in one of the groups so that the remaing part of the x2 cooperates with the other group. Exercises (14) 1. In x4 +qx2 +rx+s = 0, break qx2 into zx2 +(q−z)x2 . Complete the square of x4 + x2 and complete the square of (q − z)x2 + rx and show that this can 1.6. THE QUARTIC (OPTIONAL) 45 be done so there is no constant term remaining in x4 + qx2 + rx + s = 0 if z is a solution of the cubic equation z 3 − qz 2 − 4sz + (4sq − r2 ) = 0. (1.53) Do not attempt to solve this cubic. It is too painful. We will refer to it as the resolvent of x4 + qx2 + rx + s = 0. 2. Assume that values of z can be found by solving the cubic (1.53), and show that the solutions to the quartic equation x4 + qx2 + rx + s = 0 are the solutions to the two quadratics √ √ z r z−q = 0, + x2 + ( z − q)x + 2 2(q − z) (1.54) √ √ r z−q z = 0. − x2 − ( z − q)x + 2 2(q − z) These can be simplified somewhat, but it is not worth it. 3. Let r1 and r2 be the roots of the first quadratic in (1.54), and let r3 and r4 be the roots of the second quadratic in (1.54). Show that z = r1 r2 + r3 r4 . 4. List the 24 permutations of r1 , r2 , r3 and r4 . This is easy if you adopt the convention used in (1.39). 5. Convince yourself that of the 24 permutations of r1 , r2 , r3 and r4 that there are 8 that do not change the value of z and write them out. Note that 8 is one third of 24. Note also that we expect 3 solutions to (1.53). We will see later that this is not a coincidence. 6. Show that the resolvent of x4 − 23x2 + 18x + 40 = 0 is z 3 + 23z 2 − 160z − 4004 = 0. Show that the solutions of the resolvent equation are −22, 13 and −14. Do not solve the resolvent equation. Just plug in the numbers. A calculator will help. 7. Find the solutions of x4 − 23x2 + 18x + 40 = to the two quadratics in (1.54). Let r1 and r2 the quadratics in (1.54), and let r3 and r4 be quadratic in (1.54). What value do you get for with the numbers in Problem 6 above. 0 by finding the solutions be the solutions to one of the solutions to the other r1 r2 + r3 r4 . Compare this 8. Evaluate r1 r2 + r3 r4 with the numbers from the previous problem after applying all 24 permutations to r1 , r2 , r3 and r4 . Compare the results with the numbers in Problem 6 above. 9. Do problems 6 through 8 with x4 − 15x2 − 10x + 24 = 0. You must figure out the resolvent yourself. The resolvent should have solutions 10, −14, 11. 46 CHAPTER 1. TOOLS 10. Same with x4 − 25x2 − 60x − 36 = 0. The resolvent should have solutions 0, −9, −16. 11. Same with x4 − 17x2 − 36x − 20 = 0. The resolvent should have solutions −1, −8, −8. Chapter 2 Objects of study The shift from classical to modern mathematics came with a shift in the objects studied. Classical calculus might study individual functions from the reals to the reals, while modern calculus (or analysis) would study the set of all functions from the reals to the reals as a single object. While this might seem interesting, it is not obvious that it is useful. It turns out that this kind of shift is extremely useful once the properties of the new objects are sufficiently understood. In this chapter we will introduce new mathematical objects that form the core of modern algebra. They will be motivated to various degrees by the discussions in the previous chapter. This chapter will do no more than give the definitions of the new objects and give a few examples. Some of the examples are very familiar and we will have the opportunity to review their basic properties. We will also build new examples and will study the techniques that go into their construction. The chapter following this one will explore some very elementary properties of several of the new objects. Later chapters will focus on single objects for deeper study. 2.1 First looks Here, we give extremely brief and non-rigorous introductions to some of the new object. Later in this chapter, we will give full definitions. 2.1.1 Groups For Galois, a “group” of permutations consisted of all permutations of the roots of a polynomial that kept certain values, given as formulas in the roots, unchanged. Since permutations are functions from the roots to themselves, we can compose them. If two permutations are given names such as f and g, then we can form the composition f g (where g is applied first). 47 48 CHAPTER 2. OBJECTS OF STUDY If there is a formula in the roots whose value is unchanged when we apply f and also when we apply g, then we can ask what happens when we apply f g. A reasonable guess is that the value is unchanged by f g as well, and we will verify this carefully later when the proper definitions are in place. Thus a “group” of permutations as defined by Galois is closed under composition— if two permtations are in the group, then so is their composition. These and other observations were eventually gathered together to form a definition of an abstract group. The full definition will be given in Section 2.3. While there are many examples of groups, one of the best examples is the set of all permutations of a given set. As consequences of the definitions, groups have restrictions on their internal structures. These restrictions will provide useful information in our investigations of solutions of polynomial equations. To give a simple example of a such a restriction recall that there are 6 possible permutations of the three roots of a cubic. We have seen that there are values that are unchanged by three of the six permutations and changed by the remaining three. However, we will eventually see that there can be no value that is unchanged by four of the six permutations and changed by the remaining two. 2.1.2 Fields Several systems of numbers have come up in previous discussions, and often the same disuscussion had more than one system. While the coefficients used in the equation x2 + x + 1 = 0 are all real, the solutions are not. Thus to discuss the polynomial x2 + x + 1, the real numbers suffice, but to discuss its roots, a jump to a larger system of numbers is required. Both the real numbers and the complex numbers are self contained when the four operations of addition, subtraction, multiplication and division are applied. Given any two real numbers and one of these operations, the result (except for dividing by zero) is another real number. The same can be said about the complex numbers. But the taking of square roots does not cooperate well with all real numbers. The square root of −1 is not real. A similar discussion could be had about rational numbers. Sums, products, etec. of rational numbers are all rational, but the square root of 2 is not rational. With the right definitions in place, the set of all numbers left unchanged by groups of permutations of roots of polynomials has similar properties. Sums, products, etc. of numbers from the set give other numbers from the set, but this does not always work with n-th roots. Galois understood the importance of such collections of numbers and the fact that the taking of n-th roots would often require moving to a larger system of numbers. Eventually, systems of numbers preserved by sums, products, etc. and satisfying the usual laws (associative, distributive) came to be called fields. (Several names in several languages were used previously, with the English word “field” 2.1. FIRST LOOKS 49 used for the first time in 1893 by E. H. Moore.) The best examples for now are the rational numbers, the real numbers, and the complex numbers. 2.1.3 Rings Polynomials are central to our study. Polynomials share many properties with the integers. One can add, subtract, and multiply polynomials to get other polynomials, just as one can add, subtract, and multiply integers to get other integers. But dividing an integer by an integer does not always give an integer, and similarly dividing a polynomial by a polynomial does not always give a polynomial. Systems with addition, subtraction, multiplication, but not necessarily division are called rings. The most fundamental example of a ring is the ring of integers, and next in importance are rings of polynomials. There are even more parallels between integers and polynomials. Primes can be discussed in both settings, as well as uniqueness of factorization, greatest common divisors, and so forth. We will investigate rings in much less depth than groups and fields. This is only because of time constraints and the limited nature of our goals, and not because rings have any lesser status as mathematical objects. 2.1.4 Homomorphisms We have discussed several times the effect of permuting roots. If a formula such as r1 + r2 is given, we can give its value a name (such as s for sum) so that we can ask “what happens to s if we apply the permutation that switches r1 and r2 . If this permutation is f , so that f (r1 ) = r2 and f (r2 ) = r1 , then we are really asking if there is a reasonable way to assign a value to the notation f (s). And if d = r1 − r2 , then we are also asking if there is a reasonable way to assign a value to f (d). To be consistent with what we have been saying before, we should say that s is not changed and d is negated. That is f (s) = r2 + r1 = f (r1 ) + f (r2 ) and f (d) = r2 − r1 = f (r1 ) − f (r2 ). Similarly, if p = r1 r2 , then f (p) should be f (p) = f (r1 )f (r2 ) = r2 r1 . At this point, we have invented the homomorphism. There are homomorphisms of groups, homomorphisms of rings, and homomorphisms of fields. Since we have been talking mostly about fields, we will say that a homomorphism from one field to another is a function f from the first field to the second so that for each two elements x and y of the first field, we have f (x + y) = f (x) + f (y) and f (xy) = f (x)f (y). You might ask about subtraction and division, but that will come in due time. Structures like groups, rings and fields interact with other groups, rings and fields through homomorphisms. The behavior of homomorphisms is important enough to declare homomorphisms as separate objects of study. 50 CHAPTER 2. OBJECTS OF STUDY 2.1.5 And more Other objects will be defined. It would be difficult to indicate what they are just now, and we have listed most of the important ones. In later chapters when we have more machinery available, we will define more. 2.2 Functions Functions are too basic to put off for long. Homomorphisms are functions, permutations are functions, and we will have more uses of functions than these. 2.2.1 Sets Notation Functions need sets. We assume the reader is familiar with the very basics of sets and with set operations such as union, intersection and cross product. We need notations for sets. The reader should review such basic notations for sets such as: 1. The list: A = {1, 2, 4} is a set in which the only elements are 1, 2 and 4. 2. The implied list: B = {2, 4, 6, . . .} is a set whose only elements are the positive, even integers. 3. Set builder: C = {x a real number | x > 7} is a set whose only elements are those real numbers for which x > 7 is true. We write x ∈ A to mean that x is an element of A. So in the examples above, the set A has x ∈ A true only when x is one of 1, 2 or 4. We reserve certain letters for certain sets. We use N for the set of nonnegative integers. In other notation N = {0, 1, 2, 3, . . .}. The elements of N are called the natural numbers. We use Z for the set of integers, Q for the set of rational numbers, R for the set of real numbers, and C for the set of complex numbers. We can denote B and C above in different ways. For example B = {2n + 2 | n ∈ N}, C = {x ∈ R | x > 7}. Note that in set builder notation, {x | test on x} is the set of all x that pass the test. The notation {x | test on x} is read out loud as “the set of all x such that ‘test on x’ is true.” This needs to be kept in mind when we discuss union and intersection. The empty set There is a special set denoted ∅ and called the empty set which has no elements. That is, a ∈ ∅ is always false. 2.2. FUNCTIONS 51 Union and intersection If A and B are two sets then A ∪ B = {x | x ∈ A or x ∈ B}, A ∩ B = {x | x ∈ A and x ∈ B}, (2.1) where A ∪ B is called the union of A and B, and A ∩ B is called the intersection of A and B. Note that the word “or” used in the definition of A ∪ B is always interpreted so that in a sentence “P or Q” is considered to be true if either one of P or Q is true or both P and Q are true and is only false if both P and Q are false. This convention means that we are using the inclusive or. In order to invoke the exclusive or, which does not allow both P and Q to be true at the same time for the sentence to be true, we would have to use extra words and say “P or Q, but not both.” The right sides of the equalities in (2.1) are in set builder notation. So for A ∪ B, the sentence “x ∈ A or x ∈ B” is a test. If x passes this test, it is an element of A ∪ B. This should correspond to the definition of union that you have seen before. Similarly, if x passes the test “x ∈ A and x ∈ B,” then x is an element of A ∩ B. Cross product Cross products are sets of ordered pairs. An ordered pair (x, y) has two elements, like a set with two elements, but unlike a set with two elements, one element x is considered to be the first element and the other y is considered to be the second. Thus for ordered pairs, (x, y) = (a, b) if and only if x = a and y = b, but for two element sets, {x, y} = {a, b} if and only if either x = a and y = b, or x = b and y = a. If A and B are two sets, then the cross product of A and B, written A × B is defined as A × B = {(a, b) | a ∈ A, and b ∈ B}. For example if A = {1, 2} and B = {a, b, c}, then A × B = {(1, a), (2, a), (1, b), (2, b), (1, c), (2, c)}. Another example is that the Cartesian plane, which consists of all (x, y) where x and y are real, can be described as R × R. Subsets and equality If A and B are sets, then we say that A is a subset of B and write A ⊆ B if for every a ∈ A, we have a ∈ B. That is, every element of A is an element of B. If A and B are sets, then we say A = B if both A ⊆ B and B ⊆ A are true. The definition of subset used the phrase “for every” in an essential way. This phrase occurs frequently enough in mathematical statements to have a symbol 52 CHAPTER 2. OBJECTS OF STUDY of its own. When ∀x is written followed by a condition on x, then for the entire statement to be true, the condition on x must be met for every possible x. The phrase “for every possible x” is a bit vague so it is very common to write ∀x ∈ A where A is some set, so that the meaning is changed to “for every possible x in the set A.” At this point, we desperately need an example. The definition of A ⊆ B can now be written ∀x ∈ A, (x ∈ B). According to the rules of ∀ given above, this is true if x ∈ B is true for every possible x ∈ A. Of course, this is just a repetition of the definition of A ⊆ B that we gave in words three paragraphs back. The importance of the ∀ symbol is that it emphasizes what must be proven if a statement with “for all” in it is to be proven. One standard way to prove a statement starting with ∀x ∈ A, is to prove the condition that follows holds for an unspecified element of A that we call x. If the proof is successful, then we can claim that the proof would work with x taking on any value in A. This is often worded in the proof by saying “let x be an abitrary element of A.” This is so standard that often this sentence is abbreviated to “let x be in A.” Of course to be arbitrary, no further restrictions can be placed on x. To follow “let x be an arbitrary element of A,” with “assume x = 7,” would not result in a proof of anything that holds for all x. We illustrate this with a proof of the following. Lemma 2.2.1 If A ⊆ B and B ⊆ C, then A ⊆ C. Proof. We must prove ∀x ∈ A, (x ∈ C). Let x be in A. Since A ⊆ B, we know that x ∈ B. Since B ⊆ C, we know that x ∈ C. We could end the proof with an explanation that since our proof worked for an unrestricted element of A, symbolized by x, we have proven what is needed for every element of A. However, this final argument is so well known that it is usually left out. If you are more comfortable saying “since x ∈ A was arbitrary, we have proven ∀x ∈ A, (x ∈ C),” then you can do so. Also if you are more comfortable, you may say “let x ∈ A be arbitrary,” instead of the shorter “let x be in A.” However, we will not include such extra words in these notes. Lemma 2.2.2 If A = B and B = C, then A = C. Proof. We must prove A ⊆ C and C ⊆ A. But A = B and B = C mean that all of A ⊆ B, B ⊆ A, B ⊆ C and C ⊆ B are all true. By two applications of the previous lemma, A ⊆ C and C ⊆ A are both true, and we are done. Note that the proof of this lemma did not need to refer to the definition of subset. This is because we had a useful fact about the subset relation already proven that we could make use of. One of the greater difficulties in a math course is to keep track of all the facts that have been proven and to make use of them at opportune times. A standard trap that students fall into is to try to prove everything from the definitions. 2.2. FUNCTIONS 53 Disjoint sets If X and Y are two sets, we say that X and Y are disjoint if X ∩ Y = ∅. That is, there is no element that is simultaneously in both sets. If P is a collection of sets (yes, a set of sets), then the collection is said to be of pairwise disjoint sets if for any two sets X and Y from P that are not equal, we have that X and Y are disjoint. That is, two different sets in P cannot overlap. The collection consisting of the three sets {1, 3}, {2, 5} and {4, 6} is a collection of pairwise disjoint sets. Now two of the sets that are different have any element in common. The collection consisting of the three sets {1, 3}, {2, 5} and {1, 6} is not a collection of pairwise disjoint sets. The second set has nothing in common with the first and third, but the first and third sets have 1 in common. Note that in this example, if we take the intersection of all three of the sets in the collection we get the empty set. So intersecting all the sets in a collection to see if you get the empty set is not a valid test to see if the collection is of pairwise disjoint sets. Exercises (15) If A and B are sets, prove: 1. A ⊆ (A ∪ B) 2. (A ∩ B) ⊆ A. 3. (A ∩ B) ⊆ (A ∪ B). 4. If A ⊆ B and C ⊆ D, then (A × C) ⊆ (B × D). 5. Let A and B be two sets. Argue that showing ∀x ∈ A, (x ∈ / B) is enough to show that A and B are disjoint. If this seems too skimpy, then review the notion of contrapositive from previous courses and recall that a statement and its contrapositive always have the same truth value. Verify that the contrapositive of “if x ∈ A, then x ∈ / B” is “if x ∈ B, then x ∈ / A.” Use this to convince yourself that if ∀x ∈ A, (x ∈ / B) has been shown, then showing ∀x ∈ B, (x ∈ / A) adds nothing new. 2.2.2 Functions If f : A → B is a function from the set A to the set B, then for every a ∈ A, the expression f (a) represents an element of B. There are two rules that are so self evident that they need explanation. They also need to be invoked whenever it is claimed that a function has been specified. The first rule is that f (a) has to specify an element of B for every a ∈ A. The second rule is that if a = a′ and a and a′ are elements of A, then f (a) = f (a′ ). That is, f (a) and f (a′ ) denote the same element of B. If this is thought to be 54 CHAPTER 2. OBJECTS OF STUDY obvious, note that f (a) and f (a′ ) are not identical as printed on the page, so it is necessary to point out that there is a reason that they are equal. Consider the following “definition” of a function from the rationals to the 1 2 4 rationals. We declare that f ( m n ) = m . Now 3 = 6 as elements of Q, but 2 1 4 1 1 1 f ( 3 ) = 2 and f ( 6 ) = 4 . Note that 2 6= 4 as elements of Q. Further, f ( 30 ) = 10 and 10 does not represent an element of Q. So our “definition” violates both rules of a function. Our emphasis has been on notation. However, functions are not just viewed as grammatical constructs. One usually thinks of a function f : A → B as “taking” or “sending” each element of A to its corresponding element f (a) in B. The element f (a) in B is often referred to as the image of a (under f ). Exercises (16) 1. Which rule(s) in the definition of a function are violated by the following 1 definition for f : Q → Q? Set f ( m n ) = n. 2. Which rule(s) in the definition of a function are violated by the following n definition for f : Q → Q? Set f ( m n ) = m. 3. Which rule(s) in the definition of a function are violated by the following 2m definition for f : Q → Q? Set f ( m n ) = 2n . 4. Which rule(s) in the definition of a function are violated by the following n definition for f : Q → Q? Set f ( m n ) = n. 2.2.3 Function vocabulary If f : A → B is a function, then A is called the domain of f and B is called the range of f . If U ⊆ A is a subset of A, then we would like to define f (U ), the image of U , to be the set of all elements in B that are the images of elements of U . We could attempt to use the notation f (U ) = {f (a) | a ∈ U } for this. However, this does not fit exactly the conventions we have for set builder notation. Rather than try to stretch the notation for set builder (and admitttedly, some books do), we will introduce a symbol for “there exists” for two reasons. It will give notation for f (U ) that doesn’t stretch the conventions of set builder notation, and it helps outline proofs. We write ∃x followed by a condition on x to mean that there is some x that makes the condition true. As with ∀, we often restrict where the x can come from that makes the condition true by writing ∃x ∈ A followed by the condition on x to mean that there is some x in the set A that makes the condition true. Now given f : A → B and U ⊆ A, we can define the image of U as f (U ) = {x ∈ B | ∃a ∈ U, f (a) = x}. This fits with our previous use of set builder notation. 2.2. FUNCTIONS 55 Onto functions We can use ∃ to define a condition on functions. We say that a function f : A → B is onto or a surjection if ∀x ∈ B, ∃a ∈ A, f (a) = x. In words, every x in B is the image of some a in A. We illustrate the use of ∃ in proofs by proving the following lemma. The lemma discusses the composition of functions which should be familiar from calculus. Lemma 2.2.3 If f : A → B and g : B → C are both onto, then so is their composition gf : A → C. Proof. Let c be in C. Since g is onto, we can let b ∈ B be such that g(b) = c. Since f is onto, we can let a ∈ A be such that f (a) = b. Now gf (a) = g(f (a)) = g(b) = c. Since we have found an a ∈ A for which gf (a) = c, we know that ∃a ∈ A, gf (a) = c is true. Comments In the paragraphs before Lemma 2.2.1, we discussed the procedure for proving a conclusion having a ∀, and the procedure was illustrated in the proof of that lemma. Lemma 2.2.3 illustrates how to use a ∃ that comes from the hypothesis, and how to prove a conclusion having a ∃. The ∃ in the conclusion is easy. To prove that “there exists an x with a certain property” one only needs to find some x with the property. However, the correct use of ∃ when it is assumed is more subtle. In Lemma 2.2.3, both f and g are assumed to be onto, so the definition of being onto gets to be used twice. Each time we know something exists with a certain property, so twice we are allowed to bring a value into the proof with that property. The rule that must be followed here is that there must be no other restriction on the value that is brought in. As an example, if we know for some function f from the reals to the reals is onto, then there is an x so that f (x) = 13. We are not allowed to say “let x be such that f (x) = 13 and also assume that x = 7.” The extra restriction is not legal. In the proof of Lemma 2.2.3, we used the letter b the first time we invoked the definition of onto, and the letter a the second time. It wold not have been legal to use the same letter both times, since the second time the letter was used we would have been introducing the extra restriction “and assume the new value is the same as the one we introduced earler.” Unfortuately, this is not a very compelling example, since had we used the same letter twice we also would have ended up with the odd looking statement f (b) = b. We will see more compelling examples later. Note that we have not covered how to use a ∀ that is assumed to be true. This is quite easy. If “∀x ∈ S some statement” is assumed to be true, and you know p is in S, then you can write down that the statement must be true about p. 56 CHAPTER 2. OBJECTS OF STUDY One-to-one functions A function f : A → B is one-to-one or an injection if ∀x, y ∈ A if f (x) = f (y), then x = y. That is, if two elements in A have the same image in B, then they must have been the same element in the first place. Lemma 2.2.4 If f : A → B and g : B → C are both one-to-one, then so is their composition gf : A → C. The proof will be left as an exercise. One-to-one correspondences A function f : A → B is a one-to-one correspondence or a bijection if it is both one-to-one and onto. Lemma 2.2.5 If f : A → B and g : B → C are both one-to-one correspondences, then so is their composition gf : A → C. The proof will be left as an exercise. Domain, range and image If f : A → B is a function, then A is the domain of f and B is the range of f . The subset f (A) of B will be called in image of f . Note that not all books agree with this use of these words. However, we will be consistent with these deinfitions throughout these notes. Note that the function f above is onto if and only if its range coincides with its image. Exercises (17) 1. Prove Lemma 2.2.4. 2. Prove Lemma 2.2.5. 3. Use calculus to argue that f (x) = x3 + x is one-to-one from the reals to the reals. √ 2 4. Prove that f from the rationals to the reals √ given by f (x) = (x + 2) is one-to-one. This needs the fact that 2 is not rational. Show that the same formula gives a function from the reals to the reals that is not one-to-one. 2.2.4 Inverse functions With the basic notions from Section 2.2.3 in hand, we can discuss the important concept of “reversing” a function. 2.2. FUNCTIONS 57 Inverse images If f : A → B is a function, and S ⊆ B, then f −1 (S) = {x ∈ A | f (x) ∈ S} is called the inverse image of S under f . In words, it is the set of elements in A that f takes into S. Notice that f −1 takes sets (subsets of B) and gives sets in return (subsets of A. Note also that it is possible that f −1 (S) can be empty. This can happen if f is not onto. For example, if f (x) = x2 from the reals to the reals, then f −1 ({−1}) is the empty set. Also note that f −1 (S) can have more than one element even if S has only one element. This can happen if f is not one-to-one. For example f −1 ({1}) = {−1, 1} using the same f as above. Inverses of one-to-one correspondences If a function f is a one-to-one correspondence, then f −1 has some special properties. Lemma 2.2.6 If a function f : A → B is a one-to-one correspondence, then for every b ∈ B, f −1 ({b}) is a set with one element of A. The proof will be left as an exercise. Lemma 2.2.6 says that for a bijection f : A → B, we can think of f −1 as a function. Each single element of B has one and only one element of A associated to it by f −1 . When this happens, we no longer thing of f −1 as taking sets to sets and write f −1 (b) for b ∈ B instead of f −1 ({b}). We also call f −1 the inverse function of the bijection f and write f −1 : B → A. This puts us in the unfortunate position of having one notation f −1 that means one thing in one situation and another in a different situation. There is nothing to be done about it since the ambiguity is thoroughly embedded in mathematical writing. Lemma 2.2.7 If a function f : A → B is a bijection, then for all a ∈ A we have f −1 (f (a)) = a and for all b ∈ B we have f (f −1 (b)) = b. The proof will be left as an exercise. We can add to Lemma 2.2.6. Lemma 2.2.8 If a function f : A → B is a bijection, then so is f −1 : B → A. The proof will be left as an exercise. Lastly, we give a converse to Lemma 2.2.7. Lemma 2.2.9 Let f : A → B and g : B → A be functions so that for all a ∈ A and b ∈ B the equalities g(f (a)) = a and f (g(b)) = b hold. Then f and g are bijections and g = f −1 . 58 CHAPTER 2. OBJECTS OF STUDY The proof will be left as an exercise. Exercises (18) The following require very careful reviews of the definitions. 1. Prove Lemma 2.2.6. 2. Prove Lemma 2.2.7. 3. Prove Lemma 2.2.8. 4. Prove Lemma 2.2.9. Give examples to show that both assumptions are necessary. That is, if one only assumes for all a ∈ A that g(f (a)) = a, the functions are not necessarily bijections. And similarly, if one only assumes for all b ∈ B that f (g(b)) = b, the functions are not necessarily bijections. In each case something can be said about each of the functions. What can be said? 2.2.5 Special functions We introduced homomorphisms in Section 2.1.4. If we wanted to go bring them in formally at this point, then this is where the discussion should go. However, it would be better go into detail after other objects have been more carefully introduced. Homomorphisms are functions, but they have extra restrictions. Thus we give the title “Special functions” to this brief section. Another class of special functions is that of homomorphisms that are also bijections as functions. These turn out to be crucial to our investigations and will be announced with fanfare when the time is appropriate. 2.3 Groups We used permutations in Section 2.1.1 to motivate one of the properties that groups will have. Rather than build up more motivation we will jump right to the definition, and then show that permutations fit the extra requirements that we will list. There are two ways to describe groups. One is as a set with one extra structure, and the other is as a set with three extra structures. Since two of the three structures can be deduced from the third, it is more typical to take the efficient path and describe groups as having only one extra structure attached to the set. We will start with the more efficient path, and will point out the other later. If we describe a group as consisting of a set with an extra structure, then there is a need to refer to two items: the set and the structure. It turns out that once one is used to groups, the extra structure is often (but not always) not given any symbol. To be clear at the beginning, we will give a symbol for the structure. 2.3. GROUPS 2.3.1 59 The definition The one structure version A group is a pair (G, ·) where G is a set and · represents a multiplication that will have to obey some restrictions. To say that · is a multiplication means that if g and h are in G, then g · h also represents an element of G. The element g · h should be determined uniquely by g and h in that if g = g ′ and h = h′ , then g · h = g ′ · h′ . We call · a multiplication since we regard g · h as the “product” of g and h. The restrictions on · are as follows. 1. For all f, g, h in G, we have (f · g) · h = f · (g · h). 2. There is an element e ∈ G so that for all g ∈ G the equalities e · g = g and g · e = g hold. 3. For every element g ∈ G there is an element g −1 ∈ G so that g · g −1 = e and g −1 · g = e hold, where e is as described in the previous restriction. Often the first restriction is called the associative axiom, the second the identity axiom, and the third the inverse axiom. The element e is called “the” identity of the group, and the element g −1 is called “the” inverse of g. The word “the” is in quotes in two places since we have not yet proven that there is only one element of G that acts as in the second axiom, and that each g has only one element that acts as in the third. The definition just given is an example of a definition by properties. Anything satsifying these properties is a group. When a defintion is given this way, it is important that several examples be brought in to make the definition more real. We will do that as soon as we give the “other” definition. 2.3.2 Operations The multiplication · in the definition of a group is an operation. It takes two elements of the group, combines them and gives another element of the group in return. The operation · is more specifically called a binary operation since it combines two elements. The requirement that the result of combining two elements be uniquely determined by the elements being combined makes it a function. The domain of the fuction is all pairs of elements of the group and the range is the group itself. Thus · is a function · : G × G → G. However, we write a · b for the image of (a, b) instead of the usual functional notation ·(a, b). The fact that we use ordered pairs says that order is important. Many operations that you know, such as + can ignore order. These are commutative operations. But others such as − cannot ignore order and are not commutative. Another non-commutative operation that you know is matrix multiplication. Operations do not have to be binary. An operation might combine three elements at a time and be called ternary, or “combine” only one element and be called unary. The “inverse” operation taking g to g −1 is unary. Lastly, there 60 CHAPTER 2. OBJECTS OF STUDY are operations that take no inputs at all. These could be called “zeroary” but the word is hard to pronounce and they are called constants instead. That is, the function always gives the same value and needs no inputs to help decide what value to give. The three structure version Now we can define a group as a quadruple (G, ·, −1 , e) consisting of a set with three operations, · which is binary, −1 which is unary, and e which is a constant. The operations satisfy the following three axioms where f , g and h represent arbitrary elements of G. 1. (f · g) · h = f · (g · h). 2. e · g = g = g · e. 3. g · g −1 = e = g −1 · g. It is easy to jump to the conclusion that this version of the definition makes “the” identity unique and “the” inverse of an element g unique. It does make the element that we call the identity unique, but it still does not require that only one element behave as in the second axiom. This turns out to be true, but needs to be proven. This will be done eventually. Similar remarks apply to “the” inverse. 2.3.3 Examples These examples will be given with one operation. Abelian examples If + represents the usual addition, then all of (C, +), (R, +), (Q, +), (Z, +) are groups. These groups satisfy the extra requirement x + y = y + x for all x and y in the group. Such groups are called abelian or commutative groups. It is traditional but not required to use + for the binary operation in an abelian group. Some examples have traditional notation of their own that takes precedence. In all of the examples given so far, 0 is the identity element and this makes −x the inverse of x. Using 0 for the identity of an abelian group and −x for the inverse of an element in an abelian group is also traditional but not required. The structure (R, ·) where · represents the usual multiplication is not a group. The element 1 is the only possible identity, but 0 then has no inverse. There is no real number x so that 0 · x = 1. However, if we let R+ = {x ∈ R | x > 0}, then (R+ , ·) is an abelian group with identity 1. In fact, we can include all the elements other than the troublesome 0. If we let R∗ = {x ∈ R | x 6= 0}, then (R∗ , ·) is an abelian group with identity 1. Note that it would be impossibly confusing to insist that the operation in these examples be written + and the identity be written as 0, so these examples are an exception to the tradition mentioned above. 2.3. GROUPS 61 Another abelian group that we will write multiplicatively is the group of all complex numbers of modulus 1. If we let C1 = {z ∈ C | |z| = 1}, then (C1 , ·) is an abelian group. It is often referred to as the circle group. Non-abelian examples Matrix multiplication is not commutative. There are also identity matrices, but there are many. Also multiplication is not always defined. We can fix that by picking a particular size. We can let Mn be the set of all n × n matrices with real entries, and let In be the n × n identity matrix. Now we have the problem that not every n × n matrix has an inverse. If we let Mn′ be the set of all n × n matrices with real entries and non-zero determinant, then Mn′ and matrix multiplication gives a group that is non-abelian as long as n > 1. Note that we do not give a symbol for the multiplication since matrix multiplication is typically written without a symbol. That is, AB is the product of the matrix A with the matrix B. In fact, from now on, we will typically not give a symbol for the binary operation in a non-abelian group. That means that a particular example will have to have its multiplication given by words. Even in abelian examples, the operation often has no symbol. The example (R∗ , ·) given above is usually given as R∗ with ordinary multiplication and the product of x and y is written as xy. You should review the proof that matrix multiplication is associative. It is not always easy to prove that a particular example of a group satisfies all the requirements. 2.3.4 The symmetric groups Definition The next examples are so important that they get their own section. If X is a set, then a permutation on X is a bijection from X to X. One particular bijection from X to X is the identity function e. That is e(x) = x for every x ∈ X. From this it is trivial to check that for any bijection f : X → X, the compositions ef and f e both equal f . From Lemma 2.2.5, the composition of bijections is a bijection. From Lemma 2.2.8, the inverse of a bijection is a bijection. From Lemma 2.2.7, if f : X → X is a bijection, then the compositions f f −1 and f −1 f both equal e. Lastly, functional composition is associative. We see this from (f (gh))(x) = f ((gh)(x)) = f (g(h(x))) and ((f g)h)(x) = (f g)(h(x)) = f (g(h(x))). Reading this quickly will not convince you that anything is going on. Read it again while keeping careful track of parentheses. The set of all permutations on X will be written as SX and we have argued that SX under the operation of functional composition is a group with identity e. It is typically non-abelian which we will show by looking at particular examples. The group SX is usually called the symmetric group on (or of) X. 62 CHAPTER 2. OBJECTS OF STUDY Examples and notation If X = {1, 2, · · · , n}, then SX is usually denoted Sn . Each element of Sn is a permutation of the integers from 1 through n. There are two standard notations for such permutations. We give one now and the other in a later chapter after some of the structures are better understood. If σ ∈ Sn , then to describe σ, we must say what σ(i) is for 1 ≤ i ≤ n. The notation ! 1 2 3 ··· n (2.2) σ= σ(1) σ(2) σ(3) · · · σ(n) describes σ by having each column give the pair (i, σ(i)) with i on the top line and σ(i) on the bottom line. The notation in (2.2) is called Cauchy notation after one of the earliest mathematicians to investigate the properties of permutations and possibley the first to use the notation. The top line might seem redundant since the way it is given in (2.2), it is predictable. However, the notation does not require that the top line appear in numerical order and the notation works for permutations on sets other than {1, 2, · · · , n}. For example 1 2 3 3 1 2 and 3 1 2 2 3 1 represent the same element of S3 and a h x x h a represents a permutation on the set {a, x, h}. Since Cauchy notation completely describes a permutation, there is enough information in the notation to calculate compositions. Let us calculate with the permutations below from S3 . 1 2 3 1 2 3 . (2.3) , τ= σ= 3 2 1 2 3 1 Remembering that permutations are functions, we compose from right to left. That is, the permutation on the right is applied first. With this rule, we have 1 2 3 1 2 3 1 2 3 , = στ = 1 3 2 3 2 1 2 3 1 and τσ = 1 2 3 3 2 1 1 2 3 2 3 1 = 1 2 3 2 1 3 . 2.3. GROUPS 63 To make insure that the information is being interpreted correctly, we do two of the six calculations that go into the two compositions above. We have (στ )(1) = σ(τ (1)) = σ(3) = 1, and (τ σ)(1) = τ (σ(1)) = τ (2) = 2. In particular, note that στ 6= τ σ and S3 is not abelian. Identity and inverse The identity in Sn is 1 2 3 ··· n 1 2 3 ··· n . If σ is a permutation in Sn and it takes i to ! σ(i), then σ −1 is required to i in the Cauchy notation for take σ(i) back to i. Thus for each column σ(i) ! σ(i) in the Cauchy notation for σ −1 . Thus if σ σ, there must be a column i is given as in (2.2), then ! σ(1) σ(2) σ(3) · · · σ(n) −1 σ = . 1 2 3 ··· n This is one reason for not requiring that the top line be in numerical order. If you feel compelled to put the top line in numerical order, you may do so. Thus for σ and τ as given in (2.3), we have 1 2 3 3 2 1 1 2 3 2 3 1 −1 −1 . = , τ = = σ = 3 2 1 1 2 3 3 1 2 1 2 3 Notice that τ −1 = τ . Exercises (19) 1. Consider the following elements of S4 . 1 2 3 4 1 2 3 4 , , τ= σ= 2 1 3 4 2 3 4 1 λ= 1 2 3 4 1 3 4 2 Compute the following. See what patterns you can find. σ2 , σ3 , σ4 , σ −1 , σ −2 , σ −3 , στ , τ σ, στ σ −1 , σ 2 τ σ −2 , σ 3 τ σ −3 , λ2 , λ3 , λ−1 , λ−2 , λτ , τ λ, λτ λ−1 , λ2 τ λ−2 . . 64 CHAPTER 2. OBJECTS OF STUDY 2. What is the identity element in C1 ? What is the simplest way to give the inverse of any z ∈ C1 ? 2.4 The integers mod k The next examples are of extreme importance. They will appear repeatedly, and will lead to other examples that are even more important. To give the examples, we need more tools. The techniques that we use to build the examples and the techniques that we use to verify some of the properties will be used repeatedly in these notes. Thus the techniques are as important to learn as the examples themselves. We start with a discussion of the techniques used in the construction. 2.4.1 Equivalence relations Relations If X is a set, then some elements in X might be related to other elements in X. If X is a set of people, then one person might be a cousin of another person. The statement for an x and y in X that “x is a cousin of y” is then either true or false. A relation will always have a “value” that is either true or false. Some relations, such as “is a cousin of” are symmetric. If x is a cousin of y, then y is also a cousin of x. Some relations, such as “is a parent of” are not symmetric. We look at symmetry and two other properties of mathematical relations. If X is a set, then a binary relation on X is a property that any given pair (x, y) of elements from X can either have or not have. We say binary since we look at pairs of elements, and not triples, or other combinations. In these notes we will only be concerned with binary relations. Examples of relations Some mathematical relations are famous and have their own symbols. The relation “is less than” has its own symbol < and we write x < y to indicate that x is less than y. Thus 2 < 3 is true, and 3 < 2 is false. Other relations in this family are >, ≤, ≥. These relations can be applied to R, Q, Z or N. There is no reasonable relation < that works well on C. Another relation that applies to N or Z is “divides” and has its own symbol. We write m|n for “m divides n.” We will have more to say about this relation shortly. A symbol that is often used for an arbitrary binary relation is ∼. Like the relations above, it is written between the two elements. To say that ∼ is a relation on a set X means that for each pair (x, y) of elements of X, either x ∼ y is true or x ∼ y is false. It is possible to make up myriads of examples. 2.4. THE INTEGERS MOD K 65 1. On Z, define x ∼ y to mean x + y is even. 2. On Z, define x ∼ y to mean xy is even. 3. On R, define x ∼ y to mean x > y 2 . 4. On C, define x ∼ y to mean xy is real. The relation “divides” The relation “divides” is one of the most important relation in these notes. Here we treat it carefully. If m and n are elements of Z, then we write m|n to mean that there exists a k ∈ Z so that n = mk. Of course, we can reword this to say that n is a multiple of m and have exactly the same meaning, but the emphasis of “divides” over “is a multiple of” is traditional. Thus 2|4 is true and 2|5 is false. Note that 0|0 is true. This is in spite of the fact that 00 has no sensible value that can be assigned to it. Note that 0|2 is false, and in general 0|n is false for every n ∈ Z with n 6= 0. The setting Z greatly affects the nature of “divides.” If we take the definition of m|n and replace every appearance of Z by R, then we get a relation on R in which x|y is true for all pairs (x, y) except those where x = 0 and y 6= 0. We will not be interested in this relation at all except as an example. Equivalence relations We investigate three properties of relations. Let ∼ be a binary relation on X. 1. We say that ∼ is reflexive if for all x ∈ X, x ∼ x is true. 2. We say that ∼ is symmetric if for all x and y in X, if x ∼ y, then y ∼ x. 3. We say that ∼ is transitive if for all x, y and z in X, if x ∼ y and y ∼ z, then x ∼ z. Easy examples are that ≤ is reflexive and < is not. Neither is symmetric. The relation on Z defined by x ∼ y means “x + y is even” is symmetric. Both ≤ and < are transitive. As an exercise, you will show that divides is transitive. Convince yourself that “is a cousin of” is not transitive. A relation ∼ on X that is reflexive, symmetric and transitive is called an equivalence relation. The relation = is the most obvious example of an equivalence relation. 66 CHAPTER 2. OBJECTS OF STUDY An important example The following example will be our most important example for the time being. It is important enough to give it a speicial symbol. Pick a k in Z with k 6= 0. We define a relation ∼k on Z by saying for x, y ∈ Z, x ∼k y means k|(y − x). (2.4) The fact that this is an equivalence relation will be left as a set of exercises. Exercises (20) 1. Prove that “divides” on Z is transitive and reflexive. Show by example that it is not symmetric. 2. Prove that if k|x and k|y in Z, then k|(x + y), k|(−x), and k|(x − y) all hold. 3. Prove that if k|x and a ∈ Z, then k|(ax). 4. Let X be a set of sets. (Yes, that is allowed. For example, X might be all of the subsets of a set A.) Prove that = on X is an equivalence relation. The first step is to realize that there is something to prove. Remember that = on sets has a definition. Using the definition of equality of sets in Section 2.2.1 prove that = is reflexive, symmetric and transitive. 5. Prove that ∼k on Z defined in (2.4) is an equivalence relation. This breaks into three proofs which are quite easy, but still need to be written down carefully. 6. Give five different elements t in Z so that t ∼3 0. Give five different elements t in Z so that t ∼3 1. Give five different elements t in Z so that t ∼3 −1. 7. Prove that ∼−k is the same as ∼k . That is, x ∼k y if and only if x ∼−k y. 2.4.2 Equivalence classes Equivalence relations are used to build equivalence classes. Equivalence classes break up a set in a very specific way. We first write down the kind of break up that interests us, then define equivalence classes that come from an equivalence relation, and then show that the equivalence classes give us the break up with the desired properties. Partitions A partition of a set X is a collection P of subsets of X with the following three properties. 2.4. THE INTEGERS MOD K 67 1. Each set in the collection P is a non-empty subset of X. That is, every S in P has S ⊆ X and S 6= ∅. 2. The union of the sets in the collection P is all of X. This is the same as saying that for every x ∈ X, there is an S in P with x ∈ S. 3. The collection P is of pairwise disjoint sets. That is, for any two sets S and T from the collection P, either S = T or S and T are disjoint. Equivalence classes If ∼ is an equivalence relation on a set X, then for each x ∈ X, we define [x] = {y ∈ X | x ∼ y}. (2.5) We refer to [x] as the equivalence class of x. Sometimes we say “containing x” instead of “of x” for emphasis, and sometimes we add “in X” or “under ∼” or both if the extra clarity is needed. Given X and ∼, the set [x] is completely determined by x. This makes x look special to [x]. This is not the case. The next lemma shows that any element of [x] determines [x]. Lemma 2.4.1 If ∼ is an equivalence relation on X, if x is in X, and if z is in [x], then [z] = [x]. Proof. If z ∈ [x], then x ∼ z, and by symmetry z ∼ x. Using x ∼ z, we have that if t ∈ [z], then z ∼ t and x ∼ z implies that x ∼ t so t ∈ [x]. This shows that [z] ⊆ [x]. Using z ∼ x, an argument like the previous sentence that reverses the roles of z and x shows that [x] ⊆ [z]. Lemma 2.4.1 justifies our saying that any element of an equivalence class represents or is a representative of that class. From the definition (2.5) it is clear that [x] is a subset of X. The next proposition shows that the collection of equivalence classes forms a partition of X. We write out the proof in full to illustrate one technique of proof that must be learned. The third item in the definition of a partition says that an “or” is true. The standard way to prove that an “or” is true is to assume that one of the possibilities is false, and use that to prove that the other possibility must then be true. Since there are two possibilities in the third item of the definition, we can choose which to assume is false. The two choices may not result in the same amount of work. We choose the one that we think is the easier to use. Proposition 2.4.2 If ∼ is an equivalence relation on a set X, then the collection of equivalence classes forms a partition of X. Proof. We have already noted that each [x] is a subset of X. By reflexivity, for each x, we have x ∼ x so x ∈ [x]. Thus each [x] is non-empty. This establishes the first item in the definition of a partition. 68 CHAPTER 2. OBJECTS OF STUDY The observation that for each x ∈ X we have x ∈ [x] also shows that the union of the equivalence classes is all of X and we have the second item. To prove the third item, we assume that [x] and [y] are not disjoint. We want to prove that [x] = [y]. Since [x] ∩ [y] 6= ∅ there is an element in [x] ∩ [y]. We let z be such an element. (Note that we must use a letter not used yet since we must use the “there is” with no other restrictions on what we bring in.) From z ∈ [x], Lemma 2.4.1 says that [z] = [x]. From z ∈ [y] we get [z] = [y], so [x] = [y]. We will use equivalence relations and equivalence classes in the next section to build our examples. Exercises (21) 1. We look at the relation ∼k defined in (2.4). What are the equivalence classes in Z under ∼2 ? How many equivalence classes does are there in Z under ∼k . 2.4.3 The groups We consider the relation ∼k on Z defined in (2.4). We make a group, denoted Zk , out of the equivalence classes of ∼k on Z. That is, each element of Zk will be a single equivalence class, and the set of elements of Zk will be the set of equivalence classes. To make a group we need to define the operations. This group will be abelian and we will + to denote the binary operation, − to denote the unary operation. There will be a problem with the definitions. This kind of problem occurs often and the technique for getting rid of the problem is standard and must be learned. For x ∈ Z, we will write [x]k to denote the equivalence class of x in Z under ∼k . This example is important enough to get its own notation. Since the elements of Zk will be classes [x]k , we need to know how to add them together and how to negate them. For [x]k and [y]k in Zk , we define [x]k + [y]k = [x + y]k (2.6) −[x]k = [−x]k . (2.7) and Well definedness Observe that in (2.6) the “result” [x+y]k of the definition contains a calculation that uses elements (representatives) of the equivalence classes [x]k and [y]k . The problem is that there are many representatives of [x]k and [y]k . Thus many different calculations can be done in an attempt to determine [x]k + [y]k and 2.4. THE INTEGERS MOD K 69 we need to know that these calcuations always give the same equivalence class. Let us look at some examples. Let k = 5 and consider [1]5 + [3]5 . If we use 1 from [1]5 and 3 from [3]5 , then we get [1]5 + [3]5 = [1 + 3]5 = [4]5 . Note that 1 ∼5 6 and 3 ∼5 8 so [1]5 = [6]5 and [3]5 = [8]5 . So we can legitimately use 6 from [1]5 and 8 from [3]5 to get [1]5 + [3]5 = [6 + 8]5 = [14]5 . But 14 ∼5 4 so [14]5 = [4]5 and we get the same answer. The point is that [4]5 = [14]5 even though 4 6= 14. This raises hope that we will always get the same answer, but it is not a proof that we do. The problem that we are seeing in the defintion (2.6) is called a well definedness problem. We will discuss how to recognize such problems shortly. If it can be proven that one gets the same answer no matter which representatives are used in the calculation, then we can say that the operation + as defined in (2.6) is well defined. Well definedness problems often come up when defining operations on sets of equivalence classes. More generally (operations are really functions with an appropriate domain) well definedness problems can come up when defining functions on sets of equivalence classes. We now deal with (2.6) Lemma 2.4.3 The operation defined in (2.6) is well defined. Proof. We must show that changing the representatives used in the right side of the equal sign in (2.6) does not change the result. That is, the same equivalence class is obtained. To that end, we let x′ ∈ [x]k and y ′ ∈ [y]k be other representatives of [x] and [y], respectively. Our goal is to show that [x′ + y ′ ]k which results from using x′ and y ′ is the same as [x + y]k , the result of using x and y. Thus we wish to show (x + y) ∼k (x′ + y ′ ). But this requires showing that k|((x′ + y ′ ) − (x + y)) or equivalently k|((x′ − x) + (y ′ − y)). But we know that x ∼ x′ and y ∼ y ′ hold so k|(x′ − x) and k|(y ′ − y). The result follows from a problem in Exercise Set (20). The outline of the proof above is standard and should be learned. To show well definedness, show that if different representatives are chosen, then the result does not change. If the result is an equivalence class, then one shows that the results of the calculation are related. Lemma 2.4.4 The unary operation defined by (2.7) is well defined. The proof will be left as an exercise. We now define [x]k − [y]k = [x] + (−[y]k ) in Zk . It is wrong to think that any definition that mentions equivalence classes must have a well definedness problem. This definition does not. From Lemma 2.4.4, we know that −[y]k is well defined. Now lemma 2.4.3 tells us that the sum of the two classes [x]k and −[y]k is well defined. So the definition given is well defined. 70 CHAPTER 2. OBJECTS OF STUDY Had we written [x]k − [y]k = [x − y]k , then we would have had a well definedness problem. The difference is that the second defintion tells how to calculate the answer in terms of representatives. You might wonder why two definitions of the same thing have different behaviors. The answer is that it takes a proof that they do define the same thing. If such a proof is found, then the second definition must be well defined since the first definition is. This all is covered further in exercises. Proposition 2.4.5 The pair (Zk , +) with + as defined in (2.6), with − as defined in (2.7) is an abelian group with [0]k as the identity element. The proof will be left as an exercise. The group Zk will be referred to as the integers modulo k or as the integers mod k. From a problem in Exercise Set (20), we know that ∼k is the same relation as ∼−k and so the two relations have the same equivalence classes. Thus there is no difference between Zk and Z−k . From now on, we only look at Zk with k > 0. From a problem in Exercise Set (21), we know that Zk has k elements. Since this is finite, we can write down a table that gives the results of the operation for all pairs of elements in Zk . The table below gives the “addition table” for Z4 . We have been lazy and have omitted [ ]4 from the entries. Thus the entry 3 really refers to [3]4 , and so forth. + 0 1 2 3 0 0 1 2 3 1 1 2 3 0 2 2 3 0 1 3 3 0 1 2 Leaving out the brackets and subscript will often be done to denote elements of Zk as long as it is clear that what is going on. Order of group and element We give some definitions primarily so that we can give some problems to work on. The terms defined will take on greater importance later. A group (G, ·) is called finite if G is a finite set. If (G, ·) is a finite group, then the order of G is the number of elements of G. If G is not finite, then we can say that the order of G is infinite. We write |G| to denote the order of G. If (G, ·) is a group and g ∈ G, then (using multiplicative terminology), the order of g is the least number of copies of g that have to be multiplied together to give the identity element. If no such number exists, then we say that g is of infinite order. At the moment there is no obvious connection between the two uses of the word “order” but that will be cleared up eventually. If (G, +) is an abelian group, then additive language can be used, and the order of a g ∈ G will be the least number of copies of g that have to be added together to give the identity element. 2.4. THE INTEGERS MOD K 71 It is a reasonable guess that every element in a finite group has a finite order, but we will not see a proof of this until later. We note that the addition table for Z4 shows that the order of 0 is 1. (As mentioned, we are leaving out the brackets that denote the equivalence classes.) That is, we only need to add up one copy of 0 to get 0. On the other hand 1 6= 0, 1 + 1 = 2 6= 0, 1 + 1 + 1 = 3 6= 0 and 1 + 1 + 1 + 1 = 0. Thus 1 has order 4 in Z4 . We now have enough material to give some problems. Before we list the problems, we mention one easy but important result. Proposition 2.4.6 There is a group of every finite order. Proof. Given a finite size k, we know that Zk has k elements. Exercises (22) 1. Prove Lemma 2.4.4. 2. Prove that [x]k − [y]k = [x]k + (−[y]k ) and [x]k − [y]k = [x − y]k define the same operation. That is, prove for all x and y that [x]k +(−[y]k ) = [x−y]k . As mentioned, this proves that the second definition is well defined. Give a completely independent proof that the second definition is well defined. 3. Prove Proposition 2.4.5. You may use facts that you know about the integers (such as commutativity, associativity, etc. of various operations). 4. Calculate the orders of 2 and 3 in Z4 . 5. Write out the addition table for Z5 and calculate the orders of all the elements in Z5 . 6. Write out the addition table for Z6 and calculate the orders of all the elements in Z6 . 2.4.4 Groups that act and groups that exist Elements of the symmetric groups Sn move things around. We will learn eventually to say that Sn “acts” on the set {1, 2, . . . , n} in that each element of Sn permutes the elements of {1, 2, . . . , n} and that certain rules are followed. If we recall that matrices are linear transformations, then Mn′ , the group of n × n matrices with real entries and non-zero determinant (which are thus invertible), “acts” on the vector space Rn . Once the rules for actions are written down, it will be seen that this action also follows those rules. It is less obvious that any of the other examples “act” on anything. These examples include such familar groups as (Z, +), (R∗ , ·), etc., and perhaps less familiar groups such as (Zk , +). It could be said that these groups just exist in their own right and do not “act” on anything. It will be seen later that any group can be made to act on a set. Not every question about groups requires looking at such an action, and not every 72 CHAPTER 2. OBJECTS OF STUDY application of groups emphasizes such an action. However, many questions and applications do. The view that groups act will become more important to us as the course goes on. 2.5 Rings Each group has one binary operation. Rings have two. The best example to keep in mind as you read this section is the integers with addition and multiplication. More complicated examples will show up shortly. 2.5.1 Definitions We will not give two versions of the definition of a ring. Based on your experience with groups, you can supply a second version with no trouble. We will give the one with less structure spelled out. A ring is a triple (R, +, ·) where R is a set and + and · are binary operations on R. The following conditions must be met. 1. The set R and the addition form a commutative group. The identitity element is usually denoted 0. 2. The multiplication is associative in that a(bc) = (ab)c for all a, b and c in R. 3. Multiplication distributes over addition in that a(b + c) = ab + ac and (b + c)a = ba + ca for all a, b and c in R. Some comments are needed. The multiplication need not be commutative. This is why two distributive laws are needed. The multiplication need not have an identity. These gaps can be filled by adding more words. If R has an element 1 so that for all a ∈ R, we have a1 = a = 1a, then we say that R is a ring with identity or ring with 1 or ring with unit. (Some books choose not to deal with rings without a multiplicative identity and so their definition of a ring coincides with our definition of a ring with identity.) The even integers form a ring without identity. If a ring R satisfies ab = ba for all a and b in R, then R is called a commutative ring. A commutative ring with identity fails to be a field only in that it lacks multiplicative inverses. The integers form a commutative ring with identity. Two-by-two matrices with real entries form a non-commutative ring with identity. There are more words that can be attached to rings. We will deal with some later. The small number of assumptions about the multiplication in a ring allow for lots of variations and this leads to many special terms. 2.5. RINGS 2.5.2 73 Examples The first important example (Z, +, ·) has already been mentioned. It is referred to as the ring of integers to emphasize that both operations are being considered. The second important examples are the ring of polynomials. Since the coefficients can come from various classes of numbers, there are various rings of polynomials. For us the most important sources of coefficients will be the rational numbers Q, the real numbers R, and the complex numbers C. Of course, Q, R and C are also examples of rings. However, they satisfy many more properties than required for a ring, and they will show up again when we discuss fields. The ring Zk The elements of Zk are the equivalence classes [x]k for x ∈ Z. We add classes by [x]k + [y]k = [x + y]k having proven that this addition is well defined. We can try to multiply classes by [x]k [y]k = [xy]k . We immediately have a well definedness problem. Lemma 2.5.1 Setting [x]k [y]k = [xy]k for elements of Zk gives a well defined binary operation on Zk . The proof will be left as an exercise. Proposition 2.4.5 says that (Zk , +) is an abelian group. Now that we have a second binary operation, we might have a ring. To show that we have a ring, we have to show that the second and third requirements for a ring are met. In fact, they are and we have the following. Proposition 2.5.2 The triple (Zk , +, ·) with + as defined in (2.6), and with · the multiplication discussed in Lemma 2.5.1 is a commutative ring with 1. The proof will be left as an exercise. We gave the addition table for Z4 after Proposition 2.4.5. Below is the multiplication table. As before, we simplify the table by writing, for example, 3 instead of [3]4 . · 0 1 2 3 0 0 0 0 0 1 0 1 2 3 2 0 2 0 2 3 0 3 2 1 Note that the elements 1 and 3 in Z4 have multiplicative inverses, but 2 does not. The multiplicative inverse of 1 is 1 since 1 · 1 = 1, and the multiplicative inverse of 3 is 3 since 3 · 3 = 1. However, no x ∈ Z4 has 2 · x = 1. 74 CHAPTER 2. OBJECTS OF STUDY Exercises (23) 1. Prove Lemma 2.5.1. 2. Prove Lemma 2.5.2 3. Write out the multiplication tables for Z3 , Z5 and Z6 . In each determine which elements have multiplicative inverses. 2.6 Fields Fields are rings but with extra restrictions. Thus every field is a ring, but not every ring is a field. Since every field is a ring, we can shorten the definition of a field by referring to the definition of a ring and some of the named restrictions on rings that have been discussed. 2.6.1 Definitions A field F is a commutative ring with 1 with operations called addition (usually written with the symbol + as in x + y) and multiplication (usually written with no symbol as in xy) so that if 0 is the additive identity, then the following also hold. 1. Every x ∈ F with x 6= 0 has a multiplicative inverse x−1 . 2. 1 6= 0. The second item is not required in all books. Demanding 1 6= 0 keeps the one elmeent ring {0} from being a field. If x 6= 0, we will argue later that x−1 6= 0. You can try that now, but if you are not careful, you will skip important steps. Accepting this fact for now, we see that if F ∗ = F − {0}, then the multiplication on F ∗ is associative, commutative, has an identity and has inverses. This means that the multiplication makes F ∗ into another abelian group. (Using just + on F gives the first abelian group.) The two groups are linked by the distributive law that holds because we are assuming that F is a ring. If a definition is desired that makes no mention of rings, then a field is a triple (F, +, ·) where · is usually not written, where F is a set and + and · are binary operations. There are elements 0 and 1 in F with 1 6= 0 so that for all x, y and z in F the following hold. 1. x + y = y + x. 2. (x + y) + z = x + (y + z). 3. x + 0 = x. 4. There is −x ∈ F so that x + (−x) = 0. 2.6. FIELDS 75 5. xy = yx. 6. (xy)z = x(yz). 7. x1 = x. 8. If x 6= 0, there is x−1 ∈ F so that xx−1 = 1. 9. x(y + z) = xy + xz. 2.6.2 Examples The three standard examples have been mentioned before: Q, R and C with the usual addition and multiplication. We will give two types of examples before leaving this chapter. They look quite different, but are more related than they look. The view that unifies them will come a lot later. One set of examples will need a fair amount of material about the integers, and we will take the time to review that material in this chapter before discussing those examples. The examples that need a little less work will be covered now. Adding a square root to Q The field R has no number in it whose square is −1. We build the field C by adding a number to R that has the property that its square is −1. The field Q has√no number in it whose square is 2. We will build a field that we will call Q[ 2] by adding a number to Q that has the property that its square is 2. (For those not familiar with the proof, we will show at the end of this section that there is no √ rational number r that has r2 = 2.) There is a real number 2 whose square is √ 2, so we do not have to look far for the number that we want. Recall that 2 refers only √ to a positive real number. The negative real number whose square is 2 is − 2. √ We claim that the set of all numbers r + s 2 with r and s both rational is closed under the four operations of addition, negation, multiplication and (for r and s not both zero) inversion. In fact, we will argue that it forms a perfectly √ good field. We will use√Q[ 2] to denote this collection of numbers. √ We know that 0 + 0 2 = 0 is an additive identity and 1 + 0 2 = 1 is a multiplicative identity. We also know that the addition and multiplication operations in R satisfy all the requirements of a field including both commutativities, as√ sociativities, and also the distributive law. This means that Q[ 2] will have these √ properties. So if our closure claims are correct, then we will know that Q[ 2] forms a field. √ √ √ We have√(r + s 2) + (t + u 2) = (r + t) +√(s + u) 2. From this it is clear that −r − s 2 is an additive inverse for r + s 2. Multiplication is handled by √ √ √ √ (r + s 2)(t + u 2) = rt + (st + ru) 2 + su2 = (rt + 2su) + (st + ru) 2. This leaves inversion. 76 CHAPTER 2. OBJECTS OF STUDY We used two techniques for multiplicative inversion for complex numbers. We do exercise. Given √ as an √ √ the more difficult here, and leave the simpler r + s 2 6= 0, we want to find t and u so that (r + s 2)(t + u 2) = 1. This is equivalent to asking that √ √ (rt + 2su) + (st + ru) 2 = (1) + (0) 2. If we can find t and u so that rt + 2su = 1 and st + ru = 0, then we have a solution. But this is just two linear equations in the two unknowns t and u with r and s in the role of constants. We have rt + 2su = 1, st + ru = 0, rst + 2s2 u = s, rst + r2 u = 0, u(2s2 − r2 ) = s, so u= 2s2 −s s = 2 . 2 −r r − 2s2 Now either doing something similar, or by plugging the above value for u into st + ru = 0, we get r t= 2 . r − 2s2 This demonstrates that √ √ √ s r (r + s 2) 2 − 2 = 1 + 0 2 = 1, 2 2 2 r − 2s r − 2s so we should declare √ (r + s 2)−1 = r2 √ r s 2. − 2 2 2 − 2s r − 2s (2.8) Note that this makes no sense if r2 − 2s2 = 0. But this only happens if r 2 s =2 which would make 2 the square of a rational number since r and s are rational. Since this cannot happen, we never have r2 − 2s2 = 0. 2.6. FIELDS 77 Exercises (24) 1. Fill in the details in the calculations above for t. √ 2. Find (r + s 2)−1 by the technique of rationalizing the denominator. √ √ 3. Use (2.8) to get (3 + 2 2)−1 as an element of Q[ √2] and use direct multiplication to verify that when multiplied by 3 + 2 2 it gives 1. √ √ 4. Let Q[ 3] by all r+s 3 with r and s rational. Find formulas √ for addition, multiplication, negation and inversion and argue that Q[ 3] forms a field. √ √ √ 5. (Harder.) Let Q[ 3 2]√be all r + s 3 2 + t 3 4 with all of r, s and t rational. √ (Note that 3 4 = ( 3 2)2 .) Find formulas √ for addition, multiplication, negation and inversion and argue that Q[ 3 2] is a field. One technique for finding a formula for inversion will work (with a tremendous amount of work involved) and the other will probably be impossible. 2.6.3 The irrationality of √ 2 The proof below is extremely standard and can be found in this form in thousands of books. It is a proof by contradiction. That is, it assumes that what is to be proven is false, and derives an impossible situation—that some fact would have to be both true and false. Thus thus it is impossible that what is to be proven can be false, and must therefore be true. Getting used to proof by contradiction has some negative effects. Proof by contradiction often becomes an addiction. Once used to the technique, some sudents start using it for everything—even for statements that have simple straightforward proofs. Logically, a proof by contradiction is as good as any other, but if used when not needed, it can hide what is really going on. A hard rule to follow, but a worthwhile rule nonetheless is to never use proof by contradiction until a serious attempt to find a direct proof has been made. Proposition 2.6.1 For no rational number r, is r2 = 2. Proof. A rational number is of the form m/n for integers m and n with n 6= 0. We assume that the fraction m/n is in reduced terms. In particular, we can assume that not both m and n are even. We show the impossibility of (m/n)2 = 2 by assuming (m/n)2 = 2 and showing that m and n have to both be even. Since we are assuming that they are not both even, we will have a contradiction. If (m/n)2 = 2, then m2 = 2n2 which makes m2 even. Since odd numbers have odd squares (the square of 2k + 1 is 4k 2 + 4k + 1), m has to be even. (This is a mini proof by contradiction by itself.) If m is even, it is of the form 2q for some integer q. Now m2 = 4q 2 . So 4q 2 = 2n2 and n2 = 2q 2 . This makes n2 even, and the argument we just gave for the evenness of m also gives that n is even. This is the promised contradiction. 78 CHAPTER 2. OBJECTS OF STUDY 2.7 Properties of the ring of integers From the problems in Exercise Set (23), we get the hint that Zp might be a field as long as p is prime. This turns out to be the case and we will show this to be true. This gives another collection of examples of fields. We need to understand more about the integers. In this section we will derive properties of integers that we will exploit in Section 2.8 to show that Zp is a√ field when p is prime. These fields are related to the examples (such as Q[ 2]) above, but the relationship is far from obvious. It will be made obvious much later but we want to give hints about the nature of the relationship now. 2.7.1 An outline What will emerge in this section and the next will be an outline. It will run through some familiar aspects of the integers, such as the division algorithm, the existence and form of greatest common divisors, and end with properties about prime numbers and the application in Section 2.8 to finding multiplicative inverses in certain settings. As has been seen above, the hardest aspect in building a field is often the construction of multiplicative inverses. The importance of the outline is that it applies to more than just the ring of integers and the fields that can be built from integers. Later we will see that the same outline applies to rings of polynomials and fields that can be built from polynomials. It is at that point √ that we will see the connection between the examples of fields such as Q[ 2] and the examples that we will build in Section 2.8. The outline will cover the following topics. 1. Well ordering and induction. This only applies to the non-negative integers, and only applies indirectly to polynomials, but it is the start of all our constructs. 2. The division algorithm. Induction reveals properties of quotients and remainders, and these properties drive all that follows. 3. Greatest common divisors. Induction and the division algorithm prove that greatest common divisors exist and prove that they have a certain form. 4. Primes I. The outline branches here. This is the long branch and is not needed to construct examples of fields. It covers several crucial facts about primes. This branch will be done in this section and its parallels in rings of polynomials will be important later. (a) Factorization into primes. Every integer is a product of primes. This needs nothing but induction. (b) Euclid’s first theorem about primes. This title is not universally used, but it will do. The theorem states that for a, b and p in Z, if p is a 2.7. PROPERTIES OF THE RING OF INTEGERS 79 prime and p|(ab), then either p|a or p|b. This uses the facts that we establish about greatest common denominators. (c) Uniqueness of prime factorization. It is easy to show that every integer factors into primes, but it is harder to show that it can only be done in one way. It is also hard to explain exactly what that last sentence means. This uses induction and Euclid’s first theorem. 5. Primes II—the short branch. This will be covered in Section 2.8. (a) Building multiplicative inverses. This will show that Zp is a field when p is prime. The facts we have learned about greatest common divisors will be used here. 2.7.2 Well ordering and induction The natural numbers are the non-negative integers and are denoted N. We have N = {0, 1, 2, . . .}. The important property of the natural numbers that we wish to discuss is the following. Well ordering Proposition 2.7.1 For every non-empty subset S of N, there is a least element of S. That is, there is an element m ∈ S so that for every n ∈ S, we have m ≤ n. The property given in the proposition is called well ordering. In the next few pages, the well ordering property of N will be used several times. Some comments are in order. The well ordering of N is given as a proposition since it can be proven by a standard induction proof. It is hoped that this has been seen in a previous course, but if not, an exercise at the end of the section will guide you through a proof. The assumption that S be non-empty is obviously needed (how can m be found in S if S has no elements), but it is easy to forget to check this condition when invoking well ordering. The least element must be a member of S. This is why m ≤ n is given as the last part of the statement instead of m < n. The integers are not well ordered. The set Z is a perfectly good non-empty subset of Z and Z has no least element. The non-negative reals are not well ordered. The non-negative reals have a least element, but not every non-empty subset does. For example, the positive reals form a non-empty subset of the non-negative reals, and the positive reals have no least element. Induction We can approximately describe induction as if 0 works, and each number works if the previous number works, then all numbers work. 80 CHAPTER 2. OBJECTS OF STUDY The part to focus on in that approximation is the part that says each number works if the previous number works. This part corresponds to proving a statement is true for k + 1 under the assumption that it is true for k. We want to replace the approximation by a statement that is stronger in that it gives more hypothesis to work with when trying to prove something by induction. The stronger approximate statement would read if 0 works and each number works if all smaller numbers work, then all numbers work. However, it turns out that the part if 0 works does not need to be stated explicitly. Read on. The formal statement is given below. Proposition 2.7.2 (Strong induction) If S(n) is a statement that varies with n ∈ N, then S(n) is true for all n ∈ N if the following holds: for each n ≥ 0 it can be proven that S(n) is true assuming that S(j) is true for all 0 ≤ j < n. The fact that there is no separate assumption that S(0) be true is really hidden in the wording. The statement asks that S(0) be provable from ∀j ∈ N<0 , S(j) where N<0 = {j ∈ N|j < 0} is empty. So the statement asks that S(0) be provable from nothing. That is, it asks that S(0) be true. If it bothers you that this assumption is hidden in the wording, then you can just add it as a redundant requirement. Proof. Let F = {n ∈ N | S(n) is false}. That is, F contains all numbers n making S(n) false. If F is empty, then no n ∈ N makes S(n) false, S(n) is true for all n ∈ N and we are done. So we assume F is not empty. Then it has a least element k. This means that S(k) is false and any j ∈ N with j < k has j ∈ / F and S(j) is true. But then k fits the statement and S(k) must be true. Thus F cannot be non-empty. Some comments are in order. This is called strong induction since it is easier to use than ordinary induction. Most of the applications in this outline will use well ordering directly. But one application below will use well ordering’s consequence, strong induction, and it will be pointed out that ordinary induction would be exrtremely hard to use in that particular situation. As is typical with inductions, we do not have to start with 0. There is a version for any start value s in N and will say that S(n) is true for all n ∈ N with n ≥ s if for each n ≥ s it can be proven that S(n) is true assuming that S(j) is true for all s ≤ j < n. 2.7.3 The division algorithm The main point of this section is the following proposition. Proposition 2.7.3 (The division algorithm) Let m and d be in Z with d > 0. Then there are unique elements q and r in Z so that m = dq + r and 0 ≤ r < d. 2.7. PROPERTIES OF THE RING OF INTEGERS 81 Before we give the proof, we need some comments. The value q is usually called the quotient of division of m by d and r is called the remainder of the division. The usual name for d is the divisor and the usual name for m is best forgotten. The word “unique” appears in the statement of the proposition. The proper use of this word always involves conditions. If there is a condition around, then an object might be the only object that satisfies that condition. If it is, then it is said to be the “unique” object that satisfies the condition. Nothing is ever unique “by itself” except in colloquial speech. (E.g., “He is really unique.”) Uniqueness proofs take a certain shape. The shape will resemble proofs we have seen involving one-to-one functions. This is not a coincidence, and we will remark on this after giving the proof. Proof. We concentrate on the possible values of r. We want m = dq + r, so r = m − dq for some q ∈ Z. For this reason we let A = {m − dq | q ∈ Z}. This is clearly non-empty, but it is not a subset of N. Let B = {a ∈ A | a ≥ 0} = {m − dq | q ∈ Z and m − dq ≥ 0}. This is clearly a subset of N, but it is not clearly non-empty. We want to apply well ordering to B and need B to be non-empty to do so. If m ≥ 0, then q = 0 gives m − dq = m − d0 = m − 0 = m ≥ 0. If m < 0, then q = m gives m − dq = m − d(m) = m(1 − d). But d > 0 means d ≥ 1 so 1 − d ≤ 0. If m < 0, then m(1 − d) ≥ 0 so m − dq ≥ 0. So there is always a non-negative value of m − dq for some value of q and B is not empty. By well ordering, there is a least element of B. Let r be this element. Since r ∈ B, we know that r ≥ 0 and r = m − dq for some q, so m = dq + r. We need to show that 0 ≤ r < d. Since we already known r ≥ 0, we need only show that r < d. We prove this by contradiction. If r ≥ d, then r − d ≥ 0. But r − d = (m−dq)−d = m−d(q +1) and m−d(q +1) ≥ 0. This means r−d = m−d(q +1) is in B. However, d > 0 implies that r − d < r. So r − d is in B and r − d < r. This contradicts the fact that r is a least element of B. So r ≥ d is not possible and r < d. We have shown the “there are” part of the statement. We now have to deal with the uniqueness. To prove uniqueness, we assume “others” that satisfy the conditions and prove that the “others” are the same as the ones already found. Specifically, we asssume q ′ and r′ so that m = dq ′ + r′ and 0 ≤ r′ < d, and we try to prove q = q ′ and r = r′ . With m = dq + r and m = dq ′ + r′ , we have dq + r = dq ′ + r′ . This makes d(q − q ′ ) = r′ − r. If r = r′ , then d(q − q ′ ) = 0 and d 6= 0 implies q − q ′ = 0 and q = q ′ . So we are done if r = r′ . 82 CHAPTER 2. OBJECTS OF STUDY If r 6= r′ , one must be greater than the other. Assume r′ > r. (If r > r′ , then use d(q ′ − q) = r − r′ and repeat the argument we are about to give while reversing the roles of the primed and unprimed letters.) We have 0 ≤ r < r′ < d, so 0 < r′ − r < d. With d(q − q ′ ) = r − r′ , we have d(q − q ′ ) > 0 and d > 0 implies q − q ′ > 0. So q − q ′ ≥ 1. This makes d(q − q ′ ) ≥ d. This contradicts d(q − q ′ ) = r′ − r < d. So r = r′ . This completes the proof. The outline discussed in the proof for proving uniqueness is always followed in uniqueness proofs. It should be learned thoroughly. In proving that a function f : X → Y is one-to-one, it is shown that an element y ∈ Y that is in the image of f has a “unique” x ∈ X so that f (x) = y. The approach in the proof above to uniqueness is built into the definition of one-to-one. One assumes that there is another element x′ so that f (x′ ) = y and one must prove that x = x′ . In the definition y is not mentioned, and so the definition reads “if f (x) = f (x′ ), then x = x′ .” It is just a shorthand for a claim of uniqueness. Note the hypothesis d > 0 in the statement of the division algorithm. The following is the version with d 6= 0. Corollary 2.7.4 Let m and d be in Z with d 6= 0. Then there are unique elements q and r in N so that m = dq + r and 0 ≤ r < |d|. The proof is left as an exercise. The power of the division algorithm for us is that it allows us to discuss divisibility with some certainty. Let m and d be in Z with d 6= 0. We want to discuss the truth of d|m. We concentrate on the remainder r of the division of m by d. By the division algorithm it is unique. Now if d|m, then there is some q ∈ Z so that m = dq = dq + 0. Since 0 < |d|, we must have r = 0. (Explain how the uniqueness of the remainder is being used here.) On the other hand, if the remainder r is 0, then m = dq + 0 = dq and d|m. So we have proven the following. Corollary 2.7.5 If m and d are in Z with d 6= 0, then d|m if and only if the remainder of the division of m by d is zero. This is fundamental enough to be used often and without referring back to the corollary. 2.7.4 Greatest common divisors As promised in the outline, we will make use of the division algorithm in this section. If m and n are in Z, then a common divisor of m and n is an integer d that divides both m and n. That is, d|m and d|n are both true. Obviously, a greatest common divisor is a common divisor that is greatest in some sense. An obvious choice for greatest is “largest.” This is used in a few 2.7. PROPERTIES OF THE RING OF INTEGERS 83 books, but is less useful than another, more widely used choice for the meaning of greatest. We will use that other choice in these notes. In spite of the fact that the other choice has its advantages, it also has some negative (pun definitely intended) aspects. These negative aspects will appear shortly. For us a greatest common divisor for m and n in Z is a common divisor g for m and n so that if h is any common divisor of m and n, then h|g. Note that if g and h are both positive, then h|g implies that h ≤ g. So if we restrict to positive values, the definition we use has the power of the definition that interprets greatest as “largest.” The following guarantees the existence of greatest common divisors and says something about their form. Proposition 2.7.6 (Greatest common divisors) Let m and n be in Z with at least one of them not equal to zero. Then there is a unique positive greatest common divisor g of m and n. Further, the only other greatest common divisor of m and n is −g. Lastly, g is the smallest positive integer so that there are integers s and t so that g = ms + nt. The appearance of the word “positive” in the uniqueness part of the statement, and the assumption that at least one of m or n is not zero will be discussed after the proof. Proof. Let B = {ms + nt|s ∈ Z, t ∈ Z, ms + nt > 0}. Clearly, B ⊆ N. We wish to use well ordering, so we need to argue that B is not empty. This is left as an exercise. Applying well ordering, there is a least element g of B. If g is shown to be a greatest common divisor of m and n, then we will have shown the last sentence in the statement of the proposition. We know g has the form g = ms + nt for some s and t in Z. We first have to show that g divides both m and n. We will show g|m. We assume g does not divide m and will derive a contradiction. We have m = gq + r with q and r in Z and 0 ≤ r < g. Since we assume g does not divide m, we have 0 < r. Now r = m − gq = m − (ms + nt)q = m(1 − sq) − n(tq) is an element of B since r > 0 and both 1 − s and tq are in Z. But r < g contradicts the choice of g as the least element of B. So g|m. A similar proof will show g|n and we omit that proof. Now let h be a common divisor of m and n. Since g = ms + nt, we have that h|g. (This follows from problems in Exercise Set (20).) This makes g a greatest common divisor. If g divides both m and n, then so does −g. Also if h is another common divisor of m and n, then we know that h|g so h|(−g). This makes −g a greatest common divisor. 84 CHAPTER 2. OBJECTS OF STUDY Since g ∈ B, we have g > 0. We have shown the existence part of the statement, and need to show the uniqueness claim and the claim that −g is the only other greatest common divisor. Assume that g ′ is another greatest common divisor. Then both g ′ and g are common divisors and greatest common divisors. Using g as a greatest common divisor and g ′ as a common divisor, we have g ′ |g. Reversing the roles gives g|g ′ . So g = hg ′ and g ′ = kg. Substituting gives g = (hk)g or 1 = hk since g 6= 0. But the only integer pairs that multiply to 1 are 1 × 1 and −1 × −1. Thus k is 1 or −1. Since g ′ = kg we have g ′ = g or g ′ = −g. Thus the only greatest common divisors of m and n are g and −g and only g is positive. This completes the proof. The assumption that at least one of m or n is not zero is necessary only for the method of proof and the word “positive” in the statement. All integers are divisors of 0 and the only integer divisible by all other integers is 0. So calling 0 the greatest common divisor of 0 and 0 makes sense. However we will never have a use for this. The positive quantity g guaranteed by the proposition, the unique positive greatest common divisor of m and n, is denoted (m, n). This notation has a long standing tradition in spite of the fact that it competes with the notation for an ordered pair. Usually context determines what the notation means, and for the rest, words will have to make it specific. To give some numbers to illustrate the notation, we consider m = 12 and n = 8. The full set of common divisors of 12 and 8 is {−4, −2, −1, 1, 2, 4}. The value of (12, 8) is 4. The other greatest common divisor of 12 and 8 is −4 and is denoted −(12, 8). This situation is similar to square roots. There are two numbers −2 and 2 √ 4 is defined to be 2 and the other number whose square whose square is 4, but √ is 4 is − 4. The existence of two greatest common divisors may seem like a disadvantage over a definition where greatest means largest. If we were to use largest for greatest, then 4 would be the only possible greatest common divisor. However we remarked that this the steps in this outline would be applied later to rings other than the integers. In those settings there is no good way to interpret largest, and the definition that we are using will be the only one available. 2.7.5 Factorization into primes This is one of the branches in the outline. It need only strong induction. In this discussion, we take negative integers into account. This adds to the complications, but it prepares us to deal with the complications that arise when we consider rings other than Z. A unit in a ring R with identity 1 is a u ∈ R that has a multiplicative inverse u−1 . Since rings are not necessarily commutative this means that we assume both uu−1 = 1 and u−1 u = 1. In Z, the only units are 1 and −1. 2.7. PROPERTIES OF THE RING OF INTEGERS 85 In Z, a prime is some p ∈ Z that is not a unit and so that if p = ab with a and b in Z, then at least one of a or b is a unit. Basically, this is the usual definition of a prime integer but with negative numbers taken into account. Integers that are not primes include (among others) 0, 1, −1 since 0 = 3 · 0 and neither 3 nor 0 is a unit, and both 1 and −1 are units. However all of 2, −2, 3, −3, 5, −5 are primes. Unfortunately, when this definition is carried to rings other than Z, the word changes. What is called prime in Z is called irreducible in rings of polynomials. This is given here only as psychological preparation, and will not be discussed again until much later. The point of this section is the following. Theorem 2.7.7 (Fundamental theorem of arithmetic) Every integer n with n > 1 is either prime or a product p1 p2 · · · pk where each pi is a prime. Proof. The only real work in this proof is an inequality so obvious that it seems silly to prove it. The overall proof is by strong induction on n starting at 2. We are proving an “or” and we assume that n is not a prime. Thus n = ab where neither a nor b is ±1. Since n > 1 it is positive and (by negating if needed) we can assume that both a and b are positive. Thus a > 1 and b > 1. We claim that a < n and b < n. If a ≥ n, then b > 1 and a ≥ n combine to say that n = ab > n · 1 = n. This is not possible, so a < n. Similarly, b < n. The strong inductive hypothesis says that we can assume the truth of what we are proving for all 2 ≤ j < n, so the statement we are proving is assumed true for both a and b. Thus either a = q1 or a = q1 q2 · · · qs and either b = r1 or b = r1 r2 · · · rt where all qi and ri are primes. Since n = ab, we have our result. This would be insanely hard to prove by ordinary induction. The relevance of n − 1 to the discussion when n is not prime is nil. If one were to start with ordinary induction, then the easiest proof of the fundamental theorem of arithmetic would be to first prove the validity of strong induction and then give the above proof. 2.7.6 Euclid’s first theorem about primes This continues the branch in the outline started by the fundamental theorem of arithmetic. The main result in this section is a stunning display of the power of greatest common divisors, and it also gives us an opportunity to introduce another concept. If m and n are in Z with at least one of them not equal to zero, then we say that m and n are relatively prime if (m, n) = 1. The power of this is twofold. First, this happens a lot if one of m or n is prime. This is covered in the lemma below. The second is that (m, n) = 1 allows us to say that there are integers s and t so that ms + nt = 1. 86 CHAPTER 2. OBJECTS OF STUDY Lemma 2.7.8 Let n and p be in Z with p prime. If p does not divide n, then (n, p) = 1. The proof is left as an exercise. Next we have Euclid’s theorem. Theorem 2.7.9 Let p, a and b be in Z with p prime. If p|(ab), then either p|a or p|b. Proof. Assume p does not divide a. Then (p, a) = 1 and there are integers s and t so that ps + at = 1. Mutiplying by b gives psb + abt = b. Now p|(psb) and p|(abt) so p divides the sum psb + abt and p|b. The conclusion of Theorem 2.7.9 is the basis for the definition of prime in other rings. This will be discussed at the time it is needed. We give an easy generalization of Theorem 2.7.9 mostly to show that it is possible to write out a careful and rigorous argument for a result so obvious that it seems hard to know what to write down for a proof. Corollary 2.7.10 Let p and ai , 1 ≤ i ≤ k, be in Z with p prime. If p|(a1 a2 · · · ak ), then p|ai for some i with 1 ≤ i ≤ k. Proof. We induct on k starting with k = 1. If k = 1 there is nothing to prove and if k ≥ 2, then Theorem 2.7.9 says that either p|(a1 a2 · · · ak−1 ) or p|ak . If p does not divide ak , then the inductive hypothesis applied to the other possibility says that p|ai for some i with 1 ≤ i ≤ k − 1. 2.7.7 Uniqueness of prime factorization This ends the branch of the outline that started with the fundamental theorem of arithmetic. The main claim here is that there is only one way to factor an integer into primes. This is absolutely false if interpreted literally. We have 12 = 3 · 4 = 4 · 3 = −4 · −3 = −3 · −4 which means we have to take into account at least changes of order and changes of sign. It turns out that this is all we have to take into account. The following lemma takes care of signs. It is based on the fact that there are only two units in Z. Lemma 2.7.11 Let p and q be primes in Z. If p|q, then p = q or p = −q. The proof is left as an exercise. Another way to state the conclusion is to say that |p| = |q|, and yet another way is to say that q = up where u ∈ {−1, 1}. Now we can prove the main result of this branch of the outline. 2.7. PROPERTIES OF THE RING OF INTEGERS 87 Theorem 2.7.12 (Uniqueness of factorization of integers) Let m ∈ Z be neither 0 nor a unit. If m = p1 p2 · · · pj and m = q1 q2 · · · qk where all the pi and qi are primes, then j = k and the subscripts of the qi can be permuted so that for each i with 1 ≤ i ≤ j we have qi = pi or qi = −pi . The permuting of the subscripts takes into account that one can change the order of multiplication and not change the result. We will have more to say about the ordering problem after the proof. Proof. We induct on the smaller of j and k. Since p1 |m, we have that p1 |(q1 q2 · · · qk ), and there is an i with 1 ≤ i ≤ k so that p1 |qi . By permuting the subscripts we can assume that i = 1 so that p1 |q1 . We have q1 = up1 for some u ∈ {−1, 1}. Note that u−1 = u so we also have p1 = uq1 as well. We have m = up1 q2 q3 · · · qk and we get two expressions for m/p1 . We have m/p1 = p2 p3 · · · pj and m/p1 = (uq2 )q3 · · · qk . We know that m/p1 is not 0, and if m/p1 is a unit, then m = ±p1 = ±uq1 . In this case j = k = 1 and no permutation of the subscripts is needed. If m/p1 is not a unit, then this situation is covered by the inductive hypothesis since the number of primes in the two expressions (recall that uq2 is a prime) is j − 1 and k − 1, respectively. Applying the inductive assumption gives j − 1 = k − 1, implying j = k, and a permutation of the subscripts of the qi , 2 ≤ i ≤ j, has |pi | = |qi | for each i with 2 ≤ i ≤ j. This completes the proof. There is a way to avoid permuting the subscripts which we have not taken advantage of. We could have demanded that in the two expressions for m that |p1 | ≤ |p2 | ≤ · · · ≤ |pj | and |q1 | ≤ |q2 | ≤ · · · |qk |. With only a bit more effort, we could have proven that j = k and each i with 1 ≤ i ≤ j we have |pi | = |qi |. We did not do this since when this theorem is proven for rings other than Z, it is significantly more difficult to duplicate this extra hypothesis. Exercises (25) 1. This exercise will prove that N is well ordered. In doing so we will assume that for every m ∈ N, there is no element x ∈ N so that m < x < m + 1. This fundamental fact can either be assumed as we do here, or proven from even more fundamental assumptions about N. The proof that N is well ordered will follow when the statement “if n ∈ S ⊆ N, then S has a least element” is proven by induction on n. First check that the statement in quotes holds when n = 0. Then make the inductive assumption that the statement in quotes holds for n = k, and assume that k + 1 ∈ S ⊆ N. The induction will be finished if we can show that S has a least element. If k ∈ S, then S has a least element by the inductive assumption. If k ∈ / S, then let S = S ∪ {k}. By the inductive assumption S has a least element m. Now finish the proof by considering the two cases m = k and m 6= k. 2. Prove that the set of positive real numbers has no least element. 88 CHAPTER 2. OBJECTS OF STUDY 3. Prove Corollary 2.7.4 by first proving Let m and d be in Z with d < 0. Then there are unique elements q and r in N so that m = dq + r and 0 ≤ r < −d. Hint: do not go through the whole proof of the division algorithm, just use it. Then combine with the division algorith to conclude the corollary. 4. In the proof of the proposition on greatest common divisors, show that B is not empty. This is easier than the corresponding fact in the proof of the division algorithm proposition. 5. In the proof of the proposition on greatest common divisors, fill in the details on why h|g. 6. Prove Lemma 2.7.8. 7. Prove Lemma 2.7.11. 2.8 The fields Zp We return to the task of finding examples of fields. All the examples of fields that we have seen so far have had infinitely many elements. These examples will have only finitely many elements. This finishes the outline given in Section 2.7.1 and is the only step in the “other” branch that comes after greatest common divisors. As mentioned before, the exercises in Exercise Set (23) hint that the rings Zp might actually be fields if p is a prime. Here we verify that. The verification uses what we know about greatest common divisors. Proposition 2.8.1 Let p ∈ Z be a prime. Then Zp is a field. Proof. From Proposition 2.5.2, we know that Zp is a commutative ring with 1. A check shows that every requirement of a field is met except for the existence of multiplicative inverses. So we must show that every [i]p ∈ Zp with [i]p 6= [0]p has some [j]p ∈ Zp so that [i]p [j]p = [1]p . Since [i]p 6= [0]p , we know that p does not divide i − 0 = i so (p, i) = 1 and there are integers s and t so that ps + it = 1. From this we have it = 1 − ps so [i]p [t]p = [it]p = [1 − ps]p = [1]p since 1 and 1 − ps differ by a multiple of p. Thus [t]p is the inverse that we sought for [i]p . We see that there is a class of finite fields. Just as with groups, we use the word order to refer to the number of elements. We now know that for each prime p, there is a field of order p. It turns out that finite fields are well understood. We will eventually see that there is a field of order n if and only if n = pk for some prime p and positive integer k. The exercise below is large. This comment is here to assure you that the exercise is stated correctly and means what it says. 2.9. HOMOMORPHISMS 89 Exercises (26) 1. Learn the outline given in Section 2.7.1. Also learn the details of the proofs of the individual steps in the outline starting with Step 2. Step 1 is important from a logical point of view in that all that comes after is based on Step 1. However, Step 1 is the least algebraic and knowing the details of the proof of the main result, that N is well ordered, is not crucial. The propositions and theorems to learn are six in number: 2.7.3, 2.7.6, 2.7.7, 2.7.9, 2.7.12, and 2.8.1. 2.9 Homomorphisms Homomorphisms were discussed very briefly in Section 2.1.4. The discussion centered around consequences of permuting roots of a polynomial. It was felt that if the roots moved, then other numbers should move as well. If r1 moved to r2 , and r2 moved to r1 (i.e., f (r1 ) = r2 and f (r2 ) = r1 ), then r1 + r2 should move to itself and r1 − r2 should move to the negative of itself. That is, f (r1 + r2 ) = f (r1 ) + f (r2 ) = r2 + r1 and f (r1 − r2 ) = f (r1 ) − f (r2 ) = r2 − r1 . The logic of these consequences are the basis for the idea of a homomorphism. If there are operations around, then a homomorphism should cooperate well with the operations. Operations we have encountered have been addition, multiplication, negation and inversion. We leave out the taking of roots since they involve multiple answers. Not all objects have all possible operations. If a function f : G → H is to be a homomorphism between groups (with operation written multiplicatively), then there are as many as three operations to discuss: multiplication (of arity 2), inversion (of arity 1), and the constant identity (of arity 0). If the identity is 1, then we should require f (ab) = f (a)f (b), f (a−1 ) = (f (a))−1 , f (1) = 1. Notice that all operations used to the left of the equal signs are taking place in G and all operations used to the right of the equal signs are taking place in H. If f : R → S is to be a homomorphism between rings with 1, then we would want to require f (a + b) = f (a) + f (b), f (−a) = −f (a), f (ab) = f (a)f (b), f (0) = 0, f (1) = 1. If we are working with fields where inversion is present, we would add the requirement that f (a−1 ) = (f (a))−1 . 90 CHAPTER 2. OBJECTS OF STUDY It turns out that not all of these requirements need be stated. Some follow from others. We will see the details in the next chapter where we look at homomorphisms in more detail. For now we will just look at examples. The linear transformations of linear algebra should be kept in mind. The operations relevant to vector spaces are addition, negation and multiplication by scalars. A linear transformation T is required to cooperate with these operations and this requirement is summarized by the requirement T (ru + su) = rT (u) + sT (v) for vectors u and v, and scalars s and t. The requirement T (0) = 0 need not be stated since it is a provable consequence of the requirement above. 2.9.1 Complex conjugation The properties of complex conjugation listed in (1.32) are most of what we need to argue that complex conjugation is a homomorphism of fields from C to itself. The symbol for complex conjugation does not make it look like a function. However it is a function and we temporarily give it another symbol to make it look like one. We define f (z) = z. The equality z + w = z+w from (1.32) translates into f (z+w) = f (z)+f (w) and says that complex conjugation cooperates in the proper manner with respect to addition. Three other properties listed in (1.32) translate into f (−z) = −f (z), f (zw) = f (z)f (w) and f (z −1 ) = (f (z))−1 and show complex conjugation cooperates with negation, multiplication and inversion. All we need to add is 1 = 1 and 0 = 0 to give f (1) = 1 and f (0) = 0. Thus complex conjugation is a homomorphism. 2.9.2 The projection from Z to Zk For k > 0 in Z, we have the ring Zk . There is a natural function π (later discussion will call it a projection) from Z to Zk defined by π(i) = [i]k . That is, i is carried by π to the equivalence class containing i. The definitions [x]k + [y]k = [x + y]k and −[x]k = [−x]k from (2.6) and (2.7) make π a homomorphism from the group (Z, +) to the group (Zk , +). If we also bring in the multiplication [x]k [y]k = [xy]k discussed in Lemma 2.5.1, then we get that π is also a homomorphism from the ring (Z, +, ·) to the ring (Zk , +, ·). In the next chapter, we will define restricted classes of homomorphisms. Exercises (27) √ √ √ 2] built in Section 2.6.2. Define f : Q[ 2] → Q[ 2] 1. This uses the √ field Q[ √ by f (r + s 2) = r − s 2. Show that f is a homomorphism. This is very much like showing the facts in (1.32), but with some minor differences. Chapter 3 Theories 3.1 Introduction What is a theory? A theory in a subject is a collection of arguments, conclusions and tools that answer questions in that subject. Often the arguments, conclusions and tools are based on a small collection of assumptions, axioms or basic principles that are considered the starting points of the theory. In this course, we will look at several theories. Some of the theories are connected to the new objects of study. Thus there is a theory of groups, a theory of rings, a theory of fields. (The names are usually given in a shorter form: group theory, ring theory, field theory.) The starting points of these theories are the definitions of the objects themselves. Other theories are harder to assign starting points. The theory of Galois (or Galois theory) ties together aspects of group theory and field theory. It was invented to answer questions about roots of polynomials, but its applications have spread well beyond such questions. Galois theory is not usually discussed with a starting set of axioms or basic assumptions. Axiomatic systems An axiomatic system is a theory with a stateable collection of assumptions or definitions that form a starting point for the theory. It is a point of pride in mathematics that there are many such axiomatic systems, and that it can be claimed that all of mathematics itself can be laid out as an axiomatic system whose base is a combination of logic and set theory. This chapter will take a very quick and introductory look at the axiomatic systems consisting of group theory, ring theory and field theory. This is done for several reasons. One reason is simply to introduce and emphasize axiomatic systems. Not every mathematical subject is developed this way (some subjects are loose gath91 92 CHAPTER 3. THEORIES erings of topics that are related enough to form an area of study), but many are. Even if a subject does not have an axiomatic development, it often lives in a larger subject that does. Another reason is that other than the axiomatic system for geometry developed by the Greeks, axiomatic systems did not form a major part of mathematical development until the 1800s. Thus the rise of axiomatic systems more or less coincides with the rise of what we call modern mathematics. A third reason is that the theories we consider here (groups, rings and fields) have enough of an overlap at the beginning that there is some savings in considering all three at the same time. Caveats There are some cautions that need to be stated. Not all mathematicians are in love with the axiomatic approach to mathematical subjects. It is felt that overemphasis on axiomatics leads to “axiomatic tinkering” that is removed from the true beauty of mathematics and from its relationship to the real world. Even if this criticism is accepted as valid, the fear that “axiomatic tinkering” will come to dominate mathematics does not take into account human variability. Hundreds of years of experience has shown that no one approach to mathematics is likely to crowd out other approaches. It must also be said that there is no date for the start of modern mathematics. Nor is there a definition that separates modern mathematics from what came before. Mathematical development is much too gradual and continuous. Even such major events as Galois’ solution to the solvability of polynomials followed a period of development with contributions by many individuals over a long period of time. Beginnings The starting points of any axiomatic system (e.g., group theory, ring theory, field theory) are the definitions. After the definitions are recorded, there is nothing known about the theory except the definitions. Thus any result that comes immediately after the definitions must follow from the definitions and nothing else. Once results start to accumulate in a theory, then further results can make use of them. Ultimately however, all results derive from the definitions, either directly or through a chain of other results that also derive from the definitions. In this chapter, we will repeat the definitions for clarity, and then give some of the earliest results in each theory. We do this to emphasize and exploit some of the similarities between the theories, but also to exhibit some of the differences. 3.2. GROUPS 3.2 3.2.1 93 Groups The definition In Section 2.3.1 we gave two equivalent definitions of a group. Below we use only the “smaller” definition—the one that starts with the least structure. A group is a pair (G, ·) where G is a set and · is a binary operation (function from G × G to G) usually referred to as the multiplication. We will omit the notation for the multiplication and write the product of a and b as ab. The following requirements must be met. 1. For all a, b and c in G, we have a(bc) = (ab)c. 2. There is an element 1 ∈ G so that for all a ∈ G, we have 1a = a1 = a. 3. For each a ∈ G there is an element a−1 ∈ G so that aa−1 = a−1 a = 1. Abelian groups One class of groups is so important that it is worth defining them immediately after giving the definition of a group. If a and b are in a group G, we say that a commutes with b if ab = ba. If every pair of elements in a group commutes, then we say that the group is commutative or that the group is abelian. It is sometimes said that the multiplication is commutative, but I have never heard the word “abelian” used that way. The word “abelian” always seems to be used to refer to the group. As mentioned in Section 2.3.3, it is common but not required to use + for the operation and 0 for the identity in an abelian group. 3.2.2 First results Uniqueness There are many results that follow immediately from the existence of inverses. The following lemma is what most are based on. Lemma 3.2.1 If a and b are two elements in a group G, then there exists a unique element x ∈ G that satisfies ax = b. Proof. To show something exists, one only has to exhibit it. The element x = a−1 b satisfies ax = a(a−1 b) = (aa−1 )b = 1b = b. Of course we skipped how we found this element. One finds this value of x by writing ax = b and multiplying both sides on the left by a−1 to get a−1 (ax) = a−1 b. One has to say “both sides on the left” instead of just “both sides” since the multiplication in a group is not always commutative. Knowing p = q does not mean pr = rq, and in fact pr = rq might be false. Now that a−1 (ax) = a−1 b is known, the left side simplifies to (a−1 a)x = 1x = x. Thus we get x = a−1 b. The point is that the calculation that we just 94 CHAPTER 3. THEORIES did to get x = a−1 b does not prove that ax = b if x = a−1 b. That was proven in the first paragraph of the proof. However, the calculation that derives x = a−1 b does have merit. It shows the uniqueness of x. We proved that if ax = b, then x = a−1 b must be true. Thus this is the only value of x that makes ax = b true. Lemma 3.2.1 is important, but equally important are the ideas of the proof— that inverses always exist in a group and that inverses cancel. The lemma will be quoted often in the rest of these notes, but equally often the techniques of “multiply both sides on the right by the inverse” or “multiply both sides on the left by the inverse” will be invoked without bothering to refer back to the lemma. You should get used to looking for opportunities to use the techniques wherever needed. We get two important corollaries of Lemma 3.2.1. Corollary 3.2.2 In a group G with identity 1, if ax = a, then x = 1. Proof. Apply Lemma 3.2.1 with b = a noting that a1 = a. Corollary 3.2.3 In a group G with identity 1, if ax = 1, then x = a−1 . Proof. Apply Lemma 3.2.1 with b = 1 noting that aa−1 = 1. Corollary 3.2.2 is usually interpreted as saying that there is only one element in a group that acts as the identity. Corollary 3.2.3 is usually interpreted as saying that for each element a in a group, there is only one element that acts as the inverse of a. These observations add information to the “larger” definition of a group. If a group is defined as a set with three operations (G, ·, −1 , 1) with · binary, −1 unary and 1 a constant satisfying the usual requirements, then the two corollaries above say that the operation · completely determines the other two. There is only one way to assign inverses and there is only one element that can serve as the identity. Corollary 3.2.4 If 1 is the identity of a group G, then 1−1 = 1. Proof. Use 1 · 1 = 1 in Corollary 3.2.3. Corollary 3.2.5 In a group G, for every a ∈ G, we have (a−1 )−1 = a. Proof. If 1 is the identity of G, then for a in G, we have a−1 a = 1. Use Corollary 3.2.3 with x = a. The last proof could have been done with different letters to make the use of Corollary 3.2.3 more transparent. If the proof bothers you, change the letters. It is important that you become comfortable with the proof. 3.2. GROUPS 95 Corollary 3.2.6 In a group G, for every a and b, we have (ab)−1 = b−1 a−1 . The proof is left as an exercise. Corollary 3.2.7 If a1 , a2 , . . . , an are in a group G, then we have −1 −1 −1 (a1 a2 · · · an )−1 = a−1 n an−1 · · · a2 a1 . The proof is left as an exercise. 3.2.3 Subgroups If G is a group, then a subset S of G is called a subgroup if using the operation of G on the elements of S makes S a group. If any element of S acts as the identity element on S, then it will act as the identity on at least one element of G, and by Corollary 3.2.2 it must be the identity of G. Similarly, Corollary 3.2.3 says that each element of S must have its inverse from G also in S. Lastly, if a and b are in S, then certainly ab (as computed in G) must be in S. Once these requirements are satisfied, then the associative law will be satisfied (since it holds in G), and S will be a group. For example, if (Z, +) is the group, then the even integers form a subgroup. Recall that 0 is the identity in (Z, +). The following lemma is given as a more efficient way to check that a subset of a group forms a subgroup. Lemma 3.2.8 If S is a non-empty subset of a group G, then S forms a subgroup if for every a and b in S the element a−1 b is also in S. The proof is left as an exercise. The mention of a−1 in the lemma, and the requirement that the subset be non-empty often has the consequence that using the lemma is sometimes no more efficient than showing that a subset is a subgroup directly from the definitions. However, there are situations where using the lemma is easier, so it worth stating and proving. Intersections The following is a template for many almost identical lemmas. Lemma 3.2.9 Let G be a group and let C be a collection of subgroups of G. Then the intersection of all the subgroups in C is a subgroup of G. Proof. Let A be the intersection of all the subgroups in C. We will use lemma 3.2.8. It is just barely more efficient to do so than to not do so. We note that A is not empty since 1 must be an element of every subgroup in the collection C, and thus also an element of A. Now let a and b be in A. By the definition of intersection, we must have that a is in every subgroup in the collection C. Similarly, b is in every subgroup in the collection C. So a−1 is in every subgroup in the collection C as is a−1 b. 96 CHAPTER 3. THEORIES But this puts a−1 b in the intersection A. By Lemma 3.2.8, A is a sugroup of G. Note that the collection C is not necessarily the collection of all subgroups of G. In fact, C might have very few subgroups in it. Obvious subgroups If G is a group with identity 1, then {1} is a subgroup of G. As trivial as it is, it satisfies all of the requirements of a subgroup. As trivial as it is, it is given the name trivial subgroup or simply identity subgroup. For any group G, the group G itself is also a subgroup of G. There is less agreement on a name for this group, but it can be referred to as the full subgroup of G or whole group. The phrase proper subgroup of G in some texts refers to a subgroup that is not the whole group, and in others refers to a subgroup that is not the whole group and is not the trivial subgroup. We will not rely on one word to do all the work and if we want to refer to a subgroup of G that is not G and not {1}, then we will call it a proper, non-trivial subgroup. As uninteresting as the whole group and the trivial subgroup might seem, they are still subgroups and must be included in any list of all the subgroups of a group. Generating subgroups We give another template for many almost identical lemmas. If C is a collection of subgroups in a group G, then a subgroup H in the collection C is said to be the “smallest” subgroup in the collection C if for every subgroup K in the collection C we have H ⊆ K. In words, H is one of the subgroups in C and it is contained in every subroup in C. Note that a collection C of subgroups of G cannot have two different smallest subgroups. If A and B are both the smallest subgroups in the collection C, then A ⊆ B and B ⊆ A would both be true, and we would have A = B. In the next lemma, note the use of the word “subset” as opposed to “subgroup” in the statement. Lemma 3.2.10 Let S be a subset of a group G. Then in the collection of all subgroups of G that contain S there is a smallest subgroup. Proof. Let C be the collection of all the subgroups of G that contain S. This is not an empty collection since G is in C. Let A be the intersection of all the subgroups in the collection C. By Lemma 3.2.9, we know that A is a group. Since S is in every subgroup in the collection C, it is in the intersection A. So S ⊆ A. This makes A one of the subgroups in C. To show that A is the smallest subgroup in C, we let B be another subgroup in the collection C. That is, B is one of the subgroups we are intersecting to create A. But every element of A must be in every one of the subgroups being 3.2. GROUPS 97 intersected. In particular, every element of A must be in B. This makes A ⊆ B which is the last item needed to be proven. If G is a group and S is a subset of G, then the smallest subgroup of G that contains S (whose existence is guaranteed by Lemma 3.2.10) is called the subgroup of G generated by S. It is denoted hSi. From Lemma 3.2.10 we know that hSi exists, but we have no idea yet what is in it. This will be discussed later. We will become fascinated with subgroups when we get deeper into group theory in later chapters. 3.2.4 Homomorphisms We gave an informal definition of a homomorphism in Section 2.9. Before we give a formal definition, we give a lemma that motivates the smallness of the definition that we give. Lemma 3.2.11 Let f : G → H be a function from the group G (with identity 1G ) to the group H (with identity 1H ). If for all a and b in G we have f (ab) = f (a)f (b), then f (1G ) = 1H and for all a ∈ G we have f (a−1 ) = (f (a))−1 . Proof. Since 1G 1G = 1G , we have f (1G )f (1G ) = f (1G ) and Corollary 3.2.2 says that f (1G ) = 1H . Since aa−1 = 1G , we have f (a)f (a−1 ) = f (1G ) = 1H and Corollary 3.2.3 says that f (a−1 ) = (f (a))−1 . In Section 2.9 we wanted a homomorphism to be a function that cooperates with all operations on a group. Lemma 3.2.11 says that to get all the cooperation we require, we only need to demand that the function cooperate with the multiplication. This is not a surprise in view of the two corollaries quoted in the proof above which say that the multiplication determines the other two operations. This leads to the following definition. A function f : G → H between groups is said to be a homomorphism if for every a and b in G, we have f (ab) = f (a)f (b). It is worth discussing in more detail what is going on in a homomorphism. If f : G → H is a homomorphism between groups, then if three elements a, b and c in G are related by having c = ab, then the three corresponding elements f (a), f (b) and f (c) in H must also be related by having f (c) = f (a)f (b). An important fact about homomorphisms is the following. Lemma 3.2.12 If f : G → H and h : H → K are homomorphisms between groups, then hf : G → K is a homomorphism. The proof is left as an exercise. 98 3.2.5 CHAPTER 3. THEORIES Subgroups associated to a homomorphism The image of a homomorphism Let h : G → H be a homomorphism between groups. The image h(G) of the homomorphism is simply the image of the function. It is a subset of H. We have the following. Lemma 3.2.13 Let h : G → H be a homomorphism between groups. Then the image h(G) of the homomorphism is a subgroup of H. Proof. Let 1G and 1H be, respectively, the identity of G and the identity of H. Since 1H = h(1G ) is in h(G), we know that h(G) is not empty. If a and b are in h(G), there are x and y in G so that h(x) = a and h(y) = b. Now a−1 b = (h(x))−1 h(y) = h(x−1 )h(y) = h(x−1 y) and since x−1 y is in G, we have a−1 b is in h(G). By Lemma 3.2.8, h(G) is a subgroup of H. The kernel of a homomorphism Let h : G → H be a homomorphism between groups with 1H the identity of H. The kernel of h, denoted Ker(h) is defined by Ker(h) = {x ∈ G | h(x) = 1H }. In words, the kernel of h is all the elements of G that map to the identity in H. It is a subset of G. Lemma 3.2.14 Let h : G → H be a homomorphism between groups. Then Ker(h) is a subgroup of G. Proof. Let K = Ker(h). We have h(1G ) = 1H so 1G is in K. For a ∈ K we have h(a−1 ) = (h(a))−1 = (1H )−1 = 1H . so a−1 is in K. For a and b in K, we have h(ab) = h(a)h(b) = 1H 1H = 1H . A proof using Lemma 3.2.8 would be no shorter. The proof of Lemma 3.2.14 uses three properties of identities to get the three properties needed for a subgroup. In words, we used the fact that the image of an identity is an identity, the inverse of the identity is the identity, and the product of two identities is an identity. There is another property of the identity element that we have not exploited that is not necessarily shared by other elements. If 1 is the identity of a group G, then for all a ∈ G, we have a1a−1 = 1. If a group is commuative, then aba−1 = b always holds, but without commutativity we cannot guarantee this. The extra property of the identity leads to the following. 3.2. GROUPS 99 Lemma 3.2.15 Let h : G → H be a homomorphism between groups. Then for every a ∈ Ker(h) and for every b ∈ G, we have bab−1 ∈ Ker(h). The proof is left as an exercise. The result of Lemma 3.2.15 is turned into a definition. If N is a subgroup of a group G, then we say that N is normal in G and write N ⊳ G if for every a ∈ N and b ∈ G, we have bab−1 ∈ N . Not all subgroups of all groups are normal. This will be seen later in the notes, or in a problem below if you are impatient. The point is that kernels of homomorphisms are somewhat special. There are parallels to Lemmas 3.2.9 and 3.2.10 for normal subgroups. We will not have need for such lemmas and do not state them separately here. However, we give their statement and proof as an optional exercise. We end this section with a very easy fact. However, it is important enough to state separately as a lemma. Lemma 3.2.16 In an abelian group, every subgroup is normal. The proof is left as an exercise. 3.2.6 Homomorphisms that are one-to-one and onto This section discusses when two groups have “essentially the same structure.” We argue here that this notion is captured by homomorphisms that are one-toone and onto. First we give a lemma. Lemma 3.2.17 Let h : G → H be a homomorphism between groups that is a one-to-one correspondence. Then the inverse function h−1 : H → G is a homomorphism. Proof. . Let x and y be in H. Then a = h−1 (x) is the unique element of G for which f (a) = x and b = h−1 (y) is the unique element of G for which f (b) = y. Since h is a homomorphism, h(ab) = h(a)h(b) = xy and we have that h−1 (xy) = ab = h−1 (x)h−1 (y). This is what is needed to show that h−1 is a homomorphism. We say that a homomorphism h : G → H between groups is an isomorphism if it is also a one-to-one correspondence. We also say that G and H are isomorphic. Note that saying that two groups are isomorphic gives less information than specifying an isomorphism between them. Saying that two groups are isomorphic says that an isomorphism between them exists, but does not say exactly what that isomorphism might be. We expand on earlier remarks that we made about homomorphisms. If f : G → H is a function and a, b and c are elements of G, then we can consider the corresponding elements f (a), f (b) and f (c) in H. We have noted earlier that 100 CHAPTER 3. THEORIES if f is a homomorphism, then c = ab implies f (c) = f (a)f (b). Now we can add the remark if f is an isomorphism, then c = ab if and only if f (c) = f (a)f (b). It is this equivalence that allows us to claim that an isomorphism between two groups shows that the two groups are “essentially the same.” We can give more specifics. Lemma 3.2.18 Let h : G → H be an isomorphism between groups. Then the following hold. 1. For a and b in G, we have b = a−1 if and only if h(b) = (h(a))−1 . 2. For a in G, the order of a (as defined near the end of Section 2.4.3) in G equals the order of h(a) in H. 3. For a subset S of G, we have that S is a subgroup of G if and only if h(S) is a subgroup of H. The proof is left as an exercise. 3.2.7 The group of automorphisms of a group If G is a group, then i : G → G where i(x) = x for every x ∈ G is clearly a homomorphism. Since it is also one-to-one and onto, it is an isomorphism. This particular homomorphism/isomorphism is called the identity isomorphism. There can be more isomorphisms from a group to itself. Consider the group Z5 under addition. We will build an isomorphism h from Z5 to itself that is not the identity. The elements of Z5 (written without the brackets) are 0, 1, 2, 3, and 4. Since 0 is the identity element, we must have h(0) = 0. If we next consider 1, we can try to have h(1) something other than 1. Let us try h(1) = 2. Now 2 = 1 + 1, so we must have h(2) = h(1 + 1) = h(1) + h(1) = 2 + 2 = 4. With 3 = 2 + 1, we get h(3) = h(2 + 1) = h(2) + h(1) = 4 + 2 = 1. Lastly, h(4) = h(3 + 1) = h(3) + h(1) = 1 + 2 = 3. We have defined a function h : Z5 → Z5 that can be represented by the following permutation on {0, 1, 2, 3, 4}: 0 1 2 3 4 . 0 2 4 1 3 Since h is one-to-one and onto, it will be an isomorphism from Z5 to Z5 if it is a homomorphism. We give the following argument that h is a homomorphism. If a and b are two elements of Z5 , then a is the sum of a ones and b is the sum of b ones. That means that a + b is the sum of a + b ones. Now h(a) is the sum of a twos, h(b) is the sum of b twos and h(a + b) is the sum of a + b twos. But h(a) + h(b) is also the sum of a + b twos. This makes h a homomorphism and thus an isomorphism. We call an isomorphism from a group to itself an automorphism. The example shows that there can be automorphisms of a group other than the identity. 3.2. GROUPS 101 We get structure from the set of all automorphisms of a group. The set of all automorphisms of a group G is denoted Aut(G). Note that since each automorphism of G is a one-to-one correspondence from G to G, it is also a permutation of G. Thus Aut(G) is a subset of SG , the group of all permutations of G. It turns out that Aut(G) is actually a subgroup of SG . Lemma 3.2.19 Let G be a group and let f and h be in Aut(G). Then f h and f −1 are in Aut(G). It follows that Aut(G) is a group with composition as the group operation. Proof. From Lemma 2.2.5, we know that f h is a one-to-one correspondence. From Lemma 3.2.12, we know that f h is a homomorphism. Thus it is an isomorphism and since it goes from G to itself, it is an automorphism of G. From Lemma 2.2.8, we know that f −1 is a one-to-one correspondence. From Lemma 3.2.17, we know that f −1 is a homomorphism. Thus it is an automorphism of G. We already know that the identity from G to itself is an automorphism. This gives all the facts we need to claim that Aut(G) is a subgroup of SG . The group Aut(G) is called the automorphism group of G. The importance of automorphisms Automorphism groups will be very important to us in our study. We will not be too concerned with automorphism groups of groups. But we will be very concerned with other automorphism groups, and automorphism groups of groups is a good place to start. Automorphisms of rings and fields will also be defined. We will spend most of our time looking at automorphisms of fields. Automorphisms of an object such as a group, ring or field can be thought of as symmetries of that object. The full automorphism group can be thought of as the full group of symmetries of the object. One of the themes of mathematics is that symmetries of an object reveal properties of the object. While that makes an attractive sentence, it takes a lot of work to back it up with examples. In order to derive information from symmetries, you have to know what the symmetries are. In order to know what the symmetries of an object (such as a group, ring or field) are, you have to know some structure of the object. Thus we do not start with symmetries, but instead start with a preliminary study of the structure of the object. That preliminary information is used to say something about the symmetries, and then finally new information about the object can be extracted from the symmetries. This summarizes Galois theory in a very general way. At this point we leave groups and take up similar considerations for both rings and fields. There will be similarities that we will exploit. This will usually take the form of having you do proofs that are similar to proofs given above. We will also point out certain differences. 102 CHAPTER 3. THEORIES Exercises (28) 1. Prove Lemma 3.2.6. The hint is to absorb the idea of the proof of the previous two corollaries. 2. Give an inductive proof of Lemma 3.2.7. 3. Prove Lemma 3.2.8. Why is it necessary to assume that S is not empty in the hypothesis. 4. Prove Lemma 3.2.9 without using Lemma 3.2.8 and compare the length of your proof to the proof given. 5. We have mentioned that the even numbers E form a subgroup of Z, +). Prove that T , the set of multiples of 3, form a subgroup of (Z, +). What is E ∩ T ? 6. Consider the group (Z, +). Let f : Z → Z be the doubling map. That is f (n) = 2n. Show that f is a homomorphism. Show that g : Z → Z defined by g(n) = n + 1 is not a homomorphism. 7. Prove Lemma 3.2.12. 8. If h : G → H is a homomorphism between groups, and S is a subgroup of G, then prove that h(S) is a subgroup of H. 9. Prove Lemma 3.2.15. 10. Consider the group (Z, +), k 6= 0 in Z and the group Zk . Recall the homomorphism π : Z → Zk . What is the kernel of π? 11. (optional) State and prove lemmas that serve as parallels to Lemmas 3.2.9 and 3.2.10 for normal subgroups. The main purpose of this exercise is to review the proofs of Lemmas 3.2.9 and 3.2.10 and to exhibit their flexibility. 12. This produces a subgroup that is not normal. Consider the elements σ and τ of S4 as given in the first problem of Exercise Set (19). Show that A = {1, τ } forms a two element subgroup of S4 . Argue that your calculation of στ σ −1 shows that A is not a normal subgroup of S4 . This assumes that your calculation of στ σ −1 is correct. 13. Prove Lemma 3.2.16. 14. Prove Lemma 3.2.18. 15. There are four elements in Aut(Z5 ) where Z5 is shorthand for the group (Z5 , +). Find them and write them out as permutations. Write out the multiplication table of Aut(Z5 ). Find an isomorphism from Aut(Z5 ) to the group Z4 . 3.3. RINGS 103 16. There are four elements in Aut(Z8 ) where Z8 is shorthand for the group (Z8 , +). Find them and write them out as permutations. Write out the multiplication table of Aut(Z8 ). In spite of the fact that Aut(Z8 ) and the group Z4 both have four elements, prove that Aut(Z8 ) and Z4 cannot be isomorphic. 17. Axiomatic tinkering. There are other proofs than the one we give for the uniqueness of the identity. Our proof of Corollary 3.2.2 ultimately depends on the existence of inverses, and we will later need a proof that does not depend on inverses. See if you can find such a proof. It also turns out that less than the full identity axiom is needed. Instead of assuming there is a “two sided identity,” namely an element 1 so that 1a = a1 = a for all a, one could assume that there is a “right identity” r so that ar = a for all a, or one could assume that there is a “left identity” l so that la = a for all a. Prove that if there is both a right identity and a left identity, then they are equal and are therefore a two sided identity. Conclude that once there is at least one left identity and at least one right identity, then there is a two sided identity and all identities are the same. 3.3 Rings If you take an abelian group (with operation + and identity 0) and add a new operation called multiplication, then with two extra laws you get a ring. But nothing else is demanded. There need not be a multiplicative identity, and there need not be inverses. We will say much less about rings than we did about groups. This bias will continue throughout these notes. In fact the study of rings forms a very large subject, but you will have to take other courses to learn more about them. 3.3.1 The definition We repeat the definition from Section 2.5.1. A ring is a triple (R, +, ·) where + and · are binary operations (functions from R × R to R). The operation + is usually referred to as the addition and the operation · is usually referred to as the multiplication. We will omit the notation for the multiplication and write the product of a and b as ab. The following additional requirements must be met. 1. The pair (R, +) forms an abelian group. 2. For all a, b and c in R, we have a(bc) = (ab)c. 3. For all a, b and c in R, we have a(b + c) = ab + ac and (a + b)c = ac + bc. The additive identity is usually written as 0, and the additive inverse of a ∈ R is usually written as −a. Elements of R are not forbidden to have multiplicative inverses. However before there can be multiplicative inverses, there must be a 104 CHAPTER 3. THEORIES multiplicative identity. A multiplicative identity is not required to exist. If no multiplicative identity exists, then there are no multiplicative inverses. For a positive integer n, the strictly upper triangular n × n matrices form a ring with no multiplicative identity. As mentioned before, a ring with unit, or ring with 1, or ring with identity is a ring with a multiplicative identity. That is, there is an element 1 in the ring so that for all a in the ring, we have 1a = a1 = a. A commutative ring or abelian ring is a ring where the multiplication is commutative. That is ab = ba for all a and b in the ring. 3.3.2 First results To make sure we understand what we are allowed to do, the proof of the next lemma will be written out carefully. Lemma 3.3.1 In a ring R there is only one element that can act as the additive identity. Proof. . This follows from Corollary 3.2.2 since (R, +) is a group. As you can see the proof is short. This is because of what was proven before. However, (R, ·) is not a group, so the next lemma needs a proof. Lemma 3.3.2 In a ring with identity, there is only one element that can act as the identity. The proof is left as an exercise. We gather several facts together in the next lemma. Lemma 3.3.3 Assume that R is a ring with 0 as the additive identity. 1. For every a ∈ R there is only one element that can act as the additive inverse for a. 2. −0 = 0. 3. For every a ∈ R, we have −(−a) = a. The proof is left as an exercise. The next lemmas use all parts of the structure of a ring. They have no counterparts in the theory of groups. Lemma 3.3.4 If R is a ring with 0 as the additive identity, then for every a ∈ R we have 0a = a0 = 0. The proof is left as an exercise. Lemma 3.3.5 If R is a ring, then for every a and b in R, we have (−a)b = a(−b) = −(ab) and (−a)(−b) = ab. The proof is left as an exercise. 3.3. RINGS 3.3.3 105 Subrings In keeping with our brief treatment of rings, we will have little to say about subrings other than their definition. If R is a ring and S is a subset of R, then S is a subring of R if the two operations on R make S a ring when the operations are restricted to S. In particular, if 0 is the additive identity for R, then 0 must be in S, and if a and b are in S, then all of a + b, −a and ab must be in S. The following are examples of subrings. 1. The even integers in the ring Z. 2. Upper triangular n × n matrices in the ring of all n × n matrices, as well as the strictly upper triangular n × n matrices in the ring of all matrices. 3. Polynomials P (x) with P (1) = 0 in the ring of all polynomials with real coefficients. The verification that these are subrings are left as exercises. There are also parallels to Lemmas 3.2.9 and 3.2.10 for subrings. These are left as optional exercises. Once the parallel lemmas are in place, it is possible to define what is meant by a subring generated by a certain set. A word of caution is needed with terminology. If all rings in a discussion are rings with identity, then it is usually assumed that a subring will also have an identity. If this is the case, then the even integers would not form a subring of Z, nor would the strictly upper triangular n × n matrices form a subring of the ring of all n × n matrices. However, the upper triangular n × n matrices would. 3.3.4 Homomorphisms Let f : R → S be a function between rings. If we assume that f (a + b) = f (a) + f (b) for all a and b in R, then we know that f (0) = 0 and f (−a) = −f (a) for all a ∈ R from the fact that the additive structure on R and S form groups. The only new part of the ring structure is the multiplication. So we make the following definition. A function h : R → S between rings is a homomorphism of rings, if for all a and b in R, we have f (a + b) = f (a) + f (b) and f (ab) = f (a)f (b). As with groups, we have the following. Lemma 3.3.6 If f : R → S and h : S → T are ring homomorphisms, then hf : R → T is a ring homomorphism. The proof is left as an exercise. 3.3.5 Subrings associated to homomorphisms Images Images of ring homomorphisms behave like images of groups. 106 CHAPTER 3. THEORIES Lemma 3.3.7 Let h : R → S be a ring homomorphism. Then the image h(R) is a subring of S. The proof is left as an exercise. Kernels The kernel of of a ring homomorphism h : R → S is defined as Ker(h) = {a ∈ R | h(a) = 0S }. In the group setting, the kernel turns out to be a special subgroup. This follows from the special behavior of the identity element in the group. In the ring setting, the kernel also turns out to be a special subring. This follows from the special behavior of the additive identity of the ring. Note that normality of the kernel when looking only at the group structure is not a surprise in the ring setting because the additive group of a ring is abelian and because Lemma 3.2.16 says that all subgroups of abelian troups are normal. Kernels of ring homomorphisms have a property that goes even beyond normality. The extra property follows from the extra property of the additive identity given by Lemma 3.3.4. The statement that the kernel is a special subring implies that it is a subring in the first place. This has problems if all rings in a discussion (including subrings) are assumed to have a multiplicative identity. In a problem below, we will point out that kernels rarely have a multiplicative identity. Lemma 3.3.8 Let h : R → S be a ring homomorphism. Then for every a ∈ R and k ∈ Ker(h), both ak and ka are in Ker(h). The proof is left as an exercise. The result of Lemma 3.3.8 is turned into a definition. If K is a subring of a ring R, we say that K is a (two-sided) ideal of R if for every a ∈ R and k ∈ K, we have that both ak and ka are in K. We say that K absorbs all elements of R on both sides by multiplication. Finding a subring of a ring that is not an ideal is left as an exercise. There are also parallels to Lemmas 3.2.9 and 3.2.10 for ideals as well as the notion of an ideal generated by a certain set. This is left as an optional exercise. 3.3.6 Isomorphisms and automorphisms A function f : R → S between rings is a ring isomorphism if it is a ring homomorphism and a one-to-one correspondence. As with groups, we get the following about ring isomorphisms. Lemma 3.3.9 Let h : R → S be a ring isomorphism. Then h−1 : S → R is also a ring isomorphism. 3.3. RINGS 107 The proof is left as an exercise. An automorphism of a ring R is a ring isomorphism from R to itself. The set of all auotomorphisms of a ring R is denoted by Aut(R). As with groups, we have the following. Lemma 3.3.10 If R is a ring, then Aut(R) is a group with composition as the operation. The proof is left as an exercise. It is harder to find automorphisms of rings that are not the identity than it is to find automorphisms of a group that are not the identity. For the group (Z5 , +), we found four elements in Aut(Z5 ). However (Z5 , +, ·) is also a ring. We leave it as an exercise to show that there is only one automorphism of this ring. Essentially, with the extra structure of the multiplication, the ring (Z5 , +, ·) has fewer symmetries than the group (Z5 , +). In keeping with our brief treatment of rings, we will not supply a non-identity automorphism of a ring. All fields are rings, and we will exhibit a field with a non-identity automorphism. This will supply the missing example. Exercises (29) 1. Prove Lemma 3.3.2. Since multiplicative inverses do not exist, a simple quote of Lemma 3.2.2 is not allowed. If you have done a previous problem, you have already done this. If not, a hint is to assume that there are two multiplicative identities p and q. 2. Prove Lemma 3.3.3. Do not forget previously proven facts. 3. Prove Lemma 3.3.4. This needs some care. The element 0 is an additive identity and thus part of the “additive part” of the ring. The expression 0a involves multiplication. From the definition of a ring to this lemma, there is only one fact that combines addition and multiplication. It is therefore impossible to prove the conclusion without using this fact. Secondly, there are very few facts from the definition to this point that have as a conclusion that something equals 0. It is therefore impossible to prove the conclusion without using a second fact. Part of this exercise is to hunt down two facts as just described that combine to give a proof of the conclusion. 4. Prove Lemma 3.3.5. Comments similar to those about the proof of Lemma 3.3.4 apply. However here the focus is not on the conclusion that something is zero, but on the conclusion that something is an additive inverse. 5. Prove that the examples in Section 3.3.3 are in fact subrings of the given rings. 6. (optional) State and prove parallels to Lemmas 3.2.9 and 3.2.10 for subrings. Define what is meant by a subring generated by a certain set. 7. Prove Lemma 3.3.6. 108 CHAPTER 3. THEORIES 8. Prove Lemma 3.3.7. Note that multiplicative inverses are not present and not relevant. 9. Prove Lemma 3.3.8. The comments in the paragraphs before the statement of the lemma form a hint, but the hint should not be necessary. 10. Find an example of a subring that is not an ideal. 11. (optional) State and prove parallels to Lemmas 3.2.9 and 3.2.10 for ideals. Define what is meant by an ideal generated by a certain set. 12. Show that if R is a ring with 1 and I is an ideal in R with 1 ∈ I, then I = R. This says that most ideals cannot be regarded as subrings if it is assumed that all subrings contain the multiplicative identity. 13. Prove Lemma 3.3.9. 14. Prove Lemma 3.3.10. The only trick here is to remember all that needs to be checked. 15. Let R = (Z5 , +, ·) be the ring of integers modulo 5. Show that the only automorphism of R is the identity. 3.4 Fields Fields will occupy us significantly more than rings. 3.4.1 The definition As covered in Section 2.6.1, a field is a commutative ring with 1 so that all non-zero elements have multiplicative inverses. As mentioned in Section 2.6.1, we will also assume 1 6= 0. A full set of laws is written out in Section 2.6.1. 3.4.2 First results The following are mostly based on results from groups and rings. The only new item is the last. Lemma 3.4.1 If F is a field, then all of the following hold. 1. There is only one element in F that acts as the additive identity. 2. There is only one element in F that acts as the multiplicative identity. 3. For each x ∈ F , there is only one element that acts as the additive inverse of x. 4. For each x ∈ F with x 6= 0, there is only one element that acts as the multiplicative inverse of x. 3.4. FIELDS 109 5. −0 = 0 and 1−1 = 1. 6. For every x ∈ F , we have 0x = 0. 7. For every x and y in F , we have (−x)y = x(−y) = −(xy) and (−x)(−y) = xy. 8. For every x ∈ F with x 6= 0, we have x−1 6= 0. The proof is left as an exercise. The last item in Lemma 3.4.1 allows us to describe a field in another way. With F ∗ = F − {0}, a triple (F, +, ·) is a field if (F, +) is an abelian group with identity 0, if (F ∗ , ·) is an abelian group with identity 1 so that 1 6= 0, and if the distributive law holds. Thus a field is two abelian groups with different identities connected by a distributive law. 3.4.3 Field extensions Subfields The definition of subfield is no surprise. A subset F of a field E is a subfield of E if the addition and multiplication operations of E restricted to F make F a field. Extensions There is however, a difference in attitude. Instead of focusing on the fact that in the definition above, F is a smaller part of E, the focus is usually on the fact that E is gotten by making F larger. Thus if F is a subfield of E, we will also (and more frequently) say that E is an extension of F or an extension field of F. Thus C is an extension of R. Later, we will make sense of the statement that “C is an extension √ of R by the addition of the element i.” We can also 2] of Section 2.6.2 that it is an extension of Q by the say of the example Q[ √ addition of 2. The point of view of extension rather than subfield is supported by the construction of roots of a polynomial. The field Q contains the coefficients of x2 + x + 1. However, to build the roots of x2 + x + 1, one must pass to a larger √ −3. That is, one must build an extension of Q by field that also contains √ adding −3 in order to get a field that accomodates the roots. The cubic x3 − 9x + 8 considered in (1.22) also has its coefficients in Q. But to accomodate the roots, two extensions√must be formed. First an extension of of this field Q must be formed by the addition of −11. Then an extension √ must be formed by the addition of all cube roots of −4 + −11. (We do not √ −11. If u represents a typical cube root have to add the cube roots of −4 − √ √ of −4 + −11, v represents a typical cube root of −4 − −11, and q is the coefficient −9 of x, then we know that uv = − q3 = 3. Thus if a field contains 110 CHAPTER 3. THEORIES u and√3, it must contain 3/u = v, and so if it contains all √ the cube roots of −4 + −11 it will also contain all the cube roots of −4 − −11.) In order to make sense and use out of these considerations, we will have to study the structure of these extensions. This will be done in due time. For now we make the following observation. Dimension and degree Lemma 3.4.2 If E is an extension field of F , then E can be regarded as a vector space with F forming the field of scalars. Proof. The axioms of a vector space are listed in Section 1.4.2, where z, y and x were taken to be in C, but here can be taken from E, and r and s were taken to be from R, but here can be taken from F . The eight requirements listed in Section 1.4.2 are simply special cases of the requirements of a field. They hold since F ⊆ E and they all hold in E. It was observed in Section 1.4.2, that C forms a vector space of√dimension 2 over R and that {1, i} forms a basis. Since every element of Q[ 2] can√be √ written uniquely as a + b 2 with a and b in Q, we can√also say that Q[ 2] forms a vector space of dimension 2 over Q and that {1, 2} forms a basis. In general if E is an extension field of F , then the dimension of E as a vector space over the field of scalars F is called the degree of of E over F and is denoted [E : F ]. This value will later be seen to be a very important measure of the extension. For those √ who have done Problem √ 5√in Exercise Set (24), it should be clear that [Q[ 3 2] : Q] = 3 and that {1, 3 2, 3 4} forms a basis. Generating extensions The parallels to Lemmas 3.2.9 and 3.2.10 for subrings and ideals were treated in optional exercises since they are not as important to our particular goals. However we will use heavily the parallels for subfields and extensions. We give them here in the form we will need. Lemma 3.4.3 Let E be a field and let C be a collection of subfields of E. Then the intersection of all the subfields in C is a subfield of E. The proof is left as an exercise. The next lemma is worded slightly differently from Lemma 3.2.10. Rather than building a subfield from scratch, the lemma below builds an extension of a smaller subfield. Lemma 3.4.4 Let F ⊆ E be an extension of fields and let S be a subset of E. Then in the collection of subfields of E that contain both F and S, there is a smallest subfield. 3.4. FIELDS 111 The proof is left as an exercise. With F , E and S as in the statement of Lemma 3.4.4, let K be the smallest subfield of E containing F and S as guaranteed by the conclusion. We can refer to K as the extension of F by S in E, but it is more usually referred to as the extension of F obtained by adjoining S in E. Of course, any elements of S that are already in F will have no effect on the outcome. A great deal of the time S will have only one element. The notation for the extension of F obtained by adjoining a set of elements S is F (S), and if S is a finite set {a1 , a2 , . . . , an }, then the extension is denoted F (a1 , a2 , . . . , an ). Note that we can add elements one at a time or all at the same time. This raises the question as to the importance of order and grouping. It makes no difference and we leave the proof of the next lemma as an exercise. Note that the expression on the left represents adding the elements one at a time, and the expression on the right represents adding them all at once. Lemma 3.4.5 Let F ⊆ E be an extension of fields, and let {a1 , a2 , . . . , an } be a subset of E. Then F (a1 )(a2 ) · · · (an ) = F (a1 , a2 , . . . , an ). Much of the work in studying roots of polynomials will involve studying the structure of extensions of the type shown in Lemma 3.4.5. 3.4.4 Homomorphisms and isomorphisms Homomorphisms of fields are more restricted than homomorphisms of groups and rings. The restrictions come from the large amount of structure carried by a field. These restrictions make sensible the grouping of homomorphisms and isomorphisms in a single topic. A function f : F → K between fields is a homomorphism of fields if for every a and b in F , we have both f (a + b) = f (a) + f (b) and f (ab) = f (a)f (b). Thus the definition has the same appearance as the definition of a ring homomorphism. A field homomorphism that is a one-to-one correspondence is called an isomorphism, and fields that have an isomorphism between them are said to be isomorphic. A discussion of image and kernel will not take place that parallels the discussion for groups and rings. The next lemma is the reason. Lemma 3.4.6 Let h : F → K be a homomorphism of fields. Then either h(F ) = {0} or h is one-to-one. Proof. . We will assume h is not one-to-one. So there are a and b in F with a 6= b and f (a) = f (b). This means that a − b 6= 0 and f (a − b) = f (a) − f (b) = 0. Thus we have found a non-zero element c = a − b in the “kernel” of h. Since c 6= 0 there is a multiplicative inverse c−1 for c. We now use the expression cc−1 for 1 with great effect. 112 CHAPTER 3. THEORIES Let x ∈ F . Then x = 1x = cc−1 x. So f (x) = f (cc−1 x) = f (c)f (c−1 x) = 0f (c−1 x) = 0, and the proof is complete. Lemma 3.4.6 explains the wording of the next lemma. Lemma 3.4.7 Let h : F → K be a field homomorphism. If h(F ) 6= {0}, then h(F ) is a subfield of K and h is an isomorphism from F to the subfield h(F ) of K. This is left as an exercise. Some comments are in order. Since we require 0 6= 1 in a field, the image of a field homomorphism is not a field if the image is just {0}. Having the image equal to {0} is not a situation that will be of great interest to us. The “kernel” of a field homomorphism h : F → K would then be of only two types: {0} or all of F . The situation where the “kernel” is all of K is not interesting, and when the “kernel” is {0}, the “kernel” contains little data. Thus the “kernel” of a field homomorphism will not be discussed. Thus the only field homomorphisms h : F → K of interest to us will have h a one-to-one function. We call such homomorphisms embeddings indexembeddings or sometimes homomorphic embeddings for emphasis. As stated in Lemma 3.4.7, an embedding of fields is an isomorphism onto the image of the embedding. We have the following important parallel to similar statements about group and ring homomorphisms. Lemma 3.4.8 If f : F → K and h : K → L are field homomorphisms, then hf : F → L is a field homomorphism. The proof is left as an exercise. 3.4.5 Automorphisms Automorphisms of fields will be of special importance to us. As mentioned in Section 3.2.7, automorphisms are symmetries of an object. One of the major ideas of Galois theory is that information about a field can be extracted from a knowledge of its symmetries. As well as an interest in the structure of single fields, we are also interested in the structure of field extensions. Thus we will define automorphisms of a field extension. When we do, we will connect the definition to some of the observations and comments made in Section 1.5.3. An automorphism of a field is an isomorphism from the field to itself. For a field F , we will use Aut(F ) to denote the set of all automorphisms of F . For a field extension F ⊆ E, we will define Aut(E/F ) = {φ ∈ Aut(E) | ∀x ∈ F, φ(x) = x}. (3.1) 3.4. FIELDS 113 In words, Aut(E/F ) is the set of automorphisms of E that act as the identity on all elements of F . We say that F is fixed or that all the elements of F are fixed by all the elements in Aut(E/F ). Before we discuss this further, we make note of the following. Lemma 3.4.9 If F ⊆ E is an extension of fields, then Aut(E) is a group and Aut(E/F ) is a subgroup of Aut(E). The proof is left as an exercise. Given the truth of Lemma 3.4.9, we can refer to Aut(E) as the automorphism group of E. The group Aut(E/F ) is referred to as the automorphism group of E over F . We think of Aut(E/F ) as the symmetries of the extension F ⊆ E. Examples We have seen in Section 2.9.1 that complex conjugation is a field homomorphism from C to C. We know that z = z, so complex conjugation is its own inverse. Therefore it is one-to-one and onto and thus an automorphism of C and an element of Aut(C). Since complex conjugation is not the identity function, it is a non-identity element of these two groups. Another fact about complex conjugation is that z = z if and only if z is real. Thus complex conjugation is a non-identity element of Aut(C/R). It is also a non-identity element of Aut(C/Q). √ Another field we Set (27) we saw that √ √ have looked at is Q[ 2]. In Exercise √ f (r + s 2) = r − s 2 is a homomorphism from Q[ 2] to itself. It is clearly not the identity and it is also its own√inverse. Thus it is one-to-one and onto and a non-identity √ element of Aut(Q[ 2]). The elements left fixed by f are of the form r + 0 2. Since r runs over all the elements√of Q, we see that Q is fixed by f . Thus f is a non-identity element in Aut(Q[ 2]/Q). Let 2π 2π α = cos + i sin 5 5 be the fifth root of 1 in the first quadrant of the complex plane. Note that the other fifth roots of 1 are α2 , α3 , α4 and α5 = 1. We claim that Q(α) consists of all r + sα + tα2 + uα3 + vα4 (3.2) where all of r, s, t, u and v are in Q. We will do some of the work to verify this claim in exercises, but we will not do all of it. The full claim will be verified very much later. An easy exercise that you will be asked to do is show that sums, products and negatives of the numbers given in (3.2) are other numbers given in (3.2). A much harder exercise that you will not be asked to do (involving solving a system of five linear equations with five unknowns) is to show that mutliplicative inverses of the non-zero numbers in (3.2) are also numbers in (3.2). Thus the numbers in (3.2) form a field. 114 CHAPTER 3. THEORIES But the numbers in (3.2) are sums and products of numbers in Q(α). So the set of all the numbers in (3.2) is a subset of Q(α) and thus a subfield of Q(α). But Q(α) is the smallest field in (say) C that contains Q and α. So Q(α) is exactly the set of numbers in (3.2). It turns out that for each j in {1, 2, 3, 4}, sending α to αj leads to an element of Aut(Q(α)/Q). It also turns out that these are all the elements in Aut(Q(α)/Q). Proving these statements (including the fact that the multiplicative inverses of non-zero numbers in (3.2) are also in (3.2)) will be very much easier when more is known about the structure of fields and field extensions. Exercises (30) 1. Prove Lemma 3.4.1. 2. Prove Lemma 3.4.3. 3. Prove Lemma 3.4.4. 4. Prove Lemma 3.4.5. Hint: prove a slightly different lemma by induction. 5. Prove Lemma 3.4.7. The only part needing proof is the fact that h(F ) is a subfield of K. 6. Prove Lemma 3.4.8. 7. Prove Lemma 3.4.9. 8. Check that sums, products and negatives of numbers in (3.2) are also numbers in (3.2). Check that sending α to α2 creates a non-identity automorphism in Aut(Q(α)/Q). If this has not been enough work, you can also verify that sending α to αj for j = 2 and j = 3 each create an automorphism in Aut(Q(α)/Q). Even if you do none of these verifications, show that the set of isomorphisms that one gets by sending α to αj for j ∈ {1, 2, 3, 4} is isomorphic to Z4 . (Compare this to Problem 15 in Exercise Set (28). There, with Z5 representing the group (Z5 , +), the automorphism group Aut(Z5 ) was shown to be isomorphic to Z4 .) 3.5 On leaving Part I We have most of the ingredients in place, and hints of the outline. In studying roots of polynomials, fields will be discussed. In general, there will be a field F that contains the coefficients of the polynomial, and a field E that contains the roots of the polynomial. We know easy formulas that give the coefficients of the polynomial in terms of the roots. For example, if r1 , r2 , r3 are the roots of x3 + ax2 + bx + c, then a = −(r1 + r2 + r3 ), b = r1 r2 + r2 r3 + r3 r1 , and c = −r1 r2 r3 . Thus a field 3.5. ON LEAVING PART I 115 containing r1 , r2 and r3 must also contain a, b and c. In general, the field E contains the field F and we have an extension. If there are formulas that give the roots in terms of the coefficients, then there may be intermediate values involving the taking of n-th roots for various n. These intermediate values will force the existence of fields that are intermediate to F and E and that form smaller extensions of F than E. A study of how these smaller extensions fit into the extension F ⊆ E will tell us what has to happen for formulas that give the roots in terms of the coefficients to exist. Galois theory extracts essential information from the automorphism groups of the various extensions. Thus the groups most important to us in this larger outline are automorphism groups. We see that different types of objects play very different roles in our outline. Fields contain the numbers we calculate with. Groups are groups of symmetries (automorphisms) of the fields of numbers. Rings show up because the integers form a ring, and polynomials with coefficients from various fields also form rings. Groups, rings and fields are all algebraic objects. They all have various numbers and types of operations, and laws that dictate how the operations behave. But the differences in number of operations and the laws that they follow lead to enough diffrences in behavior to assign them different roles in a mathematical topic. In the notes to come we will launch three investigations. The first into the general theory of groups, the second into the somewhat less general theory of polynomials (instead of the more general theory of rings), the third into the general theory of fields. None of these investigations will be terribly deep, nor terribly broad, but will be sufficient to give a good introduction to each theory and will also be sufficient for the needs of the outline. The last part will be the Galois theory that ties all of the above together and answers the question about which polynomials have formulas that give the roots of the polynomial in terms of the coefficients. 116 CHAPTER 3. THEORIES Part II Group Theory 117 119 Introduction and organization As previously mentioned, group theory is a collection of arguments, results and techinques that help answer questions about groups. Therefore it is a collection of subtopics that are held together by the fact that they are all about groups. As a result, the theory can look rather disorganized. We attempt to give some organizational structure to the topics that we will cover. We will cover a tiny fraction of the topics in group theory, and within each topic, we will cover a small part of that topic. Our selection will be guided mostly by what we need from group theory to apply to the question of roots of polynomials. We can use our reasons for our choice of topics to help organize them into a few logically connected areas. Actions Groups arise in Galois theory as automorphism groups of fields and field extensions. Specifically a group will arise as some Aut(E/F ) for an extension F ⊆ E of fields. Since automorphisms are homomorphisms that are one-to-one correspondences, our groups are basically collections of certain kinds of permutations. Since permutations are thought of as moving things around, each element of Aut(E/F ) causes movement. Mathematicians say that each element of Aut(E/F ) “acts” on the field E. This brings us to the first organizational concept: groups actions. This splits into two parts. The more elementary topic is that of groups “acting” as a collection of permutations. Among other things, we will show that any group, even one not defined as acting as a collection of permutations of a set, is isomorphic to a group that is a collection of permuations of a set. The slightly more sophisticated topic, that of a general group action, will come next. Following that will be a discussion of two major topics: subgroups and homomorphic images. Subgroups and homomorphic images Formulas for roots of polynomials bring new numbers into a discussion in a certain order. First a square root might be brought in. After that, a cube root might be brought in, and so forth. Assuming that each n-th root requires moving to a larger field, we get a sequence of field extensions F0 ⊆ F1 ⊆ · · · ⊆ Fn giving the opportunity to look at the automorphism groups of many different field extensions. Under certain assumptions, the resulting groups will be related to each other by either of the two most important relations among different groups: the relation of “is a subgroup of” and the relation “is a homomorphic image of.” We will study each of these relations. These relations are related since every homomorphism has a subgroup that goes with it—the normal subgroup that is the kernel of the homomorphism. Thus the two topics will interact a good deal. 120 Iterations The sequence of field extensions associated to formulas for the roots of a polynomial also motivates the last topic we will consider in group theory. The chain of extensions F0 ⊆ F1 ⊆ · · · ⊆ Fn mentioned above, gives rise to a corresponding chain of subgroups, and under the right circumstances to a chain of homomorphic images. Thus the relations of “is a subgroup of” and “is a homomorphic image of” will have to be studied in connected chains. This takes the form of a topic in group theory, suggestively called solvable groups, that will be of direct importance in Galois’ theory of solvability of polynomials. Simple groups As might be guessed, solvable groups are associated to solvable polynomials. To obtain the result that certain polynomials are not solvable, one must then have examples of groups that are not solvable. The quickest way to such examples is via groups that are called simple. In spite of their name, the proof of their simplicity is often not that simple. Our last topic will be the demonstration that the right kinds of simple groups exist. Chapter 4 Group actions I: groups of permutations 4.1 Consequences of Lemma 3.2.1 For us a permutation group is a subgroup of a symmetric group. That is, G is a permutation group if for some X, we have G ⊆ SX . Recall that SX is the symmetric group on the set X, the group of all permutations of X. We start by proving Cayley’s theorem: that all groups are isomorphic to permutation groups. This means that limiting a discussion to permutation groups is not much of a limitation. We remind the reader of Lemma 3.2.1. Lemma (3.2.1) If a and b are two elements in a group G, then there exists a unique element x ∈ G that satisfies ax = b. This immediately leads to the following. Lemma 4.1.1 Let G be a group and let a be in G. Define la : G → G by la (x) = ax. Then la is a one-to-one correspondence from G to G. Proof. To show onto, for an element b ∈ G, we need to find an x with la (x) = ax = b. But this exists by Lemma 3.2.1. To show one-to-one, we assume that la (x) = la (y) or ax = ay. That x = y follows by either quoting the uniqueness part of Lemma 3.2.1 or duplicating the proof by multiplying both sides on the left by a−1 . We can put the result just proven in words by saying that for each a ∈ G, the function “multiply on the left by a” is a permutation of G. So each a ∈ G gives us a permutation of the elements of G. This is the first step in proving Cayley’s theorem. 121 122 CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS To have an isomorphism, one needs a function that is a homomorphism, is one-to-one, and is onto. We have the function. Each a ∈ G gives a permutation la of G. So we let our function be defined by f (a) = la . This leads to odd notation such as f (a)(x) = la (x) = ax. It is important to get used to this jumble of letters and parentheses. Note that f is a function from G to SG , the symmetric group on G or the group of all permutations of G. Arguing that f is one-to-one is easy if you keep track of what everything is and use the right attack. If f (a) = f (b), then la = lb and we have an equality between two permutations. But permutations are functions and the statement la = lb means that la (x) = lb (x) for every possible x ∈ G. We could pick a general x to work with, or we could pick a favorite one. It turns out that using x = 1 is enough. Note that we are not proving something is true for every x ∈ G, we are using the fact that something is true for every x ∈ G. Now we have la (1) = lb (1) or a1 = b1 or a = b. So the function f is one-to-one. Showing that f is onto a subgroup of SG is easy if f is a homomorphism. We know that the image of a homomorphism is a subgroup of the range and so f will be onto the subgroup of SG that is the image of f . So we need to show that f is a homomorphism. To prove that f is a homomorphism, we need to show that for all a and b in G, we have f (ab) = f (a)f (b). This means we have to look at lab and compare it to la lb . These are permutations on G and as such functions from G to G. To show that they are the same function we need to show that for all x ∈ G, we have lab (x) = (la lb )(x). Since we are proving that something is true for all x, we have to let x be an arbitrary element of G. We have lab (x) = (ab)x. We have (la lb )(x) = la (lb (x)) = la (bx) = a(bx). Thus what we want follows from the associative law of groups. We have proven the following. Theorem 4.1.2 (Cayley) Every group G is isomorphic to a subgroup of SG . Some comments are needed. The proof of Cayley’s theorem (which precedes the statement) requires carefully keeping track of exactly what everything is. Some items in the proof are elements of G, some are functions from G to G, and lastly the function f is a function from G to the set of permutations on G. You should get used to the fact that this multiplicity of types of objects will happen often and that it is normal to need a lot of time and attention to keep them all straight. Since one of the objects in Cayley’s theorem is a function from a set (group, actually) to a set of functions, the notation tends to build in complexity. Since this is the first time functions to a set of functions was used, we used extra symbols to slow things down. In checking that f (ab) = f (a)f (b), we replaced f (ab) by the function lab that f (ab) was equal to. Similarly f (a)f (b) was replaced by la lb . To write down what needed to be checked, we wrote that we 4.2. EXAMPLES 123 needed lab (x) = (la lb )(x) to be true for all x ∈ X. We could have written that we needed f (ab)(x) = (f (a)f (b))(x) to be true for all x ∈ X. Such notation will be used in the future and you should start getting used to it. You should also accept that it takes time to read a statement such as f (ab)(x) = (f (a)f (b))(x) and dig through the definitions to extract is meaning. Cayley’s theorem does not give efficient views of a group. The group S3 has six elements and it already is a group of permutations. (The number of elements of Sn is discussed thoroughly in the next section. For now, accept that the number of elements in Sn is n!.) Cayley’s theorem says that if G = S3 , then G is isomorphic to a subgroup of SG . The subgroup that is isomorphic to G must have 6 elements, but SG itself has 6! or 720 elements. Showing G as a group of permutations of 3 elements is much more efficient than demonstrating it as a six element subgroup of a 720 element group. Cayley never gave a full proof of Cayley’s theorem. He constructed the function f , showed that each f (a) is a permation of G and showed that f is one-to-one. He may have shown that the image is a group, but he did not show that f is a homomorphism. However, at the time, no one realized that this was a necessary step. The last comment about Cayley’s theorem is that it says that facts proved about all groups of permutations apply to all groups in general. Therefore a study of groups of permutations is a worthwhile study. Exercises (31) 1. Something goes wrong if you try to state and prove Cayley’s theorem with the function f (a) = ra and ra : G → G is defined by ra (x) = xa. A lemma like Lemma 4.1.1 holds for ra , but the proof of Cayley’s theorem has a problem. State and prove the lemma like 4.1.1 for ra , and find the problem with the corresponding Cayley’s theorem. You can try to find a better statement for the corresponding Cayley’s theorem now, or wait until later when we come up with a fix. 4.2 Examples Cayley’s theorem says that every group is isomorphic to a permutation group. To emphasize the set being permuted, we will sometimes say that a subgroup G of SX is a permutation group on X. Obviously one example is G = SX for some X. We will give other examples. For the finite symmetric groups Sn , the following is an important point to make. Recall that the order of a group G, written |G|, is the number of elements of G. Lemma 4.2.1 The order of Sn is n!. The following is a reasonably rigorous argument. An element α of Sn is determined by what it does to each element of {1, 2, . . . , n}. There are n choices for α(1). Whatever is chosen for α(1) is not available for α(2), and there are 124 CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS n−1 choices for α(2). Continuing, there is one less choice for each i as i increases by 1, until there is only one choice left for α(n). The total number of choices is n(n − 1)(n − 2) · · · (2)(1) = n!. The proof above can be made much more rigorous if more rigor is demanded. A more rigorous proof is left as an exercise. 4.2.1 Dihedral groups These are important examples because they are not very complicated but are complicated enough (they are not abelian) to illustrate various features of groups. Let P be a regular polygon. That is, P has the lengths of all of its sides equal, and the size of all of its angles equal. If the polygon has n sides, it has n angles and is called the regular n-gon. Below we show the regular n-gons for n = 3, 4, 5, 6. vHH vv HHH v H v)) v )) )) 1 111 11 11 11 11 11 11 11 For a given n ≥ 3, the dihedral group D2n is the group of all permutations of the vertices of the regular n-gon that preserve the structure of the n-gon itself. That is, rotating and flipping the n-gon is allowed, but no twisting, crumpling or stretching is allowed. The group D2n is often described as the group of symmetries of the regular n-gon. The notation D2n will be explained shortly. We illustrate D2n for n = 3 and n = 4. In the figure below, the vertices are labeled 1, 2 and 3. This is not traditional (A, B and C would be more traditional), but using numbers for the vertices, lets us compare D6 to S3 . 2 1 111 11 11 11 11 1 3 A quick check of the six permutations of {1, 2, 3} reveals that all permutations of the vertices preserve the structure of the triangle shown. Some of the permutations involve flipping the triangle over. This is allowed. The permutation 1 2 3 is one that requires a flip. 1 3 2 4.2. EXAMPLES 125 In the figure below, the vertices are labeled 1, 2, 3 and 4. 2 1 (4.1) 3 4 There are 24 permutations of {1, 2, 3, 4}, but not all corresponding permutations of the vertices preserve the structure of the square shown. There are four places to take vertex 1. Once vertex 1 is in place, there are only two ways to place the rest of the square. One way has the “original” side up and and the other way has the “hidden” side up and invovles a “flip.” We illustrate the two possibilities with vertex 1 carried to the lower left corner. 4 3 2 3 1 2 1 4 We see that of the 24 possible permutations in S4 only 8 correspond to vertex permutations that perserve the structure of the square. Arguing as we did above about corners shows that for each n ≥ 3, |D2n | = 2n. This explains the notation. The group D2n is usually called the dihedral group of order 2n. Some books use Dn for what we call D2n . There are enough books that use each notation that it is hard to say which is more popular. A word on vocabulary It is tempting to say that D6 equals S3 . That makes sense if the vertices are the numbers 1, 2, and 3. If you would rather say that the vertices are labeled 1, 2 and 3, then D6 is isomorphic to S3 rather than equal to S3 . We are only permuting vertices. Other books talk about moving the entire ngon. There is not much difference between moving the entire n-gon and moving its vertices. Once you know where the vertices go, you know where the entire n-gon has gone. The point is that if you view the motion as that of the entire n-gon, then it makes no sense at all to say that D6 = S3 , and no book says that. We define the D2n as permutations of the vertices so that we can refer to D2n as a subgroup of a group of permutations of a finite set (the set of vertices). Later, we can change our view and think of D2n has a group of motions of the entire n-gon. Then D2n becomes a subgroup of a group of permutations of an infinite set (the set of points in the n-gon). This other view will allow us to raise questions that we cannot raise in our setting. 126 CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS The elements and the multiplication We can give a name to each element of D2n . The smallest is D6 , so we start there. Using Cauchy notation for permutations, we have the following. 1 2 3 1 2 3 γ= e= 2 3 1 1 2 3 1 2 3 1 2 3 δ= α= 3 1 2 1 3 2 1 2 3 1 2 3 ǫ= β= 3 2 1 2 1 3 It is somewhat of a tradition to use e for the identity if using 1 would be confusing. We are already using 1 as a label of a vertex. We will compute prodcuts such as βγ as if they are functions. That is, γ will be applied first and β second. For this particular product, we get 1 2 3 1 2 3 1 2 3 = α. = βγ = 1 3 2 2 3 1 2 1 3 After doing 35 more such calculations, we get the following multiplication table for D6 . · e α β γ δ ǫ e e α β γ δ ǫ α α e δ ǫ β γ β β γ e α ǫ δ γ γ β ǫ δ e α δ δ ǫ α e γ β ǫ ǫ δ γ β α e The table doesn’t show much, but it does show that two elements rarely commute. For example, we have βγ = α, but γβ = ǫ. Note that every row has every element of the group entered once and only once. This is guaranteed by Lemma 4.1.1. Each column also has every element of the group entered once and only once. This is guaranteed by a lemma like Lemma 4.1.1 but for ra : G → G defined by ra (x) = xa. There is not much that can be pulled from a multiplication table other than the answers to products that can be figured out quickly anyway. Later, we will illustrate an important point with a multiplication table. But other than that, we will spend little time building them. Orders of elements We can use the table (or direct computation) to compute orders of elements. In the paragraphs before Exercise Set (22), we defined the order of an element x of a group to be the smallest number of copies of x that need to be multiplied 4.2. EXAMPLES 127 together to get the identity. In a non-abelian setting, it is more easily defined as the least positive integer n so that xn = 1. We write o(x) for the order of x. For D6 , we get the following orders. o(e) = 1, o(α) = 2, o(γ) = 3, o(δ) = 3, o(β) = 2, o(ǫ) = 2. If you did Problem 6 in Exercise Set (22), you found that the orders of the elements of Z6 are 1, 6, 3, 2, 3, 6. This gives two reasons why Z6 and D6 cannot be isomorphic. First, one is abelian and the other is not. Second, the list of orders of the elements of the two groups is not the same. In the next section, we set up machinery to pick out subgroups of the dihedral groups and other groups of permutations. 4.2.2 Stabilizers These notions will see applications often. Let G be a permutation group on X and let x be in X. Then Gx will denote the subset of SX defined by Gx = {h ∈ G | h(x) = x}. In words, Gx is the set of permutations in G that leave x fixed. We call Gx the stabilizer of x in G. It is also sometimes called the fixed group of x in G. Lemma 4.2.2 If G is a permutation group on X and x ∈ X, then Gx is a subgroup of G. The proof is left as an exercise. The easy proof of the following should be written out carefully. Lemma 4.2.3 Let G = SX for a set X and let x ∈ X. Then there is an isomorphism from Gx to SY where Y = X − {x}. The proof is left as an exercise. If G = SX , if H = Gx and if y ∈ X with y 6= x, then Hy contains all permutations in H that fix y. But a permutation is in H if and only if it fixes x. So the permutations in Hy are exactly those that fix both x and y. This can get cumbersome after a while, so we invent notation. If G ⊆ SX and A ⊆ X, then GA is defined by GA = {h ∈ G | ∀a ∈ A, h(a) = a}. In words, GA is the set of permutations in G that leave every element in A fixed. We call GA the pointwise stabilizer of A in G. The reason for the two word name will be clear shortly. It is easily provable that GA is a subgroup of G, but it is more important to point out the easiest reason why. 128 CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS Lemma 4.2.4 Let G be a permutation group on X and let A ⊆ X. Then \ GA = Gx . x∈A The proof is left as an exercise. It is immediate from Lemmas 4.2.4 and 3.2.9 that GA is a subgroup of G. Lastly, if G is a permuation group on X and A ⊆ X, we look at StG (A) = {h ∈ G | h(A) = A}. We call this the stabilizer of A in G and it is the permutations in G that carry A to A. Note carefully the difference between the stabilizer of A in G and the pointwise stabilizer of A in G. Of course, we have the following lemma. Lemma 4.2.5 If G is a permutation group on X and A ⊆ X, then StG (A) is a subgroup of G. The proof is left as an exercise. Note that the group Gx can also be written StG ({x}) or more briefly as StG (x). We will refer to these constructions shortly. We turn now to more concrete examples. Dihedral groups revisited Let us look at subgrops of D12 . Since D12 already has a subscript which complicates expressions, let us set G = D12 so that we can refer to subgroups such as StG (A) and GA for various A. The group G is the group of symmetries of the following figure. 3 121 11 11 (4.2) 4 111 1 11 1 5 6 Note that G = D12 has 12 elements. The full symmetric group S6 has 720 elements. This is one reason that the dihedral groups are more practical to work with than the symmetric groups. We will look at some stabilizers in G. The reader should check that G1 = StG (1) consists exactly of the identity and the element 1 2 3 4 5 6 . 1 6 5 4 3 2 The subgroup G2 consists exactly of the identity and the element 1 2 3 4 5 6 . 3 2 1 6 5 4 The exercises will discuss several other stabilizer subgroups. 4.2. EXAMPLES 129 Field automorphisms We have mentioned that if F ⊆ E is an extension of fields, then we will be interested in Aut(E/F ). This is a subgroup of Aut(E). Since automorphisms must be one-to-one correspondences, the group Aut(E) is already a permutation group. Now the definition of Aut(E/F ) as the automorphisms φ of E so that for all x ∈ F , φ(x) = x. This translates into the statement that Aut(E/F ) is the pointwise stabilizer of F in Aut(E). Exercises (32) 1. Write out a rigorous proof of Lemma 4.2.1. An inductive proof is recommended. 2. Write out the elements of D8 . Without writing out the full multiplication table for D8 , figure out the orders of the elements of D8 . Without writing out the full multiplication table for Z8 , figure out the orders of the elements of Z8 . 3. Find two elements of D8 that do not commute. 4. Prove Lemma 4.2.2. 5. Prove Lemma 4.2.3. 6. Prove Lemma 4.2.5. 7. Let G be a permutation group on X and let A ⊆ X. Let B = X −A. Show that StG (A) = StG (B). Notice that you are asked to prove an equality, not just an isomorphism. 8. Let G = Sn and let A = {1, 2, . . . , m} with 1 ≤ m < n. What is |GA |? What is |StG (A)|? 9. Let G be a permutation group on X and let a ∈ A ⊆ B ⊆ X. Prove that the following subgroup relations always hold. (a) GA ⊆ Ga . (b) GA ⊆ StG (A). (c) GB ⊆ GA . 10. These questions refer to G = D12 . Review the definitions of stabilizers while doing the problems. In the following “what is” means “what elements are in.” (a) What is G3 ? G4 ? G5 ? G6 ? (b) What is G{1,2} ? What is G{2,3} ? What is G{1,4} ? (c) What is StG ({1, 2})? What is StG ({2, 3})? What is StG ({1, 4})? 130 CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS (d) For what p and q in {1, 2, 3, 4, 5, 6} is G{p,q} equal to G1 ? (e) For what p and q in {1, 2, 3, 4, 5, 6} is StG ({p, q}) equal to G1 ? (f) Show that StG ({1, 3, 5}) is isomorphic to D6 . 11. Let G be a permutation group on X and let a ∈ A ⊆ B ⊆ X. Find examples that show that the following containments are sometimes false. Note that there are very few permutation groups that you know at this point. You should be able to find examples among the Sn and the D2n . (a) Ga ⊆ GA . (b) StG (A) ⊆ GA . (c) GA ⊆ GB . (d) StG (A) ⊆ StG (B). (e) StG (B) ⊆ StG (A). 4.3 4.3.1 Conjugation Definition and basics Definition Let G be a group and let a and b be in G. Then the conjugate of a by b is the element bab−1 . Some texts will give this as b−1 ab, but that would involve a different convention as to how permutations compose. You should keep in mind that when changing books, the formula for conjugation might change. We will often write ab as a shorthand for bab−1 . The resemblance of a conjugation to the criterion for normality given after Lemma 3.2.15 is not an accident. The definition of normality can be reworded to say that a subgroup N of a group G is normal in G if for every a ∈ N and b ∈ G, the conjugate of a by b is in N . If c = bab−1 , then we say that a is conjugate to c (by b) giving us the relation “is conjugate to” on G. Note that “is conjugate to” is an existence statement. The element a is conjugate to c means that there exists a b so that ab = c. The conjugacy operation and the conjugacy relation are extremely important and will come up often. Basics Before looking at conjugacy in permutation groups, we give some general facts. The first is extremely trivial, but so important that we make it a lemma. Lemma 4.3.1 If a and b are in a group G, then ab = a if and only if a and b commute. 4.3. CONJUGATION 131 Proof. If bab−1 = a, then multiplying both sides on the right by b shows ba = ab. If ba = ab, then multiplyiing both sides on the right by b−1 shows ab = bab−1 = a. The three proofs below, left as exercises, are in order of increasing depth of definition. The last requires extreme care in keeping track of exactly what is what. Lemma 4.3.2 In a group G, the conjugacy relation is an equivalence relation. The proof is left as an exercise. Lemma 4.3.3 If G is a group and b ∈ G, then the function cb : G → G defined by cb (x) = bxb−1 = xb is an automorphism. The proof is left as an exercise. Lemma 4.3.4 If G is a group, then f : G → Aut(G) defined by f (b) = cb with cb as in Lemma 4.3.3 is a homomorphism. The proof is left as an exercise. Some comments are in order. It is vital to work through the proofs of Lemmas 4.3.3 and 4.3.4. Much of how conjugation cooperates with products (how does g f behave if g is replaced by a product gh or if f is replaced by a product f h) is revealed by working through the proofs. Similarly, the details of the proofs reveal how conjugation cooperates with the taking of inverses. From Lemma 4.3.2, we know that a group G is partitioned into equivalence classes under the relation “is conjugate to.” The classes are called conjugacy classes and for a ∈ G, the conjugacy class of a is the set of all x ∈ G so that a is conjugate to x. Note that since “is conjugate to” is symmetric, you do not have to remember whether a is conjugate to ab or ab is conjugate to a. They are both true. The next comment turns out to be a hint for the proof of Lemma 4.3.4, but it is too important to leave out. The fact that f (b) = cb is a homomorphism implies that conjugation by b and conjugation by b−1 are inverse automorphisms. So if b conjugates a to x, then b−1 will conjugate x to a. Of course, this is clear by plugging into the definition of conjugation and simplifying. To state one last lemma, we bring in more notation. If S is a subset (we do not need subgroup for this) of a group G and b ∈ G, then S b = {xb | x ∈ S}. That is, S b contains all the conjugates by b of elements in S. We call S b the conjugate of S by b. Usually, we will look at S b when S is in fact a subgroup of G. When this happens, we get the following. Lemma 4.3.5 If G is a group, if H ⊆ G is a subgroup and b ∈ G, then H b is a subgroup of G and the restriction of cb as defined in Lemma 4.3.3 to H is an isomorphism from H to H b . 132 CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS That H b is a subgroup follows from Lemma 3.2.13 applied to the restriction of cb to H. The restriction is a homomorphism since the only requirement for being a homomorphism is that certain equalities hold. The image is H b . The rest follows immeidately from Lemma 4.3.3 and the definition of H b . One of the next goals is to completely understand the conjugacy relation in the groups Sn . 4.3.2 Conjugation of permutations The calculation that starts it all Let G be a permutation group on X and let a and b be in G and let x be in X. We calculate (ab )(b(x)) = (bab−1 )(b(x)) = (ba)(bb−1 (x)) = (ba)(x) = b(a(x)). (4.3) The calculation in (4.3) has been referred to as “the fundamental triviality” of permutation groups. Its consequences are enormous. We start with a discussion of what (4.3) tells us immediately. The element x of X is carried to a(x) by a. The calculation in (4.3) shows that b(x) is carried to b(a(x)) by ab . We can think of the pair of elements x and a(x) as “witnessing” the action of a on x. We are just looking at the pair consisting of an element of X and the element that a carries it to. The calculation (4.3) says that the two elements x and a(x) that witness the action of a on x are carried by b to the two elements b(x) and b(a(x)) that witness the action of ab on b(x). This is summarized by the following diagram that might help keep track of things. x a b b(x) / a(x) b ab =bab−1 / b(a(x)) The two ways of going from the lower left corner to the lower right corner are to either go straight across with bab−1 or first up (opposite the direction of the arrow representing b) via b−1 , then over via a and then down by b. Since the composition of these three permutations is written from right to left we get bab−1 for this second path. Thus the two ways of going from the lower left corner to the lower right corner are consistently labeled. That b carries the two elements x and a(x) to the two elements b(x) and b(a(x)) repeats the main point of the calculation (4.3). 4.3. CONJUGATION 133 Actual computations Let σ be in Sn . We recall the Cauchy notation for σ which reads ! 1 2 3 ··· n . σ= σ(1) σ(2) σ(3) · · · σ(n) (2.2) Let τ also be in Sn . We wish to use (4.3) to guide us in writing down the Cauchy notation for σ τ ! . i in (2.2) gives a pair that “witnesses” the action of σ on Each column σ(i) τ (i) . i. Following the dictates of (4.3), the corresponding pair for σ τ is τ (σ(i)) This leads us to write down the Cauchy notation for the full permutation σ τ as τ (1) τ (2) τ (3) · · · τ (n) . στ = (4.4) τ (σ(1)) τ (σ(2)) τ (σ(3)) · · · τ (σ(n)) A verbal description of how to obtain (4.4) from (2.2) is quite easy: apply τ to every entry in both lines of the Cauchy notation for σ to obtain the Cauchy notation for σ τ . This will probably result in the first line not being in numerical order. This is a problem only if this bothers you. If you insist on putting the first line in numerical order, you can certainly do so if you move both entries in a column together. An example is called for. With G = D12 , we wrote out (just under (4.2)) the non-identity element of G1 and also of G2 . Let 1 2 3 4 5 6 α= 1 6 5 4 3 2 be the non-identity element of G1 and let 1 2 3 4 5 6 β= 3 2 1 6 5 4 be the non-identity element of G2 . Consider also the permutation 1 2 3 4 5 6 γ= . 2 3 4 5 6 1 Following the instructions that we have worked out for conjugating permutations we see that 1 2 3 4 5 6 2 3 4 5 6 1 γ = β. = α = 3 2 1 6 5 4 2 1 6 5 4 3 The fact that α and β come from G1 and G2 are totally irrelevant to the calculation of αβ itself. We will comment later on why we introduced α and β 134 CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS this way. For now we will only ask that you calculate αδ where 1 2 3 4 5 6 . δ= 2 1 6 5 4 3 If you are wondering about the relevance of this, we point out that γ(1) = δ(1) = 2. It should also be mentioned that αγ = γαγ −1 can be calculated quite nicely from the techniques of multiplication and inversion given in Section 2.3.4. However, the techniques for calculating conjugations of permutations given here have simplicity and also illustrate important aspects of what is really happening in a conjugation. 4.3.3 Conjugation of stabilizers We give further consequences of (4.3). We start with the most trivial observation. Conjugation of stabilizers of single elements Assume that G is a permutation group on X, that x ∈ X and that a ∈ G is in Gx . That is, a(x) = x. Then for any b ∈ G, we have that b carries the two elements x and a(x) = x to two elements b(x) and b(a(x)) = b(x) so that ab carries the first b(x) to the second b(x). That is, ab is in Gb(x) . Of course, this could have been verfied by repeating the calculation in (4.3) under the assumption that a(x) = x. We did say that this observation is the more trivial. We know that cb : G → G defined by cb (a) = ab is an automorphism. In particular it is one-to-one. So we have shown that restricting cb to Gx sends it in a one-to-one fashion to Gb(x) . We now turn to cb−1 . Identical arguments show that cb−1 takes Gbx in a oneto-one fashion to Gx since b−1 takes b(x) to x. From Lemma 4.3.4, we know that cb−1 and cb are inverse functions. Thus from Lemma 2.2.9, we know that cb restricted to Gx is a bijection from Gx to Gb(x) . We have enough information to state the following summary. Lemma 4.3.6 If G is a permutation group on X with x ∈ X and b ∈ G, then (Gx )b = Gb(x) and cb restricted to Gx is an isomorphism from Gx to Gb(x) . The last conclusion is a direct application of the first conclusion and Lemma 4.3.5. Conjugation of general stabilizers Lemma 4.3.7 If G is a permutation group on X with A ⊆ X and b ∈ G, then (StG (A))b = StG (b(A)) and cb restricted to StG (A) is an isomorphism from StG (A) to StG (b(A)). 4.3. CONJUGATION 135 Proof. The first claim is an equality of groups which comes down to an equality if sets. The second claim (about the isomorphism) follows from the equality of the first claim and Lemma 4.3.5. We thus focus on the equality and start with one containment. An element of (StG (A))b is some hb where h ∈ StG (A). We want hb to be in StG (b(A)) so any element of b(A) should be carried to something in b(A) by hb . But an element of b(A) is of the form b(a) for some a ∈ A. We know that h(a) is in A and the fundamental triviality says that hb carries b(a) to b(h(a)) which must be in b(A). This proves hb is in StG (b(A)). We now have (StG (A))b ⊆ StG (b(A)). The reverse containment is handled as in the last two paragraphs of our proof of Lemma 4.3.6. We have shown that the bijection cb carries StG (A) into StG (b(A)). With identical proof, we get that cb−1 carries StG (b(A)) into StG (A). From Lemma 4.3.4, we know that cb and cb−1 are inverse bijections, and from Lemma 2.2.9, we know that cb restricted to StG (A) is a bijection to StG (b(A)). Conjugation of pointwise stabilizers We round out the discussion with the following predictable lemma. Lemma 4.3.8 If G is a permutation group on X with A ⊆ X and b ∈ G, then (GA )b = Gb(A) and cb restricted to GA is an isomorphism from GA to Gb(A) . The details of showing that this follows from Lemma 4.3.6 and Lemma 4.2.4 is left to the reader. We note that if this is applied to the example of Aut(E/F ) when F ⊆ E is an extension of fields, then we get that if φ is an automorphism of E, then (Aut(E/F ))φ = Aut(E/φ(F )) and that φ restricted to Aut(E/F ) is an isomorphism from Aut(E/F ) to Aut(E/φ(F )). 4.3.4 Conjugation and cycle structure We introduce a second notation for elements of Sn . It has the advantage of needing less writing. Cycles The main fact that the discussion starts from is the following. Lemma 4.3.9 Let g be an element of Sn and let i be in {1, 2, . . . , n}. Then there is a smallest positive integer s so that g s (i) = i. Proof. There are infinitely many positive integers and only finitely many elements of {1, 2, . . . , n}. If we consider the sequence i = g 0 (i), g 1 (i), g 2 (i), . . ., then there is (by well ordering) a first place g s (i) where the value is a value that has already occurred in the sequence. Since the previous apperance could be 136 CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS i = g 0 (i) itself, we have some t ≥ 0 and s > t so that g t (i) = g s (i). Further, s is the smallest integer for which this is true. We claim that g s (i) = i. If not, then t > 0 and t − 1 ≥ 0. Then g t (i) = g s (i) = g(g s−1 (i)) and t g (i) = g(g t−1 (i)). We know g s−1 (i) 6= g t−1 (i) because g s (i) is the first value that repeats an earlier value. But g t−1 (i) 6= g s−1 (i) and g(g t−1 (i)) = g t (i) = g s (i) = g(g s−1 (i)) violates the fact that g is a permutation and must be one-toone. Thus t cannot be greater than 0 and g s (i) = g 0 (i) = i. The power of Lemma 4.3.9 is that given g ∈ Sn every element of {1, 2, . . . , n} belongs to some “cycle” associated to g. We illustrate this first with an example, and give careful definitions and demonstrations second. Consider 1 2 3 4 5 6 ∈ S6 . (4.5) g= 5 6 4 1 3 2 Under powers of g we have 1→5→3→4→1 and 2 → 6 → 2. (4.6) That is g 0 (1) = 1, g 1 (1) = 5, . . . , g 4 (1) = 1. Similarly g 2 (2) = 2. We want to consider g as breaking {1, 2, 3, 4, 5, 6} into two “cycles” once we have made the right definitions. To help with this we have the following. Lemma 4.3.10 Let g be in Sn . For i and j in define i ∼g j to mean that for some integer t ≥ 0, we have g t (i) = j. Then ∼g is an equivalence relation. Proof. g 0 (i) = i so ∼g is reflexive. If i ∼g j, then if i = j, we certainly have j ∼g i so assume j 6= i. We have g t (i) = j for some t > 0. By well ordering, we can assume that t is the smallest value with these properties. By Lemma 4.3.9, there is a smallest s > 0 so that g s (i) = i. If s < t, then t = qs + r with r < s < t and j = g t (i) = g qs+r (i) = g r (g qs (i)) = g r (i) since g s (i) = i. But r < t contradicts the choice of t. Since i 6= j, we have s 6= t, so s > t and s = t + d for some d > 0. Now i = g s (i) = g d (g t (i)) = g d (j) and j ∼g i. This proves that ∼g is symmetric. Lastly, if i ∼g j and j ∼g k, we have g s (i) = j and g t (j) = k with s and t at least 0. Now g s+t (i) = k and i ∼g k showing that ∼g is transitive. For g ∈ Sn , we call the equivalence classes in {1, 2, . . . , n} under ∼g the cycles of g. Note that from any i in a class, we get all the other elements of the cycle containing i by applying powers of g to i. Cycle notation For g ∈ Sn , the cycle notation for g is the form g =(a1 a2 . . . ak1 )(ak1 +1 ak1 +2 . . . ak1 +k2 ) · · · (ak1 +k2 +···+kp−1 +1 . . . ak1 +k2 +···+kp ) (4.7) 4.3. CONJUGATION 137 where p is the number of cycles under ∼g . In each parenthesized group g(aj ) = aj+1 except that if ai is the first element of a parenthesized group and aj is the last entry in the same group, then g(aj ) = ai . The barrage of notation in (4.7) is too cumbersome to digest and an example is needed. The element g given in (4.5) is written in cycle form as g = (1 5 3 4)(2 6). Note the resemblance to (4.6). There are many ways to write an element in cycle form. The order of the cycles is irrelevant, and while the order inside a given cycle is determined by the behavior of g, the starting value for the cycle is not important. Thus we can also write g as g = (6 2)(3 4 1 5). The content of our discussion leads to the following. Lemma 4.3.11 For each g ∈ Sn , g can be written in cycle notation. Practicalities Writing an element of Sn in cycle notation is easy. Let g be in Sn . Take your favorite element of {1, 2, . . . , n} (say 1). Then write down (1 g(1) g 2 (1) · · · g k1 −1 (1)) where k1 is the least positive integer for which g k1 (1) = 1. This writes down one cycle of g. Specifically, it is the cycle containing 1. If there are elements of {1, 2, . . . , n} that are not in the cycle containing 1, then start a new cycle with an element i not yet recorded. If picking one at random bothers you, then pick the smallest. Write down the cycle containing i as (i g(i) · · · g k2 −1 (i)) where k2 is the smallest positive integer where g k2 (i) = i. If there are elements of {1, 2, . . . , n} that are not in the union of the cycles that have been written down so far, then pick one of them (smallest, if you wish) and keep going. The procedure above is exactly how the example g = (1 5 3 4)(2 6) was written down. If we use this process on the elements α, β, γ and δ that follow (4.4), we get α = (1)(2 6)(3 5)(4), β = (1 3)(2)(4 6)(5), γ = (1 2 3 4 5 6), δ = (1 2)(3 6)(4 5). Composition using cycle notation Composing permutations is still composition of functions, even if the notation is cycle notation. If the result is desired in Cauchy notation, then one just considers 138 CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS the elements of {1, 2, . . . , n} in order. If the result is desired in cycle notation, then a different order must be used. To compute αβ, we note that (αβ)(1) = 5. So we write down (1 5 and the next calculation we do is (αβ)(5) = 3. Our cycle of αβ now extends to (1 5 3. Next (αβ)(3) = 1, so our first cycle of αβ is (1 5 3). We go on to (αβ)(2) = 6 and so forth. The final result is αβ = (1 5 3)(2 6 4). The composition βα computes as βα = (1 3 5)(2 4 6). Efficiencies We wrote above that α = (1)(2 6)(3 5)(4). If it is known that α is an element of S6 , then we can leave out the cycles of length 1 if it is agreed that any element of {1, 2, 3, 4, 5, 6} that is not mentioned is in a cycle of length 1. Then the cycle notation for α simplifies to α = (2 6)(3 5). If it is not mentioned that α ∈ S6 , then a reader might interpret the simpler notation to mean that α is in S5 . So mentioning the group is important. With this convention, other elements discussed above simplify. We get α = (2 6)(3 5) and β = (1 3)(4 6). But γ = (1 2 3 4 5 6), αβ = (1 5 3)(2 6 4), βα = (1 3 5)(2 4 6) and δ = (1 2)(3 6)(4 5) are no shorter. However, if we write p = (1 5 3) and q = (2 6 4), then the statement αβ = pq becomes a true statement. This gives a convenient statement once we make a definition. Disjoint cycles We say that σ ∈ Sn is a cycle if it has only one cycle of length greater than 1. Thus p and q are cycles. We can add to this terminology and say that σ ∈ Sn is an n-cycle if it is a cycle and the its cycle of length greater than 1 has length exactly n. Thus p and q are both 3-cycles. This leaves the problem of what to say about the identity. We will avoid the problem by simply calling it the identity. What we have shown is that αβ is a product of two 3-cycles. But there is more information here than we are pointing out. If you let r = (1 2 3 4) and s = (4 5 6), then rs = (1 2 3 4 5 6) = γ which is already a cycle. The point is that the set of elements {1, 5, 3} most relevant to p and the set of elements {2, 6, 4} most relevant to q are disjoint. For r and s they are not. We make two more definitions. If σ is in Sn , the support of σ is the set {i ∈ {1, 2, . . . , n} | σ(i) 6= i}. That is, the support of σ consists of the elements that σ actually moves. The support of p is the set {1, 5, 3} and the support of q is {2, 6, 4}. 4.3. CONJUGATION 139 Our observation is that the supports of p and q are disjoint. We say that two cycles are disjoint if their supports are disjoint. The main point of Lemma 4.3.11 is that every element of Sn is a product of cycles that are pairwise disjoint. The usual way to say this leaves out the word “pairwise” since it is understood that pairwise is meant. The translation of Lemma 4.3.11 into this terminology is as follows. Lemma 4.3.12 Every non-identity element of Sn can be written as a product of disjoint cycles. We omit the identity element from the statement for the reasons mentioned above. Cycle structure There is a certain level of uniqueness that goes with lemma 4.3.12. It follows from the fact that the cycles of a g ∈ Sn are the equivalence classes of ∼g which are completely determined by g. Beyond that, there is a lot of freedom in writing out cycles of an element. Recall the example that we gave where g = (1 5 3 4)(2 6) = (6 2)(3 4 1 5) gives two (among many other) ways to write g as a product of disjoint cycles. We will concentrate on the fact that the cycles of g ∈ Sn are equivalence classes determined by ∼g . From this it is clear that the sizes of the cycles are completely determined by g. Since there can be several cycles of each size (recall the example α = (2 6)(3 5)), we cannot just talk about the set of cycle sizes. We must say how many cycles there are of each size. For a given g ∈ Sn , we will call the cycle structure of g the number of k-cycles that g has for each k. Of course, for most k this number is 0. We even include the cycles of length 1 for completeness. In the examples we have been using, α and β both have two 2-cycles, and two 1-cycles, γ has one 6-cycle, δ has three 2-cycles. Both αβ and βα have two 3-cycles. We will see later that this is not a coincidence. We are now ready to apply cycle structure to conjugation. 4.3.5 Permutations that are conjugate in Sn In this section we will learn exactly which pairs of elements in Sn are conjugate and which are not. Further, if two elements a and b of elements in Sn are conjugate, we will be able to figure out an element h ∈ Sn so that ah = b. This is rather special to Sn . Other groups do not behave this well. The key lemma that follows is a direct application of the fundamental triviality. Lemma 4.3.13 Let a and b be in Sn . If (i1 i2 · · · ik ) 140 CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS is a k-cycle of a, then b is a k-cycle of a . (b(i1 ) b(i2 ) · · · b(ik )) Proof. To say that (i1 i2 · · · ik ) is a cycle of a means that a(ij ) = ij+1 for 1 ≤ j < k and a(ik ) = i1 . By (4.3), we have ab (b(ij )) = b(ij+1 ) for 1 ≤ i < k and ab (b(ik )) = b(i1 ). But this information says exactly that (b(i1 ) b(i2 ) · · · b(ik )) is a cycle of ab . Since b is one-to-one, there are k different elements in the cycle and it is a k-cycle. Corollary 4.3.14 Let a and b be in Sn . Then a and ab have the same cycle structure. Proof. From Lemma 4.3.13, b takes k-cycles of a to k-cycles of ab . Since b is one-to-one, two k cycles of a cannot be mapped by b to one k-cycle of ab . Thus ab has at least as many k-cycles as a. Now b−1 conjugates ab to a, so b−1 takes (in a one-to-one fashion) k-cycles of ab to k-cycles of a. Thus a has at least as many k-cycles as ab and we have that a and ab have the same number of k-cycles. Since this applies to any k, we have the claimed result. There is a converse to the corollary. Assume σ and τ are in Sn and have the same cycle structure. It is easy to create an h ∈ Sn so that σ h = τ . It is easiest to do by using the cycle notation for σ and τ to create Cauchy notation for h. First write out the cycle notation for σ on one line. Then on the line below, write out cycle notation for τ so that for each k every k-cycle of τ is directly below a k-cycle of σ. The fact that σ and τ have the same cycle structure guarantees that this can be done. Now erase the parentheses from the cycle structures, put large parentheses around the pair of lines and h has been given in Cauchy notation. The reason that this all works is (4.3) and Lemma 4.3.13. If h is built by the instructions above, then for each k-cycle of σ, its image under h is a k-cycle of σ h . But h was built to have this be exactly a k-cycle of τ . Thus we get σ h = τ on a cycle by cycle basis. We illustrate this with α = (1)(2 6)(3 5)(4) and β = (1 3)(2)(4 6)(5) from our examples above. The first step is to write α =(1)(2 6)(3 5)(4) β =(2)(1 3)(4 6)(5). Note that we have rearranged the cycles of β so that each 1-cycle of β is under a 1-cycle of α and so that each 2-cycle of β is under a 2-cycle of α. Now we change parentheses to get 1 2 3 4 5 6 1 2 6 3 5 4 = (1 2)(3 4 5 6). = h= 2 1 4 5 6 3 2 1 3 4 6 5 4.3. CONJUGATION 141 Note that there are several ways to line the cycles of β under the cycles of α. Not only that, a k-cycle can be written down in k different ways depending on which element of the cycle is listed first. This means that there may be many elements that will work as the conjugator. This explains why we did not discover either of the elements in S6 , namely γ and δ, that we already knew conjugated α to β. We have done all the work to prove the following. Theorem 4.3.15 Let α and β be in Sn . Then α and β are conjugate in Sn if and only if they have the same cycle structure. Note the phrase “in Sn ” in the statement. We qualify the relation “conjugate to” to include where the conjugator must come from. This only becomes relevant if there is a group and subgroup in the discussion. If f and g are in a subgroup H of a group G, then to say that f and g are conjugate in H means that there is an h ∈ H so that f h = g. To say that f and g are conjugate in G means that the conjugator h is only required to come from G. Since there are more elements in G that might act as conjugators, it is possible for two elements to be conjugate in G and not in H. We can illustrate this with elements of D12 . By consulting the figure below 3 4 111 11 1 5 121 11 11 1 6 you can check that σ =(1 4)(2 5)(3 6), τ =(1 4)(2 3)(5 6) are both elements of D12 . According to Theorem 4.3.15, σ and τ are conjugate in S6 . However, there are 720 elements in S6 and only 12 in D12 . We could conjugate σ by all 12 elements of D12 , but that would be rather dull. We will argue that σ and τ are not conjugate in D12 by combining what we know about conjugating permutations and what we know about hexagons. The three 2-cycles in σ all consist of pairs that are on opposite extremes of the hexagon. They are the pairs (1 4), (2 5) and (3 6). If a conjugator h were to take the three 2-cycles of σ to the three 2-cycles of τ , at least one of the pairs just listed would have to be taken to the pair (2 3) of τ . However, this could not come from a permutation of the vertices of the hexagon that preserves the structure of the hexagon. Thus no conjugator can come from D12 . So σ and τ are in D12 , are conjugate in S6 , but not conjugate in D12 . 142 4.3.6 CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS One more example We move to a somewhat more complex example of a permutation group. We consider the group G of symmetries of the cube. A cube with labeled vertices is shown below. 2 3 7 6 _4 _ _ _ _ _ 8 1 (4.8) 5 There are 48 elements of G. We see this by noting that the vertex 1 can go to 8 different places. Once there, its three neighbors (vertices connected to it by an edge) can be permuted at will. To see this last point, note that if vertex 1 is viewed as sitting at the origin of three space, then vertices 2, 4, 5 sit on the major axes. The vertices 2, 4, 5 can be permuted in any way while keeping 1 fixed and the cube comes back to itself. Note that some of these permutations involve reflections as well as possibly rotations. Keeping 1 and 4 fixed and switching 2 and 5 involves a reflection of the cube in a plane through edge 1,4 and tilted 45 degrees up from the horizontal. Since there are 6 permutations for each position that 1 is sent to, there are a total of 48 symmetries of the cube in (4.8). We can look at stabilizers. To make things more interesting, we can think of not only moving the vertices, but moving the entire cube. Thus we can let x be the center point of the top face (the face 2,3,7,6) and discuss Gx . This turns out to be the same as the stabilizer of the top face. We can let L be a line segment from the midpoint of edge 2,6 to the midpoint of 3,7. Now we can ask about the stabilizer of L. These and other questions will be left as exercises. 4.3.7 Overview All of (4.3), Lemma 4.3.6, Lemma 4.3.7 and Corollary 4.3.14 are variants of the same theme. The calculation (4.3) says that if σ and τ are permutations on X, then the behavior of σ in a given location is carried by τ to the behavior of σ τ at the image of that location under τ . This is exploited in Corollary 4.3.14 where it is interpreted in cycle notation. Lemma 4.3.6 and Lemma 4.3.7 interpret the calculation (4.3) in the setting of a group of permutations instead of a single permutation. The lemmas say that the behavior of a group H of permutations in a given location is carried by a conjugator τ to the behavior of the group H τ at the image of that location under τ . 4.3. CONJUGATION 143 This summary is as important to know as the ability to do the calculations that lie behind the summary. If this summary is well understood then you will have an easy time with several of the problems below. The point of view just discussed will reappear often in these notes. Exercises (33) 1. Prove Lemma 4.3.2. 2. Prove Lemma 4.3.3. 3. Prove Lemma 4.3.4. 4. In proving Lemma 4.3.4, you have to do the calculations that “fix” Caylay’s theorem about multiplying on the right. If you have not already done so in Exercise Set (31), prove that setting ra (x) = xa−1 for each a in a group G and f (a) = ra gives an isomorphism from G to a subgroup of SG . 5. There are four elements α, β, γ and δ defined after (4.4). Pick out several pairs of these four elements and conjugate one by the other. Make sure that you try at least one conjugation of an element with itself and see if the result complies with Lemma 4.3.1. 6. Let G = D12 . If you calculated αδ , you should have gotten β which is also αγ . Given that α is the non-identity element of G1 , explain why αγ = αδ must be true. 7. We still let G = D12 . In Problem 10 in Exercise Set (32), you hopefully wrote out the elements of G{1,4} . Explain how you can immediately write out the elements of G{2,5} . Explain how you can immediately get the elements of StG ({2, 5}) from the elements of StG ({1, 4}). 8. In D8 there are four elements that do not “flip” the square. The numbering of the corners of the square in (4.1) is arranged to increase in the counterclockwise direction. The four elements that do not “flip” the square are those that preserve the fact that the numbering increases in the counterclockwise direction. We can call these elements the rotations of the square. Show that these four elements form a subgroup of D8 and that it is normal in D8 . 9. Give the details of the proof of Lemma 4.3.8 that were hinted at following the statement of the lemma. 10. For the elements α, β, γ and δ of D12 given after (4.4), write each of βγ, γβ, βδ, δβ, γδ and δγ in cycle notation. Do the same for γ 2 , γ 3 , γ 4 and γ5. 11. What are the cycle structures of γ i for each i with 1 ≤ i ≤ 6? 144 CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS 12. For the elements α and β used in Section 4.3.5, how many different elements h ∈ S6 are there that conjugate α to β? 13. Both αβ and βα are products of two disjoint 3-cycles. Thus they have the same cycle structure and must be conjugate. Find an h that conjugates αβ to βα. Now take elements a and b in any group. Prove that ab and ba are conjugate. Hint: this last fact has nothing to do with permutations. Conclude that in Sn even if two elements f and g do not commute, then at least f g and gf have the same cycle structure. 14. The following refer to the group G of symmetries of the cube shown in (4.8). How many elements are in each of the following? If this is too easy, list the elements. (a) G1 . (b) StG ({1, 7}). (c) StG ({1, 6}). (d) StG ({1, 5}). (e) StG ({1, 2, 3, 4}). (f) StG ({1, 3, 6, 8}). (g) StG ({x, y}) where x is the center point of the top face and y is the center point of the bottom face. (h) StG ({x, y}) where x is the center point of the edge 2,6 and y is the center point of the edge 1,5. (i) Find an element that conjugates StG ({1, 6}) to StG ({3, 8}). Verify by writing out the elements of each and checking. You should find the conjugator before you write out the elements of the stabilizers. (j) Find an element that conjugates StG ({1, 5}) to StG ({3, 4}). Verify by writing out the elements of each and checking. You should find the conjugator before you write out the elements of the stabilizers. 15. If you got the stabilizer of {1, 3, 6, 8} right in the group of symmetries of the cube in (4.8), then you got 24 elements. Since {1, 3, 6, 8} has 4 elements and S4 has 24 elements, this should tell you something. What? Chapter 5 Group actions II: general actions In this chapter, we move from permutation groups to a more general notion known as group actions. Lemma 4.3.4 is good motivation for what we discuss in this chapter. We repeat the lemma for the convenience of the reader. Lemma (4.3.4) If G is a group, then f : G → Aut(G) defined by f (b) = cb with cb as in Lemma 4.3.3 is a homomorphism. This can be compared to Cayley’s theorem. Cayley’s theorem says that any group is isomorphic to a subgroup of a symmetric group. Lemma 4.3.4 says that every group has a homomorphism to a subgroup of its own automorphism group. The similarities and differences are worth noting. Since elements of Aut(G) are bijections from G to G (admittedly of a special nature), they are permutations of the elements of G. Thus the target of the function f in Lemma 4.3.4 is a group of permutations. This is similar to Cayley’s theorem. Further the function f in Lemma 4.3.4 is a homomorphism so we have that f (ab) = f (a)f (b) or cab = ca cb . Thus, as in Cayley’s theorem, the multiplication of elements of G is reflected in the composition of the permutations that f carries the elements to. But f is not guaranteed to be one-to-one. In fact there are strong reasons to expect it not to be one-to-one. The permutation corresponding to a is ca , conjugation by a. Thus ca (x) = axa−1 . We know that this will equal x if a and x commute. It is possible that a commutes with every x. In fact, this must happen in an abelian group. If a commutes with every x ∈ G, then we will have ca (x) = axa−1 = x for every x ∈ G. This means that ca is the identity permutation. But if 1 is the identity in G, then c1 is also the identity permutation. If a 6= 1, then we will have two elements in G carried to the same permutation. We can gather a lot of this together by looking at the kernel of f . This is 145 146 CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS all a ∈ G so that f (a) = ca is the identity in Aut(G). As just observed, ca is the identity exactly when a commutes with every x ∈ G. This is given a name and we have that for any group G, the center of G, written Z(G), is defined as Z(G) = {a ∈ G | ∀x ∈ G(ax = xa)}. The letter Z is used since the German word zentrum for center was first used for this construct. We don’t have to give an exercise to show that Z(G) is a subgroup of G. The work in the proof of Lemma 4.3.4 that shows that f is a homomorphism shows that its kernel Z(G) must be a subgroup of G. In fact Z(G) must be a normal subgroup of G. In spite of the fact that we don’t have to give the exercise, we will give it anyway. We could reject the function f of Lemma 4.3.4 as a failed attempt at a Cayley type theorem, but conjugation is too important a concept. Instead we invent a new concept called a group action and declare that conjugation is one of its prime examples. 5.1 5.1.1 Definition and examples The definition If G is a group and X is a set, then an action of G on X is a homomorphism θ : G → SX . Various words are sometimes added to actions, and we will illustrate that with the example provided by Lemma 4.3.4. Since θ is a homomorphism, it has a kernel. The kernel of θ is the kernel of the action. We will look at the kernel of an action again when new notation is introduced. The homomorphism f of Lemma 4.3.4 is an example of a group action. The group is G and the set that G acts on is also G. All of the permutations in the image of f are automorphisms of G so we could say that the action is by automorphisms. However, each permutation is calculated by taking a conjugation, so we could also say that the action is by conjugation. The latter is more specific and the action supplied by Lemma 4.3.4 is usually introduced by the phrase “let G act on itself by conjugation.” Alternate definition and notation This definition is perfectly fine, but it is not the most typical way that it is presented. The way group actions are usually presented involves more detail and less notation. We will explain. If θ : G → SX is an action of G on X, then for each a ∈ G we have a permutation θ(a) of X. Thus for each x ∈ X, we have θ(a)(x) as an element of X. If b is another element of G and y another element of X, then we have θ(b)(y) as an element of X. Thus for every pair (g, z) in G × X, we get an element θ(g)(z) in X. Thus θ is a function from G × X to X. 5.1. DEFINITION AND EXAMPLES 147 We cannot let any function from G × X to X be an action. We would be forgetting the multiplication on G. So we would have to add restrictions that would make it an action. The following lemma tells what we would need. Lemma 5.1.1 Let G be a group and X be a set. A function φ : G × X → X gives an action θ : G → SX defined by θ(g)(x) = φ(g, x) if and only if the following hold. 1. For all g and h in G and x ∈ X, we have φ(gh, x) = φ(g, φ(h, x)). 2. If 1 is the identity in G, then for all x ∈ X, we have φ(1, x) = x. We will not ask you to prove this lemma. It is too ugly. We will pretty it up first and then ask you to prove it. The multiplication in a group is often written without a symbol. We will reword Lemma 5.1.1 to omit the function symbol φ. An action is now going to take a pair (g, x) in G × X and return an element of X that is called gx. It is thought of as the element of X that g takes x to (after g is turned into a permutation by the action). The expression gx is referred to as the result of the action of g on x. We “know” that gx is not a multiplication because g comes from a group G and x comes from a set X. Unfortunately, we have the example of G acting on itself by conjugation to prove that we have to be careful with this notation in certain circumstances. Now Lemma 5.1.1 looks like the following. Lemma 5.1.2 Let G be a group, X be a set and let a function from G × X to X be given where the image of (g, x) is written gx. Then this function gives an action θ : G → SX defined by θ(g)(x) = gx if and only if the following hold. 1. For all g and h in G and x ∈ X, we have (gh)x = g(h(x)). 2. If 1 is the identity in G, then for all x ∈ X, we have 1x = x. The proof is left as an exercise. It is an important exercise to do. Once proven, one has an alternate definition of the action of a group G on a set X as a function from G × X to X with image of (g, x) written as gx that satisfies conditions 1 and 2 of Lemma 5.1.2. One of the steps in the proof of Lemma 5.1.2 will show the following fact which is important enough to state separately. Fact: If G acts on X with g ∈ G and x and y in X, then gx = y if and only if g −1 y = x. In doing the proof, you should keep careful track of what each of gh, (gh)x, h(x) and g(h(x)) belong to and that all the expressions and the equalities make sense. Lastly, note that 1x is not multiplication of x by 1 and that 1x = x does not follow from the identity axiom for groups. If an action of G on X uses notation for gx for the action of g ∈ G on x ∈ X, then the kernel of the action becomes the set {g ∈ G | ∀x ∈ X, gx = x}. 148 5.1.2 CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS Examples Every permutation group G on a set X gives an action of G on X. The kernel of such an action is trivial. In this situation the only element of G that acts as the identity is the identity itself. All examples of permutation groups from Chapter 4 are therefore examples of group actions. Below we give a few examples of actions that are newer. Action of a group on itself by conjugation We already have the example of G acting on itself by conjugation. Since the group is acting on itself, it is inadvisable to use the shorthand notation of Lemma 5.1.2 for the action. Thus it is better to use ab for cb (a) which is the result of b acting on a. The notation ab has unfortunate aspects. The first requirement in Lemma 5.1.2 says that cab (g) must equal (ca cb )g = ca (cb (g)). This turns into g ab = (g b )a . Most would prefer to see the incorrect g ab = (g a )b . In fact, the better looking equality holds if a different definition is made for conjugation. However, this would then require that permutations be composed the opposite way that functions are usually composed. Since reversing the way functions compose is easy to get used to, many books make these changes. We will stick with our definition of conjugation and stick with the way that we compose permutations. So g ab = (g b )a will remain as something we will have to live with. Action of a group on its subgroups by conjugation Let G be a group and let S be the set of subgroups of G. Let g be in G. In the paragraph before Lemma 4.3.5, we defined Ag for any subset of G, and Lemma 4.3.5 says that Ag is a subgroup if A is a subgroup. So conjugation takes subgroups to subgroups. It needs to be checked that this forms an action. Most of the work is taken up by Lemma 4.3.4. The rest of the details are left to the reader. We will wait until we have more practice listing all the subgroups of a group before giving problems based on this example. The action by conjugation on a normal subgroup Let G be a group and let N ⊳ G. If b is in G and n ∈ N , then nb is in N . Thus each element of G permutes the elements of N . Since the action of a group on itself by conjugation is truly an action, all the equalities that must hold to make the action of G on N a true action really do hold. Thus Lemma 4.3.4 shows that this is an action. This is a special case of a form of restriction that will be taken up shortly. 5.1. DEFINITION AND EXAMPLES 149 The action of the line on the complex plane by rotations Let t ∈ R and z ∈ C be given. Define tz = z(cos t + i sin t). We should have introduced eit for cos t + i sin t, but this would have taken time to justify. Basically, t rotates the complex plane through the angle t. Details about this action are left as an exercise. Restrictions of actions to subgroups Let G act on X and let H be a subgroup of G. Then H acts on X by looking only at gx for those g ∈ H. The requirements in Lemma 5.1.2 are all equalities. These cannot be violated by restricting to a subgroup, so the restriction is also an action. Restricting an action to an invariant subset Let G act on X and let Y be a subset of X. We cannot just look at gx for those x ∈ Y and expect to get an action on Y since gx might not be in Y even if x is in Y . So this topic needs a condition. For G, X and Y as above, we say that Y is invariant under the action of G if gy is in Y for every g ∈ G and y ∈ Y . Now if Y is invariant under the action of G, then looking only at gx for those x ∈ Y gives an action of G on Y . Once again, the equalities required in Lemma 5.1.2 cannot fail since they hold for the action of G on X. The action of G by conjugation on a normal subgroup of G is of this type. Exercises (34) 1. Show that for any group G that Z(G) is a subgroup of G and in fact a normal subgroup of G. 2. Show that the center of D6 is trivial, but that the center of D8 is not. What are the elements of Z(D8 )? 3. Prove Lemma 5.1.2. One aspect needs care. In the direction where you prove that 1 and 2 imply that θ is an action, remember to prove that each θ(g) is a permutation on X. 4. Prove that the “action” of a group on its set of subgroups by conjugation is truly an action. 5. You showed that the four rotations in D8 form a normal subgroup of D8 . Call this subgroup R. What is the kernel of the action of D8 on R under conjugation? What is the kernel of the action of D8 on itself by conjugation? 6. Show that tz = z(cos t + i sin t) is an action of R on C and find the kernel of the action. 150 5.2 CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS Stabilizers Everything that was said about stabilizers for permutation groups applies to group actions. Such sweeping statements are bad pedagogy. It invites the student to review a large amount of material with a slight change of definition and check that everything goes through as well with the new definitions as it did with the old. Very few students accept such an invitation. Given this, we will be very selective in what we ask the student to review. Also given this, we have not said everything that needs to be said about group actions/permutation groups, and have reserved a few concepts to be introduced only after group actions are defined. That will start in the next section. Here we review some definitions and a few lemmas. Let a group G act on a set X with the result of g ∈ G acting on x ∈ X written gx. Let x ∈ X and A ⊆ X. The stabilizer of x in G is Gx = {g ∈ G | gx = x}. The pointwise stabilizer of A in G is GA = {g ∈ G | ∀x ∈ A, gx = x}. The stabilizer of A in G is StG (A) = {g ∈ G | ∀x ∈ A, gx ∈ A}. Comparing these definitions to those in Chapter 4 shows that the only difference typographically is the replacement of g(x) by gx in the conditions. Note that the kernel of the action of a group G on a set X is GX . Lemma 5.2.1 Let G act on X with x ∈ X and A ⊆ X. Then all of Gx , GA and StG (A) are subgroups of G. The proof is left as an exercise. Lemma 5.2.2 Let G act on X with b ∈ G, x ∈ X and A ⊆ X. Then with cb conjugation by b, we have that the appropriate restriction of cb gives isomorphisms as follows: 1. from Gx to Gb(x) , 2. from GA to Gb(A) , and 3. from StG (A) to StG (b(A)). The shift from permutation groups to general actions is not without consequences. Calculations of conjugacy in permutation groups on finite sets (such as subgroups of Sn ) are easily carried out. In general actions, such niceties as Cauchy notation or cycle notation that uniquely identify elements are not generally available. In spite of this, we will be able to work with the concepts. 5.2. STABILIZERS 151 Special stabilizers Conjugation is such an important action that some stabilizers have their own names. To discuss this, we need to look at the action of G on itself by conjugation. The abbreviated notation for an action (g takes f to gf ) would be too confusing for this action. It looks exactly like left multiplication by g. So we write that g takes f to f g or more specifically to gf g −1 . We have already seen that Z(G), the center of G, is the kernel of the action of G on itself by conjugation. We look at other subgroups that can be defined by this action, and in all that follows in this section, we are considering the action of G on itself by conjugation. If g ∈ G, then the centralizer of g in G is Gg with respect to this action. It is usually defined separately as {h ∈ G | hg = gh}, and it is usually denoted CG (g). With the machinery we have developed, we know that the centralizer of g in G is a subgroup of G and that conjugation by f ∈ G takes the centralizer of g in G to the centralizer of g f in G. These facts are quite easy to show directly from the definitions, but it is nice to know that they also follow from more general consideration. If H is a subgroup of G, then the stablizer of H under the action of G on itself by conjugation is called the normalizer of H in G. It is often denoted NG (H). It is also a subgroup of G and the reader can supply a typical statement about the effect of conjugation. We can make two remarks about the normalizer. The first is that if we look instead at the action of G on the set of subgroups of G by conjugation, then NG (H) becomes the stabilizer of a single element, namely the subgroup H. The second is that H ⊳ NG (H), and in fact NG (H) is the largest subgroup of G in which H is normal. That is, if K is a subgroup of G containing H and H ⊳ K, then K ⊆ NG (H). This is a triviality from the definitions, but we leave it as an exercise to check the definitions. Exercises (35) 1. Prove Lemma 5.2.1. 2. Prove at least one of the isomorphisms in Lemma 5.2.2. 3. Prove, without quoting any lemmas, the statement made above that “conjugation by f ∈ G takes the centralizer of g in G to the centralizer of g f in G.” The claim is that it is an easy calculation. However, it is hard to write down the easy calculation correctly. Be careful. 4. Let H be a subgroup of G. Prove that H ⊳ NG (H), and that if K is a subgroup of G containing H and H ⊳ K, then K ⊆ NG (H). 152 5.3 CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS Orbits and fixed points We now come to concepts that were not introduced when permutation groups were discussed. They could have been introduced with permutation groups, but they were delayed to this point to have some exercises done that are not imitations of previous exercises. We give some useful notation. If G acts on X, S ⊆ G, if A ⊆ X are subsets, and if g is in G and x is in X, then we can write down various sets of X. We define gA = {ga | a ∈ A}, Sx = {sx | s ∈ S}, and SA = {sa | s ∈ S, a ∈ A}. Basically, each looks at all possible combinations hinted at by the each of the notations gA, Sx and SA. We look at one of these combinations when S is all of G. Orbits Let the group G act on the set X and let x be in X. The orbit of x under the action of G is the set OG (x) = Gx = {gx | g ∈ G}. Either notation OG (x) or Gx will do. We will tend to use the first more often. In words, the orbit of x is the set of all the elements that x is taken to under the action of G. Another way to description of the orbit is OG (x) = {y ∈ X | ∃g ∈ G, y = gx}. The second description gives a better indication of how to show that something is in an orbit and it makes a better connection to the discussion that follows. To say something about the nature of orbits, we define a relation. Given an action of G on X with x and y in X, we say that x ∼G y to mean that there is a g ∈ G so that gx = y. Note that the second defintion of OG (x) makes it clear that x ∼G y if and only if y ∈ OG (x). So if we prove that ∼G is an equivalence relation, we will have shown that OG (x) is an equivalence class. Lemma 5.3.1 If G acts on X and ∼G is defined as above, then ∼G is an equivalence relation and each OG (x) is an equivalence class under ∼G . The proof will be left as an exercise. There is some resemblance of Lemma 5.3.1 to Lemma 4.3.10, but the proof of Lemma 5.3.1 is even easier. In addition, Lemma 4.3.10 depends on Lemma 4.3.9 and there is no need for a parallel to Lemma 4.3.9 here. 5.3. ORBITS AND FIXED POINTS 153 Since orbits are equivalence classes, we now know that the orbits under the action of G partition X. This leads to a number of counting exercises since now the number of elements of X is known to be the sum of the sizes of the orbits. This will be used shortly. When you write out the proof of Lemma 5.3.1, you will see that Lemma 5.3.1 depends on the fact that G is a group. We do not get such good behavior from arbitrary subsets of G and for a subset S ⊆ G, we do not call Sx an orbit and we do not write OS (x). However, if S is a subgroup of G, then it is a group in its own right and we can talk about the orbit of x under the subgroup H, denote it OH (x), and have it defined as Hx = {hx | h ∈ H}. You can check that OH (x) ⊆ OG (x) holds in this situation. Lemma 5.3.2 Let G act on X with x ∈ X, H a subgroup of G, and g ∈ G. Then OH g (gx) = gOH (x). The proof is left as an exercise. Fixed points A special situation arises when an orbit of an action of a group G on a set X has only one element in it. The element of such an orbit is taken only to itself under the action of G. As such it is called a fixed point of G. The set of all fixed points of G is called the fixed set of G and is defined as Fix(G) = {x ∈ X | ∀g ∈ G, gx = x}. This is easy to confuse with the definition of a stabilizer. Note that stabilizers are contained in G and fixed sets are contained in X. For a subgroup H of G, we define Fix(H) = {x ∈ X | ∀h ∈ H, hx = x}. We have that Fix(G) is the union of all the orbits of G of size one. This together with Lemma 5.3.2 makes the proof of the following quite easy. Lemma 5.3.3 Let G act on X with H a subgroup of G and b ∈ G. Then Fix(H b ) = bFix(H). The proof is left as an exercise. Invariant sets Slightly looser than a fixed point is an invariant set. The following was previously mentioned, but we repeat it here with slightly different notation. The notion is easy to confuse with a stablizer. If G acts on X and A ⊆ X, then we say that A is invariant under the action of G if GA ⊆ A. This rather brief definition when expanded into words says that a set A is invariant under the action if all its elements are carried into A by all elements of G. When A ⊆ X is invariant under the action of G on X, then we can restrict the action of G to only act on A. If A is not invariant, then trying to restrict 154 CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS the action of G to A would make no sense since there would be elements of A that G would carry outside of A. The restriction of the action of G on itself by conjugation to the action of G on a normal subgroup is an example of this. Invariant sets and stablizers The notion of invariant sets cooperates with the notion of stabilizers. We have seen an example of this with the normalizer of a subgroup. Let G act on X with A ⊆ X. Then we can look at the subgroup H = StG (A) of G. The restriction of the action to H is another action and A is invariant under the action of H on X even if it is not invariant under the action of all of G on X. Further H is the largest subgroup of G for which this is true. The verification of these remarks is left as an exercise. 5.4 Cosets and counting arguments We come to the first important results in elementary group theory. They all involve counting and they give the first restrictions on how groups can be built. The main result, Lagrange’s theorem, gives the possible sizes of subgroups of groups with finitely many elements. Lagrange’s theorem can be proven very early after groups are defined. We have chosen to delay its proof so that we could first introduce the notion of a group action. Now that we have introduced that notion, we can make use of it. Lagrange’s theorem is only about groups with finitely many elements. A group with finitely many elements is called a finite group. Recall that the number of elements of a group G is called the order of G and is written |G|. 5.4.1 Cosets The key lemma behind Lagrange’s theorem is the following. It applies to all groups, finite or not. Lemma 5.4.1 Let H be a subgroup of a group G and let H act on G by left multiplication in that h ∈ H takes g ∈ G to hg. Then for any g ∈ G, the function sending h to hg is a bijection from H to the orbit Hg of g. Proof. The function is onto by the definition of an orbit. To show the function is one-to-one, we consider h and h′ in H and assume hg = h′ g. Now right multiplication by g −1 shows that h = h′ . The orbit Hg is usually called a right coset of H in G. The word “right” is used, since the element g that determines which orbit (coset) we are talking about is on the right of H. Note that g ∈ Hg since 1 ∈ H and g = 1g. There are also left cosets. The notation gH refers to the set {gh | h ∈ H} and is called a left coset of H. Note that g ∈ gH since 1 ∈ H and g = g1. 5.4. COSETS AND COUNTING ARGUMENTS 155 We would like to know that the left cosets also partition G. This can be proven directly, but it would be nice to use orbits. Unfortunately multiplication on the right is not an action of H on G. This was mentioned in Exercise Set (31). However, there is a useful right multiplication that is an action. Since the abbreviated notation will be confusing here, we will define an action of H on G in which h ∈ H takes g ∈ G to an element of G that we will write as h(g) which is given by the formula h(g) = gh−1 . If you did Problem 4 in Exercise Set (33), then you know how to prove the following lemma. Lemma 5.4.2 If H is a subgroup of G, then defining h(g) = gh−1 for h ∈ H and g ∈ G defines an action of H on G. The proof is left as an exercise. The relevance of Lemma 5.4.2 to right cosets is the following. Lemma 5.4.3 Let H be a subgroup of G and let g ∈ G. Then the left coset gH equals {gh−1 | h ∈ H}. The proof is left as an exercise. Lastly, we have the following parallel to Lemma 5.4.1. Lemma 5.4.4 If H is a subgroup of G and g is in G, then the function sending h to gh is a bijection from H to gH. The proof is identical with trivial changes to the proof of the last sentence in Lemma 5.4.1. From Lemma 5.4.3, we know that left cosets are orbits of an action and thus form a partition of G just as the right cosets do. This raises several questions. The first question is whether the left cosets are the same as the right cosets. We will see that the answer is sometimes they are and sometimes not. The difference between the two situations will be studied. The second question is why the two kinds of cosets are needed. They both partition the full group into subsets that are all the same size as each other and the same size as the subgroup. It turns out that both kinds of cosets can be useful. We turn to the first application of cosets. 5.4.2 Lagrange’s theorem For Lagrange’s theorem, it does not matter which kinds of cosets we use. For no particular reason, we will use right cosets. Assume that G is a finite group and that H is a subgroup of G. We know that the right cosets partition G and are all the same size. In particular they are the same size as H. Since G is a finite group, H is finite as well. If k is the number of different right cosets of H, we must also have that k is a finite number. Thus |G| = k|H|. We have proven the following. 156 CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS Theorem 5.4.5 (Lagrange) If G is a finite group and H is a subgroup of G, then |H| divides |G|. We cannot resist giving an immediate consequence of Lagrange’s theorem. Corollary 5.4.6 If a group G has prime order, then the only subgroups of G are the trivial subgroup and G itself. Proof. Let p = |G|. The only positive integers dividing p are 1 and p. The only subgroup of order 1 is {1} and the only subgroup with p elements in a group with p elements is the whole group. Another consequence of Lagrange’s theorem is the following. Corollary 5.4.7 If a group G has prime order, then it is abelian. Proof. Assume false. Let g ∈ G be such that it does not commute with h ∈ G. We know that g cannot be the identity. Now CG (g) has at least the identity and g, but it does not have the element h. So it is a subgroup of G that is not trivial and not the whole group. This is impossible by Corollary 5.4.6. There is another proof of Corollary 5.4.7 based on a better understanding of how subgroups are built. We will see this in the next chapter. 5.4.3 The index of a subgroup Lagrange’s theorem can be put in another form. If H is a subgroup of G then the index of H in G is the number of right cosets of H in G. It is also the number of left cosets of H in G since the left cosets also partition G and are also the same size as H. The notation for the index of H in G is [G : H]. Lagrange’s theorem can now be stated as follows. Theorem 5.4.8 If H is a subgroup of G, then |G| = |H|[G : H]. Proof. This is true by the proof of the original form of Lagrange’s theorem if |G| is finite. If G is infinite then at least one of |H| or [G : H] is infinite, and we can regard the equality as being correct. There is a certain amount of arithmetic that can be done with indexes. Recall that the definition and the notation regarding the index of a subgroup takes into account both the subgroup and the group that it is contained in. If we have a chain of subgroups H ⊆ K ⊆ G, then we have three indexes that we can look at: [G : H], [G : K] and [K : H]. They are related as follows. Lemma 5.4.9 If H ⊆ K ⊆ G is a chain of subgroups of the group G, then [G : H] = [G : K][K : H]. The proof is left as an exercise. 5.4. COSETS AND COUNTING ARGUMENTS 5.4.4 157 Sizes of orbits Let G act on X and let x be in in X. We want to know the size of OG (x). The answer will be that it is equal to an index of a certain subgroup. Thus we will want to set up a one-to-one correspondence between cosets of the subgroup and the orbit. It turns out that it will be left cosets that we have to work with. Theorem 5.4.10 Let G act on X and take x ∈ X. Then the number of elements of OG (x) is [G : Gx ]. Proof. We will claim a one-to-one correspondence between the set of left cosets of Gx in G and OG (x). The question is what the one-to-one correspondence is based on. The answer is that each left coset is exactly the set of elements of G that take x to a specific element of OG (x). Here are the details. We claim that for g ∈ G, we have h ∈ gGx if and only if hx = gx. To show one direction, we take h ∈ gGx . That is h = gf for some f ∈ Gx . This means that hx = (gf )x = g(f x) = gx. To show the other direction, we assume that hx = gx. This means that (g −1 h)x = (g −1 g)x = x and g −1 h belongs to Gx . Say g −1 h = f with f ∈ Gx . That means h = gf and h ∈ gGx . Now define ρ from the set of left cosets of Gx in G to OG (x) by setting ρ(gGx ) = gx. This needs to be checked for well definedness since there are many g that give the same left coset. If gGx = hGx , then g and h are in the same left coset and gx = hx. The function ρ is one-to-one, by the “if” part of the previous paragraph. If ρ(gGx ) = ρ(hGx ), then gx = hx and g and h are in the same left coset, so gGx = hGx . The function ρ is onto since every element y in OG (x) is gx for some g ∈ G and this makes ρ(gGx ) = gx = y. This completes the proof. Corollary 5.4.11 Let a finite group G act on X with x ∈ X. Then the number of elements in OG (x) divides |G|. This is immediate from Theorem 5.4.10. 5.4.5 Cauchy’s theorem We come to the first intricate theorem of group theory. Lagrange’s theorem is of vast importance, but it is rather elementary to prove. The proof of Cauchy’s theorem, first published in 1844, has had over 150 years to be polished and simplified, and it is still an effort to prove. The following proof [3] appeared first in 1959. Theorem 5.4.12 (Cauchy) Let a prime p divide the order of a finite group G. Then there is an element of G of order p. Proof. Let n = |G| and let 1 be the identity element of G. Let Gp = G × G × · · · × G with p factors. Thus each element of Gp is a p-tuple of the form 158 CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS (g0 , g1 , . . . , gp−1 ). We start the subscripts with 0 to have the subscripts cooperate with the elements of Zp . The number of elements of Gp is np . Let A = {(g0 , g1 , . . . , gp−1 ) | g0 g1 · · · gp−1 = 1}. The number of elements |A| of A is np−1 . This is seen since the first p − 1 entries of a p-tuple in A can be any of the elements of G. The last entry has exactly one choice, namely (g0 g1 · · · gp−2 )−1 . We let Zp act on A by rotating the entries in each p-tuple of A. That is, for [k]p ∈ Zp , we have k(g0 , g1 , . . . , gp−1 ) = (g[k]p , g[k+1]p , . . . , g[k+p−1]p ). First we note that this is an action on A. If a product of elements is 1, then any rotation of the product is 1. This is seen by induction if we show this for rotation by one position. If g0 g1 · · · gp−1 = 1, then g0−1 (g0 g1 · · · gp−1 )g0 = g0−1 1g0 = 1. But g0−1 (g0 g1 · · · gp−1 )g0 = g1 g2 · · · gp−1 g0 which is a single rotation of the product. Thus each element of Zk takes elements of A to elements of A. That it is an action is seen by noting that rotating first j positions and then k positions, is the same as rotating j + k positions. Also note that rotating by 0 positions leaves the p-tuple fixed. We then quote Lemma 5.1.2 to conclude that we have an action. From Corollary 5.4.11, we know that the size of each orbit of the action is either 1 or p. An orbit of size 1 is a fixed point of the action. But an orbit of size 1 is a p-tuple that remains the same under all possible rotations. Thus an orbit of size 1 is a p-tuple whose entries are all the same. There is at least one orbit of size 1. This has the p-tuple (1, 1, . . . , 1). Thus the number of orbits of size 1 is not zero. Let k be the number of orbits of size 1 and let l be the number of orbits of size p. Since the orbits of the action partition A, we must have that the sum of the sizes equals the number of elements of A. Thus pn−1 = |A| = k(1) + l(p). Since p divides both |A| and lp, it must divide k(1) = k. Since k is not zero, it must be at least p. Thus there is at least one other p-tuple in A in which all the entries are the same besides (1, 1, . . . , 1). This tuple is of the form (g, g, . . . , g) for some g 6= 1 in G. But to be in A we must have gg · · · g = 1 or g p = 1. We must argue that p is the order of g. The order of p is not 1, and if it is some d other than p, then 1 < d < p by the definition of order. Then by the division algorithm, p = dq + r with 0 ≤ r < d. Now 1 = g p = g dq+r = g dq g r = (g d )q g r = 1q g r = g r and r is a smaller power than d that makes g r = 1. If r > 0, then this contradicts the statement that d is the order of g, so r = 0 and d|p. But this is not possible unless d = p. So p is the order of g. This completes the proof. 5.4. COSETS AND COUNTING ARGUMENTS 159 Exercises (36) 1. Prove Lemma 5.3.1. 2. Prove Lemma 5.3.2. 3. Prove Lemma 5.3.3. 4. Let G act on itself by conjugation. Show that for this action Fix(G) equals Z(G). 5. Let G act on X with A ⊆ X and let H = StG (A). Then A is invariant under the action of H on X and if A is invariant under the action of a subgroup K of G on X, then K ⊆ H. 6. Prove Lemma 5.4.2 if you have not already done the work in Problem 4 in Exercise Set (33). If you have done that problem, then verify that the details of the solution prove the lemma. 7. Prove Lemma 5.4.3. 8. Prove Lemma 5.4.9. As a hint, note that the number of elements in a cross product A × B of sets is the product of the sizes of the two sets A and B. 160 CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS Chapter 6 Subgroups 6.1 Subgroup generated by a set of elements We add information from Section 3.2.3. There we showed (Lemma 3.2.10) that if S is a subgroup of a group G, then there is a smallest subgroup of G that contains S. We used this lemma to define the subgroup of G generated by S to be this smallest subgroup. However, we never said exactly what is in the group generated by S. In this section we will fill in this gap in our information. To save words, we let hSi denote the subgroup generated by S, where S is a subset of a group G. We will build on the fact that hSi must contain S by definition. Before we look at what else hSi must contain, we discuss strategy. 6.1.1 Strategy We ouline the strategy very explicitly since similar strategies apply in many other situations. The start We want to know what the elements of hSi are where S is a subset of a group G. We turn this around and pick a collection C of elements of G and ask if the collection C is equal to hSi. Of course hSi is a subgroup of G, so our collection C must form a subgroup of G. Also hSi contains S. So our collection C must also contain S. But if our collection C is a subgroup of G and contains S, then we have hSi ⊆ C (6.1) by the definition of hSi. So all we need to prove is that C ⊆ hSi. The middle To discuss how to show C ⊆ hSi, we look at how hSi is formed in Lemma 3.2.10. We get hSi by intersecting all subgroups of G that contain S. So we need build 161 162 CHAPTER 6. SUBGROUPS C so that every element of C is in every subgroup of G that contains S. Looking at this negatively, we say that we put nothing in C unless it has to be in every subgroup of G that contains S. Looking at it positively, we say that we will throw everything that we can think of into C that must be in every subgroup of G that contains S. The end We combine these observations. To build C, we start with S. This certainly has only elements that are in every subgroup of G that contains S. Then we throw in everything that we can think of that must be in any subgroup of G that contains S. For example, the squares of elements of S, the inverses of every element of S, the products of pairs of elements of S, and so forth. This will keep C ⊆ hSi. If it turns out that the set C that we have created is a group, then (6.1) will give us hSi ⊆ C. In summary: 1. Start with S. 2. Build C by adding to S all elements of G that must be in any subgroup of G. 3. Show that C is a group. 4. Conclude that C = hSi. 6.1.2 The strategy applied Let G be a group and let S be a subset of G. We use S −1 to denote the set of all the inverses of elements in S. More specifically, S −1 = {g −1 | g ∈ S}. We note that every element of S −1 must be in any subgroup of G that contains S. This statement also applies to S ∪ S −1 . We now let C be all finite products of elements from S ∪ S −1 . This sentence needs clarification. We know what a product of two elements is. Since the multiplication is associative, we also know what a product of n elements is for n > 2. We next discuss a product of one element, and a product of zero elements. We define a product of one element x to be x itself. To discuss a product of zero elements, we note that if u is a product of m elements and v is a product of n elements, then uv is clearly a product of m + n elements. It would be nice if we defined a product of zero elements so that if z is a product of zero elements and u is a product of m elements, then zu is a product of 0 + m = m elements. This is easy to accomplish by letting z be the identity. 6.1. SUBGROUP GENERATED BY A SET OF ELEMENTS 163 We turn the discussion of the last paragraph into a definition and say that a product of zero elements is the identity in G. We are now ready to prove the following. Proposition 6.1.1 Let G be a group and let S be a subset of G. Let C consist of all finite products of elements of S ∪ S −1 . Then C = hSi. Proof. We will discuss what we are doing as the proof goes along. A subgroup of G needs three things. It needs to have the identity of G, it needs to be closed under the taking of inverses, and it needs to be closed under products. We start with a discussion of the identity. There is a bit of hidden efficiency in our setup since it works even if S is empty. If S is empty, then our convention that a product of zero elements is the identity puts the identity in C. If S is empty, then so is S −1 , and there are no other elements to take products of. So we end up with only the identity in C. This is the smallest subgroup of G and clearly contains S since S is empty. If S is not empty, then it has some element whose inverse is in S −1 . Now the product of the element and its inverse gives the identity and so our convention that the product of zero elements is the identity is not really needed in this case. Next we consider inverses. Since a subgroup of G is closed under the taking of inverses, we know that S ∪ S −1 is in any subgroup of G that contains S. Further the inverse of an inverse is the original element, so S ∪ S −1 is closed under the taking of inverses. Next we consider products. Finite products of elements from S ∪ S −1 must also be in any subgroup of G that contains S since subgroups are closed under finite products. So we have C ⊆ hSi. Also we know that if u and v are finite products of elements of S ∪ S −1 , then uv is also a finite product of elements of S ∪ S −1 . So we have created a set closed under finite products. As mentioned above, we also know that the identity is present. But by adding new elements, we may have ruined closure with respect to the taking of inverses since there are now more elements to invert. It turns out that we have not and we now prove this. If u is in C, then u = a1 a2 · · · an where all the ai are in S ∪ S −1 . But now −1 −1 −1 u−1 = a−1 n an−1 · · · a2 a1 . For any i with 1 ≤ i ≤ n, we know that ai is in S or S −1 , so either ai = s or ai = s−1 for some s ∈ S. But then a−1 = s−1 or a−1 = s for some s in S and i i −1 −1 −1 ai is in S ∪ S . This makes u a product of elements of S ∪ S −1 and u−1 is in C. Thus C is a subgroup of G. As mentioned in the description of the strategy, this tells us that hSi ⊆ C. We have been careful to keep C ⊆ hSi, so C = hSi. The order that was used to build C in Proposition 6.1.1 is important. In the proposition, inverses are added before products. If we reverse this we do not get a good result. Start with a set S in a group G. Throw in the identity as the first step. Then throw in all finite products as the second step. At this point, the 164 CHAPTER 6. SUBGROUPS collection is closed under products as shown in the proof of Proposition 6.1.1. Lastly, throw in the inverses of all that has been created as the third step. Since inverses were done last, it is clear that the result is closed under the taking of inverses. However, the third step has ruined the good results of the second step. The result is not necessarily closed under the taking of products. This will be explored in an exercise. 6.1.3 Generators As mentioned in Section 3.2.3, the smallest subgroup H of a group G that contains a subset S of G is called the subgroup generated by S, and we can refer to S as a set of generators for H. If the subgroup H is the full group G, then of course we can say that G is generated by S and that S is a set of generators for G. There is no unique set of generators for a group. If G is a group, then G is a subset of G and generates G since G is the smallest subgroup of G that contains G. However smaller generating sets might work as well. Consider Zk , the group of integers modulo some integer k > 1 and consider the set {1} (leaving out [ ]k for brevity). Any subgroup of Zk containing {1}, must contain 1 + 1, as well as 1 + 1 + 1, and so forth. Eventually, we see that the subgroup must also contain a sum of k ones, which is the same as 0 in Zk . Thus we see that any subgroup of Zk that contains {1} must contain all of Zk , so Zk is generated by the single element 1. This example leads to a definition. A group generated by one element is said to be cyclic. We have seen that Zk is a cyclic group. It will be shown in an exercise that Z is also a cyclic group. In a cyclic group generated by {a}, we often say that the group is generated by a. Exercises (37) 1. Consider G = Z × Z. This consists of all ordered pairs (a, b) with both a and b from Z under the operation (a, b) + (c, d) = (a + c, b + d). Consider the set S = {(1, 0), (0, 1)}. Create a subset C of G as follows. Let C1 be all finite sums (as opposed to products since this abelian group is written additively) of elements of S. Let C = C1 ∪ C1−1 . In other words, we throw in all sums before we throw in inverses in contrast to the order used in Proposition 6.1.1. (a) Describe the elements in C. (b) Determine whether C is a group or not. (c) Determine what goes wrong with the proof of Proposition 6.1.1 if the order used in this exercise is used. 6.1. SUBGROUP GENERATED BY A SET OF ELEMENTS 165 2. Show that Z is generated by 1. 3. Let G be a cyclic group generated by a. (a) If the order of a is a finite number k, then show that G is isomorphic to Zk . (b) If the order of a is infinite, then show that G is isomorphic to Z. Hint: exhibit the isomorphism explicitly. 4. Let p > 1 be a prime integer. Use Lagrange’s theorem to show that Zp is generated by any element in Zp that is not [0]k . 5. Let k > 1 be an integer that is not a prime. Show that there are non-zero elements of Zk that do not generate Zk . 6. Can you find two elements a and b of Z6 so that {a, b} generates Z6 , but neither a nor b by themselves generates Z6 ? 7. What is the smallest generating set that you can find for D12 ? 8. What is the smallest generating set that you can find for S4 ? 9. List all the subgroups of D16 . This rather formidable looking problem is not all that bad. Consider what some simple combinations of elements generate, and constantly take into account Lagrange’s theorem. 166 CHAPTER 6. SUBGROUPS Chapter 7 Quotients and homomorphic images In this chapter we take the process that creates the group Zk from the group Z and apply it to arbitrary groups. In order to give an outline of what will be done, we review how Zk is built and its relationship to Z. 7.1 7.1.1 The outline On the groups Z and Zk The construction To build Zk , we go through several steps. First we put the relation ∼k on Z and show that it is an equivalence relation. We then declare that the new group will have the equivalence classes of ∼k as its elements. Second we define a binary operation (written additively since it turns out to be abelian and it comes from the addition on Z), on the set of equivalence classes of ∼k as [a]k + [b]k = [a+ b]k and show that this operation is well defined. Third, we make the easy observation that the binary operation has all the properties needed to declare that we have a group. This turns out to be easy since the binary operation on the equivalence classes is defined in terms of the operation on Z. The homomorphism Z → Zk The fact that the binary operation on Zk is defined using the binary operation on Z makes the function π : Z → Zk defined by π(a) = [a]k a surjective homomorphism. 167 168 CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES Construction and homomorphism revisited The relevant equivalence relation on Z is defined by a ∼k b if and only if k|(b−a). Another way to say this is to say that b − a is a multiple of k. A third way to say this is that b − a is in the set of multiples of k in Z. The multiples of k make an appearance in a discussion of the surjective homomorphism π. The multiples of k are exactly the elements in the kernel of π since π(a) = [0]k happens exactly when [a]k = [0]k or a − 0 is a multiple of k. Note that this means that the multiples of k in Z form a subgroup of Z and that this subgroup has to be normal. Of course, the fact that the multiples of k in Z form a subgroup is extremely easy to argue without building Zk , and the fact that the subgroup turns out to be normal is immediate from the fact that Z is abelian and all of its subgroups are normal. If we let M be the set of multiples of k in Z, then we have the following. 1. The equivalence relation ∼k could have been defined by saying a ∼k b if and only if b − a is in M . 2. After Zk is defined from ∼k , the kernel of π : Z → Zk turns out to be M . 7.1.2 The new outline We wish to apply our experience with Z and Zk to more general situations. We wish to start with an arbitrary group G (which might not be abelian) and form a new group by putting an equivalence relation on G and define a binary operation on the set of equivalence classes. If the new group is called Q and the operation on Q is based on the operation on G, then there should be a reasonable homomorphism from G to Q. If the construction of Q from G closely imitates the construction of Zk from Z, then the kernel of this homomorphism should be related to the equivalence relation that we put on G. But the kernel of a homomorphism is always normal, so the equivalence relation should probably be based on a normal subgroup of G. Next we note that we should be writing operations multiplicatively since G might not be abelian. Thus we should be looking at ba−1 instead of b − a. We have arrived at the following outline. 1. Start with a group G and a normal subgroup N of G. 2. Define a relation ∼N on G by saying a ∼N b if and only if ba−1 ∈ N . 3. Prove that ∼N is an equivalence relation. Let [a]N denote the equivalence class of a under this relation. 4. Define a binary operation on the equivalence classes by [a]N [b]N = [ab]N and prove that this is well defined. 5. Prove that all the requirements in the definition of a group are satisfied by this operation on the set of equivalence classes. Let Q denote the resulting group structure on the set of equivalence classes. 7.2. COSETS 169 6. Define h : G → Q by h(a) = [a]N . That this is a homomorphism is immediate from the definition of the binary operation on Q. 7. Show that N is the kernel of h. Some details in the outline above are not usually given in the way that we have described them. It turns out that the eqivalence classes [a]N have a much more familiar description. Before working through the outline, we first take a look at the equivalence classes. Exercises (38) The following is technically optional. It will be shown by other means in the next section. However, it is extremely easy and should be done anyway. Note that normality is not needed. 1. Let G be a group with a subgroup H. Define the relation ∼H on G by declaring a ∼H b to mean that ba−1 is in H. Prove that ∼H is an equivalence relation. 7.2 Cosets We make our first observations about an arbitrary subgroup and then specialize to normal subgroups. We do these two steps to emphasize the difference in behavior between normal subgroups and more general subgroups. 7.2.1 Identifying the equivalence classes with cosets Let G be a group and let H be a subgroup of G. Define the relation ∼H on G by declaring a ∼H b to mean that ab−1 ∈ H. We could go through the work to show that this is an equivalence relation (which you may have done already in an exercise above), but it turns out that we have already done that work. Since this might be hard to recognize, we show that ∼H could have been given a different definition. This will allow us to argue that it is an equivalence relation and it will give us a second view (actually a third) on the relation. The following lemma refers to orbits of an action as discussed in Section 5.3, and the view of cosets as orbits as discussed in Section 5.4.1. Lemma 7.2.1 Let G be a group and let H be a subgroup of G. Then for elements a and b in G, we have a ∼H b if and only if a is in the orbit Hb of b under the action of H on G by left multiplication. Proof. If a ∼H b, then ab−1 = h for some h ∈ H. This means that a = hb putting a in the orbit Hb of the action. Conversely, if a is in the orbit of b under the action of H on G by left multiplication, then a = hb for some h ∈ H making ab−1 = h ∈ H. 170 CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES In Section 5.3 we saw from Lemma 5.3.1 that being in the same orbit of an action is an equivalence relation and that orbits are equivalence classes. Thus we have one step of the outline finished. Not only that, we have a name for the equivalence classes. The equivalence classes are simply the right cosets of H in G. So far, normality has not been essential in the outline. However, the next steps will need normality. The definition of the binary operation on the equivalence classes will not be well defined unless the subgroup in question is normal. We look next look at the important difference that normality makes. 7.2.2 Cosets of normal subgroups An example We will look at an example that will run through this discussion for a while. We will look at D16 , the dihedral group of order 16. This is the group of symmetries of the octogon shown below. 3 llRRRRRR 4 lllll R, 2 ,, ,, ,, , 5 ,, 1 ,, ,, RRRR l l RRR llll 8 6 l 7 This example was picked because it is large enough to exhibit certain behviors. We will find two subgroups N and H of the same size, with N normal in G and H not. Further N will be normal in spite of the fact that some of its elements do not commute with some elements of D16 . The fact that N and H are the same size is not crucial, just pleasing. With the vertices labeled as shown above, we can use cycle notation to describe all 16 elements of D16 . First we have the identity e = (1)(2)(3)(4)(5)(6)(7)(8). Next we have the rotation r = (1 2 3 4 5 6 7 8). Another 6 elements are the various powers of r. For example r3 = (1 4 7 2 5 8 3 6) and r4 = (1 5)(2 6)(3 7)(4 8). Then we have 8 reflections. Four of them are across lines through two opposite vertices, and four of them are across lines through the midpoints of two opposite edges. The four reflections across lines through vertices are v = (3)(2 4)(1 5)(6 8)(7), h = (1)(2 8)(3 7)(4 6)(5), d1 = (2)(1 3)(4 8)(5 7)(6), d2 = (4)(5 3)(6 2)(7 1)(8). 7.2. COSETS 171 The four reflections across lines through midpoints of edges are m1 = (1 2)(8 3)(7 4)(6 5), m2 = (2 3)(1 4)(8 5)(7 6), m3 = (3 4)(2 5)(1 6)(8 7), m4 = (4 5)(3 6)(2 7)(1 8). We will look at two subgroups N = {1, r2 , r4 , r6 } and H = {1, v, h, r4 }. The subgroup N is normal in D16 . This can be shown directly since conjugations are easy to calculate. The calculations will show that not all elements of N commute with all elements of D16 . The elements 1 and r4 do, but there are elements of D16 that do not commute with r2 and r6 . The subgroup H is not normal in D16 . Conjugating v by r gives d1 which is not in the subgroup H. We can look at the relations ∼N and ∼H . The equivalence classes of these relations are the right cosets of N and H, respectively. Since D16 has 16 elements and each of N and H have 4, there are 4 cosets of each. The right cosets of N are N 1 = {1, r2 , r4 , r6 }, N r = {r, r3 , r5 , r7 }, N v = {v, d2 , h, d1 }, N m1 = {m1 , m2 , m3 , m4 }. The calculations leading to N 1 and N r are trivial. To compute N v and N m1 , we note that each of these cosets is filled with a flip followed by a rotation. Thus each element in these cosets reverses the circular order of the vertex labels. Next it is noted that all of the “order reversing” elements in D16 do different things to 1. For example, v(1) = 5, h(1) = 1, d1 (1) = 3 and so forth. So seeing where 1 goes determines the element. For example r2 v(1) = 7 so rv = d2 . The right cosets of H are H1 = {1, v, h, r4 }, Hr = {r, m2 , m4 , r5 }, Hr2 = {r2 , d1 , d2 , r6 }, Hr3 = {r3 , m1 , m3 , r7 }, and are again computed by seeing whether a flip has taken place and by following vertex 1. An important difference between the normal N and the non-normal H shows up when we look at left cosets. The left cosets of N are 1N = {1, r2 , r4 , r6 }, rN = {r, r3 , r5 , r7 }, vN = {v, d1 , h, d2 }, m1 N = {m1 , m4 , m3 , m2 }, 172 CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES and the left cosets of H are 1H = {1, v, h, r4 }, rH = {r, m3 , m1 , r5 }, r2 H = {r2 , d2 , d1 , r6 }, r3 H = {r3 , m4 , m2 , r7 }. Observations We note that every left coset xN of N equals the right coset N x. However, there are left cosets of H that equal no right coset of H. In particular rH and r3 H equal no right coset of H. Note that some left cosets of H do equal the corresponding right coset. This is no surprise since 1H = H1 has to be true. We also have r2 H = Hr2 . These observations lead to the next lemma whose proof is left to the reader. Lemma 7.2.2 Let G be a group and let J be a subgroup of G. Then the following are equivalent. 1. J is normal in G. 2. For every g ∈ G, we have gJ = Jg. 3. For every g ∈ G and j ∈ J, there is an element k ∈ J so that gj = kg. 4. For every g ∈ G and j ∈ J, there is an element k ∈ J so that jg = gk. Exercises (39) 1. Prove that the H given above is a subgroup of D16 . This exercise should also ask to show that N is a subgroup, but that would be too easy. You should check that N is normal in D16 , however. 2. Find an element g of D16 so that gr2 g −1 6= r2 . 3. Verify that v r = d1 . 4. The order that the elements are listed in N v and vN is based on the order that the elements are listed in N . Verify that when every element of N in the order given above is multiplied on the right by v, we get the elements in the order given above and that when every element in N in the order given above is multiplied on the left by v, we get the elements in the order given above. 5. More generally, verify all the calculations of the right and left cosets of N and H. If you do not take into account the remarks made above about how the calculations can be done, you will end up doing much more work than necessary. 7.3. THE CONSTRUCTION 173 6. Prove Lemma 7.2.2. Note that to prove that 1 through 4 are equivalent to each other it suffices to prove 1 ⇒ 2 ⇒ 3 ⇒ 4 ⇒ 1. Any other cyclic order of 1 through 4 will do. Also note that once 3 (say) is known, then it can be exploited in 4 (say) by seeing that every g ∈ G is the inverse of some other element of G. 7.3 The construction We are now ready to work through the outline. 7.3.1 The multiplication Let G be a group and let N be a normal subgroup of G. Define ∼N on G by saying that a ∼N b means that ba−1 is in N . We know that this is an equivalence relation and we know that for each a ∈ G the equivalence class of a ∈ G under this relation is just the right coset N a of N . Because of Lemma 7.2.2, we know that right cosets of N in G are also left cosets of N in G. So referring to right versus left cosets of N is not all that important. However, we will usually refer to right cosets to be specific. The definition Define a binary multiplication on the set of equivalence classes (i.e., the set of right cosets of N in G) by declaring (N a)(N b) = N (ab). (7.1) Well definedness When we write N a, we are identifying an entire set by mentioning one element a of that set. There are other elements in that set that could have been used equally well. For example, if we consder D16 and the normal subgroup N = {1, r2 , r4 , r6 }, the cosets N v, N h, N d1 and N d2 are all the same. So the definition in (7.1) gives the result of multiplying two cosets by a formula that uses representatives of the cosets being multiplied. Thus we have to prove that the result is independent of the representatives chosen. The following use of Lemma 7.2.2 is typical, is used often, and should be learned. Lemma 7.3.1 For a group G with normal subgroup N , the multiplication on the cosets of N in G defined by (7.1) is well defined. Proof. Let a, b, x and y be in G with a ∼N x and b ∼N y. We want to show that (ab) ∼n (xy). We know that xa−1 = m ∈ N and yb−1 = n ∈ N . Now (xy)(ab)−1 = xyb−1 a−1 = xna−1 = n′ xa−1 = n′ m 174 CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES where n′ is the element of N guaranteed by Lemma 7.2.2 to satisfy xn = n′ x. Since n′ and m are in N , we have n′ m ∈ N and we have shown what we needed to show. The main feature of the proof of Lemma 7.3.1 is that even though x and n do not commute, we can “pass x over n” by replacing n with another element n′ of N . Group properties Let Q denote the set of right cosets of N in G. The argument that Q with the multiplication as defined by (7.1) forms a group is identical to the corresponding argument for Zk . Since the product in (7.1) is based on the product in G, we get the an identity for Q by noting (N 1)(N a) = n(1a) = N a = N (a1) = (N a)(N 1). (7.2) Inverses follow from (N a)(N a−1 ) = N (aa−1 ) = N 1 = N (a−1 a) = (N a−1 )(N a), and associativity from (N a)((N b)(N c)) = (N a)(N (bc)) = N (a(bc)) = N ((ab)c) = (N (ab))(N c) = ((N a)(N b))(N c). We have shown that Q with the multiplication defined in (7.1) is a group. Notation and terminology Standard notation for this construction is to denote the set of right cosets of N in G endowed with the multiplication defined in (7.1) by G/N . The group G/N is referred to as the quotient of G by N . The notation G/N is often read out loud as “G mod N ” or “G modulo N .” The action of creating G/N from G is often referred to as “modding out by N .” 7.3.2 The projection homomorphism Just as we have a homomorphism from Z to Zk , we have a homomorphism from G to G/N . If we define π : G → G/N by π(a) = N a, then (7.1) immediately gives that this is a homomorphism. It is clearly a surjection. We call π the projection or the quotient homomorphism from G to G/N . The nature of π is that it takes each element of G to the right coset of N that contains it. Thus all the elements in one coset are carried by π to one element of G/N , and elements in different cosets are carried to different elements of G/N . The kernel of π is the coset N 1. But N 1 = N . Thus N is the kernel of π. From these observations, we see that N measures the extent to which π is not one-to-one. Not only are |N | elements carried to the identity of G/N , but 7.3. THE CONSTRUCTION 175 also since the cosets of N are all the same size as N , we see that exactly |N | elements are carried to any one element of G/N . These observations are made more specific in the proposition below. Proposition 7.3.2 Let G be a group and let N be a normal subgroup of G. Then π : G → G/N defined by π(a) = N a is a surjective homomorphism and for each a ∈ G, we have π −1 (N a) = N a. Proof. All provisions except the last equality have been argued in the paragraphs above. The last equality makes two different uses of N a. To the left of the equal sign N a represents an element of G/N . To the right of the equal sign it is a subset of G. Now g ∈ G is in π −1 (N a) if π(g) = N a. But π(g) = N g, so N g = N a and g = 1g ∈ N g = N a puts g in N a. So π −1 (N a) ⊆ N a. If g ∈ N a, then N g∩N a is not empty so N a = N g since the right cosets of N partition G. This gives N a = N g = π(g) so g ∈ π −1 (N a) and N a ⊆ π −1 (N a). We will see in the next section that this is the structure of any surjective homomorphism, and in fact every surjective homomorphism is essentially a quotient homomorphism. 7.3.3 The first isomorphism theorem There are several theorems in group theory known as isomorphism theorems. Often they are numbered and are given names like “first isomorphism theorem,” “second isomorphism theorem” and up to a third. Unfortunately not all books number the isomorphism theorems the same way. We do not need them all, and we will present two of them—one here and one in a later section. We will also present a theorem usually called the correspondence theorem, but sometimes called (not by us) the “fourth isomorphism theorem.” Most books call the next theorem the first isomorphism theorem. Theorem 7.3.3 (First Isomorphism Theorem) Let h : G → H be a surjective homomorphism and let K be the kernel of h. Then j : G/K → H defined by j(Ka) = h(a) is a well defined isomorphism from G/K to H. Proof. We start with the well definedness question. If Ka = Kb, then ba−1 ∈ K. So h(ba−1 ) = 1 in H. This means 1 = h(ba−1 ) = h(b)h(a−1 ) = h(b)(h(a))−1 which implies that h(b) = h(a). The calculation j((Ka)(Kb)) = j(K(ab)) = h(ab) = h(a)h(b) = j(Ka)j(Kb) shows that j is a homomorphism. Since every c ∈ H is h(a) for some a ∈ G, we get c = h(a) = j(Ka) showing that j is onto. Lastly, if j(Ka) = j(Kb), then h(a) = h(b). From this 1 = h(b)(h(a))−1 = h(ba−1 ) so ba−1 is in K. This means that Ka = Kb and we have shown that j is one-to-one and thus an isomorphism. 176 CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES Note that Proposition 7.3.2 says that every normal subgroup leads to a quotient and a surjective homomorphism onto the quotient. On the other hand, the First Isomorphism Theorem says that every surjective homomorphism has its image isomorphic in a natural way to the quotient of the domain by the kernel. Thus we have a close correspondence between (group, normal subgroup, quotient) and (group, kernel, homomorphic image). Often it is easier to “recognize” a quotient G/N of a group G with normal subgroup N by noticing that there is a surjective homomorphism with domain G having N as the kernel. We will give examples after we discuss some easy ways normal subgroups can occur. 7.3.4 Abelian groups and products The easiest way to get normal subgroups is to have things commute. As mentioned in Lemma 3.2.16, every subgroup of an abelian group is normal. So it follows that if G an abelian group and H is a subgroup of G, then we can form G/H. However, we do not have to go all the way to abelian groups to get easy access to normal subgroups. Consider two groups A and B. We know that the set A × B is defined as A × B = {(a, b) | a ∈ A, b ∈ B}. Given that A and B are both groups, we can put a group structure on the set A × B by defining the multiplication as (a, b)(c, d) = (ac, bd). (7.3) That is, multiplication is done separately on each coordinate with the left coordinate behaving as A behaves and the right coordinate behaving as B behaves. We will leave as an exercise that this puts a group structure on A × B. We will also leave as an exercise that A × B is abelian if both A and B are abelian. There are two important subgroups of A × B. They are A × {1} = {(a, 1) | a ∈ A}, and {1} × B = {(1, b) | b ∈ B}. The fact that these are subgroups and other facts about them are given in the next lemma whose proof is left as an exercise. Lemma 7.3.4 If A and B are groups, then (7.3) makes A × B into a group. If we let A′ = A × {1} and B ′ = {1} × B, then the following are true. 1. For all a′ ∈ A′ and b′ ∈ B ′ , we have a′ b′ = b′ a′ . 2. A′ and B ′ are normal subgroups of A × B. 3. The quotient (A × B)/A′ is isomorphic to B and the quotient (A × B)/B ′ is isomorphic to A. Note that in the first item, both a′ and b′ denote ordered pairs. 7.3. THE CONSTRUCTION 7.3.5 177 Examples The quotient of a group G by a normal subgroup N involves three groups: G, N and G/N . Obviously the first two determine the third. However, that has to be said carefully to be completely correct. It may also be guessed that knowing any two of G, N and G/N determines the third. This is also wrong. We will present very simple examples. Two of the examples involve G = Z4 × Z2 . We know that G is abelian, so all its subgroups are normal. Using Z4 = {0, 1, 2, 3} and Z2 = {0, 1}, we consider the following subgroups: A = {(0, 0), (0, 1)}, B = {(0, 0), (2, 0)}, C = {(0, 0), (1, 0), (2, 0), (3, 0)}, D = {(0, 0), (2, 0), (0, 1), (2, 1)}. We also look at Z8 . (We could look at Z4 , but we have already defined G and a relevant subgroup and this cooperates well with Z8 , so we will use it.) Using Z8 = {0, 1, 2, 3, 4, 5, 6, 7}, we let E = {0, 2, 4, 6}. 1. In exercises, you will show that A and B both are of order 2 and isomorphic to each other, that G/A is isomorphic to Z4 , and that G/B is isomorphic to the Klein four group. Thus knowing what G is isomorphic to and what the normal subgroup is isomorphic to does not determine what the quotient is isomorphic to. 2. In exercises, you will show that C and D are both of order 4 and not isomorphic to each other, and that G/C is isomorphic to G/D. Thus knowing what G is isomorphic to and what the quotient is isomorphic to does not determine what the normal subgroup is isomorphic to. 3. In exercises, you will show that C and E are isomorphic to each other, that Z8 /E and G/C are isomorphic to each other, and that Z8 and G are not isomorphic to each other. Thus knowing what the quotient is isomorphic to and what the normal subgroup is isomorphic to does not determine what the full group is isomorphic to. We return to a statement in the first paragraph of this section. We said that G and N determines G/N . The first example above shows that to determine G/N you need to know the particular normal subgroup of G, and not just what it is isomorphic to. 7.3.6 The correspondence theorem If N is a normal subgroup of a group G, it is valid to say that (among other consequences) all of the information in N has been “collapsed” to the trivial subgroup in G/N . But there is much of the structure of G that survives the 178 CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES transition from G to G/N . In particular structures of G that contain N survive in a very complete way. The correspondence theorem makes this precise. The bridge from G to G/N is the projection homomorphism π : G → G/N . The following lemma should probably have been introduced earlier. Lemma 7.3.5 If h : G → H is a homomorphism of groups, and J ⊆ G is a subgroup of G, then the restriction of h to J is a homomorphism, and h(J) is a subgroup of H. Proof. The are only two requirements for an item (say f ) be a homomorphism: that f be a function and that f (ab) = f (a)f (b) for all a and b in the domain. But the restriction of a function to a smaller domain is still a function, and the equality still holds since the restriction uses the same values as the original. That h(J) is a subgroup of H follows from Lemma 3.2.13. Applying this to π : G → G/N , we see that π carries each subgroup of G to a subgroup of G/N . This gives a function from subgroups of G to subgroups of G/N . This function deserves its own notation, and even though it looks more confusing to have two notations for what looks like almost the same thing, it will be less confusing later. So for a subgroup H of G, we define π(H) to be π(H) and we have a function π from the set of subgroups of G to the set of subgroups of G/N . This function turns out to be onto, but not necessarily one-to-one. For example, every subgroup of N (including N itself and {1}) is taken to the trivial subgroup in G/N . However, π does give a one-to-one correspondence (hence the name of the theorem) between the set of subgroups of G that contain N and the set of subgroups of G/N . If we focus only on normal subgroups, then we get a similar result: π gives a one-to-one correspondence between the set of normal subgroups of G that contain N and the set of normal subgroups of G/N . Much of the argument that goes into the proof of the Correspondence Theorem is about the behavior of functions. The next lemma separates out the ideas needed so that they don’t clutter up the proof of the theorem. Lemma 7.3.6 If f : X → Y is a function between sets, if A ⊆ X and B ⊆ Y , then 1. f (f −1 (B)) ⊆ B always holds, 2. B ⊆ f (f −1 (B)) holds if and only if B is contained in the image of f , 3. A ⊆ f −1 (f (A)) always holds, and 4. f −1 (f (A)) ⊆ A holds if and only if for every x ∈ A, we have f −1 (f (x)) ⊆ A. Proof. We leave the proof as a set of exercises. We are now ready for: 7.3. THE CONSTRUCTION 179 Theorem 7.3.7 (Correspondence Theorem) Let N be a normal subgroup of a group G, let S be the set of subgroups of G that contain N , and let T be the set of subgroups of G/N . Let π : S → T be derived from the projection homomorphism π : G → G/N as described above. Then the following are true. 1. The function π : S → T is a one-to-one correspondence. 2. If H ∈ S has H ⊳ G, then π(H) ⊳ G/N , so that if S ′ is the set of normal subgroups in S and T ′ is the set of normal subgroups in T , then π also gives a function from S ′ to T ′ . 3. The function π : S ′ → T ′ is a one-to-one correspondence. Proof. For the first conlcusion, we use the fact that a function with an inverse is a one-to-one correspondence. Thus we need an inverse to the function π : S → T . For an A element of T (subgroup of G/N ), consider π −1 (A) = {g ∈ G | π(g) ∈ A}. We claim that π −1 gives a function from T to S. That is, we claim that π −1 (A) is a subgroup of G that contains N . That π −1 (A) contains N follows from the fact that A contains the identity of G/N and N is the kernel of π. To check that π −1 (A) is a subgroup of G, we check for identity, inverses and products. Since π(1) is the identity in G/H it is in A, so 1 ∈ π −1 (A). If π(x) ∈ A, then π(x−1 ) = (π(x))−1 ∈ A so every x ∈ π −1 (A) has its inverse in π −1 (A). Lastly, if π(x) ∈ A and π(y) ∈ A, then π(xy) = π(x)π(y) ∈ A, so every x and y in π −1 (A) has xy ∈ π −1 (A). So π −1 (A) is a subgroup of G. We want to show that π(π −1 (A)) = A for all A ∈ T and that π −1 (π(H)) = H for all H ∈ S. Here we will use the fact that π(H) = π(H) for any H in T . The equality π(π −1 (A)) = A becomes π(π −1 (A)) = A which is true by the first two conclusions of Lemma 7.3.6 because π is onto. The equality π −1 (π(H)) = H becomes π −1 (π(H)) = H, and the containment H ⊆ π −1 (π(H)) comes from conclusion 3 of Lemma 7.3.6. For the reverse containment, we need the fourth conclusion of Lemma 7.3.6. For this we need that for every g ∈ H we have π −1 (π(g)) ⊆ H. But π(g) = N g and Proposition 7.3.2 gives that π −1 (N g) = N g, a coset of N . But N ⊆ H and g ∈ H imply that N g ⊆ H which is what we need. This establishes that π is a one-to-one correspondence. The last two provisions are left as exercises. The key fact in the last part of the argument given above is that if N ⊆ H ⊆ G with G a group and N and H subgroups, then any coset of N that has an element of H lies entirely in H. Another way to state this is that no coset of N has an element in H and an element outside of H, and yet another way to say this is that H is a union of cosets of N . 180 7.3.7 CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES Another isomorphism theorem The next theorem is called the “second isomorphism theorem” by many books and the “third isomorphism theorem” by about as many books. We will simply call the Other Isomorphism Theorem. If N and H are normal subgroups of G with N contained in H, then the nature of the definition of normality makes it clear that N is normal in H. Now the Correspondence Theorem says that H/N is normal in G/N . Thus we can look at the quotient . (G/N ) (H/N ). The elements of this quotient of quotients are clumsy to write down. They are right cosets of H/N by elements of G/N wich in turn are right cosets of N by elements of G. So a typical element would have to be written (H/N )(N x) where x ∈ G. The Other Isomorphism Theorem has this to say about this quotient of quotients. Theorem 7.3.8 (Other Isomorphism Theorem) Let N and H be normal subgroups of G with N ⊆ H. Then taking Hx to (H/N )(N x) gives a well defined isomorphism . G/H → (G/N ) (H/N ). Proof. Well definedness is easy. If x and y are both in N x, then xy −1 is in N . But N ⊆ H, so xy −1 is in H. Now (N x)(N y)−1 = (N x)(N y −1 ) = N (xy −1 ) ∈ H/N since xy −1 ∈ H. This means that N x and N y are in the same coset of H/N in G/N . That this is a homomorphism and a one-to-one correspondence is left as an exercise. Exercises (40) 1. Prove that if A and B are groups, then the multiplication defined in (7.3) makes A × B a group. 2. Prove that if A and B are abelian groups, then the multiplication (7.3) on A × B is abelian. 3. Prove Lemma 7.3.4. The first sentence is a previous exercise. The first conclusion needs careful attention to what elements are. The last conclusion is best done using the First Isomorphism Theorem. Since that theorem needs a homomorphism, the main task in the last conclusion is to figure out what the right homomorphisms are. 4. Prove that the three examples in Section 7.3.5 have the properties claimed and that each example shows the “does not determine” claim stated for that example. 7.3. THE CONSTRUCTION 181 5. Prove Lemma 7.3.6. 6. Prove the second and third conclusions of the Correspondence Theorem. Be careful to cover all that needs to be proven in the third conclusion and be careful not to do too much work by ignoring things that have already been proven. 7. Finish the proof of the Other Isomorphism Theorem. 8. Let N be the normal subgroup of D16 described in Section 7.2.2. Write out the mutliplication table for D16 /N . What known group is D16 /N isomorphic to? How can this be used to determine all the subgroups of D16 that contain N ? 9. Consider the following two permutations in S8 . σ = (1 2 3 4)(5 6 7 8), τ = (1 5)(2 6)(3 7)(4 8). These can be viewed as symmetries of the cube shown in (4.8). They generate a group G of 8 elements. The elements of G are 8 of the 16 the elements that stabilize the set {x, y} where x is the center of the square 1234 and y is the center of the square 5678. (Can you find an element in the stabilizer of {x, y} that is not in G?) (a) Show that G is abelian and is isomorphic to Z4 × Z2 . Think about how little you can get away with showing. Otherwise you will end up computing an 8 × 8 multiplication table and that is way too much work. Since G is abelian, all subgroups are normal. (b) The identity and τ form an order two subgroup K. What group is G/K isomorphic to? (c) The identity and σ 2 form an order two subgroup N . What group is G/N isomorphic to? 10. Prove that if H is a subgroup of G and [G : H] = 2, then H is a normal subgroup of G. 11. The purpose of this problem is to show that a normal subgroup of a normal subgroup is not always normal. Consider certain subgroups of D8 by referring to the figure in (4.1). Let v = (1 2)(3 4) and let h = (2 3)(1 4). Let H = {1, v} and let J = {1, v, h, vh}. Show that H is a normal subgroup of J, that J is a normal subgroup of D8 , and that H is not a normal subgroup of D8 . 182 CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES Chapter 8 Classes of groups We became interested in groups because groups generalize groups of permutations. We became interested in groups of permutations because interesting permutations (of the roots) came up when looking at solutions to polynomials. Galois’ main observation was that the permutation groups that came up with polynomials whose roots were “expressible by radicals” were nicer than arbitrary groups. The groups that arise this way are now called “solvable groups” because of their association with solutions to polynomials. This chapter introduces solvable groups. The property that defines solvable groups does not resemble in any way a solution to a polynomial equation. It will take several other chapters to make the connection between the group property and solutions to polynomials. 8.1 Abelian groups We start with a very short section on abelian groups. We do this for several reasons. First, it is a bit of a warm up exercise before getting into the full topic of solvable groups since some of the behavior of abelian groups is also found among solvable groups. Second, the definition of solvable groups makes reference to abelian groups. And third, because of the reference to abelian groups in the definition of solvable groups, some facts about solvable groups are based on corresponding facts about abelian groups. 8.1.1 Subgroups of abelian groups Lemma 8.1.1 A subgroup of an abelian group is abelian. Proof. The only argument that needs to be made is based on the definition of a subgroup. If H is a subgroup of G, then the multiplication used for H is just the multiplication used for G. That is if a, b are in H, then ab as computed in H is the same as ab as computed in G. So if G is abelian then ab = ba as computed in G. So ab = ba as computed in H. 183 184 8.1.2 CHAPTER 8. CLASSES OF GROUPS Quotients of abelian groups Lemma 8.1.2 A quotient of an abelian group is abelian. Proof. The only argument that needs to be made is based on the definition of the multiplication in a quotient of a group. If N is a normal subgroup of a group G, then the multiplication on G/N is defined by (N a)(N b) = N (ab). Now if G is abelian, then ab = ba, so (N a)(N b) = N (ab) = N (ba) = (N b)(N a). 8.2 Solvable groups We will define solvable groups, and prove statements thare are parallel to Lemmas 8.1.1 and 8.1.2 and that use these lemmas in their proofs. Specifically, we will show that subgroups and quotients of solvable groups are solvable. The reader should not expect it to be clear why this class of groups is called solvable nor why the class is at all useful. 8.2.1 The definition Let G be a group. We say that G is solvable if there is a finite sequence of subgroups {1} = G0 ⊆ G1 ⊆ G2 ⊆ · · · ⊆ Gn−1 ⊆ Gn = G with the property that for each i with 0 ≤ i < n we have Gi ⊳ Gi+1 and Gi+1 /Gi is abelian. The definition needs some discussion. Recall from Problem 11 in Exercise set (40) that a normal subgroup of a normal subgroup might not be normal in the entire groups. So the definition of solvable does not require that each Gi be normal in G. It only requires that each group in the sequence of groups be normal in the next larger subgroup in the sequence. Note that G1 must be abelian. We have G1 /G0 must be abelian, but G0 = {1} and G1 /G0 is isomorphic to G1 . If G is abelian, then G is solvable by taking G0 = {1} and G1 = G. If G has finite order, then G is solvable if and only if G satisfies a definition that reads exactly the same as the definition of solvable except that we replace the requirement that Gi+1 /Gi be abelian by the stronger requirement that Gi+1 /Gi be cyclic. We will argue this after we give the parallels to Lemmas 8.1.1 and 8.1.2. 8.2.2 Subgroups of solvable groups Lemma 8.2.1 A subgroup of a solvable group is solvable. Proof. Let 1 = G0 ⊳ G1 ⊳ · · · ⊳ Gn = G 8.2. SOLVABLE GROUPS 185 be as in the definition of a solvable group and let H ⊆ G be a subgroup. Let Hi = H ∩ Gi . A conjugate of an element of Hi by an element of Hi+1 must be in H since both elements are in H and it must be in Gi since Gi ⊳ Gi+1 . Thus Hi ⊳ Hi+1 . We now define a homomorphism from Hi+1 /Hi to Gi+1 /Gi . In the following a and b are elements of Hi+1 . Sending Hi a to Gi a is well defined since Hi a = Hi b implies ba−1 ∈ Hi ⊆ Gi which implies Gi a = Gi b. It is a homomorphism by the way cosets are multiplied. If Gi a = Gi b, then ba−1 is in Gi . But a and b are in H, so ba−1 is in H ∩ Gi = Hi and Hi a = Hi b. So the homomorhpism is one-to-one. This makes Hi+1 /Hi isomorphic to a subgroup of the abelian group Gi+1 /Gi which by Lemma 8.1.1 must be abelian. We have shown that H is solvable. 8.2.3 Quotients of solvable groups Lemma 8.2.2 A quotient of a solvable group is solvable. Proof. If G is a group and N ⊳ G, then the projection homomorphism π : G → G/N makes the quotient a homomorphic image. Thus it suffices to prove that the homomorphic image of a solvable group is solvable. Now assume that 1 = G0 ⊳ G1 ⊳ · · · ⊳ Gn = G so that each Gi+1 /Gi is abelian, and assume that h : G → H is a surjective homomorphism. For each i, let Hi = h(Gi ). If h(a) ∈ Hi and h(b) ∈ Hi+1 with a ∈ Gi and b ∈ Gi+1 , then (h(b))(h(a))(h(b))−1 = h(bab−1 ) = h(c) for some c ∈ Gi . Thus h(c) ∈ Hi and Hi ⊳ Hi+1 . Define j : Gi+1 /Gi → Hi+1 /Hi by j(Gi a) = Hi h(a). If Gi a = Gi b, then ba−1 ∈ Gi . But this gives (h(b))(h(a))−1 = h(ba−1 ) ∈ Hi , so Hi h(a) = Hi h(b) and j(Gi a) = j(Gi b) and j is well defined. It is a homomorphism by the way we define the product of cosets. It is onto since h : Gi+1 → Hi+1 is onto. Since we assume Gi+1 /Gi is abelian, Lemma 8.1.2 shows that Hi+1 /Hi is abelian. We have shown that H satisfies the definition of a solvable group. 8.2.4 Finite solvable groups We will discuss the following statement that appears to be stronger than the definition of solvable. We will refer to it as condition (∗). We say that a group G satisfies (∗) if there is a finite sequence of subgroups {1} = G0 ⊆ G1 ⊆ G2 ⊆ · · · ⊆ Gn−1 ⊆ Gn = G with the property that for each i with 0 ≤ i < n we have Gi ⊳ Gi+1 and Gi+1 /Gi is cyclic. Clearly, we have that if G satisfies (∗), then G solvable since every cyclic group is abelian. 186 CHAPTER 8. CLASSES OF GROUPS In parallel to the fact that every abelian group is solvable, we have that every cyclic group satisfies (∗). Finite abelian groups We will show first that every finite abelian group satisfies (∗), and then we will show that every finite solvable group satisfies (∗). Since every abelian group is solvable, this two step process seems redundant, but we will use the first result in proving the second much in the way that Lemmas 8.1.1 and 8.1.2 were used in proving Lemmas 8.2.1 and 8.2.2. Lemma 8.2.3 If a group G is finite and abelian, then it satisfies condition (∗). Proof. The proof is inductive and heavily based on the Correspondence Theorem and the Other Isomorphism Theorem. We induct on the order of the group. A group of order 1 satisfies (∗). We now look at an abelian group G of order n and assume that a group of order less than n satsfies (∗). If G is cyclic, then there is nothing to show. So assume that there is an element x of G that is not the identity, but that does not generate all of G. Let N be the subgroup generated by x. It is not all of G, it is not trivial, and it is normal in G since G is abelian. We consider G/N . Since |G/N | = |G|/|N | and |N | > 1, we know that |G/N | < |G| = n. Also by Lemma 8.1.2, G/N is abelian. So G/N satisfies (∗) by the inductive hypothesis and there is a sequence H0 ⊆ H1 ⊆ · · · ⊆ Hk = G/N with H0 the trivial subgroup, all subgroups are normal since G/N is abelian, and each Hi+1 /Hi cyclic. Letting π : G → G/N be the quotient homomorphism, the Correspondence Theorem gives that each π −1 (Hi ) is a subgroup of G that contains N . Let Ni = π −1 (Hi ) for each i. Since Hi ⊆ G/N , we have that Hi is the set of cosets of N in G that lie in Ni and we can write Hi = Ni /N . We have N0 = N . We also have Nk = π −1 (Hk ) = π −1 (G/N ) = G. Thus we have the sequence {1} ⊆ N = N0 ⊆ N1 ⊆ · · · ⊆ Nk = G. Since G is abelian, all subgroups are normal. Now the Other Isomorphism Theorem says that for 0 ≤ i < k we have . Ni+1 /Ni ≃ (Ni+1 /N ) (Ni /N ) = Hi+1 /Hi (8.1) which is cyclic. (Here ≃ means “isomorphic to.”) This covers all the “successive quotients” in (8.1) except the first N/{1}. But N was chosen cyclic so even the first quotient is cyclic. This finishes showing that G satisfies (∗). 8.2. SOLVABLE GROUPS 187 Finite solvable groups Proposition 8.2.4 If a group G is finite and solvable, then it satisfies condition (∗). The proof will be left as an exercise. The proof should be modeled on the proof of Lemma 8.2.3. If G is a finite solvable group, then the definition of solvable will give a sequence of subgroups whose successive quotients are all abelian. What is wanted is a new sequence of subgroups whose successive quotients are all cyclic. There is no reason to expect that the new sequence will equal the original sequence given by the definition of solvable, nor even that the new sequence will have the same number of subgroups as the original sequence. However, when Lemma 8.2.3 and the Correspondenc Theorem are used to prove Proposition 8.2.4, it will be seen that there is a strong relationship between the original sequence and the new sequence. The proof should end up proving the following more specific statement. Theorem 8.2.5 If a group G has a finite sequence of subgroups Gi , 0 ≤ i ≤ n, satisfying the requirements in the definition of solvable, then it has a finite sequence of groups G′j , 0 ≤ j ≤ m, satsifying the requirements of condition (∗) so that the sequence Gi is included in the sequence G′j in that for each i, there is a j so that Gi = G′j . In other words, to get the sequence of the G′j from the sequence of the Gi , one “inserts” extra groups between the successive Gi . Exercises (41) 1. Prove Theorem 8.2.5. Of course, this will prove Proposition 8.2.4. 2. Prove that all the dihedral groups are solvable. There are infinitely many dihedral groups, and you cannot write out infinitely many proofs. But if you start with the smallest D6 , you should see a general argument that works for all of them. 3. Prove the following “converse” to Lemmas 8.2.1 and 8.2.2. If N ⊳ G, and if both N and G/N are solvable, then G is solvable. A hint is to use the Correspondence Theorem. 4. This is a bit harder. Show that S4 is solvable. To find normal subgroups, you should keep in mind that all permutations that share the same cycle structure are conjugate in S4 (Section 4.3.5). 188 CHAPTER 8. CLASSES OF GROUPS Chapter 9 Permutation groups If we accept the fact that solvable groups go with solvable polynomial equations, then the existence of non-solvable groups becomes interesting. This chapter does some calculations with permutation groups that, among other things, finds examples of non-solvable groups. If a group is to be non-solvable, it must be non-abelian. For a non-abelian group to be solvable, it has to have at least one normal subgroup that is not trivial and not the whole group. So the easiest way to find a non-solvable group is to find a non-abelian group whose only normal subgroups are the trivial group and the whole group. This leads to a definition. A group G is said to be simple if its only normal subgroups are the trivial subgroup and G itself. Note that for a prime p, the group Zp is simple because Lagrange’s theorem says that the only subgroups of Zp (normal or not) are the trivial subgroup and Zp itself. However, each Zp is abelian and thus solvable. The main purpose of this chapter is show that the full permutation groups Sn are not solvable when n ≥ 5. We will do some direct calculations on permutations that will give this, and one or two other facts about the groups Sn that will be needed later. 9.1 Odd and even permutations A transposition in Sn has exactly one cycle of length 2 and all other cycles are of length one. That is, a tranposition simply switches two numbers in {1, 2, . . . , n} and leaves all other numbers fixed. A transposition looks like (a b) in cycle notation. Every permutation in Sn can be written as a product of transpositions. This is not hard to show. It is a standard exercise to do this in almost any introductory computer class, and it is left as an exercise here. Given a permutation σ ∈ Sn , there may be many ways to write it as a 189 190 CHAPTER 9. PERMUTATION GROUPS product of transpositions. For example 1 2 3 4 = (1 3)(2 4) = (2 3)(3 4)(1 2)(2 3), 3 4 1 2 so even the number of transpositions used can vary. We claim that for a given permutation σ ∈ Sn either all ways of writing σ as a product of transpositions use an even number of transpositions, or all ways of writing σ as a product of transpositions use an odd number of transpositions. Note that in the example above, both ways shown of writing out the given permutation used an even number of transpositions. We will prove our claim by relating the number of transpositions to a number calculated directly from the permutation. 9.1.1 Crossing number of a permutation Recall the general form of the Cauchy notation for a permutation σ ∈ Sn as introduced in Section 2.3.4. ! 1 2 3 ··· n . (2.2) σ= σ(1) σ(2) σ(3) · · · σ(n) For such a σ, we let the crossing number for σ be the number of i and j with i < j for which σ(i) > σ(j). That is, it is the number of pairs in {1, 2, . . . , n} whose order has been “switched” by σ. In (2.2), it is the number of pairs in the bottom line that are out of order. 1 2 3 4 , the bottom line reads (3 4 1 2). There are six In the example 3 4 1 2 pairs to consider in the bottom line: (3 4) (3 1) (3 2) (4 1) (4 2) (1 2) Of these six pairs, the ones that are out of order are (3, 1), (3, 2), (4, 1), and (4, 2), while are (3, 4), and (1, 2) are in order. Thus the crossing number of this permutation is 4. We are not really concerned with the crossing number, but only in whether it is even or odd. The evenness or oddness of an integer is called its parity and an integer is said to have even parity if it is even, and odd parity if it is odd. However, the use of “even parity” and “odd parity” only allows one to say in two words what used to be said in one word, so we will not use it much. If σ ∈ Sn has odd crossing number, then σ is said to have even parity (or more simply σ is said to be even) and have odd parity (or be odd) otherwise. Our main result for this part will be that if σ can be written as a product of k transpositions, then the parity of k must agree with the parity of σ. We will prove this in several steps. First we observe that the identity has crossing number zero (no pair is reversed) and thus the identity permutation is even. 9.1. ODD AND EVEN PERMUTATIONS 191 Next we look at special transpositions. We call a transposition (i j) an adjacent transposition if j = i + 1. Thus adjacent transpositions switch two numbers that are consecutive. Lemma 9.1.1 If σ is in Sn and τ ∈ Sn is an adjacent transposition, then the partity of τ σ is the opposite of the parity of σ. Proof. If τ = (i i + 1), then the numbers in positions p and q have their order unchanged when τ acts if neither p nor q is in {i, i + 1}. Next one observes, if one (say p) is in {i, i + 1} and the other is not, then the values in positions p and i before τ is done become the values in positions p and i + 1 after τ is done and the values in positions p and i + 1 become the values in positions p and i. Thus the number of pairs out of order for such a p and q does not change. Lastly, when both p and q are in {i, i + 1}, then the pair of numbers in those positions either changes from in order to out of order or the reverse. Thus the application of τ changes the crossing number by exactly one. This proves the claim. Corollary 9.1.2 If σ ∈ Sn is the product of k adjacent transpositions, then the parity of k equals the parity of σ. Proof. The identity is even, and each time we multiply by an adjacent transposition, the parity reverses. From Lemma 9.1.1, the parity of the resulting product must be the parity of the number of adjacent transpositions multiplied. Next we relate arbitrary transpositions to adjacent transpositions. Lemma 9.1.3 If (a b) is a transposition in Sn , then (a b) can be written as a product of an odd number of adjacent transpositions. The proof of this is left as an exercise. From Lemma 9.1.3 and Corollary 9.1.2, we get the following two consequences. The first is immediate. Corollary 9.1.4 Every transposition in Sn is odd. Corollary 9.1.5 If σ ∈ Sn is the product of k transpositions, then the parity of k equals the parity of σ. Proof. Each transposition can be replaced by an odd number of adjacent transpositions. The number of adjacent transpositions in the resulting (larger) product is the sum of these odd numbers which is odd if the number of original transpositions is odd and even otherwise. A proof similar to that of Corollary 9.1.5 give the following. Lemma 9.1.6 If σ1 and σ2 are in Sn , the the parity of σ1 σ2 is the sum of the parities of σ1 and σ2 . 192 CHAPTER 9. PERMUTATION GROUPS The proof is left as an exercise. By the “sum” of two parities, we mean the parity that results if two integers are added with the given parities. Note that we can think of the elements of Z2 as representing all the parities that we need. The element 0 represents even parity, and the element 1 represents odd parity. Adding parities, is now represented by addition in Z2 . With this view of parities, Lemma 9.1.6 says that taking σ ∈ Sn to 0 if σ is even and taking σ to 1 if σ is odd, gives a homomorphism from Sn to Z2 . It is onto since we know that there are elements of Sn of odd parity. The homomorphism can be referred to as the parity homomorphism from Sn to Z2 . Exercises (42) 1. Show that every permutation can be written as a product of transpositions. This is easier if you have had to write a sort routine in a programming class. 2. Show that every transposition can be written as a product of an odd number of adjacent transpositions. That is, prove Lemma 9.1.3. 3. Prove Lemma 9.1.6. 4. Let σ be a single cycle in Sn with k elements in the cycle. Show that σ is even if k is odd and odd if k is even. A direct proof would be nice and an inductive proof would be nicer. 9.2 The alternating groups Let An denote the collection of even permutations in Sn . It follows from Lemma 9.1.6 that products of elements in An are in An , and from Corollary 9.1.5 that the inverse of an element in An is in An . Thus An is a subgroup of Sn . An alternate way to argue this is to notice that An is the kernel of the parity homomorphism from Sn to Z2 . That observation also gives that An is a normal subgroup of Sn . However, since Z2 has order 2 and the parity homomorphism is onto, we get that [Sn : An ] = 2 and we know that any subgroup of index two has to be normal. The groups An are called the alternating groups. One of their claims to fame is that for n ≥ 5, the group An is non-abelian and simple. We will show that A5 is simple. It is all we need later. The techniques that we use for A5 can be generalized to all An with n ≥ 5, but it will be easier to work with just A5 since it is easy to describe all types of elements that will be encountered. 9.2.1 The A5 menagerie Assume that N is a non-trivial normal subgroup of A5 . If we show that N has to be all of A5 , then we will have shown that A5 is simple. The assumption that N is not trivial says that there is a non-identity element in N . There are only three kinds of non-identity elements in A5 : 9.2. THE ALTERNATING GROUPS 193 1. a product of two “non-overlapping” transpositions (a b)(c d), 2. a three-cycle (a b c), and 3. a five-cycle (a b c d e). All other non-identity cycle structures possible with five objects being permuted are odd. Showing that A5 is simple now proceeds in two steps. We let N be a nontrivial normal subgroup of A5 and we let g be a non-identity element of N . The first step is to show that N must contain a three-cycle. Of course if g is already a three-cycle, there is nothing to show. If g is not a three-cycle, then we must use the structure of g and the normality of N to build another element of N that is a three-cycle. The second step is to show that if N has a three-cycle, then N must have all elements of A5 . 9.2.2 Getting a three-cycle Let g be a non-identity element in a normal subgroup N of A5 . We know that g has one of the cycle structures given in Section 9.2.1. If g is a three-cycle, then N certainly contains a three-cycle. If g = (a b)(c d), then we can conjugate g by h = (d e). We can calculate that g h = (a b)(c e). Since N is normal, g h must be in N . One now calculates that (g h )(g −1 ) = (a)(b)(c d e) which is a three-cycle. If g = (a b c d e), then we can conjugate g by h = (d e). We can calculate that g h = (a b c e d). Since N is normal, g h must be in N . One now calculates that (g h )(g −1 ) = (a d e)(b)(c) which is a three-cycle. We have shown that every non-trivial normal subgroup N of A5 has a threecycle. 9.2.3 Getting all of A5 Three-cycles We first show that a non-trivial normal subgroup N of A5 contains all threecycles. From the previous section, we can assume that there is a three-cycle (a b c) in N . If (p q r) is another three-cycle in S5 , then we know from Theorem 4.3.15 that there is a τ ∈ Sn so that (a b c)τ = (p q r). But N ⊳ A5 can only be exploited if τ ∈ A5 . There are two cases. In the case that τ is even, then τ ∈ A5 and N ⊳ A5 implies that (p q r) = (a b c)τ is in N . 194 CHAPTER 9. PERMUTATION GROUPS In the case that τ is odd, then write (p q r) as (p q r)(s)(t). Let λ = (s t)τ . Then λ is even and is in A5 . Further, since (s t) is its own inverse, we have (a b c)λ = λ(a b c)λ−1 = (s t)τ (a b c)τ −1 (s t) = (s t)(p q r)(s t) = (p q r). Now with λ ∈ A5 and N ⊳ A5 , we get that (p q r) must also be in A5 . This proves that N contains all three-cycles. Arbitrary elements Now we wish to show that N contains all elements of A5 . An arbitrary element in A5 is in one of the three forms discussed in Section 9.2.1. We already know that all elements in the second form (three-cycles) are in A5 , so we need to account for the other two. If g ∈ A5 is of the form (a b)(c d), then (a b c)(b c d) = (a b)(c d) = g and g is in N . If g is of the form (a b c d e), then (a b c)(c d e) = (a b c d e) = g and g is also in N . This shows that N contains all elements in A5 . We have given most of the proof of the following. Theorem 9.2.1 If N is a non-trivial normal subgroup of A5 , then N is all of A5 . In particular, A5 is simple and not solvable. Proof. The only item not already proven is the fact that A5 is not solvable. From the definition of solvable, we only need to show that A5 is not abelian. We leave this as an exercise. This has the following consequences. Corollary 9.2.2 The symmetric group S5 is not solvable. Proof. This follows from the fact (Lemma 8.2.1) that subgroups of solvable groups are solvable and that the non-solvable A5 is a subgroup of S5 . Corollary 9.2.3 The symmetric group Sn is not solvable for n ≥ 5. Proof. If Sn is the group of permutations of {1, 2, . . . , n}, then the subgroup of Sn that keeps each element of {6, 7, . . . , n} fixed is isomorphic to S5 which is not solvable. Now Lemma 8.2.1 says that Sn is not solvable. 9.3. SHOWING A SUBGROUP IS ALL OF SN 195 Exercises (43) 1. Find two elements of A5 that do not commute. Explain the comment in the proof of Theorem 9.2.1 that this is all that is needed to show that A5 is not solvable. 9.3 Showing a subgroup is all of Sn Here we give an argument that resembles the proof that A5 is simple, but that gives a different result. It will be needed to justify an example that will occur much later in the course. First we need a definition. If a group G acts on a set X, we say that the action is transitive (or that G acts transitively on X) if there is only one orbit of the action (which then must necessarily be all of X). Another way to say the same thing is that for every x and y in X, there is a g ∈ G so that g(x) = y. That is, you can take any element of X to any other element of X. We will apply this to the action of Sn on {1, 2, . . . , n}. If H is a subgroup of Sn , then it is also a group of permutations on {1, 2, . . . , n} and thus acts on {1, 2, . . . , n}. Clearly Sn acts transitively on {1, 2, . . . , n}, and for a subgroup H of Sn , the action of H on {1, 2, . . . , n} will either be transitive or not. Clearly, the action of the trivial subgroup of Sn on {1, 2, . . . , n} is not transitive. Our goal is the following. Proposition 9.3.1 If H is a subgroup of S5 that acts transitively on {1, 2, 3, 4, 5} and that contains at least one transposition, then H = S5 . We will give the argument as a sequence of lemmas that build up information about H. We will be hindered by the fact that the transitivity of the action tells only so much about the action and no more. For example if a and b are two elements of {1, 2, 3, 4, 5}, and c is a third, then we know that there is a σ ∈ H so that σ(a) = b. But we have no way of knowing what σ(c) is for this particular σ. In other words, the transitivity lets us dictate what happens to one element of {1, 2, 3, 4, 5}, but does not let us dictate what happens to two. The proofs of the lemmas will be somewhat wordy, and to be more efficient we introduce some efficient wording. If σ and τ are each single cycles, then we say that they overlap to mean that the set of elements involved in the cycle of σ and the set of elements involved in the cycle of τ have non-empty intersection. For example if σ = (1 3 5) and τ = (2 5), then they overlap. But if σ = (1 3 5) and τ = (2 4), then they do not overlap. Even further, we can discuss by how much they overlap. So we can say that σ = (1 3 5) and τ = (2 3 5) overlap in two elements, but σ = (1 3 5) and τ = (2 4 5) only overlap in one element. 196 CHAPTER 9. PERMUTATION GROUPS Lemma 9.3.2 If H is a subgroup of S5 that acts transitively on {1, 2, 3, 4, 5} and that contains at least one transposition, then H contains a three-cycle. Proof. Let τ = (a b) be the transposition that is guaranteed to be in H and let c be neither a nor b. There is a σ ∈ H so that σ(a) = c. We consider the transposition τ σ which we know must be in H since both τ and σ are in H. We know that τ σ transposes at least the element c of {1, 2, 3, 4, 5} that is neither a nor b. But we do not know where the other element transposed by τ σ is. Thus the two-cycle τ σ either does not overlap the two-cycle τ or they overlap in a single element. We thus must consider two cases. In the first case, we assume that the overlap is in a single element. We leave as an exercise to show that the product of two transpositions that overlap in a single element is a three-cycle. Thus we are done in this case. In the second case, we have τ = (a b) and τ σ = (c d) where {a, b} and {c, d} are disjoint. But then there is a fifth element e that is not in {a, b, c, d}. There is a λ ∈ H so that λ(a) = e. Now τ λ is a transposition of the form (e ?) where the unknown value “?” must be in {a, b, c, d} since there are only 5 values being permuted by S5 . Thus τ λ must overlap exactly one τ or τ σ in a single element. As in the previou case, we have two transpositions that we can multiply to give us a three-cycle. The next two lemmas will be given rather breezy proofs. You will be asked to justify the steps in an exercise. Lemma 9.3.3 If H is a subgroup of S5 that acts transitively on {1, 2, 3, 4, 5} and that contains at least one transposition, then H contains a four-cycle. Proof. We know there is a three-cycle σ in H. Using the transitivity of H, we can conjugate σ to another three cycle that overlaps σ in either one or two elements. If the overlap is two elements, then there is a fifth element that is fixed by both three cycles. Using the transitivity of H, we can find a transposition that moves this fifth element. The transposition must overlap one of the two three cycles in one element. The product of the three cycle and the transposition that overlaps in a single element is a four-cycle. If the overlap of the two three cycles is one element, then using the transitivity of H, we can find a transposition that moves the common element. The transposition must overlap one of the two three cycles in one element. The product of the three cycle and the transposition that overlaps in a single element is a four-cycle. Lemma 9.3.4 If H is a subgroup of S5 that acts transitively on {1, 2, 3, 4, 5} and that contains at least one transposition, then H contains a five-cycle. Proof. We know there is a four-cycle σ in H. Using the transitivity of H, we can find a transposition that moves the element fixed by σ. The transposition must 9.3. SHOWING A SUBGROUP IS ALL OF SN 197 overlap the four-cycles in one element. The product of σ and the transposition that overlaps in a single element is a five-cycle. Lemma 9.3.5 If H is a subgroup of S5 that acts transitively on {1, 2, 3, 4, 5} and that contains at least one transposition, then H contains all transpositions. Proof. Once again we sketch the proof and leave details as exercises. We know there is a four-cycle in H that we choose to write as (b c d e). We can find a transposition (a ?) where the unknown element is in {b, c, d, e}. Now conjugating the transposition by powers of the four-cycle give all transpositions in which one of the elements being transposed is a. This works as long as a is the fixed element of the four cycle. But now we can conjugate the four-cycle by powers of the five-cycle that must be in H to change the fixed element of the four cycle to anything we want. Thus we can get all transpositions in S5 to be in H. From a problem in Exercise Set (43), we know that every permutation can be written as a product of permutations. This fact and Lemma 9.3.5 proves Theorem 9.3.1. Exercises (44) In these exercises, it will be important to remember that there are several ways to write down the same cycle. In particular there are two ways to write down a transposition: (a b) and (b a) denote the same transposition. There are three ways to write down the same three-cycle, four ways to write down the same four-cycle, and so forth. 1. Find a non-trivial subgroup of S3 that does not act transitively on {1, 2, 3}. Find a subgrop of S3 that is not all of S3 that does act transitively on {1, 2, 3}. 2. Show that if σ and τ are two-cycles that overlap in a single element, then their product is a three-cycle. 3. This exercise should not be attempted until you feel that you are an expert on the proof of Lemma 9.3.2. Fill in the details in the proof of Lemmas 9.3.3 and 9.3.4. In Lemma 9.3.3, how do we get the three-cycle that overlaps σ in one or two elements? In the two cases, how do argue that we can find a transposition that overlaps one of the three cycles in a single element? Show that the product of an n-cycle and a two-cycle that overlap in one element is an (n + 1)-cycle. 4. This exercise should also not be attempted until you feel that you are an expert on the proof of Lemma 9.3.2. Fill in the details in the proof of Lemma 9.3.5. Be as thorough as the previous exercise. 198 CHAPTER 9. PERMUTATION GROUPS 5. Prove that if a subgroup H of S5 contains a transposition and a five-cycle, then H = S5 . Part III Field theory 199 Chapter 10 Field basics 10.1 Introductory remarks In this part (Part III) we will cover Galois Theory. This theory tells, for a given polynomial, when its roots can be expressed in terms of its coefficients and some basic constants using only the five operations of addition, subtraction, multiplication, division, and the extraction of n-th roots. Fields are structures in which one can do the first four of these operations, and much of our effort will be a study of fields. If a root of a polynomial is calculated from the coefficients of the polynomial by a string of the five operations, and the coefficients and basic constants are all contained in a given field, then the four operations of addition, subtraction, multiplication and division stay within that field. However, each time the taking of an n-th root is required, a larger field might have to be brought in so that the calculation can continue. For this reason, the study of pairs of fields, smaller fields contained in larger fields, becomes important. Since the larger field is thought of as being built from the smaller field by including new numbers, the larger field is called an extension of the smaller. It is important to know when an arbitrary extension can be accomplished by the inclusion of new numbers that are n-th roots of numbers from the smaller field. And because the calculation of a root of a polynomial might involve the taking of several different n-th roots (with perhaps different values of n), it is important to know when an extension field can be obtained from a smaller field by a sequence of such inclusions. It is Galois’ main contribution that he recognized that this could be detected by looking at certain groups of symmetries (automorphisms) of the extension field. Thus we have the first themes of our study: we will study certain symmetries of certain pairs of fields, where the pair consists of a smaller field contained in an extension field. Later we will study what the symmetries reveal about the extension. In order to derive information from symmetries, you have to know what the 201 202 CHAPTER 10. FIELD BASICS symmetries are. In order to know what the symmetries of an object (such as a field) are, you have to know some structure of the object. Thus we do not start with symmetries, but instead start with a study of the structures of fields and field extensions. We will learn about the structures in two steps. First, we will look at some basic facts that arise from the definitions and that apply to all fields. That will be the subject of this chapter. Then we will see how certain specific fields and field extensions are constructed. The details of the construction will reveal the symmetries. The construction will be studied in a later chapter. In between this chapter and the chapter on the construction of extensions will be a chapter on polynomials. Galois Theory never strays very far from discussions of polynomials. The subject is motivated by a search for roots of polynomials, and the promised constructions are based heavily on the properties of polynomials and sets of polynomials. Indeed sets of polynomials are valid objects of study in an algebra course since polynomials can be added, subtracted and multiplied (but not divided) to give new polynomials, and thus form algebraic structures (rings) in their own right. We will see that the algebraic structures of sets of polynomials have very familiar properties that will be easy to exploit. 10.2 Review The definition of a field, some examples, and some properties of fields that can be derived immediately from the definitions are to be found in Chapter 2 (Sections 2.6 and 2.8) and in Chapter 3 (Section 3.4). Important items from these sections include the following. 1. Examples of fields include√the rationals Q, the real numbers R, the complex numbers C, and Q[ 2] which consists of all numbers of the form √ a + b 2 where a and b are rationals. See Section 2.6.2 for the details of √ the addition, subtraction, multiplication and division of elements of Q[ 2]. 2. Each Zp with p a prime integer is a field. See Section 2.8. 3. The identities of a field are unique as are additive and multiplicative inverses. Further, for all x and y in a field, we have 0x = 0, (−x)y = −(xy) and (−x)(−y) = xy. See Lemma 3.4.1. 4. If F ⊆ E is an extension of fields, then E is a vector space over F . The dimension of this vector space is called the degree of E over F and is denoted [E : F ]. See Section 3.4.3. 5. The intersection of subfields is a subfield. From this it follows that if F ⊆ E is an extension of fields and S is a subset of E, then there is a smallest subfield of E that contains F and S. This subfield is called the extension of F by S in E. See Section 3.4.3. 10.3. FIXED FIELDS OF AUTOMORPHISMS 203 6. A homomorphism between fields either takes all elements to 0 or is oneto-one. In the latter case, the image of the homomorphism is a subfield of the range. See Section 3.4.4. 7. For a field E, the set Aut(E) of isomorphisms from E to itself is a group under composition. If F ⊆ E is an extension of fields, then we let Aut(E/F ) = {h ∈ Aut(E) | h(x) = x for all x ∈ F }. With this definition Aut(E/F ) is a subgroup of Aut(E). See Section 3.4.5. In the rest of this chapter we add to this list a few more basic facts and concepts concerning fields. Exercises (45) 1. This should probably have been included in Lemma 3.4.1. Prove that if xy = 0 in a field F , then either x = 0 or y = 0. Conclude that if a 6= 0 and b 6= 0 in a field F , then ab 6= 0. 10.3 Fixed fields of automorphisms Let E be a field, let θ be an automorphism of E and let Γ be a subgroup of Aut(E). The fixed field Fix(θ) of θ is the set {x ∈ E | θ(x) = x}, and the fixed field Fix(Γ) of Γ is \ {x ∈ E | θ(x) = x for every θ ∈ Γ} = Fix(θ). θ∈Γ Exercises (46) 1. If E is a field, θ ∈ Aut(E) and Γ is a subgroup of Aut(E), then Fix(θ) and Fix(Γ) are subfields of E. 2. If E is an extension of the field F and Γ = Aut(E/F ), then F ⊆ Fix(Γ). In the second exercise above, you can try to prove that F = Fix(Γ), but you should hopefully fail because it is false. Later we will see when equality occurs. 10.4 Automorphisms and polynomials In the introductory remarks to this chapter, we point out that we are interested in situations in which a field contains the coefficients of a polynomial, but the roots are not contained in that field, but in a larger field. For several reasons, this situation cooperates beautifully with the automorphisms discussed in Section 3.4.5. 204 CHAPTER 10. FIELD BASICS Polynomials are put together from various ingredients using addition and multiplication. Automorphisms cooperate well with addition and multiplication, and so cooperate well with the structure of a polynomial. Further, a root of a polynomial makes the value of a polynomial zero. An automorphism must take zero to zero, so we get good cooperation between automorphisms and roots. We now give details to these comments. Let P (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 be a polynomial. Assume further that all the ai are elements of a field F . Lastly assume that there is a field extension E of F so that some r ∈ E is a root of P . That is P (r) = 0 in E. Let θ be an automorphism in Aut(E/F ). The reason for chosing θ in Aut(E/F ) will be clear immediately. Because of the properties of automorphisms, we have θ(P (x)) = θ(an xn + an−1 xn−1 + · · · + a1 x + a0 ) = θ(an xn ) + θ(an−1 xn−1 ) + · · · + θ(a1 x) + θ(a0 ) = θ(an )θ(xn ) + θ(an−1 )θ(xn−1 ) + · · · + θ(a1 )θ(x) + θ(a0 ) = θ(an )(θ(x))n + θ(an−1 )(θ(x))n−1 + · · · + θ(a1 )θ(x) + θ(a0 ) = an (θ(x))n + an−1 (θ(x))n−1 + · · · + a1 θ(x) + a0 = P (θ(x)) for any x ∈ E. The next to last equality holds because θ fixes all elements of F and all the ai are in F . Now we note that P (r) = 0 so θ(P (r)) = θ(0) = 0. But θ(P (r)) = P (θ(r)) so P (θ(r)) = 0. This means that θ(r) is a root of P (x). We have shown the following. Proposition 10.4.1 Let F ⊆ E be fields, let P (x) be a polynomial with coefficients in F , let r ∈ E be a root of P (x), and let θ be in Aut(E/F ). Then θ(r) is also a root of P (x). We now have our first serious restriction on automorphisms. Under the right conditions, automorphisms have to take roots of a polynomial to roots of the same polynomial. In the next chapter we will see that there are a limited number of roots for a given polynomial, so that this is a real restriction. Later we will see conditions that imply that given two roots of a polynomial there must actually be an automorphism that carries one root to the other root. Exercises (47) 1. We consider the reals R and the complex numbers C, regarding C as an extension of R. We also consider complex conjugation z 7→ z taking C to C which is defined by a + bi = a − bi for real a and b. The notation z is not convenient for us, since it does not give complex conjugation a symbol that looks like a function symbol, so we define κ : C → C by κ(z) = z to fill that role. 10.5. ON THE DEGREE OF AN EXTENSION 205 (a) Prove that κ is in Aut(C/R). (b) Prove that R is the fixed field of κ. (c) Prove that if θ is any element of Aut(C/R), then θ(i) is either i or −i. (d) Prove that Aut(C/R) has exactly two elements. (e) Prove that if P (x) is a polynomial with real coefficients, and r is a root of P (x) in C, then r is also a root of P (x). Later, another exercise will use the exercise above to conclude that every polynomial P (x) with real coefficients factors into a product of polynomials with real coefficients in which each factor is of degree one or two. 10.5 On the degree of an extension Recall (Section 3.4.3) that the degree [E : F ] of an extension F ⊆ E of fields is the dimension of E as a vector space over F . 10.5.1 Comparing degree with index The notation [E : F ] for the degree of the extension invites misunderstandings since it resembles the notation used for the index of a subgroup in a group. If H is a subgroup of G then [G : H] is the index of H in G. Both [G : H] and [E : F ] relate to the sizes of the objects if they are finite, but not in the same way. Let us use |A| for the number of elements in a set A. For groups, we have [G : H] = |G| , |H| or equivalently |G| = |H| [G : H]. For a field extension F ⊆ E, let us work out the relation between [E : F ], |E| and |F | under the assumption that all these numbers are finite. As with groups, we refer to |E| as the order of the field E. Let d = [E : F ]. Since d is the dimension of the vector space E over F , there is a basis (x1 , x2 , . . . , xd ) of elements of E for this vector space. We know that every element of E is a unique linear combination of the form a1 x1 + a2 x2 + · · · + ad xd where all the ai come from F . Since this linear combination for each element of E is unique, different linear combinations give different elements of E. Thus the function that takes each such linear combination to its value in E is one-toone. But every element of E is such a linear combination. Thus this function 206 CHAPTER 10. FIELD BASICS is also onto, and we have a one-to-one correspondence between the set of linear combinations above and the elements of E. We now count the number of such linear combinations. There are |F | choices for a1 , there are |F | choices for a2 and so forth. Thus there are |F |d choices all together and we have |E| = |F |d = |F |[E:F ] . We see that the relationship between [E : F ], and |E| and |F | is very different from the relationship between the corresponding quantities for groups. Finite extensions of Zp For each prime p, we know that Zp is a field. Thus there is a field of order p for each prime p. If E is a finite field extension of Zp , then for d = [E : Zp ], we have that |E| = pd . Thus potentially, we have a field of order pd for various primes p and positive integers d. It turns out that these are the only possible orders for finite fields. This will be shown shortly. It also turns out that all such orders of fields do exist. This will not be proven but will be discussed in a later chapter. 10.5.2 Properties of the degree Lemma 10.5.1 Let G ⊆ F ⊆ E be a sequence of field extensions so that both [E : F ] = m and [F : G] = n are finite. Let (x1 , x2 , . . . , xm ) be a basis for E over F and let (y1 , y2 , . . . , yn ) be a basis for F over G. Then the set B = {xi yj | 1 ≤ i ≤ m, 1 ≤ j ≤ n} is a basis for E over G. Note that this proves that when [E : F ] and [F : G] are both finite, then [E : G] is finite and [E : G] = [E : F ][F : G]. Proof. We must show that the elements of B are linearly independent over G and span E. The span is the easier to deal with. Let e be in E. We can get e as the linear combination e = f1 x1 + f2 x2 + · · · + fm xm = m X fi xi i=1 with all the fi ∈ F because (x1 , x2 , . . . , xm ) is a basis for E over F . But every fi is a linear combination fi = gi1 y1 + gi2 y2 + · · · + gin yn = n X j=1 gij yj 10.5. ON THE DEGREE OF AN EXTENSION 207 because (y1 , y2 , . . . , ym ) is a basis for F over G. Putting all this together lets us write e as m n X X gij yj xi e= i=1 = m X i=1 = m X i=1 j=1 n X j=1 n X j=1 gij yj xi gij xi yj . The first equality is gotten by plugging in the expression for each fi in terms of the yi . The second equality is the distributive law used m times and the third equality is the commutative law. This shows that e is a linear combination of the elements xi yj using coefficients from G. Thus the set B spans all of E. To show linear independence, we start with a linear combination that gives zero. That is, we assume that there are gij ∈ G so that the sum of all the gij xi yj gives zero. If we gather all the expressions with the same i together we can write this sum as in the last line of the calculation above. We then get the above calculation in reverse. Specifically m n X X gij xi yj 0= i=1 = m X i=1 = m X i=1 j=1 n X j=1 n X j=1 gij yj xi gij yj xi . But the xi are linearly independent. So each coefficient of xi must be zero in the last expression giving n X gij yj = 0 j=1 for each i. But the yj are linearly independent. So each coefficient gij must be zero. This proves the linear independence of the xi yj . The proof of the next lemma is left as an exercise. Lemma 10.5.2 Let F ⊆ E be a field extension. Then F = E if and only if [E : F ] = 1. 208 CHAPTER 10. FIELD BASICS Exercises (48) 1. Prove Lemma 10.5.2. 2. What is [C : R]? If it is finite, what is a basis? √ 3. Let T = Q[ 2] as in Section 2.6.2. What is [T : Q]? What is a basis? How can you argue that your answers are correct? √ 4. Let T = Q[ 3 2] from Problem 5 of Exercise set (24). What is [T : Q]? What is a basis? How can you argue that your answers are correct? 10.6 The characteristic of a field 10.6.1 Definition and properties There is an important number associated to each field. Let F be a field and look at the structure of the abelian group formed by F and the addition operation. In this group, the element 1 has an order c that is either a positive integer or infinite. We will shortly give a set of exercises that will supply important information about the number c. Observe that since the additive group F under addition is written additively, we are considering sums of copies of 1 rather than products of copies of 1. We are asking for the smallest number c so that the sum 1 + 1 + ···+ 1 (in which there are exactly c ones) is equal to zero in F . We could write c1 for this sum, but that looks too much like multiplication in F in spite of the fact that the c really comes from Z and the 1 comes from F . We invent a notation for this situation. Let n be a positive integer. We write n(1) for the sum of n ones in F . While it still looks like multiplication, it also looks like a function which is a more accurate way to think about it. We expand the meaning of this notation to say that n(x) is the sum of n copies of x for any x ∈ F . That is n(x) = x + x + · · · + x in which there are exactly n copies of x. Exercises (49) 1. The function n defined above is NOT a multiplicative homomorphism. Show that n(xy) = xn(y) = yn(x) for any x and y in F . In particular n(x) = xn(1) for any x ∈ F . 10.6. THE CHARACTERISTIC OF A FIELD 209 2. The function n defined above IS an additive homomorphism. Show that n(x + y) = n(x) + n(y) for any x and y in F . 3. Show that for positive integers m and n and for x in a field F , that (mn)(x) = m(n(x)). 4. For a field F let c be the smallest positive integer so that c(1) = 0. If c is not prime so that c = ab with a and b positive integers both greater than 1, show that either a(1) = 0 or b(1) = 0 in F . Show that this implies that c must be a prime. The number c as defined here is used in the rest of the exercises in this set. 5. If c is finite, then show that c(x) = 0 for every x ∈ F . 6. If there is a positive integer n so that n(x) = 0 in F for some x 6= 0 in F , then n(1) = 0 in F and the order of 1 in F is finite and no larger than n. 7. Argue from the previous exercises that if c is finite, then it is the order of every non-zero element of F under addition. 8. If c is finite and n(x) = 0 for some x 6= 0 in F , then c|n. If F is a field, and the order of 1 in the group formed by F under addition is finite, then we call this order the characteristic of F . If this order is infinite, then we say that the characteristic of F is zero. Motivation for this last twist in the definition is that the characteristic of F ends up being the smallest positive integer n so that n(x) = 0 for all x ∈ F if such a positive integer exists, or the only integer n (namely 0) so that n(x) = 0 for all x ∈ F if no such positive integer exists. We see from the exercises, that the characteristic of a field is always either a prime or is zero, and if non-zero, it is the order of every non-zero element of the field under addition. The behavior of fields of non-zero characteristic is rather different from that of fields of characteristic zero. Because we are interested in the solution of polynomials with real or complex coefficients, and because the reals and complex numbers are fields of characteristic zero, we will spend more time with fields of characteristic zero. However, fields of non-zero characteristic are interesting and we will spend a little time with them. 10.6.2 A minimal field of each characteristic The field Q has characteristic 0, and for each prime p, the field Zp has characteristic p. These are not just typical examples. They are the most fundamental examples. The next lemma supports this claim. 210 CHAPTER 10. FIELD BASICS Theorem 10.6.1 A field F has characteristic zero if and only if it has a subfield isomorphic to Q. A field F has characteristic p for a prime p if and only if it has a subfield isomorphic to Zp . The proof of Theorem 10.6.1 in the case of characteristic zero is lengthy and we will not give it here. It will be left as a fairly long but straightforward project for the curious. For non-zero characteristic, we outline the argument and leave the details as an exercise. If F has a subfield isomorphic to Zp , then any non-zero element of this subfield has order p. Thus the characteristic of F must be p. If F has non-zero characteristic p (which we know must be a prime), then we are trying to find a subfield of F isomorphic to Zp . We accomplish this by finding a homomorphism from Zp to F that takes [1]p ∈ Zp to 1 ∈ F . From our basic facts about field homomorphisms (Lemma 3.4.6), we know that this will be one-to-one and its image will be the subfield that we want. We will be strict with notation to make sure we are being careful with details. For [n]p ∈ Zp , let h([n]p ) = n(1) where n(1) is the sum of n copies of 1 ∈ F . Since this is a definition based on a formula that uses a representative of an equivalence class, an exercise must be done to show that it is well defined. The function h clearly takes [1]p to 1 ∈ F . What remains to show is that h is a field homomorphism. This is left as an easy exercise. Exercises (50) 1. Finish the proof of Theorem 10.6.1 in the case of non-zero characteristic. That is, show that h is well defined and a field homomorphism. 2. (Optional) Prove Theorem 10.6.1 in the case of characteristic zero. This must be done in stages. First show that there is a ring homomorphism from Z to F that takes 1 ∈ Z to 1 ∈ F . This starts as an imitation of the proof of Thoerem 10.6.1 in the case of characteristic p to get the nonnegative integers into F . Then one must decide how to get the negative integers into F , and lastly how to get the rational numbers into F . Each stage must be checked for well definedness (if necessary) and the properties of a homomorphism. The one-to-one property can be ignored until the end when all that is needed is to show that there is at least one element of Q that does not go to zero. 10.6.3 Consequences of Theorem 10.6.1 Let F be a finite field. The order of 1 in F under addition must be finite. So F has characteristic p for some prime p. From Theorem 10.6.1, we know that F is an extension of Zp . Since |F | is finite, it must have a finite basis over Zp and thus have some finite dimension d = [F : Zp ]. From the discussion in Section 10.5.1 we get |F | = pd . We have shown the following. Theorem 10.6.2 The order of any finite field is a power of a prime number. 10.6. THE CHARACTERISTIC OF A FIELD 211 This gives one half of the facts mentioned at the end of Section 10.5.1. The other half says that every power of a prime is the order of some finite field. This will be discussed to some extent in a later chapter. 212 CHAPTER 10. FIELD BASICS Chapter 11 Polynomials 11.1 Motivation: the construction of Zp from Z To construct the field Zp from Z when p is prime, we first build an equivalence relation on Z, and then build a structure on the resulting set of equvialence classes. Building the operations of addition, multiplication and negation is easy, and the proof that they are well defined is straightforward. The required laws, commutative, associative, distributive, follow from the fact that these laws hold in Z. The only subtle part of the construction is finding multiplicative inverses. The argument that multiplicative inverses exist is lengthy and is discussed in outline form in Section 2.7.1 with details given in Sections 2.7.3, 2.7.4 and 2.8. The outline visits many fundamental properties of the integers: the existence of a division algorithm, the existence of greatest common divisors, the notion of a prime integer, and the notion of relatively prime integers. One of the goals of this chapter is to prove that these properties and results concerning the integers are also shared by collections of polynomials. This will be used in the next chapter where we show how certain field extensions can be constructed from collections of polynomials in much the same way that Zp is constructed from Z. In particular, finding multiplicative inverses will follow the same outline that was used to find multiplicative inverses in Zp . The outline in Section 2.7.1 also leads to other results about the integers, and the details are given in Sections 2.7.5, 2.7.6 and 2.7.7. These results are the Fundamental Theorem of Arithmetic, Euclids Theorem about primes, and the uniqueness of factorization of integers into primes. These results will get us to the second goal of this chapter which is to prove basic facts about roots of polynomials. Deriving these facts will use results about polynomials that parallel the results just mentioned about the integers. The facts that we gather from the details of the constructions of the extensions will allow us to say that certain automorphisms must exist. The facts that we gather about roots of polynomials will combine with Proposition 10.4.1 which says that certain automorphisms must take roots to roots. This will es213 214 CHAPTER 11. POLYNOMIALS tablish restrictions on how many automorphisms can exist. Between the two we will later get a complete understanding of certain groups of automorphisms. 11.2 Rings The integers have all the properties of a field except for multiplicative inverses. Various structures that have two operations and a distributive law of one over the other, but that lack some of the properties of a field are called rings. 11.2.1 Ring definitions The following duplicates the definition in Section 3.3. Sections 3.3.2 and 3.3.4 should be reviewed. Lemma 3.3.4 is particularly important to remember. A ring is a set with two operations, usually called addition and multiplication, that satisfy some of the axioms of a field, but not all. Specifically a set R with an addition (written a + b), a negation (taking a to −a), a multiplication (written ab), and an element 0 is a ring if it satisfies the following. 1. The set R and the addition form a commutative group with identitity element 0. 2. The multiplication is associative in that a(bc) = (ab)c for all a, b and c in R. 3. Multiplication distributes over addition in that a(b + c) = ab + ac and (b + c)a = ba + ca for all a, b and c in R. Some comments are needed. The multiplication need not be commutative. This is why two distributive laws are needed. The multiplication need not have an identity. These gaps can be filled by adding more words. If R has an element 1 so that for all a ∈ R, we have a1 = a = 1a, then we say that R is a ring with identity1 or ring with 1 or ring with unit. The even integers form a ring without identity. If a ring R satisfies ab = ba for all a and b in R, then R is called a commutative ring. A commutative ring with identity fails to be a field only in that it lacks multiplicative inverses. The integers form a commutative ring with identity. It turns out that polynomials (whose coefficients come from a field) also form a commutative ring with identity. Since this is the only ring other than the integers that we will deal with, we will assume from now on that all our rings are commutative with identity. There is terminology for rings with various combinations of properties, but we will leave such terminology alone so as not to introduce too many new words. As noted, the integers with the usual addition and multiplication form a commutative ring with identity. However, the integers also have special properties not covered by these terms. In Z, if a 6= 0 and b 6= 0, then ab 6= 0. A 1 Some books choose not to deal with rings without a multiplicative identity and so their definition of a ring coincides with our definition of a ring with identity. 11.3. POLYNOMIALS 215 commutative ring R for which a 6= 0 and b 6= 0 always implies that ab 6= 0 is called an integral domain. Thus Z is an integral domain with identity. Even further, Z has a division algorithm. There is a definition associated with this fact, but it is stated in much greater generality than we need and so we will skip the technical definition. Instead we will just prove the specific fact that polynomials have a division algorithm and work with that. Recall that a ring homomorphism is a function h : R1 → R2 , where R1 and R2 are rings, so that h(a + b) = h(a) + h(b) and h(ab) = h(a)h(b) for all a and b in R1 . As usual, it is easy to prove that such an h preserves 0 and 1 and additive inverses. In comparisons with field homomorphisms, which by Lemma 3.4.6 can only be trivial or one-to-one, we will eventually see that ring homomorphisms are much more flexible. Since the main difference between rings and fields is the lack of multiplicative inverses in rings, you should check to see how multiplicative inverses are crucial to the proof of Lemma 3.4.6. We next turn to polynomials and show that they share many properties with the integers. Exercises (51) 1. Show that an integral domain has the cancellative property. That is, if pq = pr and p 6= 0, then q = r. Hint: consider pq − pr. Show also that a commutative ring with the cancellative property is an integral domain. Thus for commutative rings, being an integral domain is equivalent to having the cancellative property. 11.3 Polynomials 11.3.1 Introductory remarks on polynomials Polynomials can be added, negated and multiplied to give other polynomials. Thus it seems likely that we can make a ring out of polynomials. However, polynomials are more complicated than integers, and the discussion of polynomials is correspondingly more complicated. We must first deal with the question of what a polynomial is. A typical polynomial looks like P (x) = an xn + an−1 xn−1 + · · · + a1 x1 + a0 x0 . This is an expression of a very recognizable form. Usually, the letter x is assumed to represent some variable and the ai are assumed to represent some constants. This makes a polynomial a function whose value depends on the variable x. Thus we can think of a polynomial in two ways: an expression of a certain form, or a function given by a formula in that form. If we regard a polynomial as a function, then it is clear that two expressions can give the same function. Both 0x2 + 3x + 2 and 3x + 2 specify the same function, but they look different. The first form may not look all that necessary, 216 CHAPTER 11. POLYNOMIALS but it is useful when giving a simple formula that tells how to add 3x + 2 to 5x2 − x − 7. One can also wonder if there are other ways that different looking polynomials can specify the same function. We will avoid such questions for now and start by treating polynomials as expressions and ignore until later the fact that they specify functions. We now get down to specifics. 11.3.2 Polynomial basics Definition of a polynomial We know that polynomials have constants (coefficients) and variables. We need to specify where the coefficients come from. To make a ring, we only need to add, negate and multiply and it is easy to see that we only need to add, negate and multiply the coefficients to do that. Thus we can make a ring of polynomials if we choose to take the coefficients from a ring. However, we want more than just a ring. We want a ring that imitates the properties of the integers. One of the properties is that a division algorithm exists. We will see that the easiest way to get a good division algorithm for polynomials is to insist that the coefficients come from a field. This restriction fits with our intended uses of polynomials, so it will never be seen as confining. We will see shortly how this restriction gets used. So to define a polynomial, we start with a field F . We give a first definition by saying that a polynomial P (x) over F is an expression of the form P (x) = an xn + an−1 xn−1 + · · · + a1 x1 + a0 x0 (11.1) where all the ai come from F . We call this a first definition since we will give a second definition shortly that is better suited to our purposes. Usually, we replace x0 by 1 in (11.1), but that breaks the pattern of the decreasing exponents. Exploiting that pattern lets us sometimes write (11.1) in the convenient form i=n X ai xi . (11.2) P (x) = i=0 2 We must deal with the fact that 0x + 3x + 2 deserves to be thought of as the same polynomial as 3x + 2. There are two ways to handle this. One is to modify the definition so that a polynomial is an expression of the form P (x) = ∞ X ai xi (11.3) i=0 with the extra provision that all but finitely many of the ai are zero. The second is to simply declare that certain expressions as given in (11.1) and (11.2) are equivalent. The second option would create a need to check that our defined equivalence is an equivlanece relation, and would then lead to future well definenedness problems that would have to be checked. 11.3. POLYNOMIALS 217 We thus give our second and final definition to say that a polynomial over F is an expression as given in (11.3) so that each ai is in F and where there is a positive integer M for which i > M implies that ai = 0. Note that M need not be the smallest such positive integer, so the condition does not mean that aM is not zero. Alternate notations However, expressions such as (11.1) and (11.2) are both familiar and useful. So we will continue to use them and take each to mean the same as (11.3) where all ai = 0 for i > n. Again, this does not mean that an 6= 0. We can go farther and say that any omitted term in a polynomial P implies i that the omitted coefficient is zero. So x4 + 2x is the polynomial ∞ i=0 ai x in which a4 = 1, a1 = 2 and all other ai are zero. We can go even further still and allow a0 x0 to be represented by a0 . Combining these simplifications in notation lets us use an element c of the field F to represent a polynomial over F in which the coefficient of x0 is c and all other coefficients are zero. Viewed as a function, the polynomial c is just the constant function to c. We call such a polynomial (one whose only nonzero coefficient is that of x0 ), a constant polynomial since it is a constant when viewed as a function of x. Two very useful polynomials that use this notation are the constant polynomial 0 in which all the coefficients are zero, and the constant polynomial 1 in which the coefficient of x0 is 1 and all other coefficients are zero. The polynomials 0 and 1 are special cases of monomials. A monomial is a polynomial in which all coeffients except at most one are non-zero. We say “at most” instead of “exactly” so that we include 0 among the monomials. Having every c in F represent not only an element of F , but also a polynomial over F makes ambiguous statements possible. However, this tends not to be a serious problem. Polynomial operations The sum, negative and product are now easy to define. If P (x) = ∞ X i ai x and Q(x) = (P + Q)(x) = bi xi i=0 i=0 are polynomials, then ∞ X ∞ X (ai + bi )xi , i=0 (−P )(x) = ∞ X (−ai )xi , and i=0 (P Q)(x) = j=i ∞ X X i=0 j=0 (aj bi−j )xi . (11.4) 218 CHAPTER 11. POLYNOMIALS Note that the statement that P (x) and Q(x) are polynomials carries with it the assumption that all but finitely many of their coefficients is zero. It must then be proven that this holds for our definitions of P + Q, −P and P Q. That is, it must be proven that these are polynomials. In fact we have the following whose proof is left as an exercise. Lemma 11.3.1 If P (x) and Q(x) are polynomials over a field F , then so are (P + Q)(x), −P (x) and (P Q)(x). Note that the definition of P Q in (11.4) agrees with the usual instructions that one is taught about multiplying polynomials. When the term aj xj from P (x) is multiplied by the term bi−j xi−j from Q(x), then aj bi−j xi is contributed to (P Q)(x) and is a summand of the ultimate expression involving xi in (P Q)(x). However, the definitions carry more power than just this illusttation of familiarity. We give two more lemmas about the nature of the definitions in (11.4). Lemma 11.3.2 The polynomials over a field F form a commutative ring with identity. The polynomial 0 is the additive identity and the polynomial 1 is the multiplicative identity. The proof of Lemma 11.3.2 is tedious with the biggest culprits being the proofs of associativity of multiplication and the distributivity of multiplication over addition. We will accept the truth of this lemma for now and leave its proof as an optional exercise. We will use F [x] to denote the set of all polynomials over the field F with the ring structure guaranteed by Lemma 11.3.2. P i The next lemma discusses polynomials as functions. If P (x) = ∞ i=0 ai x is a polynomial over a field F , then a given z ∈ F , we can discuss the meaning Pfor ∞ of P (z). Taking P (z) to mean i=0 ai z i has us adding up elements of F since each ai is in F and z is in F . However, we are formally adding infinitely many values of F . But since all but finitely many of the ai are zero, Lemma 3.4.1 says that all but finitely many of the ai z i are zero. Thus we are only adding up finitely many non-zero elements of F and the result is a specific element of F . Thus for each z ∈ F , we get an element P (z) in F , and we see that P (x) is a function from F to F . Let us look at some small examples. Let P (x) = 2x + 4, and Q(x) = 6x − 5. We have P (3) = 10, and Q(3) = 13, so P (3)Q(3) = 130. But we can also write P (3) = 2 · 3 + 4, and Q(3) = 6 · 3 − 5 so P (3)Q(3) = 12 · 32 + 14 · 3 − 20 = 108 + 42 − 20 = 130. 11.3. POLYNOMIALS 219 Of course in the latter calculation we were just using 3 instead of x in the multiplication P (x)Q(x) = (2x + 4)(6x − 5) = 12x2 + 14x − 20. Essentially, we gave an example that shows that the rules for multiplying polynomials work no matter what value is given to the variable. This also works with addition and negation. This can all be formalized into a very simple statement. Take one value z ∈ F . Now each polynomial P (x) over F gives a value P (z). This gives a function from F [x] to F . We refer to this function as the evaluation function at z. Let us denote this function as vz (think of “value at z”), so that we have vz (P (x)) = P (z). Note that taking z to be a different element of F gives a different function. As an example, v0 takes each polynomial P (x) to the element a0 in F where a0 is the coefficient of x0 in P (x). What can you say about v1 ? The next lemma uses the notion of a ring homomorphism. Not every ring is a field, but every field is a ring. So the notion of a ring homomorphism from a ring to a field makes sense. Lemma 11.3.3 Let F be a field and let z be a given element of F . Let vz : F [x] → F be the evaluation function at z from the ring F [x] of all polynomials over F to F . Then vz is a ring homomorphism. As with Lemma 11.3.2, the proof of Lemma 11.3.3 can be painful. Discussion of its proof will occur later. Because of Lemma 11.3.3, the evaluation function vz is usually called the evaluation homomorphism. The evaluation homomorphism at 0 is rather special. Lemma 11.3.4 Let F be a field and let C be the collection of constant polynomials in F [x]. Then C is a subring of F [x] and the evaluation homorphism v0 at 0 restricted to C is an isomorphism from C to F . Essentially, Lemma 11.3.4 lets us view the constant polynomials as a copy of F living inside F [x]. Exercises (52) 1. Prove Lemma 11.3.1. 2. (optional) Prove Lemma 11.3.2. This breaks into many checks. Some (such as commutativity of addition and multiplication, and the associativity of addition) are quite easy. More difficult are the associativity of multiplication and the distributive law, but these are less terrible than you might think. They require careful manipulation of summation indexes. 3. Prove Lemma 11.3.4. 220 11.3.3 CHAPTER 11. POLYNOMIALS Degree P∞ In the polynomial P (x) = i=0 ai xi , each ai xi is a term of P (x). The degree of this term is i. If P (x) 6= 0, then the degree of P (x) is the largest i so that ai 6= 0. That is, the degree of P (x) is the highest degree of a term of P (x) with non-zero coefficient. We write deg(P (x)) for the degree of P (x). We have the following very simple observation about degrees and multiplication. Lemma 11.3.5 If P (x) 6= 0 and Q(x) 6= 0 are polynomials over a field F , then deg((P Q)(x)) = deg(P (x)) + deg(Q(x)). P∞ P∞ Proof. Let P (x) = i=0 ai xi , let Q(x) = i=0 bi xi , let m = deg(P (x)), and let n = deg(Q(x)). Then the coefficient of xm+n in (P Q)(x) is j=m+n X aj bm+n−j . j=0 One of the terms in this sum is am bn which is not zero since am 6= 0 and bn 6= 0. All other terms in this sum either have j > m implying that aj = 0, or j < m implying that m + n − j > n so bm+n−j = 0. Thus the coefficient of xm+n in (P Q)(x) is the non-zero quantity am bn added to a bunch of zeros. Now for k > m + n, we have that the coefficient of xk in (P Q)(x) is j=k X aj bk−j . j=0 For j > m, we have aj = 0. For j ≤ m, we have k − j ≥ k − m > m + n − m = n and bk−j = 0. Thus for every xk with k > m+n, the coefficient of xk in (P Q)(x) is zero. The degree of the polynomial 0 is a problem. Since 0P = 0 for any polynomial, we want the degree of 0 plus any other degree to always be the degree of 0 if we wish to have degree cooperate with Lemma 11.3.5 even for the polynomial 0. We do this by declaring2 that the degree of 0 is −∞ and invent the rule that −∞ + m = −∞ for any integer m and −∞ + −∞ = −∞. We do not allow polynomials of degree +∞, so we never have to deal with the sum of −∞ and +∞. We can now extend Lemma 11.3.5. Proposition 11.3.6 If P (x) and Q(x) are polynomials over a field F , then deg((P Q)(x)) = deg(P (x)) + deg(Q(x)). 2 Some books simply leave the degree of 0 undefined. Our convention has other problems which we will ignore. 11.4. THE DIVISION ALGORITHMM FOR POLYNOMIALS 221 We introduce terms that arise naturally at this point. If a polynomial P (x) is not zero, then its leading coefficient is the coefficient of xd where d = deg(P (x)). If P (x) = 0, then we say that its leading coefficident is zero. Note that the zero polynomial is the only polynomial with leading coefficient equal to zero. We say that a polynomial is monic if its leading coefficient is 1. We can use degree to redefine constant polynomials as those polynomials of degree no more than zero. Polynomials of degree exactly one are called linear polynomials. We declare that −∞ < m for any integer m. This cooperates with induction arguments. Note that with the exception of the polynomial 0, all degrees of polynomials are non-negative. Thus the set D of degrees of polynomials has the non-negative integers and −∞. If S is a non-empty subset of D, then either −∞ ∈ D or −∞ ∈ / D. In the first case, −∞ is the least element of D and in the second case, D has a least element since it is a non-empty subset of the non-negative integers. We will use this in the next section. Exercises (53) 1. Show that if polynomials P (x) and Q(x) have degrees m and n respectively, then the degree of (P + Q)(x) is no larger than max{m, n}. 11.4 The division algorithmm for polynomials Theorem 11.4.1 Let P (x) 6= 0 and S(x) be polynomials over a field F . Then there are unique polynomials Q(x) and R(x) so that S(x) = (P Q)(x) + R(x) and deg(R(x)) < deg(P (x)). Proof. What follows is an extremely slight modification of the proof of the division algorithm for integers (Proposition 2.7.3). Let d = deg(P (x)). Note that d ≥ 0. Let A be the set of degrees of all polynomials of the form S(x) − (P Q)(x) as Q(x) runs over all polynomials over F . The set A is certainly non-empty and so has a least element r (which might be −∞). For this r let Q(x) be some polynomial over F so that deg(S(x) − (P Q)(x)) = r. We do not yet know that there is only one such Q(x). Let R(x) = S(x) − (P Q)(x). We have deg(R(x)) = r and we wish to show that r < d. Assume that r ≥ deg(P (x)). This is the point at which the details of the proof vary from the details of the proof of Proposition 2.7.3. Let ar be the coefficient of xr in R(x) and let bd be the coefficient of xd in P (x). We know that neither ar nor bd is zero. Since r ≥ d, we have that the monomial M (x) = (ar /bd)xr−d is a polynomial over F . We consider the polynomial D(x) = R(x) − (P M )(x) = (S(x) − (P Q)(x)) − (P M )(x) = S(x) − (P (Q + M ))(x). 222 CHAPTER 11. POLYNOMIALS The degree of (P M )(x) is r as is the degree of R(x), so the degree of D(x) is no more than r. The coefficient of xr in D(x) is zero, so the degree of D(x) is less than r. However, writing D(x) as S(x) − (P (Q + M ))(x) shows that deg(D(x)) is in A. This contradicts the fact that r was chosen to be the least element of A and we have shown that we must have r < d. Now suppose that (P Q1 )(x) + R1 (x) = S(x) = (P Q2 )(x) + R2 (x) with deg(R1 (x)) < d and deg(R2 (x)) < d. But we have (P (Q1 − Q2 ))(x) = (R2 − R1 )(x) and the degree of the left side is at least d if (Q1 − Q2 )(x) 6= 0 and the degree of the right side is strictly less than d. Thus (Q1 − Q2 )(x) = 0 and Q1 (x) = Q2 (x). Now (R2 − R1 )(x) = 0 and R2 = R1 . Exercises (54) 1. Where in the proof of Theorem 11.4.1 is the fact that F is a field used? 2. (optional) There is a more complicated statement of a division algorithm for polynomials with coefficients in the integers. Find and prove such a statement. This is an example of dealing with the ring of polynomials whose coefficients come from a ring. 11.4.1 Roots and linear factors The division algorithm gives us the usual correspondence between roots of a polynomial and its linear factors. Lemma 11.4.2 Let P (x) be a polynomial over a field F . Then r ∈ F is a root of P (x) if and only if x − r divides P (x). Proof. If x − r divides P (x), then P (x) = (x − r)A(x) and P (r) = 0. Now assume P (r) = 0. The degree of x − r is 1. From Theorem 11.4.1 we know there are unique Q(x) and R(x) so that deg(R(x)) < 1 and P (x) = (x − r)Q(x) + R(x). Since deg(R(x)) is either 0 or −∞, we know that R(x) is a constant c. Thus P (x) = (x − r)Q(x) + c, and 0 = P (r) = c. 11.5 Greatest common divisors and consequences 11.5.1 Divisors and units In Section 2.7.4, we defined “a” greatest common divisor of two integers m and n, not both zero, to be a common divisor g of m and n so that every other common divisor of m and n divides g. We then observed that most pairs of integers have two greatest common divisors, one positive and one negative. For 11.5. GREATEST COMMON DIVISORS AND CONSEQUENCES 223 example, both −6 and 6 are greatest common divisors of 12 and 18. We then adopted the convention that the notation (m, n) would refer to the non-negative greatest common divisor of m and n. We face a similar but larger problem with polynomials. The problem in the integers stems from the fact that both −1 and 1 have multiplicative inverses in Z. In F [x], all the non-zero constant polynomials have multiplicative inverses. (Check it out.) Consider common divisors in R[x] of x2 − 1 and x2 + x − 2. One sees quickly that x−1 divides both since (x2 −1)/(x−1) = x+1 and (x2 +x−2)/(x−1) = x+2. But 2x−2 also divides both x2 −1 and x2 +x−2 since (x2 −1)/(2x−2) = 12 x+ 12 and (x2 + x − 2)/(2x − 2) = 12 x + 1. In fact so does 3x − 3, as does .1x − .1, and so on. However, these common divisors all divide each other so they are all candidates for greatest common divisor. While there are usually two choices for a greatest common divisor in the integers there are infinitely many candidates in the world of polynomials. We choose one of several ways to get around this and take time to discuss the solution before we prove that greatest common divisors exist. We first make a definition. In a ring R with 1, we say that u ∈ R is a unit if u has a multiplicative inverse. Note that the element 1 has to exist in R in order to have this discussion. Note also that if u is a unit, so is u−1 . Note also that 1 is a unit in any ring with identity. This use of the word “unit” seems to overlap with the phrase “ring with unit.” However, if a ring has a unit with the meaning just defined, then it must have an element 1 in order for the word unit to make sense. So “ring with unit” still means that an identity must exist. We need other notions that we used with integers. Let a and b be elements of a ring R. We say a divides b if there is a q ∈ R so that b = aq. As usual, we write a|b if a divides b. We imitate the definition of greatest common divisor and say that if a and b are in R, not both zero, then a greatest common divisor g of a and b is a common divisor of a and b so that if h is any common divisor of a and b, then h|g. The following lemma ties together all the notions that we need. Lemma 11.5.1 Let R be an integral domain with 1. 1. The set of units in R forms a group under multiplication. 2. If a ∈ R and u is a unit in R, then u|a. 3. If a|b in R, and u is a unit in R, then (au)|b. 4. If a|b and b|a in R are not both zero, then neither is zero and for some units u and v in R with uv = 1, we have a = bu and b = va. 5. If g is a greatest common divisor of a and b in R and u is a unit in R, then gu is a greatest common divisor of a and b in R. 224 CHAPTER 11. POLYNOMIALS 6. If g1 and g2 are two greatest common divisors of a and b in R, then for some units u and v in R with uv = 1, we have g1 = ug2 and g2 = vg1 . 7. If F is a field, then the units of F [x] are exactly the non-zero constant polynomials. 8. If P (x) is a non-zero polynomial in F [x] for a field F , then there is a unique unit u in F [x] and unique monic polynomial Q(x) so that P (x) = uQ(x). 9. If P (x) and Q(x) have a greatest common divisor in F [x] for a field F , then they have a unique monic greatest common divisor. Proof. We give the proofs of a couple of the items and leave the rest as exercises. To prove 4, we note that a|b and b|a implies that b = av for some v (not yet known to be a unit) and a = bu for some u (not yet known to be a unit). This means that if one is zero, then both are zero. But since we assume they are not both zero, neither is zero. Substituting bu for a in b = av gives b = (bu)v. Since b = b1, we get b1 = buv and cancellativity (from the fact that R is an integral domain) says uv = 1 giving the last point needed. To prove 7, we note that 0 is never a unit. If P (x) and Q(x) satisfy P Q(x) = 1, then the degree of P Q is zero. But the degree of P Q is the sum of the degrees of P and Q, neither of which is −∞. So the degrees of P and Q are both zero and both are non-zero constants. A consequence of 4 and 7 is that if two non-zero polynomials in F [x] are mutually divisible, then each is a non-zero constant times the other. The main point of Lemma 11.5.1 is Item 9. If polynomials A(x) and B(x) over a field F have a greatest common divisor, then they have a unique monic, greatest common divisor and we will reserve the notation (A, B) for that unique monic, greatest common divisor of A(x) and B(x). Exercises (55) 1. Prove the rest of Lemma 11.5.1. We now have a meaning for (A, B) when A(x) and B(x) are polynomials over a field F , but we do not yet know that (A, B) always exists. We prove this in the next section. 11.5.2 GCD of polynomials Theorem 11.5.2 Let P (x) and Q(x) be polynomials over a field F so that at least one of P (x) and Q(x) is not zero. Then there is a unique monic, greatest common divisor of P (x) and Q(x). Proof. Repeating our opening sentence to the proof of Theorem 11.4.1, what follows is an extremely slight modification of the proof of the corresponding 11.5. GREATEST COMMON DIVISORS AND CONSEQUENCES 225 result for integers (Proposition 2.7.6). The modfications are even smaller than than those needed for the proof of Theorem 11.4.1. We only have to show that a greatest common divisor exists. The existence and uniqueness of a monic, greatest common divisor will follow from this and from Lemma 11.5.1. Let A be the set of degrees of all polynomials of the form (M P + N Q)(x) where M (x) and N (x) are polynomials over F . Let B be the subset of A consisting of all degrees not equal to −∞. We know that B is not empty since at least one of P (x) and Q(x) is not zero. Let g be the least value in B and let G(x) = (M P + N Q)(x) for some M (x) and N (x) for which deg(G(x)) = g. We claim that G(x) is a greatest common divisor of P (x) and Q(x). We know that there are unique polynomials S(x) and R(x) so that P (x) = (SG)(x) + R(x) and deg(Rx)) < g. Note that R(x) = P (x)−(SG)(x) = P (x)−S(x)(M P +N Q(x) = ((1−SM )P +(SN )Q)(x) so deg(R(x)) is in A. If deg(R(x)) 6= −∞, then deg(R(x)) is also in B. But deg(R(x)) < g which is least in B, so deg(R(x)) must be −∞ and R(x) = 0. Thus G(x)|P (x). Similarly G(x)|Q(x) and we have that G(x) is a common divisor of P (x) and Q(x). If H(x) is a common divisor of P (x) and Q(x), then H(x) must divide G(x) = (M P + N Q)(x). This makes G(x) a greatest common divisor. Note that at least one greatest common divisor of P (x) and Q(x) is of the form (M P + N Q)(x). Further, all greatest common divisors can be obtained from any other by multiplying by a unit. So any greatest common divisor of P (x) and Q(x) can be written u(M P + N Q)(x). But this is ((uM )P + (uN )Q)(x). So all greatest common divisors of P (x) and Q(x) can be put in the form (M P + N Q)(x). In particular, this is true of the monic polynomial (P, Q). 11.5.3 Irreducible polynomials One fact about the integers is Euclid’s Theorem (Theorem 2.7.9). In the integers, if p is a prime and p|(ab), then either p|a or p|b. We need a notion in a more general ring that imitates the notion of a prime in the integers. There are two words that are used. The word “prime” is used for the behavior referred to above: if p|(ab), then p|a or p|b. The word “irreducible” is used for the behavior that we usually think of when we think of prime integers. In a ring R, we say that a ∈ R is irreducible if a is not a unit and whenever a factors as a = bc, then one of b or c is a unit. That is, a is not a unit and is not a product of two non-units. For polynomials, this means that an irreducible polynomial is not a constant and cannot be factored into non-constant factors. We say that a non-unit p in a ring R is prime if whenever p|(ab) with a and b in R, then p|a or p|b. For some rings “irreducible” and “prime” are not identical. For the rings F [x] with F a field, we will see shortly that there is no real difference and we will not discuss the matter further. However, the 226 CHAPTER 11. POLYNOMIALS word “irreducible” is traditionally used with polynomials more frequently than “prime” and we will do so here as well. The definition of “prime” is included for the curious. As a companion to these definitions we can define two elements a and b of a cancellative ring with an identity to be relatively prime if there is a greatest common divisor that is a unit. Note that if a unit u is a greatest common divisor of relatively prime a and b, then so is uu−1 = 1. Applying this to polynomials over a field F , if P (x) and Q(x) are relatively prime, then for some M (x) and N (x), we have (M P + N Q)(x) = 1, and we will have (P, Q) = 1 since we insist that the greatest common divisor of two polynomials be monic. Exercises (56) 1. Every polynomial of degree one over a field is irreducible. Hint: consider degrees. We are now in a position to prove that “irreducible” implies “prime” in F [x] for a field F . It will be seen that units add an extra step here and there. Theorem 11.5.3 Let P (x), A(x) and B(x) be polynomials over a field F , with P (x) irreducible. If P (x)|(AB)(x), then either P (x)|A(x) or P (x)|B(x). Proof. We assume that P (x) does not divide A(x). If (P, A) is not 1, then it is some G(x) that is not a unit. But G(x)|P (x) implies that P (x) = (GC)(x) for some C(x) which must then be a constant polynomial which we may as well write as c, giving P (x) = cG(x). But G(x) also divides A(x), and P (x) is a unit times G(x) so P (x) divides A(x). So we must have (P, A) = 1. The rest of the argument follows the proof of Theorem 2.7.9. Now for some M (x) and N (x), we have 1 = (P M + N A)(x). Multiplying by B(x) gives B(x) = (BP M )(x)+(N AB)(x). Since P (x) divides both summands on the right it divides B(x). 11.6 Uniqueness of factorization We can now prove a parallel to the fundamental theorem of arithmetic. Details will be left as exercises. Exercises (57) 1. Prove that every polynomial over a field F that is not a unit is a product of irreducible polynomials. Hint: if false, then let P (x) be the polynomial of least degree for which the statement is false. 2. Prove that every polynomial P (x) over a field F that is not a unit is a constant times a product of monic, irreducible polynomials, and that this constant is the leading coefficient of P (x) and is thus unique. 11.7. ROOTS OF POLYNOMIALS 227 3. Prove that if a polynomial P (x) over a field F and is not a unit, if P (x) is a constant times a product of monic, irreducible polynomials in two ways, and if A(x) is an irreducible factor of the first factorization, then A(x) equals one of the factors in the second factorization. 4. Prove that if P (x) is a polynomial over a field F and is is not a unit, then any two factorizations of P (x) into a constant times a product of monic, irreducible polynomials can be made the same by permuting the factors of one of the factorizations. Hint: induct on the number of irreducible factors. 5. Prove that if P (x) is a polynomial over a field F , and A(x) is a monic, irreducible polynomial over F that divides P (x), then A(x) is one of the factors of any factorization of P (x) into a constant times a product of monic, irreducible polynomials. The problems above prove the following version of the Fundamental Theorem of Arithmetic for polynomials over a field. Theorem 11.6.1 If P (x) is a polynomial over a field F and is not a unit, then P (x) factors as a constant times a product of monic, irreducible polynomials over F . Further, this factorization is unique up to a permutation of the factors. At this point, we have finally filled in some missing details from Section 1.2.6. There it was claimed that if r1 and r2 are roots of the monic quadratic x2 + bx + c, then it must be true that r1 + r + 2 = −b and r1 r2 = c. This follows from Lemma 11.4.2 and Theorem 11.6.1. The lemma says that if a monic quadratic has roots r1 and r2 , then (x − r1 ) and (x − r2 ) are factors of the polynomial and the theorem says that the polynomial factors into linear factors in only one way. Thus the polynomial must equal (x − r1 )(x − r2 ) and the claims follow. 11.7 Roots of polynomials 11.7.1 Counting roots We combine information from Lemma 11.4.2 and Theorem 11.6.1. Let P (x) be a polynomial over a field F . There may be a root of P (x) in F or there may not. The polynomial x2 − 2 is a polynomial over Q and also over R. There is a root of x2 − 2 in R, but not in Q. Thus the number of roots of P (x) in F is not completely known just by knowing P (x). One must know F as well. But there are facts about the number of roots that we can record, and we do that here. Proposition 11.7.1 Let P (x) be a non-zero polynomial over a field F , and let d = deg(P (x)). Then there are no more than d different roots of P (x) in F . 228 CHAPTER 11. POLYNOMIALS Proof. If there are n different roots of P (x) in F , then let them be denoted r1 , r2 , . . . , rn . From Lemma 11.4.2, we know that each x − ri , 1 ≤ i ≤ n, divides P (x). But each x − ri is monic and irreducible so by Theorem 11.6.1 each x − ri is a factor in any factorization of P (x) into a constant times a product of monic, irreducible polynomials. Since the x − ri are all different for 1 ≤ i ≤ n, the degree of P (x) must be at least n, and d ≥ n. 11.7.2 Polynomials as functions We have been treating polynomials as expressions. They are also functions. If P (x) is a polynomial over a field F , then if a value from F is assigned to the variable x, then a value for P (x) can be calculated by applying the operations of the field to the expression P (x). This makes P (x) a function from F to F . A question arises as to whether different looking polynomials have to give different functions. The answer is that often they do, but not always. Proposition Q(x) be polynomials over a field F with P∞ 11.7.2 Let P (x)Pand ∞ P (x) = i=0 ai xi and Q(x) = i=0 bi xi . If there are more elements of F then the larger of deg(P (x)) and deg(Q(x)), and if P (x) = Q(x) for every x in F , then ai = bi for 0 ≤ i < ∞. The conclusion can be thought of as saying that if two polynomials are equal as functions and the field is large enough, then the two polynomials are equal as expressions. Proof. The assumptions say that (P − Q)(x) = 0 for every x in F . But the degree of (P − Q)(x) is no larger than max{deg(P (x)), deg(Qx))} so there are more roots to (P − Q)(x) then deg((P − Q)(x)). From Proposition 11.7.1, we must have that (P − Q)(x) is the zero polynomial. That is, all its coefficients are zero. Corollary 11.7.3 Let PP (x) and Q(x) be polynomials over a field F with P (x) = P ∞ ∞ i i a x and Q(x) = i i=0 i=0 bi x . If the characteristic of F is zero, and if P (x) = Q(x) for every x in F , then ai = bi for 0 ≤ i < ∞. Proof. A field with characteristic zero must have infinitely many elements. 11.7.3 Automorphisms and roots We give a first hint of restrictions that exist on automorphisms. If P (x) is a polynomial over a field F , if E is an extension field of F , and if r ∈ E is a root of P (x), then by Proposition 10.4.1 an element of Aut(E/F ) must carry r to a root of P (x). But P (x) is a polynomial over E as well as over F , and there are no more than d = deg(P (x)) roots of P (x) in E. Thus there are only d places that θ can carry r. This is stated formally in the following. 11.8. DERIVATIVES AND MULTIPLICITIES OF ROOTS 229 Lemma 11.7.4 Let P (x) be a polynomial of degree d over a field F , and let E be a field extension of F containing a root r of P (x). Then {θ(r) | θ ∈ Aut(E/F )} is contained in the set of roots of P (x) in E and thus has no more than d elements. Later we will learn more detailed information about automorphism groups of field extensions. 11.8 Derivatives and multiplicities of roots Counting roots becomes inaccurate if there are multiple roots. We derive a criterion for telling whether there are multiple roots. 11.8.1 The derivative P∞ If P (x) = i=0 ai xi is a polynomial over a field F , then we know from calculus that the derivative of P (x) is given by P ′ (x) = ∞ X i=1 i(ai )xi−1 = ∞ X (i + 1)(ai+1 )xi . (11.5) i=0 Note that i(ai ) involves an element of Z, namely i, times an element of F and is defined as the sum of i copies of ai in manner identical to the definition of m(x) from Section 10.6.1. The first of the two formulas is the more familiar. The two are seen equal when the terms of similar powers of x are matched between the two formulas. In calculus (11.5) is derived as the result of a limit process. Here, we just take (11.5) as the definition of P ′ (x), and limits are completely eliminated from the discussion. We can describe (11.5) as an algebraic definition of the derivative. Before we give an exercise that shows that the algebraic definition of the derivative behaves in familiar ways, we give one more fact about the function m(x). In the expression (mn)(xy) where m and n are non-negative integers and x and y are field elements, we can use two facts from Exercise Set (49). We can write (mn)(xy) = m(n(xy)) = m(xn(y)) = m(x)n(y) where the first equality follows from (mn)(x) = m(n(x)), and the other two equalities from different applications of m(xy) = xm(y). Exercises (58) 1. Using (11.5) as a definition, prove the product rule (P Q)′ (x) = (P ′ Q + Q′ P )(x). You will need various facts about the functions m(x). 230 11.8.2 CHAPTER 11. POLYNOMIALS Multiplicities of roots We use the derivative to detect roots with multiplicities. For a polynomial P (x) over a field F , we say that r ∈ F is a root of P (x) with multiplicity m if (x− r)m divides P (x) but (x − r)m+1 does not divide P (x). That is, m is the largest integer for which (x − r)m divides P (x). Exercises (59) 1. Prove that if P (x) is a non-zero polynomial of degree d over a field F and r1 , r2 , . . . , rn are the different roots in F of P (x), and if for each i, the multiplicity of ri is mi , then m1 + m2 + · · · + mn ≤ d. 2. Prove that if P (x) is a non-zero polynomial over a field F of characteristic zero and r ∈ F is a root of P (x), then the multiplicity of r is greater than 1 if and only if r is also a root of P ′ (x). The two exercises above explain our interest in multiplicities and derivatives. From the first exercise, we know that if one or more roots of P (x) has multiplicty greater than 1, then there must be strictly fewer roots than the degree of P (x). This combines with Lemma 11.7.4 to say that the number of places an automorphism can take a root P (x) is strictly smaller than the degree of P (x). We will see that this is a less than desirable situation since there is less going on in a relevant automorphism group than hinted at by the degree of P (x). The second exercise explains our interest in the derivative as defined in (11.5), since derivatives help determine when a root of a polynomial has multiplicity greater than 1. 11.9 Factoring polynomials over the reals We look at a very special case of polynomials. We look at R[x]. Experience shows that the non-real roots of a polynomial P (x) ∈ R[x] come in complex conjugate pairs, and Exercise 1(e) of Exercise Set (47) proves that this is always the case. It turns out (what is called the Fundamental Theorem of Algebra) that any polynomial in C[x] (not just R[x]) has all of its roots in C. So this applies as well to P (x). We now bring in uniqueness of factorization. We know that if r1 , r2 , . . . , rd are the roots of P (x) (here d is the degree of P (x)), then with c the leading coefficient of P (x), we have P (x) = c(x − r1 )(x − r2 ) · · · (x − rd ). We can arrange the product so that any non-real roots appear next to their complex conjugate “twin.” Now if (x − ri )(x − ri+1 ) is a pair of such “twins,” then their producgt is really (x − ri )(x − r i ) and multiplies to x2 − (ri + r i )x + ri r i . 11.9. FACTORING POLYNOMIALS OVER THE REALS 231 But for any complex number z, both z + z and zz are real. (Why?) So the quadratic above is in R[x]. For any real (non-non-real?) root rj , the factor (x − rj ) is also in R[x]. Thus we have shown (accepting the Fundamental Theorem of Algebra as true) that any P (x) ∈ R[x] factors into degree one and degree two factors over R[x]. 232 CHAPTER 11. POLYNOMIALS Chapter 12 A construction procedure for field extensions In this chapter we give one method for building field extensions. There is another method that we will not cover that gives a very different kind of extension. We start with generalities that apply to all extensions, and then give definitions that describe the two basic types of extensions. Then we settle down to the construction of the particular type that we need. 12.1 Smallest extensions We recall some basic results from Section 3.4.3. The first is standard and its argument follows the same outline as for groups and other algebraic structures. Lemma 3.4.3 Let E be a field and let C be a collection of subfields of E. Then the intersection of all the subfields in C is a subfield of E. From this, another standard technique gives the following. Lemma 3.4.4 Let F ⊆ E be an extension of fields and let S be a subset of E. Then in the collection of subfields of E that contain both F and S, there is a smallest subfield. The smallest extension given by Lemma 3.4.4 is denoted F (S), or is denoted F (a1 , a2 , . . . , an ) if S = {a1 , a2 , . . . , an } is a finite set. In particular, when S = {a} has only one element, we write F (a) for the extension. We think of F (a) as created from F by “adding a” to F . The next lemma shows that F (a1 , a2 , . . . , an ) can be created by adding one element at a time. Note that F (a1 ) is a field in its own right, so F (a1 )(a2 ) refers to the field obtained from F by first adding a1 to obtain F (a1 ) and then adding a2 to F (a1 ) to obtain F (a1 , a2 ). The lemma says that this is the field F (a1 , a2 ) which is the smallest 233 234 CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS field containing F and {a1 , a2 }. Lemma 3.4.5 Let F ⊆ E be an extension of fields, and let {a1 , a2 , . . . , an } be a subset of E. Then F (a1 )(a2 ) · · · (an ) = F (a1 , a2 , . . . , an ). We will refer to F (a1 , a2 , . . . , an ) as the extension of F (in E) by {a1 , a2 , . . . , an }. In the special case when S is the single element {a} in E, we will refer to F (a) as the extension of F (in E) by a. It is nice to know that these smallest fields exist, but it is even nicer to know what is in them. In the setting of groups, we have a description of what is in a group generated by a certain set of elements. However, fields are somewhat more complicated than groups and so the descriptions of what is in them is correspondingly more complicated. The promised construction of a field extension will give a description of F (a) when a has a certain property. The construction that we will not cover would describe what happens when a does not have that property. We now discuss the property that distinguishes between the two cases. 12.2 Algebraic and transcendental elements Let F ⊆ E be an extension of fields, and let α be an element of E. If there is a non-zero polynomial P over P , then we say that α √ F so that α is a root of √ is algebraic over F . Thus 2 is algebraic over Q since 2 is a root of x2 − 2. Note that every element α of F is algebraic over F since α is a root of x − α. We say that α ∈ E is transcendental over F if it is not algebraic over F . That is, for every non-zero polynomial P with coefficients in F , we have P (α) 6= 0. We will give a construction for F (α) when α is algebraic over F . This will be all we need since we are interested in roots of polynomials. Elements that are not roots of polynomials will have no use in our discussions. However, transcendental elements do exist and form the basis for discussions that are outside the scope of these notes. It can be hard to prove that a specific element is transcendental. Both e (the base of the natural logartithm) and π are transcendental over Q. The proofs of these facts are beyond the scope of these notes and involve a certain amount of analysis as well as algebra. Note that e and π are both algebraic over R. Thus it is not correct to simply say that e and π are transcendental “period.” For certain extensions, it can be easy to prove that transcendental elements must exist. This is quite different from showing that a specific element is transcendental. This will be given as an exercise later. Exercises (60) 1. Prove that if α is algebraic over F , then so is −α. Conclude that if α is transcendental over F , then so is −α. 12.3. EXTENSION BY AN ALGEBRAIC ELEMENT 235 2. Prove that if α is transcendental over F , then so is α + α. Show by example that if α and β are transcendental over F , then α + β might not be transcendental over F . 3. Prove that if α is transcendental over F and β ∈ F , then α + β is transcendental over F . If in addition β 6= 0, then αβ is transcendental over F . Conclude that if α is algebraic over F and β ∈ F , then α + β is algebraic over F , and if in addition β 6= 0, then αβ is algebraic over F . 4. Prove that if α is algebraic over F , then α−1 is algebraic over F . Conclude that if α is transcendental over F , then α−1 is transcendental over F . Hint: if P (x) is non-zero polynomial for which P (α) = 0, look at P (α−1 ) but do not expect it to be zero. See what transformations it goes through when you clear fractions. Then start again with P (α) which you know is zero and see what transformations you can make on it. If α and β are algebraic over F , you might wonder about α + β and αβ. These are both algebraic over F , but we won’t have the tools to prove that until later. For now you can think about why it is not obvious. Next we give the promised construction. This will lead to a description of F (α) when α is algebraic over F . 12.3 Extension by an algebraic element Let F ⊆ E be an extension of fields and let α ∈ E be algebraic. We would like to understand the structure of F (α). The structure of F (α) will come from the structure of F [x], the ring of polynomials over F . From Lemma 11.3.3, we know that vα : F [x] → E defined by vα (P (x)) = P (α) is a ring homomorphism. Since vα (P (x)) = P (α) is a combination of products and sums of elements of F and powers of α, we get that P (α) is in F (α), the smallest subfield of E that contains all of F and contains α. Thus we can say that vα goes from F [a] to F (α). Later we will see that it is onto. Now we investigate the extent to which it is not one-to-one. The fact that α is algebraic over F tells us that there is some non-zero polynomial Q(x) in F [x] for which Q(α) = 0. That is, α is a root of Q(x). If we concentrate on the additive parts of F [x] and of F (α), then this says that Q(x) is a non-zero element of the kernel of vα . The kernel of a group homomorphism captures the failure of the homomorphism to be one-to-one1 , and we can say that vα (M (x)) = vα (N (x)) for M (x) and N (x) in F [x] if and only if M (x) − N (x) is in the kernel of vα . It thus becomes important to understand the kernel of vα . To do so, we bring in the multiplicative structure of F [x]. Since α ∈ E is algebraic over F , it is a root of some non-zero polynomial in F [x]. There must be a smallest degree d so that there is a non-zero polynomial in P [x] of degree d with α as a root. We call d the degree of α over F . 1 Field homomorphisms are so restrictive that failure to be one-to-one has drastic consequences, but F [x] is not a field and vα is not a field homomorphism. 236 CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS If α 6= 0, then a polynomial P (x) ∈ F [x] whose degree is the degree of α is called a minimal polynomial for α over F . The next exercises show that the kernel of vα is completely determined by a minimal polynomial for α over F and that all such minimal polynomials are closely related. Exercises (61) 1. Let F ⊆ E be an extension of fields, let α ∈ E be algebraic over F and let P (x) ∈ F [x] be minimal for α. Show that P (x) is irreducible over F . 2. Let F ⊆ E be an extension of fields, and let α ∈ E be a root of P (x) ∈ F [x] where P (x) is irreducible over F . Show that P (x) is a minimal polynomial for α. 3. In the setting of problem 1, show that Q(x) ∈ F [x] has Q(α) = 0 (that is, Q(x) is in the kernel of vα ) if and only if P (x) divides Q(x). 4. In the setting of Problem 1 show that given two minimal polynomials for α that each is a constant time the other, and show that there is a unique monic, minimal polynomial for α. The exercises above show that in the setting given, M (α) = N (α) for M (x) and N (x) in F [x] if and only if M (x) − N (x) is a multiple of some minimal polynomial (equivalently, all minimal polynomials) for α over F . 12.3.1 The construction Let F ⊆ E be an extension of fields and let α ∈ E be algebraic over F with minimial polynomial P (x) ∈ F [x] for α. The exercise set above motivates the following definition in imitation of the construction of Zp from Z. For M (x) and N (x), we define M (x) ∼P N (x) to mean that N (x) − M (x) is a multiple of P (x). Exercises (62) The following problems refer to the items described in the previous two paragraphs. The arguments are similar to arguments about the construction of Zp from Z. 1. Show that ∼P is an equivalence relation. 2. For M (x) ∈ F [x], write [M (x)]P to denote the equivalence class of M (x) under ∼P . Prove that setting [M (x)]P + [N (x)]P = [(M + N )(x)]P , −[M (x)]P = [−M (x)]P , and [M (x)]P [N (x)]P = [(M N )(x)]P give well defined operations on equivalence classes. We will denote the set of equivalence classes with the operations above by F [x]/P (x). 12.3. EXTENSION BY AN ALGEBRAIC ELEMENT 237 3. Argue that the multiplication defined in Problem 2 makes F [x]/P (x) a commutative ring with identity. That is, the relevant laws (commutative, associative, identity, distributive, etc.) hold. 4. Use the irreducibility of P (x) to prove that there are multiplicative inverses for every non-zero element in F [x]/P (x). This argument mirrors the proof that Zp has multiplicative inverses when p is a prime. You should look up that argument in Section 2.8. The previous exercise set shows that F [x]/P (x) with the operations in Problem 2 forms a field. We can now prove the following. Theorem 12.3.1 Let F ⊆ E be an extension of fields. Let α ∈ E be algebraic over F and let P (x) ∈ F [x] be a minimal polynomial for α. Let vα : F [x] → E be the evaluation homomorphism. Then v α : F [x]/P (x) → E defined by v α ([M (x)]P = vα (M (x)) = M (α) is a well defined field homomorphism which takes [x]P to α and whose image is exactly F (α). Thus v a : F [x]/P (x) → F (α) is a field isomorphism taking [x]P to α. Proof. The last line follows from the previous lines since a field homomorphism either takes everything to zero or is one-to-one. The homomorphism cannot take everything to zero since if k ∈ F , then the constant polynomial k has v α ([k]) = vα (k) = k. Thus the image contains at least F . A one-to-one homomorphism is an isomorphism onto its image. Thus we must show that v α : F [x]/P (x) → E is a well defined field homomorphism with the claimed image. We start with the well definedness question. If M (x) ∼P N (x), then M (x)− N (x) is divisible by P (x) and M (x) − N (x) = P (x)Q(x). Now M (α) − N (α) = P (α)Q(α) = 0Q(α) = 0 so M (α) = N (α). That v α is a homomorhpism follows from the fact that vα is a homomorphism, and from the definitions in Problem 2 of Exercise set (62). The monomial x has vα (x) = α, so v α ([x]P ) = vα (x) = α giving one of the facts to be proven and putting α in the image of v α . Each k ∈ F also can be thought of as a constant polynomial so that v α ([k]P ) = vα (k) = k(α) = k which puts F in the image of v α . Since the image of v α is a subfield of E containing both F and α and so contains F (α). But each [M (x)]P for M (x) ∈ F [x] maps to M (α) under v α . As argued in the first few paragraphs of Section 12.3, we have M (α) ∈ F (α). So the image of v α is contained in F (α) and so must equal F (α). This completes the proof. Theorem 12.3.1 says that in the setting of the theorem, F (α) is like F [x]/P (x). But we would like to say more about what F [x]/P (x) is like on its own merits, 238 CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS just using the fact that P (x) is irreducible over F . We can say what F [x]/P (x) is like, and we will do so twice. It turns out to be very easy when P (x) is a minimal polynomial for some α that is algebraic in some extension of F . It is a bit more involved when it is not known that P (x) has such an α. It turns out that every P (x) has such an α, but that argument needs either a result from complex analysis in the setting of subfields of C, or a prior understanding of F [x]/P (x). 12.3.2 The structure of F [x]/P (x) In the discussion that follows, we will assume that P (x) is a minimal polynomial for some α that is algebraic in some extension E of F . From Theorem 12.3.1, we know that v α : F [x]/P (x) → F (α) is an isomorphism. In particular, the inverse θ of v α is an isomorphism from F (α) to F [x]/P (x). We will use the two isomorphisms v α and θ to say something about F [x]/P (x). In exercises that come right after the analysis of F [x]/P (x) using the two isomorphisms, you will not assume that P (x) is a is a minimal polynomial for some α that is algebraic in some extension of F . Thus you will not have the isomorphism v α and its inverse θ available. In spite of this, you will be asked to prove statements that correspond to the facts that we will extract from the two isomorphisms. Assuming that E, α, v α and θ exist, we know for every element k ∈ F that k also represents the constant polynomial that we can refer to as k in F [x]. Further we have vα (k) = k, so v α ([k]P ) = k. From this we know that v α is one-to-one from the classes of constant polynomials to F ⊆ F (α), that different constant polynomials lie in different classes mod P in F [x]/P (x), and that the classes of constant polynomials form a subfield of F [x]/P (x) isomorphic to F . Also, θ(k) = [k]P for each k ∈ F . It is cumbersome to keep writing [k]P , for “the class mod P of the constant polynomial k,” so we will simply denote it by k. With this shorthand, we have θ(k) = k. With the shorthand that regards k ∈ F as also representing the class [k]P , we have inserted F in F [x]/P (x) as a subfield. Thus we can think of F [x]/P (x) as an extension of F . We know that the monomial x has vα (x) = α, so v α ([x]P ) = α and θ(α) = ∞ X ci xi where each ci is in F . Since P (α) = 0, we have [x]P . Let P (x) = ∞ X i=0 i=0 ci αi = 0. Applying θ to this equality, we get ∞ X ci ([x]P )i = 0 since our i=0 shorthand says that θ(ci ) = ci for each i. Thus in F [x]/P (x), the element [x]P is a root of the polynomial P (x). If there is a subfield K in F [x]/P (x) that contains F and contains [x]P , then it will have an isomorphic image v α (K) in F (α) that contains F and contains α. 12.3. EXTENSION BY AN ALGEBRAIC ELEMENT 239 But our definition of F (α) says that v α (K) would then have to equal F (α). Thus K = F [x]/P (x), and F [x]/P (x) is the smallest subfield of itself that contains F and [x]P . We now consider the degree [F [x]/P (x) : F ] where we regard F as a subfield of F [x]/P (x) using the shorthand discussed above. If d is the degree of P (x), we will show that [F [x]/P (x) : F ] = d by showing that the d elements of {1, [x]P , [x2 ]P , . . . , [xd−1 ]P } form a basis for F [x]/P (x). If we bring these elements to F (α) by the isomorphism v α , then we are looking at 1, α, α2 , . . . , αd−1 . If these are linearly dependent, then the linear dependence would would have a factor that is a non-zero polynomial of degree no more than d − 1 with α as a root. (To find such a factor one would only have to factor out a power of α from the linear dependence.) But this would violate the fact that P (x) is a minimal polynomial for α. Every element in F [x]/P (x) is of the form [M (x)]P for some polynomial M (x) over F . By the division algorithm, M (x) = (P Q)(x) + R(x) with the degree of R(x) smaller than d, the degree of P (x). Since M (x) − R(x) is a multiple of P (x), we have [M (x)]P = [R(x)]P and every element of F [x]/P (x) is represented by the class of some polynomial of degree less than d. It is tempting to jump to the correct conclusion that R(x) is a linear combination of the elements of {1, [x]P , [x2 ]P , . . . , [xd−1 ]P }. However, this conclusion requires a bit more than just a jump. We will take advantage of the fact that R(x) is carried by the isomorphism v α to R(α) which is a linear combination of the elements of {1, α, α2 , . . . , αd−1 }. Now this linear combination is carried back by θ to a linear combination of the elements of {1, [x]P , [x2 ]P , . . . , [xd−1 ]P } and we are done. We have shown the following lemma. Lemma 12.3.2 Let F ⊆ E be an extension of fields, and let α ∈ E be algebraic over F with minimal polynomial P (x). Then the following hold. 1. Sending k ∈ F to [k]P in F [x]/P (x) is an isomorphism from F into F [x]/P (x) whose image is not zero. Using this, we regard F as a subfield of F [x]/P (x) for the rest. 2. The element [x]P in F [x]/P (x) is a root of the polynomial P (x). 3. The smallest subfield of F [x]/P (x) that contains F and [x]P is F [x]/P (x) itself. 4. The degree [F [x]/P (x) : F ] equals the degree of P (x), and a basis for F [x]/P (x) over F is {1, [x]P , [x2 ]P , . . . , [xd−1 ]P }. Corollary 12.3.3 Let F ⊆ E be an extension of fields, and let α ∈ E be algebraic over F with minimal polynomial P (x) of degree d. Then [F (α) : F ] = d, and a basis for F (α) over F is {1, α, α2 , . . . , αd−1 }. 240 CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS There is a result corresponding to Lemma 12.3.2 with a different hypothesis. Its proof will be an exercise. Note that the conclusions are identical to those of Lemma 12.3.2. Proposition 12.3.4 Let F be a field and let P (x) be a non-constant polynomial that is irreducible over F . Then the following hold. 1. Sending k ∈ F to [k]P in F [x]/P (x) is an isomorphism from F into F [x]/P (x) whose image is not zero. Using this, we regard F as a subfield of F [x]/P (x) for the rest. 2. The element [x]P in F [x]/P (x) is a root of the polynomial P (x). 3. The smallest subfield of F [x]/P (x) that contains F and [x]P is F [x]/P (x) itself. 4. The degree [F [x]/P (x) : F ] equals the degree of P (x), and a basis for F [x]/P (x) over F is {1, [x]P , [x2 ]P , . . . , [xd−1 ]P }. Exercises (63) The steps below will prove Proposition 12.3.4. Note that from Exercise set (62) we already know that F [x]/P (x) is a field. 1. Prove Conclusion 1 of the proposition. There are several ways to approach this. Some steps may be saved if Lemma 3.4.6 is taken into account. 2. This is the start of a proof of Conclusion 2 of the proposition. Let (x) represent the polynomial in which all coefficients are equal to zero except that the coefficient of x1 is one. Similarly, let (xn ) represent the polynomial in which all coefficients are equal to zero except that the coefficient of xn is one. Prove that (x)n = (xn ). Use this to conclude that in F [x]/P (x) the equality ([x]P )n = [xn ]P holds. 3. This will help with Conclusion 2 of the proposition as well as Conclusion 4. ∞ X ai xi in F [x], and k ∈ F , define (kM )(x) to For a polynomial M (x) = be the polynomial ∞ X i=0 (kai )xi . Prove that this multiplication of polynomial i=0 times “scalar” and the addition and negation in F [x] (but ignoring the multiplication in F [x]), makes F [x] a vector space over F . There are many things to check. See Section 1.4.2 for the list. 4. This will also help with Conclusions 2 and 4. If M (x) is a polynomial in F [x], then in F [x]/P (x), we have M ([x]P ) = [M (x)]P . This includes a small amount of calculation and a very careful review of definitions. 5. Prove Conclusion 2 of Proposition 12.3.4. 12.3. EXTENSION BY AN ALGEBRAIC ELEMENT 241 6. Prove Conclusion 4 of Proposition 12.3.4. That is right. Do 4 before 3. You should decide what the basis is, and prove that it is the basis. A review of definitions will help with linear independence. 7. Prove Conclusion 3 of Proposition 12.3.4. This asks what has to be in a subfield of F [x]/P (x) that contains F and [x]P . The point of Proposition 12.3.4 is that even if we don’t have an extension having a root of some P (x), we can build one that does. In the proposition, we need to assume that P (x) is irreducible, but later we will see how to handle any polynomial. Note that the constant polynomials in F [x] form a subfield of the ring F [x], and that in turn the constant polynomials are exactly the units of the ring F [x] together with the zero polynomial. That this forms a field is an accident as the next exercise shows. Exercises (64) 1. Find an n so that the ring Zn (which is not a field unless n is prime) does not have a subfield consisting exactly of the units and the zero element. 12.3.3 A result about automorphisms From Proposition 12.3.4, we know a great deal about the structure of F [x]/P (x). We should be able to say something about its automorphisms. Speficically, we will say something about the automorphisms that fix F . To keep the notation simpler we will look at an extension F ⊆ E where α ∈ E is algebraic over F with a minimal polynomial P (x). We know that the extension F ⊆ F (α) has the same structure as F ⊆ F [x]/P (x) and F (α) is less complicated to write than F [x]/P (x). It is possible that F (α) has other roots of P (x). If d is the degree of P (x), we know that there cannot be more than d roots of P (x) in F (α). Let the roots of P (x) in F (α) be {α1 , α2 , . . . , αn } with n ≤ d and with α1 = α. Note that if Q(x) is another minimal polynomial for α, it is a non-zero constant times P (x) and has exactly the same roots as P (x). Consider an αi for i 6= 1, and consider F (αi ). This is a subfield of F (α) since αi is in F (α). Since P (x) is minimal for α it is irreducible over F . Since αi is a root of P (x) and P (x) is irreducible over F , P (x) is a minimal polynomial for αi . By Lemma 12.3.2, we know that [F (αi ) : F ] = d and also that [F (α) : F ] = d. But F ⊆ F (αi ) ⊆ F (α) so [F (α) : F ] = [F (α) : F (αi )][F (αi ) : F ]. But this implies that [F (α) : F (αi )] = 1 and F (αi ) = F (α). Thus all the F (αi ) are equal. Now for each i there is an isomorphism φi = v αi from F [x]/P (x) to F (αi ) that takes each constant polynomial k to k in F (αi ) and takes [x]P to αi . Now 242 CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS φi ◦ (φ1 )−1 takes F (α) isomorphically to F (αi ) so that each k ∈ F is taken to itself and so that α is taken to αi . Since F (αi ) = F (α), this is an automorphism of F (α) that fixes F and that takes α to αi . We have just discovered an element θi = φi ◦ (φ1 )−1 of Aut(F (α)/F ). We know that all the θi are different since they do different things to α. We now argue that Aut(F (α)/F ) consists entirely of the θi . Let σ be in Aut(F (α)/F ). By Proposition 10.4.1, we know that σ(α) is a root of P (x). Thus σ(α) is one of the θi and σ and θi agree on α. We now need a quick exercise. Exercises (65) 1. If E is a field and ρ and θ are in Aut(E), then {y ∈ E | ρ(y) = θ(y)} is a subfield of E. By the exercise, the set of elements on which σ and θi agree is a subfield of F (α). But this subfield contains F and contains α. Thus it contains F (α) and σ and θi are the same element in Aut(F (α)/F ). We have one more observation to make before we are ready to state a result. So far we have shown that if F (α) contains n different roots of a minimal polynomial P (x) for α, then there are n elements in the group Aut(F (α)/F ) and that each root of P (x) is the image of α under some element of Aut(F (α)/F ). Further, an element of Aut(F (α)/F ) is completely determined by what it does on α. Recall that we also showed that if αi is another root of P (x), then F (αi ) = F (α), so that we can let αi play the role of α. This lets us repeat for αi all that has been said about α. All these observations combine to give the following extremely important result about automorphism groups. We call it a proposition since later we will generalize it to larger extensions and will call the generalization a theorem. Proposition 12.3.5 Let F ⊂ E be an extension of fields, and let α ∈ E be algebraic over F with minimal polynomial P (x) ∈ F [x] of degree d. Then the number of elements in Aut(F (α)/F ) is exactly the number of roots of P (x) in F (α) and is no larger than d. Further given any two (not necessarily different) roots αi and αj of P (x) in F (α), there is an element of Aut(F (α)/F ) that carries αi to αj . Lastly, if αi is a root of P (x) in F (α), then each automorphism in Aut(F (α)/F ) is determined completely by what it does on αi . If we restrict the action of the group Aut(F (α)/F ) to the set of all roots R = {α1 , α2 , . . . , αn } of P (x) in F (α), then we note that the orbit of any of the αi is all of R. Recall (Section 9.3) that when a group acts on a set so that there is only one orbit (i.e., given any two elements of the set, there is an element of the group taking one to the other), then it is said that the action of the group on the set is transitive. Thus Proposition 12.3.5 says, assuming its hypotheses, that the action of Aut(F (α)/F ) on the roots of P (x) in F (α) is transitive. 12.3. EXTENSION BY AN ALGEBRAIC ELEMENT 12.3.4 243 Examples We give a few examples to show the power of Proposition 12.3.5. They also illustrate some of the need for the care that went into the wording of the proposition. We work out one fully in the text and then give others as exercises. Cube roots of 1 We will use Q as the base field for our extensions. The polynomial P (x) = x3 −1 is not irreducible over Q since it factors as (x−1)(x2 +x+1). The left factor has a √ √ 3 3 1 1 2 root of 1 and the right factor has as roots ω = − 2 +i 2 , and ω = − 2 −i 2 . The right factor is irreducible over Q (the factorization x2 + x + 1 = (x − ω)(x − ω 2 ) uses coefficients outside of Q) and so is a minimal polynomial for ω and ω 2 . Since ω 2 is in any field that contains ω, we know that it is in Q(ω). From Proposition 12.3.5, we know that Aut(Q(ω)/Q) has exactly two elements—one that takes ω to itself and one that takes ω to ω 2 . Since the identity must be an element in Aut(Q(ω)/Q), the first must be the identity. What must be the image of ω 2 under the non-identity element of Aut(Q(ω)/Q)? Fifth roots of 1 The fifth roots of 1 are evenly spaced around the unit circle in C and one of them is 1. The separation of 2π/5 or 72o puts the one of the four non-real roots in each of the four quadrants. The non-real root in the first quadrant we will denote by α and the other three non-real roots will be α2 , α3 and α4 . These are all roots of x5 − 1 but x5 − 1 factors as (x − 1)(x4 + x3 + x2 + x + 1). The fifth root 1 of 1 is the root of x − 1 and the four non-real roots are roots of the right factor. The exercises below will address the irreducibility of the right factor and then will address Aut(Q(α)/Q). Exercises (66) 1. Prove that there is no degree one factor of x4 + x3 + x2 + x + 1 over Q. 2. Prove that there is no degree two factor of x4 + x3 + x2 + x + 1 over Q. (Hint: What must the roots of such a degree two factor be and what would that say about the coefficients in the factor. Use part (e) of the problem in Exercise set (47) and its consequences discussed in Section 11.9.) Conclude that x4 + x3 + x2 + x + 1 is irreducible over Q. 3. Argue that Aut(Q(α)/Q) has exactly four elements. Give each element of Aut(Q(α)/Q) a name and write out the multiplication table using these names for Aut(Q(α)/Q). What is this group isomorphic to? √ 4. (This has nothing to do with automorphisms.) Show that 5 is an element of Q(α). 244 CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS Cube roots of 2 √ The cube roots of 2 are evenly spaced around√the circle of radius 3 2 centered at 3 the origin, with one root being the real root 2 that is the “usual” cube root of o 2. The other two are at angles √ ±120 from the2 real root and can be represented √ 3 2 3 as β1 = ω 2 and β2 = ω 2 where ω and ω are the two non-real cube roots of 1. Exercises (67) 1. Prove that x3 − 2 is irreducible over Q and is thus a minimal polynomial over Q for each of the cube roots of 2. √ 2. How many roots of x3 − 2 are in Q( 3 2)? √ 3. How many elements does Aut(Q( 3 2)/Q) have? 4. How many roots of x3 − 2 does Q(β1 ) have? 5. How many elements does Aut(Q(β1 )/Q) have? Sixth roots of 1 The sixth roots of 1 are evenly spaced around the unit circle in C with one being 1, one being −1 and the other four not real. Since they are spaced 60o apart, there is one non-real sixth root of 1 in each of the four quadrants. Let γ be the non-real root in the first quadrant. The sixth roots of 1 are all roots of x6 − 1. However, the polynomial x6 − 1 factors into (x − 1)(x5 + x4 + x3 + x2 + x + 1). Exercises (68) 1. Show that x5 + x4 + x3 + x2 + x + 1 factors into three factors in Q[x] of degrees 2, 2, and 1, respectively. Show that each of these factors is irreducible. 2. How many roots of x6 − 1 are in Q(γ)? 3. What can you say about Aut(Q(γ)/Q)? Project (optional) 1. Existence of finite fields To build finite extensions of Zp , one follows the same outline. The key is to find polynomials with coefficients in Zp of desired degrees that do not factor (called irreducible in the next chapter) into smaller polynomials with coeffients in Zp . These are easier to show exist than to find. The fact that they exist comes from the fact that there are more polynomials of a given degree than polynomials that result from multiplying polynomials of smaller degree. It follows from this that a finite field exists of size pd for each prime p and each integer d ≥ 1. 12.3. EXTENSION BY AN ALGEBRAIC ELEMENT 245 One can attempt direct calculations of this as a project. To see a particularly elegant proof that irreducible polynomials of all degrees always exist, see [1], Section 16.9, Page 368–371. The narrative there will require reading some of the pages that come before that section. Chapter 16 of the book [1] has a lot of interesting information on finite fields, some of which duplicates what has been covered here, and much that has not. 246 CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS Chapter 13 Multiple extensions Consider the cubic polynomial P (x) = x3 + 3x − 2. Its derivative is P ′ (x) = 3x2 + 3 is positive for all real numbers, so P (x) is strictly increasing onp all of R. p √ √ 3 3 Thus P (x) has only one real root. The real root is α = 1 + 2 + 1 − 2 which can be verfied either by evaluating P (α) directly or by going through the formula s s r r 2 3 2 3 r r 3 −r p 3 −r p x= + + + − + 2 2 3 2 2 3 for the solutions to cubic polynomial equations of the form x3 + qx + r = 0. The coefficients of P (x) were chosen deliberately to cooperate reasonably well with the formula. We are studying whether polynomials can have their roots calculated from their coefficients by the five processes of addition, subtraction, multiplication, division and the taking of n-th roots. The first four of the five operations are field operations, so we can summarize the process by saying that we want to find the roots of a polynomial by the processes of field operations and the taking of n-th roots. If we take the example just given and see how the allowed processes can be arranged, one step at a time, to take us from the coefficients to the roots, we would start with Q, the smallest field that contains the coefficients. Calculations that we can make before taking of any n-th roots would include calculating the expression r 2 p 3 + (13.1) 2 3 that lies inside the square roots. These involve only field operations and stay within the field containing the roots. In the example, the expression (13.1) evaluates to 2. Then a square root must be taken. This √goes outside the field containing the coefficients and brings us to the field Q( 2). Adding −r/2 to the square 247 248 CHAPTER 13. MULTIPLE EXTENSIONS √ root (or its negative) is another field operation and stays within Q( 2), but the taking of the cube roots is a problem. Note that if we get one of the cube roots in a field, we will also have the other. This is because q q √ √ √ √ 3 3 1+ 2 1 − 2 = 3 1 − 2 = 3 −1 = −1 p p √ √ 3 3 so that 1 − 2 = −( 1 + 2)−1 . √ It is possible to believe that the cube roots exist in Q( 2), but in fact they do not. At the end of this chapter we will see how to argue that P (x) is irreducible over √ Q. If P (x) is reducible over √ Q. This means that Q(α) is of degree 3 over factor of P (x) in Q( 2)[x] must √ be degree one and Q( 2), then at least one √ give the real (because Q( 2) lies in R) root α. Since [Q( 2) : Q] = 2, this would put Q(α) inside a field of degree 2 over √ Q. This is not possible. 2) and the cube roots cannot be Thus P (x) is also irreducible over Q( √ in Q( 2). Thus at least one more extension must be created. Specifically, √ p √ 3 we mustpcreate the extension Q( 2)( 1 + 2). As noted above, the field √ √ 3 Q( 2)( 1 + 2) will contain both of the cube roots that are needed to build α. So we see that if we are to follow the processes that we want to use to arrive at roots, we must look at extensions that are more complicated than the extensions by single elements that we studied in the previous chapter. This chapter proposes to do just that. In fact, we can arrive at a field containing α just by forming Q(α). This will be a degree 3 extension of Q by Lemma 12.3.2. But building this extension does not follow the allowable processes in a step by step fashion. Adding the element α to Q all at once combines several taking of n-th roots. √ √ p 3 Note that neither Q(α) nor Q( 2)( 1 + 2) contains the other roots of P (x) since the other roots must be non-real complex numbers. One theme in this chapter will be that looking at a smallest extension that contains all the roots of a polynomial will reveal much about the polynomial and its roots. As this example shows, such extensions will often take more than one step. Lastly, this chapter will end with a discussion of one way to show that a polynomial is irreducible. In particular it will apply to the polynomial P (x) = x3 + 3x − 2, and it will apply to other important cases. 13.1 Multiple extensions Recall the following. Lemma 3.4.5 Let F ⊆ E be an extension of fields, and let {a1 , a2 , . . . , an } be a subset of E. Then F (a1 )(a2 ) · · · (an ) = F (a1 , a2 , . . . , an ). 13.2. ALGEBRAIC EXTENSIONS 249 To illustrate what this means for an extension done in two steps, we take an extension of fields F ⊂ E and we let α and β be elements of E. Then the Lemma 3.4.5 implies that F (α, β) = F (α)(β) = F (β)(α). Of the three fields being compared, the first is the smallest subfield of E containing F and α and β, the second is the smallest subfield of E containing F (α) and β where F (α) is the smallest subfield of E containing F and α. The third is the smallest subfield of E containing F (β) and α, where F (β) is the smallest subfield of E containing F and β. There is terminology to separate out those extensions that can be done in one step. If F ⊆ E is an extension of fields, then we say that the extension is simple if there is an element α ∈ E so that E = F (α). In this situation, α is called a primitive element for the extension. We shortly give conditions under which primitive elements always exist. The wording of the definition of simple has been carefully chosen. Note that an extension F (α, β) appears not to be a simple extension of F , but it might be if there is an element γ ∈ F (α, β) for which F (γ) = F (α, β). Exercises (69) √ √ 1. Is Q( 2, 3) a simple extension of Q? Hint: find a lot of elements in the extension. The previous chapter analyzed the structure of simple extensions and had a lot to say about the automorphisms of a simple extension. We wish to do the same here for multiple extensions. However, we will limit ourselves to multiple extensions by algebraic elements. We next make some general (and fairly powerful) observations about such extensions. Notation We will often be more interested in extensions of fields by specific elements than in some larger field that might contain even more elements. Thus we will often let F (α) be an extension of F by α without saying that α comes from a specific field originally containing both F and α. If this is hard to accept, then we can say that letting F (α) be an extension of F by α is shorthand for saying that F ⊆ E is an extension of fields and that α ∈ E is an element for which F (α) = E. 13.2 Algebraic extensions Let F ⊆ E be an extension of fields. We say that E is algebraic over F if every element of E is algebraic over F . We need to see that algebraic extensions exist. The next lemma discusses finite dimensional extensions. The next definition should have been given earlier. If F ⊆ E is an extension of fields, we say that E is a finite extensionof F if the degree [E : F ] is finite. Thus we save one word by referring to finite dimensional extensions as finite extensions. Note that saying 250 CHAPTER 13. MULTIPLE EXTENSIONS that E is a finite extension of F says nothing about the number of elements of E. Lemma 13.2.1 Let E be a finite extension of the field F . Then E is algebraic over F . Further, if α is an element of E and [E : F ] = d, then α has a minimal polynomial of degree at most d. Proof. Proving the second sentence will prove the entire lemma. Consider the set S = {1 = α0 , α1 , α2 , . . . , αd }. If two of these (say αi and αj with 0 ≤ i < j ≤ d) are equal, then α is a root of xj − xi which has degree at most d. If all αi and αj with 0 ≤ i < j ≤ d are different, then S has d + 1 elements which must be linearly dependent over F . A linear dependence will be a non-zero polynomial of degree no more than d with α as a root. Lemma 13.2.1 could have been proven in back in Chapter 3, just after the definition of the degree of an extension in Section 3.4.3. It says that finite extensions are algebraic. We can combine this with Proposition 12.3.4 from from Chapter 12 which says that an extension by an algebraic element is a finite extension, and with facts about degrees that were established in Section 10.5.2 to get a suite of lemmas about algebraic extensions. Lemma 13.2.2 The extension F (α) of the field field F is algebraic over F if α is algebraic over F . We know that α is algebraic over F . The point is that so is every other element of F (α). Proof. From Proposition 12.3.4, the degree [F (α) : F ] is finite, and from Lemma 13.2.1, we get that F (α) is algebraic over F . Lemma 13.2.2 is really a special case of the next lemma. Lemma 13.2.3 Let F ⊆ E be an extension of fields and assume that E = F (α1 , α2 , . . . , αn ), let F0 = F and for each i with 1 ≤ i ≤ n, let Fi = F (α1 , α2 , . . . , αi ). Assume for each i with 1 ≤ i ≤ n that αi is algebraic over Fi−1 . Then E is algebraic over F . Proof. Proposition 12.3.4 says that each Fi with 1 ≤ i ≤ n is a finite extension of Fi−1 . The result follows from Lemma 13.2.1 and the fact that degrees multiply (Lemma 10.5.1). Note that a consequence of Lemma 13.2.3 is that all of the αi in the statement of the Lemma turn out to be algebraic over F . Corollary 13.2.4 The extension F (S) of the field F by a finite set of elements is algebraic if each element of S is algebraic over F . Proof. This follows from Lemma 3.4.5 and the fact that an element that is algebraic over F is algebraic over any extension of F . 13.3. AUTOMORPHISMS 251 Corollary 13.2.4 can be generalized to extensions by infinite sets of algebraic elements, but first we need another of its consquences. Lemma 13.2.5 Let F ⊆ E be an extension of fields, and let A be the set of all elements of E that are algebraic over F . Then A is a field. Of course A will then be algebraic extension of F . Proof. If α and β are in A, then they are in F (α, β) which must then be an algebraic extension of F by Corollary 13.2.4. But F (α, β) contains, α + β, αβ, −α and α−1 . These four quantities must then be algebraic which puts them in A. Thus A is closed under the four basic operations of a field and is a field. Lemma 13.2.6 Let F (S) be an extension of F by a (not necessarily finite) set of elements all of which are algebraic over F . Then F (S) is algebraic over F . Proof. By Lemma 13.2.5, there is a subfield A of F (S) consisting of all the elements of F (S) that are algebraic over F . The subfield A contains F and S and so contains F (S). Lemma 13.2.7 If F ⊆ K ⊆ E are field extensions, if K is algebraic over F and E is algebraic over K, then E is algebraic over F . If F = F1 ⊆ F2 ⊆ F3 ⊆ · · · ⊆ Fn = E are field extensions and for each i with 1 ≤ i < n the extension Fi ⊆ Fi+1 is algebraic, then E is algebraic over F . Proof. The second sentence follows by induction from the first. To prove the first sentence, let α be in E. It is algebraic over K and so there is a non-zero polynomial P (x) ∈ K[x] with P (α) = 0. If {c1 , c2 , . . . , ck } is the set of non-zero coefficients of P (x), then it follows from Lemma 13.2.3 that F (c1 , c2 , . . . , ck , α) is algebraic over F , so that α is algebraic over F . 13.3 Automorphisms We know about the automorphisms of simple algebraic extensions. Here we say something about autormorphisms of multiple algebraic extensions. We do not try to handle the most general situations, and so confine ourselves to extensions that are obtained by finitely many successive simple extensions. There are two ways to work on this. One way is to prove that a succession of finitely many simple extensions is, in fact, a simple extension in disguise. Exercise set (69) gives one example. This works in many situations, and later we will see that it works in the situations that we are interested in. A second way is to build on our knowledge of automorphisms of simple extensions and build up facts about successive simple extensions in a stepby-step manner. We will take this approach here, but first we point out a complication that will occur. 252 CHAPTER 13. MULTIPLE EXTENSIONS Consider field extensions F ⊆ F (α) ⊆ F (α)(β) = F (α, β) = E where α is algebraic over F and β is algebraic over F (α). It follows from Lemma 13.2.3 that β is also algebraic over F , but our emphasis will be on the fact that β is algebraic over F (α). Let P (x) be a minimal polynomial for α over F , and let Q(x) be a minimal polynomial for β over F (α). That is, P (x) ∈ F [x] and Q(x) ∈ F (α)[x]. Let the degree of P (x) be p, and let the degree of Q(x) be q. Let θ be an automorphism in Aut(E/F ). Since θ fixes F , the field containing the coefficients of P (x), we know that θ(α) is another root α′ of P (x). From our discussion in Section 12.3.3 about automorphisms of simple, algebraic extensions, we know that F (α′ ) = F (α) and that the restriction of θ to F (α) gives an isomorphism from F (α) to F (α′ ) = F (α) and is thus an automorphism in Aut(F (α)/F ). If we now look at what θ does to β, the situation is more complicated. The field F (α) that contains the coefficients of Q(x) is not fixed by θ.1 Thus the discussion of Section 12.3.3 needs to be expanded so that instead of insisting that the field containing the coefficients of the minimal polynomial is kept fixed elementwise, it is instead moved by an automorphism. In fact, it is no harder to deal with the situation in which the field containing the coefficients is moved by an isomorphism, and this is what we will do. 13.3.1 Relativizing Proposition 12.3.5 Certain mathematical results come in two versions. One version (called the absolute version) will discuss facts about a given structure. The second version (called the relative version) will discuss facts about a given structure while taking into account a second (similar) structure. The relative version often proves more useful as a building block that can be combined with itself or other results in complex situations. Our studies are already partly relative. We have not just investigated Aut(E), the automorphisms of the field E, we have investigated Aut(E/F ), the automorphisms of the field E that keep the subfield F fixed. However, now we go farther and allow F to move. As mentioned above, it is just as easy to allow isomorhpisms of F and E as it is to allow automorphisms of F and E, and we will also have a need for this extra flexibility. We have to look at what the setting will be. We start with a field extension F ⊆ E, an element α of E that is algebraic over F , and a minimal polynomial P (x) ∈ F [x] for α. Since we are only interested in F (α), we may as well assume E = F (α). 1 If one tries to get around this by replacing Q(x) by a polynomial that is minimal for β over F instead of over F (α), then the unknown interaction between α and β becomes a problem. In a vague sense, the polynomial Q(x) that is minimal for β over F (α) contains the required information about the interaction of α and β. 13.3. AUTOMORPHISMS 253 We then take a field F ′ together with an isomorphism φ : F → F ′ and an extension E ′ of F ′ . We wonder if there is an isomorphism θ that extends φ from E = F (α) into a subfield of E ′ . The statement that θ extends φ means that the restriction of θ to F equals φ. In the former situation F ′ = F and φ was the identity. In that case we knew that θ(α) had to be another root of P (x). We can say something similar in this situation under the assumption that the extension θ exists. ∞ ∞ X X φ(ai )xi . This ai xi and let φ(P )(x) denote the polynomial Let P (x) = i=0 i=0 puts φ(P )(x) in F ′ [x]. If we now assume that an extension θ of φ to F (α) exists, then we have 0 = θ(0) = θ(P (α)) =θ ∞ X ai αi i=0 = ∞ X ! θ(ai )(θ(α))i i=0 = ∞ X φ(ai )(θ(α))i i=0 = φ(P )(θ(α)) where the next to last equality holds because θ is an extension of φ and the equality before that one holds because θ is an isomorphism. Thus θ(α) is a root of φ(P )(x). Next we argue that φ(P )(x) is irreducible over F ′ . This follows from a lemma that should have been done in the chapter on polynomials. Here it is. Lemma 13.3.1 Let F and F ′ be fields and let φ : F → F ′ be an isomorphism. Then taking each P (x) ∈ F [x] to φ(P )(x) ∈ F ′ [x] is an isomorphism from F [x] to F ′ [x]. Exercises (70) 1. Prove Lemma 13.3.1. The formulas in (11.4) makes the calculations easy. 2. Use Lemma 13.3.1 to prove that in the setting of the lemma if P (x) ∈ F [x] is irreducible over F , then φ(P )(x) ∈ F ′ [x] is irreducible over F ′ . We are now in a position to prove the following. Proposition 13.3.2 Let F and F ′ be fields and let φ : F → F ′ be an isomorphism. Let P (x) ∈ F [x] be irreducible over F . Then φ : F [x]/P (X) → F ′ [x]/φ(P )(x) defined by φ([H(x)]) = [φ(P )(x)] is a well defined isomorphism. 254 CHAPTER 13. MULTIPLE EXTENSIONS Proof. For well definedness, we let H(x) and J(x) be in the same class. This means that (H − J)(x) = P (x)M (x) for some M (x) ∈ F [x]. From Lemma 13.3.1, we have (φ(H) − φ(J))(x) = φ(P )(x)φ(M )(x) and [φ(H)(x)] = [φ(J)(x)]. The function φ is a homomorphism since φ on F [x] is a homomorphism, and F [x] contains the representatives of the classes in F [x]/P (x). It is one-to-one since it is a homomorphism of fields. It is onto since φ is onto on the coefficients. Exercises (71) 1. Explain the last sentence in the proof above. We can now give a relative version of Proposition 12.3.5. Proposition 13.3.3 Let F ⊆ F (α) be fields with α algebraic over F with minimal polynomial P (x) of degree d. Let F ′ ⊆ E be fields and let φ : F → F ′ be an isomorphism. Let A be the set of all homomorphisms θ : F (α) → E that are extensions of φ in that θ(x) = φ(x) for all x ∈ F . Let B be the set of roots of φ(P )(x) in E. Then θ(α) is in B for each θ ∈ A, and sending θ ∈ A to θ(α) in B is a one-to-one correspopndence from A to B. In particular, A has no more than d elements. Proof. That θ(α) is in B for every θ in A is shown in the calculation before Lemma 13.3.1. We know that if two homomorphisms agree on F and on α, then they agree on all of F (α). Thus taking θ ∈ A to θ(α) ∈ B is one-to-one. To show that this is onto, consider some β ∈ B. We have isomorphisms F (α) → F [x]/P (x) → F ′ [x]/φ(P )(x) → F ′ (β) by Lemmas 12.3.2 and 13.3.2. For k ∈ F , the isomorphisms take k first to [k], then to [φ(k)], then to φ(k). The element α is taken first to [x], then to [x] (since φ(1) = 1), then to β. Thus the composition of the isomorphisms above is an element of A that takes α to β. 13.3.2 Applying the relative proposition The next theorem relies heavily on the fact that if F ⊆ K ⊆ E are finite extensions of fields, then [E : F ] = [E : K][K : F ]. Theorem 13.3.4 Let F ⊆ E be an extension of fields with [E : F ] = d < ∞. Let F ′ ⊆ E ′ be an extension of fields and let φ : F → F ′ be an isomorphism. Then there are no more than d different homomorphisms θ : E → E ′ that extend φ. 13.3. AUTOMORPHISMS 255 Proof. We will induct on d. The theorem is true if d = 1. Our inductive hypothesis will be that the theorem is true for all extensions of F of degree less than d. Let K be a field with F ⊆ K ⊆ E and with k = [K : F ] as large as possible but still strictly less than d. We can find such a K since there are only finitely many possibilities for [K : F ] with F ⊆ K ⊆ E. By our inductive hypothesis, there are no more than k different homomorphisms θ : K → E ′ that extend φ. Since K is not all of E, there is an element α in E − K. We have F ⊆ K ⊆ K(α) ⊆ E. Since α ∈ / K, we have K(α) : K] 6= 1 so that [K(α) : F ] > [K : F ]. But the way that we picked K forces K(α) = E. Since all the extensions are finite, α is algebraic over K and has a minimal polynomial P (x) ∈ K[x] of some degree q. Since F ⊆ K ⊆ K(α) = E, we have d = [E : F ] = [K(α) : F ] = [K(α) : K][K : F ] = qk where we know that [K(α) : K] = q from Corollary 12.3.3. Let θ : E = K(α) → E ′ be an extension of φ. Note that the restriction θ|K of θ to K is an extension of φ to a homomorphism of K into E ′ . Now θ is an extension of θ|K . Thus each extension θ : E → E ′ of φ can be thought of as obtained in two steps: first extend to a homomorphism with domain K, and then extend that homomorphism to one with domain E. However, there are only k different extensions of φ to K, and for each extension ρ to K of φ, Proposition 13.3.3 applied to K, ρ and K(α), gives that there are no more than q extensions of ρ to all of K(α) = E. Thus there are no more than kq = d possible extensions of φ to all of K(α) = E. Corollary 13.3.5 Let F ⊆ E be an extension of fields with [E : F ] = d < ∞. Then Aut(E/F ) has no more than d elements. Proof. Apply Theorem 13.3.4 where F ′ = F , E ′ = E and where the isomorphism from F to F ′ = F is the identity. Example We add to one of the examples in Section 12.3.4 to illustrate the need for the generality in Proposition 13.3.3. √ √ Consider Q( 3 2, ω) where ω = − 21 + i 23 is the cube root of 1 in the second √ 3 2, ω) : Q] in stages by quadrant of the complex plane. We can figure out [Q( √ √ √ 3 3 3 looking at Q ⊆ Q( 2) ⊆ Q( 2, ω). We know that [Q( 2 : Q] = 3 since x3√ −2 √ 3 2 is irreducible over Q. Also, Q( 2) ⊆ R√so x + x +√1 is irreducible over Q( 3 2) since it is irreducible over R. Thus [Q( 3 2, ω) : Q( 3 2)] = 2. This gives √ √ √ √ 3 3 3 3 [Q( 2, ω) : Q] = [Q( 2, ω) : Q( 2)][Q( 2) : Q] = (2)(3) = 6. 256 CHAPTER 13. MULTIPLE EXTENSIONS We ask what are the extensions of the identity on Q to homomorphisms√of √ by what it does on 3 2 Q( 3 2, ω) into C. Sucn an extension will be determined √ 3 and on ω. (Why?) The extension must take 2 to some root of x3 − 2 and must take ω to some root of x2 + x + 1. √ √ √ 3 2, ω 3 2 and ω 2 3 2. These are all in Now the three roots of x3 − 2 in C are √ √ in Q( 3 2, ω). So any extension of the identity Q( 3 2, ω). Also both ω and ω 2 are √ √ 3 3 on Q to a homomorphism of Q( 2, ω) into C must have its image in Q( √ 2, ω). 3 (Why?) But the image will have degree 6 over Q, and so must be all of Q( 2, ω). √ Thus the extensions we seek are just elements of Aut(Q( 3 2, ω)/Q). If we follow the outline in the proof of Theorem 13.3.4, we must pick a field √ K containing Q and strictly contained in Q( 3 2, ω) of largest degree over Q. Since the√degrees involved must be factors of 6, the largest such factor is 3. Thus Q( 3 2) can be used for K. √ √ only 3 2 is in The three roots of x3 − 2 in C are all in Q( 3 2, ω), but √ √ Q( 3 2). By Proposition 12.3.5, the only automorphism of Q( 3 2) that fixes Q is the identity. However, by Proposition 13.3.3, there are √ three extensions of √ extend the the identity on Q of homomorphisms of Q( 3 2) into Q( 3 2, ω) that √ 3 3 identity on Q: one for each root of x − 2. The images of Q( 2) under the √ √ √ three homomorphisms are Q( 3 2) itself, Q(ω 3 2), and Q(ω 2 3 2). Now the second part of 13.3.4, looks √ √ the argument in the proof of Theorem at each extension to Q( 3 2) and extends further to Q( 3 2, ω) using the full generality of Proposition 13.3.3. The full generality is needed since two of√the homomorphisms that we are extending are not only not the identity on Q( 3 2), √ 3 they are not even automorphisms of Q( 2). √ They are homomorphisms (isomorphisms onto their images) with domain Q( 3 2). We know that there are two such extensions in each case. One will be the identity which takes ω to itself, and the other will be complex conjugation that takes ω to ω 2 . We thus end up with a total of six extensions of the identity on √ 3 2, ω)/Q). The following Q. As noted above, these are all elements of Aut(Q( √ table gives the action of the six automorphisms on 3 2 and on ω. Each numbered column shows what happens to the two values in the extreme left column. √ 3 2 ω 1 √ 3 2 ω 2 √ 3 2 ω2 3 √ ω32 ω 4 √ ω32 ω2 5√ ω2 3 2 ω 6√ ω2 3 2 ω2 More interesting is what each automorphism does to the three roots of x3 −2. Using the information √ in the table above, we show what each of the numbered elements of Aut(Q( 3 2, ω)/Q) does to the roots. 2 3 4 5√ 6√ √ √ √ 3 3 3 2 3 2 3 2 ω 2 ω 2 ω 2 ω 2 √ √ √ √ √ 3 3 3 2 3 2 3 ω √ 2 ω√ 2 2 ω√ 2 √ √2 3 3 ω32 2 ω2 3 2 ω 3 2 2 √ √ 3 For√example, the image of ω 2 2 under automorphism 4 is calculated as (ω 2 )2 (ω 3 2) = √ √ √ ω 5 3 2 = ω 2 3 2 since automorphism 4 takes ω to ω 2 and 3 2 to ω 3 2. √ 3 2 √ 3 ω √2 ω2 3 2 1 √ 3 2 √ 3 ω √2 ω2 3 2 13.4. SPLITTING FIELDS 257 It is easier √ to see what is happening to the roots if we let A = and C = ω 2 3 2. Then the table above turns into the following. 6 C B A √ It is now easy to see that the elements in Aut(Q( 3 2, ω)/Q) give all possible permutations of the three roots of x3 − 2. A B C 13.4 1 A B C 2 A C B 3 B C A 4 B A C √ √ 3 2, B = ω 3 2 5 C A B Splitting fields √ The field Q( 3 2, ω) is the smallest subfield of C containing all the roots of x3 − 2. This is because as a field of characteristic 0, any subfield of C must √ √ 3 3 2 and ω 2 must also contain contain Q, and because a field containing both √ √ √ ω 3 2( 3 2)−1 = ω. Thus Q( 3 2, ω) is the smallest subfield of C in which x3 − 2 factors into a product of three terms of degree one. This phenomenon is turned into a definition. Let F ⊆ E be an extension of fields. Let P (x) be in F [x]. We say that P (x) splits over E if P (x) factors as a product of one degree factors in E[x]. If P (x) splits over E, then a splitting field for P (x) in E over F is the smallest subfield of E containing F over which P (x) splits. If P (x) splits over E, and if the roots of P (x) in E are r1 , r2 , . . . , rk , then the splitting field for P (x) in E over F is exactly F (r1 , r2 , . . . , rk ). If E itself is the smallest subfield of E containing F over which P (x) splits, then we simply say that E is a splitting field for P (x) over F . We need to say a few words about what is happening. If P (x) is in F [x] and F ⊆ E, then P (x) is also in E[x] since its coefficients are in E as well as F . If P (x) splits over E, then P (x) is a product of degree one polynomials, each of which uses coefficients from E. Thus if d is the degree of P (x), then we have that P (x) = A1 (x)A2 (x) · · · Ad (x) where each Ai (x) is a degree one element of E[x] is a true statement about multiplication in E[x]. However, the result of the multiplication is a polynomial P (x) which happens to have its coefficients in a smaller field F so that it happens to lie in F [x]. Exercises (72) 1. For each of the following polynomials P (x), determine the splitting field SP for P (x) in C over Q and determine the degree [SP : Q]. If you are ambitious, you can also determine the automorphism group Aut(SP /Q). Hint: review the exercises in Section 12.3.4 (a) P (x) = x3 − 1. (b) P (x) = x4 − 1. 258 CHAPTER 13. MULTIPLE EXTENSIONS (c) P (x) = x5 − 1. (d) P (x) = x6 − 1. Next we will show that splitting fields always exist (although not necessarily inside a given field) and that all splitting fields for a given P (x) over a given F are isomorphic in a strong way. 13.4.1 Existence If F is a field, and if P (x) is in F [x], then a splitting field for P (x) exists if P (x) splits over some extension E of F . Thus we work to break P (x) into linear factors. Finding linear factors is the same as finding roots, and we have the tools to build roots where needed. Proposition 13.4.1 Let F be a field and let P (x) be in F [x]. Then there is an extension E of F over which P (x) splits. Proof. Let d be the degree of P (x). If d = 1, then P (x) already splits over F and we can let E = F . We then induct on d in the sense that we assume that any polynomial of degree less than d over any field splits over some extension of that field. We claim that if there is an extension E of F (even allowing E = F ) that has a root r of P (x), then we are done in one extra step because then P (x) factors into (x − r)Q(x). This will make the degree of Q(x) equal to d − 1, and our inductive hypothesis will then give an extension E ′ of E over which Q(x) splits. Thus P (x) will split over E ′ which is also an extension of F . So we assume that F itself has no roots of P (x) and we seek an extension E of F having at least one root of P (x). Now let A(x) be an irreducible factor of P (x) over F . Perhaps A(x) = P (x), but it does not matter whether this equality holds or not. By Proposition 12.3.4 the field F [x]/A(x) has an element (namely [x]) that is a root of P (x). Further, the classes of constant polynomials in F [x]/A(x) form a subfield of F [x]/A(x) that is isomorphic to F . Thus if we regard F as directly contained in F [x]/A(x) by regarding each class of a constant polynomial as being that constant element in F itself, we then have an extension of F that has a root of A(x) and thus of P (x). 13.4.2 Uniqueness If E and E ′ are two extensions of F that are splitting fields for one polynomial P (x) ∈ F [x], then we want to show that they are isomorphic. We have the following which is stated in relative form to help with the induction. Proposition 13.4.2 Let F and F ′ be fields and let φ : F → F ′ be an isomorphism. Let P (x) be in F [x]. Let E be a splitting field for P (x) over F and 13.4. SPLITTING FIELDS 259 let E ′ be a splitting field for φ(P )(x) over F ′ . Then there is an isomorphism θ : E → E ′ that extends φ. Proof. Let d be the degree of P (x). If d = 1, then E = F and E ′ = F ′ and we are done by letting θ = φ. So we assume d > 1 and we inductively assume the truth of the proposition in all situations where the degree of the polynomial is less than d. Let A(x) be an irreducible factor of P (x) over F so that P (x) = A(x)B(x). By Lemma 13.3.1 and Problem 2 in Exercise set (70), we know that φ(A)(x) is an irreducible factor of φ(P )(x) and φ(P )(x) = φ(A)(x)φ(B)(x). Let α be a root of A(x) in E (which must exist since P (x) splits over E) and let α′ be a root of φ(A)(x) in E ′ . By Proposition 13.3.3 (the relative version of Proposition 12.3.5), there is an extension ρ of φ that is an isomorphism from F (α) to F ′ (α′ ) that carries α to α′ . Now A(x) factors over F (α) as (x − α)C(x) for some (not necessarily irreducible) C(x) ∈ F (α)[x]. By Lemma 13.3.1, ρ(x − α)ρ(C)(x) is a factorization of ρ(A)(x). But ρ(x − α) = (x − α′ ). So P (x) = (x − α)C(x)B(x) and ρ(P )(x) = (x − α′ )ρ(C)(x)ρ(B)(x). We know that D(x) = C(x)B(x) and ρ(D)(x) = ρ(C)(x)ρ(B)(x) have degree d − 1. We now try to apply the inductive hypothesis to the isomorphism ρ from field F (α) to F (α′ ) and polynomials D(x) and ρ(D)(x). By hypothesis P (x) factors into degree one factors over E. By uniqueness of factorization, (x − α) is one of those factors and D(x) must be the product of all the factors of P (x) with one copy of (x − α) removed. Similarly ρ(D)(x) is the product of all the factors of ρ(P )(x) with one copy of (x − α′ ) removed. Thus D(x) splits over E and ρ(D)(x) splits over E ′ . If D(x) splits over a field K containing F (α) that is smaller than E, then P (x) would split over K as well since all factors of D(x) would be in K[x] and the extra factor (x − α) is in F (α)[x] which is also in K[x]. Thus E would not be the smallest field in E containing F over which P (x) splits. We conclude that E is a splitting field for D(x) over F (α) as well. An identical argument shows that E ′ is a splitting field for ρ(D)(x) over F ′ (α′ ). With F (α) and F ′ (α′ ) playing the role of F and F ′ , with ρ playing the role of φ, with D(x) playing the role of P (x), and with E and E ′ playing their original roles, we have reproduced the hypotheses of the statement we are proving but with a polynomial of degree d − 1 instead of d. Our inductive hypothesis says that there is an isomorphism θ : E → E ′ that extends ρ. But ρ extends φ, so θ also extends φ. Thus θ is the isomorphism that we were looking for. 13.4.3 An application Theorem 13.4.3 Let F and F ′ be two finite fields with the same number of elements. Then F and F ′ are isomorphic. 260 CHAPTER 13. MULTIPLE EXTENSIONS Proof. We know that for some prime p and some natural number n that the number of elements of F is pn . We also know that Zp is a subfield of both F and F ′ . Let q = pn . Then the multiplicative group F ∗ has q − 1 elements and thus every element of F ∗ has order dividing q − 1. From this we know that for every x ∈ F ∗ that xq−1 = 1. Multiplying by one more copy of x gives that every x in F ∗ satisfied xq = x. However, this is also true of 0. So for every x ∈ F , we have that x is a root of the polynomial xq = x. But there are exactly q elements of F and every one is a root of the degree q polynomial xq − x. Since xq − x can have no more than q roots, it splits over F . Since every element of F is needed for the splitting, F is the smallest subfield of F over which xq − x splits. This makes F a splitting field for xq − x over Zp . Similarly F ′ is a splitting field for xq − x over Zp . The result now follows from the uniqueness of splitting fields. A strategy We wish to understand a polynomial (P (x) in some F [x]. In particular we want to understand its roots. A splitting field contains all its roots and its structure (up to an isomorphism) depends only on P (x) and F . Thus the splitting field should have much information about the relationship between F , P (X) and the roots of P (x) and should not depend on choices made in the construction of the splitting field. Galois’ approach is to study Γ = Aut(E/F ) where E is some splitting field for P (x) over F . There is no reason at the outset to expect that there will be enough information in Γ to say much about how F , P (x) and its roots relate, but it turns out that there is. However, there are some extra conditions that must be met. Every one of the automorphisms in Γ fixes every element of F , so no internal structure of F is picked up by Γ. Since it is the relatinship between F (which contains the coefficients) and E (which contains the roots) that is to be understood, this does not seem to be a problem. However, if Γ fixes more than F , then there are parts of E not contained in F whose structure is being ignored by Γ. To get the maximum amount of information about Γ, we would like F to be the only set of elements fixed by Γ. In the next section, we will explore what it takes to make F the fixed field of Aut(E/F ). 13.5 Fixed fields From Corollary 13.3.5, we know that if an extension F ⊆ E of fields has degree [E : F ] = d < ∞, then Γ = Aut(E/F ) has no more than d elements. But it might have fewer. In this section, we will learn that if Γ has fewer than d elements, than the fixed field of Γ will not be F , but a field that is strictly larger than F . Thus there will be parts of the structure of the extension that are not 13.5. FIXED FIELDS 261 being distinguished from F by Γ. This turns out to be an important loss, and in the next chapter we will see what conditions are needed to guarantee that for a finite extension F ⊆ E, the number of elements in Aut(E/F ) is exactly [E : F ]. In order to discuss the size of fixed fields, we need an importantion notion of independence of automorphisms. This will be introduced first, and then applied to fixed fields. 13.5.1 Independence of automorphisms Field automorphisms are linear transformations and if that is all that was required of a field automorphism, then field automorphisms might form a vector space. But sums and “scalar multiples” of field automorphisms are not field automorphisms. Simply look at what happens to 1 under such operations. In spite of this, we can make a definition that imitates aspects of linear algebra. Let φ1 , φ2 , . . . , φk be different automorphisms of a field F . We say that these automorphisms are linearly dependent if there are elements ai , 1 ≤ i ≤ k so that not all the ai are zero and so that k X ai φi (α) = 0 (13.2) i=0 for every element α ∈ F . We say that these automorphisms are linearly independent if they are not linearly dependent. It turns out that field automorphisms are so restrictive that all that is needed to make a finite set of field automorphisms linearly independent is to make sure that they are all different. Proposition 13.5.1 If φi , 1 ≤ i ≤ k are automorphisms of a field F with no two of them being the same, then they are linearly independent. Proof. We proceed by contradiction and assume that they are linearly dependent. If the automorphisms φi , 1 ≤ i ≤ k are linearly dependent, then a dependence such as (13.2) must exist with some number of the ai not equal to zero. If we only keep the terms with ai not zero, and discard the others, then we have a linear dependence on a subset of the φi . If we choose a minimal non-empty subset of the φi that are linearly dependent, then the sum like (13.2) for that subset will have that all the ai used in the sum are non-zero. We will prove that a strictly smaller subset must have a linear dependence, contradicting the fact that we have chosen a minimal subset. Note that a subset of size one cannot be linearly independent, since the statement ai 6= 0 and ai φi (α) = 0 for all α ∈ F implies that φi is the zero homomorphism and cannot be an automorphism. For simplicity of notation, we assume that a minimal linearly dependent subset of the φi is the set of φi with 1 ≤ i ≤ j so that (13.2) takes the form a1 φ1 (α) + a2 φ2 (α) + · · · aj φj (α) = 0. (13.3) 262 CHAPTER 13. MULTIPLE EXTENSIONS We would like to subtract from (13.3) a similar but not identical sum so that at least one term cancels, but not all terms cancel. We can do this by exploiting the properties of field automorphisms. We note that φj (βα) = φj (β)φj (α). So one way to get the last term to read as aj φj (β)φj (α) is to replace every appearance of α in (13.3) by βα. Since βα is another element of F , the resulting sum will still be zero by the definition of linear dependence. Another way to get the last term to read as aj φj (β)φj (α) is to simply multiply all terms in (13.3) by φj (β). Again the resulting sum will still be zero. The first modification makes the first term read as a1 φ1 (β)φ1 (α), and the second modification makes the first term read a1 φj (β)φ1 (α). To keep these terms different, we only need that φ1 (β) 6= φj (β). But this can be arranged by the right choice of β since φ1 and φj are not the same automorphism and must differ on some element of F . The difference of the two modifications will be a linear dependence on a strictly smaller subset since at least one of the coefficients (the first) will not be zero and at least one (the last) will be zero. 13.5.2 Sizes of fixed fields The following argument uses a technique that will come up again. Proposition 13.5.2 Let γ ⊆ Aut(E) be a subgroup with finitely many elements and let F be the fixed field of Γ. Then [E : F ] is the order of Γ. Proof. Let Γ = {φ1 , . . . , φk } and let d = [E : F ]. From Corollary 13.3.5, we know that k ≤ d. We assume that k < d and arrive at a contradiction. We let (β1 , . . . , βd ) be a basis for E as a vector space over F . Consider the equation φi (β1 )x1 + φi (β2 )x2 + · · · φi (βd )xd = 0. (Di) Since φi for a fixed i is an automorphism of the vector space E that fixes the field of “scalars” F , the elements φi (βj ), 1 ≤ j ≤ d must be linearly independent over F and the only solutions for the xj in F for (Di) are all zero. But the φi (βj ) are not linearly independent over E and there are non-zero solutions in E. Further, if we consider the system of linear equations (Di) for 1 ≤ i ≤ k, then with our assumption k < d, we have fewer equations than unknowns, and there is at least one solution where not all the xi are zero. Let us renumber so that at least x1 6= 0. Note that we can multiply all the xi by any non-zero element of E and still have a solution to the system of equations for which x1 6= 0. Since we can multiply by any element of E, we can find a solution to the system of the (Di) where x1 is whatever fixed non-zero element of E that we please. We now define for each j with 1 ≤ j ≤ d aj = φ1 (xj ) + φ2 (xj ) + · · · φk (xj ). 13.5. FIXED FIELDS 263 Since the φi are linearly independent, and we can arrange to have x1 any nonzero element of E that we wish, we can choose x1 so that a1 6= 0. We now come to the technique referred to before the statement of the proposition. We use the fact that the φi are all the elements of a group. We know that multiplication (composition) on the left by any one group element permutes the elements of a group. Thus for any i and j we have that φi (aj ) is the same sum as the one giving aj except that the order of the summands is permuted. Thus for each i and j, we have that φi (aj ) = aj . Since each aj is fixed by all the φi , each aj is in F , the fixed field of Γ. Since the βj , 1 ≤ j ≤ d are linearly independent over F , since the aj are all in F , and since at least a1 6= 0, we must have d X j=1 aj βj 6= 0. (13.4) We calculate this sum using the definition of the aj and get d X a j βj = j=1 k d X X = φi (xj ) βj i=1 j=1 d X k X ! (φi (xj )βj ) j=1 i=1 = d X k X j=1 i=1 = k X d X i=1 j=1 = k X i=1 But for each i we have d X (xj φ−1 i (βj )) = j=1 φi xj φ−1 i (βj ) φi xj φ−1 i (βj ) d X φi (xj φ−1 i (βj )) . d X j=1 (φ−1 i (βj )xj ) = j=1 d X (φt (βj )xj ) j=1 for whatever φt ∈ Γ equals φ−1 i . This is one of the sums (Di) for i = t and is equal to zero. This gives d X a j βj = 0 j=1 which contradicts (13.4). 264 CHAPTER 13. MULTIPLE EXTENSIONS Combining Corollary 13.3.5 and Proposition 13.5.2 In Proposition 13.5.2, we can let Γ = Aut(E/F ) and G be the fixed field of Aut(E/F ). We have F ⊆ G ⊆ E. Corollary 13.5.3 If F ⊆ E is a field extension and Aut(E/F ) is finite, then letting G be the fixed field of Aut(E/F ) gives [E : G] = |Aut(E/F )| ≤ [E : F ]. Proof. The equality comes from Proposition 13.5.2 and the inequality comes from Corollary 13.3.5. From this we get the following. Corollary 13.5.4 Let F ⊆ E be a field extension of finite degree, and let G be the fixed field of Aut(E/F ). Then G = F if and only if |Aut(E/F )| = [E : F ]. Proof. From Corollary 13.5.3, we have |Aut(E/F )| = [E : F ] if and only if [E : G] = [E : F ]. From the multiplicative properties of degree (Lemma 10.5.1), we have [E : G] = [E : F ] if and only if [G : F ] = 1, and from Lemma 10.5.2 this happens if and only if G = F . The strategy revisited We are interested in roots of a polynomial P (x) over a field F , and we know that we will find them in some splitting field for P (x) over F . The automorphism group Γ = Aut(E/F ) will explore the relationship between E and F best if F is the fixed field of Γ. We know that F must be contained in the fixed field of Γ, but from Proposition 13.5.2, we will only get equality if the order of Γ equals [E : F ]. 13.6 A criterion for irreducibility We give a celebrated criterion that implies irreducibility of a polynomial over Q. In particular it will imply that the polynomial used as an example at the beginning of the chapter is irreducible. We will also show that it implies that a certain class of polynomials is irreducible and will use this fact in an important way in the next chapter. The criterion is easy to state, but its proof requires that some concepts be introduced first. 13.6.1 Primitive polynomials and content The roots of a polynomial do not change when the polynomial is multiplied by a constant. If we are given a polynomial over Q, we can multiply by a large 13.6. A CRITERION FOR IRREDUCIBILITY 265 enough integer (the product of all the denominators of the coefficients, say) so that the resulting polynomial has the same roots as the original, but all the coefficients are integers. We will refer to a polynomial with integer coefficients as being “over Z” and in Z[x] even though we have not yet discussed polynomials over rings. Given a polynomial over Z, we can alter it further by dividing by the greatest common divisor (here we mean positive greatest common divisor) of all the coefficients. The result is a polynomial over Z with the same roots and whose greatest common divisor of the coefficients is 1. We use this as a definition and say that a non-zero polynomial is primitive if it is over Z and if the greatest common divisor of all its coefficients is 1. Our observations before the definition have argued that a polynomial over Q has the same roots as a primitive polynomial. We can make this observation stronger and more specific. Lemma 13.6.1 Let P (x) be a non-zero polynomial in Q[x]. Then there is a unique c ∈ Q and a unique primitive polynomial A(x) so that P (x) = cA(x). Proof. That there are such a c and A(x) has already been argued. We assume that P (x) = d(B(x) where d ∈ Q and B(x) is primitive. Then c = m/n and d = p/q with all of m, n, p, q in Z. Now CA(x) = dB(x) so mqA(x) = npB(x). Since the greatest common divisor of the coefficients of A(x) is 1, the greatest common divisor of the coefficients of mqA(x) must be mq. And since the greatest common divisor of B(x) is 1, the greatest common divisor of the coefficients of npB(x) must be np. But the two polynomials are the same so mq = np and c = d. Now A(x) = B(x). The rational number c in Lemma 13.6.1 is called the content of the polynomial P (x) ∈ Q[x]. An important property of primitive polynomials is that they are closed under multiplication. Lemma 13.6.2 If P (x) and Q(x) are both primitive, then so is (P Q)(x). Proof. If (P Q)(x) is not primitive, then some integer greater than 1 divides all its coefficients. In particular, some prime integer p divides all the coefficients. We will show that no such prime exists. We have ∞ i X X aj bi−j xi (P Q)(x) = i=0 j=i where ai are the coefficients of P (x) and bi are the coefficients of Q(x). Since both P (x) and Q(x) are primitive, we know that p does not divide all the ai and does not divide all the bi . Let s be the smallest so that p does not divide as and let t be the smallest so that p does not divide bt . So p|ai when i < s and p|bj when j < t. 266 CHAPTER 13. MULTIPLE EXTENSIONS The coefficient of xs+t in (P Q)(x) is s+t X aj bs+t−j . (13.5) j=0 When j < s we know p|aj bs+t−j since p|aj . When j > s, then s + t − j < t and p|aj bs+t−j since p|bs+t−j . The sum (13.5) is divisible by p and the only term not yet mentioned is as bt . So p|as bt . Since p is prime, either p|as or p|bt which contradicts our choice of s and t. The previous two lemmas combine to give the following non-obvious result. Lemma 13.6.3 If P (x) ∈ Z[x] factors into two polynomials of positive degree in Q[x], then it factors into two polynomials of positive degree in Z[x]. Proof. Suppose P (x) ∈ Z[x] factors as A(x)B(x) with A(x) and B(x) in Q[x] each of positive degree. Letting c be the content of A(x) and d be the content of B(x), we have A(x) = cC(x) and B(x) = dD(x) for primitive polynomials C(x) and D(x). Now P (x) = A(x)B(x) = cd(CD(x)) where (CD)(x) is primitive by Lemma 13.6.2. But this makes cd the content of P (x). The coefficients of (CD)(x) have greatest common divisor 1, so if cd is not an integer, then some coefficient of cd(CD)(x) would not be an integer. Since all coefficients of P (x) are integers, cd is an integer. Writing P (x) as cdC(x) 13.6.2 D(x) gives the required factorization of P (x). The Eisenstein Irreducibility Criterion We now get to the criterion for irreducibilitiy. The criterion in the theorem below is known as the Eisenstein Irreducibility Criterion. Note that it is not an if and only if criterion. Any polynomial in Z[x] must be irreducible over Q, but there are polynomials in Z[x] that are irreducible over Q that do not satisfy the criterion. In fact an important set of examples will not satisfy the criterion but will be provably irreducible over Q by a trick that allows the criterion to be used indirectly. Theorem 13.6.4 Let P (x) = ∞ X ai xi i=0 be a polynomial in Z[x] of degree d. If there is a prime p that that divides every ai except ad and if p2 does not divide a0 , then P (x) is irreducible over Q. 13.6. A CRITERION FOR IRREDUCIBILITY 267 Note that it is a requirement of the hypotheses that p does not divide ad . Proof. We assume that P (x) is reducible and that P (x) = Q(x)R(x) where Q(x) and R(x) have positive degree. By Lemma 13.6.3, we can assume that Q(x) and R(x) have integer coefficients. Let Q(x) = ∞ X bi xi and R(x) = i=0 ∞ X ci xi . i=0 We know that a0 = b0 c0 and that p divides a0 but p2 does not. Thus p divides exactly one of b0 and c0 but not both. Let us assume that p divides b0 and does not divide c0 . Our goal is to prove that p divides every bi . This will say that Q(x) = pS(x) for some S(x) ∈ Z[x] making P (x) = pS(x)R(x) and p will divide all coefficients of P (x) including ad . But this will contradict a hypothesis. We prove that p divides all bi by induction. We know that p|b0 . Assume that p|bi for all i < k. We want to prove that p|bk . Let the degree of Q(x) be q. If k > q, then we have nothing to prove since bk = 0 if k > q. So we assume k ≤ q. Since Q(x) and R(x) have positive degree, we know that q < d so k < d. We know that ak = b0 ck + b1 ck−1 + · · · bk c0 . By our inductive hypotheses, we know that p divides every bi with i < k. Thus p divides all terms in the sum except possibly the last. But the hypotheses of the lemma and the fact that k < d say that p divides the sum which is ak . So p divides the last term. But p does not divide c0 so p|bk . This completes the proof. 13.6.3 Applications of the irreducibility criterion The example at the beginning of the chapter The example at the beginning of this chapter is P (x) = x3 + 3x − 2. This does not satsify the hypotheses of the Eisenstein Criterion. But P (x + 2) = (x + 2)3 + 3(x + 2) − 2 = x3 + 3x2 · 2 + 3x · 22 + 23 + 3x + 6 − 2 = x3 + 6x2 + 15x + 12 does using the prime 3. So H(x) = P (x + 2) is irreducible over Q. Now if P (x) = H(x − 2) were reducible as P (x) = A(x)B(x), then H(x) = P (x + 2) would be reducible as H(x) = A(x + 2)B(x + 2). Since H(x) is irreducible over Q, so is P (x) as promised at the beginning of the chapter. This trick is based on the fact that if A(x) and B(x) are in Q[x], then so are A(x + 2) and B(x + 2). We will not bother to formalize this procedure into a lemma. Roots of one We worked in a previous problem to show that x4 + x3 + x2 + x + 1 is irreducible over Q. This comes up as a factor of x5 − 1. For any prime n, the polynomial 268 CHAPTER 13. MULTIPLE EXTENSIONS xn − 1 factors as x − 1 times P (x) = n−1 X xi . (13.6) i=0 We have seen that for some n, such as n = 6, the polynomial in (13.6) is reducible over Q. However for n a prime, it is irreducible as we will show. We use a procedure similar to the previous example. We will show that P (x + 1) satisfies the Eisenstein Criterion which will then imply that P (x) is irreducible. We let n = p a prime so that the notation makes clear what we are working with. We note that (x − 1)P (x) = xp − 1. so that we can write P (x) = xp − 1 . x−1 Thus we can write P (x + 1) = (x + 1)p − 1 (x + 1)p − 1 = . (x + 1) − 1 x Now p (x + 1) = p X p i=0 p We know that = 0 p p ! i xi = 1 and that for 1 ≤ i ≤ p − 1 that p p(p − 1)(p − 2) · · · (p − i + 1) . = 1 · 2 · 3···i i With 1 ≤ i ≤ p − 1, no term!in the denominator divides p so that the result is p = p. Since the constant term of (x + 1)p is 1, we divisible by p. Also p−1 p know that (x + 1)− 1 has constant term 0 and is divisible by x. p If we take to be 0 when k > p, we can write k (x + 1)p − 1 x X p xi = i + 1 i=0 P (x + 1) = which has degree p − 1, leading term 1, constant term p and all other coefficients divisible by p. Thus Eisenstein’s Criterion applies and P (x + 1) is irreducible over Q and so is P (x). Chapter 14 Galois theory basics Let F ⊆ E be a finite extension of fields. We know that [E : F ] is always no smaller than Aut(E/F ), and we know that F is always contained in the fixed field of Aut(E/F ). From Corollary 13.5.3 we know that the following are equivalent: 1. [E : F ] equals the order of Aut(E/F ). 2. F is the fixed field of Aut(E/F ). In this chapter, we will study the situation in which the above equivalent items hold. The relevant definition is the following. We call an extension F ⊆ E of fields a Galois extension if [E : F ] is finite and equals the order of Aut(E/F ). Equivalently, we can call an extension F ⊆ E of fields a Galois extension if [E : F ] is finite and F equals the fixed field of Aut(E/F ). We start by investigating which extensions are Galois extensions. 14.1 Separability Let F ⊆ E is an extension of fields, and let Γ = Aut(E/F ). We know that F ⊆ Fix(Γ). If F is strictly smaller than Fix(Γ), then there are elements in E − F that Γ cannot distinguish from F . Specifically, there are elements in E − F that no element in Γ can budge. One might say that there are elements of E − F that cannot be “separated” from F by elements of Γ. We will let the reader judge if this bit of background justifies the choice of words used in the following discussion. This choice of words is quite standard. We now assume that [E : F ] is finite. We know from the previous chapter, that F = Fix(Γ) if and only if |Γ| = [E : F ]. It will always be the case that |Γ| ≤ [E : F ], so we are only in trouble if there are fewer automorphisms in Aut(E/F ) than demanded by the degree of the extension. From the details of the proofs of Propositions 12.3.5, 13.3.3 and Theorem 13.3.4, we know that we get automorphisms from the roots of the minimal 269 270 CHAPTER 14. GALOIS THEORY BASICS polynomials that are involved. One way to guarantee that the roots we need are available is to work with splitting fields. Properties of splitting fields will be dealt with later in this chapter. However, the number of automorphisms that we seek is determined by the degree of the extension, and this degree is tied to the degrees of the minimal polynomials involved. Thus we are in trouble if the number of roots of a relevant minimal polynomial is smaller than the degree of the polynomial. This will occur if the polynomial has multiple roots. In discussions of multiple roots, one talks about counting multiple roots multiple times. But counting a root more than once will not create more than one automorphism taking a given element to that particular root. So having multiple roots will stand in the way of “separating” elements in E − F from F . This leads to our first definition. If F is a field and P (x) is in F [x], then we say that P (x) is separable over F if P (x) has no multiple roots in any splitting field for P (x) over F . Shortly, we will connect this notion of “separable” to our use of the word “separating.” Lemma 14.1.1 Let F be a field of characteristic zero and let P (x) ∈ F [x] be irreducible. Then P (x) is separable over F . Proof. Let α be a root of P (x). Since P (x) is irreducible, it is minimal for α over F . We know that α is a multiple root of P (x) = ∞ X ai xi i=0 if and only if α is also a root of P ′ (x) = ∞ X i(ai )xi−1 . i=1 Since P (x) is not zero and has a root, it has degree at least 1. Also each i > 0 that has ai 6= 0 gives i(ai ) 6= 0 since F has characteristic zero. So P ′ (x) will be a non-zero polynomial with α as a root whose degree is smaller than the degree of P (x). This contradicts the fact that P (x) is a minimal polynomial for α over F. If F ⊆ E is an extension of fields, then α ∈ E is said to be separable, if its minimal polynomial over F is separable. The extension E of F is said to be separable if every element of E is separable over F . Corollary 14.1.2 A finite extension of a field of characteristic zero is separable. Separability gives the next important result which deserves its own section. It will not see use in this chapter but will in the next chapter. 14.2. THE PRIMITIVE ELEMENT THEOREM 14.2 271 The primitive element theorem Recall from Section 13.1 that an extension F ⊆ E of fields is said to be simple if there is an element α ∈ E so that E = F (α), and that if such an element exists, it is called a primitive element for the extension. In our setting, finite extensions are all simple. That is, primitive elements always exist. The key lemma is the following where the use of separability is prominent. Lemma 14.2.1 Let F ⊆ E be an extension of fields, and let α and β in E be algebraic over F . If the minimal polynomial for α over F is separable, then there are only finitely many t ∈ F for which F (α + tβ) is not all of F (α, β). Proof. We will avoid t = 0 since this is only one value to avoid and thus will not affect the conclusion. Note that if α is in F (α + tβ), then so is β since β can be gotten from α + tβ and α by field operations. So we are done if we show that there are only finitely many t ∈ F for which α ∈ / F (α + tβ). Let P (x) be a minimal polynomial for α over F and let Q(x) be a minimal polynomial for β over F . By Proposition 13.4.1, we can extend E so that P (x) splits in the extension, so we can just assume that P (x) splits in E. We assume that α ∈ / F (α + tβ), and we let P1 (x) be a minimal polynomial for α over F (α + tβ). It will have degree bigger than 1 and it will divide P (x). Since P (x) has no multiple roots, neither will P1 (x) and there will be a root α′ of P1 (x) different from α. Note that α′ will also be a root of P (x). By Prop 13.3.3, there is an isomorphism φ from F (α+tβ)(α) to F (α+tβ)(α′ ) that is the identity on F (α + tβ). The element β ′ = φ(β) must be some root of a minimal polynomial for β over F (α + tβ) and thus a root for the minimal polynomial Q(x) for β over F . Noting that t ∈ F implies φ(t) = t, we have α + tβ = φ(α + tβ) = φ(α) + tφ(β) = α′ + tβ ′ . Now α 6= α′ together with α + tβ = α′ + tβ ′ implies that β 6= β ′ and we can solve for t as α − α′ . (14.1) t= ′ β −β Thus the assumption that α ∈ / F (α + tβ) forces t to be one of the finitely many elements of F of the form (14.1) obtained by letting α′ range over the roots of P (x) different from α and letting β ′ range over the roots of Q(x) different from β. This completes the proof. This now gives the following important result. Theorem 14.2.2 (Primitive Element Theorem) A finite separable extension F ⊆ E of infinite fields is simple. 272 CHAPTER 14. GALOIS THEORY BASICS Proof. Since [E : F ] is finite, there are finitely many elements α1 , α2 , . . . , αk in E so that E = F (α1 , α2 , . . . , αk ). We assume that these elements have been chosen so that k is as small as possible. If k = 1, we are done. If k > 1, Lemma 14.2.1 says that we can find a t ∈ F so that setting γ = α1 + tα2 gives F (γ) = F (α1 , α2 ) and E = F (γ, α3 , . . . , αk ) which contradicts our choice of k. Corollary 14.2.3 A finite extension F ⊆ E of fields of characteristic zero is simple. Proof. A field of characteristic zero is infinite and any finite extension of it is separable. Theorem 14.2.2 holds in greater generality than what we have given. In particular it also holds for finite separable fields, so the most general statement would be that a finite separable extension of any field is simple. Our ultimate goal is to work in fields of characteristic zero (in particulat subfields of C), so we will not discuss the more general version.1 Theorem 14.2.2 could have been used to give a shorter proof of Proposition 13.5.2, but we would have had to assume separability to use it. This is not a big restriction, as Lemma 14.1.1 shows, but the technique that we did use in the proof of Proposition 13.5.2, independence of automorphisms, will be needed later. 14.3 Galois extensions Two ideas have been put forward to avoid having two few automorhpisms. One is to insure that all roots of a relevant polynomial are present. This is accomplished by looking at splitting fields. The other is to avoid polynomials with multiple roots. This concept comes under the name “separable.” In this section we explore what happens when the two ideas are combined. The main result is that one obtains our goal, a Galois extension. However, we get much more. When the number of automorphisms equals the number predicted by the degree, then a large collection of techniques become available. These lead to a large number of results about Galios extensions, as well as a large number of properties that turn out to be equivalent to an extension being Galois. Ultimately these results combine to the first theorem, the Fundamental Theorem of Galois Theory, that makes serious ties between the automorphism group of a Galois extension and the internal structure of the extension. The statement and proof of the fundamental theorem is at the end of this chapter. In the chap1 If you have done the project at the end of Chapter 11 which shows that the multiplicative group F ∗ for a finite field F is cyclic, then you will have done all the work for the more general version of Theorem 14.2.2. 14.3. GALOIS EXTENSIONS 273 ter that follows, we will start with the fundamental theorem and use it to deduce facts about polynomials and their roots. Normal extensions The first property that we will explore in connection with Galois extensions is called normality. Normality seems like a strengthening of the notion of a splitting field. Galois extensions will always turn out to be splitting fields, but it turns out that Galois extensions also have this apparently stronger property. Conversely normality combined with separability implies Galois (for finite extensions). However, splitting fields that are separable are also Galois (again, for finite extensions) and thus normal. We have put off the definition long enough, so it is given in the next paragraph. An extension F ⊆ E of fields is called a normal extension if every irreducible polynomial in F [x] that has a root in E also splits in E. In other words, if an irreducible polynomial over F has one of its roots in E, then it has all of its roots in E. Another rewording is that F ⊆ E is normal if all minimal polynomials in F [x] for elements of E that are algebraic over F split in E. The reason for the word “normal” will be made clear after the statement of the Fundamental Theorem of Galois Theory is given. 14.3.1 Finite, separable, normal extensions In this section we show that being finite, separable and normal is equivalent to being Galois. We break the two directions into two arguments since each takes a bit of doing and each uses techniques worth learning. Here is the first direction. Proposition 14.3.1 If F ⊆ E is a Galois extension of fields, then it is finite, separable and normal. The following argument uses a technique that gets used often. In this situation it has the strange effect of proving the two items that require work simultaneously. Proof. The extension is finite by the definition of a Galois extension. Thus it is algebraic. Any element of E has a minimal polynomial P (x) over F that must be irreducible over F , and if a polynomial over F is irreducible over F and has a root in E, then it is a minimal polynomial for that root. So if we look at elements α ∈ E and irreducible polynomials P (x) ∈ F [x] for which α is a root of P (x), then we know we are looking at all elements of E and all irreducible polynomials over F with a root in E. Since multiplying a polynomial by a constant does not change its roots, we can assume that P (x) is monic. For such an α and P (x), let A = {α1 , α2 , . . . , αk } be all the roots of P (x) in E. We take α1 = α. If θ is in Aut(E/F ), then each θ(αi ) must also be a root of P (x) and must be some αj . Since θ is one-to-one and A is finite, the action of θ on A is to permute the elements of A. 274 CHAPTER 14. GALOIS THEORY BASICS We now bring in the advertised technique. We now consider Q(x) = (x − α1 )(x − α2 ) · · · (a − αk ). The action of θ on Q(x) is to permute that factors of Q(x). Thus θ(Q)(x) = Q(x) and θ fixes all the coefficients of Q(x). Since this was shown for any θ ∈ Aut(E/F ), we know that the coefficients of Q(x) lie in the fixed field of Aut(E/F ) which is F by hypothesis. If we extend E to a field K that is a splitting field for P (x), we see that Q(x) is a product of factors of P (x) in K. Thus Q(x)|P (x). But α = α1 is a root of Q(x) and P (x) is minimal for α in E over F . Thus P (x)|Q(x). This means that P (x) and Q(x) are constants times each other. Since we assume P (x) is monic, and Q(x) is clearly monic, they are equal. We have shown that P (x) splits in E, and since all the αi are different, we have shown that P (x) is separable. This is what was to be proven. The work to prove the converse to Proposition 14.3.1 is contained in a key lemma that can be used to show that in the presence of separability not only is normality a powerful property, but being a splitting field field is just as powerful a property. Lemma 14.3.2 Let F ⊆ E be an extension of fields and assume that there is a finite set A of elements in E that are separable and algebraic so that E = F (A) and so that each element of A has a minimal polynomial over F that splits in E. Then F ⊆ E is a Galois extension. Proof. It may be that there are elements in A that are not needed to make E = F (A) true. We can throw the unneeded elements out and end up with a new subset that we still call A for which E = F (A) is still true and all other hypothese hold and so that no proper subset S ⊆ A makes E = F (S) true. If α1 , α2 , . . . , αk are the elements of A, then we can define Fi = F (αi , α2 , . . . , αi ) for 0 ≤ i ≤ k (where F0 = F ) and we have a succession of extensions F = F0 ⊆ F1 ⊆ F2 ⊆ · · · ⊆ Fk = E. Our assumption that A cannot be made smaller implies, for each i with 1 ≤ i ≤ k, that αi ∈ / Fi−1 . Since each αi is algebraic over F , it is algebraic over Fi−1 . So each [Fi+1 : Fi ] is finite and [E : F ] = [Fk : F0 ] is finite. For each i with 1 ≤ i ≤ k, let Pi (x) be a minimal polynomial for αi over F and let Qi (x) be a minimal polynomial for αi over Fi−1 . We will use this data to count the number of elements in Aut(E/F ). It is in this lemma that we deliver on our promise that enough assumptions about the presence of roots and the non-duplication of roots gives the number of automorphisms that is predicted by the degree. We will build automorphisms by building homomorphisms of the Fi into E. These will not necessarily be automorphisms of the Fi , but they will fit 14.3. GALOIS EXTENSIONS 275 together to give automorphisms of E. We will use Proposition 13.3.3 to count the homomorphisms. To help with the count, we let ni be the number of homomorphisms from Fi into E that restrict to the identity on F . Note that nk will be the njumber of elements in Aut(E/F ). This is because each homomorphism θ from Fk = E into E that is the identity on F will have [θ(E) : F ] = [E : F ]. Thus [E : θ(E)] = 1 and θ(E) = E. We proceed by induction. Our goal is to prove that ni = [Fi : F ] for each i with 0 ≤ i ≤ k. It is true for i = 0 since there is only one homomorphism from F0 = F into E that is the identity on F , and [F : F ] = 1. We assume that ni−1 = [Fi−1 : F ] for some i ≥ 1 and work to prove the corresponding statement for ni . Let θ be one of the ni−1 homomorphisms from Fi−1 into E that is the identity on F . We know from Proposition 13.3.3 that the number of extensions of θ to a homomorphism from Fi into E is the number of roots of θ(Qi )(x) that are in E. Let di be the degree of Q(x). It is the degree of θ(Qi )(x). We know that Qi (X) divides Pi (x) and that Pi (x) splits into a product of linear factors over E and that no factor is repeated. We also know that θ(Qi )(x) divides θ(Pi )(x). But the coefficients of Pi (x) are in F and θ is the identity on F . Thus θ(Pi )(x) = Pi (x) and θ(Qi )(x) divides Pi (x). Thus over E, θ(Qi )(x) factors into a product of some of the linear factors of Pi (x), none of which are repeated. From this and Proposition 13.3.3 we get that the number of extensions of θ to Fi is exactly di . Note that every extension is the identity on F since it is extending a homomorphism that is already the identity on F . Note that extensions of different homomorphisms from Fi−1 into E must be different since they disagree on Fi−1 . Also for a given extension (e.g., θ) from Fi−1 to E, the di extensions to Fi are different simply because they are different (specifically they disagree on αi by the extra provisions of Lemma 13.3.3 if you want details). Thus each of the ni−1 homomorphisms from Fi−1 into E that are the identity on F give di different extensions to Fi and there are ni−1 di such extensions in total. But ni−1 = [Fi−1 : F ] by the inductive assumption and di = [Fi : Fi−1 ] by Lemma 12.3.2, so there are [Fi : Fi−1 ][Fi−1 : F ] = [Fi : F ] extensions and we have shown ni = [Fi : F ]. By induction, we get to nk = [E : F ] and, as mentioned, this shows that [E : F ] equals the order of Aut(E/F ). This proves that F ⊆ E is Galois. Proposition 14.3.3 A finite, separable, normal extension of fields is a Galois extension. Proof. Let F ⊆ E be an extension satsifying the hypotheses. There is a finite sequence α1 , α2 , . . . , αk so that F (α1 , α2 , . . . , αk ) = E. To get such a sequence, we start with F . If F is not all of E, we let α1 be any element in E − F . If F (α1 ) is not all of E, we let α2 be in E − F (α1 ). Inductively if F (α1 , . . . , αi ) is not all of E, we take αi+1 to be any element of E − F (α1 , . . . , αi ). This process must stop since [E : F ] is finite. Note that each αi is algebraic over F since F ⊆ E is a finite extension. 276 CHAPTER 14. GALOIS THEORY BASICS Now the hypotheses in the statement show that we have all the hypotheses of lemma 14.3.2 and the result follows from that lemma. 14.3.2 Splitting fields We need to make some remarks about splitting fields. The construction of splitting fields in Proposition 13.4.1 needs no assumption of irreducibility. This allows us to split any finite set of polynomials by being able to split any one polynomial. Given a finite set of polynomials, just multiply all of them together and build a splitting field for the product. It turns out that infinite sets of polynomials can be split, but we will have no need here of such power. So the assumption below that we are looking at a splitting field of a single polynomial is no real restriction for us. We start with a lemma that gives a brief introduction to the power of the assumption that a field is a splitting field. Lemma 14.3.4 Let F ⊆ E ⊆ K be an extension of fields, and assume that E is a splitting field for a polynomial P (x) in F [x]. Then any homomorphism from E into K that is the identity on F is an automorphism of E and thus an element of Aut(E/F ). Proof. Let A = {α1 , α2 , . . . , αk } be the roots of P (x). Then E = F (α1 , α2 , . . . , αk ). An immediate consequence is that the extension F ⊆ E is finite. Let θ be a homomorphism from E into K that is the identity on F . Since θ is the identity on F , we must have that each θ(αi ) is a root of P (x). Thus each θ(αi ) is an element of A. Thus θ takes every element of A into A and thus into E. We know that θ(E) ∩ E is a field. But it contains F and all of A. Thus it contains E = F (α1 , α2 , . . . , αk ). So E ⊆ θ(E). We know that [E : F ] is finite and must equal [θ(E) : F ]. So [θ(E) : E] = 1 and E = θ(E). We now give the proposition that shows that “splitting” implies “normal” in the presence of “separable.” It uses the same key lemma that Proposition 14.3.3 does. Proposition 14.3.5 Let F ⊆ E be a separable extension of fields, and assume that E is a splitting field for a polynomial in F [x]. Then the extension is a Galois extension. Remark. The power of the result might be hidden. Proposition 14.3.5 says that a separable extension that is a splitting field for one polynomial splits all polynomials that have at least one root in the extension. Further, there is a great deal that can be said about the automorphism group of the extension. Proof. As in the proof of Lemma 14.3.4, we know that the extension is finite. Let P (x) be the polynomial for which E is the splitting field, and let A = 14.4. THE FUNDAMENTAL THEOREM OF GALOIS THEORY 277 {α1 , . . . , αk } be the set of roots of P (x). We have that E = F (A). Each αi has a minimal polynomial Pi (x) over F which must divide P (x). The separability assumption makes each Pi (x) separable, and the fact that P (x) splits in E says that each Pi (x) splits in E. We now have all the hypotheses of Lemma 14.3.2 and the conclusion of that lemma is that F ⊆ E is a Galois extension. The converse to Proposition 14.3.5 is easy to get from from Proposition 14.3.1 and the remarks that we made at the beginning of this section that a splitting field for finitely many polynomials is a splitting field for one polynomial by the trick of multiplying all the polynomials together. 14.3.3 Characterizations of Galois extensions The theorem below summarizes what we know from our defintions, propositions and lemmas. Theorem 14.3.6 The following are eqivalent for an extension F ⊆ E of fields. 1. The extension is Galois. 2. The degree [E : F ] is finite and equals the order of Aut(E/F ). 3. The degree [E : F ] is finite and the fixed field of Aut(E/F ) is F . 4. The extension is finite, separable and normal. 5. The extension is a separable extension and E is a splitting field for some polynomial in F [x]. The statements simplify if we assume the fields have characteristic zero since characteristic zero implies separable. Theorem 14.3.7 The following are eqivalent for an extension F ⊆ E of fields of characteristic zero. 1. The extension is Galois. 2. The degree [E : F ] is finite and equals the order of Aut(E/F ). 3. The degree [E : F ] is finite and the fixed field of Aut(E/F ) is F . 4. The extension is finite and normal. 5. E is a splitting field for some polynomial in F [x]. 14.4 The fundamental theorem of Galois Theory We are now in a position to reveal our true motives. We are interested in seeing when a polynomial P (x) has its roots expressible in terms of its coefficients by a sequence of five operations, four of which are 278 CHAPTER 14. GALOIS THEORY BASICS the field operations and the fifth being the taking of n-th roots. This translates into asking whether there is a chain of field extensions F1 ⊆ F2 ⊆ · · · ⊆ Fk where F1 contains the coefficients, Fk contains the roots and each extension Fi ⊆ Fi+1 , 1 ≤ i < k, is of the form Fi+1 = Fi (αi ) where αi is a root of a polynomial of the form xn − bi for some bi ∈ Fi . Such a setup has a name. We say that the extension F1 ⊆ Fk is a radical extension if the sequence of extensions as described connecting F1 and Fk exists. A radical extension of one step (each Fi ⊆ Fi+1 , for example) would be described as “adding an n-th root” of an element. Given an extension F ⊆ E of fields, we refer to a field K with F ⊆ K ⊆ E as a field that is intermediate to (or between) F and E. Thus our interest is first in finding fields intermediate to F1 and Fk , and second in understanding the relationships between various pairs of the intermediate fields. The Galois group of a Galois extension The Fundamental Theorem of Galois Theory takes care of the first and a bit of the second. For Galois extensions, it tells exactly what the intermediate fields are and gives some information about their relationships. The theorem gives this information in terms of the group Aut(E/F ) and it subgroups. This makes the group Aut(E/F ) so important that it is given a name. For a Galois extension F ⊆ E of fields, we call Aut(E/F ) the Galois group of the extension and denote it as Gal(E/F ). The full analysis of the second question (“What is the relation between two of the intermediate fields?”) will come from a more detailed examination of the structure of the Galois group than is given by the fundamental theorem. This will occupy us in the next chapter. 14.4.1 Some permutation facts One part of the proof of the fundamental theorem relies on key facts about conjugation in groups of permutations. It has been a while since this was seen, so we review it here. We also review the notation since different notations are used in different books. If h and g are in a group G, we write hg for ghg −1 . We call hg the conjugate of h by g. If H is a subgroup of G, we write H g for {hg |h ∈ H}. We call H g the conjugate of H by g. If G acts on a set S by permutations, then each g ∈ G is a one-to-one correspondence from S to itself. For a subgroup H of G, we write Fix(H) for the set {s ∈ S|h(s) = s for all h ∈ H}. One basic fact that we want is the following. Lemma 14.4.1 In the setting just described, Fix(H g ) = g(Fix(H)). In particular if H ⊳ G, then Fix(H) = g(Fix(H)) for all g ∈ G. 14.4. THE FUNDAMENTAL THEOREM OF GALOIS THEORY 279 Proof. This is just Lemma 5.3.3 with an extra observation added at the end. 14.4.2 The Fundamental Theorem Theorem 14.4.2 (Fundamental Theorem of Galois Theory) Let F ⊆ E be a Galois extension of fields. Then for every field K intermediate to F and E, the extension K ⊆ E is a Galois extension and Gal(E/K) is a subgroup of Gal(E/F ). Further sending K to Gal(E/K) gives a one-to-one correspondence between the fields intermediate to F and E and the subgroups of Gal(E/F ). This one-to-one correspondence has the following properties: 1. If F ⊆ K ⊆ L ⊆ E, then Gal(E/L) ⊆ Gal(E/K). 2. An field K intermediate to F and E is a Galois extension of F if and only if Gal(E/K) is a normal subgroup of Gal(E/F ). When this occurs, then (a) θ(K) = K for all θ ∈ Gal(E/F ), (b) π : Gal(E/F ) → Gal(K/F ) defined by π(θ) = θ|K for each θ ∈ Gal(E/F ) is onto and a homomorphism of groups, and . (c) Gal(K/F ) is isomorphic to the quotient group Gal(E/F ) Gal(E/K). We make some comments before giving the proof. Item 1 in the conclusion says that the one-to-one correspondence K ↔ Gal(E/K) is containment reversing. Larger intermediate fields have smaller Galois groups. This even applies to the extremes. The field E is intermediate to F and E, is the largest such intermediate field, and its Galois group Gal(E/E) is the smallest possible: the trivial group. The field F is the smallest field intermediate to F and E and its Galois group Gal(E/F ) is the largest subgroup of Gal(E/F ). Item 2 in the conclusion says that the one-to-one correspondence K ↔ Gal(E/K) also serves as a one-to-one correspondence between the Galois extensions F ⊆ K with K intermediate to F and E and the normal subgroups of Gal(E/F ). We can now justify the word normal as applied to extensions. The finiteness of the extension F ⊆ K comes from the setting setting, and for fields of characteristic zero, separability is immeidate as well. By Item 4 of Theorem 14.3.7, the only property needed to get that F ⊆ K is Galois for fields of characteristic zero is that the extension be normal. The fundamental theorem says that this happens if and only if the corresponding Galois group is normal. This explains why normal subgroups and normal extensions have the same name. The statement of the Fundamental Theorem of Galois Theory makes a lot of promises, and so the proof is correspondingly long. However, no one of the conclusions is very difficult to prove. It will be found that most of the work has been done in Section 14.3. In the proof we sometimes write Aut( / ) and other times write Gal( / ). Our rule is that we use Aut when it is not known if the extension is Galois, 280 CHAPTER 14. GALOIS THEORY BASICS and we use Gal when we know that the extension is Galois. This is not exactly standard terminology. Proof. If K is intermediate to F and E, then we must first show that K ⊆ E is a Galois extension. From theorem 14.3.6, we have a choice of approach. We will use item 4 from that theorem and show that K ⊆ E is finite, separable and normal. But F ⊆ E is finite, separable and normal. Finiteness of K ⊆ E is immediate. If α is in E, then its minimal polynomial Q(x) over K must divide its minimal polynomial P (x) over F . But P (x) splits in E and hash no repeated roots. Thus Q(x) is a product of some of the linear factors that make up P (x) and so Q(x) splits in E and has no repeated roots. We know that Gal(E/K) is a subgroup of Aut(E). Since every element of Gal(E/K) is the identity on K and F ⊆ K, it is also the identity on F . Thus every element of Gal(E/K) is an element of Gal(E/F ). We now have a function g from I, the set of fields intermediate to F and E to S, the set of subgroups of Gal(E/F ). It is defined as g(K) = Gal(E/K). To show that f is one-to-one and onto, we build an inverse. Let f : S → I be defined by f (H) = Fix(H) for each subgroup H of Gal(E/F ). We must show that gf and f g are identity functions. For K intermediate to F and E, the field f g(K) is the fixed field of Gal(E/K). But the basic property of Galois extensions is that this be K. So the work in showing that f g is the identity is found in the proof that K ⊆ E is Galois. For a subgroup H of Gal(E/F ), the field f (H) is the fixed field L of H in E. If m is the order of H, then Proposition 13.5.2 says that [E : L] = m. Since every element of H fixes F , we must have F ⊆ L, so L is intermediate to F and E and it is known that L ⊆ E is Galois. Now gf (H) is Gal(E/L) the group of all automorphisms of E that fix L. This must include H since all elements of H fix L and we have H ⊆ gf (H). But for the Galois extension L ⊆ E, we must have that m = [E : L] equals the order of Gal(E/L). Thus we have that H and gf (H) have the same order and must be equal. We have shown the provisions in the first paragraph of the statement of the theorem. If F ⊆ K ⊆ L ⊆ E, then every element of Gal(E/L) fixes L which contains K. Thus every element of Gal(E/L) fixes K and must be in Gal(E/K). This proves Item 1. For Item 2, there are several things to prove: both directions of the if and only if, and Items (a)–(c). Assume that K is intermediate to F and E and that F ⊆ K is a Galois extension. Then it is a splitting field for some polynomial P (x) in F [x]. Let θ be in Gal(E/F ). The restriction of θ to K is a homomorphism from K into E that is the identity on F . From Lemma 14.3.4, we know that this restriction has image K and is an element of Aut(K/F ). This gives Item (a). Thus π(θ) = θ|K is a function from Gal(E/F ) to Gal(K/F ). Since K is taken to itself by each automorphism in Gal(E/F ), composing two restrictions gives the same result as restricting the result of a composition of two elements from Gal(E/F ). Written out, this says θ|K ρ|K = (θρ)|K and gives that the function 14.4. THE FUNDAMENTAL THEOREM OF GALOIS THEORY 281 π is a homomorphism. We want to prove that π is onto. The kernel of π consists of those elements of Gal(E/F ) whose restriction to K is the identity on K. But this is just a description of Gal(E/K). Thus Gal(E/K) is normal in Gal(E/F ) and we have one direction of the “if and only if.” For Item (c), we note that the first isomorphism theorem for group homomorphisms says that the image of a group homomorphism is isomorphic to the domain of the homomorphism modulo the kernel. In our situation, this says . that the image of π is isomorphic to Gal(E/F ) Gal(E/K). Thus the order of the image of π is [E : F ] |Gal(E/F )| = |Gal(E/K)| [E : K] by Item 2 of Theorem 14.3.6. But F ⊆ K ⊆ E says that [E : F ] = [E : K][K : F ] so [E : F ] = [K : F ] = |Gal(K/F )| [E : K] and the image of π must . abe all of Gal(K/F ). Thus π is onto and Gal(K/F ) is isomorphic to Gal(E/F ) Gal(E/K). For the other direction of the “if and only if” we assume that Gal(E/K) is normal in Gal(E/F ) and we work to prove that E ⊆ K is Galois. To shorten the notation let Γ = Gal(E/K). We know from the first part of this proof that K ⊆ E is Galois, so K is the fixed field of Γ = Gal(E/K). From Lemma 14.4.1, we know that θ(K) = K for every θ in Gal(E/F ). (This proves Item (a) again, but from a different hypothesis.) At this point, we can choose any of a number of arguments to show that F ⊆ K is Galois. We choose Item 3 of Theorem 14.3.6, and show that F ⊆ K is Galois because F is the fixed field of Aut(K/F ). We know that F is the fixed field of Gal(E/F ). If a ∈ K − F , then there is a θ ∈ Gal(E/F ) with θ(a) 6= a. But θ(K) = K and so θ|K is an element of Aut(K/F ) that does not fix a. Thus no element of K − F is in the fixed field of Aut(K/F ). But F is clearly in the fixed field of Aut(K/F ). So F is the fixed field of Aut(K/F ) and F ⊆ K is Galois. 282 CHAPTER 14. GALOIS THEORY BASICS Chapter 15 Galois theory in C We now apply the results in the previous chapter to subfields of C, the field of complex numbers. This setting is so important that it has a name of its own. We will refer to subfields of C as number fields. So the title of the chapter could have been “Galois theory of number fields.” However, the terminology is not completely standard1 and the chapter title we use is more specific. The setting has numerous advantages. One important fact is that polynomials in C[x] split in C. This is called the Fundamental Theorem of Algebra and will be proven here if we have time. Its proof needs at least some analysis (calculus). Another advantage is that C has characteristic zero and extensions are thus automatically separable. A third advantage is that we know a lot about multiplying and adding in C. In particular, we know a lot about the n-th roots of complex numbers (with n-th roots of 1 being the most important special case) which will turn out to be very important in our analysis. The goal of this chapter is to use Galois theory to give information about roots of polynomials. In particular, we will determine when a polynomial P (x) in Q[x] has roots that can be calculated from the coefficients of P (x) using the four field operations and the taking of n-th roots. This will be translated into asking when the extension Q ⊆ K (where K is a splitting field for P (x)) has a very specific strucure. The triumph of Galois theory is to tie this specific structure to properties of the group Gal(K/Q). The Fundamental Theorem of Galois Theory and the interplay between our knowledge of n-th roots in C and the automorphisms they lead to will give the analysis we seek. The Fundamental Theorem of Algebra To make sure we get the statement recorded and to indicate its importance, we record the statement here. A proof will be supplied much later (and perhaps not before the end of the semester). 1 The term “number field” is often used to refer to a finite extension of Q. 283 284 CHAPTER 15. GALOIS THEORY IN C Theorem 15.0.3 (Fundamental Theorem of Algebra) Every polynomial in C[x] splits in C[x]. 15.1 Radical extensions We wish to investigate polynomials over a number field F . Typically we will want to understand some P (x) over Q, but number fields other than Q could be used as well. We repeat some of a previous discussion. If P (x) ∈ F [x] is given then we know its coefficients. If its roots can be found from the coefficients by the five operations that we have discussed (the four field operations and the taking of n-th roots), then we say that P (x) is solvable by radicals. There is a structure that we will associate to the concept of solvability by radicals. Consider the following structure where all the Fi are number fields. We have a sequence of field extensions F0 ⊆ F1 ⊆ F2 ⊆ · · · ⊆ Fk where Fi for 1 ≤ i ≤ k is of the form Fi−1 (αi ) and αi ∈ Fi is a root of a polynomial of the form xni − βi−1 for some element βi−1 ∈ Fi−1 . In other words, we get from one field to the next larger field by adding an element that is a root of some element in the smaller field. If such a structure exists, then we say that Fk is a radical extension of F0 . We will refer to the collection of the ni as the exponents of the structure. This is not standard terminology, but it will be convenient for us. Also, it is sometimes required that the structure have other nice properties. For example, it is sometimes required that each extension Fi ⊆ Fi+1 be normal. We will not make that requirement now, but will add it later and show that it can be achieved. We will also show that other properties can be added. Lemma 15.1.1 If P (x) is a polynomial over a number field F , then P (x) is solvable by radicals if and only if the splitting field E ⊆ C for P (x) over F is contained in a radical extension Fk of F . Proof. If P (x) is solvable by radicals, then there is a set of intermediate calculations that lead to the roots. Since there are finitely many roots, there are finitely many intermediate calculations. We start with F0 = F , and we consider the list of intermediate values. We eliminate all intermediate values that are already in F . We take an intermediate vale α1 that is not in F but that is a root of an element of F . We form F1 = F (α1 ), and then eliminate from the list of intermediate values that remains all values that are now in F1 . We continue in this way until we have a field Fk that contains all the roots. Since Fk contains all the roots of P (x), it must contain a splitting field for P (x) over F . By the way Fk is constructed, it is a radical extension of F . 15.1. RADICAL EXTENSIONS 285 Conversely, assume that there is a radical extension Fk of F that contains a splitting field E for P (x) over F . We will be done when we show that every element of Fk can be obtained from the elements of F by the four field operations, and the taking of n-th roots. This is done quite simply by induction on k. There is nothing to show if k = 0. If k ≥ 1, let Fi , 0 ≤ i ≤ k, with F0 = F , and elements αi , βi and ni be as in the definition above of a radical extension. We have that Fk = Fk−1 (αk ) and that αk is a root of xnk − βk−1 with βk−1 ∈ Fk−1 . From Corollary 12.3.3, we know that every element in Fk is a linear combination of of powers of αk with coefficients from Fk−1 . By the inductive assumption, every element of Fk−1 can be obtained from elements of F using the five allowed operations, and so we can get all elements of Fk by from elements of F by these operations, the taking of nk -th roots (specifically of βk−1 ), and some further field operations. Since E ⊆ Fk and E contains all roots of P (x), this applies to the roots of P (x). 15.1.1 An outline Lemma 15.1.1 gives enough of a picture for us to describe how things will go. We have fields F ⊆ E ⊆ Fk . The radical extension Fk of F might not be Galois, but the extension F ⊆ E is Galois. If Fk were Galois over F , then we would have . Gal(E/F ) ≃ Gal(Fk /F ) Gal(F/E) making Gal(E/F ) a quotient group of Gal(Fk /F ). It turns out that we can extend the radical extension of Lemma 15.1.1 even farther to make it Galois and still keep it as a radical extension. It further turns out that automorphism groups of Galois, radical extensions are solvable groups. This is why solvable groups are solvable groups. Recall that quotients of solvable groups are solvable. This gives us our first major result. If P (x) is a polynomial over a number field F that is solvable by radicals, and E is the splitting field in C for P (x) over F , then Gal(E/F ) is a solvable group. We will use this result to show that a specific polynomial in Q[x] (it will have degree 5 since all polynomials of degree no more than 4 are solvable by radicals) is not solvable by radicals by computing the Galois group of its splitting field and showing that it is not a solvable group. Thus we will have illustrated one direction of the statement “P (x) ∈ F [x] is solvable by radicals if and only if a splitting field E for P (x) over F has that Gal(E/F ) is a solvable group.” It is the easier direction and the more flamboyant. The other direction is harder and more interesting. It says that if the Galois group is known to have a certain structure (namely, being solvable), then the extension must have a certain structure. That this direction turns out to be true creates a need for two comments. First, this shows that not only is the solvability of the Galois group necessary for solvability of the polynomial by radicals (and thus something that prevents 286 CHAPTER 15. GALOIS THEORY IN C solvability of the polynomial by radicals when it is absent), but it is also sufficient. That is, it exactly captures exactly the solvability of the polynomial. Second, it is a powerful example where the niceness of structure of the group of symmetries implies niceness of the structure having those symmetries. This is why it is the harder direction. One must use properties of the group of symmetries to prove that the splitting field of the polynomial is contained in an extension that can be built step by step by extensions that are determined by polynomials of the form xn − β. 15.2 Improving radical extensions This section will be an example of making a situation fit a technique. Galois theory works will on Galois extensions. The result of Lemma 15.1.1 shows that a polynomial P (x) is solvable by radicals if its splitting field is contained in a radical extension. The discussion above shows that it would be nice to know that the radical extension is Galois. It would be even nicer to know more about the intermediate fields in the radical extension. In particular, getting the automorphism group to be solvable will require knowing that the intermediate extensions are normal. We will show that all this can be achieved, and more. That is, we will start with the conclusion of Lemma 15.1.1 and show that the conclusion can be strengthened to the point where Galois theory can make some contributions. One task of this section is to show that any radical extension can be make Galois. But there will be another task. We will also show that the “internal structure” of the radical extension can be shown to have a particularly simple structure. This will help when we work to compute the Galois group of the extension. The terminology that we use for the following is not standard. Let F ⊆ K be an extension of fields. We say that K is an improved radical extension of F if F ⊆ K is Galois, and if there are fields Fi , elements βi , primes pi , and an integer n so that F = F0 ⊆ F1 ⊆ F2 ⊆ · · · ⊆ Fn = K with βi ∈ Fi , 0 ≤ i < n, and with each Fi , 1 ≤ i ≤ k a splitting field over Fi−1 of xpi − βi−1 . The definition needs some comments. First, if the definition of improved radical extension is compared carefully to the definition of a radical extension, then it is not immediately obvious that an improved radical extension is a radical extension. This is cleared up in the next lemma. Second, the requirement that F ⊆ K be Galois has already been given motivation by the discussion in the previous section. Third, the requirement that the exponents pi used in the polynomials all be prime is more of a convenience than a necessity. The fact that the exponents are primes will make certain arguments slightly easier. 15.2. IMPROVING RADICAL EXTENSIONS 287 Lemma 15.2.1 If F ⊆ K is an improved radical extension of fields, it is a radical extension of fields. Proof. The item missing from the definition of a (plain) radical extension is that each intermediate extension be by a single element that is a root of a polynomial of the right form. But if Fi is a splitting field over Fi−1 for the polynomial xpi − βi−1 , then Fi = Fi−1 (α1 , α2 , . . . , αk ) where α1 , α2 , . . . , αk are all the roots of xpi − βi−1 . If we add one root at a time to Fi−1 and give the intermediate fields names, then we have the required structure of a radical extension between Fi−1 and Fi since each αj is a root of xpi − βi−1 . This can be done between each pair Fi−1 ⊆ Fi . Thus by adding extra intermediate fields, we get that F ⊆ K is a radical extension. Not every radical extension is an improved radical extension. But every radical extension can be made bigger so that it becomes an improved radical extension. We tackle the improvements one at a time. First we get the exponents in the polynomials to be primes. 15.2.1 The first improvement If F ⊆ F (α) is an extension of fields, and α is a root of xn − β with β ∈ F , then αn = β. But if n is not a prime, then n = pq for some prime p and some q > 1, and αn = (αq )p = β and we can introduce a field intermediate to F and F (α) to get F ⊆ F (γ) ⊆ F (α) where γ = αq ∈ F (α). Now α is a root of xp − γ over F (γ), and γ is a root of xq − β over F . Since q < n, we have all the ingredients needed to give an inductive proof of the following. Lemma 15.2.2 Let F ⊆ F (α) be an extension where α is a root of xn − β for β ∈ F . Then there are fields Fi , elements αi and βi , primes pi and a natural number k so that F = F0 ⊆ F1 ⊆ F2 ⊆ · · · ⊆ Fk = F (α) where for 1 ≤ i ≤ k, Fi = Fi−1 (αi ), and αi is a root of xpi − βi−1 with βi−1 ∈ Fi−1 . By using Lemma 15.2.2 at each stage of the definition of a radical extension, we get the following corollary to Lemma 15.2.2. Corollary 15.2.3 Let E be a radical extension of a number field F . Then there are fields Fi , elements αi and βi , primes pi and a natural number k so that F = F0 ⊆ F1 ⊆ F2 ⊆ · · · ⊆ Fk = E where for 1 ≤ i ≤ k, Fi = Fi−1 (αi ), and αi is a root of xpi − βi−1 with βi−1 ∈ Fi−1 . 288 15.2.2 CHAPTER 15. GALOIS THEORY IN C The second improvement In the proof of the following, we will refer to splitting fields for certain polynomials. We know that splitting fields can always be constructed, but here we can take advantage of the fact that we work in C, and a splitting field for any polynomial always exists in C. Proposition 15.2.4 If F ⊆ K is a radical extension of a number field, then there is an improved radical extension F ⊆ E that has K ⊆ E. Proof. We first apply corollary 15.2.3 so that we can assume that all the exponents of the structure of the radical extension F ⊆ K are primes. We next list, in the order that they are added, all the elements α1 , α2 , . . . , αk that are added to F to obtain K. We will induct on k. We let Fi be the extension of F by α1 through αi . There is nothing to do when k = 0, and so we assume that there is an improved radical extension F ⊆ G that contains Fk−1 . We will add αk to Fk−1 and then enough extra so that the resulting extension E satisfies the conclusion of the proposition. Note that αk is a root of xpk − βk−1 where βk−1 is from Fk−1 and is thus in G. We will obtain E by building a succession of extensions G = S0 ⊆ S1 ⊆ S2 ⊆ · · · ⊆ St = E (15.1) where each Si is a splitting field over Si−1 of a polynomial of the form xpk − ci−1 where ci−1 is some element in Si−1 . To create a splitting field, one must add all roots of a polynomial. This can be done in succession so that if (say) r1 , r2 , . . . , rn are the roots of xpk − c0 , then S1 is the end of a succession of extensions S0 ⊆ S0 (r1 ) ⊆ S0 (r1 , r2 ) ⊆ · · · ⊆ S0 (r1 , r2 , . . . , rn ) = S1 . Note that each intermediate extension is of the allowed type. The addition of a root of xpk − c0 . The fact that c0 is the same for all of the extensions is not a problem. Thus we will take it as established that extending by building a splitting field of a polynomial of the form xpk − c for some c stays within the definition of a radical extension and that keeps the exponents used within the set of primes. Note also, that since we are building splitting fields, we are satisfy one of the requirements of an improved radical extension. We now work on getting the last extension (splitting field) St to be Galois over F . The Fundamental Theorem of Galois Theory tells us what we must do. The splitting field S1 already contains all roots of xpk − βk−1 . The Galois extension E of F that we seek will contain the Galois extension G of F . There will be a surjective homomorphism π : Gal(E/F ) → Gal(G/F ) given by restriction. That is, if θ is in Gal(E/F ), then π(θ) = θ|G will be in Gal(G/F ), and every element in Gal(G/F ) arises this way. 15.3. ON THE IMPROVED EXTENSION 289 Let ρ be some element of Gal(G/F ) and let ρ ∈ Gal(E/F ) be such that its restriction to G is ρ. If R is the set of all roots of xpk − βk−1 , then ρ(R) will be in E and will all be roots of ρ(xpk − βk−1 ). This equals xpk − ρ(βk−1 ) since βk−1 is in G and ρ restricted to G is ρ. (We can see that all the roots of xpk − ρ(βk−1 ) will be in ρ(R) by applying ρ −1 to the set of roots of xpk − ρ(βk−1 ) and noting that they will all be in R.) So we must not only add roots of xpk − βk−1 to G, we must all roots of pk x − ρ(βk−1 ) for every ρ ∈ Gal(G/F ). This gives us our succession of splitting fields (15.1). The number t of split extensions in (15.1) will be the number of elements of Gal(G/F ). This means that St = E is a splitting field over G for the polynomial P (x) = (xpk − ρ1 (βk−1 ))(xpk − ρ2 (βk−1 )) · · · (xpk − ρt (βk−1 )) where ρ1 , ρ2 , . . . , ρt are all the elements (including the identity) in Gal(G/F ). This makes E Galois over G, but we would like E to be Galois over F . We will get this in two steps. First by the inductive hypothesis, G is Galois over F , so it is a splitting field of some Q(x) over F . In the next paragraph, we will show that all the coefficients of P (x) are in F so that P (x) ∈ F [x]. Thus Q(x)P (x) will be in F [x] and E will be a splitting field of Q(x)P (x) over F and will be Galois over F . We now bring in a technique that was used in the proof of Proposition 14.3.1. All the factors of P (x) use coefficients from G (namely 1 and ρi (βk−1 )) and so we can act on P (x) by any element of Gal(G/F ). If ρ is such an element, then the factors of ρ(P (x)) will be the factors of P (x) with −ρi (βk−1 ) replaced by −ρρi (βk−1 ) in each factor. But multiplying all elements of Gal(G/F ) on the left by one element of Gal(G/F ) simply permutes the elements of Gal(G/F ). Thus the factors of ρ(P (x)) are exactly those of P (x) except for a change of order. Thus ρ(P (x)) = P (x). This puts all the coefficients of P (x) in the fixed field of ρ. Since this applies to any element of Gal(G/F ), we have that the coefficients of P (x) are in the fixed field of Gal(G/F ) which must be F since F ⊆ G is Galois. Thus P (x) is in F [x]. This makes E Galois over F . As mentioned above, all other requirements for an improved, radical extension have been obtained. 15.3 The Galois group of an improved, radical extension Recall that an improved, radical extension F ⊆ K has fields Fi , elements βi , primes pi and an integer n so that F ⊆ K is Galois and so that F = F0 ⊆ F1 ⊆ F2 ⊆ · · · ⊆ Fn = K with βi ∈ Fi , 0 ≤ i < n, and with each Fi , 1 ≤ i ≤ k a splitting field over Fi−1 of xpi − βi−1 . This gives us various extensions to look at, some of which are Galois. 290 CHAPTER 15. GALOIS THEORY IN C Since F ⊆ K is Galois, the Fundamental Theorem of Galois Theory says that each Fi ⊆ K is Galois and we have a succession of groups {1} = Gn ⊆ Gn−1 ⊆ Gn−2 ⊆ G1 ⊆ G0 = Gal(K/F ) where each Gi = Gal(K/Fi ). We do not know that each Fi is Galois over F , so we do not know that the Gi are normal in Gal(K/F ). But each Fi is a splitting field for a polynomial over Fi−1 , so each Fi−1 ⊆ Fi is Galois. This means that Gal(K/Fi ) is normal in Gal(K/Fi−1 ). In terms of the Gi , this means that each Gi is normal in Gi−1 . Thus the sequence of groups above can be rewritten as {1} = Gn ⊳ Gn−1 ⊳ Gn−2 ⊳ G1 ⊳ G0 = Gal(K/F ). (15.2) Recall that normality is not transitive, and that this does not imply that all the Gi are normal in Gal(K/F ). We are interested in analyzing each . Gi−1 /Gi = Gal(K/Fi−1 ) Gal(K/Fi ). Since each each extension in Fi−1 ⊆ Fi ⊆ K is Galois, the Fundamental Theorem of Galois Theory says that the quotient above is isomorphic to Gal(Ki /Ki−1 ). But this is the Galois group of a splitting field for a polynomial of the form xpi − βi−1 . We are interested in this in the setting of number fields, which means that we are looking at an extension obtained by adding all the complex roots of a single complex number βi−1 . Since we know much about such roots, we are in a good position to analyze the corresponding Galois group. To simplify the notation, let us consider a number fields G ⊆ G′ where G′ is the splitting field over G in C for xp − β with β ∈ G and p a prime. We want to say something about Gal(G′ /G). The case p = 1 is either ruled out because 1 is not a prime, or is ruled out because it is trivial. Thus we assume p > 1. We know that the p-th roots of β in C are of the form γ i α, 0 ≤ i < p, where γ is the p-th root of 1 making the smallest positive angle with respect to the positive real axis, and α is one p-th root of β. Since all of these roots are in G′ and there is more than one root (using p > 1), their ratios are in G′ as well. Thus all the γ i , 0 ≤ i < p, are in G′ , and we see that G′ contains Gp , the splitting field over G in C for xp − 1. Once all the αi are in Gp , to get all the roots of xp − β, we only need to add α. Thus we have the sequence of extensions G ⊆ Gp ⊆ Gp (α) = G′ . The extension G ⊆ Gp (α) = G′ is Galois, as is G ⊆ Gp sincei Gp is a splitting field of a polynomial. Thus Gal(Gp (α)/Gp ) is normal in Gal(G′ /G). Thus we can insert into (15.2), a normal subgroup between each pair Gi ⊳ Gi−1 to obtain a sequence twice as long in which half the extensions consist of adding all roots of the polynomial xp − 1 for some prime p, and the remaining extensions consist 15.3. ON THE IMPROVED EXTENSION 291 of adding one root (and thus all roots) of a polynomial of the form xp − β for some prime p and some β to a field containing β and all the p-th roots of 1. We look at the Galois group of each of the two types. If we consider G ⊆ Gp where Gp is the splitting field over G in C of xp − 1, then Gp = G(γ) with γ as described above. This is because all p-th roots of 1 are of the form γ i for some i. Thus all automorphisms in Gal(Gp /G) are completely determined by where γ goes. Since roots of xp − 1 have to be taken to roots of xp − 1, an automorphism in Gal(Gp /G) must take γ to some γ i . If θ(γ) = γ i and φ(γ) = γ j , for θ and φ in Gal(Gp /G), then θ(φ(γ)) = θ(γ j ) = (γ i )j = γ ij . Similarly φ(θ(γ)) = γ ij . Since elements of Gal(Gp /G) are determined by what they do on γ, we have θ ◦ φ = φ ◦ θ and we have shown that Gal(Gp /G) is abelian.2 If we consider Gp ⊆ Gp (α) where Gp contains the p-th root of 1 γ as described above (and thus all p-th roots of 1), and α is one root of xp − β for β ∈ Gp , then Gp (α) is a splitting field for xp − β over Gp . All automorphisms in Gal(Gp (α)/Gp ) are completely determined by where α goes. Since α must be taken to a root of xp −β, an automorphism in Gal(Gp (α)/Gp ) must take α to some αγ i for some i. If θ(α) = αγ i and φ(α) = αγ j for θ and φ in Gal(Gp (α)/Gp ), then θ(φ(α)) = θ(αγ j ) = θ(α)θ(γ j ) = αγ i γ j = αγ i+j since γ ∈ Gp implies that θ fixes γ. Similarly φ(θ(α)) = αγ i+j . Since elements of Gal(Gp (α)/Gp ) are determined by what they do on α, we have θ ◦ φ = φ ◦ θ and we have shown that Gal(Gp (α)/Gp ) is abelian. Thus we have shown that an improved, radical extension has a Galois group with a sequence of subgroups, each normal in the next so that the successive quotient groups are abelian. But this is the definition of a solvable group. We have shown the following. Proposition 15.3.1 The Galois group of an improved, radical extension of number fields is solvable. If we combine Lemma 15.1.1, the comments in Section 15.1.1, Propositon 15.2.4 and Proposition 15.3.1, we have the following. Proposition 15.3.2 Let P (x) be a polynomial over a number field F and let K be the splitting field in C over F of P (x). If P (x) is solvable by radicals, then the Galois group Gal(K/F ) is a quotient of a solvable group. 2 We have not use the fact that p is prime. In fact, we are leaving out a lot of commentary. In general, not every γ 7→ γ i is valid. For example, γ 7→ γ p = 1 is not valid. But also, if γ is in G, then it must be fixed by any automorphism. And even if γ is not in G, γ 7→ γ i cannot be valid of γ i is in G. So there are values of i that cannot be used. This does not interfere with our argument. If p is a prime, then the roots of xp − 1 form a cyclic group of order p under multiplication. Any root of xp − 1 that is not 1 is a generator of this group since the only subgroup bigger than {1} is the whole group. Thus if any root of xp − 1 other than 1 is in G, they all are. So if one root of xp − 1 is not in G, then all roots of xp − 1 other than 1 are not in G, and all values of i other than multiples of p can be used for an automorphism determined by γ 7→ γ i . Thus we get a more complete picture when p is a prime. 292 CHAPTER 15. GALOIS THEORY IN C But we know from Lemma 8.2.2 that a quotient of a solvable group is solvable. Thus we get the following key result. Theorem 15.3.3 Let P (x) be a polynomial over a number field F and let K be the splitting field in C over F of P (x). If P (x) is solvable by radicals, then the Galois group Gal(K/F ) is a solvable group. 15.4 An example Consider the polynomial P (x) = 4x5 − 10x2 + 5. This is irreducible over Q using the Eisenstein criterion with the prime 5. It has odd degree, so it has at least one real root. We will show that it has three real roots. We have P ′ (x) = 20x4 − 20x = 20(x4 − x) = 20x(x3 − 1) = 20x(x − 1)(x2 + x + 1). We recognize x2 + x + 1 as the polynomial whose roots are the complex cube roots of 1, so the real roots of P ′ (x) are x = 0 and x = 1. We have P (0) = 5 and P (1) = 4 − 10 + 5 = −1. So P (x) crosses the x-axis three times and only three times. The other two roots of P (x) are complex and by the last part of Exercise Set (47), they are complex conjugate pairs. Let r1 , r2 , r3 be the three real roots of P (x), and let c1 , c2 be the complex roots. The splitting field E for P (x) over Q is Q(r1 , r2 , r3 , c1 , c2 ). We know that complex conjugation is an automorphism of C fixing Q. Since E is a splitting field for a polynomial over Q, Lemma 14.3.4 says that the image of E under complex conjugation is E itself, and complex conjugation is an element of Gal(E/Q). Its action on {r1 , r2 , r3 , c1 , c2 } is to fix each ri and to switch c1 and c2 . Since P (x) is irreducible (this is the only place that we use irreducibility), we know that any one root (r1 , say) of P (x) can be taken to any other root of P (x). (We actually never recorded this fact explicitly, but it follows from Propositions 13.3.3 and 13.4.2. Any Q(a) for a root a of P (x) is isomorphic to Q(r1 ) by the first and the isomorphism extends to E by the second since E is a splitting field of some factor of P (x) over Q(r1 ).) We now argue that this makes Gal(E/Q) isomorphic to the symmetric group on 5 objects. The statement that any root of P (x) can be taken to any other root says that Gal(E/Q) acts transitively on the roots of P (x). Recall that if a group G acts on a set S, we say it acts transitively if given any two elements x and y of S, there is a g ∈ G so that g(x) = y. We have already shown complex conjugation is one of the elements of Gal(E/Q) and that this is a single transposition of the roots of P (x). From Proposition 9.3.1, we know that the action of Gal(E/Q) on the roots of P (x) contains all elements of S5 . Since E = Q(r1 , r2 , r3 , c1 , c2 ), any element of Gal(E/Q) is determined completely by what it does on {r1 , r2 , r3 , c1 , c2 }. Since we have shown that any permutation of {r1 , r2 , r3 , c1 , c2 } can be accomplished by an automorphism in Gal(E/Q), we know that Gal(E/Q) is isomorphic to S5 . 15.4. AN EXAMPLE 293 From Corollary 9.2.3, we know that S5 is not solvable. So we have shown that the group Gal(E/Q) is not a solvable group. Thus P (x) is not solvable by radicals. Thus no radical extension of Q in C contains the roots of P (x), and the roots of P (x) cannot be built from elements of Q by the five operations of addition, subtraction, multiplication, division and the taking of n-th roots for various positive integers n. This does not mean that computers cannot calculate the roots of P (x) to some degree of approximation. To fifteen significant figures, the five roots of P (x) are r1 = −0.668329831433218, r2 = 0.788731352638918, r3 = 1.16412943244799, c1 = −0.642265476826847 + 1.27455269235095i, c1 = −0.642265476826847 − 1.27455269235095i, 294 CHAPTER 15. GALOIS THEORY IN C Bibliography [1] Norman L. Biggs, Discrete mathematics, second ed., Oxford Science Publications, The Clarendon Press Oxford University Press, New York, 1989. MR 1078626 (91h:00002) [2] Girolamo Cardano, The rules of algebra, (Ars Magna), Dover Publications, reprint of the 1968 translation the 1545 edition of Artis magnae, sive de regulis algebraics. Lib. unus. Qui & totius operas de arithmetica, quod Opus Perfectum inscripsit, est in ordine decimus, with additions from the 1570 and 1663 editions, translated by T. Richard Widmer, published by the MIT Press., 2007. [3] James H. McKay, Another proof of Cauchy’s group theorem, Amer. Math. Monthly 66 (1959), 119. MR 0098777 (20 #5232) [4] Frederic Rosen, The algebra of Mohmmed ben Musa, Kessinger Publishing, reprint of the 1831 translation published by the Oriental Translation Fund, of the c. 830, The Compendious Book on Calculation by Completion and Balancing, by Mohammed ibn Musa al-Khwarizmi, 2004. 295 Index Zk , 70 Zp as field, 88 ∃, 54 poof of, 55 using in proof, 55 ∀, 52 proof of, 52 using in proof, 55 n-cycle, 138 abelian group, 60, 93 ring, 104 absolute value of complex number, 27 action alternate definition, 147 fixed point, 153 fixed set, 153 invariant set, 149, 153 kernel, 146 of group on set, 146 orbit, 152 transitive, 195, 292 addition complex numbers, 22 adjacent transposition, 191 al-Jabr, 10 al-Khwarizmi, 9 al-Mukabalah, 10 algebra word origin, 10 algebraic element, 234 extension, 249 algebraic element minimal polynomial for, 236 algorithm word origin, 10 alternating group, 192 Arabic number, 9 argument of complex number, 27 automorphism, 100 group of extension, 113 of field, 113 of group, 101 of a field, 112 of ring, 107 automorphisms linearly dependent, 261 linearly independent, 261 axiomatic system, 8, 91 axis imaginary, of complex plane, 25 real, of complex plane, 25 bijection, 56 binary operation, 59 relation, 64 biquadrtic equation, 3 Cardano, Girolamo, 15 Cauchy notation for permutation, 62 Cauchy’s theorem, 157 Cayley’s theorem, 121, 122 center 296 INDEX of group, 146 centralizer, 151 characteristic of field, 209 coefficient leading, 5, 221 common divisor, 82 commuative group, 60 commutative ring, 104, 214 commuting elements, 93 completing the square, 7 complex addition, 22 conjugate, 27 multiplication, 23 multiplicative inverse, 24 negation, 23 number, 22 absolute value, 27 argument, 27 imaginary part, 27 modulus, 27 real part, 27 conjugacy class, 131 of element, 131 conjugate complex number, 27 of element, 130 of set, 131 conjugate to, 130 conjugation, 130 constant polynomial, 217 content of polynomial, 265 correspondence theorem, 179 coset left, 154 right, 154 cross product, 51 crossing number of permutatino, 190 cubic 297 equation, 3 polynomial reduction, 18 solution, 19 cycle as permutation, 138 notation for permutation, 136 structure of permutation, 139 cycles disjoint, 139 of permutation, 136 cyclic group, 164 degree compared to index, 205 of a field extension, 110 of polynomial, 5, 220 of term, 5, 220 del Ferro, Scipione, 16 dihedral group, 124 disjoint cycles, 139 pairwise, 53 sets, 53 disjoint cycles, 139 divides in ring, 223 division algorithm, 80 division algorothm divisor, 81 qotient, 81 remainder, 81 divisor common, 82 greatest, 83 of division algorithm, 81 domain, 56 of function, 54 Eisenstein irreducibility criterion, 266 empty set, 50 equation biquadratic, 3 cubic, 3 298 polynomial, 3 quadratic, 3 quartic, 3 quintic, 3 equivalence class, 67 representative, 67 relation, 65 Euclid theorem about primes, 86 evaluation function, 219 homomorhpism, 219 exponents of radical extension, 284 extension field, 109 automorphism group, 113 of field, 109, 201 algebraic, 249 by a set, 111, 234 by an element, 234 finite, 249 Galois group of, 278 intermediate, 278 normal, 273 radical, 278, 284 separable, 270 simple, 249 field, 74, 108 automorphism, 112 automorphism group, 113 characteristic, 209 element algebraic, 234 primitive, 249 separable, 270 transcendental, 234 extension, 109 algebraic, 249 automorphism group, 113 finite, 249 Galois, 269 Galois group of, 278 intermediate, 278 INDEX normal, 273 radical, 278, 284 separable, 270 simple, 249 homomorphism, 111 isomorphism, 111 number, 283 splitting, 257 finite extension of field, 249 group, 70 finite group, 154 first isomorphism theorem, 175 fixed by automorphism, 113 fixed field of automorphism, 203 of group of automorphisms, 203 fixed group of element, 127 fixed point under action, 153 fixed set of action, 153 formula quadratic, 6 symmetric, 36 full subgroup, 96 function, 53 bijection, 56 domain, 54, 56 image, 54, 56 injection, 56 inverse, 57 one to one, 56 onto, 55 range, 54 surjection, 55 well defined, 69 Fundamental theorem of algebra, 230, 283 of arithmetic, 85 of Galois theory, 279 fundamental triviality, 132 INDEX Galois, 3, 42 Galois extension of fields, 269 Galois group of field extension, 278 Galois theory fundamental theorem, 279 generators group, 164 subgroup, 164 greatest common divisor, 83 group, 93 abelian, 60, 93 action, 146 alternating, 192 automorphism of extension, 113 of field, 113 automorphism group of, 101 axioms, 59 center of, 146 commutative, 60 cyclic, 164 definition, 59 dihedral, 124 finite, 70, 154 generators, 164 of permutations, 47, 121 of symmetries, 124 operation, 59 order of, 70 quotient, 174 simple, 189 solvable, 184 symmetric, 61 homomorphism, 49, 89, 97 evaluation, 219 image, 98 kernel, 98 of fields, 111 of rings, 105, 215 parity, 192 projection, 174 quotient, 174 299 ideal, 106 identity isomorphism, 100 permutation, 37 image inverse, 57 of function, 54, 56 of homomorphism, 98 imaginary number, 25 imaginary part complex number, 27 improved radical extension, 286 inclusive or, 51 index compared to degree, 205 of subgroup, 156 injection, 56 integers as a ring, 73 modulo k, 70 integral domain, 215 intermediate extension of field, 278 intersection of sets, 51 invariant set under action, 149, 153 inverse function, 57 image, 57 multiplicative complex, 24 irreducible Eisenstein criterion, 266 in ring, 225 polynomial, 225 isomorphic, 99 fields, 111 isomorphism, 99 identity, 100 of fields, 111 of rings, 106 isomorphism theorem first, 175 300 INDEX other, 180 kernel of action, 146 of homomorphism, 98 Khayyám, Omar, 16 Lagrange’s theorem, 156 leading coefficient, 221 leading coefficient polynomial, 5 linear polynomial, 221 linearly dependent automorphisms, 261 independent automorphisms, 261 minimal polynomial, 236 modulus of complex number, 27 monic polynomial, 14, 221 monomial, 217 multiplication complex numbers, 23 mutiplicity of root, 230 natural numbers, 50 natural numbers, 79 negation complex numbers, 23 normal extension of fields, 273 subgroup, 99, 130 normalizer, 151 number Arabic, 9 complex, 22 imaginary, 25 natural, 50 number field, 283 one to one function, 56 one-to-one correspondence, 56 onto function, 55 operation binary, 59 group, 59 ternary, 59 unary, 59 orbit under action, 152 order infinite of element, 70 of group, 70 of element, 70 of group, 70 ordered pair, 51 other isomorhpism theorem, 180 pair ordered, 51 pairwise disjoint, 53 parity, 190 even, 190 homomorphism, 192 odd, 190 of permutation, 190 partition, 66 permutation, 36, 61 as a cycle, 138 Cauchy notation, 62 crossing number, 190 cycle notation, 136 cycle structure, 139 group, 121 identity, 37 support of, 138 transposition, 189 polynomial, 5, 216 as function, 218 constant, 217 content of, 265 cubic reduction, 18 INDEX solution, 19 degree, 5, 220 equation, 3 solution, 5 irreducible, 225 leading coefficient, 5 linear, 221 monic, 14, 221 primitive, 265 resolvent, 43 root, 5 separable, 270 solvable by radicals, 14, 284 splitting of, 257 term, 5 term of, 220 prime Euclid’s theorem about, 86 in ring, 225 integer, 85 primitive polynomial, 265 primitive element, 249 Primitive element theorem, 271 product cross, 51 projection homomorphism, 174 proper subgroup, 96 quadratic equation, 3 formula, 6 quartic equation, 3 quintic equation, 3 quotient group, 174 homomorphism, 174 of division algorithm, 81 radical extension, 278, 284 exponents, 284 301 improved, 286 range of function, 54 real part complex number, 27 reduction cubic polynomial, 18 reflexive relation, 65 relation binary, 64 equivalence, 65 reflexive, 65 symmetric, 65 transitive, 65 relatively prime, 226 remainder of division algorithm, 81 representative of equivalence class, 67 resolvent of polynomial, 43 ring, 72, 103, 214 abelian, 104 automorphism, 107 commutative, 72, 104, 214 homomorphism, 105, 215 ideal in, 106 isomorphism, 106 of integers, 73 unit in, 84 with identity, 72, 104, 214 with unit, 104, 214 root multiplicity, 230 of polynomial, 5 separable element of field, 270 extension of fields, 270 polynomial, 270 set disjoint, 53 empty, 50 equality, 51 intersection, 51 302 notations for, 50 subset, 51 union, 51 simple extension of fields, 249 group, 189 solution cubic polynomial, 19 polynomial equation, 5 solvable by radicals, 14, 284 group, 184 splitting field, 257 of polynomial, 257 stabilizer of element, 127, 150 of set, 128, 150 pointwise, 127, 150 strong induction, 80 subfield, 109 subgroup, 95 full, 96 generated by, 164 generated by set, 97 generators, 164 index, 156 normal, 99, 130 proper, 96 trivial, 96 subring, 105 subset of set, 51 support of permutation, 138 surjection, 55 symmetric formula, 36 relation, 65 symmetric group, 61 Tartaglia, Niccolò, 16 term degree, 5, 220 of polynomial, 5, 220 ternary INDEX operation, 59 transcendental element, 234 transitive action, 195, 292 relation, 65 transposition, 189 adjacent, 191 trivial subgroup, 96 unary ooperation, 59 union of sets, 51 unit in ring, 84, 223 well defined operation, 69 well definedness problem, 69 well ordering, 79