modern algebra - Internal department services

MODERN ALGEBRA
NOTES FOR
MATH 401-402
at
Binghamton University
by
Matthew G. Brin
1-st Edition
Department of Mathematical Sciences
Binghamton University
State University of New York
c
2011
Matthew G. Brin
All rights reserved.
ii
Preface
This book is designed for a two semester undergraduate course in modern algebra. It makes no pretence that it can be used for a graduate course.
The assumption is made that students are just barely familiar with rigorous
proofs and that proofs by induction are are things that they have seen and
done a few times, but not mastered. It is assumed that the students have seen
sets to the extent of being familiar with unions and intersections, but little
else. It is assumed that students have had some very basic linear algebra. The
topics needed from linear algebra are the notions used in bases—span and linear
independence—and basic facts about solutions to systems of homogeneous linear
equations. It will be assumed that the students are familiar with basic algebraic
manipulations. Lastly, it is not assumed that the students know anything about
complex numbers, although it will be hoped that they have heard of them.
We do not follow a foundational point of view. That is, we do not start with
a blank slate and then add structures with axioms, one structure at a time,
stopping after each structure is introduced to develop its theory as much as
is needed before moving on to the next structure. In spite of this, the book
is self contained. Ultimately it proves from scratch enough about the usual
core subject (groups, rings and fields) to have developed Galois theory to the
extent of proving the existence of specific non-solvable fifth degree polynomial
equations. We do this without referring to any outside facts except one—the
intermediate value theorem from calculus. This is used in the proof of the
fundamental theorem of algebra. [Note: There is not really time to do the
Fundamental Theorem of Algebra. As of now, the FTA is taken as a black box.]
The pace is very slow. The ideal goal is to have the students learn absolutely
everything. This is not realistic, but some students will come close. The rest
should try to learn as much of the theory as they can.
iii
iv
PREFACE
Contents
Preface
iii
I
1
Preliminaries
1 Tools
1.1 Why we look at a bit of history . . . . . . . . . . . . . . . . .
1.2 The quadratic . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Classical solutions . . . . . . . . . . . . . . . . . . . .
1.2.2 Before the Greeks . . . . . . . . . . . . . . . . . . . .
1.2.3 After the Greeks . . . . . . . . . . . . . . . . . . . . .
1.2.4 More algebra from geometry . . . . . . . . . . . . . .
1.2.5 Solvability by radicals . . . . . . . . . . . . . . . . . .
1.2.6 On the roots and coefficients of the quadratic . . . . .
1.3 The cubic . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Mixing the old and the new . . . . . . . . . . . . . . .
1.3.2 Reducing the cubic . . . . . . . . . . . . . . . . . . . .
1.3.3 Solving the reduced cubic . . . . . . . . . . . . . . . .
1.4 Complex arithmetic . . . . . . . . . . . . . . . . . . . . . . .
1.4.1 Complex numbers and the basic operations . . . . . .
1.4.2 Complex numbers as a vector space . . . . . . . . . .
1.4.3 Complex numbers in Cartesian and polar coordinates
1.4.4 Complex conjugation . . . . . . . . . . . . . . . . . . .
1.4.5 Powers and roots of complex numbers . . . . . . . . .
1.4.6 Roots of 1 . . . . . . . . . . . . . . . . . . . . . . . . .
1.5 The cubic revisited . . . . . . . . . . . . . . . . . . . . . . . .
1.5.1 Picking out the solutions from the formula . . . . . .
1.5.2 Symmetry and Asymmetry . . . . . . . . . . . . . . .
1.5.3 The symmetric and the asymmetric . . . . . . . . . .
1.6 The quartic (optional) . . . . . . . . . . . . . . . . . . . . . .
1.6.1 Reduction . . . . . . . . . . . . . . . . . . . . . . . . .
1.6.2 The resolvent . . . . . . . . . . . . . . . . . . . . . . .
v
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
5
6
7
8
12
14
14
15
16
18
19
22
22
25
25
27
28
31
33
33
35
41
44
44
44
vi
2 Objects of study
2.1 First looks . . . . . . . . . . . . . . . . . . . .
2.1.1 Groups . . . . . . . . . . . . . . . . .
2.1.2 Fields . . . . . . . . . . . . . . . . . .
2.1.3 Rings . . . . . . . . . . . . . . . . . .
2.1.4 Homomorphisms . . . . . . . . . . . .
2.1.5 And more . . . . . . . . . . . . . . . .
2.2 Functions . . . . . . . . . . . . . . . . . . . .
2.2.1 Sets . . . . . . . . . . . . . . . . . . .
2.2.2 Functions . . . . . . . . . . . . . . . .
2.2.3 Function vocabulary . . . . . . . . . .
2.2.4 Inverse functions . . . . . . . . . . . .
2.2.5 Special functions . . . . . . . . . . . .
2.3 Groups . . . . . . . . . . . . . . . . . . . . . .
2.3.1 The definition . . . . . . . . . . . . . .
2.3.2 Operations . . . . . . . . . . . . . . .
2.3.3 Examples . . . . . . . . . . . . . . . .
2.3.4 The symmetric groups . . . . . . . . .
2.4 The integers mod k . . . . . . . . . . . . . . .
2.4.1 Equivalence relations . . . . . . . . . .
2.4.2 Equivalence classes . . . . . . . . . . .
2.4.3 The groups . . . . . . . . . . . . . . .
2.4.4 Groups that act and groups that exist
2.5 Rings . . . . . . . . . . . . . . . . . . . . . .
2.5.1 Definitions . . . . . . . . . . . . . . .
2.5.2 Examples . . . . . . . . . . . . . . . .
2.6 Fields . . . . . . . . . . . . . . . . . . . . . .
2.6.1 Definitions . . . . . . . . . . . . . . .
2.6.2 Examples . . . . . √
. . . . . . . . . . .
2.6.3 The irrationality of 2 . . . . . . . . .
2.7 Properties of the ring of integers . . . . . . .
2.7.1 An outline . . . . . . . . . . . . . . . .
2.7.2 Well ordering and induction . . . . . .
2.7.3 The division algorithm . . . . . . . . .
2.7.4 Greatest common divisors . . . . . . .
2.7.5 Factorization into primes . . . . . . .
2.7.6 Euclid’s first theorem about primes . .
2.7.7 Uniqueness of prime factorization . . .
2.8 The fields Zp . . . . . . . . . . . . . . . . . .
2.9 Homomorphisms . . . . . . . . . . . . . . . .
2.9.1 Complex conjugation . . . . . . . . . .
2.9.2 The projection from Z to Zk . . . . .
CONTENTS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
47
47
47
48
49
49
50
50
50
53
54
56
58
58
59
59
60
61
64
64
66
68
71
72
72
73
74
74
75
77
78
78
79
80
82
84
85
86
88
89
90
90
CONTENTS
vii
3 Theories
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.1 The definition . . . . . . . . . . . . . . . . . . .
3.2.2 First results . . . . . . . . . . . . . . . . . . . .
3.2.3 Subgroups . . . . . . . . . . . . . . . . . . . . .
3.2.4 Homomorphisms . . . . . . . . . . . . . . . . .
3.2.5 Subgroups associated to a homomorphism . . .
3.2.6 Homomorphisms that are one-to-one and onto
3.2.7 The group of automorphisms of a group . . . .
3.3 Rings . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1 The definition . . . . . . . . . . . . . . . . . . .
3.3.2 First results . . . . . . . . . . . . . . . . . . . .
3.3.3 Subrings . . . . . . . . . . . . . . . . . . . . . .
3.3.4 Homomorphisms . . . . . . . . . . . . . . . . .
3.3.5 Subrings associated to homomorphisms . . . .
3.3.6 Isomorphisms and automorphisms . . . . . . .
3.4 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.1 The definition . . . . . . . . . . . . . . . . . . .
3.4.2 First results . . . . . . . . . . . . . . . . . . . .
3.4.3 Field extensions . . . . . . . . . . . . . . . . .
3.4.4 Homomorphisms and isomorphisms . . . . . . .
3.4.5 Automorphisms . . . . . . . . . . . . . . . . . .
3.5 On leaving Part I . . . . . . . . . . . . . . . . . . . . .
II
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Group Theory
4 Group actions I: groups of permutations
4.1 Consequences of Lemma 3.2.1 . . . . . . .
4.2 Examples . . . . . . . . . . . . . . . . . .
4.2.1 Dihedral groups . . . . . . . . . . .
4.2.2 Stabilizers . . . . . . . . . . . . . .
4.3 Conjugation . . . . . . . . . . . . . . . . .
4.3.1 Definition and basics . . . . . . . .
4.3.2 Conjugation of permutations . . .
4.3.3 Conjugation of stabilizers . . . . .
4.3.4 Conjugation and cycle structure .
4.3.5 Permutations that are conjugate in
4.3.6 One more example . . . . . . . . .
4.3.7 Overview . . . . . . . . . . . . . .
91
91
93
93
93
95
97
98
99
100
103
103
104
105
105
105
106
108
108
108
109
111
112
114
117
. .
. .
. .
. .
. .
. .
. .
. .
. .
Sn
. .
. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
121
121
123
124
127
130
130
132
134
135
139
142
142
viii
5 Group actions II: general actions
5.1 Definition and examples . . . . .
5.1.1 The definition . . . . . . .
5.1.2 Examples . . . . . . . . .
5.2 Stabilizers . . . . . . . . . . . . .
5.3 Orbits and fixed points . . . . . .
5.4 Cosets and counting arguments .
5.4.1 Cosets . . . . . . . . . . .
5.4.2 Lagrange’s theorem . . .
5.4.3 The index of a subgroup .
5.4.4 Sizes of orbits . . . . . . .
5.4.5 Cauchy’s theorem . . . .
CONTENTS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
145
146
146
148
150
152
154
154
155
156
157
157
6 Subgroups
6.1 Subgroup generated by a set of elements
6.1.1 Strategy . . . . . . . . . . . . . .
6.1.2 The strategy applied . . . . . . .
6.1.3 Generators . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
161
161
161
162
164
7 Quotients and homomorphic images
7.1 The outline . . . . . . . . . . . . . . . . . . . . . . .
7.1.1 On the groups Z and Zk . . . . . . . . . . . .
7.1.2 The new outline . . . . . . . . . . . . . . . .
7.2 Cosets . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2.1 Identifying the equivalence classes with cosets
7.2.2 Cosets of normal subgroups . . . . . . . . . .
7.3 The construction . . . . . . . . . . . . . . . . . . . .
7.3.1 The multiplication . . . . . . . . . . . . . . .
7.3.2 The projection homomorphism . . . . . . . .
7.3.3 The first isomorphism theorem . . . . . . . .
7.3.4 Abelian groups and products . . . . . . . . .
7.3.5 Examples . . . . . . . . . . . . . . . . . . . .
7.3.6 The correspondence theorem . . . . . . . . .
7.3.7 Another isomorphism theorem . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
167
167
167
168
169
169
170
173
173
174
175
176
177
177
180
8 Classes of groups
8.1 Abelian groups . . . . . . . . . . . .
8.1.1 Subgroups of abelian groups .
8.1.2 Quotients of abelian groups .
8.2 Solvable groups . . . . . . . . . . . .
8.2.1 The definition . . . . . . . . .
8.2.2 Subgroups of solvable groups
8.2.3 Quotients of solvable groups .
8.2.4 Finite solvable groups . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
183
183
183
184
184
184
184
185
185
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
9 Permutation groups
9.1 Odd and even permutations . . . . . . . .
9.1.1 Crossing number of a permutation
9.2 The alternating groups . . . . . . . . . . .
9.2.1 The A5 menagerie . . . . . . . . .
9.2.2 Getting a three-cycle . . . . . . . .
9.2.3 Getting all of A5 . . . . . . . . . .
9.3 Showing a subgroup is all of Sn . . . . . .
III
ix
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Field theory
189
189
190
192
192
193
193
195
199
10 Field basics
10.1 Introductory remarks . . . . . . . . . . . . . .
10.2 Review . . . . . . . . . . . . . . . . . . . . . .
10.3 Fixed fields of automorphisms . . . . . . . . .
10.4 Automorphisms and polynomials . . . . . . .
10.5 On the degree of an extension . . . . . . . . .
10.5.1 Comparing degree with index . . . . .
10.5.2 Properties of the degree . . . . . . . .
10.6 The characteristic of a field . . . . . . . . . .
10.6.1 Definition and properties . . . . . . .
10.6.2 A minimal field of each characteristic
10.6.3 Consequences of Theorem 10.6.1 . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
201
201
202
203
203
205
205
206
208
208
209
210
11 Polynomials
11.1 Motivation: the construction of Zp from Z . .
11.2 Rings . . . . . . . . . . . . . . . . . . . . . .
11.2.1 Ring definitions . . . . . . . . . . . . .
11.3 Polynomials . . . . . . . . . . . . . . . . . . .
11.3.1 Introductory remarks on polynomials .
11.3.2 Polynomial basics . . . . . . . . . . .
11.3.3 Degree . . . . . . . . . . . . . . . . . .
11.4 The division algorithmm for polynomials . . .
11.4.1 Roots and linear factors . . . . . . . .
11.5 Greatest common divisors and consequences .
11.5.1 Divisors and units . . . . . . . . . . .
11.5.2 GCD of polynomials . . . . . . . . . .
11.5.3 Irreducible polynomials . . . . . . . .
11.6 Uniqueness of factorization . . . . . . . . . .
11.7 Roots of polynomials . . . . . . . . . . . . . .
11.7.1 Counting roots . . . . . . . . . . . . .
11.7.2 Polynomials as functions . . . . . . . .
11.7.3 Automorphisms and roots . . . . . . .
11.8 Derivatives and multiplicities of roots . . . .
11.8.1 The derivative . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
213
213
214
214
215
215
216
220
221
222
222
222
224
225
226
227
227
228
228
229
229
x
CONTENTS
11.8.2 Multiplicities of roots . . . . . . . . . . . . . . . . . . . . 230
11.9 Factoring polynomials over the reals . . . . . . . . . . . . . . . . 230
12 Constructing field extensions
12.1 Smallest extensions . . . . . . . . . . .
12.2 Algebraic and transcendental elements
12.3 Extension by an algebraic element . .
12.3.1 The construction . . . . . . . .
12.3.2 The structure of F [x]/P (x) . .
12.3.3 A result about automorphisms
12.3.4 Examples . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
233
233
234
235
236
238
241
243
13 Multiple extensions
13.1 Multiple extensions . . . . . . . . . . . . . . . . .
13.2 Algebraic extensions . . . . . . . . . . . . . . . .
13.3 Automorphisms . . . . . . . . . . . . . . . . . . .
13.3.1 Relativizing Proposition 12.3.5 . . . . . .
13.3.2 Applying the relative proposition . . . . .
13.4 Splitting fields . . . . . . . . . . . . . . . . . . .
13.4.1 Existence . . . . . . . . . . . . . . . . . .
13.4.2 Uniqueness . . . . . . . . . . . . . . . . .
13.4.3 An application . . . . . . . . . . . . . . .
13.5 Fixed fields . . . . . . . . . . . . . . . . . . . . .
13.5.1 Independence of automorphisms . . . . .
13.5.2 Sizes of fixed fields . . . . . . . . . . . . .
13.6 A criterion for irreducibility . . . . . . . . . . . .
13.6.1 Primitive polynomials and content . . . .
13.6.2 The Eisenstein Irreducibility Criterion . .
13.6.3 Applications of the irreducibility criterion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
247
248
249
251
252
254
257
258
258
259
260
261
262
264
264
266
267
14 Galois theory basics
14.1 Separability . . . . . . . . . . . . . . . . . . .
14.2 The primitive element theorem . . . . . . . .
14.3 Galois extensions . . . . . . . . . . . . . . . .
14.3.1 Finite, separable, normal extensions .
14.3.2 Splitting fields . . . . . . . . . . . . .
14.3.3 Characterizations of Galois extensions
14.4 The fundamental theorem of Galois Theory .
14.4.1 Some permutation facts . . . . . . . .
14.4.2 The Fundamental Theorem . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
269
269
271
272
273
276
277
277
278
279
15 Galois theory in C
15.1 Radical extensions . . . . . .
15.1.1 An outline . . . . . . .
15.2 Improving radical extensions
15.2.1 The first improvement
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
283
284
285
286
287
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
xi
15.2.2 The second improvement . . . . . . . . . . . . . . . . . . 288
15.3 On the improved extension . . . . . . . . . . . . . . . . . . . . . 289
15.4 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
xii
CONTENTS
Part I
Preliminaries
1
Chapter 1
Tools
1.1
Why we look at a bit of history
Polynomial equations
Modern algebra grew out of classical algebra. Classical algebra is learned today
in middle and high school, and it developed irregularly over thousands of years.
The earliest recorded evidence that people thought about algebraic questions at
all dates from around 3000 BC.
By 1600 AD most of the tools currently taught in middle and high school
were in place. They were well understood and in common practice within a few
decades of 1600, and it is no accident that calculus, a triumphant marriage of
geometry and algebra, was invented when it was (around 1665 AD).
Standard topics of classical algebra include solutions to polynomial equations
such as the quadratic equation
ax2 + bx + c = 0,
the cubic equation
ax3 + bx2 + cx + d = 0
and the quartic (also called the biquadratic) equation
ax4 + bx3 + cx2 + dx + e = 0.
Solutions to the quadratic equation were known before 1000 BC, and solutions to the cubic and quartic equatons were first published in 1545 AD.
The quintic equation
ax5 + bx4 + cx3 + dx2 + ex + f = 0
resisted all attacks until the early 1800s when it was finally understood that
there was no formula of the expected type that gave the solutions. (We define
“expected type” in Section 1.2.5.) In addition, techniques were developed by
3
4
CHAPTER 1. TOOLS
Galois in 1832 and refined by others over the next 100 years that detected which
particular quintics (with coefficients replaced by specific numbers) had solutions
in the expected form and which did not. In fact, Galois’ techniques apply to
polynomial equations of any degree, not just to degree five.
A bit on what will be covered
These notes will cover those aspects of modern algebra (whose development
started around 1800) sufficient to describe the theory of Galois and some of its
applications to the solutions of polynomials. This gives names to the topics in
these notes, but gives no details. Still giving no details, we point out that the
advances of the early 1800s were based on the introduction of new tools. The
absence of these tools between 1600 and 1800 prevented any real progress on
the quintic.
We have arrived at the theme (given by the title) of this chapter: tools. In
fact, the teaching of new tools is a major goal of these notes.
Our fascination with tools is one reason for learning a bit of mathematical
history. By seeing how something familiar (solving the quadratic) was done
centuries ago, we can see how the absence of certain tools affects the difficulty
of a problem.
We will also look at the less familiar cubic and quartic. We do this for several
reasons. First, they were done several decades before the full development of
classical algebra and we can again point out some difficulties that a lack of
tools created. Second, we will get hints on the nature of the new tools that
were developed around 1800 and will be able to summarize the topics that will
follow. Third, we will have opportunities to cover some basic material (complex
numbers, for example) that will be used throughout the rest of these notes.
Comments
The problem of deciding which polynomial equations have solutions in an expected form may not seem all that important. In fact while Galois’ techniques
make the decision possible in theory, they do not make the decision easy in
practice. However, the tools developed by Galois are much more important
than their application to this particular problem. An advanced course on the
theory will cover many other topics. Still, the problem is important in that its
resolution led to an important part of mathematics.
The arrangement of the material in these notes was chosen for several reasons. The first is that the material constitutes a reasonable introduction to
modern algebra. The second is that the ultimate topic of these notes, Galois
theory, is one of the truly astounding developments of modern mathematics.
The third is that the linking of polynomial equations to the Galois theory shows
how modern algebra grew out of and connects to classical algebra.
In spite of the discussion of the last few paragraphs, these notes do not follow
a historical development. History will be mentioned, but will not be followed
blindly. Ideas will be placed where they make the most sense, and not always
1.2. THE QUADRATIC
5
in the order in which they were discovered. Even in this chapter, where most of
the historical references will occur, the history given is so incomplete as to be
almost non-existent.
Modern algebra has developed far beyond Galois theory. If you wish to learn
more, you will have to take more advanced courses—perhaps in graduate school.
In later versions of these notes, extra topics may be included.
1.2
The quadratic
We now look at a familiar topic from classical algebra: solving quadratic equations.
We know how to solve the equation ax2 + bx + c = 0. In fact we know
how to solve ax2 + bx = c and ax2 = bx + c since we can rearrange the second
and third equations to look like the first. When equations such as this were
first considered, negative quantities were not accepted. The first form would
not have been considered since positive quantities can not sum to zero and the
second and third forms would have been considered different kinds of problems.
Since we want to see how these equations were solved some thousands of years
ago, we have to consider forms other than the one that we consider standard.
This causes temporary problems in terminology. The quantity ax2 + bx + c is
a polynomial of degree 2 in the variable x, and we have the convenient term
“root” of the polynomial which refers to any value which makes the polynomial
equal to zero when that value is substituted for x. So saying we are looking for
roots of the polynomial ax2 + bx + c is the same as saying we are looking for
solutions to the equation ax2 + bx + c = 0.
The word “root” does not cooperate with ax2 + bx = c. So we have to use
the word “solution” when we consider such an equation. Much of this course
will involve a search for roots of polynomials, and so the word “root” will be a
large part of our language. However, in this chapter, it will have to share space
with the word “solution.”
To make sure that we know what we are talking about, we will quickly review
some terms. They will be repeated more carefully later.
Some definitions
A polynomial in one variable (say x) is a linear combination of powers of x where
we will have more to say about the coefficients and where they come from later.
An example is
3x5 − 7x3 + 2x − 5.
The summands are the terms and the degree of term is the power of the
variable in the term. In the example, −7x3 is a term and it has degree 3. The
constant term (−5 in the example) has degree 0 since x0 (usually omitted when
writing out the polynomial) is part of the term. The degree of the polynomial is
6
CHAPTER 1. TOOLS
the highest degree of a term with a non-zero coefficient. That non-zero coefficient in that term of highest degree in the polynomial is the leading coefficient.
The polynomial above has degree 5 and leading coefficient 3.
As mentioned above, a root of a polynomial with variable x is a value of
x that makes the value of the whole polynomial equal to zero. The quadratic
formula gives roots of polynomials of degree 2. We now take a look at that
formula.
1.2.1
Classical solutions
We start with the more familiar: the way the formulas for the roots of quadratic
polynomials have been derived since the 1600s.
If three numbers a, b and c are given with a 6= 0, then the (as many as two)
values x1 and x2 for x that make
ax2 + bx + c = 0
true are given by the formulas
√
−b + b2 − 4ac
,
x1 =
2a
x2 =
(1.1)
−b −
√
b2 − 4ac
.
2a
(1.2)
We require a 6= 0 since having a = 0 would give us a polynomial that is not of
degree 2. Also, the formulas in (1.2) would make no sense if a = 0.
The validity of (1.2) can be checked by substituting x1 and x2 into (1.1) and
simplifying. For example, direct multiplication computes (x1 )2 as
√
√
b2 − 2b b2 − 4ac + (b2 − 4ac)
b2 − b b2 − 4ac − 2ac
2
(x1 ) =
=
4a2
2a2
so that
√
√
b2 − b b2 − 4ac − 2ac −b2 + b b2 − 4ac
+
+c
a(x1 ) + bx1 + c =
2a
2a
√
√
b2 − b b2 − 4ac − 2ac − b2 + b b2 − 4ac + 2ac
=
2a
= 0.
2
More importantly, we can explain how the formulas in (1.2) are derived. The
standard method involves completing the square. If we subtract c from both
sides of (1.1), and then divide both sides of the result by a, we get
c
b
x2 + x = − .
a
a
(1.3)
The technique of completing the square has us add (b/2a)2 to both sides of (1.3)
to get
b2
c
b2
b
x2 + x + 2 = 2 −
a
4a
4a
a
1.2. THE QUADRATIC
which simplifies to
7
2
b2 − 4ac
b
=
.
x+
2a
4a2
(1.4)
The right side of (1.4) has (as many as two) square roots and we can write
r
b
b2 − 4ac
x+
=±
2a
4a2
from which we get
b
x=− ±
2a
r
b±
b2 − 4ac
=−
4a2
√
b2 − 4ac
.
2a
We can also justify the steps for completing the square. Using the distributive laws three times and the commutative law once, we can write
(x + k)2 = (x + k)(x + k)
= (x + k)x + (x + k)k
= x2 + kx + xk + k 2
= x2 + 2kx + k 2 .
This tells us that if we are given an expression x2 + hx, we can think of h as 2k
and k as h/2. So if we add the square of h/2 to x2 + hx, we get the square of
(x + h/2).
We can make a partial list of the tools we used in our analysis of (1.1). We
have already noted our use of such laws of arithmetic as the distributive and
commutative laws, and we did not bother to note our use of other laws such
as the associative law. Our calculation of (x + k)2 was used implicitly in our
calculation of (x1 )2 . We used laws of algebra such as “equals added to equals
give equal results,” we have used laws involving the addition of fractions, we
have used our knowledge of square roots, and most importantly, we have used
our skill with the manipulation of symbols. The manipulated symbols represent
both numerical quantities as well as operations such as addition, subtraction,
multiplication, division and the taking of square roots.
In other words, we have used standard tools of classical algebra. In the next
two sections we will give some of the details of how quadratics were handled
before most of these tools were worked out.
1.2.2
Before the Greeks
The oldest mathematical texts we know now date from about 2000 BC to 1000
BC (give or take several hundred years) depending on the geographical area.
These come from Egypt, Mesopotamia, India and China. Some of these texts
include solutions the quadratic equation.
There are commonalities among these texts. Most include tables of calculations, and all include solved problems with specific numbers. A problem might
8
CHAPTER 1. TOOLS
read “A square and six of its sides equals 40.” In modern notation this would
read x2 + 6x = 40. The solution would read “Take 40 and the square of half of
six to get 49. Take the square root of 49 to get 7. Now remove from 7 half of
six to get 4.” In modern notation this would read
s 2
6 √
6
+ 40 − = 49 − 3 = 4.
2
2
The other root −10 would be ignored because negative quantities would not
have been accepted.
There would be no proof of the correctness of the procedure, although with
the wisdom of hindsight one can see in some cases from the wording and arrangement of steps that a geometric justification lay behind the solution. In the
next section, we will look at later writings where the geometric justification is
written out.
A more common version of a quadratic problem might give the area and
perimeter of a rectangle. This is equivalent to the information xy = a and
x + y = b which reduces by substitution (a modern technique) to the quadratic
x(b − x) = a or x2 − bx + a = 0. Once again, a specific problem would be
given with specific numbers for a and b and a solution would simply tell how to
manipulate a and b to get the answer. We will look again at this form of the
problem before we leave the topic of quadratic equations.
As mentioned, the absence of negative numbers would have an equation such
as x2 = 6x + 40 considered as a different type of problem as x2 + 6x = 40. Thus
the order of operations and the wording of the solution would read differently
for each of the two equations.
In spite of the restrictions described above, some texts had mathematics that
would be regarded today as quite advanced and calculations that were extremely
exact. One text from before 1600 BC has an approximation to the square root
of two that is off by only 0.0000006.1
1.2.3
After the Greeks
Around 500 BC, the first Greek texts on mathematics start to appear. The
Greek texts brought in significant changes. The most obvious is that not only
were proofs included, they were emphasized. As part of their structure of a
mathematical world built on proofs, they developed the axiomatic system. That
is, a small number of assumed truths were listed upon which all succeeding
arguments and conclusions were based.
1 The study of ancient mathematical texts is a swiftly developing and changing field. Some
areas produce hundreds of source texts in a variety of notations and languages, and interpretations of these texts change with time as they are more thoroughly studied, and as more
information gets integrated from knowledge of the history and culture of the relevant times.
I have tried to limit myself to statements that are not controversial, and have stayed away
from questions as to depth of understanding, flow of ideas over time and geography, credit,
originality, or worth of any of the mathematics contained in the ancient texts.
1.2. THE QUADRATIC
9
However, this system was mostly applied to geometry and number theory,
and much less so to algebra. Algebraic manipulations were still done on a verbal
basis, with no rules for the manipulation of symbols representing quantities, and
no symbols for artithmetic operations.
To see the first recorded attempt to bring basic rules into algebra, we shift
our attention to the Middle East. Located geographically between all the areas
that we have mentioned so far, and with regimes that from time to time strongly
supported the development and preservation of the sciences, the Middle East
(centered around Baghdad) brought together many of the contributions from
different locations, and made many contributions of its own.
We are not giving a detailed history, and so we look at just one book. The
book appeared around 830 AD and its title has been variously translated over
the years. It has been given the title The Compendious Book on Calculation by
Completion and Balancing and also the shorter title The Algebra of Mohammed
Ben Musa [4]. The author’s full name has been given as Abu Abdallah Muhammad ibn Musa al-Khwarizmi where the last part “al-Khwarizmi” is generally
taken to be where the author was from. In spite of this, the author is usually
referred to as “al-Khwarizmi.” It is thought that he was born in Persia (now
Iran) and lived his adult life in Baghdad (now in Iraq).
The importance of al-Khwarizmi’s book is based on the fact that it is the
earliest known text to combine all of the following features about an algebraic
topic (the solution of quadratic equations):
1. a systematic listing of all forms of the problem,
2. a small set of operations from which a reduction to one of the forms could
be obtained, and
3. for each form, a solution and a proof that the solution is correct.
Second, the book introduced these notions to the west (that is, to Europe).2
Since the standard of proof of the time was geometric proof in the Greek
tradition, the proofs are geometric. We discuss the above numbered items.
The six forms
The forms of the quadratic listed in [4] are:
ax2 = bx,
ax2 = c,
x = c,
ax2 + bx = c,
ax2 + c = bx,
(1.5)
2
ax = bx + c.
Given that all the numbers are to be positive, it is seen that this covers all
possibilities. As noted, without negative quantities, the form ax2 + bx + c = 0 is
2 Another book by al-Khwarizmi, on arithmetic, introduced to Europe the numbers we
call arabic numbers. The name “arabic numbers” comes from the fact that Europe learned
of them from al-Khwarizmi’s book in spite of the fact that the book itself states that the
numbers come from India.
10
CHAPTER 1. TOOLS
not possible. Also note that even the case where a = 0 is covered and that the
only restrictions are that the unknown must appear and that there be at least
two terms.
The operations
The operations for reduction to one of the forms are of two types.
The first operation, called al-Jabr, is the moving of a subtracted quantity
to the other side of the equation as an added quantity by adding the quantity
to both sides. Al-Jabr means “the completion,” usually of something defective.
Thus x2 − 3x = 2 becomes x2 = 3x + 2 by this operation. The object being
completed is x2 which is less than complete before the operation because of the
subtraction of the quantity 3x. A corruption of “al-Jabr” gives us the word
algebra.3
The second operation, called al-Mukabalah, (English spellings vary greatly)
means “the balancing” or “the reckoning.” It is used to cancel like terms that
might be on opposite sides of the equality. Thus 3x2 + 6 = x2 + 12 is balanced
by two applications of al-Mukabalah to become 2x2 = 6.
Using the operations of al-Jabr and al-Mukabalah, al-Khawarizmi argued
that any combination of squares, first powers and constants could be reduced
to one of the six forms of the quadratic. It was taken for granted that like
quantities on the same side could be combined. Recall that all equations were
expressed in words and it was not questioned that “six squares and two squares”
is the same as “eight squares.”
The solutions and proofs
The three equations in (1.5) with only two terms are trivial and we will not
discuss those. We give the treatment of one of the remaining three equations
in (1.5) as was done in [4], and leave the other two as exercises. It is likely
that the solutions and arguments below were known long before 830 AD. The
importance of al-Khwarizmi’s book is its completeness, its building from a small
set of rules, and its inclusion of justifications for all parts of the solutions
To explain the solution of x2 + px = q, the following figure is given in [4].
3A
A
B
A
B
C
B
A
B
A
corruption of the name “al-Khwarizmi” gives us the word algorithm.
(1.6)
1.2. THE QUADRATIC
11
The following explanation is given of the figure. We will use modern symbols.
Al-Khwarizmi used words.
The square C has the unknown x as the length of its sides.
The rectangles B have x as the length of one side and p4 as the length of the
other side. Thus the sum of the areas of the four rectangles labeled B is px.
The small squares A have p4 as the length of their sides. The sum of the four
squares labeled A is p2 /4.
It is given that x2 + px = q, but x2 + px is the combined area of C and all
2
the squares labeled B. Thus the area of the entire figure is q + p4 .
The side of the full figure is x + p2 . Thus we know that
r
p
p2
x+ = q+ .
2
4
Finally we get
x=
r
p
p
−p + p2 + 4q
p2
q+
− =
4
2
2
which agrees with what we know from the quadratic formula applied to
x2 + px − q = 0.
Note that since p is a positive number, −p is negative and we would end up with
a negative
p solution (not allowed) unless a larger positive number is added to it.
Since p2 + 4q > p (recall that q is also positive), we have a positive solution.
Note further that we cannot use the negative square root since then a negative
solution would result. Thus there is only one solution given in [4] for this case.
Exercises (1)
The two problems in this exercise set cover the last two forms in (1.5).
In the following all quantities are to be assumed positive.
1. The following figure in [4] is used to analyze x2 + q = px.
i
j
k
l
e
g
h
f
a
(1.7)
b
c
d
The square cdhg has sides whose lengths are the unknown x.
The length ad is p.
The point b is halfway from a to d.
12
CHAPTER 1. TOOLS
The lengths lk, lg, and lf are all equal.
What represents q?
There are two positive solutions to x2 + q = px. Why? Show that the
diagram justifies the smaller of the two.
In order to have a real solution, the fact q < ( p2 )2 must be built into the
figure. Show how this is the case.
The book [4] gives no diagram to justify the larger solution, but points out
one length in the diagram above that is the larger solution. Which is it?
Can you draw a diagram that justifies the larger solution?
2. The following figure in [4] is used to analyze x2 = px + q.
m
d
c
g
e
i
f
(1.8)
j
l
k
a
b
The square abcd has sides whose lengths are the unknown x.
The length bi is p.
The point l is halfway between b and i.
The lengths of gi and il are equal.
The lengths of gf and ic are equal.
What represents q?
There is only one positive solution to x2 = px + q. Why? Show that the
diagram justifies the solution.
1.2.4
More algebra from geometry
In [4], the following alternate figure is given to justify the case x2 + px = q.
x
C
p
2
A
x
B
p
2
In the figure, areas C and B sum to px, and area A is x2 . One can think of
starting with the small square on the lower left with sides p2 , and expanding the
1.2. THE QUADRATIC
13
square A until areas A, B and C add up to q. This makes
p2
p 2
=q+
x+
2
4
which agrees with the solution given before for x2 + px = q.
More interesting about the figure above is that it is a pictoral representation
of the square of a binomial. If we label the sides differently as shown
x
x
y
y
then we immediately get that
(x + y)2 = x2 + 2xy + y 2 .
(1.9)
This illustrates the power of geometric figures to capture what we know from
simple algebraic manipulations. However, it must be remembered that without
algebraic notation all algebraic discussion must be done with words. In [4], the
explations that accompany (1.7) and (1.8) occupy two pages each.
Exercises (2)
The following exercises about the quadratic use modern techniques. The first
two exercises are relevant to techniques that will be used in our discussion of
the cubic and quartic.
1. The formula (1.4) makes it appear that we are solving for the quantity
b
b
b
x + . If we let y = x + , then x = y − . What happens when we
2a
2a
2a
substitute this value for x into (1.1)? Assuming you did the calculation
right and saw the right phenomenon, can you explain why this particular
substitution works (and another such as x = y − b) does not?
2. If you solve the equation obtained by the substution above for y, how
would you recover x, the solution to the original equation?
3. The standard solutions to ax2 + bx + c = 0 given by the quadratic formula
(1.2) assume a 6= 0. This is sensible since if a = 0, then the equation
is not quadratic. However, there is a form of the solution that allows
for a = 0. What happens when you “rationalize the numerator” of the
solutions given by (1.2)? (This is exactly the operation done in Calculus
I to derive the derivative of the square root function from the definition
of the derivative.) Do this for both solutions in (1.2) since they have
important differences. What happens when a = 0? What happens when
c = 0?
14
1.2.5
CHAPTER 1. TOOLS
Solvability by radicals
The form of the solutions in (1.2) indicates what we meant in Section 1.1 by “expected type.” In (1.2), the coefficients of (1.1) and a few constants (specifically
2 and 4) are combined using the operations of addition, subtraction, multiplication, division and the taking of square roots.
In Section 1.3 we consider the cubic, and will also have to take cube roots.
All of these observations lead to the following definition. A polynomial is
said to be solvable by radicals if its roots can be expressed as formulas in the
coefficients of the polynomial and a number of constants using the operations of
addition, subtraction, multiplication, division and the taking of n-th roots for
various positive integers n.
We will discuss what constants are allowed later in these notes. For now,
we will just say that the same constants (and thus the same formula) must be
used no matter what the coefficients are. It would not be reasonable to allow
the constants (and the formula) to change with the polynomial.
The fifth operation, that of taking n-th roots, has a very different status
from the four operations of addition, subtraction, multiplication and division.
This theme will unfold slowly. Here we will point out that a more accurate
description of the fifth operation is that it extracts solutions to equations of the
form xn = c. When c 6= 0 and n = 2, we know that there are two solutions. For
arbitrary n, we will see shortly that there are n solutions whenever c 6= 0.
On the other hand, the operation of addition always gives a unique result.
This is also true of subtraction, multiplication and division. (We always refuse
to divide by zero, so we never have to deal with the fact that zero divided by
zero might be said to give infinitely many results.)
1.2.6
On the roots and coefficients of the quadratic
Let us make some observations using the techniques available since 1600 AD.
It is somewhat simpler to deal with ax2 + bx + c = 0 if a = 1. If we have
a true quadratic, then a 6= 0 and we can divide both sides by a. So it is not
a serious restriction to require a = 1. In general, a polynomial with leading
coefficient equal to 1 is called a monic polynomial. We have just argued that
finding roots for any polynomial can be done if we know how to find roots
for monic polynomials. We will see later that monic polynomials have other
conveniences.
Since the 1600s, we have known that if r1 and r2 are the roots of x2 + bx + c,
then
x2 + bc + c = (x − r1 )(x − r2 ) = x2 − (r1 + r2 )x + r1 r2 .
(1.10)
So we must have r1 + r2 = −b and r1 r2 = c.4 This brings us back to one topic
discussed in Section 1.2.2.
4 The first equality in (1.10) and the use of the word “must” assume facts about polynomials
that we have not yet proven. These facts will be proven in due time (Section 11.6), and when
we do we will point out that the discussion here has been fully justified.
1.3. THE CUBIC
15
If we are told that two unknowns have a known sum and a known product,
then the information gleaned from (1.10) lets us write down a quadratic whose
roots are the two unknowns. Specifically, if r1 r2 = p and r1 + r2 = s, then x
and y are roots of x2 − sx + p. This will be exploited in the following section
on cubics.
The fact that the coefficient of x is the negative of the sum of the roots and
the constant term is the product of the roots can be verfied directly (assuming
a = 1) from the solutions given in (1.2).
There is a two way relationship between the coefficients of a polynomial, and
the roots of the polynomial. In one direction, the coefficients clearly control the
roots since the roots must make the value of the polynomial equal to zero.
Understanding this direction is the goal to finding the roots. The formulas in
(1.2) give this direction for the quadratic, and we will see formulas (or at least
how to get them) for the cubic and quartic.
The other direction is not mysterious at all. We see that for the quadratic
it is quite simple. One coefficient is obtained by simple addition and negation,
and the other by multiplication. We put off discussing this further until we have
analyzed the cubic. At that point, we will have more information available to
discuss.
Exercises (3)
1. Verify that the roots as given in (1.2) (when a = 1) multiply to the
constant term and add to the negative of the coeffient of x in (1.1).
2. Solve each of the following for x and y using the technique discussed above.
Using any other technique avoids learning the point of this discussion. Of
course, you should be willing to write down answers that are negative
and that involve the square roots of negative numbers. Square roots of
negative numbers will be discussed more thoroughly in a later section.
1.3
(a)
s = 10,
p = 2.
(b)
(c)
s = 2,
s = 10,
(d)
(e)
s = −10,
s = −10,
p = 10.
p = −2.
p = 2.
p = −2.
The cubic
The full solution to the cubic equation appeared in Girolamo Cardano’s book
[2] in 1545. Solutions of very special cases with carefully arranged coefficients
had been worked out long before. A non-numerical, graphical solution to the
16
CHAPTER 1. TOOLS
cubic by the Persian poet and mathematician Omar Khayyám5 (which involved
drawing a circle and a hyperbola and measuring a coordinate of an intersection
point) dates from the eleventh century.
As with the quadratic, non-acceptance of negative numbers required that
the general cubic be treated in a number of cases. Regarding cases with only
two terms as trivial, and cases with no cube or no constant term as reducible
to lower degree problems, one is left with thirteen different cases.
x3 + ax2 = c,
x3 + bx = c,
x3 + c = bx,
x3 + c = ax2 ,
x3 = bx + c,
x3 = ax2 + c,
x3 + ax2 + bx = c,
x3 + ax2 + c = bx,
x3 + bx + c = ax2 ,
x3 + ax2 = bx + c,
3
(1.11)
2
x + bx = ax + c,
x3 + c = ax2 + bx,
x3 = ax2 + bx + c.
Cardano learned of the solution to one or two of the thirteen cases from
another mathematician, worked out the remaining cases himself and published
the full account in [2].6
In [2], each of the thirteen cases in (1.11) is given its own chapter, most
running to two or three pages, and a few, with extra examples and alternate
approaches, running to quite a few more.
1.3.1
Mixing the old and the new
We will not try to give an exact account of how cubic equations were solved in
[2]. First, the translations available have already modernized the arguments to
some extent. Second, the arguments will take too long to work out completely
in two different ways. However, we will give one idea of the restrictions under
which Cardano worked.
Section 1.2.4 showed how the square of a binomial could be understood geometrically. Relevant to the solution of the cubic will be the cube of a binomial.
The following figure illustrates how (x + y)3 can be understood.
5 There is speculation that the eleventh century, Persian poet Omar Khayyám and the
eleventh century, Persian mathematician Omar Khayyám were two different people.
6 The publication of [2] started a bitter, ten year feud between Cardano and Niccolò
Tartaglia who had given Cardano the solution to a special case under a promise that Cardano
would not publish the solution until Tartaglia published his solutions first. The feud went on
in spite of the fact that Cardano named Tartaglia in [2] as the source of Cardano’s original
knowledge for one of the cases. Tartaglia was not the first to solve a special case. Some
thirty years earlier the case x3 + bx2 = c had been worked out by Scipione del Ferro but never
published. It is said that Cardano published [2] after learning about Ferro’s results because he
then knew that Tartaglia was not the first to solve a special case, and because, after some five
years, Tartaglia had still not published his own solutions. The feud illustrates the importance
that mathematics had in certain circles of society in the middle of the Renaissance.
1.3. THE CUBIC
17
o
xooooo y o ooo
ooo o_
o
y
_o o_ o_ _ _
oo
x
o_
o
_o o_ o_ _ _
o
o
x
_
_
_
_
o
o
ooo ooooo
o
o
o
o
o
o
o o
oo
ooo ooo _ _ _ _ _ _ o_ _ _ o_o
o o ooooo
o
o
o
o
_ _o _ _ _ o_o
o oo _ _ _ _ _ _ o_ _ _ o_o
o
o
o
ooo
_ _o _ o o_ _ o_oooo
o
oo
y
(1.12)
The full cube shown is made up of eight pieces. There is a small cube of
volume y 3 in the front, upper, right corner, and a larger cube of volume x3 in
the back, lower, left corner. There are three “bars” of volume xy 2 that touch
the small cube of volume y 3 . There are three “slabs” of volume x2 y that touch
the larger cube of volume x3 . The full figure has volume (x + y)3 , so we see
that
(x + y)3 = x3 + 3x2 y + 3xy 2 + y 3 .
(1.13)
We will shortly make use of this equality.
Later we will use
(x + y)3 = x3 + 3xy(x + y) + y 3
(1.14)
which we can obtain from (1.13) very easily with our knowledge of the distributive law. However in 1545, the equality (1.14) was obtained by modifying the
figure (1.12). If each “bar” in (1.12) is combined with one of the three “slabs”
in (1.12), then the following figure results.
o
o
ooo
xooooo o
o
oo
y o ooo
ooo
oo
ooo _ _ _oo_ _ _ o_o _ _ _ _ _
o
oo
o o
ooo
y
o_ o_ o_ _ _ _ _ _ _ o o_ o_ _ oo ooo
o oo
ooo (1.15)
x
_ _ _ _ _ _ _ _ _ _ _ _
o
oo
oo
ooo
o o o ooooo
o o o
_ _ _ _ _ _ _ o_
o
oo
ooo
o
y
x
The larger “slabs” in (1.15) now have volume xy(x + y) each, and the full figure
verifies (1.14).
18
CHAPTER 1. TOOLS
This is as far as we will go in trying to reproduce the efforts of the sixteenth
century in dealing with algebraic quantities geometrically. We will use (1.13)
and (1.14) in the arguments that follow, but will use all the algebra that we
know now in their application. In particular, we will not shy away from negative
numbers, or from square roots of negative numbers.
1.3.2
Reducing the cubic
The general form of the cubic equation is
ax3 + bx2 + cx + d = 0.
If a = 0, then we really do not have a cubic. So we assume a 6= 0. This lets
us divide both sides by a which gives us a monic cubic with the same solutions.
So from now on we work with equations of the form
x3 + px2 + qx + r = 0.
(1.16)
We now modify a technique introduced in the problems in Section 1.2.4.
If we add the right constant to x and call the result y, we might end up with
a simpler equation. We can figure out exactly what to add. If we let y = x + k,
then x = y − k. We can substitute this in for x in (1.16) and then apply (1.13)
and (1.9). This gives
x3 +px2 + qx + r
= (y − k)3 + p(y − k)2 + q(y − k) + r
= (y 3 − 3y 2 k + 3yk 2 − k 3 ) + p(y 2 − 2yk + k 2 ) + q(y − k) + r
= y 3 + (p − 3k)y 2 + (q + 3k 2 − 2pk)y + (r − k 3 + pk 2 − qk).
Thus we see that if p = 3k or k = p3 , then the equation in y will have no term
with y 2 . If we set k = p3 , then the last expression above becomes
p 2
p 2
p 3
p
p
p
y 3 + (p − 3 )y 2 + (q + 3
− 2p )y + (r −
+p
−q )
3
3
3
3
3
3
2
3
3
2
p
p
p
p
p
− 2 )y + (r −
+
−q )
= y 3 + (q +
3
3
27
9
3
3
2
2p
p
p
−q )
= y 3 + (q − )y + (r +
3
27
3
p 2
p 3
p
3
= y + (q − 3
).
)y + (r + 2
−q
3
3
3
To use this calculation, one starts with the problem given by (1.16). The problem (1.16) is then replaced by the problem
p 3
p
p 2
)y + (r + 2
−q
y 3 + (q − 3
) = 0.
(1.17)
3
3
3
If a solution y is found for (1.17), then x = y − k = y − p3 is a solution to (1.16).
1.3. THE CUBIC
19
The form of the equation in (1.17) is so bad, that there is little reason to try
to remember it. It is much more useful to remember that when presented with
(1.16) the substutition x = y − p3 will remove the square term. For example (to
pick an example where the numbers are not so bad), given
x3 − 6x2 + 4x − 5 = 0
one puts in x = y −
−6
3
= y + 2 and gets
(y + 2)3 − 6(y + 2)2 + 4(y + 2) − 5
= (y 3 + 6y 2 + 12y + 8) − 6(y 2 + 4y + 4) + 4y + 8 − 5
= y 3 + 12y + 8 − 24y − 24 + 4y + 8 − 5
= y 3 − 8y − 13 = 0.
If a solution y is found for the above, then x = y + 2 is a solution to the original.
Note that carrying along the terms for y 2 verifies the fact that they vanish in
the end, but if one is confident in the technique, then they can be ignored.
Exercises (4)
1. In each of the following, write down a cubic in y having no y 2 term so
that the solutions of the original and the solutions of the equation in y
differ by a single constant. State the relationship between the solutions
in x and the solutions in y. You should do at least one of the problems
“the hard way.” That is, do not use (1.17). You can even try “the harder
way.” That is, pretend you do not know in advance what the constant
difference between x and y is.
(a) x3 + 9x2 − 2x + 3 = 0.
(b) 2x3 − 12x2 + 6x + 3 = 0.
(c) x3 + x2 + 1 = 0.
1.3.3
Solving the reduced cubic
We will not completely solve the cubic here. We will get a nice formula for the
solution, but we will have trouble interpreting it. The full interpretation of the
formula will require learning more about complex numbers. We will derive the
formula and then take a break to learn about complex numbers. Then we will
return to give the last word on the solution of the cubic in Section 1.5.
The solutionto the (reduced) cubic is based on two key steps. Just as the
reduction above can be boiled down to one idea (replace x by y − p3 ), and
the solution to the quadratic can be boiled down to one idea (complete the
square), the derivation of the formula for the cubic can be boiled down to a
small number (two) of ideas. These ideas and how to implement them should
be learned, rather than trying to memorize the whole derivation.
20
CHAPTER 1. TOOLS
The two ideas come from recognizing two resemblances. The first is the
following.
The reduced cubic can be written as
x3 + qx + r = 0.
The equation we derived in (1.14) can be written as
(x + y)3 − 3xy(x + y) − (x3 + y 3 ) = 0.
Since x is in use in the cubic, we rewrite the above as
(u + v)3 − 3uv(u + v) − (u3 + v 3 ) = 0
to avoid having x used in a different way in two different places. Now if x = u+v,
if −3uv = q and if −(u3 +v 3 ) = r, then the equation above resembles the reduced
cubic. Then if we can use −3uv = q and −(u3 + v 3 ) = r to solve for u and v,
then we get x as u + v. But we can solve for u and v. This requires the second
idea.
From −3uv = q, we get
q 3
q
.
(1.18)
uv = − , or u3 v 3 = −
3
3
From −(u3 + v 3 ) = r, we get
u3 + v 3 = −r.
But this means that we know the sum and product of the two unknowns, u3
and v 3 . From Section 1.2.6, we know that u3 and v 3 are roots of a quadratic.
The specific quadratic is
q 3
= 0.
(1.19)
z 2 + rz −
3
The solutions to (1.19) are
q
3
r r r 2 q 3
−r + r2 + 4 q3
z1 =
+
+
=−
2
2
2
3
q
r
3
r
r 2 q 3
−r − r2 + 4 q3
−
+
=−
z2 =
2
2
2
3
The unknown u will be the cube root of one of them (z1 , say), and the other
unknown v will be the cube root of the other. Since x = u + v, we get
s
s
r r r 2 q 3
r r r 2 q 3
3
3
+
−
+
+ −
+
.
(1.20)
x= −
2
2
3
2
2
3
This needs some interpretation. The fact that there are two values whose
square is
r 2 q 3
+
2
3
1.3. THE CUBIC
21
is taken into account by using one value for z1 and the other for z2 . The formula
(1.20) makes it clear what to do with these two values.
However, we will see in the next section that there are three possible values
for each of the cube roots u and v. If all three values are used for both u and
v, then it might seem that there are nine possible values for x = u + v. That
this is not the case, and how to pick out the right combinations will be covered
after a thorough discussion of complex numbers.
Two examples
There are other difficulties hidden in (1.20). The follwing examples will illustrate
this. We will also refer to these examples later when we return to the topic of
the relationships between the roots and the coefficients.
Substituting 1 for x shows that x = 1 is a solution for each of the following.
x3 + 3x − 4 = 0.
3
x − 9x + 8 = 0.
If we apply (1.20) to (1.21), we get
q
q
√
√
3
3
x = 2+ 5+ 2− 5
(1.21)
(1.22)
(1.23)
which hardly looks like 1.
The formula in (1.23), while strange, is at least trying to give a real number.
The square√root of 5 is a specific real
√ number which is slightly larger than two.
Thus 2 + 5 is positive and 2 − 5 is negative. A positive real number has
a positive real cube root and a negative real number has a negative real cube
root, so there is a real value that (1.23) can be specifying for x.
There is no other real solution to (1.21). The left side of (1.21) has derivative
equal to 3x2 + 3 which is positive for all real x. Thus x3 + 3x − 4 is a strictly
increasing function of x, and it crosses the x-axis exactly once. Thus if our
derivation of (1.20) is correct and complete, then (1.23) must be equal to 1.
The equation (1.22) has more problems. It is of the form f (x) = 0 where
√
3
f (x) = x√
− 9x + 8. √
We have f ′ (x) = 3x2 − 9 which
is zero √
at x = ± 3.
√
Now f (− 3) = 8 + 6 3 which is√positive, and f ( 3) = 8 − 6 3 which can
be checked and found negative. ( 3 is approximately 1.732.) It follows from
standard curve sketching arguments that f (x) crosses the x-axis three times
and (1.22) has three real solutions. But (1.20) gives
q
q
√
√
3
3
x = −4 + −11 + −4 − −11.
(1.24)
The expression (1.24) does not look as though it gives any real solutions, let
alone three real solutions of which one is x = 1.
In [2], Cardano struggled with expressions such as (1.24) and attempted to
make sense of them. He was willing to manipulate such expressions formally
(as you will in the next exercise) in spite of the fact that he had no confidence
22
CHAPTER 1. TOOLS
that they had any meaning. With the later development (and slow acceptance)
of complex numbers, expressions such as (1.24) became easy to handle and
interpret. The next tool that we will look at will be complex numbers and their
rules of arithmetic.
Exercises (5)
1. Plug the solutions (1.23) and (1.24) into their respective equations and
show that they are indeed solutions. This is just a brute force calculation.
Use (1.13) to cube the solutions. There will be only one tricky part to the
simplification.
2. [XXXXXXXXXXxx need some cubics to work on.]
1.4
Complex arithmetic
As seen from the examples in the last section, we need to understand square
and cube roots thoroughly in order to make sense out of (1.20). This is done
by understanding complex arithmetic thoroughly. We refer to the topic as complex arithmetic as opposed to complex analysis since we are only going to add,
subtract, multiply, divide and take roots. We will not differentiate, integrate
or take limits. Students that have taken complex analysis will find what we do
familiar, but the emphasis will be very different. A prior knowledge of complex
analysis is not at all necessary.
1.4.1
Complex numbers and the basic operations
There is no real number whose square is −1. So we invent a new number i
with the property that i2 = −1. Almost everything that follows is forced if
we want to end up with numbers that are as well behaved as real numbers. A
desire to have laws of arithmetic such as the commutative law, associative law,
distributive law and so forth dictate what happens next.
A complex number is an expression of the form a + bi where a and b are real
numbers. We say that a + bi = c + di if and only if a = c and b = d. The set of
all complex numbers is denoted C.
We now look at operations. The “calculation”
(a + bi) + (c + di) = (a + c) + (bi + di) = (a + c) + (b + d)i
is motivated by the laws of arithmetic mentioned above. The word “calculation”
is in quotes since we really do not yet have a definition for the sum of two
complex numbers. But the desire to copperate with the laws of arithmetic leads
us to use the result as a definition. Thus we say that
(a + bi) + (c + di) = (a + c) + (b + d)i
1.4. COMPLEX ARITHMETIC
23
is a definition. Note that our definition does in fact take two complex numbers
and give us a third complex number since both (a + c) and (b + d) are real
numbers.
The “calculation”
(a + bi)(c + di) = ac + adi + bci + bdi2
= ac + adi + bci + bd(−1)
= (ac − bd) + (ad + bc)i
leads to the definition
(a + bi)(c + di) = (ac − bd) + (ad + bc)i.
Now that we have definitions for addition and multiplication, we can see
how the definitions behave. Direct calculation (no longer in quotes since we
have definitions)
(0 + 0i) + (c + di) = (0 + c) + (0 + d)i = c + di
and
(1 + 0i)(c + di) = (1c − 0d) + (1d + 0c)i = c + di
show that 0 + 0i is an identity for addition and 1 + 0i is an identity for multiplication.
The check
(a + bi) + ((−a) + (−b)i) = (a + (−a)) + (b + (−b))i = 0 + 0i
shows that (−a) + (−b))i is an additive inverse for a + bi. So we set
−(a + bi) = (−a) + (−b)i.
Now that we have additive inverses, we can subtract by adding the additive
inverse. That is if z and w are complex numbers (and yes, it is legitimate to
use a single letter to represent a complex number), then z − w = z + (−w).
We note that (0+0i)(c+di) = 0+0i by direct calculation from the definition.
Multiplicative inverses will allow us to divide in the same way that additive inverses allow us to subtract. But multiplicative inverses are slightly more
complicated. Given a + bi, we want c + di so that
(a + bi)(c + di) = (ac − bd) + (ad + bc)i = 1 + 0i.
(1.25)
There are three ways to find c+ di, easy, medium and hard. It will turn out that
hard is the most useful for the rest of the course, but here we will do medium
and easy.
The calculation above leads to the medium technique directly. We note that
a + bi cannot be 0 + 0i since no value of c + di will make (0 + 0i)(c + di) = 1 + 0i.
24
CHAPTER 1. TOOLS
We now use the information in (1.25) to write down two equations
ac − bd = 1,
ad + bc = 0.
(1.26)
These are two linear equations in the two unknowns c and d. The values a, b, 0
and 1 are given. Straightforward linear algebra gives the solutions
a
,
+ b2
−b
d= 2
a + b2
c=
a2
(1.27)
so that we can write
(a + bi)−1 =
a2
a
b
− 2
i.
2
+b
a + b2
The easy technique is to know a standard trick. We want to put
1
a + bi
in the form of a complex number. We make the denominator real by doing
a
b
1
1
a − bi
a − bi
1
= 2
− 2
i. (1.28)
=
·1=
·
= 2
a + bi
a + bi
a + bi a − bi
a + b2
a + b2
a + b2
All of the calculations above are fine but the last equality is not yet justified.
We are trying to get multiplicative inverses so that we can divide. The last
equality assumes that we already know how to divide by a real number. That
means the calculation in (1.28) should be preceded by a definition of division
by a real number. This is easy to do, and so the outline using (1.28) to get
multiplicative inverses is sometimes used.
We leave the hard technique for much later.
Note that the assumption a + bi 6= 0 + 0i shows up in the formula. If we do
not make the assumption, then dividing by a2 + b2 might be dividing by zero,
and if we do make the assumption, then dividing by a2 + b2 is never dividing by
zero since a and b are real and a2 + b2 cannot be zero if at least one of a or b is
not zero.
Lastly we mention some relaxations of rules when it comes to writing down
complex numbers. We often write bi for 0 + bi, we often write a for a + 0i. This
means that 0 + 0i is often written as 0, and 1 + 0i is often written as 1.
Exercises (6)
1. Prove all that were not already done in class of commutativity and associativity for addition and multiplication, and distributivity of multiplication
over addition in complex arithmetic.
2. Do the linear algebra that gives the solutions (1.27) to (1.26).
1.4. COMPLEX ARITHMETIC
1.4.2
25
Complex numbers as a vector space
This minor section is here only because the concept becomes very important
later, and it is worth seeing more than once.
If r and s are real numbers, then they are also complex numbers by thinking
of r as r + 0i and s as s + 0i. If r and s are real numbers and z, y and w are
complex numbers, then we know the following facts.
1. z + w = w + z.
2. (z + y) + w = z + (y + w).
3. z + 0 = z.
4. z + (−z) = 0.
5. (rs)(z) = r(sz).
6. (r + s)z = rz + sz
7. r(z + y) = rz + ry.
8. 1z = z.
These facts make C a vector space with the real numbers as scalars. The
definition of a complex number says that every complex number is uniquely
represented as a + bi with a and b real. Since a + bi = a1 + bi, this can be
reinterpreted to say that every complex number is a unique linear combination
of the two complex numbers 1 and i. This makes the set {1, i} a basis for C
with the real numbers as scalars. We will have more to say about this later.
1.4.3
Complex numbers in Cartesian and polar coordinates
Cartesian coordinates
Since each complex number is specified by a pair of real numbers, it is possible
to plot the complex numbers in the Cartesian plane. The point (a, b) in the
plane will correspond to the complex number a + bi and vice-versa. This makes
all points in the x-axis correspond to complex numbers of the form a + 0i, which
are regarded as real numbers. For that reason, the x-axis is called the real axis.
The y-axis corresponds to complex numbers of the form 0 + bi, which are
called imaginary numbers, so the y-axis is calle the imaginary axis.
The addition rule (a + bi) + (c + di) = (a + c) + (b + d)i says that when viewed
as points in the plane, complex numbers are added coordinate by coordinate,
just as 2-dimensional vectors are. This continues the observations made in the
previous section. Similary, negation where −(a + bi) = (−a) + (−b)i is also done
coordinate by coordinate.
26
CHAPTER 1. TOOLS
In conclusion, the Cartesian representation of complex numbers as points in
the plane where the a and b in a + bi are x and y coordinates cooperates well
with addition, negation and thus subtraction.
But multiplication and division do not do as well in the Cartesian representation. The x and y coordinates get mixed badly under multiplication and
division.
Polar coordinates
It turns out that polar coordinates cooperate beautifully with multiplication
and division. Since n-th roots are a multiplicative concept, polar coordinates
cooperate beautifully with the taking of n-th roots as well. This is our main
reason for looking at polar coordinates.
To do polar coordinates we need distance of a point to the origin, and an
angle that the line to the origin makes with respect to the positive x-axis. In
the following figure
z = a + bi
·


r
b





 θ d
a
(1.29)
the Cartesion coordinates of z are (a, b) and the polar coordinates are (r, θ).
The relationships that we will need between the various quanties are as follows:
a = r cos(θ),
b = r sin(θ),
p
r = a2 + b 2 ,
In particular, if the polar coordinates (r, θ) are known, then the Cartesian coordinates are (r cos(θ), r sin(θ)) making
z = r cos(θ) + r sin(θ)i = r(cos(θ) + i sin(θ)).
(1.30)
Putting the i before the sin(θ) in (1.30) is traditional.
Let us multiply two complex numbers written in the form (1.30). For z =
r(cos(θ) + i sin(θ)) and w = s(cos(φ) + i sin(φ)), we get
zw
= r(cos(θ) + i sin(θ))s(cos(φ) + i sin(φ))
= rs((cos(θ) cos(φ) − sin(θ) sin(φ)) + i(cos(θ) sin(φ) + sin(θ) cos(φ))
= rs(cos(θ + φ) + i sin(θ + φ))
(1.31)
1.4. COMPLEX ARITHMETIC
27
where the last equal sign follows from two of the standard trigonometry identities. This result is extremely important and needs some discussion to bring out
its admirable qualities.
Modulus and argument
In (1.29), the length r is called the modulus of the complex number z. It is
simply the distance from z to the origin. The modulus of z is usually denoted
|z| which explains why it is sometimes called the absolute value of z. Note that
|z| ≥ 0 for any complex number z and that |z| is zero if and only if z = 0.
The angle θ in (1.29) is called the argument of the complex number z and is
denoted Arg(z).
The expression (1.30) expresses a complex number z in terms of its modulus
and argument. It expresses z as a real number |z| times a complex number
(cos(θ) + i sin(θ)) which has modulus 1 since
q
cos2 (θ) + sin2 (θ) = 1.
In (1.31), we multiply z, which has modulus r and argument θ, times w,
which has modulus s and argument φ. We see that the result has modulus rs
and argument θ + φ. We can turn this into an easily stated rule: when complex
numbers are multiplied, the moduli are multiplied and the arguments are added.
Exercises (7)
1.4.4
Complex conjugation
In (1.28), we saw that (a + bi)(a − bi) = a2 + b2 which is the square of the
modulus of a + bi. The complex number a − bi is called the complex conjugate
of the complex number a + bi. If z is a complex number, then its complex
conjugate is written as z.
There are many nice properties of the complex conjugate. To write some of
them down efficiently, we adopt some standard notation. If z = a + bi, then a
is the real part of z, written Re(z), and b (a real number) is the imaginary part
of z, written Im(z). This makes z = Re(z) + Im(z)i.
We can now record several facts about complex conjugates. The first has
already been noted.
zz = |z|2 ,
(z + z)/2 = Re(z),
(z − z)/(2i) = Im(z),
z + w = z + w,
−z = −z,
zw = z · w,
1/z = 1/z,
z = z.
(1.32)
28
CHAPTER 1. TOOLS
If we plot two complex numbers that are the conjugates of each other,
z
z
•DD
DD
DD
DD
D
zz
z
zz
zz
z
•z
then from the Cartesian view one is obtained from the other by reflection about
the real axis. From the polar view, we can say that they have the same modulus,
and that the arguments are the negatives of each other.
Exercises (8)
1. Prove all the facts in (1.32)
2. Prove that a complex number z is real if and only if z = z. What is the
shortest proof you can give?
1.4.5
Powers and roots of complex numbers
Let the complex number z have modulus r and argument θ so that z = r(cos(θ)+
i sin(θ)). Then z n which is just z · z · · · · · z with n copies of z, has modulus
r · r · · · · · r with n copies of r and has argument θ + θ + · · · + θ with n copies of
θ.
That is, z n has modulus rn and argument nθ. In particular,
z n = rn (cos(nθ) + i sin(nθ)).
This completely analyzes powers of complex numbers when given in polar form.
Now that powers are understood, we can look at roots. If we look for the
n-th root of this same z, then we want some complex number
√ w with modulus
s and argument φ so that sn = r and nφ = θ. That is, s = n r and φ = nθ .
The modulus is straightforward. It is supposed to be √
a non-negative real
number, and there is only one non-negative real number n r that can be the
n-th root of the non-negative real number r.
The argument is less straightforward. Saying that θ is the argument of z
specifies θ as an angle, but not as a real number. All of θ, θ + 2π, θ + 4π, θ − 2π
specify the same angle. Since they specify the same angle, there is no reason to
use any one over the other when specifying the argument of z. But when they
are divided by n, then can end up specifying different angles. For example
θ
,
3
θ + 2π
,
3
θ + 4π
3
1.4. COMPLEX ARITHMETIC
29
all specify different angles, but when multiplied by 3, they all become the same
angle as θ. However
θ
θ + 6π
= + 2π
3
3
specifies the same angle as θ/3 which is one of the angles above.
We can analyze the situation completely as follows. If z has argument θ,
then anything of the form θ + k(2π) represents the same angle. If we divide k
by n, we get a quotient q and remainder7 m so that k = qn + m and we can
require that the remainder satisfy 0 ≤ m < n. Now
θ + (qn + m)(2π)
θ
m
θ + k(2π)
=
= + q(2π) + (2π)
n
n
n
n
m
θ
which represents the same angle as + (2π). So we see that the angle we get
n
n
for the n-th root depends only on the remainder and not the quotient.
Now if we use two different values of k with two different remainders (m1
and m2 , say), then the two angles that result from this will differ by
θ
m1
m2
m1 − m2
θ
+
(2π) −
+
(2π) =
(2π)
n
n
n
n
n
which is less than 2π since m1 − m2 is less than n and thus does not represent
the angle 0. This makes
m1
θ
+
(2π)
n
n
and
θ
m2
+
(2π)
n
n
two different angles.
So the angles of the n-th root do not depend on the quotient, but depend
completely on the remainder. This lets us list all the n-th roots.
If z 6= 0 has modulus r and argument θ, then all the n-th roots of z are of
the form
√
θ
m
m
θ
n
r cos
+ (2π) + i sin
+ (2π)
,
m ∈ {0, 1, . . . , n − 1}.
n
n
n
n
If z = 0, then the modulus is zero and all roots have modulus zero. But the
only complex number with modulus zero is 0 itself. So all n-th roots of 0 equal
0.
Examples
1. Let us find cube roots of z = 1 = 1 + 0i. The
√ modulus is 1 and the argument
is zero. The cube roots all have modulus 3 1 and the arguments of the cube
roots are
0
0 + (2π) = 0,
3
7 The
1
2π
0 + (2π) =
,
3
3
2
4π
0 + (2π) =
.
3
3
use of quotients and remainders will be extremely important later in these notes.
30
CHAPTER 1. TOOLS
The three cube roots are pictured in the xy-plane below.
ω
•11
11
1

xy
~1}
z|
{•1
•
ω2
They are
1 = 1(cos(0) + i sin(0)) = 1 + 0i,
√
1
3
ω = 1(cos(2π/3) + i sin(2π/3)) = − + i
,
2
√2
3
1
.
ω 2 = 1(cos(4π/3) + i sin(4π/3)) = − − i
2
2
The fact that the third root ω 2 is the square of the root ω follows from the way
complex multiplication behaves when expressed in polar coordinates.
The use of ω (the Greek letter omega) and ω 2 are used for the two non-real
cube roots of 1 is standard and will be used this way for the rest of these notes.
√
2. Let us find fourth roots of 2ω = −1 + i 3. The √
modulus is 2 and the
argument is 2π/3. The fourth roots all have modulus 4 2 and the arguments
are
2π 0
π
+ (2π) = ,
12 4
6
2π 1
2π
+ (2π) =
,
12 4
3
2π 2
7π
+ (2π) =
,
12 4
6
The four fourth roots are shown below.
2ω
•
b•1
a
11
q• 2
1
q
q
1q}

xy
~
g̀a
fqe
b
z
c
d|
{
qqq 111
•
c
•d
2π 3
5π
+ (2π) =
.
12 4
3
1.4. COMPLEX ARITHMETIC
31
They are
a=
√
4
√
π
π
4
2(cos( ) + i sin( )) = 2
6
6
√
3
1
+i
2
2
!
,
√ !
√
1
2π
3
2π
4
,
b = 2(cos( ) + i sin( )) = 2 − + i
3
3
2
2
!
√
√
√
3
7π
7π
1
4
4
c = 2(cos( ) + i sin( )) = 2 −
,
−i
6
6
2
2
√ !
√
√
1
5π
3
5π
4
4
.
−i
d = 2(cos( ) + i sin( )) = 2
3
3
2
2
√
4
Notice that a2 , a3 are not other fourth roots of 2ω. (See this by computing
modulus and argument of a2 and a3 using the principle stated at the end of
Section 1.4.3.) This differs from the behavior of ω and ω 2 which are both cube
roots of 1. However, there are nice relationships between a, b, c and d which we
will explore in the next section.
Exercises (9)
1. This exercise will compute ω and ω 2 algebraically. The cube roots of 1
should be solutions to x3 = 1 or to the equivalent equation x3 − 1 = 0.
However, x = 1 is clearly one solution, so x − 1 should be a factor of
x3 − 1. What is the other factor? If you don’t already know the answer,
you should do the long division (x3 − 1)/(x − 1) to find out. Then you
should memorize the answer since it is important. The other factor is
a quadratic. Solve the quadratic. If all is done correctly, your answers
should be ω and ω 2 .
2. Verify that ω 2 = ω and ω 2 = ω. Find all complex numbers z so that
z 2 = z.
3. (a) Find all cube roots of i.
(b) Find all cube roots of 2.
(c) Find all sixth roots of −1.
(d) Find all fourth roots of −1.
1.4.6
Roots of 1
The n-th roots of 1 (also called roots of unity) occupy a special place in these
discussions. They will turn out to be important not only now, but much later
in the notes.
The modulus of 1 is 1, of which all its powers are 1, and all of its positive
real roots are 1. Thus all n-th roots of 1 lie on the unit circle (circle of radius
one with center at the origin).
32
CHAPTER 1. TOOLS
Since the argument of 1 is zero, the n-th roots of 1 are spaced evenly around
the unit circle with angle exactly 2π/n between them, starting at 1. If α is the
root with argument exactly 2π/n, then the various powers of α give all the n-th
roots of 1. Here are the twelve 12-th roots of 1.
•
•
•
•
•
•α
xy
~}
z|
{•1
•
α5•
•
•
α8
• 11
α
We have labeled α5 , α8 and α11 for no particular purpose other than for illustrative examples.
Application to n-th roots of arbitrary complex numbers
Let z be a complex number (other than zero) and let w be an n-th root of z.
We know that w 6= 0 since z 6= 0. Let β be an n-th root of 1. Then
(βw)n = β n wn = 1z = z
shows that βw is another n-th root of z.
Further, if y is another n-th root of z, then γ = wy satisfies
y n
yn
z
γn =
= n = =1
w
w
z
which shows that γ is an n-th root of 1. Since y = γw, our two arguments have
shown that multiplying an n-th root of z by an n-th root of 1 gives another n-th
root of z, and that every n-th root of z can be obtained from one single n-th
root of z by multplying by the various n-th roots of 1.
Let α be the n-th root of 1 with argument exactly equal to 2π/n. Let w be
one n-th root of z. Then all the n-th roots of z form precisely the set
{w = 1w, αw, α2 w, . . . , αn−1 w} = {αi w | i = 0, 1, 2, . . . , n − 1}
(1.33)
of n complex numbers.
We can review these observations from the polar view of complex multiplication. If w is an n-th root of z and γ an n-th root of 1, then the modulus of
γ is 1. That makes the modulus of γw the same as the modulus of w and the
correct modulus to be an n-th root of z. The argument of γw will differ from
that of w by a multiple of 2π/n and thus be a correct argument for an n-th root
of z.
To summarize:
√ If z has modulus r and argument θ, then one n-th root w of
z has modulus n r and has argument θ/n. Now we form all the n-th roots of z
as specified by (1.33).
1.5. THE CUBIC REVISITED
33
Exercises (10)
1. What are all the 6-th roots of 64? (You are supposed to know that 26 =
64.)
2. What are all the 6-th roots of −64?
3. What are all the cube roots of −27i?
1.5
The cubic revisited
1.5.1
Picking out the solutions from the formula
In Section 1.3, we reduced an arbitrary cubic to the form x3 + qx + r = 0 and
found the solutions are given by
x=
s
3
−
r
2
+
r r 2
2
+
q 3
3
+
s
3
−
r
2
−
r r 2
2
+
q 3
3
.
(1.20)
We know that each of the cube roots can take on three values (unless the value
is zero). But the three cube roots are not independent.
The two cube roots correspond to the values u and v chosen in Section 1.3.3
so that x = u + v. The values u and v satisfied various equalities of which the
most relevent (1.18) is that uv = − 3q . This means that
v=−
q
3u
and the second cube root in (1.20) is specified once the first cube root is known.
In particular, if the first cube root is real, and q is real, the second cube root
must be real.
We know how to get all cube roots of a number once we have one cube root
of the number. In a very similar manner, we can get all roots of x3 + qx + r
once we have one root. The equality uv = − 3q is the key.
Assume that x = u + v is one root of x3 + qx + r where u is one cube root
of
r r r 2 q 3
+
−
+
,
(1.34)
2
2
3
and v is one cube root of
−
r
2
+
r r 2
2
+
q 3
3
.
(1.35)
Then the other cube roots of (1.34) are ωu and ω 2 u and the other cube roots of
(1.35) are ωv and ω 2 v where ω and ω 2 are the non-real cube roots of 1. Given
34
CHAPTER 1. TOOLS
that uv = − 3q and that any two cube roots used to make x also have to multiply
to − q3 , we must have that the three roots given by (1.20) are
x1 = u + v,
x2 = ωu + ω 2 v,
(1.36)
2
x3 = ω u + ωv
since these are the only combinations where the two parts have the same product
as uv.
We can apply this to the examples worked out in Section 1.3.3. We saw that
the formula (1.20) applied to x3 + 3x − 4 = 0 gave
q
q
√
√
3
3
x = 2 + 5 + 2 − 5.
(1.23)
Since all numbers the numbers in (1.23) are real, there are two real values of
the two cube roots. These must go together, since uv = − 3q and q = 3 is real.
If we take the cube roots in (1.23) as representing real numbers, then the other
two solutions to x3 + 3x − 4 = 0 are
q
q
q
q
√
√
√
√
3
3
3
3
2 + 5 + ω2
2 − 5 , and ω 2
2+ 5 +ω
2− 5 .
ω
The formula (1.20) applied to x3 − 9x + 8 = 0 gave
q
q
√
√
3
3
x = −4 + −11 + −4 − −11.
(1.24)
√
The√ two numbers inside the cube root signs, α = −4 + i 11 and α =
−4 − i 11 are both not real and are complex conjugates of each other. So they
have the same modulus and their arguments are the negatives of each other.
If we take the (real) cube roots of the modulus and 1/3 of the two arguments,
then we get two complex numbers that are cube roots of α and α and that
are also complex conjugates of each other. If we call these β and β, then their
product is real and must be −q/3 = 9/3 = 3. We can verify that by noting that
αα = |α|2 = 16 + 11 = 27 and the cube root of 27 is 3. So x = β + β must be
one solution. This is real by one of the facts from (1.32).
The other two solutions are ωβ + ω 2 β and ω 2 β + ωβ. From the comments
in Section 1.3.3, we know that all the roots of x3 − 9x + 8 = 0 are supposed to
be real. But this can be verified from a key observation from Exercise Set (9)
that ω 2 = ω. Now
ω 2 β = ω · β = ωβ,
so the two parts of the second solution are complex conjugates of each other
and the sum is real. A similar calculation shows that the third solution is real.
Recall that one of the solutions must be 1. We√ leave it to the reader to
decide
which. Recall that one solution is less
√
√ than − 3, another is greater than
3. Also note that the argument of −4 + i 11 is just slightly larger than 3π/4.
(It is almost exactly 140 degrees.)
1.5. THE CUBIC REVISITED
35
Exercises (11)
1. Which of the solutions to x3 − 9x + 8 = 0 must give x = 1? Hint, the
relative positions of the three solutions on the line and the facts mentioned
are all that is needed.
2. Combine all the steps from Sections 1.3.2 and 1.3.3 to solve
2x3 + 12x2 + 18x + 12 = 0.
The numbers were chosen to come out not completely horrendous, but not
exactly nice.
1.5.2
Symmetry and Asymmetry
This continues the discussion started in Section 1.2.6. There we looked at the
relationship between the roots and coefficients of the quadratic. Here will make
the same study for the cubic. As might be expected from the more complicated
situation, we will have more to say. In fact, the complexities are high enough
to give hints about what will come later in the notes. We also continue the
discussion started in Section 1.2.5 on the differences between the four operations
of addition, subtraction, multiplication and division, and the fifth operation
consisting of the taking of n-th roots.
Symmetry
If we are given a monic cubic equation
x3 + px2 + qx + r = 0
(1.37)
and we know that the three roots8 are r1 , r2 and r3 , then the equation (1.37)
must9 be the same as
0 = (x − r1 )(x − r2 )(x − r3 )
= x3 − (r1 + r2 + r3 )x2 + (r1 r2 + r2 r3 + r3 r1 )x − r1 r2 r3 .
From this we get
p = −(r1 + r2 + r3 ),
q = r1 r2 + r2 r3 + r3 r1 ,
r = −r1 r2 r3 .
(1.38)
These formulas are only slightly more complicated than the ones we get for
the quadratic from (1.10). Also, the formulas in (1.38) share with the corresponding formulas for the quadratic the property discussed in the next paragraph.
8 Later
9 The
we will need to justify our claim that there are three roots
word “must” has the same qualifications that we mentioned in Section 1.2.6.
36
CHAPTER 1. TOOLS
There are no instructions how to assign the three roots to the symbols r1 ,
r2 and r3 . In fact there are 6 ways of doing so, corresponding to the six ways of
ordering three things. The formulas in (1.38) come out the same no matter how
the values of the roots are assigned to the three symbols. For p, this comes down
to the commutativity of addition and for r it comes down to the commutativity
of multiplication. For q the reason is a bit more complicated, but it is easy to
see that it is true.
The word that is attached to these observations is “symmetric.” To discuss
this more precisely, we make some definitions. The discussion can be held from
two points of view.
The first uses orderings. When we assign the three roots of the cubic to the
symbols r1 , r2 and r3 , we are giving an ordering to the roots. The one assigned
to r1 is the first, the one assigned to r2 is the second and the one assigned to r3
is the third. There are six possible ways to do this with three values. There are
three choices for which of the three is the first, two choices remain for which is
to be second, and there is then only one choice left of which is to be the third.
The product of 3, 2, and 1 is six. To make a different assignment of the roots
to r1 , r2 and r3 is to choose a different ordering of the roots.
The statement that the formulas in (1.38) are symmetric means that the
values for p, q and r come out the same no matter which ordering is used for
the roots.
We can use orderings to introduce the second point of view which discusses
permutations. Permutations will have a major role in these notes.
If f : A → A is a function from a set to itself that is one-to-one and onto,
then f is said to be a permutation of the set A. If A has three elements (for
example, it is the set of roots of a cubic), then there are six permutations of A.
This is easy to see by referring back to orderings. If an ordering is picked for the
three elements, then we can talk about the first element, the second element,
and the third. There are three places a permutation can send the first, there are
only two places the permutation can send the second (this uses the one-to-one
aspect), and there is then only one place to send the last. Again, there are 6
possibilities.
We can write out all 6 permutations easily. If we simply list, in order, where
r1 , r2 and r3 are taken, then a list such as r3 , r1 , r2 describes the permutation
that takes r1 to r3 , takes r2 to r1 , and takes r3 to r2 . With this convention, the
6 permutations of r1 , r2 and r3 are
r1 , r2 , r3 ,
r1 , r3 , r2 ,
r2 , r1 , r3 ,
r2 , r3 , r1 ,
(1.39)
r3 , r1 , r2 ,
r3 , r2 , r1 .
Note that the first permutation “does nothing.” It takes each of r1 , r2 and r3
to itself. However, this is a valid one-to-one and onto function from {r1 , r2 , r3 }
1.5. THE CUBIC REVISITED
37
to itself, and this is a valid permutation. This permutation is called the identity
permutation.
From the point of view of permutations, the statement that the formulas
in (1.38) are symmetric means that the values of the right hand sides of the
equalities in (1.38) do not change if a permutation is applied to the values of
r1 , r2 and r3 .
Recall that in Section 1.2.6, we pointed out that if r1 and r2 are the roots
of x2 + bx + c, then b = −(r1 + r2 ) and c = r1 r2 . Note that the values of b and
c do not change if a permutation is applied to the values of r1 and r2 .
Asymmetry
As interesting as the symmetries in (1.38) are, they get more interesting when
compared to formulas that are less symmetric.
Both the quadratic and cubic are solvable by radicals. From the coefficients
and constants, one can get to the roots by the five operations of addition,
subtraction, multiplication, division and the taking of n-th roots. If we look
at the sequence of steps involved in these calculations, we get a sequence of
intermediate values on the way to the roots. Thus for one root of x2 √
+ bx + c,
2
2
b2 − 4c,
the intermediate
values
might
be
listed
as
b
first,
then
b
−
4c,
then
√
√
1
2
2
then −b + b − 4c and finally 2 (−b + b − 4c).
Since the coefficients can be computed from the roots, each intermediate
value can be computed from the roots as well by plugging in r1 r2 for c and
−(r1 + r2 ) for b. However, we can get some intermediate values more easily by
other means.
√
Let us look at the first value b2 − 4c where a square root occurs. From
r1 =
we get
p
1
(−b + b2 − 4c),
2
and
r1 =
p
1
(−b − b2 − 4c),
2
p
b2 − 4c = r1 − r2 .
(1.40)
y 2 − (b2 − 4c) = 0.
(1.41)
The
√ formula is very simple, but symmetry is lost. If we switch r1 with r2 , we get
− b2 − 4c which is the other number whose square is b2 − 4c. Thus permuting
the roots moves the value r1 − r2 among the solutions (in y) of
The intermediate values, such as b2 or b2 − 4c involving no square roots must
have symmetric formulas in terms of r1 and r2 . This is because both b and c
do not change when r1 and r2 are permuted, so neither will b2 or b2 − 4c. We
2
2
2
have b2 = (r√
1 + r2 ) from the formula for b, and b − 4c = (r1 − r2 ) from the
formula for b2 − 4c.
For the cubic x3 + qx + r, the intermediate values that involve n-th roots
are
r r 2 q 3
+
(1.42)
2
3
38
CHAPTER 1. TOOLS
as well as
s
3
−
r 2
+
r r 2
2
+
q 3
s
3
and
3
−
r
2
−
r r 2
2
+
q 3
3
.
(1.43)
The two quantities in (1.43) have already been given convenient names. One
is u and the other is v. Further, u3 and v 3 are the same except for the sign
in front of the quantity (1.42). Thus we get that (1.42) is equal to 12 (u3 − v 3 ).
So if we have u and v expressed in terms of r1 , r2 and r3 , then we have (1.42)
expressed in terms of r1 , r2 and r3 as well.
The three roots in terms of u and v are given by
r1 = u + v,
r2 = ωu + ω 2 v,
(1.36)
2
r3 = ω u + ωv.
Since there are two unknowns to solve for (u and v), we only need the first two
of the equations in (1.36). This is not surprising since we are assuming that the
coefficient of x2 is zero. This means that 0 = −(r1 + r2 + r3 ) and we can get
the third root from the first two. In spite of this, we get nicer expressions for
u and v if we use all three equations. This requires one important observation
and one trick.
The observation is that the numbers ω and ω 2 are the roots of the quadratic
that is left when x3 − 1 has x − 1 factored out. See Exercise Set (9). That is,
they are the two roots of x2 + x + 1. From this we get that ω 2 + ω + 1 = 0.
Note that if we plug in the expressions in (1.36) into r1 + r2 + r3 , then the fact
ω 2 + ω + 1 = 0 immediately verifies that r1 + r2 + r3 = 0.
The trick is to multiply r2 by ω 2 and r3 by ω. This gives
r1 = u + v,
2
ω r2 = u + ωv,
ωr3 = u + ω 2 v.
Now if the three expressions are added, we get
r1 + ω 2 r2 + ωr3 = 3u + (1 + ω + ω 2 )v = 3u + 0v = 3u
so that
1
(r1 + ω 2 r2 + ωr3 ).
3
In an almost identical manner, we get
u=
v=
1
(r1 + ωr2 + ω 2 r3 ).
3
(1.44)
(1.45)
So the formulas for the important itermediate values u and v are fairly
simple, but not symmetric. The lack of symmetry will be explored in exercises.
1.5. THE CUBIC REVISITED
39
To find the value of (1.42) we must cube u and v and take the difference.
While lengthy, this is straightforward. The reader can show
(a + b + c)3 = a3 + b3 + c3 + 3(a2 b + a2 c + ab2 + b2 c + ac2 + bc2 ) + 6abc. (1.46)
If this is used to evaluate u3 , we get
1 3
(r + r23 + r33 + 3(r12 r2 ω 2 + r12 r3 ω + r1 r21 ω + r22 r3 ω 2 + r1 r32 ω 2 + r1 r32 ω)+ 6r1 r2 r3 ).
27 1
Similarly, for v 3 we get
1 3
(r + r23 + r33 + 3(r12 r2 ω + r12 r3 ω 2 + r1 r21 ω 2 + r22 r3 ω + r1 r32 ω + r1 r32 ω 2 )+ 6r1 r2 r3 ).
27 1
Now u3 − v 3 calculates as
1
3(ω 2 − ω)(r12 r2 − r12 r3 − r1 r22 + r22 r3 + r1 r33 − r2 r33 ).
27
√
But ω 2 − ω = −i 3. So we have
r 1
r 2 q 3
+
= (u3 − v 3 )
2
3
2
√
−i3 3 2
(r1 r2 − r12 r3 − r1 r22 + r22 r3 + r1 r32 − r2 r32 )
=
54
√
i 3
=
(−r12 r2 + r12 r3 + r1 r22 − r22 r3 − r1 r32 + r2 r32 ).
18
Direct computation shows that
(r1 − r2 )(r2 − r3 )(r3 − r1 )
=r1 r2 r3 − r12 r2 − r1 r32 + r12 r3 − r22 r3 + r22 r1 − r2 r32 − r1 r2 r3
= − r12 r2 − r1 r32 + r12 r3 − r22 r3 + r22 r1 − r2 r32 .
Combining the last two calculations gives
r √
r 2 q 3
i 3
+
=
(r1 − r2 )(r2 − r3 )(r3 − r1 ).
2
3
18
(1.47)
The formulas (1.44), (1.45) and (1.47) share properties with (1.40). They
are all polynomials in the roots (combinations of products and powers but there
are no divisions by the roots or taking of n-th roots) that are fairly simple, but
not symmetric. The effects of permutating the roots in (1.40) are obvious and
in (1.47) they are almost as obvious. The effects in (1.47) as well as (1.44) and
(1.45) will be covered in the next exercises.
After the exercises we will discuss the implications of all of the observations.
40
CHAPTER 1. TOOLS
Exercises (12)
1. Derive (1.45) in a manner similar to the derivation of (1.44).
2. Derive (1.46). You can do this in two ways. One is to just work out the
cube and get 27 terms that have to be gathered. Another is to use (1.13)
on (a + (b + c))3 which will then need another use of (1.13) for the (b + c)3
which will appear.
3. Verify all the calculations after (1.46) that lead to (1.47).
3
3
3
3
2
4. Verify
√ the calculations of u , v , u −v and verify the claim that ω −ω =
−i 3.
5. This will study the effect of permuting the values of r1 , r2 and r3 by
considering some specific permutations. The first is easy: what happens
to u and v if r1 is kept the same while r2 is switched with r3 ? Next, what
happens to u and v if r1 , r2 and r3 are rotated by sending r1 to r2 , sending
r2 to r3 and sending r3 to r1 ? Lastly, what happens to u and v if r1 and
r2 are switched while r3 is kept the same?
6. Verify that all the permutations of r1 , r2 and r3 leave the values of the
right sides of the equations in (1.38) the same.
7. Assume r1 6= r2 and verify that one permutation of r1 and r2 leaves the
right side of (1.40) the same, and one does not.
8. Assume that r1 , r2 and r3 are all different and verify that the only permutation of r1 , r2 and r3 that leaves the right side of (1.44) the same is
the identity permutation. Do the same for (1.45).
9. Assume that r1 , r2 and r3 are all different and verify that any permutation
of r1 , r2 and r3 can only either leave (1.47) the same, or introduce a minus
sign. Continue with the assumption and determine which permutations
leave (1.47) the same and which introduce a minus sign.
10. This problem addresses the fact that the discussion in this section assumed
p = 0 in x3 + px2 + qx + r = 0. Without the assumption p = 0, the
techniques of Section 1.3.2 would be brought in to reduce the cubic to one
with p = 0. The reduction to a monic cubic need not be considered since
the roots do not change under that reduction. But the reduction to p = 0
changes the roots. The roots of the original are obtained from the roots
of the reduced cubic by subtracting p3 from the roots of the reduced cubic.
(See the remarks after (1.17).) This has the effect of subtracting p3 from
each line of (1.36). Show that this has no effect on the formulas (1.44)
and (1.45) for u and v and the formula (1.47) for the intermediate square
root.
1.5. THE CUBIC REVISITED
1.5.3
41
The symmetric and the asymmetric
Effects of permuting the roots
If the problems in the previous section were done correctly, they would reveal
that the effect of permuting the roots of the cubic is to move u and v among all
the cube roots of
!
!
r r r 2 q 3
r r r 2 q 3
+
and
−
−
.
+
+
−
2
2
3
2
2
3
There is another way to express this. The cube roots of the expressions
above are the solutions of the following equations with unknown y:
!
r r r 2 q 3
3
y − −
+
=0
(1.48)
+
2
2
3
and
3
y −
−
r
2
−
r r 2
2
+
q 3
3
!
= 0.
(1.49)
A clever way to combine two equations where one side is zero is to multiply
them. Any solution to the product must be a solution to one of the originals.10
Thus when the roots of the cubic are permuted, the values of u and v move
among the solutions to
!
!
r r r 2 q 3
r r r 2 q 3
−
y3+
+
= 0. (1.50)
+
+
y3+
2
2
3
2
2
3
This simplifies to
3
y +
r 2
2
−
which then simplifies even further to
r 2
y 6 + ry 3 −
2
+
q 3 3
=0
q 3
= 0.
(1.51)
3
Of course, this is just the quadratic (1.19) with the variable z in (1.19)
replaced by y 3 and we have come full circle.
In summary, the values of u and v move among the solutions of (1.51) when
the roots of x3 + qx + r are permuted. We have already noted that the values
obtained from (1.40) move among the roots of y 2 − (b2 − 4c) when the roots of
x2 + bx + c are permuted, and the previous problems show that the values of
(1.47) move among the roots of
r 2 q 3
y2 −
(1.52)
+
2
3
when the roots of x3 + qx + r are permuted.
10 This assumes that if ab = 0, then one of a or b is 0. This property will be explored more
carefully later in the notes.
42
CHAPTER 1. TOOLS
Our observations so far
We can now add to comments that were started in Sections 1.2.5 and 1.2.6.
Those sections discussed the requirements on formulas giving roots in terms of
coefficients, and the nature of the formulas giving the coefficients in terms of
the roots.
We have observed the following.
1. If a polynomial is solvable by radicals, then there is a chain of intermediate
values in going from the coefficients to the roots, were each new value in the
chain is obtained from previous values by one of five allowable operations:
addition, subtraction, multiplication, division, and the taking of n-th roots
for various n.
2. The formulas giving the coefficients from the roots are simple (they are
polynomials in the roots) and symmetric. Permutations of the roots do
not change the values of the formulas.
3. The formulas giving the intermediate values from the roots are also polynomials in the roots and are symmetric if the intermediate values are only
computed from the coefficients using the four operations of addition, subtraction, multiplication and division. Permutations of the roots do not
change these values.
4. The formulas for the intermediate values that involve the taking of n-th
roots are also polynomials in the roots but are not symmetric. Permutations of the roots do change the values of these formulas.
5. For intermediate values that change with permutations of the roots, the
values are constrained to move among the roots of other polynomials.
6. The number of permutations that change an intermediate value and the
number of permutations that leave that value unchanged can depend on
the particular value.
Attempts to explain the observations
In 1832, Galois was able to create a unified system that took all of these observations into account. He was then able to study the system in enough detail
to tell the difference between polynomials that were “solvable” and those that
were not.
In the creation of his system he had the help of earlier work of mathematicians who were going off in a wrong direction. We have observed that important
intermediate values in the computation of the roots of the quadratic and cubic
are themselves roots of other polynomials such as (1.41) for the quadratic, and
(1.51) and (1.52) for the cubic. These polynomials are easier to solve than the
original, and so the original is ultimately solvable. There are similar polynomials for the quartic. These polynomials whose roots give intermediate values
1.5. THE CUBIC REVISITED
43
are called resolvents, and an intense search for resolvents for the fifth degree
polynomial had been under way for some time by 1800.
The resolvents for the cubic and quartic were obtained in 1545 by tricky
algebraic manipulation that was peculiar to each degree. Up to 1800, it was
hoped that a unified technique could be found that would build resolvents for
any degree equation that would help solve the equation. Even though this
attempt was doomed to failure it still generated some interesting mathematics.
The formulas for the intermediate values (1.44), (1.45) and (1.47) exhibit
certain symmetries even if they are not completely symmetric. Further, when
the resolvents are written out in terms of the roots, other symmetries appear.
This is not surprising since the resolvents use combinations of the coefficients
from the original equation and the coefficients have symmetric expressions in
terms of the roots.
This led to a separate study of symmetries and permutations. The goal of
the study—building useful resolvents—was not to be realized, but the study
itself turned out to be more important than the unrealized goal. Two names,
Lagrange and Cauchy, associated to the study of permutations will appear again
in these notes.
Galois was able to take what was known about permutations and create
new objects to study. He focused on the interaction between the permutations
that left certain values unchanged, and the values that were unchanged by the
permutations. He referred to the collection of permutations that left certain
values unchanged as a “group” of permutations. The name stuck and groups
became important objects of study. Galois did not name the collections of values
that were left unchanged by certain permutations, but they evolved into another
set of objects called fields. These new objects and their interactions will be a
major focus of these notes.
A look at things to come
This chapter has almost run its course. From this point, new objects, new
behaviors, new rules, and new techniques will be introduced in fairly rapid
succession. Each of these will be a new tool.
This chapter had the task of emphasizing the importance of tools, but it was
not the job of this chapter to introduce them all. That will be the task of the
rest of these notes.
We have named much of what will occupy us. New objects, such as groups
and fields will be defined and studied. These will not be the only new objects,
but are the only ones we can list now. We will also study more familiar objects
such as polynomials.
The introduction of new objects is such an important concept that it deserves
its own chapter, and will be the subject of Chapter 2. In that chapter other
objects will be introduced, and some that are already familiar (such as the
integers, real numbers, complex numbers and polynomials) will be reviewed as
examples of the objects of study or as ingredients from which more examples
can be constructed.
44
CHAPTER 1. TOOLS
The last section of this chapter will discuss the quartic equation. It will show
is more of what we have already learned, so it will be labeled as optional. The
effects of permuting the roots will be more complicated. This is not surprising.
There are six ways to permute three roots. Quartic equations have four roots
and there are twenty-four ways to permute them.
1.6
The quartic (optional)
This section introduces no new phenomena, but it gives richer examples of the
phenomena that we have already observed. It is arranged as a series of exercises
and so may be viewed as a larger project.
1.6.1
Reduction
Exercises (13)
1. Find a way to reduce the general quartic ax4 + bx3 + cx2 + dx + e = 0 to
x4 + qx2 + rx + s = 0. Familiarity with the corresponding reduction of
the cubic will make this easy.
1.6.2
The resolvent
We will introduce the idea given in [2] that lets us solve a quartic if we can solve a
cubic. Once the idea is in place, getting the cubic is reasonably straightforward.
The form x4 + qx2 + rx + s has the powers 4, 2, 1, and 0 of x. The powers
4, 2 and 0 form a quadratic in the variable x2 and it is possible to complete
the square for the sum of those powers alone. The powers 2, 1 and 0 form a
quadratic in the variable x it is possible to complete the square for the sum of
those powers alone. However, the two sets of powers are mixed. The idea is to
complete the square of both sets at the same time. If this is done so that there
is no extra constant term remaining, then an equation of the form
(x2 + k)2 + t(x + j)2 = 0
results which can be turned into (x2 + k)2 = −t(x + j)2 which can be solved
easily for x.
The problem is to complete the square of the group of powers 4, 2 and 0
simultaneously with the group of powers 2, 1 and 0 so there is no constant term
remaining. The trick is to figure out how much of the x2 term should be in
one of the groups so that the remaing part of the x2 cooperates with the other
group.
Exercises (14)
1. In x4 +qx2 +rx+s = 0, break qx2 into zx2 +(q−z)x2 . Complete the square
of x4 + x2 and complete the square of (q − z)x2 + rx and show that this can
1.6. THE QUARTIC (OPTIONAL)
45
be done so there is no constant term remaining in x4 + qx2 + rx + s = 0
if z is a solution of the cubic equation
z 3 − qz 2 − 4sz + (4sq − r2 ) = 0.
(1.53)
Do not attempt to solve this cubic. It is too painful. We will refer to it
as the resolvent of x4 + qx2 + rx + s = 0.
2. Assume that values of z can be found by solving the cubic (1.53), and
show that the solutions to the quartic equation x4 + qx2 + rx + s = 0 are
the solutions to the two quadratics
√
√
z
r z−q
= 0,
+
x2 + ( z − q)x +
2 2(q − z)
(1.54)
√
√
r z−q
z
= 0.
−
x2 − ( z − q)x +
2 2(q − z)
These can be simplified somewhat, but it is not worth it.
3. Let r1 and r2 be the roots of the first quadratic in (1.54), and let r3 and
r4 be the roots of the second quadratic in (1.54). Show that
z = r1 r2 + r3 r4 .
4. List the 24 permutations of r1 , r2 , r3 and r4 . This is easy if you adopt
the convention used in (1.39).
5. Convince yourself that of the 24 permutations of r1 , r2 , r3 and r4 that
there are 8 that do not change the value of z and write them out. Note
that 8 is one third of 24. Note also that we expect 3 solutions to (1.53).
We will see later that this is not a coincidence.
6. Show that the resolvent of x4 − 23x2 + 18x + 40 = 0 is z 3 + 23z 2 − 160z −
4004 = 0. Show that the solutions of the resolvent equation are −22, 13
and −14. Do not solve the resolvent equation. Just plug in the numbers.
A calculator will help.
7. Find the solutions of x4 − 23x2 + 18x + 40 =
to the two quadratics in (1.54). Let r1 and r2
the quadratics in (1.54), and let r3 and r4 be
quadratic in (1.54). What value do you get for
with the numbers in Problem 6 above.
0 by finding the solutions
be the solutions to one of
the solutions to the other
r1 r2 + r3 r4 . Compare this
8. Evaluate r1 r2 + r3 r4 with the numbers from the previous problem after
applying all 24 permutations to r1 , r2 , r3 and r4 . Compare the results
with the numbers in Problem 6 above.
9. Do problems 6 through 8 with x4 − 15x2 − 10x + 24 = 0. You must figure
out the resolvent yourself. The resolvent should have solutions 10, −14, 11.
46
CHAPTER 1. TOOLS
10. Same with x4 − 25x2 − 60x − 36 = 0. The resolvent should have solutions
0, −9, −16.
11. Same with x4 − 17x2 − 36x − 20 = 0. The resolvent should have solutions
−1, −8, −8.
Chapter 2
Objects of study
The shift from classical to modern mathematics came with a shift in the objects
studied. Classical calculus might study individual functions from the reals to
the reals, while modern calculus (or analysis) would study the set of all functions
from the reals to the reals as a single object. While this might seem interesting,
it is not obvious that it is useful. It turns out that this kind of shift is extremely
useful once the properties of the new objects are sufficiently understood.
In this chapter we will introduce new mathematical objects that form the
core of modern algebra. They will be motivated to various degrees by the
discussions in the previous chapter.
This chapter will do no more than give the definitions of the new objects
and give a few examples. Some of the examples are very familiar and we will
have the opportunity to review their basic properties. We will also build new
examples and will study the techniques that go into their construction.
The chapter following this one will explore some very elementary properties
of several of the new objects. Later chapters will focus on single objects for
deeper study.
2.1
First looks
Here, we give extremely brief and non-rigorous introductions to some of the new
object. Later in this chapter, we will give full definitions.
2.1.1
Groups
For Galois, a “group” of permutations consisted of all permutations of the roots
of a polynomial that kept certain values, given as formulas in the roots, unchanged. Since permutations are functions from the roots to themselves, we
can compose them. If two permutations are given names such as f and g, then
we can form the composition f g (where g is applied first).
47
48
CHAPTER 2. OBJECTS OF STUDY
If there is a formula in the roots whose value is unchanged when we apply f
and also when we apply g, then we can ask what happens when we apply f g. A
reasonable guess is that the value is unchanged by f g as well, and we will verify
this carefully later when the proper definitions are in place.
Thus a “group” of permutations as defined by Galois is closed under composition—
if two permtations are in the group, then so is their composition.
These and other observations were eventually gathered together to form a
definition of an abstract group. The full definition will be given in Section 2.3.
While there are many examples of groups, one of the best examples is the set
of all permutations of a given set.
As consequences of the definitions, groups have restrictions on their internal
structures. These restrictions will provide useful information in our investigations of solutions of polynomial equations.
To give a simple example of a such a restriction recall that there are 6
possible permutations of the three roots of a cubic. We have seen that there
are values that are unchanged by three of the six permutations and changed by
the remaining three. However, we will eventually see that there can be no value
that is unchanged by four of the six permutations and changed by the remaining
two.
2.1.2
Fields
Several systems of numbers have come up in previous discussions, and often the
same disuscussion had more than one system. While the coefficients used in the
equation x2 + x + 1 = 0 are all real, the solutions are not. Thus to discuss the
polynomial x2 + x + 1, the real numbers suffice, but to discuss its roots, a jump
to a larger system of numbers is required.
Both the real numbers and the complex numbers are self contained when the
four operations of addition, subtraction, multiplication and division are applied.
Given any two real numbers and one of these operations, the result (except for
dividing by zero) is another real number. The same can be said about the
complex numbers. But the taking of square roots does not cooperate well with
all real numbers. The square root of −1 is not real.
A similar discussion could be had about rational numbers. Sums, products,
etec. of rational numbers are all rational, but the square root of 2 is not rational.
With the right definitions in place, the set of all numbers left unchanged by
groups of permutations of roots of polynomials has similar properties. Sums,
products, etc. of numbers from the set give other numbers from the set, but this
does not always work with n-th roots.
Galois understood the importance of such collections of numbers and the
fact that the taking of n-th roots would often require moving to a larger system
of numbers.
Eventually, systems of numbers preserved by sums, products, etc. and satisfying the usual laws (associative, distributive) came to be called fields. (Several
names in several languages were used previously, with the English word “field”
2.1. FIRST LOOKS
49
used for the first time in 1893 by E. H. Moore.) The best examples for now are
the rational numbers, the real numbers, and the complex numbers.
2.1.3
Rings
Polynomials are central to our study. Polynomials share many properties with
the integers. One can add, subtract, and multiply polynomials to get other
polynomials, just as one can add, subtract, and multiply integers to get other
integers. But dividing an integer by an integer does not always give an integer,
and similarly dividing a polynomial by a polynomial does not always give a
polynomial.
Systems with addition, subtraction, multiplication, but not necessarily division are called rings. The most fundamental example of a ring is the ring of
integers, and next in importance are rings of polynomials.
There are even more parallels between integers and polynomials. Primes can
be discussed in both settings, as well as uniqueness of factorization, greatest
common divisors, and so forth.
We will investigate rings in much less depth than groups and fields. This is
only because of time constraints and the limited nature of our goals, and not
because rings have any lesser status as mathematical objects.
2.1.4
Homomorphisms
We have discussed several times the effect of permuting roots. If a formula such
as r1 + r2 is given, we can give its value a name (such as s for sum) so that we
can ask “what happens to s if we apply the permutation that switches r1 and
r2 . If this permutation is f , so that f (r1 ) = r2 and f (r2 ) = r1 , then we are
really asking if there is a reasonable way to assign a value to the notation f (s).
And if d = r1 − r2 , then we are also asking if there is a reasonable way to assign
a value to f (d).
To be consistent with what we have been saying before, we should say that
s is not changed and d is negated. That is f (s) = r2 + r1 = f (r1 ) + f (r2 ) and
f (d) = r2 − r1 = f (r1 ) − f (r2 ).
Similarly, if p = r1 r2 , then f (p) should be f (p) = f (r1 )f (r2 ) = r2 r1 .
At this point, we have invented the homomorphism. There are homomorphisms of groups, homomorphisms of rings, and homomorphisms of fields. Since
we have been talking mostly about fields, we will say that a homomorphism from
one field to another is a function f from the first field to the second so that for
each two elements x and y of the first field, we have f (x + y) = f (x) + f (y) and
f (xy) = f (x)f (y). You might ask about subtraction and division, but that will
come in due time.
Structures like groups, rings and fields interact with other groups, rings and
fields through homomorphisms. The behavior of homomorphisms is important
enough to declare homomorphisms as separate objects of study.
50
CHAPTER 2. OBJECTS OF STUDY
2.1.5
And more
Other objects will be defined. It would be difficult to indicate what they are
just now, and we have listed most of the important ones. In later chapters when
we have more machinery available, we will define more.
2.2
Functions
Functions are too basic to put off for long. Homomorphisms are functions,
permutations are functions, and we will have more uses of functions than these.
2.2.1
Sets
Notation
Functions need sets. We assume the reader is familiar with the very basics of
sets and with set operations such as union, intersection and cross product. We
need notations for sets. The reader should review such basic notations for sets
such as:
1. The list: A = {1, 2, 4} is a set in which the only elements are 1, 2 and 4.
2. The implied list: B = {2, 4, 6, . . .} is a set whose only elements are the
positive, even integers.
3. Set builder: C = {x a real number | x > 7} is a set whose only elements
are those real numbers for which x > 7 is true.
We write x ∈ A to mean that x is an element of A. So in the examples
above, the set A has x ∈ A true only when x is one of 1, 2 or 4.
We reserve certain letters for certain sets. We use N for the set of nonnegative integers. In other notation N = {0, 1, 2, 3, . . .}. The elements of N are
called the natural numbers. We use Z for the set of integers, Q for the set of
rational numbers, R for the set of real numbers, and C for the set of complex
numbers.
We can denote B and C above in different ways. For example
B = {2n + 2 | n ∈ N},
C = {x ∈ R | x > 7}.
Note that in set builder notation, {x | test on x} is the set of all x that pass
the test. The notation {x | test on x} is read out loud as “the set of all x such
that ‘test on x’ is true.” This needs to be kept in mind when we discuss union
and intersection.
The empty set
There is a special set denoted ∅ and called the empty set which has no elements.
That is, a ∈ ∅ is always false.
2.2. FUNCTIONS
51
Union and intersection
If A and B are two sets then
A ∪ B = {x | x ∈ A or x ∈ B},
A ∩ B = {x | x ∈ A and x ∈ B},
(2.1)
where A ∪ B is called the union of A and B, and A ∩ B is called the intersection
of A and B.
Note that the word “or” used in the definition of A ∪ B is always interpreted
so that in a sentence “P or Q” is considered to be true if either one of P or Q
is true or both P and Q are true and is only false if both P and Q are false.
This convention means that we are using the inclusive or. In order to invoke
the exclusive or, which does not allow both P and Q to be true at the same
time for the sentence to be true, we would have to use extra words and say “P
or Q, but not both.”
The right sides of the equalities in (2.1) are in set builder notation. So for
A ∪ B, the sentence “x ∈ A or x ∈ B” is a test. If x passes this test, it is an
element of A ∪ B. This should correspond to the definition of union that you
have seen before. Similarly, if x passes the test “x ∈ A and x ∈ B,” then x is
an element of A ∩ B.
Cross product
Cross products are sets of ordered pairs. An ordered pair (x, y) has two elements,
like a set with two elements, but unlike a set with two elements, one element
x is considered to be the first element and the other y is considered to be the
second. Thus for ordered pairs, (x, y) = (a, b) if and only if x = a and y = b,
but for two element sets, {x, y} = {a, b} if and only if either x = a and y = b,
or x = b and y = a.
If A and B are two sets, then the cross product of A and B, written A × B
is defined as
A × B = {(a, b) | a ∈ A, and b ∈ B}.
For example if A = {1, 2} and B = {a, b, c}, then
A × B = {(1, a), (2, a), (1, b), (2, b), (1, c), (2, c)}.
Another example is that the Cartesian plane, which consists of all (x, y)
where x and y are real, can be described as R × R.
Subsets and equality
If A and B are sets, then we say that A is a subset of B and write A ⊆ B if for
every a ∈ A, we have a ∈ B. That is, every element of A is an element of B.
If A and B are sets, then we say A = B if both A ⊆ B and B ⊆ A are true.
The definition of subset used the phrase “for every” in an essential way. This
phrase occurs frequently enough in mathematical statements to have a symbol
52
CHAPTER 2. OBJECTS OF STUDY
of its own. When ∀x is written followed by a condition on x, then for the entire
statement to be true, the condition on x must be met for every possible x. The
phrase “for every possible x” is a bit vague so it is very common to write ∀x ∈ A
where A is some set, so that the meaning is changed to “for every possible x in
the set A.” At this point, we desperately need an example.
The definition of A ⊆ B can now be written ∀x ∈ A, (x ∈ B). According to
the rules of ∀ given above, this is true if x ∈ B is true for every possible x ∈ A.
Of course, this is just a repetition of the definition of A ⊆ B that we gave in
words three paragraphs back.
The importance of the ∀ symbol is that it emphasizes what must be proven
if a statement with “for all” in it is to be proven. One standard way to prove a
statement starting with ∀x ∈ A, is to prove the condition that follows holds for
an unspecified element of A that we call x. If the proof is successful, then we
can claim that the proof would work with x taking on any value in A. This is
often worded in the proof by saying “let x be an abitrary element of A.” This is
so standard that often this sentence is abbreviated to “let x be in A.” Of course
to be arbitrary, no further restrictions can be placed on x. To follow “let x be
an arbitrary element of A,” with “assume x = 7,” would not result in a proof
of anything that holds for all x.
We illustrate this with a proof of the following.
Lemma 2.2.1 If A ⊆ B and B ⊆ C, then A ⊆ C.
Proof. We must prove ∀x ∈ A, (x ∈ C). Let x be in A. Since A ⊆ B, we know
that x ∈ B. Since B ⊆ C, we know that x ∈ C.
We could end the proof with an explanation that since our proof worked for
an unrestricted element of A, symbolized by x, we have proven what is needed
for every element of A. However, this final argument is so well known that it is
usually left out. If you are more comfortable saying “since x ∈ A was arbitrary,
we have proven ∀x ∈ A, (x ∈ C),” then you can do so. Also if you are more
comfortable, you may say “let x ∈ A be arbitrary,” instead of the shorter “let
x be in A.” However, we will not include such extra words in these notes.
Lemma 2.2.2 If A = B and B = C, then A = C.
Proof. We must prove A ⊆ C and C ⊆ A. But A = B and B = C mean that
all of A ⊆ B, B ⊆ A, B ⊆ C and C ⊆ B are all true. By two applications of
the previous lemma, A ⊆ C and C ⊆ A are both true, and we are done.
Note that the proof of this lemma did not need to refer to the definition of
subset. This is because we had a useful fact about the subset relation already
proven that we could make use of. One of the greater difficulties in a math
course is to keep track of all the facts that have been proven and to make use
of them at opportune times. A standard trap that students fall into is to try to
prove everything from the definitions.
2.2. FUNCTIONS
53
Disjoint sets
If X and Y are two sets, we say that X and Y are disjoint if X ∩ Y = ∅. That
is, there is no element that is simultaneously in both sets.
If P is a collection of sets (yes, a set of sets), then the collection is said to be
of pairwise disjoint sets if for any two sets X and Y from P that are not equal,
we have that X and Y are disjoint. That is, two different sets in P cannot
overlap.
The collection consisting of the three sets {1, 3}, {2, 5} and {4, 6} is a collection of pairwise disjoint sets. Now two of the sets that are different have any
element in common.
The collection consisting of the three sets {1, 3}, {2, 5} and {1, 6} is not a
collection of pairwise disjoint sets. The second set has nothing in common with
the first and third, but the first and third sets have 1 in common. Note that in
this example, if we take the intersection of all three of the sets in the collection
we get the empty set. So intersecting all the sets in a collection to see if you get
the empty set is not a valid test to see if the collection is of pairwise disjoint
sets.
Exercises (15)
If A and B are sets, prove:
1. A ⊆ (A ∪ B)
2. (A ∩ B) ⊆ A.
3. (A ∩ B) ⊆ (A ∪ B).
4. If A ⊆ B and C ⊆ D, then (A × C) ⊆ (B × D).
5. Let A and B be two sets. Argue that showing ∀x ∈ A, (x ∈
/ B) is enough to
show that A and B are disjoint. If this seems too skimpy, then review the
notion of contrapositive from previous courses and recall that a statement
and its contrapositive always have the same truth value. Verify that the
contrapositive of “if x ∈ A, then x ∈
/ B” is “if x ∈ B, then x ∈
/ A.” Use
this to convince yourself that if ∀x ∈ A, (x ∈
/ B) has been shown, then
showing ∀x ∈ B, (x ∈
/ A) adds nothing new.
2.2.2
Functions
If f : A → B is a function from the set A to the set B, then for every a ∈ A, the
expression f (a) represents an element of B. There are two rules that are so self
evident that they need explanation. They also need to be invoked whenever it
is claimed that a function has been specified.
The first rule is that f (a) has to specify an element of B for every a ∈ A. The
second rule is that if a = a′ and a and a′ are elements of A, then f (a) = f (a′ ).
That is, f (a) and f (a′ ) denote the same element of B. If this is thought to be
54
CHAPTER 2. OBJECTS OF STUDY
obvious, note that f (a) and f (a′ ) are not identical as printed on the page, so it
is necessary to point out that there is a reason that they are equal.
Consider the following “definition” of a function from the rationals to the
1
2
4
rationals. We declare that f ( m
n ) = m . Now 3 = 6 as elements of Q, but
2
1
4
1
1
1
f ( 3 ) = 2 and f ( 6 ) = 4 . Note that 2 6= 4 as elements of Q. Further, f ( 30 ) = 10
and 10 does not represent an element of Q. So our “definition” violates both
rules of a function.
Our emphasis has been on notation. However, functions are not just viewed
as grammatical constructs. One usually thinks of a function f : A → B as
“taking” or “sending” each element of A to its corresponding element f (a) in
B. The element f (a) in B is often referred to as the image of a (under f ).
Exercises (16)
1. Which rule(s) in the definition of a function are violated by the following
1
definition for f : Q → Q? Set f ( m
n ) = n.
2. Which rule(s) in the definition of a function are violated by the following
n
definition for f : Q → Q? Set f ( m
n ) = m.
3. Which rule(s) in the definition of a function are violated by the following
2m
definition for f : Q → Q? Set f ( m
n ) = 2n .
4. Which rule(s) in the definition of a function are violated by the following
n
definition for f : Q → Q? Set f ( m
n ) = n.
2.2.3
Function vocabulary
If f : A → B is a function, then A is called the domain of f and B is called the
range of f .
If U ⊆ A is a subset of A, then we would like to define f (U ), the image of
U , to be the set of all elements in B that are the images of elements of U . We
could attempt to use the notation f (U ) = {f (a) | a ∈ U } for this. However, this
does not fit exactly the conventions we have for set builder notation. Rather
than try to stretch the notation for set builder (and admitttedly, some books
do), we will introduce a symbol for “there exists” for two reasons. It will give
notation for f (U ) that doesn’t stretch the conventions of set builder notation,
and it helps outline proofs.
We write ∃x followed by a condition on x to mean that there is some x that
makes the condition true. As with ∀, we often restrict where the x can come
from that makes the condition true by writing ∃x ∈ A followed by the condition
on x to mean that there is some x in the set A that makes the condition true.
Now given f : A → B and U ⊆ A, we can define the image of U as
f (U ) = {x ∈ B | ∃a ∈ U, f (a) = x}.
This fits with our previous use of set builder notation.
2.2. FUNCTIONS
55
Onto functions
We can use ∃ to define a condition on functions. We say that a function f :
A → B is onto or a surjection if ∀x ∈ B, ∃a ∈ A, f (a) = x. In words, every x in
B is the image of some a in A.
We illustrate the use of ∃ in proofs by proving the following lemma. The
lemma discusses the composition of functions which should be familiar from
calculus.
Lemma 2.2.3 If f : A → B and g : B → C are both onto, then so is their
composition gf : A → C.
Proof. Let c be in C. Since g is onto, we can let b ∈ B be such that g(b) = c.
Since f is onto, we can let a ∈ A be such that f (a) = b. Now gf (a) = g(f (a)) =
g(b) = c. Since we have found an a ∈ A for which gf (a) = c, we know that
∃a ∈ A, gf (a) = c is true.
Comments In the paragraphs before Lemma 2.2.1, we discussed the procedure
for proving a conclusion having a ∀, and the procedure was illustrated in the
proof of that lemma. Lemma 2.2.3 illustrates how to use a ∃ that comes from
the hypothesis, and how to prove a conclusion having a ∃.
The ∃ in the conclusion is easy. To prove that “there exists an x with a
certain property” one only needs to find some x with the property. However,
the correct use of ∃ when it is assumed is more subtle.
In Lemma 2.2.3, both f and g are assumed to be onto, so the definition of
being onto gets to be used twice. Each time we know something exists with a
certain property, so twice we are allowed to bring a value into the proof with
that property. The rule that must be followed here is that there must be no
other restriction on the value that is brought in.
As an example, if we know for some function f from the reals to the reals is
onto, then there is an x so that f (x) = 13. We are not allowed to say “let x be
such that f (x) = 13 and also assume that x = 7.” The extra restriction is not
legal.
In the proof of Lemma 2.2.3, we used the letter b the first time we invoked
the definition of onto, and the letter a the second time. It wold not have been
legal to use the same letter both times, since the second time the letter was
used we would have been introducing the extra restriction “and assume the new
value is the same as the one we introduced earler.” Unfortuately, this is not a
very compelling example, since had we used the same letter twice we also would
have ended up with the odd looking statement f (b) = b. We will see more
compelling examples later.
Note that we have not covered how to use a ∀ that is assumed to be true.
This is quite easy. If “∀x ∈ S some statement” is assumed to be true, and you
know p is in S, then you can write down that the statement must be true about
p.
56
CHAPTER 2. OBJECTS OF STUDY
One-to-one functions
A function f : A → B is one-to-one or an injection if ∀x, y ∈ A if f (x) = f (y),
then x = y. That is, if two elements in A have the same image in B, then they
must have been the same element in the first place.
Lemma 2.2.4 If f : A → B and g : B → C are both one-to-one, then so is
their composition gf : A → C.
The proof will be left as an exercise.
One-to-one correspondences
A function f : A → B is a one-to-one correspondence or a bijection if it is both
one-to-one and onto.
Lemma 2.2.5 If f : A → B and g : B → C are both one-to-one correspondences, then so is their composition gf : A → C.
The proof will be left as an exercise.
Domain, range and image
If f : A → B is a function, then A is the domain of f and B is the range of
f . The subset f (A) of B will be called in image of f . Note that not all books
agree with this use of these words. However, we will be consistent with these
deinfitions throughout these notes.
Note that the function f above is onto if and only if its range coincides with
its image.
Exercises (17)
1. Prove Lemma 2.2.4.
2. Prove Lemma 2.2.5.
3. Use calculus to argue that f (x) = x3 + x is one-to-one from the reals to
the reals.
√ 2
4. Prove that f from the rationals to the reals
√ given by f (x) = (x + 2)
is one-to-one. This needs the fact that 2 is not rational. Show that
the same formula gives a function from the reals to the reals that is not
one-to-one.
2.2.4
Inverse functions
With the basic notions from Section 2.2.3 in hand, we can discuss the important
concept of “reversing” a function.
2.2. FUNCTIONS
57
Inverse images
If f : A → B is a function, and S ⊆ B, then
f −1 (S) = {x ∈ A | f (x) ∈ S}
is called the inverse image of S under f . In words, it is the set of elements in
A that f takes into S.
Notice that f −1 takes sets (subsets of B) and gives sets in return (subsets
of A. Note also that it is possible that f −1 (S) can be empty. This can happen
if f is not onto. For example, if f (x) = x2 from the reals to the reals, then
f −1 ({−1}) is the empty set.
Also note that f −1 (S) can have more than one element even if S has only
one element. This can happen if f is not one-to-one. For example f −1 ({1}) =
{−1, 1} using the same f as above.
Inverses of one-to-one correspondences
If a function f is a one-to-one correspondence, then f −1 has some special properties.
Lemma 2.2.6 If a function f : A → B is a one-to-one correspondence, then
for every b ∈ B, f −1 ({b}) is a set with one element of A.
The proof will be left as an exercise.
Lemma 2.2.6 says that for a bijection f : A → B, we can think of f −1 as a
function. Each single element of B has one and only one element of A associated
to it by f −1 . When this happens, we no longer thing of f −1 as taking sets to
sets and write f −1 (b) for b ∈ B instead of f −1 ({b}). We also call f −1 the inverse
function of the bijection f and write f −1 : B → A.
This puts us in the unfortunate position of having one notation f −1 that
means one thing in one situation and another in a different situation. There
is nothing to be done about it since the ambiguity is thoroughly embedded in
mathematical writing.
Lemma 2.2.7 If a function f : A → B is a bijection, then for all a ∈ A we
have f −1 (f (a)) = a and for all b ∈ B we have f (f −1 (b)) = b.
The proof will be left as an exercise.
We can add to Lemma 2.2.6.
Lemma 2.2.8 If a function f : A → B is a bijection, then so is f −1 : B → A.
The proof will be left as an exercise.
Lastly, we give a converse to Lemma 2.2.7.
Lemma 2.2.9 Let f : A → B and g : B → A be functions so that for all a ∈ A
and b ∈ B the equalities g(f (a)) = a and f (g(b)) = b hold. Then f and g are
bijections and g = f −1 .
58
CHAPTER 2. OBJECTS OF STUDY
The proof will be left as an exercise.
Exercises (18)
The following require very careful reviews of the definitions.
1. Prove Lemma 2.2.6.
2. Prove Lemma 2.2.7.
3. Prove Lemma 2.2.8.
4. Prove Lemma 2.2.9. Give examples to show that both assumptions are
necessary. That is, if one only assumes for all a ∈ A that g(f (a)) = a, the
functions are not necessarily bijections. And similarly, if one only assumes
for all b ∈ B that f (g(b)) = b, the functions are not necessarily bijections.
In each case something can be said about each of the functions. What can
be said?
2.2.5
Special functions
We introduced homomorphisms in Section 2.1.4. If we wanted to go bring them
in formally at this point, then this is where the discussion should go. However,
it would be better go into detail after other objects have been more carefully
introduced.
Homomorphisms are functions, but they have extra restrictions. Thus we
give the title “Special functions” to this brief section. Another class of special
functions is that of homomorphisms that are also bijections as functions. These
turn out to be crucial to our investigations and will be announced with fanfare
when the time is appropriate.
2.3
Groups
We used permutations in Section 2.1.1 to motivate one of the properties that
groups will have. Rather than build up more motivation we will jump right to
the definition, and then show that permutations fit the extra requirements that
we will list.
There are two ways to describe groups. One is as a set with one extra
structure, and the other is as a set with three extra structures. Since two of the
three structures can be deduced from the third, it is more typical to take the
efficient path and describe groups as having only one extra structure attached
to the set. We will start with the more efficient path, and will point out the
other later.
If we describe a group as consisting of a set with an extra structure, then
there is a need to refer to two items: the set and the structure. It turns out
that once one is used to groups, the extra structure is often (but not always)
not given any symbol. To be clear at the beginning, we will give a symbol for
the structure.
2.3. GROUPS
2.3.1
59
The definition
The one structure version
A group is a pair (G, ·) where G is a set and · represents a multiplication that
will have to obey some restrictions. To say that · is a multiplication means that
if g and h are in G, then g · h also represents an element of G. The element g · h
should be determined uniquely by g and h in that if g = g ′ and h = h′ , then
g · h = g ′ · h′ . We call · a multiplication since we regard g · h as the “product”
of g and h.
The restrictions on · are as follows.
1. For all f, g, h in G, we have (f · g) · h = f · (g · h).
2. There is an element e ∈ G so that for all g ∈ G the equalities e · g = g and
g · e = g hold.
3. For every element g ∈ G there is an element g −1 ∈ G so that g · g −1 = e
and g −1 · g = e hold, where e is as described in the previous restriction.
Often the first restriction is called the associative axiom, the second the
identity axiom, and the third the inverse axiom. The element e is called “the”
identity of the group, and the element g −1 is called “the” inverse of g. The word
“the” is in quotes in two places since we have not yet proven that there is only
one element of G that acts as in the second axiom, and that each g has only
one element that acts as in the third.
The definition just given is an example of a definition by properties. Anything satsifying these properties is a group. When a defintion is given this way,
it is important that several examples be brought in to make the definition more
real. We will do that as soon as we give the “other” definition.
2.3.2
Operations
The multiplication · in the definition of a group is an operation. It takes two
elements of the group, combines them and gives another element of the group
in return. The operation · is more specifically called a binary operation since
it combines two elements. The requirement that the result of combining two
elements be uniquely determined by the elements being combined makes it a
function. The domain of the fuction is all pairs of elements of the group and
the range is the group itself. Thus · is a function · : G × G → G. However, we
write a · b for the image of (a, b) instead of the usual functional notation ·(a, b).
The fact that we use ordered pairs says that order is important. Many
operations that you know, such as + can ignore order. These are commutative
operations. But others such as − cannot ignore order and are not commutative.
Another non-commutative operation that you know is matrix multiplication.
Operations do not have to be binary. An operation might combine three
elements at a time and be called ternary, or “combine” only one element and
be called unary. The “inverse” operation taking g to g −1 is unary. Lastly, there
60
CHAPTER 2. OBJECTS OF STUDY
are operations that take no inputs at all. These could be called “zeroary” but
the word is hard to pronounce and they are called constants instead. That is,
the function always gives the same value and needs no inputs to help decide
what value to give.
The three structure version
Now we can define a group as a quadruple (G, ·, −1 , e) consisting of a set with
three operations, · which is binary, −1 which is unary, and e which is a constant.
The operations satisfy the following three axioms where f , g and h represent
arbitrary elements of G.
1. (f · g) · h = f · (g · h).
2. e · g = g = g · e.
3. g · g −1 = e = g −1 · g.
It is easy to jump to the conclusion that this version of the definition makes
“the” identity unique and “the” inverse of an element g unique. It does make
the element that we call the identity unique, but it still does not require that
only one element behave as in the second axiom. This turns out to be true,
but needs to be proven. This will be done eventually. Similar remarks apply to
“the” inverse.
2.3.3
Examples
These examples will be given with one operation.
Abelian examples
If + represents the usual addition, then all of (C, +), (R, +), (Q, +), (Z, +) are
groups. These groups satisfy the extra requirement x + y = y + x for all x and
y in the group. Such groups are called abelian or commutative groups. It is traditional but not required to use + for the binary operation in an abelian group.
Some examples have traditional notation of their own that takes precedence.
In all of the examples given so far, 0 is the identity element and this makes
−x the inverse of x. Using 0 for the identity of an abelian group and −x for the
inverse of an element in an abelian group is also traditional but not required.
The structure (R, ·) where · represents the usual multiplication is not a
group. The element 1 is the only possible identity, but 0 then has no inverse.
There is no real number x so that 0 · x = 1.
However, if we let R+ = {x ∈ R | x > 0}, then (R+ , ·) is an abelian
group with identity 1. In fact, we can include all the elements other than the
troublesome 0. If we let R∗ = {x ∈ R | x 6= 0}, then (R∗ , ·) is an abelian group
with identity 1. Note that it would be impossibly confusing to insist that the
operation in these examples be written + and the identity be written as 0, so
these examples are an exception to the tradition mentioned above.
2.3. GROUPS
61
Another abelian group that we will write multiplicatively is the group of all
complex numbers of modulus 1. If we let
C1 = {z ∈ C | |z| = 1},
then (C1 , ·) is an abelian group. It is often referred to as the circle group.
Non-abelian examples
Matrix multiplication is not commutative. There are also identity matrices, but
there are many. Also multiplication is not always defined. We can fix that by
picking a particular size. We can let Mn be the set of all n × n matrices with
real entries, and let In be the n × n identity matrix. Now we have the problem
that not every n × n matrix has an inverse. If we let Mn′ be the set of all
n × n matrices with real entries and non-zero determinant, then Mn′ and matrix
multiplication gives a group that is non-abelian as long as n > 1.
Note that we do not give a symbol for the multiplication since matrix multiplication is typically written without a symbol. That is, AB is the product of
the matrix A with the matrix B.
In fact, from now on, we will typically not give a symbol for the binary
operation in a non-abelian group. That means that a particular example will
have to have its multiplication given by words. Even in abelian examples, the
operation often has no symbol. The example (R∗ , ·) given above is usually given
as R∗ with ordinary multiplication and the product of x and y is written as xy.
You should review the proof that matrix multiplication is associative. It is
not always easy to prove that a particular example of a group satisfies all the
requirements.
2.3.4
The symmetric groups
Definition
The next examples are so important that they get their own section.
If X is a set, then a permutation on X is a bijection from X to X. One
particular bijection from X to X is the identity function e. That is e(x) = x for
every x ∈ X. From this it is trivial to check that for any bijection f : X → X,
the compositions ef and f e both equal f .
From Lemma 2.2.5, the composition of bijections is a bijection. From Lemma
2.2.8, the inverse of a bijection is a bijection. From Lemma 2.2.7, if f : X → X
is a bijection, then the compositions f f −1 and f −1 f both equal e.
Lastly, functional composition is associative. We see this from (f (gh))(x) =
f ((gh)(x)) = f (g(h(x))) and ((f g)h)(x) = (f g)(h(x)) = f (g(h(x))). Reading
this quickly will not convince you that anything is going on. Read it again while
keeping careful track of parentheses.
The set of all permutations on X will be written as SX and we have argued
that SX under the operation of functional composition is a group with identity e.
It is typically non-abelian which we will show by looking at particular examples.
The group SX is usually called the symmetric group on (or of) X.
62
CHAPTER 2. OBJECTS OF STUDY
Examples and notation
If X = {1, 2, · · · , n}, then SX is usually denoted Sn . Each element of Sn is a
permutation of the integers from 1 through n. There are two standard notations
for such permutations. We give one now and the other in a later chapter after
some of the structures are better understood.
If σ ∈ Sn , then to describe σ, we must say what σ(i) is for 1 ≤ i ≤ n. The
notation
!
1
2
3 ···
n
(2.2)
σ=
σ(1) σ(2) σ(3) · · · σ(n)
describes σ by having each column give the pair (i, σ(i)) with i on the top line
and σ(i) on the bottom line. The notation in (2.2) is called Cauchy notation after
one of the earliest mathematicians to investigate the properties of permutations
and possibley the first to use the notation.
The top line might seem redundant since the way it is given in (2.2), it is
predictable. However, the notation does not require that the top line appear
in numerical order and the notation works for permutations on sets other than
{1, 2, · · · , n}. For example
1 2 3
3 1 2
and
3 1 2
2 3 1
represent the same element of S3 and
a h x
x h a
represents a permutation on the set {a, x, h}.
Since Cauchy notation completely describes a permutation, there is enough
information in the notation to calculate compositions. Let us calculate with the
permutations below from S3 .
1 2 3
1 2 3
.
(2.3)
,
τ=
σ=
3 2 1
2 3 1
Remembering that permutations are functions, we compose from right to
left. That is, the permutation on the right is applied first. With this rule, we
have
1 2 3
1 2 3
1 2 3
,
=
στ =
1 3 2
3 2 1
2 3 1
and
τσ =
1 2 3
3 2 1
1 2 3
2 3 1
=
1 2 3
2 1 3
.
2.3. GROUPS
63
To make insure that the information is being interpreted correctly, we do two
of the six calculations that go into the two compositions above. We have
(στ )(1) = σ(τ (1)) = σ(3) = 1,
and
(τ σ)(1) = τ (σ(1)) = τ (2) = 2.
In particular, note that στ 6= τ σ and S3 is not abelian.
Identity and inverse
The identity in Sn is
1 2 3 ··· n
1 2 3 ··· n
.
If σ is a permutation in Sn and it takes i to !
σ(i), then σ −1 is required to
i
in the Cauchy notation for
take σ(i) back to i. Thus for each column
σ(i)
!
σ(i)
in the Cauchy notation for σ −1 . Thus if σ
σ, there must be a column
i
is given as in (2.2), then
!
σ(1) σ(2) σ(3) · · · σ(n)
−1
σ =
.
1
2
3 ···
n
This is one reason for not requiring that the top line be in numerical order. If
you feel compelled to put the top line in numerical order, you may do so. Thus
for σ and τ as given in (2.3), we have
1 2 3
3 2 1
1 2 3
2 3 1
−1
−1
.
=
,
τ =
=
σ =
3 2 1
1 2 3
3 1 2
1 2 3
Notice that τ −1 = τ .
Exercises (19)
1. Consider the following elements of S4 .
1 2 3 4
1 2 3 4
,
, τ=
σ=
2 1 3 4
2 3 4 1
λ=
1 2 3 4
1 3 4 2
Compute the following. See what patterns you can find.
σ2 , σ3 , σ4 ,
σ −1 , σ −2 , σ −3 ,
στ , τ σ,
στ σ −1 , σ 2 τ σ −2 , σ 3 τ σ −3 ,
λ2 , λ3 , λ−1 , λ−2 ,
λτ , τ λ,
λτ λ−1 , λ2 τ λ−2 .
.
64
CHAPTER 2. OBJECTS OF STUDY
2. What is the identity element in C1 ? What is the simplest way to give the
inverse of any z ∈ C1 ?
2.4
The integers mod k
The next examples are of extreme importance. They will appear repeatedly,
and will lead to other examples that are even more important.
To give the examples, we need more tools. The techniques that we use
to build the examples and the techniques that we use to verify some of the
properties will be used repeatedly in these notes. Thus the techniques are as
important to learn as the examples themselves. We start with a discussion of
the techniques used in the construction.
2.4.1
Equivalence relations
Relations
If X is a set, then some elements in X might be related to other elements in X.
If X is a set of people, then one person might be a cousin of another person.
The statement for an x and y in X that “x is a cousin of y” is then either true
or false. A relation will always have a “value” that is either true or false.
Some relations, such as “is a cousin of” are symmetric. If x is a cousin of
y, then y is also a cousin of x. Some relations, such as “is a parent of” are not
symmetric. We look at symmetry and two other properties of mathematical
relations.
If X is a set, then a binary relation on X is a property that any given pair
(x, y) of elements from X can either have or not have. We say binary since we
look at pairs of elements, and not triples, or other combinations. In these notes
we will only be concerned with binary relations.
Examples of relations
Some mathematical relations are famous and have their own symbols. The
relation “is less than” has its own symbol < and we write x < y to indicate that
x is less than y. Thus 2 < 3 is true, and 3 < 2 is false. Other relations in this
family are >, ≤, ≥. These relations can be applied to R, Q, Z or N. There is
no reasonable relation < that works well on C.
Another relation that applies to N or Z is “divides” and has its own symbol.
We write m|n for “m divides n.” We will have more to say about this relation
shortly.
A symbol that is often used for an arbitrary binary relation is ∼. Like the
relations above, it is written between the two elements. To say that ∼ is a
relation on a set X means that for each pair (x, y) of elements of X, either
x ∼ y is true or x ∼ y is false.
It is possible to make up myriads of examples.
2.4. THE INTEGERS MOD K
65
1. On Z, define x ∼ y to mean x + y is even.
2. On Z, define x ∼ y to mean xy is even.
3. On R, define x ∼ y to mean x > y 2 .
4. On C, define x ∼ y to mean xy is real.
The relation “divides”
The relation “divides” is one of the most important relation in these notes. Here
we treat it carefully.
If m and n are elements of Z, then we write m|n to mean that there exists a
k ∈ Z so that n = mk. Of course, we can reword this to say that n is a multiple
of m and have exactly the same meaning, but the emphasis of “divides” over
“is a multiple of” is traditional.
Thus 2|4 is true and 2|5 is false. Note that 0|0 is true. This is in spite of
the fact that 00 has no sensible value that can be assigned to it. Note that 0|2
is false, and in general 0|n is false for every n ∈ Z with n 6= 0.
The setting Z greatly affects the nature of “divides.” If we take the definition
of m|n and replace every appearance of Z by R, then we get a relation on R in
which x|y is true for all pairs (x, y) except those where x = 0 and y 6= 0. We
will not be interested in this relation at all except as an example.
Equivalence relations
We investigate three properties of relations.
Let ∼ be a binary relation on X.
1. We say that ∼ is reflexive if for all x ∈ X, x ∼ x is true.
2. We say that ∼ is symmetric if for all x and y in X, if x ∼ y, then y ∼ x.
3. We say that ∼ is transitive if for all x, y and z in X, if x ∼ y and y ∼ z,
then x ∼ z.
Easy examples are that ≤ is reflexive and < is not. Neither is symmetric.
The relation on Z defined by x ∼ y means “x + y is even” is symmetric. Both
≤ and < are transitive. As an exercise, you will show that divides is transitive.
Convince yourself that “is a cousin of” is not transitive.
A relation ∼ on X that is reflexive, symmetric and transitive is called an
equivalence relation. The relation = is the most obvious example of an equivalence relation.
66
CHAPTER 2. OBJECTS OF STUDY
An important example
The following example will be our most important example for the time being.
It is important enough to give it a speicial symbol. Pick a k in Z with k 6= 0.
We define a relation ∼k on Z by saying
for x, y ∈ Z,
x ∼k y means k|(y − x).
(2.4)
The fact that this is an equivalence relation will be left as a set of exercises.
Exercises (20)
1. Prove that “divides” on Z is transitive and reflexive. Show by example
that it is not symmetric.
2. Prove that if k|x and k|y in Z, then k|(x + y), k|(−x), and k|(x − y) all
hold.
3. Prove that if k|x and a ∈ Z, then k|(ax).
4. Let X be a set of sets. (Yes, that is allowed. For example, X might be all
of the subsets of a set A.) Prove that = on X is an equivalence relation.
The first step is to realize that there is something to prove. Remember
that = on sets has a definition. Using the definition of equality of sets in
Section 2.2.1 prove that = is reflexive, symmetric and transitive.
5. Prove that ∼k on Z defined in (2.4) is an equivalence relation. This breaks
into three proofs which are quite easy, but still need to be written down
carefully.
6. Give five different elements t in Z so that t ∼3 0. Give five different
elements t in Z so that t ∼3 1. Give five different elements t in Z so that
t ∼3 −1.
7. Prove that ∼−k is the same as ∼k . That is, x ∼k y if and only if x ∼−k y.
2.4.2
Equivalence classes
Equivalence relations are used to build equivalence classes. Equivalence classes
break up a set in a very specific way. We first write down the kind of break up
that interests us, then define equivalence classes that come from an equivalence
relation, and then show that the equivalence classes give us the break up with
the desired properties.
Partitions
A partition of a set X is a collection P of subsets of X with the following three
properties.
2.4. THE INTEGERS MOD K
67
1. Each set in the collection P is a non-empty subset of X. That is, every S
in P has S ⊆ X and S 6= ∅.
2. The union of the sets in the collection P is all of X. This is the same as
saying that for every x ∈ X, there is an S in P with x ∈ S.
3. The collection P is of pairwise disjoint sets. That is, for any two sets S
and T from the collection P, either S = T or S and T are disjoint.
Equivalence classes
If ∼ is an equivalence relation on a set X, then for each x ∈ X, we define
[x] = {y ∈ X | x ∼ y}.
(2.5)
We refer to [x] as the equivalence class of x. Sometimes we say “containing x”
instead of “of x” for emphasis, and sometimes we add “in X” or “under ∼” or
both if the extra clarity is needed.
Given X and ∼, the set [x] is completely determined by x. This makes x
look special to [x]. This is not the case. The next lemma shows that any element
of [x] determines [x].
Lemma 2.4.1 If ∼ is an equivalence relation on X, if x is in X, and if z is in
[x], then [z] = [x].
Proof. If z ∈ [x], then x ∼ z, and by symmetry z ∼ x. Using x ∼ z, we have that
if t ∈ [z], then z ∼ t and x ∼ z implies that x ∼ t so t ∈ [x]. This shows that
[z] ⊆ [x]. Using z ∼ x, an argument like the previous sentence that reverses the
roles of z and x shows that [x] ⊆ [z].
Lemma 2.4.1 justifies our saying that any element of an equivalence class
represents or is a representative of that class.
From the definition (2.5) it is clear that [x] is a subset of X. The next
proposition shows that the collection of equivalence classes forms a partition of
X. We write out the proof in full to illustrate one technique of proof that must
be learned.
The third item in the definition of a partition says that an “or” is true.
The standard way to prove that an “or” is true is to assume that one of the
possibilities is false, and use that to prove that the other possibility must then
be true. Since there are two possibilities in the third item of the definition, we
can choose which to assume is false. The two choices may not result in the same
amount of work. We choose the one that we think is the easier to use.
Proposition 2.4.2 If ∼ is an equivalence relation on a set X, then the collection of equivalence classes forms a partition of X.
Proof. We have already noted that each [x] is a subset of X. By reflexivity, for
each x, we have x ∼ x so x ∈ [x]. Thus each [x] is non-empty. This establishes
the first item in the definition of a partition.
68
CHAPTER 2. OBJECTS OF STUDY
The observation that for each x ∈ X we have x ∈ [x] also shows that the
union of the equivalence classes is all of X and we have the second item.
To prove the third item, we assume that [x] and [y] are not disjoint. We
want to prove that [x] = [y]. Since [x] ∩ [y] 6= ∅ there is an element in [x] ∩ [y].
We let z be such an element. (Note that we must use a letter not used yet since
we must use the “there is” with no other restrictions on what we bring in.)
From z ∈ [x], Lemma 2.4.1 says that [z] = [x]. From z ∈ [y] we get [z] = [y], so
[x] = [y].
We will use equivalence relations and equivalence classes in the next section
to build our examples.
Exercises (21)
1. We look at the relation ∼k defined in (2.4). What are the equivalence
classes in Z under ∼2 ? How many equivalence classes does are there in Z
under ∼k .
2.4.3
The groups
We consider the relation ∼k on Z defined in (2.4). We make a group, denoted
Zk , out of the equivalence classes of ∼k on Z. That is, each element of Zk will
be a single equivalence class, and the set of elements of Zk will be the set of
equivalence classes.
To make a group we need to define the operations. This group will be
abelian and we will + to denote the binary operation, − to denote the unary
operation. There will be a problem with the definitions. This kind of problem
occurs often and the technique for getting rid of the problem is standard and
must be learned.
For x ∈ Z, we will write [x]k to denote the equivalence class of x in Z
under ∼k . This example is important enough to get its own notation. Since the
elements of Zk will be classes [x]k , we need to know how to add them together
and how to negate them.
For [x]k and [y]k in Zk , we define
[x]k + [y]k = [x + y]k
(2.6)
−[x]k = [−x]k .
(2.7)
and
Well definedness
Observe that in (2.6) the “result” [x+y]k of the definition contains a calculation
that uses elements (representatives) of the equivalence classes [x]k and [y]k . The
problem is that there are many representatives of [x]k and [y]k . Thus many
different calculations can be done in an attempt to determine [x]k + [y]k and
2.4. THE INTEGERS MOD K
69
we need to know that these calcuations always give the same equivalence class.
Let us look at some examples.
Let k = 5 and consider [1]5 + [3]5 . If we use 1 from [1]5 and 3 from [3]5 , then
we get [1]5 + [3]5 = [1 + 3]5 = [4]5 . Note that 1 ∼5 6 and 3 ∼5 8 so [1]5 = [6]5
and [3]5 = [8]5 . So we can legitimately use 6 from [1]5 and 8 from [3]5 to get
[1]5 + [3]5 = [6 + 8]5 = [14]5 . But 14 ∼5 4 so [14]5 = [4]5 and we get the same
answer. The point is that [4]5 = [14]5 even though 4 6= 14.
This raises hope that we will always get the same answer, but it is not a
proof that we do.
The problem that we are seeing in the defintion (2.6) is called a well definedness problem. We will discuss how to recognize such problems shortly. If it can
be proven that one gets the same answer no matter which representatives are
used in the calculation, then we can say that the operation + as defined in (2.6)
is well defined.
Well definedness problems often come up when defining operations on sets
of equivalence classes. More generally (operations are really functions with
an appropriate domain) well definedness problems can come up when defining
functions on sets of equivalence classes.
We now deal with (2.6)
Lemma 2.4.3 The operation defined in (2.6) is well defined.
Proof. We must show that changing the representatives used in the right side of
the equal sign in (2.6) does not change the result. That is, the same equivalence
class is obtained.
To that end, we let x′ ∈ [x]k and y ′ ∈ [y]k be other representatives of [x] and
[y], respectively. Our goal is to show that [x′ + y ′ ]k which results from using x′
and y ′ is the same as [x + y]k , the result of using x and y. Thus we wish to
show (x + y) ∼k (x′ + y ′ ). But this requires showing that
k|((x′ + y ′ ) − (x + y))
or equivalently k|((x′ − x) + (y ′ − y)).
But we know that x ∼ x′ and y ∼ y ′ hold so k|(x′ − x) and k|(y ′ − y). The
result follows from a problem in Exercise Set (20).
The outline of the proof above is standard and should be learned. To show
well definedness, show that if different representatives are chosen, then the result
does not change. If the result is an equivalence class, then one shows that the
results of the calculation are related.
Lemma 2.4.4 The unary operation defined by (2.7) is well defined.
The proof will be left as an exercise.
We now define [x]k − [y]k = [x] + (−[y]k ) in Zk . It is wrong to think that
any definition that mentions equivalence classes must have a well definedness
problem. This definition does not. From Lemma 2.4.4, we know that −[y]k is
well defined. Now lemma 2.4.3 tells us that the sum of the two classes [x]k and
−[y]k is well defined. So the definition given is well defined.
70
CHAPTER 2. OBJECTS OF STUDY
Had we written [x]k − [y]k = [x − y]k , then we would have had a well
definedness problem. The difference is that the second defintion tells how to
calculate the answer in terms of representatives.
You might wonder why two definitions of the same thing have different behaviors. The answer is that it takes a proof that they do define the same thing.
If such a proof is found, then the second definition must be well defined since
the first definition is. This all is covered further in exercises.
Proposition 2.4.5 The pair (Zk , +) with + as defined in (2.6), with − as
defined in (2.7) is an abelian group with [0]k as the identity element.
The proof will be left as an exercise.
The group Zk will be referred to as the integers modulo k or as the integers
mod k.
From a problem in Exercise Set (20), we know that ∼k is the same relation
as ∼−k and so the two relations have the same equivalence classes. Thus there
is no difference between Zk and Z−k . From now on, we only look at Zk with
k > 0.
From a problem in Exercise Set (21), we know that Zk has k elements. Since
this is finite, we can write down a table that gives the results of the operation
for all pairs of elements in Zk . The table below gives the “addition table” for
Z4 . We have been lazy and have omitted [ ]4 from the entries. Thus the entry
3 really refers to [3]4 , and so forth.
+
0
1
2
3
0
0
1
2
3
1
1
2
3
0
2
2
3
0
1
3
3
0
1
2
Leaving out the brackets and subscript will often be done to denote elements of
Zk as long as it is clear that what is going on.
Order of group and element
We give some definitions primarily so that we can give some problems to work
on. The terms defined will take on greater importance later.
A group (G, ·) is called finite if G is a finite set. If (G, ·) is a finite group,
then the order of G is the number of elements of G. If G is not finite, then we
can say that the order of G is infinite. We write |G| to denote the order of G.
If (G, ·) is a group and g ∈ G, then (using multiplicative terminology), the
order of g is the least number of copies of g that have to be multiplied together
to give the identity element. If no such number exists, then we say that g is of
infinite order. At the moment there is no obvious connection between the two
uses of the word “order” but that will be cleared up eventually.
If (G, +) is an abelian group, then additive language can be used, and the
order of a g ∈ G will be the least number of copies of g that have to be added
together to give the identity element.
2.4. THE INTEGERS MOD K
71
It is a reasonable guess that every element in a finite group has a finite order,
but we will not see a proof of this until later.
We note that the addition table for Z4 shows that the order of 0 is 1. (As
mentioned, we are leaving out the brackets that denote the equivalence classes.)
That is, we only need to add up one copy of 0 to get 0. On the other hand
1 6= 0, 1 + 1 = 2 6= 0, 1 + 1 + 1 = 3 6= 0 and 1 + 1 + 1 + 1 = 0. Thus 1 has order
4 in Z4 . We now have enough material to give some problems.
Before we list the problems, we mention one easy but important result.
Proposition 2.4.6 There is a group of every finite order.
Proof. Given a finite size k, we know that Zk has k elements.
Exercises (22)
1. Prove Lemma 2.4.4.
2. Prove that [x]k − [y]k = [x]k + (−[y]k ) and [x]k − [y]k = [x − y]k define the
same operation. That is, prove for all x and y that [x]k +(−[y]k ) = [x−y]k .
As mentioned, this proves that the second definition is well defined. Give
a completely independent proof that the second definition is well defined.
3. Prove Proposition 2.4.5. You may use facts that you know about the
integers (such as commutativity, associativity, etc. of various operations).
4. Calculate the orders of 2 and 3 in Z4 .
5. Write out the addition table for Z5 and calculate the orders of all the
elements in Z5 .
6. Write out the addition table for Z6 and calculate the orders of all the
elements in Z6 .
2.4.4
Groups that act and groups that exist
Elements of the symmetric groups Sn move things around. We will learn eventually to say that Sn “acts” on the set {1, 2, . . . , n} in that each element of Sn
permutes the elements of {1, 2, . . . , n} and that certain rules are followed.
If we recall that matrices are linear transformations, then Mn′ , the group
of n × n matrices with real entries and non-zero determinant (which are thus
invertible), “acts” on the vector space Rn . Once the rules for actions are written
down, it will be seen that this action also follows those rules.
It is less obvious that any of the other examples “act” on anything. These
examples include such familar groups as (Z, +), (R∗ , ·), etc., and perhaps less
familiar groups such as (Zk , +). It could be said that these groups just exist in
their own right and do not “act” on anything.
It will be seen later that any group can be made to act on a set. Not
every question about groups requires looking at such an action, and not every
72
CHAPTER 2. OBJECTS OF STUDY
application of groups emphasizes such an action. However, many questions and
applications do. The view that groups act will become more important to us as
the course goes on.
2.5
Rings
Each group has one binary operation. Rings have two. The best example to keep
in mind as you read this section is the integers with addition and multiplication.
More complicated examples will show up shortly.
2.5.1
Definitions
We will not give two versions of the definition of a ring. Based on your experience
with groups, you can supply a second version with no trouble. We will give the
one with less structure spelled out.
A ring is a triple (R, +, ·) where R is a set and + and · are binary operations
on R. The following conditions must be met.
1. The set R and the addition form a commutative group. The identitity
element is usually denoted 0.
2. The multiplication is associative in that a(bc) = (ab)c for all a, b and c in
R.
3. Multiplication distributes over addition in that a(b + c) = ab + ac and
(b + c)a = ba + ca for all a, b and c in R.
Some comments are needed. The multiplication need not be commutative.
This is why two distributive laws are needed. The multiplication need not have
an identity. These gaps can be filled by adding more words.
If R has an element 1 so that for all a ∈ R, we have a1 = a = 1a, then
we say that R is a ring with identity or ring with 1 or ring with unit. (Some
books choose not to deal with rings without a multiplicative identity and so
their definition of a ring coincides with our definition of a ring with identity.)
The even integers form a ring without identity.
If a ring R satisfies ab = ba for all a and b in R, then R is called a commutative ring. A commutative ring with identity fails to be a field only in that it
lacks multiplicative inverses. The integers form a commutative ring with identity. Two-by-two matrices with real entries form a non-commutative ring with
identity.
There are more words that can be attached to rings. We will deal with some
later. The small number of assumptions about the multiplication in a ring allow
for lots of variations and this leads to many special terms.
2.5. RINGS
2.5.2
73
Examples
The first important example (Z, +, ·) has already been mentioned. It is referred
to as the ring of integers to emphasize that both operations are being considered.
The second important examples are the ring of polynomials. Since the coefficients can come from various classes of numbers, there are various rings of
polynomials. For us the most important sources of coefficients will be the rational numbers Q, the real numbers R, and the complex numbers C.
Of course, Q, R and C are also examples of rings. However, they satisfy
many more properties than required for a ring, and they will show up again
when we discuss fields.
The ring Zk
The elements of Zk are the equivalence classes [x]k for x ∈ Z. We add classes
by [x]k + [y]k = [x + y]k having proven that this addition is well defined. We
can try to multiply classes by [x]k [y]k = [xy]k . We immediately have a well
definedness problem.
Lemma 2.5.1 Setting [x]k [y]k = [xy]k for elements of Zk gives a well defined
binary operation on Zk .
The proof will be left as an exercise.
Proposition 2.4.5 says that (Zk , +) is an abelian group. Now that we have
a second binary operation, we might have a ring. To show that we have a ring,
we have to show that the second and third requirements for a ring are met. In
fact, they are and we have the following.
Proposition 2.5.2 The triple (Zk , +, ·) with + as defined in (2.6), and with ·
the multiplication discussed in Lemma 2.5.1 is a commutative ring with 1.
The proof will be left as an exercise.
We gave the addition table for Z4 after Proposition 2.4.5. Below is the
multiplication table. As before, we simplify the table by writing, for example,
3 instead of [3]4 .
·
0
1
2
3
0
0
0
0
0
1
0
1
2
3
2
0
2
0
2
3
0
3
2
1
Note that the elements 1 and 3 in Z4 have multiplicative inverses, but 2 does
not. The multiplicative inverse of 1 is 1 since 1 · 1 = 1, and the multiplicative
inverse of 3 is 3 since 3 · 3 = 1. However, no x ∈ Z4 has 2 · x = 1.
74
CHAPTER 2. OBJECTS OF STUDY
Exercises (23)
1. Prove Lemma 2.5.1.
2. Prove Lemma 2.5.2
3. Write out the multiplication tables for Z3 , Z5 and Z6 . In each determine
which elements have multiplicative inverses.
2.6
Fields
Fields are rings but with extra restrictions. Thus every field is a ring, but not
every ring is a field. Since every field is a ring, we can shorten the definition of
a field by referring to the definition of a ring and some of the named restrictions
on rings that have been discussed.
2.6.1
Definitions
A field F is a commutative ring with 1 with operations called addition (usually
written with the symbol + as in x + y) and multiplication (usually written with
no symbol as in xy) so that if 0 is the additive identity, then the following also
hold.
1. Every x ∈ F with x 6= 0 has a multiplicative inverse x−1 .
2. 1 6= 0.
The second item is not required in all books. Demanding 1 6= 0 keeps the
one elmeent ring {0} from being a field.
If x 6= 0, we will argue later that x−1 6= 0. You can try that now, but if you
are not careful, you will skip important steps. Accepting this fact for now, we see
that if F ∗ = F − {0}, then the multiplication on F ∗ is associative, commutative,
has an identity and has inverses. This means that the multiplication makes F ∗
into another abelian group. (Using just + on F gives the first abelian group.)
The two groups are linked by the distributive law that holds because we are
assuming that F is a ring.
If a definition is desired that makes no mention of rings, then a field is a
triple (F, +, ·) where · is usually not written, where F is a set and + and · are
binary operations. There are elements 0 and 1 in F with 1 6= 0 so that for all
x, y and z in F the following hold.
1. x + y = y + x.
2. (x + y) + z = x + (y + z).
3. x + 0 = x.
4. There is −x ∈ F so that x + (−x) = 0.
2.6. FIELDS
75
5. xy = yx.
6. (xy)z = x(yz).
7. x1 = x.
8. If x 6= 0, there is x−1 ∈ F so that xx−1 = 1.
9. x(y + z) = xy + xz.
2.6.2
Examples
The three standard examples have been mentioned before: Q, R and C with
the usual addition and multiplication.
We will give two types of examples before leaving this chapter. They look
quite different, but are more related than they look. The view that unifies them
will come a lot later. One set of examples will need a fair amount of material
about the integers, and we will take the time to review that material in this
chapter before discussing those examples. The examples that need a little less
work will be covered now.
Adding a square root to Q
The field R has no number in it whose square is −1. We build the field C by
adding a number to R that has the property that its square is −1.
The field Q has√no number in it whose square is 2. We will build a field
that we will call Q[ 2] by adding a number to Q that has the property that its
square is 2. (For those not familiar with the proof, we will show at the end of
this section that there is no √
rational number r that has r2 = 2.)
There is a real number 2 whose square is
√ 2, so we do not have to look
far for the number that we want. Recall that 2 refers only
√ to a positive real
number. The negative real number whose square
is
2
is
−
2.
√
We claim that the set of all numbers r + s 2 with r and s both rational is
closed under the four operations of addition, negation, multiplication and (for
r and s not both zero) inversion.
In fact, we will argue that it forms a perfectly
√
good field. We will use√Q[ 2] to denote this collection of numbers.
√
We know that 0 + 0 2 = 0 is an additive identity and 1 + 0 2 = 1 is a multiplicative identity. We also know that the addition and multiplication operations
in R satisfy all the requirements of a field including both commutativities,
as√
sociativities, and also the distributive law. This means that Q[ 2] will have
these
√ properties. So if our closure claims are correct, then we will know that
Q[ 2] forms a field.
√
√
√
We have√(r + s 2) + (t + u 2) = (r + t) +√(s + u) 2. From this it is clear
that −r − s 2 is an additive inverse for r + s 2. Multiplication is handled by
√
√
√
√
(r + s 2)(t + u 2) = rt + (st + ru) 2 + su2 = (rt + 2su) + (st + ru) 2.
This leaves inversion.
76
CHAPTER 2. OBJECTS OF STUDY
We used two techniques for multiplicative inversion for complex numbers.
We do
exercise. Given
√ as an √
√ the more difficult here, and leave the simpler
r + s 2 6= 0, we want to find t and u so that (r + s 2)(t + u 2) = 1. This is
equivalent to asking that
√
√
(rt + 2su) + (st + ru) 2 = (1) + (0) 2.
If we can find t and u so that rt + 2su = 1 and st + ru = 0, then we have a
solution. But this is just two linear equations in the two unknowns t and u with
r and s in the role of constants.
We have
rt + 2su = 1,
st + ru = 0,
rst + 2s2 u = s,
rst + r2 u = 0,
u(2s2 − r2 ) = s,
so
u=
2s2
−s
s
= 2
.
2
−r
r − 2s2
Now either doing something similar, or by plugging the above value for u into
st + ru = 0, we get
r
t= 2
.
r − 2s2
This demonstrates that
√
√
√
s
r
(r + s 2) 2
−
2
= 1 + 0 2 = 1,
2
2
2
r − 2s
r − 2s
so we should declare
√
(r + s 2)−1 =
r2
√
r
s
2.
− 2
2
2
− 2s
r − 2s
(2.8)
Note that this makes no sense if r2 − 2s2 = 0. But this only happens if
r 2
s
=2
which would make 2 the square of a rational number since r and s are rational.
Since this cannot happen, we never have r2 − 2s2 = 0.
2.6. FIELDS
77
Exercises (24)
1. Fill in the details in the calculations above for t.
√
2. Find (r + s 2)−1 by the technique of rationalizing the denominator.
√
√
3. Use (2.8) to get (3 + 2 2)−1 as an element of Q[ √2] and use direct multiplication to verify that when multiplied by 3 + 2 2 it gives 1.
√
√
4. Let Q[ 3] by all r+s 3 with r and s rational. Find formulas
√ for addition,
multiplication, negation and inversion and argue that Q[ 3] forms a field.
√
√
√
5. (Harder.) Let
Q[ 3 2]√be all r + s 3 2 + t 3 4 with all of r, s and t rational.
√
(Note that 3 4 = ( 3 2)2 .) Find formulas
√ for addition, multiplication,
negation and inversion and argue that Q[ 3 2] is a field. One technique for
finding a formula for inversion will work (with a tremendous amount of
work involved) and the other will probably be impossible.
2.6.3
The irrationality of
√
2
The proof below is extremely standard and can be found in this form in thousands of books. It is a proof by contradiction. That is, it assumes that what is
to be proven is false, and derives an impossible situation—that some fact would
have to be both true and false. Thus thus it is impossible that what is to be
proven can be false, and must therefore be true.
Getting used to proof by contradiction has some negative effects. Proof by
contradiction often becomes an addiction. Once used to the technique, some
sudents start using it for everything—even for statements that have simple
straightforward proofs.
Logically, a proof by contradiction is as good as any other, but if used when
not needed, it can hide what is really going on. A hard rule to follow, but a
worthwhile rule nonetheless is to never use proof by contradiction until a serious
attempt to find a direct proof has been made.
Proposition 2.6.1 For no rational number r, is r2 = 2.
Proof. A rational number is of the form m/n for integers m and n with n 6= 0. We
assume that the fraction m/n is in reduced terms. In particular, we can assume
that not both m and n are even. We show the impossibility of (m/n)2 = 2 by
assuming (m/n)2 = 2 and showing that m and n have to both be even. Since
we are assuming that they are not both even, we will have a contradiction.
If (m/n)2 = 2, then m2 = 2n2 which makes m2 even. Since odd numbers
have odd squares (the square of 2k + 1 is 4k 2 + 4k + 1), m has to be even. (This
is a mini proof by contradiction by itself.) If m is even, it is of the form 2q for
some integer q. Now m2 = 4q 2 . So 4q 2 = 2n2 and n2 = 2q 2 . This makes n2
even, and the argument we just gave for the evenness of m also gives that n is
even. This is the promised contradiction.
78
CHAPTER 2. OBJECTS OF STUDY
2.7
Properties of the ring of integers
From the problems in Exercise Set (23), we get the hint that Zp might be a field
as long as p is prime. This turns out to be the case and we will show this to be
true. This gives another collection of examples of fields.
We need to understand more about the integers. In this section we will
derive properties of integers that we will exploit in Section 2.8 to show that Zp
is a√ field when p is prime. These fields are related to the examples (such as
Q[ 2]) above, but the relationship is far from obvious. It will be made obvious
much later but we want to give hints about the nature of the relationship now.
2.7.1
An outline
What will emerge in this section and the next will be an outline. It will run
through some familiar aspects of the integers, such as the division algorithm,
the existence and form of greatest common divisors, and end with properties
about prime numbers and the application in Section 2.8 to finding multiplicative
inverses in certain settings. As has been seen above, the hardest aspect in
building a field is often the construction of multiplicative inverses.
The importance of the outline is that it applies to more than just the ring of
integers and the fields that can be built from integers. Later we will see that the
same outline applies to rings of polynomials and fields that can be built from
polynomials. It is at that point
√ that we will see the connection between the
examples of fields such as Q[ 2] and the examples that we will build in Section
2.8.
The outline will cover the following topics.
1. Well ordering and induction. This only applies to the non-negative integers, and only applies indirectly to polynomials, but it is the start of all
our constructs.
2. The division algorithm. Induction reveals properties of quotients and remainders, and these properties drive all that follows.
3. Greatest common divisors. Induction and the division algorithm prove
that greatest common divisors exist and prove that they have a certain
form.
4. Primes I. The outline branches here. This is the long branch and is not
needed to construct examples of fields. It covers several crucial facts about
primes. This branch will be done in this section and its parallels in rings
of polynomials will be important later.
(a) Factorization into primes. Every integer is a product of primes. This
needs nothing but induction.
(b) Euclid’s first theorem about primes. This title is not universally used,
but it will do. The theorem states that for a, b and p in Z, if p is a
2.7. PROPERTIES OF THE RING OF INTEGERS
79
prime and p|(ab), then either p|a or p|b. This uses the facts that we
establish about greatest common denominators.
(c) Uniqueness of prime factorization. It is easy to show that every
integer factors into primes, but it is harder to show that it can only
be done in one way. It is also hard to explain exactly what that last
sentence means. This uses induction and Euclid’s first theorem.
5. Primes II—the short branch. This will be covered in Section 2.8.
(a) Building multiplicative inverses. This will show that Zp is a field
when p is prime. The facts we have learned about greatest common
divisors will be used here.
2.7.2
Well ordering and induction
The natural numbers are the non-negative integers and are denoted N. We have
N = {0, 1, 2, . . .}. The important property of the natural numbers that we wish
to discuss is the following.
Well ordering
Proposition 2.7.1 For every non-empty subset S of N, there is a least element
of S. That is, there is an element m ∈ S so that for every n ∈ S, we have m ≤ n.
The property given in the proposition is called well ordering. In the next
few pages, the well ordering property of N will be used several times.
Some comments are in order.
The well ordering of N is given as a proposition since it can be proven by
a standard induction proof. It is hoped that this has been seen in a previous
course, but if not, an exercise at the end of the section will guide you through
a proof.
The assumption that S be non-empty is obviously needed (how can m be
found in S if S has no elements), but it is easy to forget to check this condition
when invoking well ordering.
The least element must be a member of S. This is why m ≤ n is given as
the last part of the statement instead of m < n.
The integers are not well ordered. The set Z is a perfectly good non-empty
subset of Z and Z has no least element.
The non-negative reals are not well ordered. The non-negative reals have a
least element, but not every non-empty subset does. For example, the positive
reals form a non-empty subset of the non-negative reals, and the positive reals
have no least element.
Induction
We can approximately describe induction as if 0 works, and each number works
if the previous number works, then all numbers work.
80
CHAPTER 2. OBJECTS OF STUDY
The part to focus on in that approximation is the part that says each number works if the previous number works. This part corresponds to proving a
statement is true for k + 1 under the assumption that it is true for k.
We want to replace the approximation by a statement that is stronger in
that it gives more hypothesis to work with when trying to prove something by
induction. The stronger approximate statement would read if 0 works and each
number works if all smaller numbers work, then all numbers work. However, it
turns out that the part if 0 works does not need to be stated explicitly. Read
on.
The formal statement is given below.
Proposition 2.7.2 (Strong induction) If S(n) is a statement that varies
with n ∈ N, then S(n) is true for all n ∈ N if the following holds: for each n ≥ 0
it can be proven that S(n) is true assuming that S(j) is true for all 0 ≤ j < n.
The fact that there is no separate assumption that S(0) be true is really
hidden in the wording. The statement asks that S(0) be provable from ∀j ∈
N<0 , S(j) where N<0 = {j ∈ N|j < 0} is empty. So the statement asks that
S(0) be provable from nothing. That is, it asks that S(0) be true. If it bothers
you that this assumption is hidden in the wording, then you can just add it as
a redundant requirement.
Proof. Let
F = {n ∈ N | S(n) is false}.
That is, F contains all numbers n making S(n) false. If F is empty, then no
n ∈ N makes S(n) false, S(n) is true for all n ∈ N and we are done.
So we assume F is not empty. Then it has a least element k. This means
that S(k) is false and any j ∈ N with j < k has j ∈
/ F and S(j) is true. But
then k fits the statement and S(k) must be true. Thus F cannot be non-empty.
Some comments are in order. This is called strong induction since it is
easier to use than ordinary induction. Most of the applications in this outline
will use well ordering directly. But one application below will use well ordering’s
consequence, strong induction, and it will be pointed out that ordinary induction
would be exrtremely hard to use in that particular situation.
As is typical with inductions, we do not have to start with 0. There is a
version for any start value s in N and will say that S(n) is true for all n ∈ N
with n ≥ s if for each n ≥ s it can be proven that S(n) is true assuming that
S(j) is true for all s ≤ j < n.
2.7.3
The division algorithm
The main point of this section is the following proposition.
Proposition 2.7.3 (The division algorithm) Let m and d be in Z with d >
0. Then there are unique elements q and r in Z so that m = dq + r and
0 ≤ r < d.
2.7. PROPERTIES OF THE RING OF INTEGERS
81
Before we give the proof, we need some comments. The value q is usually
called the quotient of division of m by d and r is called the remainder of the
division. The usual name for d is the divisor and the usual name for m is best
forgotten.
The word “unique” appears in the statement of the proposition. The proper
use of this word always involves conditions. If there is a condition around, then
an object might be the only object that satisfies that condition. If it is, then it
is said to be the “unique” object that satisfies the condition. Nothing is ever
unique “by itself” except in colloquial speech. (E.g., “He is really unique.”)
Uniqueness proofs take a certain shape. The shape will resemble proofs we
have seen involving one-to-one functions. This is not a coincidence, and we will
remark on this after giving the proof.
Proof. We concentrate on the possible values of r. We want m = dq + r, so
r = m − dq for some q ∈ Z. For this reason we let
A = {m − dq | q ∈ Z}.
This is clearly non-empty, but it is not a subset of N. Let
B = {a ∈ A | a ≥ 0} = {m − dq | q ∈ Z and m − dq ≥ 0}.
This is clearly a subset of N, but it is not clearly non-empty. We want to apply
well ordering to B and need B to be non-empty to do so.
If m ≥ 0, then q = 0 gives m − dq = m − d0 = m − 0 = m ≥ 0.
If m < 0, then q = m gives m − dq = m − d(m) = m(1 − d). But d > 0
means d ≥ 1 so 1 − d ≤ 0. If m < 0, then m(1 − d) ≥ 0 so m − dq ≥ 0.
So there is always a non-negative value of m − dq for some value of q and B
is not empty.
By well ordering, there is a least element of B. Let r be this element. Since
r ∈ B, we know that r ≥ 0 and r = m − dq for some q, so m = dq + r. We need
to show that 0 ≤ r < d. Since we already known r ≥ 0, we need only show that
r < d.
We prove this by contradiction. If r ≥ d, then r − d ≥ 0. But r − d =
(m−dq)−d = m−d(q +1) and m−d(q +1) ≥ 0. This means r−d = m−d(q +1)
is in B. However, d > 0 implies that r − d < r. So r − d is in B and r − d < r.
This contradicts the fact that r is a least element of B. So r ≥ d is not possible
and r < d.
We have shown the “there are” part of the statement. We now have to deal
with the uniqueness.
To prove uniqueness, we assume “others” that satisfy the conditions and
prove that the “others” are the same as the ones already found. Specifically,
we asssume q ′ and r′ so that m = dq ′ + r′ and 0 ≤ r′ < d, and we try to prove
q = q ′ and r = r′ .
With m = dq + r and m = dq ′ + r′ , we have dq + r = dq ′ + r′ . This makes
d(q − q ′ ) = r′ − r. If r = r′ , then d(q − q ′ ) = 0 and d 6= 0 implies q − q ′ = 0 and
q = q ′ . So we are done if r = r′ .
82
CHAPTER 2. OBJECTS OF STUDY
If r 6= r′ , one must be greater than the other. Assume r′ > r. (If r > r′ ,
then use d(q ′ − q) = r − r′ and repeat the argument we are about to give while
reversing the roles of the primed and unprimed letters.)
We have 0 ≤ r < r′ < d, so 0 < r′ − r < d. With d(q − q ′ ) = r − r′ , we
have d(q − q ′ ) > 0 and d > 0 implies q − q ′ > 0. So q − q ′ ≥ 1. This makes
d(q − q ′ ) ≥ d. This contradicts d(q − q ′ ) = r′ − r < d. So r = r′ . This completes
the proof.
The outline discussed in the proof for proving uniqueness is always followed
in uniqueness proofs. It should be learned thoroughly.
In proving that a function f : X → Y is one-to-one, it is shown that an
element y ∈ Y that is in the image of f has a “unique” x ∈ X so that f (x) = y.
The approach in the proof above to uniqueness is built into the definition of
one-to-one. One assumes that there is another element x′ so that f (x′ ) = y and
one must prove that x = x′ . In the definition y is not mentioned, and so the
definition reads “if f (x) = f (x′ ), then x = x′ .” It is just a shorthand for a claim
of uniqueness.
Note the hypothesis d > 0 in the statement of the division algorithm. The
following is the version with d 6= 0.
Corollary 2.7.4 Let m and d be in Z with d 6= 0. Then there are unique
elements q and r in N so that m = dq + r and 0 ≤ r < |d|.
The proof is left as an exercise.
The power of the division algorithm for us is that it allows us to discuss
divisibility with some certainty.
Let m and d be in Z with d 6= 0. We want to discuss the truth of d|m.
We concentrate on the remainder r of the division of m by d. By the division
algorithm it is unique. Now if d|m, then there is some q ∈ Z so that m = dq =
dq + 0. Since 0 < |d|, we must have r = 0. (Explain how the uniqueness of the
remainder is being used here.) On the other hand, if the remainder r is 0, then
m = dq + 0 = dq and d|m. So we have proven the following.
Corollary 2.7.5 If m and d are in Z with d 6= 0, then d|m if and only if the
remainder of the division of m by d is zero.
This is fundamental enough to be used often and without referring back to
the corollary.
2.7.4
Greatest common divisors
As promised in the outline, we will make use of the division algorithm in this
section.
If m and n are in Z, then a common divisor of m and n is an integer d that
divides both m and n. That is, d|m and d|n are both true.
Obviously, a greatest common divisor is a common divisor that is greatest
in some sense. An obvious choice for greatest is “largest.” This is used in a few
2.7. PROPERTIES OF THE RING OF INTEGERS
83
books, but is less useful than another, more widely used choice for the meaning
of greatest. We will use that other choice in these notes. In spite of the fact that
the other choice has its advantages, it also has some negative (pun definitely
intended) aspects. These negative aspects will appear shortly.
For us a greatest common divisor for m and n in Z is a common divisor g
for m and n so that if h is any common divisor of m and n, then h|g.
Note that if g and h are both positive, then h|g implies that h ≤ g. So if we
restrict to positive values, the definition we use has the power of the definition
that interprets greatest as “largest.”
The following guarantees the existence of greatest common divisors and says
something about their form.
Proposition 2.7.6 (Greatest common divisors) Let m and n be in Z with
at least one of them not equal to zero. Then there is a unique positive greatest
common divisor g of m and n. Further, the only other greatest common divisor
of m and n is −g. Lastly, g is the smallest positive integer so that there are
integers s and t so that g = ms + nt.
The appearance of the word “positive” in the uniqueness part of the statement, and the assumption that at least one of m or n is not zero will be discussed
after the proof.
Proof. Let
B = {ms + nt|s ∈ Z, t ∈ Z, ms + nt > 0}.
Clearly, B ⊆ N. We wish to use well ordering, so we need to argue that B is
not empty. This is left as an exercise.
Applying well ordering, there is a least element g of B. If g is shown to be a
greatest common divisor of m and n, then we will have shown the last sentence
in the statement of the proposition. We know g has the form g = ms + nt for
some s and t in Z.
We first have to show that g divides both m and n. We will show g|m.
We assume g does not divide m and will derive a contradiction. We have
m = gq + r with q and r in Z and 0 ≤ r < g. Since we assume g does not divide
m, we have 0 < r. Now
r = m − gq = m − (ms + nt)q = m(1 − sq) − n(tq)
is an element of B since r > 0 and both 1 − s and tq are in Z. But r < g
contradicts the choice of g as the least element of B. So g|m. A similar proof
will show g|n and we omit that proof.
Now let h be a common divisor of m and n. Since g = ms + nt, we have that
h|g. (This follows from problems in Exercise Set (20).) This makes g a greatest
common divisor.
If g divides both m and n, then so does −g. Also if h is another common
divisor of m and n, then we know that h|g so h|(−g). This makes −g a greatest
common divisor.
84
CHAPTER 2. OBJECTS OF STUDY
Since g ∈ B, we have g > 0. We have shown the existence part of the
statement, and need to show the uniqueness claim and the claim that −g is the
only other greatest common divisor.
Assume that g ′ is another greatest common divisor. Then both g ′ and g are
common divisors and greatest common divisors. Using g as a greatest common
divisor and g ′ as a common divisor, we have g ′ |g. Reversing the roles gives g|g ′ .
So g = hg ′ and g ′ = kg. Substituting gives g = (hk)g or 1 = hk since g 6= 0.
But the only integer pairs that multiply to 1 are 1 × 1 and −1 × −1. Thus k
is 1 or −1. Since g ′ = kg we have g ′ = g or g ′ = −g. Thus the only greatest
common divisors of m and n are g and −g and only g is positive. This completes
the proof.
The assumption that at least one of m or n is not zero is necessary only for
the method of proof and the word “positive” in the statement. All integers are
divisors of 0 and the only integer divisible by all other integers is 0. So calling
0 the greatest common divisor of 0 and 0 makes sense. However we will never
have a use for this.
The positive quantity g guaranteed by the proposition, the unique positive
greatest common divisor of m and n, is denoted (m, n). This notation has a
long standing tradition in spite of the fact that it competes with the notation
for an ordered pair. Usually context determines what the notation means, and
for the rest, words will have to make it specific.
To give some numbers to illustrate the notation, we consider m = 12 and
n = 8. The full set of common divisors of 12 and 8 is {−4, −2, −1, 1, 2, 4}. The
value of (12, 8) is 4. The other greatest common divisor of 12 and 8 is −4 and
is denoted −(12, 8).
This situation is similar
to square roots. There are two numbers −2 and 2
√
4
is
defined to be 2 and the other number whose square
whose square
is
4,
but
√
is 4 is − 4.
The existence of two greatest common divisors may seem like a disadvantage
over a definition where greatest means largest. If we were to use largest for
greatest, then 4 would be the only possible greatest common divisor. However
we remarked that this the steps in this outline would be applied later to rings
other than the integers. In those settings there is no good way to interpret
largest, and the definition that we are using will be the only one available.
2.7.5
Factorization into primes
This is one of the branches in the outline. It need only strong induction.
In this discussion, we take negative integers into account. This adds to the
complications, but it prepares us to deal with the complications that arise when
we consider rings other than Z.
A unit in a ring R with identity 1 is a u ∈ R that has a multiplicative inverse
u−1 . Since rings are not necessarily commutative this means that we assume
both uu−1 = 1 and u−1 u = 1. In Z, the only units are 1 and −1.
2.7. PROPERTIES OF THE RING OF INTEGERS
85
In Z, a prime is some p ∈ Z that is not a unit and so that if p = ab with a
and b in Z, then at least one of a or b is a unit.
Basically, this is the usual definition of a prime integer but with negative
numbers taken into account.
Integers that are not primes include (among others) 0, 1, −1 since 0 = 3 · 0
and neither 3 nor 0 is a unit, and both 1 and −1 are units. However all of 2,
−2, 3, −3, 5, −5 are primes.
Unfortunately, when this definition is carried to rings other than Z, the word
changes. What is called prime in Z is called irreducible in rings of polynomials.
This is given here only as psychological preparation, and will not be discussed
again until much later.
The point of this section is the following.
Theorem 2.7.7 (Fundamental theorem of arithmetic) Every integer n with
n > 1 is either prime or a product p1 p2 · · · pk where each pi is a prime.
Proof. The only real work in this proof is an inequality so obvious that it seems
silly to prove it. The overall proof is by strong induction on n starting at 2.
We are proving an “or” and we assume that n is not a prime. Thus n = ab
where neither a nor b is ±1. Since n > 1 it is positive and (by negating if
needed) we can assume that both a and b are positive. Thus a > 1 and b > 1.
We claim that a < n and b < n. If a ≥ n, then b > 1 and a ≥ n combine to
say that n = ab > n · 1 = n. This is not possible, so a < n. Similarly, b < n.
The strong inductive hypothesis says that we can assume the truth of what
we are proving for all 2 ≤ j < n, so the statement we are proving is assumed
true for both a and b. Thus either a = q1 or a = q1 q2 · · · qs and either b = r1 or
b = r1 r2 · · · rt where all qi and ri are primes. Since n = ab, we have our result.
This would be insanely hard to prove by ordinary induction. The relevance
of n − 1 to the discussion when n is not prime is nil. If one were to start
with ordinary induction, then the easiest proof of the fundamental theorem of
arithmetic would be to first prove the validity of strong induction and then give
the above proof.
2.7.6
Euclid’s first theorem about primes
This continues the branch in the outline started by the fundamental theorem of
arithmetic. The main result in this section is a stunning display of the power
of greatest common divisors, and it also gives us an opportunity to introduce
another concept.
If m and n are in Z with at least one of them not equal to zero, then we say
that m and n are relatively prime if (m, n) = 1. The power of this is twofold.
First, this happens a lot if one of m or n is prime. This is covered in the lemma
below. The second is that (m, n) = 1 allows us to say that there are integers s
and t so that ms + nt = 1.
86
CHAPTER 2. OBJECTS OF STUDY
Lemma 2.7.8 Let n and p be in Z with p prime. If p does not divide n, then
(n, p) = 1.
The proof is left as an exercise.
Next we have Euclid’s theorem.
Theorem 2.7.9 Let p, a and b be in Z with p prime. If p|(ab), then either p|a
or p|b.
Proof. Assume p does not divide a. Then (p, a) = 1 and there are integers s and
t so that ps + at = 1. Mutiplying by b gives psb + abt = b. Now p|(psb) and
p|(abt) so p divides the sum psb + abt and p|b.
The conclusion of Theorem 2.7.9 is the basis for the definition of prime in
other rings. This will be discussed at the time it is needed.
We give an easy generalization of Theorem 2.7.9 mostly to show that it is
possible to write out a careful and rigorous argument for a result so obvious
that it seems hard to know what to write down for a proof.
Corollary 2.7.10 Let p and ai , 1 ≤ i ≤ k, be in Z with p prime. If p|(a1 a2 · · · ak ),
then p|ai for some i with 1 ≤ i ≤ k.
Proof. We induct on k starting with k = 1. If k = 1 there is nothing to prove
and if k ≥ 2, then Theorem 2.7.9 says that either p|(a1 a2 · · · ak−1 ) or p|ak . If p
does not divide ak , then the inductive hypothesis applied to the other possibility
says that p|ai for some i with 1 ≤ i ≤ k − 1.
2.7.7
Uniqueness of prime factorization
This ends the branch of the outline that started with the fundamental theorem
of arithmetic.
The main claim here is that there is only one way to factor an integer into
primes. This is absolutely false if interpreted literally. We have
12 = 3 · 4 = 4 · 3 = −4 · −3 = −3 · −4
which means we have to take into account at least changes of order and changes
of sign. It turns out that this is all we have to take into account. The following
lemma takes care of signs. It is based on the fact that there are only two units
in Z.
Lemma 2.7.11 Let p and q be primes in Z. If p|q, then p = q or p = −q.
The proof is left as an exercise.
Another way to state the conclusion is to say that |p| = |q|, and yet another
way is to say that q = up where u ∈ {−1, 1}.
Now we can prove the main result of this branch of the outline.
2.7. PROPERTIES OF THE RING OF INTEGERS
87
Theorem 2.7.12 (Uniqueness of factorization of integers) Let m ∈ Z be
neither 0 nor a unit. If m = p1 p2 · · · pj and m = q1 q2 · · · qk where all the pi and
qi are primes, then j = k and the subscripts of the qi can be permuted so that
for each i with 1 ≤ i ≤ j we have qi = pi or qi = −pi .
The permuting of the subscripts takes into account that one can change the
order of multiplication and not change the result. We will have more to say
about the ordering problem after the proof.
Proof. We induct on the smaller of j and k.
Since p1 |m, we have that p1 |(q1 q2 · · · qk ), and there is an i with 1 ≤ i ≤ k
so that p1 |qi . By permuting the subscripts we can assume that i = 1 so that
p1 |q1 . We have q1 = up1 for some u ∈ {−1, 1}. Note that u−1 = u so we also
have p1 = uq1 as well.
We have m = up1 q2 q3 · · · qk and we get two expressions for m/p1 . We have
m/p1 = p2 p3 · · · pj and m/p1 = (uq2 )q3 · · · qk .
We know that m/p1 is not 0, and if m/p1 is a unit, then m = ±p1 = ±uq1 .
In this case j = k = 1 and no permutation of the subscripts is needed.
If m/p1 is not a unit, then this situation is covered by the inductive hypothesis since the number of primes in the two expressions (recall that uq2 is a
prime) is j − 1 and k − 1, respectively. Applying the inductive assumption gives
j − 1 = k − 1, implying j = k, and a permutation of the subscripts of the qi ,
2 ≤ i ≤ j, has |pi | = |qi | for each i with 2 ≤ i ≤ j. This completes the proof. There is a way to avoid permuting the subscripts which we have not taken
advantage of. We could have demanded that in the two expressions for m that
|p1 | ≤ |p2 | ≤ · · · ≤ |pj | and |q1 | ≤ |q2 | ≤ · · · |qk |. With only a bit more effort,
we could have proven that j = k and each i with 1 ≤ i ≤ j we have |pi | = |qi |.
We did not do this since when this theorem is proven for rings other than Z, it
is significantly more difficult to duplicate this extra hypothesis.
Exercises (25)
1. This exercise will prove that N is well ordered. In doing so we will assume
that for every m ∈ N, there is no element x ∈ N so that m < x < m + 1.
This fundamental fact can either be assumed as we do here, or proven
from even more fundamental assumptions about N. The proof that N is
well ordered will follow when the statement “if n ∈ S ⊆ N, then S has a
least element” is proven by induction on n. First check that the statement
in quotes holds when n = 0. Then make the inductive assumption that
the statement in quotes holds for n = k, and assume that k + 1 ∈ S ⊆ N.
The induction will be finished if we can show that S has a least element.
If k ∈ S, then S has a least element by the inductive assumption. If k ∈
/ S,
then let S = S ∪ {k}. By the inductive assumption S has a least element
m. Now finish the proof by considering the two cases m = k and m 6= k.
2. Prove that the set of positive real numbers has no least element.
88
CHAPTER 2. OBJECTS OF STUDY
3. Prove Corollary 2.7.4 by first proving Let m and d be in Z with d < 0.
Then there are unique elements q and r in N so that m = dq + r and
0 ≤ r < −d. Hint: do not go through the whole proof of the division
algorithm, just use it. Then combine with the division algorith to conclude
the corollary.
4. In the proof of the proposition on greatest common divisors, show that B
is not empty. This is easier than the corresponding fact in the proof of
the division algorithm proposition.
5. In the proof of the proposition on greatest common divisors, fill in the
details on why h|g.
6. Prove Lemma 2.7.8.
7. Prove Lemma 2.7.11.
2.8
The fields Zp
We return to the task of finding examples of fields. All the examples of fields
that we have seen so far have had infinitely many elements. These examples
will have only finitely many elements.
This finishes the outline given in Section 2.7.1 and is the only step in the
“other” branch that comes after greatest common divisors.
As mentioned before, the exercises in Exercise Set (23) hint that the rings
Zp might actually be fields if p is a prime. Here we verify that. The verification
uses what we know about greatest common divisors.
Proposition 2.8.1 Let p ∈ Z be a prime. Then Zp is a field.
Proof. From Proposition 2.5.2, we know that Zp is a commutative ring with 1.
A check shows that every requirement of a field is met except for the existence
of multiplicative inverses. So we must show that every [i]p ∈ Zp with [i]p 6= [0]p
has some [j]p ∈ Zp so that [i]p [j]p = [1]p .
Since [i]p 6= [0]p , we know that p does not divide i − 0 = i so (p, i) = 1 and
there are integers s and t so that ps + it = 1. From this we have it = 1 − ps
so [i]p [t]p = [it]p = [1 − ps]p = [1]p since 1 and 1 − ps differ by a multiple of p.
Thus [t]p is the inverse that we sought for [i]p .
We see that there is a class of finite fields. Just as with groups, we use the
word order to refer to the number of elements. We now know that for each
prime p, there is a field of order p.
It turns out that finite fields are well understood. We will eventually see that
there is a field of order n if and only if n = pk for some prime p and positive
integer k.
The exercise below is large. This comment is here to assure you that the
exercise is stated correctly and means what it says.
2.9. HOMOMORPHISMS
89
Exercises (26)
1. Learn the outline given in Section 2.7.1. Also learn the details of the
proofs of the individual steps in the outline starting with Step 2. Step 1 is
important from a logical point of view in that all that comes after is based
on Step 1. However, Step 1 is the least algebraic and knowing the details
of the proof of the main result, that N is well ordered, is not crucial. The
propositions and theorems to learn are six in number: 2.7.3, 2.7.6, 2.7.7,
2.7.9, 2.7.12, and 2.8.1.
2.9
Homomorphisms
Homomorphisms were discussed very briefly in Section 2.1.4. The discussion
centered around consequences of permuting roots of a polynomial. It was felt
that if the roots moved, then other numbers should move as well. If r1 moved
to r2 , and r2 moved to r1 (i.e., f (r1 ) = r2 and f (r2 ) = r1 ), then r1 + r2
should move to itself and r1 − r2 should move to the negative of itself. That is,
f (r1 + r2 ) = f (r1 ) + f (r2 ) = r2 + r1 and f (r1 − r2 ) = f (r1 ) − f (r2 ) = r2 − r1 .
The logic of these consequences are the basis for the idea of a homomorphism. If there are operations around, then a homomorphism should cooperate
well with the operations. Operations we have encountered have been addition,
multiplication, negation and inversion. We leave out the taking of roots since
they involve multiple answers.
Not all objects have all possible operations. If a function f : G → H is to
be a homomorphism between groups (with operation written multiplicatively),
then there are as many as three operations to discuss: multiplication (of arity
2), inversion (of arity 1), and the constant identity (of arity 0). If the identity
is 1, then we should require
f (ab) = f (a)f (b),
f (a−1 ) = (f (a))−1 ,
f (1) = 1.
Notice that all operations used to the left of the equal signs are taking place in
G and all operations used to the right of the equal signs are taking place in H.
If f : R → S is to be a homomorphism between rings with 1, then we would
want to require
f (a + b) = f (a) + f (b),
f (−a) = −f (a),
f (ab) = f (a)f (b),
f (0) = 0,
f (1) = 1.
If we are working with fields where inversion is present, we would add the
requirement that f (a−1 ) = (f (a))−1 .
90
CHAPTER 2. OBJECTS OF STUDY
It turns out that not all of these requirements need be stated. Some follow
from others. We will see the details in the next chapter where we look at
homomorphisms in more detail. For now we will just look at examples.
The linear transformations of linear algebra should be kept in mind. The
operations relevant to vector spaces are addition, negation and multiplication by
scalars. A linear transformation T is required to cooperate with these operations
and this requirement is summarized by the requirement
T (ru + su) = rT (u) + sT (v)
for vectors u and v, and scalars s and t. The requirement T (0) = 0 need not
be stated since it is a provable consequence of the requirement above.
2.9.1
Complex conjugation
The properties of complex conjugation listed in (1.32) are most of what we
need to argue that complex conjugation is a homomorphism of fields from C to
itself. The symbol for complex conjugation does not make it look like a function.
However it is a function and we temporarily give it another symbol to make it
look like one. We define f (z) = z.
The equality z + w = z+w from (1.32) translates into f (z+w) = f (z)+f (w)
and says that complex conjugation cooperates in the proper manner with respect to addition. Three other properties listed in (1.32) translate into f (−z) =
−f (z), f (zw) = f (z)f (w) and f (z −1 ) = (f (z))−1 and show complex conjugation cooperates with negation, multiplication and inversion. All we need to add
is 1 = 1 and 0 = 0 to give f (1) = 1 and f (0) = 0. Thus complex conjugation is
a homomorphism.
2.9.2
The projection from Z to Zk
For k > 0 in Z, we have the ring Zk . There is a natural function π (later
discussion will call it a projection) from Z to Zk defined by
π(i) = [i]k .
That is, i is carried by π to the equivalence class containing i. The definitions [x]k + [y]k = [x + y]k and −[x]k = [−x]k from (2.6) and (2.7) make π a
homomorphism from the group (Z, +) to the group (Zk , +).
If we also bring in the multiplication [x]k [y]k = [xy]k discussed in Lemma
2.5.1, then we get that π is also a homomorphism from the ring (Z, +, ·) to the
ring (Zk , +, ·).
In the next chapter, we will define restricted classes of homomorphisms.
Exercises (27)
√
√
√
2] built in Section 2.6.2. Define f : Q[ 2] → Q[ 2]
1. This uses the
√ field Q[ √
by f (r + s 2) = r − s 2. Show that f is a homomorphism. This is very
much like showing the facts in (1.32), but with some minor differences.
Chapter 3
Theories
3.1
Introduction
What is a theory?
A theory in a subject is a collection of arguments, conclusions and tools that
answer questions in that subject. Often the arguments, conclusions and tools
are based on a small collection of assumptions, axioms or basic principles that
are considered the starting points of the theory. In this course, we will look at
several theories.
Some of the theories are connected to the new objects of study. Thus there is
a theory of groups, a theory of rings, a theory of fields. (The names are usually
given in a shorter form: group theory, ring theory, field theory.) The starting
points of these theories are the definitions of the objects themselves.
Other theories are harder to assign starting points. The theory of Galois
(or Galois theory) ties together aspects of group theory and field theory. It was
invented to answer questions about roots of polynomials, but its applications
have spread well beyond such questions. Galois theory is not usually discussed
with a starting set of axioms or basic assumptions.
Axiomatic systems
An axiomatic system is a theory with a stateable collection of assumptions or
definitions that form a starting point for the theory. It is a point of pride in
mathematics that there are many such axiomatic systems, and that it can be
claimed that all of mathematics itself can be laid out as an axiomatic system
whose base is a combination of logic and set theory.
This chapter will take a very quick and introductory look at the axiomatic
systems consisting of group theory, ring theory and field theory. This is done
for several reasons.
One reason is simply to introduce and emphasize axiomatic systems. Not
every mathematical subject is developed this way (some subjects are loose gath91
92
CHAPTER 3. THEORIES
erings of topics that are related enough to form an area of study), but many
are. Even if a subject does not have an axiomatic development, it often lives in
a larger subject that does.
Another reason is that other than the axiomatic system for geometry developed by the Greeks, axiomatic systems did not form a major part of mathematical development until the 1800s. Thus the rise of axiomatic systems more or
less coincides with the rise of what we call modern mathematics.
A third reason is that the theories we consider here (groups, rings and fields)
have enough of an overlap at the beginning that there is some savings in considering all three at the same time.
Caveats
There are some cautions that need to be stated.
Not all mathematicians are in love with the axiomatic approach to mathematical subjects. It is felt that overemphasis on axiomatics leads to “axiomatic
tinkering” that is removed from the true beauty of mathematics and from its
relationship to the real world. Even if this criticism is accepted as valid, the
fear that “axiomatic tinkering” will come to dominate mathematics does not
take into account human variability. Hundreds of years of experience has shown
that no one approach to mathematics is likely to crowd out other approaches.
It must also be said that there is no date for the start of modern mathematics.
Nor is there a definition that separates modern mathematics from what came
before. Mathematical development is much too gradual and continuous. Even
such major events as Galois’ solution to the solvability of polynomials followed
a period of development with contributions by many individuals over a long
period of time.
Beginnings
The starting points of any axiomatic system (e.g., group theory, ring theory,
field theory) are the definitions. After the definitions are recorded, there is
nothing known about the theory except the definitions. Thus any result that
comes immediately after the definitions must follow from the definitions and
nothing else.
Once results start to accumulate in a theory, then further results can make
use of them. Ultimately however, all results derive from the definitions, either
directly or through a chain of other results that also derive from the definitions.
In this chapter, we will repeat the definitions for clarity, and then give some
of the earliest results in each theory. We do this to emphasize and exploit
some of the similarities between the theories, but also to exhibit some of the
differences.
3.2. GROUPS
3.2
3.2.1
93
Groups
The definition
In Section 2.3.1 we gave two equivalent definitions of a group. Below we use
only the “smaller” definition—the one that starts with the least structure.
A group is a pair (G, ·) where G is a set and · is a binary operation (function
from G × G to G) usually referred to as the multiplication. We will omit the
notation for the multiplication and write the product of a and b as ab. The
following requirements must be met.
1. For all a, b and c in G, we have a(bc) = (ab)c.
2. There is an element 1 ∈ G so that for all a ∈ G, we have 1a = a1 = a.
3. For each a ∈ G there is an element a−1 ∈ G so that aa−1 = a−1 a = 1.
Abelian groups
One class of groups is so important that it is worth defining them immediately
after giving the definition of a group. If a and b are in a group G, we say that
a commutes with b if ab = ba. If every pair of elements in a group commutes,
then we say that the group is commutative or that the group is abelian. It is
sometimes said that the multiplication is commutative, but I have never heard
the word “abelian” used that way. The word “abelian” always seems to be used
to refer to the group.
As mentioned in Section 2.3.3, it is common but not required to use + for
the operation and 0 for the identity in an abelian group.
3.2.2
First results
Uniqueness
There are many results that follow immediately from the existence of inverses.
The following lemma is what most are based on.
Lemma 3.2.1 If a and b are two elements in a group G, then there exists a
unique element x ∈ G that satisfies ax = b.
Proof. To show something exists, one only has to exhibit it. The element x =
a−1 b satisfies ax = a(a−1 b) = (aa−1 )b = 1b = b.
Of course we skipped how we found this element. One finds this value of
x by writing ax = b and multiplying both sides on the left by a−1 to get
a−1 (ax) = a−1 b. One has to say “both sides on the left” instead of just “both
sides” since the multiplication in a group is not always commutative. Knowing
p = q does not mean pr = rq, and in fact pr = rq might be false.
Now that a−1 (ax) = a−1 b is known, the left side simplifies to (a−1 a)x =
1x = x. Thus we get x = a−1 b. The point is that the calculation that we just
94
CHAPTER 3. THEORIES
did to get x = a−1 b does not prove that ax = b if x = a−1 b. That was proven in
the first paragraph of the proof. However, the calculation that derives x = a−1 b
does have merit. It shows the uniqueness of x. We proved that if ax = b, then
x = a−1 b must be true. Thus this is the only value of x that makes ax = b true.
Lemma 3.2.1 is important, but equally important are the ideas of the proof—
that inverses always exist in a group and that inverses cancel. The lemma will
be quoted often in the rest of these notes, but equally often the techniques of
“multiply both sides on the right by the inverse” or “multiply both sides on
the left by the inverse” will be invoked without bothering to refer back to the
lemma. You should get used to looking for opportunities to use the techniques
wherever needed.
We get two important corollaries of Lemma 3.2.1.
Corollary 3.2.2 In a group G with identity 1, if ax = a, then x = 1.
Proof. Apply Lemma 3.2.1 with b = a noting that a1 = a.
Corollary 3.2.3 In a group G with identity 1, if ax = 1, then x = a−1 .
Proof. Apply Lemma 3.2.1 with b = 1 noting that aa−1 = 1.
Corollary 3.2.2 is usually interpreted as saying that there is only one element
in a group that acts as the identity.
Corollary 3.2.3 is usually interpreted as saying that for each element a in a
group, there is only one element that acts as the inverse of a.
These observations add information to the “larger” definition of a group.
If a group is defined as a set with three operations (G, ·, −1 , 1) with · binary,
−1
unary and 1 a constant satisfying the usual requirements, then the two
corollaries above say that the operation · completely determines the other two.
There is only one way to assign inverses and there is only one element that can
serve as the identity.
Corollary 3.2.4 If 1 is the identity of a group G, then 1−1 = 1.
Proof. Use 1 · 1 = 1 in Corollary 3.2.3.
Corollary 3.2.5 In a group G, for every a ∈ G, we have (a−1 )−1 = a.
Proof. If 1 is the identity of G, then for a in G, we have a−1 a = 1. Use Corollary
3.2.3 with x = a.
The last proof could have been done with different letters to make the use of
Corollary 3.2.3 more transparent. If the proof bothers you, change the letters.
It is important that you become comfortable with the proof.
3.2. GROUPS
95
Corollary 3.2.6 In a group G, for every a and b, we have (ab)−1 = b−1 a−1 .
The proof is left as an exercise.
Corollary 3.2.7 If a1 , a2 , . . . , an are in a group G, then we have
−1
−1 −1
(a1 a2 · · · an )−1 = a−1
n an−1 · · · a2 a1 .
The proof is left as an exercise.
3.2.3
Subgroups
If G is a group, then a subset S of G is called a subgroup if using the operation
of G on the elements of S makes S a group. If any element of S acts as the
identity element on S, then it will act as the identity on at least one element
of G, and by Corollary 3.2.2 it must be the identity of G. Similarly, Corollary
3.2.3 says that each element of S must have its inverse from G also in S. Lastly,
if a and b are in S, then certainly ab (as computed in G) must be in S. Once
these requirements are satisfied, then the associative law will be satisfied (since
it holds in G), and S will be a group.
For example, if (Z, +) is the group, then the even integers form a subgroup.
Recall that 0 is the identity in (Z, +). The following lemma is given as a more
efficient way to check that a subset of a group forms a subgroup.
Lemma 3.2.8 If S is a non-empty subset of a group G, then S forms a subgroup
if for every a and b in S the element a−1 b is also in S.
The proof is left as an exercise.
The mention of a−1 in the lemma, and the requirement that the subset
be non-empty often has the consequence that using the lemma is sometimes
no more efficient than showing that a subset is a subgroup directly from the
definitions. However, there are situations where using the lemma is easier, so it
worth stating and proving.
Intersections
The following is a template for many almost identical lemmas.
Lemma 3.2.9 Let G be a group and let C be a collection of subgroups of G.
Then the intersection of all the subgroups in C is a subgroup of G.
Proof. Let A be the intersection of all the subgroups in C. We will use lemma
3.2.8. It is just barely more efficient to do so than to not do so.
We note that A is not empty since 1 must be an element of every subgroup
in the collection C, and thus also an element of A.
Now let a and b be in A. By the definition of intersection, we must have
that a is in every subgroup in the collection C. Similarly, b is in every subgroup
in the collection C. So a−1 is in every subgroup in the collection C as is a−1 b.
96
CHAPTER 3. THEORIES
But this puts a−1 b in the intersection A. By Lemma 3.2.8, A is a sugroup of G.
Note that the collection C is not necessarily the collection of all subgroups
of G. In fact, C might have very few subgroups in it.
Obvious subgroups
If G is a group with identity 1, then {1} is a subgroup of G. As trivial as it is,
it satisfies all of the requirements of a subgroup. As trivial as it is, it is given
the name trivial subgroup or simply identity subgroup.
For any group G, the group G itself is also a subgroup of G. There is less
agreement on a name for this group, but it can be referred to as the full subgroup
of G or whole group.
The phrase proper subgroup of G in some texts refers to a subgroup that is
not the whole group, and in others refers to a subgroup that is not the whole
group and is not the trivial subgroup. We will not rely on one word to do all
the work and if we want to refer to a subgroup of G that is not G and not {1},
then we will call it a proper, non-trivial subgroup.
As uninteresting as the whole group and the trivial subgroup might seem,
they are still subgroups and must be included in any list of all the subgroups of
a group.
Generating subgroups
We give another template for many almost identical lemmas.
If C is a collection of subgroups in a group G, then a subgroup H in the
collection C is said to be the “smallest” subgroup in the collection C if for every
subgroup K in the collection C we have H ⊆ K. In words, H is one of the
subgroups in C and it is contained in every subroup in C.
Note that a collection C of subgroups of G cannot have two different smallest
subgroups. If A and B are both the smallest subgroups in the collection C, then
A ⊆ B and B ⊆ A would both be true, and we would have A = B.
In the next lemma, note the use of the word “subset” as opposed to “subgroup” in the statement.
Lemma 3.2.10 Let S be a subset of a group G. Then in the collection of all
subgroups of G that contain S there is a smallest subgroup.
Proof. Let C be the collection of all the subgroups of G that contain S. This
is not an empty collection since G is in C. Let A be the intersection of all the
subgroups in the collection C. By Lemma 3.2.9, we know that A is a group.
Since S is in every subgroup in the collection C, it is in the intersection A. So
S ⊆ A. This makes A one of the subgroups in C.
To show that A is the smallest subgroup in C, we let B be another subgroup
in the collection C. That is, B is one of the subgroups we are intersecting to
create A. But every element of A must be in every one of the subgroups being
3.2. GROUPS
97
intersected. In particular, every element of A must be in B. This makes A ⊆ B
which is the last item needed to be proven.
If G is a group and S is a subset of G, then the smallest subgroup of G
that contains S (whose existence is guaranteed by Lemma 3.2.10) is called the
subgroup of G generated by S. It is denoted hSi. From Lemma 3.2.10 we know
that hSi exists, but we have no idea yet what is in it. This will be discussed
later.
We will become fascinated with subgroups when we get deeper into group
theory in later chapters.
3.2.4
Homomorphisms
We gave an informal definition of a homomorphism in Section 2.9. Before we
give a formal definition, we give a lemma that motivates the smallness of the
definition that we give.
Lemma 3.2.11 Let f : G → H be a function from the group G (with identity
1G ) to the group H (with identity 1H ). If for all a and b in G we have f (ab) =
f (a)f (b), then f (1G ) = 1H and for all a ∈ G we have f (a−1 ) = (f (a))−1 .
Proof. Since 1G 1G = 1G , we have f (1G )f (1G ) = f (1G ) and Corollary 3.2.2 says
that f (1G ) = 1H .
Since aa−1 = 1G , we have f (a)f (a−1 ) = f (1G ) = 1H and Corollary 3.2.3
says that f (a−1 ) = (f (a))−1 .
In Section 2.9 we wanted a homomorphism to be a function that cooperates
with all operations on a group. Lemma 3.2.11 says that to get all the cooperation we require, we only need to demand that the function cooperate with
the multiplication. This is not a surprise in view of the two corollaries quoted
in the proof above which say that the multiplication determines the other two
operations. This leads to the following definition.
A function f : G → H between groups is said to be a homomorphism if for
every a and b in G, we have f (ab) = f (a)f (b).
It is worth discussing in more detail what is going on in a homomorphism.
If f : G → H is a homomorphism between groups, then if three elements a, b
and c in G are related by having c = ab, then the three corresponding elements
f (a), f (b) and f (c) in H must also be related by having f (c) = f (a)f (b).
An important fact about homomorphisms is the following.
Lemma 3.2.12 If f : G → H and h : H → K are homomorphisms between
groups, then hf : G → K is a homomorphism.
The proof is left as an exercise.
98
3.2.5
CHAPTER 3. THEORIES
Subgroups associated to a homomorphism
The image of a homomorphism
Let h : G → H be a homomorphism between groups. The image h(G) of the
homomorphism is simply the image of the function. It is a subset of H. We
have the following.
Lemma 3.2.13 Let h : G → H be a homomorphism between groups. Then the
image h(G) of the homomorphism is a subgroup of H.
Proof. Let 1G and 1H be, respectively, the identity of G and the identity of H.
Since 1H = h(1G ) is in h(G), we know that h(G) is not empty.
If a and b are in h(G), there are x and y in G so that h(x) = a and h(y) = b.
Now
a−1 b = (h(x))−1 h(y) = h(x−1 )h(y) = h(x−1 y)
and since x−1 y is in G, we have a−1 b is in h(G). By Lemma 3.2.8, h(G) is a
subgroup of H.
The kernel of a homomorphism
Let h : G → H be a homomorphism between groups with 1H the identity of H.
The kernel of h, denoted Ker(h) is defined by
Ker(h) = {x ∈ G | h(x) = 1H }.
In words, the kernel of h is all the elements of G that map to the identity in H.
It is a subset of G.
Lemma 3.2.14 Let h : G → H be a homomorphism between groups. Then
Ker(h) is a subgroup of G.
Proof. Let K = Ker(h). We have h(1G ) = 1H so 1G is in K.
For a ∈ K we have h(a−1 ) = (h(a))−1 = (1H )−1 = 1H . so a−1 is in K.
For a and b in K, we have h(ab) = h(a)h(b) = 1H 1H = 1H .
A proof using Lemma 3.2.8 would be no shorter.
The proof of Lemma 3.2.14 uses three properties of identities to get the three
properties needed for a subgroup. In words, we used the fact that the image
of an identity is an identity, the inverse of the identity is the identity, and the
product of two identities is an identity.
There is another property of the identity element that we have not exploited
that is not necessarily shared by other elements. If 1 is the identity of a group
G, then for all a ∈ G, we have a1a−1 = 1. If a group is commuative, then
aba−1 = b always holds, but without commutativity we cannot guarantee this.
The extra property of the identity leads to the following.
3.2. GROUPS
99
Lemma 3.2.15 Let h : G → H be a homomorphism between groups. Then for
every a ∈ Ker(h) and for every b ∈ G, we have bab−1 ∈ Ker(h).
The proof is left as an exercise.
The result of Lemma 3.2.15 is turned into a definition. If N is a subgroup
of a group G, then we say that N is normal in G and write N ⊳ G if for every
a ∈ N and b ∈ G, we have bab−1 ∈ N .
Not all subgroups of all groups are normal. This will be seen later in the
notes, or in a problem below if you are impatient. The point is that kernels of
homomorphisms are somewhat special.
There are parallels to Lemmas 3.2.9 and 3.2.10 for normal subgroups. We
will not have need for such lemmas and do not state them separately here.
However, we give their statement and proof as an optional exercise.
We end this section with a very easy fact. However, it is important enough
to state separately as a lemma.
Lemma 3.2.16 In an abelian group, every subgroup is normal.
The proof is left as an exercise.
3.2.6
Homomorphisms that are one-to-one and onto
This section discusses when two groups have “essentially the same structure.”
We argue here that this notion is captured by homomorphisms that are one-toone and onto. First we give a lemma.
Lemma 3.2.17 Let h : G → H be a homomorphism between groups that is
a one-to-one correspondence. Then the inverse function h−1 : H → G is a
homomorphism.
Proof. . Let x and y be in H. Then a = h−1 (x) is the unique element of
G for which f (a) = x and b = h−1 (y) is the unique element of G for which
f (b) = y. Since h is a homomorphism, h(ab) = h(a)h(b) = xy and we have that
h−1 (xy) = ab = h−1 (x)h−1 (y). This is what is needed to show that h−1 is a
homomorphism.
We say that a homomorphism h : G → H between groups is an isomorphism if it is also a one-to-one correspondence. We also say that G and H are
isomorphic.
Note that saying that two groups are isomorphic gives less information than
specifying an isomorphism between them. Saying that two groups are isomorphic says that an isomorphism between them exists, but does not say exactly
what that isomorphism might be.
We expand on earlier remarks that we made about homomorphisms. If
f : G → H is a function and a, b and c are elements of G, then we can consider
the corresponding elements f (a), f (b) and f (c) in H. We have noted earlier that
100
CHAPTER 3. THEORIES
if f is a homomorphism, then c = ab implies f (c) = f (a)f (b). Now we can add
the remark if f is an isomorphism, then c = ab if and only if f (c) = f (a)f (b).
It is this equivalence that allows us to claim that an isomorphism between
two groups shows that the two groups are “essentially the same.” We can give
more specifics.
Lemma 3.2.18 Let h : G → H be an isomorphism between groups. Then the
following hold.
1. For a and b in G, we have b = a−1 if and only if h(b) = (h(a))−1 .
2. For a in G, the order of a (as defined near the end of Section 2.4.3) in G
equals the order of h(a) in H.
3. For a subset S of G, we have that S is a subgroup of G if and only if h(S)
is a subgroup of H.
The proof is left as an exercise.
3.2.7
The group of automorphisms of a group
If G is a group, then i : G → G where i(x) = x for every x ∈ G is clearly a
homomorphism. Since it is also one-to-one and onto, it is an isomorphism. This
particular homomorphism/isomorphism is called the identity isomorphism.
There can be more isomorphisms from a group to itself. Consider the group
Z5 under addition. We will build an isomorphism h from Z5 to itself that is not
the identity.
The elements of Z5 (written without the brackets) are 0, 1, 2, 3, and 4.
Since 0 is the identity element, we must have h(0) = 0. If we next consider 1,
we can try to have h(1) something other than 1. Let us try h(1) = 2. Now
2 = 1 + 1, so we must have h(2) = h(1 + 1) = h(1) + h(1) = 2 + 2 = 4.
With 3 = 2 + 1, we get h(3) = h(2 + 1) = h(2) + h(1) = 4 + 2 = 1. Lastly,
h(4) = h(3 + 1) = h(3) + h(1) = 1 + 2 = 3.
We have defined a function h : Z5 → Z5 that can be represented by the
following permutation on {0, 1, 2, 3, 4}:
0 1 2 3 4
.
0 2 4 1 3
Since h is one-to-one and onto, it will be an isomorphism from Z5 to Z5 if it
is a homomorphism. We give the following argument that h is a homomorphism.
If a and b are two elements of Z5 , then a is the sum of a ones and b is the
sum of b ones. That means that a + b is the sum of a + b ones. Now h(a) is the
sum of a twos, h(b) is the sum of b twos and h(a + b) is the sum of a + b twos.
But h(a) + h(b) is also the sum of a + b twos. This makes h a homomorphism
and thus an isomorphism.
We call an isomorphism from a group to itself an automorphism. The example shows that there can be automorphisms of a group other than the identity.
3.2. GROUPS
101
We get structure from the set of all automorphisms of a group. The set
of all automorphisms of a group G is denoted Aut(G). Note that since each
automorphism of G is a one-to-one correspondence from G to G, it is also a
permutation of G. Thus Aut(G) is a subset of SG , the group of all permutations
of G. It turns out that Aut(G) is actually a subgroup of SG .
Lemma 3.2.19 Let G be a group and let f and h be in Aut(G). Then f h and
f −1 are in Aut(G). It follows that Aut(G) is a group with composition as the
group operation.
Proof. From Lemma 2.2.5, we know that f h is a one-to-one correspondence.
From Lemma 3.2.12, we know that f h is a homomorphism. Thus it is an
isomorphism and since it goes from G to itself, it is an automorphism of G.
From Lemma 2.2.8, we know that f −1 is a one-to-one correspondence. From
Lemma 3.2.17, we know that f −1 is a homomorphism. Thus it is an automorphism of G.
We already know that the identity from G to itself is an automorphism. This
gives all the facts we need to claim that Aut(G) is a subgroup of SG .
The group Aut(G) is called the automorphism group of G.
The importance of automorphisms
Automorphism groups will be very important to us in our study.
We will not be too concerned with automorphism groups of groups. But
we will be very concerned with other automorphism groups, and automorphism
groups of groups is a good place to start. Automorphisms of rings and fields
will also be defined. We will spend most of our time looking at automorphisms
of fields.
Automorphisms of an object such as a group, ring or field can be thought of
as symmetries of that object. The full automorphism group can be thought of
as the full group of symmetries of the object.
One of the themes of mathematics is that symmetries of an object reveal
properties of the object. While that makes an attractive sentence, it takes a lot
of work to back it up with examples.
In order to derive information from symmetries, you have to know what the
symmetries are. In order to know what the symmetries of an object (such as a
group, ring or field) are, you have to know some structure of the object. Thus we
do not start with symmetries, but instead start with a preliminary study of the
structure of the object. That preliminary information is used to say something
about the symmetries, and then finally new information about the object can
be extracted from the symmetries. This summarizes Galois theory in a very
general way.
At this point we leave groups and take up similar considerations for both
rings and fields. There will be similarities that we will exploit. This will usually
take the form of having you do proofs that are similar to proofs given above.
We will also point out certain differences.
102
CHAPTER 3. THEORIES
Exercises (28)
1. Prove Lemma 3.2.6. The hint is to absorb the idea of the proof of the
previous two corollaries.
2. Give an inductive proof of Lemma 3.2.7.
3. Prove Lemma 3.2.8. Why is it necessary to assume that S is not empty
in the hypothesis.
4. Prove Lemma 3.2.9 without using Lemma 3.2.8 and compare the length
of your proof to the proof given.
5. We have mentioned that the even numbers E form a subgroup of Z, +).
Prove that T , the set of multiples of 3, form a subgroup of (Z, +). What
is E ∩ T ?
6. Consider the group (Z, +). Let f : Z → Z be the doubling map. That
is f (n) = 2n. Show that f is a homomorphism. Show that g : Z → Z
defined by g(n) = n + 1 is not a homomorphism.
7. Prove Lemma 3.2.12.
8. If h : G → H is a homomorphism between groups, and S is a subgroup of
G, then prove that h(S) is a subgroup of H.
9. Prove Lemma 3.2.15.
10. Consider the group (Z, +), k 6= 0 in Z and the group Zk . Recall the
homomorphism π : Z → Zk . What is the kernel of π?
11. (optional) State and prove lemmas that serve as parallels to Lemmas 3.2.9
and 3.2.10 for normal subgroups. The main purpose of this exercise is
to review the proofs of Lemmas 3.2.9 and 3.2.10 and to exhibit their
flexibility.
12. This produces a subgroup that is not normal. Consider the elements σ
and τ of S4 as given in the first problem of Exercise Set (19). Show
that A = {1, τ } forms a two element subgroup of S4 . Argue that your
calculation of στ σ −1 shows that A is not a normal subgroup of S4 . This
assumes that your calculation of στ σ −1 is correct.
13. Prove Lemma 3.2.16.
14. Prove Lemma 3.2.18.
15. There are four elements in Aut(Z5 ) where Z5 is shorthand for the group
(Z5 , +). Find them and write them out as permutations. Write out the
multiplication table of Aut(Z5 ). Find an isomorphism from Aut(Z5 ) to
the group Z4 .
3.3. RINGS
103
16. There are four elements in Aut(Z8 ) where Z8 is shorthand for the group
(Z8 , +). Find them and write them out as permutations. Write out the
multiplication table of Aut(Z8 ). In spite of the fact that Aut(Z8 ) and the
group Z4 both have four elements, prove that Aut(Z8 ) and Z4 cannot be
isomorphic.
17. Axiomatic tinkering. There are other proofs than the one we give for the
uniqueness of the identity. Our proof of Corollary 3.2.2 ultimately depends
on the existence of inverses, and we will later need a proof that does not
depend on inverses. See if you can find such a proof. It also turns out
that less than the full identity axiom is needed. Instead of assuming there
is a “two sided identity,” namely an element 1 so that 1a = a1 = a for all
a, one could assume that there is a “right identity” r so that ar = a for
all a, or one could assume that there is a “left identity” l so that la = a
for all a. Prove that if there is both a right identity and a left identity,
then they are equal and are therefore a two sided identity. Conclude that
once there is at least one left identity and at least one right identity, then
there is a two sided identity and all identities are the same.
3.3
Rings
If you take an abelian group (with operation + and identity 0) and add a new
operation called multiplication, then with two extra laws you get a ring. But
nothing else is demanded. There need not be a multiplicative identity, and there
need not be inverses.
We will say much less about rings than we did about groups. This bias will
continue throughout these notes. In fact the study of rings forms a very large
subject, but you will have to take other courses to learn more about them.
3.3.1
The definition
We repeat the definition from Section 2.5.1.
A ring is a triple (R, +, ·) where + and · are binary operations (functions
from R × R to R). The operation + is usually referred to as the addition and
the operation · is usually referred to as the multiplication. We will omit the
notation for the multiplication and write the product of a and b as ab. The
following additional requirements must be met.
1. The pair (R, +) forms an abelian group.
2. For all a, b and c in R, we have a(bc) = (ab)c.
3. For all a, b and c in R, we have a(b + c) = ab + ac and (a + b)c = ac + bc.
The additive identity is usually written as 0, and the additive inverse of a ∈ R
is usually written as −a. Elements of R are not forbidden to have multiplicative
inverses. However before there can be multiplicative inverses, there must be a
104
CHAPTER 3. THEORIES
multiplicative identity. A multiplicative identity is not required to exist. If no
multiplicative identity exists, then there are no multiplicative inverses. For a
positive integer n, the strictly upper triangular n × n matrices form a ring with
no multiplicative identity.
As mentioned before, a ring with unit, or ring with 1, or ring with identity is
a ring with a multiplicative identity. That is, there is an element 1 in the ring
so that for all a in the ring, we have 1a = a1 = a.
A commutative ring or abelian ring is a ring where the multiplication is
commutative. That is ab = ba for all a and b in the ring.
3.3.2
First results
To make sure we understand what we are allowed to do, the proof of the next
lemma will be written out carefully.
Lemma 3.3.1 In a ring R there is only one element that can act as the additive
identity.
Proof. . This follows from Corollary 3.2.2 since (R, +) is a group.
As you can see the proof is short. This is because of what was proven before.
However, (R, ·) is not a group, so the next lemma needs a proof.
Lemma 3.3.2 In a ring with identity, there is only one element that can act
as the identity.
The proof is left as an exercise.
We gather several facts together in the next lemma.
Lemma 3.3.3 Assume that R is a ring with 0 as the additive identity.
1. For every a ∈ R there is only one element that can act as the additive
inverse for a.
2. −0 = 0.
3. For every a ∈ R, we have −(−a) = a.
The proof is left as an exercise.
The next lemmas use all parts of the structure of a ring. They have no
counterparts in the theory of groups.
Lemma 3.3.4 If R is a ring with 0 as the additive identity, then for every
a ∈ R we have 0a = a0 = 0.
The proof is left as an exercise.
Lemma 3.3.5 If R is a ring, then for every a and b in R, we have (−a)b =
a(−b) = −(ab) and (−a)(−b) = ab.
The proof is left as an exercise.
3.3. RINGS
3.3.3
105
Subrings
In keeping with our brief treatment of rings, we will have little to say about
subrings other than their definition.
If R is a ring and S is a subset of R, then S is a subring of R if the two
operations on R make S a ring when the operations are restricted to S. In
particular, if 0 is the additive identity for R, then 0 must be in S, and if a and
b are in S, then all of a + b, −a and ab must be in S.
The following are examples of subrings.
1. The even integers in the ring Z.
2. Upper triangular n × n matrices in the ring of all n × n matrices, as well
as the strictly upper triangular n × n matrices in the ring of all matrices.
3. Polynomials P (x) with P (1) = 0 in the ring of all polynomials with real
coefficients.
The verification that these are subrings are left as exercises.
There are also parallels to Lemmas 3.2.9 and 3.2.10 for subrings. These are
left as optional exercises. Once the parallel lemmas are in place, it is possible
to define what is meant by a subring generated by a certain set.
A word of caution is needed with terminology. If all rings in a discussion are
rings with identity, then it is usually assumed that a subring will also have an
identity. If this is the case, then the even integers would not form a subring of
Z, nor would the strictly upper triangular n × n matrices form a subring of the
ring of all n × n matrices. However, the upper triangular n × n matrices would.
3.3.4
Homomorphisms
Let f : R → S be a function between rings. If we assume that f (a + b) =
f (a) + f (b) for all a and b in R, then we know that f (0) = 0 and f (−a) = −f (a)
for all a ∈ R from the fact that the additive structure on R and S form groups.
The only new part of the ring structure is the multiplication. So we make the
following definition.
A function h : R → S between rings is a homomorphism of rings, if for all a
and b in R, we have f (a + b) = f (a) + f (b) and f (ab) = f (a)f (b).
As with groups, we have the following.
Lemma 3.3.6 If f : R → S and h : S → T are ring homomorphisms, then
hf : R → T is a ring homomorphism.
The proof is left as an exercise.
3.3.5
Subrings associated to homomorphisms
Images
Images of ring homomorphisms behave like images of groups.
106
CHAPTER 3. THEORIES
Lemma 3.3.7 Let h : R → S be a ring homomorphism. Then the image h(R)
is a subring of S.
The proof is left as an exercise.
Kernels
The kernel of of a ring homomorphism h : R → S is defined as
Ker(h) = {a ∈ R | h(a) = 0S }.
In the group setting, the kernel turns out to be a special subgroup. This
follows from the special behavior of the identity element in the group. In the
ring setting, the kernel also turns out to be a special subring. This follows from
the special behavior of the additive identity of the ring. Note that normality
of the kernel when looking only at the group structure is not a surprise in
the ring setting because the additive group of a ring is abelian and because
Lemma 3.2.16 says that all subgroups of abelian troups are normal. Kernels of
ring homomorphisms have a property that goes even beyond normality. The
extra property follows from the extra property of the additive identity given by
Lemma 3.3.4.
The statement that the kernel is a special subring implies that it is a subring
in the first place. This has problems if all rings in a discussion (including
subrings) are assumed to have a multiplicative identity. In a problem below, we
will point out that kernels rarely have a multiplicative identity.
Lemma 3.3.8 Let h : R → S be a ring homomorphism. Then for every a ∈ R
and k ∈ Ker(h), both ak and ka are in Ker(h).
The proof is left as an exercise.
The result of Lemma 3.3.8 is turned into a definition. If K is a subring of a
ring R, we say that K is a (two-sided) ideal of R if for every a ∈ R and k ∈ K,
we have that both ak and ka are in K. We say that K absorbs all elements of
R on both sides by multiplication.
Finding a subring of a ring that is not an ideal is left as an exercise.
There are also parallels to Lemmas 3.2.9 and 3.2.10 for ideals as well as the
notion of an ideal generated by a certain set. This is left as an optional exercise.
3.3.6
Isomorphisms and automorphisms
A function f : R → S between rings is a ring isomorphism if it is a ring
homomorphism and a one-to-one correspondence. As with groups, we get the
following about ring isomorphisms.
Lemma 3.3.9 Let h : R → S be a ring isomorphism. Then h−1 : S → R is
also a ring isomorphism.
3.3. RINGS
107
The proof is left as an exercise.
An automorphism of a ring R is a ring isomorphism from R to itself. The
set of all auotomorphisms of a ring R is denoted by Aut(R). As with groups,
we have the following.
Lemma 3.3.10 If R is a ring, then Aut(R) is a group with composition as the
operation.
The proof is left as an exercise.
It is harder to find automorphisms of rings that are not the identity than it
is to find automorphisms of a group that are not the identity. For the group
(Z5 , +), we found four elements in Aut(Z5 ). However (Z5 , +, ·) is also a ring.
We leave it as an exercise to show that there is only one automorphism of
this ring. Essentially, with the extra structure of the multiplication, the ring
(Z5 , +, ·) has fewer symmetries than the group (Z5 , +).
In keeping with our brief treatment of rings, we will not supply a non-identity
automorphism of a ring. All fields are rings, and we will exhibit a field with a
non-identity automorphism. This will supply the missing example.
Exercises (29)
1. Prove Lemma 3.3.2. Since multiplicative inverses do not exist, a simple
quote of Lemma 3.2.2 is not allowed. If you have done a previous problem,
you have already done this. If not, a hint is to assume that there are two
multiplicative identities p and q.
2. Prove Lemma 3.3.3. Do not forget previously proven facts.
3. Prove Lemma 3.3.4. This needs some care. The element 0 is an additive
identity and thus part of the “additive part” of the ring. The expression 0a
involves multiplication. From the definition of a ring to this lemma, there
is only one fact that combines addition and multiplication. It is therefore
impossible to prove the conclusion without using this fact. Secondly, there
are very few facts from the definition to this point that have as a conclusion
that something equals 0. It is therefore impossible to prove the conclusion
without using a second fact. Part of this exercise is to hunt down two
facts as just described that combine to give a proof of the conclusion.
4. Prove Lemma 3.3.5. Comments similar to those about the proof of Lemma
3.3.4 apply. However here the focus is not on the conclusion that something is zero, but on the conclusion that something is an additive inverse.
5. Prove that the examples in Section 3.3.3 are in fact subrings of the given
rings.
6. (optional) State and prove parallels to Lemmas 3.2.9 and 3.2.10 for subrings. Define what is meant by a subring generated by a certain set.
7. Prove Lemma 3.3.6.
108
CHAPTER 3. THEORIES
8. Prove Lemma 3.3.7. Note that multiplicative inverses are not present and
not relevant.
9. Prove Lemma 3.3.8. The comments in the paragraphs before the statement
of the lemma form a hint, but the hint should not be necessary.
10. Find an example of a subring that is not an ideal.
11. (optional) State and prove parallels to Lemmas 3.2.9 and 3.2.10 for ideals.
Define what is meant by an ideal generated by a certain set.
12. Show that if R is a ring with 1 and I is an ideal in R with 1 ∈ I, then
I = R. This says that most ideals cannot be regarded as subrings if it is
assumed that all subrings contain the multiplicative identity.
13. Prove Lemma 3.3.9.
14. Prove Lemma 3.3.10. The only trick here is to remember all that needs
to be checked.
15. Let R = (Z5 , +, ·) be the ring of integers modulo 5. Show that the only
automorphism of R is the identity.
3.4
Fields
Fields will occupy us significantly more than rings.
3.4.1
The definition
As covered in Section 2.6.1, a field is a commutative ring with 1 so that all
non-zero elements have multiplicative inverses. As mentioned in Section 2.6.1,
we will also assume 1 6= 0. A full set of laws is written out in Section 2.6.1.
3.4.2
First results
The following are mostly based on results from groups and rings. The only new
item is the last.
Lemma 3.4.1 If F is a field, then all of the following hold.
1. There is only one element in F that acts as the additive identity.
2. There is only one element in F that acts as the multiplicative identity.
3. For each x ∈ F , there is only one element that acts as the additive inverse
of x.
4. For each x ∈ F with x 6= 0, there is only one element that acts as the
multiplicative inverse of x.
3.4. FIELDS
109
5. −0 = 0 and 1−1 = 1.
6. For every x ∈ F , we have 0x = 0.
7. For every x and y in F , we have (−x)y = x(−y) = −(xy) and (−x)(−y) =
xy.
8. For every x ∈ F with x 6= 0, we have x−1 6= 0.
The proof is left as an exercise.
The last item in Lemma 3.4.1 allows us to describe a field in another way.
With F ∗ = F − {0}, a triple (F, +, ·) is a field if (F, +) is an abelian group
with identity 0, if (F ∗ , ·) is an abelian group with identity 1 so that 1 6= 0, and
if the distributive law holds. Thus a field is two abelian groups with different
identities connected by a distributive law.
3.4.3
Field extensions
Subfields
The definition of subfield is no surprise. A subset F of a field E is a subfield of
E if the addition and multiplication operations of E restricted to F make F a
field.
Extensions
There is however, a difference in attitude. Instead of focusing on the fact that
in the definition above, F is a smaller part of E, the focus is usually on the fact
that E is gotten by making F larger. Thus if F is a subfield of E, we will also
(and more frequently) say that E is an extension of F or an extension field of
F.
Thus C is an extension of R. Later, we will make sense of the statement
that “C is an extension
√ of R by the addition of the element i.” We can also
2] of Section 2.6.2 that it is an extension of Q by the
say of the example
Q[
√
addition of 2.
The point of view of extension rather than subfield is supported by the
construction of roots of a polynomial. The field Q contains the coefficients of
x2 + x + 1. However, to build
the roots of x2 + x + 1, one must pass to a larger
√
−3.
That is, one must build an extension of Q by
field that
also
contains
√
adding −3 in order to get a field that accomodates the roots.
The cubic x3 − 9x + 8 considered in (1.22) also has its coefficients in Q. But
to accomodate the roots, two extensions√must be formed. First an extension of
of this field
Q must be formed by the addition of −11. Then an extension
√
must be formed by the addition of all
cube
roots
of
−4
+
−11.
(We do not
√
−11.
If
u
represents
a
typical
cube root
have to add
the
cube
roots
of
−4
−
√
√
of −4 + −11, v represents a typical cube root of −4 − −11, and q is the
coefficient −9 of x, then we know that uv = − q3 = 3. Thus if a field contains
110
CHAPTER 3. THEORIES
u and√3, it must contain 3/u = v, and so if it contains all
√ the cube roots of
−4 + −11 it will also contain all the cube roots of −4 − −11.)
In order to make sense and use out of these considerations, we will have to
study the structure of these extensions. This will be done in due time. For now
we make the following observation.
Dimension and degree
Lemma 3.4.2 If E is an extension field of F , then E can be regarded as a
vector space with F forming the field of scalars.
Proof. The axioms of a vector space are listed in Section 1.4.2, where z, y and
x were taken to be in C, but here can be taken from E, and r and s were taken
to be from R, but here can be taken from F . The eight requirements listed in
Section 1.4.2 are simply special cases of the requirements of a field. They hold
since F ⊆ E and they all hold in E.
It was observed in Section 1.4.2, that C forms a vector space of√dimension
2 over R and that {1, i} forms
a basis. Since every element of Q[ 2] can√be
√
written uniquely as a + b 2 with a and b in Q, we can√also say that Q[ 2]
forms a vector space of dimension 2 over Q and that {1, 2} forms a basis.
In general if E is an extension field of F , then the dimension of E as a vector
space over the field of scalars F is called the degree of of E over F and is denoted
[E : F ]. This value will later be seen to be a very important measure of the
extension.
For those
√ who have done Problem
√ 5√in Exercise Set (24), it should be clear
that [Q[ 3 2] : Q] = 3 and that {1, 3 2, 3 4} forms a basis.
Generating extensions
The parallels to Lemmas 3.2.9 and 3.2.10 for subrings and ideals were treated
in optional exercises since they are not as important to our particular goals.
However we will use heavily the parallels for subfields and extensions. We give
them here in the form we will need.
Lemma 3.4.3 Let E be a field and let C be a collection of subfields of E. Then
the intersection of all the subfields in C is a subfield of E.
The proof is left as an exercise.
The next lemma is worded slightly differently from Lemma 3.2.10. Rather
than building a subfield from scratch, the lemma below builds an extension of
a smaller subfield.
Lemma 3.4.4 Let F ⊆ E be an extension of fields and let S be a subset of E.
Then in the collection of subfields of E that contain both F and S, there is a
smallest subfield.
3.4. FIELDS
111
The proof is left as an exercise.
With F , E and S as in the statement of Lemma 3.4.4, let K be the smallest
subfield of E containing F and S as guaranteed by the conclusion. We can refer
to K as the extension of F by S in E, but it is more usually referred to as the
extension of F obtained by adjoining S in E. Of course, any elements of S that
are already in F will have no effect on the outcome. A great deal of the time S
will have only one element.
The notation for the extension of F obtained by adjoining a set of elements
S is F (S), and if S is a finite set {a1 , a2 , . . . , an }, then the extension is denoted
F (a1 , a2 , . . . , an ).
Note that we can add elements one at a time or all at the same time. This
raises the question as to the importance of order and grouping. It makes no
difference and we leave the proof of the next lemma as an exercise. Note that
the expression on the left represents adding the elements one at a time, and the
expression on the right represents adding them all at once.
Lemma 3.4.5 Let F ⊆ E be an extension of fields, and let {a1 , a2 , . . . , an } be
a subset of E. Then
F (a1 )(a2 ) · · · (an ) = F (a1 , a2 , . . . , an ).
Much of the work in studying roots of polynomials will involve studying the
structure of extensions of the type shown in Lemma 3.4.5.
3.4.4
Homomorphisms and isomorphisms
Homomorphisms of fields are more restricted than homomorphisms of groups
and rings. The restrictions come from the large amount of structure carried by
a field. These restrictions make sensible the grouping of homomorphisms and
isomorphisms in a single topic.
A function f : F → K between fields is a homomorphism of fields if for
every a and b in F , we have both f (a + b) = f (a) + f (b) and f (ab) = f (a)f (b).
Thus the definition has the same appearance as the definition of a ring homomorphism. A field homomorphism that is a one-to-one correspondence is called
an isomorphism, and fields that have an isomorphism between them are said to
be isomorphic.
A discussion of image and kernel will not take place that parallels the discussion for groups and rings. The next lemma is the reason.
Lemma 3.4.6 Let h : F → K be a homomorphism of fields. Then either
h(F ) = {0} or h is one-to-one.
Proof. . We will assume h is not one-to-one. So there are a and b in F with a 6= b
and f (a) = f (b). This means that a − b 6= 0 and f (a − b) = f (a) − f (b) = 0.
Thus we have found a non-zero element c = a − b in the “kernel” of h. Since
c 6= 0 there is a multiplicative inverse c−1 for c. We now use the expression cc−1
for 1 with great effect.
112
CHAPTER 3. THEORIES
Let x ∈ F . Then x = 1x = cc−1 x. So
f (x) = f (cc−1 x) = f (c)f (c−1 x) = 0f (c−1 x) = 0,
and the proof is complete.
Lemma 3.4.6 explains the wording of the next lemma.
Lemma 3.4.7 Let h : F → K be a field homomorphism. If h(F ) 6= {0}, then
h(F ) is a subfield of K and h is an isomorphism from F to the subfield h(F ) of
K.
This is left as an exercise.
Some comments are in order.
Since we require 0 6= 1 in a field, the image of a field homomorphism is not
a field if the image is just {0}. Having the image equal to {0} is not a situation
that will be of great interest to us.
The “kernel” of a field homomorphism h : F → K would then be of only
two types: {0} or all of F . The situation where the “kernel” is all of K is
not interesting, and when the “kernel” is {0}, the “kernel” contains little data.
Thus the “kernel” of a field homomorphism will not be discussed.
Thus the only field homomorphisms h : F → K of interest to us will have h
a one-to-one function. We call such homomorphisms embeddings
indexembeddings or sometimes homomorphic embeddings for emphasis. As
stated in Lemma 3.4.7, an embedding of fields is an isomorphism onto the image
of the embedding.
We have the following important parallel to similar statements about group
and ring homomorphisms.
Lemma 3.4.8 If f : F → K and h : K → L are field homomorphisms, then
hf : F → L is a field homomorphism.
The proof is left as an exercise.
3.4.5
Automorphisms
Automorphisms of fields will be of special importance to us. As mentioned in
Section 3.2.7, automorphisms are symmetries of an object. One of the major
ideas of Galois theory is that information about a field can be extracted from a
knowledge of its symmetries.
As well as an interest in the structure of single fields, we are also interested
in the structure of field extensions. Thus we will define automorphisms of a
field extension. When we do, we will connect the definition to some of the
observations and comments made in Section 1.5.3.
An automorphism of a field is an isomorphism from the field to itself. For a
field F , we will use Aut(F ) to denote the set of all automorphisms of F . For a
field extension F ⊆ E, we will define
Aut(E/F ) = {φ ∈ Aut(E) | ∀x ∈ F, φ(x) = x}.
(3.1)
3.4. FIELDS
113
In words, Aut(E/F ) is the set of automorphisms of E that act as the identity
on all elements of F . We say that F is fixed or that all the elements of F are
fixed by all the elements in Aut(E/F ).
Before we discuss this further, we make note of the following.
Lemma 3.4.9 If F ⊆ E is an extension of fields, then Aut(E) is a group and
Aut(E/F ) is a subgroup of Aut(E).
The proof is left as an exercise.
Given the truth of Lemma 3.4.9, we can refer to Aut(E) as the automorphism
group of E. The group Aut(E/F ) is referred to as the automorphism group of
E over F . We think of Aut(E/F ) as the symmetries of the extension F ⊆ E.
Examples
We have seen in Section 2.9.1 that complex conjugation is a field homomorphism
from C to C. We know that z = z, so complex conjugation is its own inverse.
Therefore it is one-to-one and onto and thus an automorphism of C and an
element of Aut(C). Since complex conjugation is not the identity function,
it is a non-identity element of these two groups. Another fact about complex
conjugation is that z = z if and only if z is real. Thus complex conjugation
is a non-identity element of Aut(C/R). It is also a non-identity element of
Aut(C/Q).
√
Another
field we
Set (27) we saw that
√
√ have looked at is Q[ 2]. In Exercise
√
f (r + s 2) = r − s 2 is a homomorphism from Q[ 2] to itself. It is clearly not
the identity and it is also its own√inverse. Thus it is one-to-one and onto and
a non-identity
√ element of Aut(Q[ 2]). The elements left fixed by f are of the
form r + 0 2. Since r runs over all the elements√of Q, we see that Q is fixed
by f . Thus f is a non-identity element in Aut(Q[ 2]/Q).
Let
2π
2π
α = cos
+ i sin
5
5
be the fifth root of 1 in the first quadrant of the complex plane. Note that the
other fifth roots of 1 are α2 , α3 , α4 and α5 = 1. We claim that Q(α) consists
of all
r + sα + tα2 + uα3 + vα4
(3.2)
where all of r, s, t, u and v are in Q. We will do some of the work to verify this
claim in exercises, but we will not do all of it. The full claim will be verified
very much later.
An easy exercise that you will be asked to do is show that sums, products
and negatives of the numbers given in (3.2) are other numbers given in (3.2).
A much harder exercise that you will not be asked to do (involving solving a
system of five linear equations with five unknowns) is to show that mutliplicative
inverses of the non-zero numbers in (3.2) are also numbers in (3.2). Thus the
numbers in (3.2) form a field.
114
CHAPTER 3. THEORIES
But the numbers in (3.2) are sums and products of numbers in Q(α). So the
set of all the numbers in (3.2) is a subset of Q(α) and thus a subfield of Q(α).
But Q(α) is the smallest field in (say) C that contains Q and α. So Q(α) is
exactly the set of numbers in (3.2).
It turns out that for each j in {1, 2, 3, 4}, sending α to αj leads to an element of Aut(Q(α)/Q). It also turns out that these are all the elements in
Aut(Q(α)/Q). Proving these statements (including the fact that the multiplicative inverses of non-zero numbers in (3.2) are also in (3.2)) will be very
much easier when more is known about the structure of fields and field extensions.
Exercises (30)
1. Prove Lemma 3.4.1.
2. Prove Lemma 3.4.3.
3. Prove Lemma 3.4.4.
4. Prove Lemma 3.4.5. Hint: prove a slightly different lemma by induction.
5. Prove Lemma 3.4.7. The only part needing proof is the fact that h(F ) is
a subfield of K.
6. Prove Lemma 3.4.8.
7. Prove Lemma 3.4.9.
8. Check that sums, products and negatives of numbers in (3.2) are also
numbers in (3.2). Check that sending α to α2 creates a non-identity
automorphism in Aut(Q(α)/Q). If this has not been enough work, you
can also verify that sending α to αj for j = 2 and j = 3 each create an
automorphism in Aut(Q(α)/Q). Even if you do none of these verifications,
show that the set of isomorphisms that one gets by sending α to αj for
j ∈ {1, 2, 3, 4} is isomorphic to Z4 . (Compare this to Problem 15 in
Exercise Set (28). There, with Z5 representing the group (Z5 , +), the
automorphism group Aut(Z5 ) was shown to be isomorphic to Z4 .)
3.5
On leaving Part I
We have most of the ingredients in place, and hints of the outline.
In studying roots of polynomials, fields will be discussed. In general, there
will be a field F that contains the coefficients of the polynomial, and a field E
that contains the roots of the polynomial.
We know easy formulas that give the coefficients of the polynomial in terms
of the roots. For example, if r1 , r2 , r3 are the roots of x3 + ax2 + bx + c, then
a = −(r1 + r2 + r3 ), b = r1 r2 + r2 r3 + r3 r1 , and c = −r1 r2 r3 . Thus a field
3.5. ON LEAVING PART I
115
containing r1 , r2 and r3 must also contain a, b and c. In general, the field E
contains the field F and we have an extension.
If there are formulas that give the roots in terms of the coefficients, then
there may be intermediate values involving the taking of n-th roots for various n.
These intermediate values will force the existence of fields that are intermediate
to F and E and that form smaller extensions of F than E. A study of how
these smaller extensions fit into the extension F ⊆ E will tell us what has to
happen for formulas that give the roots in terms of the coefficients to exist.
Galois theory extracts essential information from the automorphism groups
of the various extensions. Thus the groups most important to us in this larger
outline are automorphism groups.
We see that different types of objects play very different roles in our outline.
Fields contain the numbers we calculate with. Groups are groups of symmetries
(automorphisms) of the fields of numbers. Rings show up because the integers
form a ring, and polynomials with coefficients from various fields also form rings.
Groups, rings and fields are all algebraic objects. They all have various
numbers and types of operations, and laws that dictate how the operations
behave. But the differences in number of operations and the laws that they
follow lead to enough diffrences in behavior to assign them different roles in a
mathematical topic.
In the notes to come we will launch three investigations. The first into the
general theory of groups, the second into the somewhat less general theory of
polynomials (instead of the more general theory of rings), the third into the
general theory of fields. None of these investigations will be terribly deep, nor
terribly broad, but will be sufficient to give a good introduction to each theory
and will also be sufficient for the needs of the outline.
The last part will be the Galois theory that ties all of the above together
and answers the question about which polynomials have formulas that give the
roots of the polynomial in terms of the coefficients.
116
CHAPTER 3. THEORIES
Part II
Group Theory
117
119
Introduction and organization
As previously mentioned, group theory is a collection of arguments, results and
techinques that help answer questions about groups. Therefore it is a collection
of subtopics that are held together by the fact that they are all about groups.
As a result, the theory can look rather disorganized.
We attempt to give some organizational structure to the topics that we will
cover. We will cover a tiny fraction of the topics in group theory, and within
each topic, we will cover a small part of that topic. Our selection will be guided
mostly by what we need from group theory to apply to the question of roots of
polynomials. We can use our reasons for our choice of topics to help organize
them into a few logically connected areas.
Actions
Groups arise in Galois theory as automorphism groups of fields and field extensions. Specifically a group will arise as some Aut(E/F ) for an extension
F ⊆ E of fields. Since automorphisms are homomorphisms that are one-to-one
correspondences, our groups are basically collections of certain kinds of permutations. Since permutations are thought of as moving things around, each
element of Aut(E/F ) causes movement. Mathematicians say that each element
of Aut(E/F ) “acts” on the field E. This brings us to the first organizational
concept: groups actions.
This splits into two parts. The more elementary topic is that of groups
“acting” as a collection of permutations. Among other things, we will show
that any group, even one not defined as acting as a collection of permutations
of a set, is isomorphic to a group that is a collection of permuations of a set.
The slightly more sophisticated topic, that of a general group action, will
come next. Following that will be a discussion of two major topics: subgroups
and homomorphic images.
Subgroups and homomorphic images
Formulas for roots of polynomials bring new numbers into a discussion in a
certain order. First a square root might be brought in. After that, a cube
root might be brought in, and so forth. Assuming that each n-th root requires
moving to a larger field, we get a sequence of field extensions
F0 ⊆ F1 ⊆ · · · ⊆ Fn
giving the opportunity to look at the automorphism groups of many different
field extensions. Under certain assumptions, the resulting groups will be related
to each other by either of the two most important relations among different
groups: the relation of “is a subgroup of” and the relation “is a homomorphic
image of.”
We will study each of these relations. These relations are related since every
homomorphism has a subgroup that goes with it—the normal subgroup that is
the kernel of the homomorphism. Thus the two topics will interact a good deal.
120
Iterations
The sequence of field extensions associated to formulas for the roots of a polynomial also motivates the last topic we will consider in group theory. The chain
of extensions
F0 ⊆ F1 ⊆ · · · ⊆ Fn
mentioned above, gives rise to a corresponding chain of subgroups, and under
the right circumstances to a chain of homomorphic images. Thus the relations
of “is a subgroup of” and “is a homomorphic image of” will have to be studied in
connected chains. This takes the form of a topic in group theory, suggestively
called solvable groups, that will be of direct importance in Galois’ theory of
solvability of polynomials.
Simple groups
As might be guessed, solvable groups are associated to solvable polynomials. To
obtain the result that certain polynomials are not solvable, one must then have
examples of groups that are not solvable. The quickest way to such examples
is via groups that are called simple. In spite of their name, the proof of their
simplicity is often not that simple. Our last topic will be the demonstration
that the right kinds of simple groups exist.
Chapter 4
Group actions I: groups of
permutations
4.1
Consequences of Lemma 3.2.1
For us a permutation group is a subgroup of a symmetric group. That is, G is
a permutation group if for some X, we have G ⊆ SX . Recall that SX is the
symmetric group on the set X, the group of all permutations of X.
We start by proving Cayley’s theorem: that all groups are isomorphic to
permutation groups. This means that limiting a discussion to permutation
groups is not much of a limitation.
We remind the reader of Lemma 3.2.1.
Lemma (3.2.1) If a and b are two elements in a group G, then there exists a
unique element x ∈ G that satisfies ax = b.
This immediately leads to the following.
Lemma 4.1.1 Let G be a group and let a be in G. Define la : G → G by
la (x) = ax. Then la is a one-to-one correspondence from G to G.
Proof. To show onto, for an element b ∈ G, we need to find an x with la (x) =
ax = b. But this exists by Lemma 3.2.1.
To show one-to-one, we assume that la (x) = la (y) or ax = ay. That x = y
follows by either quoting the uniqueness part of Lemma 3.2.1 or duplicating the
proof by multiplying both sides on the left by a−1 .
We can put the result just proven in words by saying that for each a ∈ G,
the function “multiply on the left by a” is a permutation of G. So each a ∈ G
gives us a permutation of the elements of G. This is the first step in proving
Cayley’s theorem.
121
122
CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS
To have an isomorphism, one needs a function that is a homomorphism, is
one-to-one, and is onto. We have the function. Each a ∈ G gives a permutation
la of G. So we let our function be defined by f (a) = la . This leads to odd
notation such as f (a)(x) = la (x) = ax. It is important to get used to this
jumble of letters and parentheses. Note that f is a function from G to SG , the
symmetric group on G or the group of all permutations of G.
Arguing that f is one-to-one is easy if you keep track of what everything is
and use the right attack. If f (a) = f (b), then la = lb and we have an equality
between two permutations. But permutations are functions and the statement
la = lb means that la (x) = lb (x) for every possible x ∈ G. We could pick a
general x to work with, or we could pick a favorite one. It turns out that using
x = 1 is enough. Note that we are not proving something is true for every
x ∈ G, we are using the fact that something is true for every x ∈ G.
Now we have la (1) = lb (1) or a1 = b1 or a = b. So the function f is
one-to-one.
Showing that f is onto a subgroup of SG is easy if f is a homomorphism.
We know that the image of a homomorphism is a subgroup of the range and so
f will be onto the subgroup of SG that is the image of f . So we need to show
that f is a homomorphism.
To prove that f is a homomorphism, we need to show that for all a and b in
G, we have f (ab) = f (a)f (b). This means we have to look at lab and compare
it to la lb . These are permutations on G and as such functions from G to G. To
show that they are the same function we need to show that for all x ∈ G, we
have lab (x) = (la lb )(x). Since we are proving that something is true for all x,
we have to let x be an arbitrary element of G.
We have lab (x) = (ab)x. We have
(la lb )(x) = la (lb (x)) = la (bx) = a(bx).
Thus what we want follows from the associative law of groups.
We have proven the following.
Theorem 4.1.2 (Cayley) Every group G is isomorphic to a subgroup of SG .
Some comments are needed. The proof of Cayley’s theorem (which precedes
the statement) requires carefully keeping track of exactly what everything is.
Some items in the proof are elements of G, some are functions from G to G, and
lastly the function f is a function from G to the set of permutations on G. You
should get used to the fact that this multiplicity of types of objects will happen
often and that it is normal to need a lot of time and attention to keep them all
straight.
Since one of the objects in Cayley’s theorem is a function from a set (group,
actually) to a set of functions, the notation tends to build in complexity. Since
this is the first time functions to a set of functions was used, we used extra
symbols to slow things down. In checking that f (ab) = f (a)f (b), we replaced
f (ab) by the function lab that f (ab) was equal to. Similarly f (a)f (b) was replaced by la lb . To write down what needed to be checked, we wrote that we
4.2. EXAMPLES
123
needed lab (x) = (la lb )(x) to be true for all x ∈ X. We could have written that
we needed f (ab)(x) = (f (a)f (b))(x) to be true for all x ∈ X. Such notation will
be used in the future and you should start getting used to it. You should also
accept that it takes time to read a statement such as f (ab)(x) = (f (a)f (b))(x)
and dig through the definitions to extract is meaning.
Cayley’s theorem does not give efficient views of a group. The group S3
has six elements and it already is a group of permutations. (The number of
elements of Sn is discussed thoroughly in the next section. For now, accept that
the number of elements in Sn is n!.) Cayley’s theorem says that if G = S3 ,
then G is isomorphic to a subgroup of SG . The subgroup that is isomorphic to
G must have 6 elements, but SG itself has 6! or 720 elements. Showing G as a
group of permutations of 3 elements is much more efficient than demonstrating
it as a six element subgroup of a 720 element group.
Cayley never gave a full proof of Cayley’s theorem. He constructed the
function f , showed that each f (a) is a permation of G and showed that f is
one-to-one. He may have shown that the image is a group, but he did not show
that f is a homomorphism. However, at the time, no one realized that this was
a necessary step.
The last comment about Cayley’s theorem is that it says that facts proved
about all groups of permutations apply to all groups in general. Therefore a
study of groups of permutations is a worthwhile study.
Exercises (31)
1. Something goes wrong if you try to state and prove Cayley’s theorem with
the function f (a) = ra and ra : G → G is defined by ra (x) = xa. A lemma
like Lemma 4.1.1 holds for ra , but the proof of Cayley’s theorem has a
problem. State and prove the lemma like 4.1.1 for ra , and find the problem
with the corresponding Cayley’s theorem. You can try to find a better
statement for the corresponding Cayley’s theorem now, or wait until later
when we come up with a fix.
4.2
Examples
Cayley’s theorem says that every group is isomorphic to a permutation group.
To emphasize the set being permuted, we will sometimes say that a subgroup
G of SX is a permutation group on X.
Obviously one example is G = SX for some X. We will give other examples.
For the finite symmetric groups Sn , the following is an important point to make.
Recall that the order of a group G, written |G|, is the number of elements of G.
Lemma 4.2.1 The order of Sn is n!.
The following is a reasonably rigorous argument. An element α of Sn is
determined by what it does to each element of {1, 2, . . . , n}. There are n choices
for α(1). Whatever is chosen for α(1) is not available for α(2), and there are
124
CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS
n−1 choices for α(2). Continuing, there is one less choice for each i as i increases
by 1, until there is only one choice left for α(n). The total number of choices is
n(n − 1)(n − 2) · · · (2)(1) = n!.
The proof above can be made much more rigorous if more rigor is demanded.
A more rigorous proof is left as an exercise.
4.2.1
Dihedral groups
These are important examples because they are not very complicated but are
complicated enough (they are not abelian) to illustrate various features of groups.
Let P be a regular polygon. That is, P has the lengths of all of its sides
equal, and the size of all of its angles equal. If the polygon has n sides, it has
n angles and is called the regular n-gon. Below we show the regular n-gons for
n = 3, 4, 5, 6.
vHH
vv HHH
v
H
v)) v
))
))
1
111
11
11
11
11
11
11
11
For a given n ≥ 3, the dihedral group D2n is the group of all permutations of
the vertices of the regular n-gon that preserve the structure of the n-gon itself.
That is, rotating and flipping the n-gon is allowed, but no twisting, crumpling
or stretching is allowed. The group D2n is often described as the group of
symmetries of the regular n-gon. The notation D2n will be explained shortly.
We illustrate D2n for n = 3 and n = 4.
In the figure below, the vertices are labeled 1, 2 and 3. This is not traditional
(A, B and C would be more traditional), but using numbers for the vertices,
lets us compare D6 to S3 .
2
1
111
11
11
11
11
1
3
A quick check of the six permutations of {1, 2, 3} reveals that all permutations
of the vertices preserve the structure of the triangle shown. Some of the permutations
involve flipping the triangle over. This is allowed. The permutation
1 2 3
is one that requires a flip.
1 3 2
4.2. EXAMPLES
125
In the figure below, the vertices are labeled 1, 2, 3 and 4.
2
1
(4.1)
3
4
There are 24 permutations of {1, 2, 3, 4}, but not all corresponding permutations
of the vertices preserve the structure of the square shown. There are four places
to take vertex 1. Once vertex 1 is in place, there are only two ways to place the
rest of the square. One way has the “original” side up and and the other way
has the “hidden” side up and invovles a “flip.” We illustrate the two possibilities
with vertex 1 carried to the lower left corner.
4
3
2
3
1
2
1
4
We see that of the 24 possible permutations in S4 only 8 correspond to vertex
permutations that perserve the structure of the square.
Arguing as we did above about corners shows that for each n ≥ 3, |D2n | =
2n. This explains the notation. The group D2n is usually called the dihedral
group of order 2n. Some books use Dn for what we call D2n . There are enough
books that use each notation that it is hard to say which is more popular.
A word on vocabulary
It is tempting to say that D6 equals S3 . That makes sense if the vertices are
the numbers 1, 2, and 3. If you would rather say that the vertices are labeled
1, 2 and 3, then D6 is isomorphic to S3 rather than equal to S3 .
We are only permuting vertices. Other books talk about moving the entire ngon. There is not much difference between moving the entire n-gon and moving
its vertices. Once you know where the vertices go, you know where the entire
n-gon has gone. The point is that if you view the motion as that of the entire
n-gon, then it makes no sense at all to say that D6 = S3 , and no book says that.
We define the D2n as permutations of the vertices so that we can refer to
D2n as a subgroup of a group of permutations of a finite set (the set of vertices).
Later, we can change our view and think of D2n has a group of motions of the
entire n-gon. Then D2n becomes a subgroup of a group of permutations of an
infinite set (the set of points in the n-gon). This other view will allow us to
raise questions that we cannot raise in our setting.
126
CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS
The elements and the multiplication
We can give a name to each element of D2n . The smallest is D6 , so we start
there. Using Cauchy notation for permutations, we have the following.
1 2 3
1 2 3
γ=
e=
2 3 1
1 2 3
1 2 3
1 2 3
δ=
α=
3 1 2
1 3 2
1 2 3
1 2 3
ǫ=
β=
3 2 1
2 1 3
It is somewhat of a tradition to use e for the identity if using 1 would be
confusing. We are already using 1 as a label of a vertex.
We will compute prodcuts such as βγ as if they are functions. That is, γ
will be applied first and β second. For this particular product, we get
1 2 3
1 2 3
1 2 3
= α.
=
βγ =
1 3 2
2 3 1
2 1 3
After doing 35 more such calculations, we get the following multiplication table
for D6 .
· e α β γ δ ǫ
e e α β γ δ ǫ
α α e δ ǫ β γ
β β γ e α ǫ δ
γ γ β ǫ δ e α
δ δ ǫ α e γ β
ǫ ǫ δ γ β α e
The table doesn’t show much, but it does show that two elements rarely
commute. For example, we have βγ = α, but γβ = ǫ.
Note that every row has every element of the group entered once and only
once. This is guaranteed by Lemma 4.1.1. Each column also has every element
of the group entered once and only once. This is guaranteed by a lemma like
Lemma 4.1.1 but for ra : G → G defined by ra (x) = xa.
There is not much that can be pulled from a multiplication table other than
the answers to products that can be figured out quickly anyway. Later, we will
illustrate an important point with a multiplication table. But other than that,
we will spend little time building them.
Orders of elements
We can use the table (or direct computation) to compute orders of elements. In
the paragraphs before Exercise Set (22), we defined the order of an element x
of a group to be the smallest number of copies of x that need to be multiplied
4.2. EXAMPLES
127
together to get the identity. In a non-abelian setting, it is more easily defined
as the least positive integer n so that xn = 1. We write o(x) for the order of x.
For D6 , we get the following orders.
o(e) = 1,
o(α) = 2,
o(γ) = 3,
o(δ) = 3,
o(β) = 2,
o(ǫ) = 2.
If you did Problem 6 in Exercise Set (22), you found that the orders of the
elements of Z6 are 1, 6, 3, 2, 3, 6. This gives two reasons why Z6 and D6 cannot
be isomorphic. First, one is abelian and the other is not. Second, the list of
orders of the elements of the two groups is not the same.
In the next section, we set up machinery to pick out subgroups of the dihedral
groups and other groups of permutations.
4.2.2
Stabilizers
These notions will see applications often.
Let G be a permutation group on X and let x be in X. Then Gx will denote
the subset of SX defined by
Gx = {h ∈ G | h(x) = x}.
In words, Gx is the set of permutations in G that leave x fixed. We call Gx the
stabilizer of x in G. It is also sometimes called the fixed group of x in G.
Lemma 4.2.2 If G is a permutation group on X and x ∈ X, then Gx is a
subgroup of G.
The proof is left as an exercise.
The easy proof of the following should be written out carefully.
Lemma 4.2.3 Let G = SX for a set X and let x ∈ X. Then there is an
isomorphism from Gx to SY where Y = X − {x}.
The proof is left as an exercise.
If G = SX , if H = Gx and if y ∈ X with y 6= x, then Hy contains all
permutations in H that fix y. But a permutation is in H if and only if it fixes
x. So the permutations in Hy are exactly those that fix both x and y. This can
get cumbersome after a while, so we invent notation.
If G ⊆ SX and A ⊆ X, then GA is defined by
GA = {h ∈ G | ∀a ∈ A, h(a) = a}.
In words, GA is the set of permutations in G that leave every element in A
fixed. We call GA the pointwise stabilizer of A in G. The reason for the two
word name will be clear shortly. It is easily provable that GA is a subgroup of
G, but it is more important to point out the easiest reason why.
128
CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS
Lemma 4.2.4 Let G be a permutation group on X and let A ⊆ X. Then
\
GA =
Gx .
x∈A
The proof is left as an exercise.
It is immediate from Lemmas 4.2.4 and 3.2.9 that GA is a subgroup of G.
Lastly, if G is a permuation group on X and A ⊆ X, we look at
StG (A) = {h ∈ G | h(A) = A}.
We call this the stabilizer of A in G and it is the permutations in G that carry
A to A. Note carefully the difference between the stabilizer of A in G and the
pointwise stabilizer of A in G. Of course, we have the following lemma.
Lemma 4.2.5 If G is a permutation group on X and A ⊆ X, then StG (A) is
a subgroup of G.
The proof is left as an exercise.
Note that the group Gx can also be written StG ({x}) or more briefly as
StG (x).
We will refer to these constructions shortly. We turn now to more concrete
examples.
Dihedral groups revisited
Let us look at subgrops of D12 . Since D12 already has a subscript which complicates expressions, let us set G = D12 so that we can refer to subgroups such
as StG (A) and GA for various A.
The group G is the group of symmetries of the following figure.
3
121
11
11
(4.2)
4 111
1
11
1
5
6
Note that G = D12 has 12 elements. The full symmetric group S6 has 720
elements. This is one reason that the dihedral groups are more practical to work
with than the symmetric groups.
We will look at some stabilizers in G.
The reader should check that G1 = StG (1) consists exactly of the identity
and the element
1 2 3 4 5 6
.
1 6 5 4 3 2
The subgroup G2 consists exactly of the identity and the element
1 2 3 4 5 6
.
3 2 1 6 5 4
The exercises will discuss several other stabilizer subgroups.
4.2. EXAMPLES
129
Field automorphisms
We have mentioned that if F ⊆ E is an extension of fields, then we will be
interested in Aut(E/F ). This is a subgroup of Aut(E). Since automorphisms
must be one-to-one correspondences, the group Aut(E) is already a permutation
group. Now the definition of Aut(E/F ) as the automorphisms φ of E so that
for all x ∈ F , φ(x) = x. This translates into the statement that Aut(E/F ) is
the pointwise stabilizer of F in Aut(E).
Exercises (32)
1. Write out a rigorous proof of Lemma 4.2.1. An inductive proof is recommended.
2. Write out the elements of D8 . Without writing out the full multiplication
table for D8 , figure out the orders of the elements of D8 . Without writing
out the full multiplication table for Z8 , figure out the orders of the elements
of Z8 .
3. Find two elements of D8 that do not commute.
4. Prove Lemma 4.2.2.
5. Prove Lemma 4.2.3.
6. Prove Lemma 4.2.5.
7. Let G be a permutation group on X and let A ⊆ X. Let B = X −A. Show
that StG (A) = StG (B). Notice that you are asked to prove an equality,
not just an isomorphism.
8. Let G = Sn and let A = {1, 2, . . . , m} with 1 ≤ m < n. What is |GA |?
What is |StG (A)|?
9. Let G be a permutation group on X and let a ∈ A ⊆ B ⊆ X. Prove that
the following subgroup relations always hold.
(a) GA ⊆ Ga .
(b) GA ⊆ StG (A).
(c) GB ⊆ GA .
10. These questions refer to G = D12 . Review the definitions of stabilizers
while doing the problems. In the following “what is” means “what elements are in.”
(a) What is G3 ? G4 ? G5 ? G6 ?
(b) What is G{1,2} ? What is G{2,3} ? What is G{1,4} ?
(c) What is StG ({1, 2})? What is StG ({2, 3})? What is StG ({1, 4})?
130
CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS
(d) For what p and q in {1, 2, 3, 4, 5, 6} is G{p,q} equal to G1 ?
(e) For what p and q in {1, 2, 3, 4, 5, 6} is StG ({p, q}) equal to G1 ?
(f) Show that StG ({1, 3, 5}) is isomorphic to D6 .
11. Let G be a permutation group on X and let a ∈ A ⊆ B ⊆ X. Find
examples that show that the following containments are sometimes false.
Note that there are very few permutation groups that you know at this
point. You should be able to find examples among the Sn and the D2n .
(a) Ga ⊆ GA .
(b) StG (A) ⊆ GA .
(c) GA ⊆ GB .
(d) StG (A) ⊆ StG (B).
(e) StG (B) ⊆ StG (A).
4.3
4.3.1
Conjugation
Definition and basics
Definition
Let G be a group and let a and b be in G. Then the conjugate of a by b is
the element bab−1 . Some texts will give this as b−1 ab, but that would involve a
different convention as to how permutations compose. You should keep in mind
that when changing books, the formula for conjugation might change.
We will often write ab as a shorthand for bab−1 .
The resemblance of a conjugation to the criterion for normality given after
Lemma 3.2.15 is not an accident. The definition of normality can be reworded
to say that a subgroup N of a group G is normal in G if for every a ∈ N and
b ∈ G, the conjugate of a by b is in N .
If c = bab−1 , then we say that a is conjugate to c (by b) giving us the relation
“is conjugate to” on G. Note that “is conjugate to” is an existence statement.
The element a is conjugate to c means that there exists a b so that ab = c.
The conjugacy operation and the conjugacy relation are extremely important
and will come up often.
Basics
Before looking at conjugacy in permutation groups, we give some general facts.
The first is extremely trivial, but so important that we make it a lemma.
Lemma 4.3.1 If a and b are in a group G, then ab = a if and only if a and b
commute.
4.3. CONJUGATION
131
Proof. If bab−1 = a, then multiplying both sides on the right by b shows ba = ab.
If ba = ab, then multiplyiing both sides on the right by b−1 shows ab = bab−1 =
a.
The three proofs below, left as exercises, are in order of increasing depth of
definition. The last requires extreme care in keeping track of exactly what is
what.
Lemma 4.3.2 In a group G, the conjugacy relation is an equivalence relation.
The proof is left as an exercise.
Lemma 4.3.3 If G is a group and b ∈ G, then the function cb : G → G defined
by cb (x) = bxb−1 = xb is an automorphism.
The proof is left as an exercise.
Lemma 4.3.4 If G is a group, then f : G → Aut(G) defined by f (b) = cb with
cb as in Lemma 4.3.3 is a homomorphism.
The proof is left as an exercise.
Some comments are in order.
It is vital to work through the proofs of Lemmas 4.3.3 and 4.3.4. Much of
how conjugation cooperates with products (how does g f behave if g is replaced
by a product gh or if f is replaced by a product f h) is revealed by working
through the proofs. Similarly, the details of the proofs reveal how conjugation
cooperates with the taking of inverses.
From Lemma 4.3.2, we know that a group G is partitioned into equivalence
classes under the relation “is conjugate to.” The classes are called conjugacy
classes and for a ∈ G, the conjugacy class of a is the set of all x ∈ G so that a
is conjugate to x. Note that since “is conjugate to” is symmetric, you do not
have to remember whether a is conjugate to ab or ab is conjugate to a. They
are both true.
The next comment turns out to be a hint for the proof of Lemma 4.3.4, but
it is too important to leave out. The fact that f (b) = cb is a homomorphism
implies that conjugation by b and conjugation by b−1 are inverse automorphisms.
So if b conjugates a to x, then b−1 will conjugate x to a. Of course, this is clear
by plugging into the definition of conjugation and simplifying.
To state one last lemma, we bring in more notation. If S is a subset (we do
not need subgroup for this) of a group G and b ∈ G, then S b = {xb | x ∈ S}.
That is, S b contains all the conjugates by b of elements in S. We call S b the
conjugate of S by b. Usually, we will look at S b when S is in fact a subgroup of
G. When this happens, we get the following.
Lemma 4.3.5 If G is a group, if H ⊆ G is a subgroup and b ∈ G, then H b is
a subgroup of G and the restriction of cb as defined in Lemma 4.3.3 to H is an
isomorphism from H to H b .
132
CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS
That H b is a subgroup follows from Lemma 3.2.13 applied to the restriction
of cb to H. The restriction is a homomorphism since the only requirement for
being a homomorphism is that certain equalities hold. The image is H b .
The rest follows immeidately from Lemma 4.3.3 and the definition of H b .
One of the next goals is to completely understand the conjugacy relation in
the groups Sn .
4.3.2
Conjugation of permutations
The calculation that starts it all
Let G be a permutation group on X and let a and b be in G and let x be in X.
We calculate
(ab )(b(x)) = (bab−1 )(b(x)) = (ba)(bb−1 (x)) = (ba)(x) = b(a(x)).
(4.3)
The calculation in (4.3) has been referred to as “the fundamental triviality” of
permutation groups. Its consequences are enormous. We start with a discussion
of what (4.3) tells us immediately.
The element x of X is carried to a(x) by a. The calculation in (4.3) shows
that b(x) is carried to b(a(x)) by ab .
We can think of the pair of elements x and a(x) as “witnessing” the action
of a on x. We are just looking at the pair consisting of an element of X and the
element that a carries it to. The calculation (4.3) says that the two elements x
and a(x) that witness the action of a on x are carried by b to the two elements
b(x) and b(a(x)) that witness the action of ab on b(x).
This is summarized by the following diagram that might help keep track of
things.
x
a
b
b(x)
/ a(x)
b
ab =bab−1
/ b(a(x))
The two ways of going from the lower left corner to the lower right corner are
to either go straight across with bab−1 or first up (opposite the direction of
the arrow representing b) via b−1 , then over via a and then down by b. Since
the composition of these three permutations is written from right to left we get
bab−1 for this second path. Thus the two ways of going from the lower left
corner to the lower right corner are consistently labeled. That b carries the two
elements x and a(x) to the two elements b(x) and b(a(x)) repeats the main point
of the calculation (4.3).
4.3. CONJUGATION
133
Actual computations
Let σ be in Sn . We recall the Cauchy notation for σ which reads
!
1
2
3 ··· n
.
σ=
σ(1) σ(2) σ(3) · · · σ(n)
(2.2)
Let τ also be in Sn . We wish to use (4.3) to guide us in writing down the
Cauchy notation for σ τ !
.
i
in (2.2) gives a pair that “witnesses” the action of σ on
Each column
σ(i)


τ
(i)
.
i. Following the dictates of (4.3), the corresponding pair for σ τ is 
τ (σ(i))
This leads us to write down the Cauchy notation for the full permutation σ τ as


τ
(1)
τ
(2)
τ
(3)
·
·
·
τ
(n)
.
στ = 
(4.4)
τ (σ(1)) τ (σ(2)) τ (σ(3)) · · · τ (σ(n))
A verbal description of how to obtain (4.4) from (2.2) is quite easy: apply τ
to every entry in both lines of the Cauchy notation for σ to obtain the Cauchy
notation for σ τ . This will probably result in the first line not being in numerical
order. This is a problem only if this bothers you. If you insist on putting the
first line in numerical order, you can certainly do so if you move both entries in
a column together. An example is called for.
With G = D12 , we wrote out (just under (4.2)) the non-identity element of
G1 and also of G2 . Let
1 2 3 4 5 6
α=
1 6 5 4 3 2
be the non-identity element of G1 and let
1 2 3 4 5 6
β=
3 2 1 6 5 4
be the non-identity element of G2 . Consider also the permutation
1 2 3 4 5 6
γ=
.
2 3 4 5 6 1
Following the instructions that we have worked out for conjugating permutations we see that
1 2 3 4 5 6
2 3 4 5 6 1
γ
= β.
=
α =
3 2 1 6 5 4
2 1 6 5 4 3
The fact that α and β come from G1 and G2 are totally irrelevant to the
calculation of αβ itself. We will comment later on why we introduced α and β
134
CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS
this way. For now we will only ask that you calculate αδ where
1 2 3 4 5 6
.
δ=
2 1 6 5 4 3
If you are wondering about the relevance of this, we point out that γ(1) =
δ(1) = 2.
It should also be mentioned that αγ = γαγ −1 can be calculated quite nicely
from the techniques of multiplication and inversion given in Section 2.3.4. However, the techniques for calculating conjugations of permutations given here have
simplicity and also illustrate important aspects of what is really happening in a
conjugation.
4.3.3
Conjugation of stabilizers
We give further consequences of (4.3). We start with the most trivial observation.
Conjugation of stabilizers of single elements
Assume that G is a permutation group on X, that x ∈ X and that a ∈ G is
in Gx . That is, a(x) = x. Then for any b ∈ G, we have that b carries the two
elements x and a(x) = x to two elements b(x) and b(a(x)) = b(x) so that ab
carries the first b(x) to the second b(x). That is, ab is in Gb(x) .
Of course, this could have been verfied by repeating the calculation in (4.3)
under the assumption that a(x) = x. We did say that this observation is the
more trivial.
We know that cb : G → G defined by cb (a) = ab is an automorphism. In
particular it is one-to-one. So we have shown that restricting cb to Gx sends it
in a one-to-one fashion to Gb(x) .
We now turn to cb−1 . Identical arguments show that cb−1 takes Gbx in a oneto-one fashion to Gx since b−1 takes b(x) to x. From Lemma 4.3.4, we know
that cb−1 and cb are inverse functions. Thus from Lemma 2.2.9, we know that
cb restricted to Gx is a bijection from Gx to Gb(x) . We have enough information
to state the following summary.
Lemma 4.3.6 If G is a permutation group on X with x ∈ X and b ∈ G, then
(Gx )b = Gb(x) and cb restricted to Gx is an isomorphism from Gx to Gb(x) .
The last conclusion is a direct application of the first conclusion and Lemma
4.3.5.
Conjugation of general stabilizers
Lemma 4.3.7 If G is a permutation group on X with A ⊆ X and b ∈ G, then
(StG (A))b = StG (b(A)) and cb restricted to StG (A) is an isomorphism from
StG (A) to StG (b(A)).
4.3. CONJUGATION
135
Proof. The first claim is an equality of groups which comes down to an equality
if sets. The second claim (about the isomorphism) follows from the equality of
the first claim and Lemma 4.3.5. We thus focus on the equality and start with
one containment.
An element of (StG (A))b is some hb where h ∈ StG (A). We want hb to be
in StG (b(A)) so any element of b(A) should be carried to something in b(A) by
hb . But an element of b(A) is of the form b(a) for some a ∈ A. We know that
h(a) is in A and the fundamental triviality says that hb carries b(a) to b(h(a))
which must be in b(A). This proves hb is in StG (b(A)).
We now have (StG (A))b ⊆ StG (b(A)). The reverse containment is handled
as in the last two paragraphs of our proof of Lemma 4.3.6. We have shown that
the bijection cb carries StG (A) into StG (b(A)). With identical proof, we get that
cb−1 carries StG (b(A)) into StG (A). From Lemma 4.3.4, we know that cb and
cb−1 are inverse bijections, and from Lemma 2.2.9, we know that cb restricted
to StG (A) is a bijection to StG (b(A)).
Conjugation of pointwise stabilizers
We round out the discussion with the following predictable lemma.
Lemma 4.3.8 If G is a permutation group on X with A ⊆ X and b ∈ G, then
(GA )b = Gb(A) and cb restricted to GA is an isomorphism from GA to Gb(A) .
The details of showing that this follows from Lemma 4.3.6 and Lemma 4.2.4
is left to the reader.
We note that if this is applied to the example of Aut(E/F ) when F ⊆
E is an extension of fields, then we get that if φ is an automorphism of E,
then (Aut(E/F ))φ = Aut(E/φ(F )) and that φ restricted to Aut(E/F ) is an
isomorphism from Aut(E/F ) to Aut(E/φ(F )).
4.3.4
Conjugation and cycle structure
We introduce a second notation for elements of Sn . It has the advantage of
needing less writing.
Cycles
The main fact that the discussion starts from is the following.
Lemma 4.3.9 Let g be an element of Sn and let i be in {1, 2, . . . , n}. Then
there is a smallest positive integer s so that g s (i) = i.
Proof. There are infinitely many positive integers and only finitely many elements of {1, 2, . . . , n}. If we consider the sequence i = g 0 (i), g 1 (i), g 2 (i), . . .,
then there is (by well ordering) a first place g s (i) where the value is a value that
has already occurred in the sequence. Since the previous apperance could be
136
CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS
i = g 0 (i) itself, we have some t ≥ 0 and s > t so that g t (i) = g s (i). Further, s
is the smallest integer for which this is true. We claim that g s (i) = i.
If not, then t > 0 and t − 1 ≥ 0. Then g t (i) = g s (i) = g(g s−1 (i)) and
t
g (i) = g(g t−1 (i)). We know g s−1 (i) 6= g t−1 (i) because g s (i) is the first value
that repeats an earlier value. But g t−1 (i) 6= g s−1 (i) and g(g t−1 (i)) = g t (i) =
g s (i) = g(g s−1 (i)) violates the fact that g is a permutation and must be one-toone. Thus t cannot be greater than 0 and g s (i) = g 0 (i) = i.
The power of Lemma 4.3.9 is that given g ∈ Sn every element of {1, 2, . . . , n}
belongs to some “cycle” associated to g. We illustrate this first with an example,
and give careful definitions and demonstrations second.
Consider
1 2 3 4 5 6
∈ S6 .
(4.5)
g=
5 6 4 1 3 2
Under powers of g we have
1→5→3→4→1
and
2 → 6 → 2.
(4.6)
That is g 0 (1) = 1, g 1 (1) = 5, . . . , g 4 (1) = 1. Similarly g 2 (2) = 2. We want to
consider g as breaking {1, 2, 3, 4, 5, 6} into two “cycles” once we have made the
right definitions. To help with this we have the following.
Lemma 4.3.10 Let g be in Sn . For i and j in define i ∼g j to mean that for
some integer t ≥ 0, we have g t (i) = j. Then ∼g is an equivalence relation.
Proof. g 0 (i) = i so ∼g is reflexive.
If i ∼g j, then if i = j, we certainly have j ∼g i so assume j 6= i. We
have g t (i) = j for some t > 0. By well ordering, we can assume that t is
the smallest value with these properties. By Lemma 4.3.9, there is a smallest
s > 0 so that g s (i) = i. If s < t, then t = qs + r with r < s < t and
j = g t (i) = g qs+r (i) = g r (g qs (i)) = g r (i) since g s (i) = i. But r < t contradicts
the choice of t. Since i 6= j, we have s 6= t, so s > t and s = t + d for some
d > 0. Now i = g s (i) = g d (g t (i)) = g d (j) and j ∼g i. This proves that ∼g is
symmetric.
Lastly, if i ∼g j and j ∼g k, we have g s (i) = j and g t (j) = k with s and t at
least 0. Now g s+t (i) = k and i ∼g k showing that ∼g is transitive.
For g ∈ Sn , we call the equivalence classes in {1, 2, . . . , n} under ∼g the
cycles of g. Note that from any i in a class, we get all the other elements of the
cycle containing i by applying powers of g to i.
Cycle notation
For g ∈ Sn , the cycle notation for g is the form
g =(a1 a2 . . . ak1 )(ak1 +1 ak1 +2 . . . ak1 +k2 ) · · ·
(ak1 +k2 +···+kp−1 +1 . . . ak1 +k2 +···+kp )
(4.7)
4.3. CONJUGATION
137
where p is the number of cycles under ∼g . In each parenthesized group g(aj ) =
aj+1 except that if ai is the first element of a parenthesized group and aj is the
last entry in the same group, then g(aj ) = ai .
The barrage of notation in (4.7) is too cumbersome to digest and an example
is needed. The element g given in (4.5) is written in cycle form as
g = (1 5 3 4)(2 6).
Note the resemblance to (4.6).
There are many ways to write an element in cycle form. The order of the
cycles is irrelevant, and while the order inside a given cycle is determined by
the behavior of g, the starting value for the cycle is not important. Thus we
can also write g as
g = (6 2)(3 4 1 5).
The content of our discussion leads to the following.
Lemma 4.3.11 For each g ∈ Sn , g can be written in cycle notation.
Practicalities
Writing an element of Sn in cycle notation is easy. Let g be in Sn . Take your
favorite element of {1, 2, . . . , n} (say 1). Then write down
(1 g(1) g 2 (1) · · · g k1 −1 (1))
where k1 is the least positive integer for which g k1 (1) = 1. This writes down
one cycle of g. Specifically, it is the cycle containing 1. If there are elements
of {1, 2, . . . , n} that are not in the cycle containing 1, then start a new cycle
with an element i not yet recorded. If picking one at random bothers you, then
pick the smallest. Write down the cycle containing i as (i g(i) · · · g k2 −1 (i))
where k2 is the smallest positive integer where g k2 (i) = i. If there are elements
of {1, 2, . . . , n} that are not in the union of the cycles that have been written
down so far, then pick one of them (smallest, if you wish) and keep going.
The procedure above is exactly how the example g = (1 5 3 4)(2 6) was
written down.
If we use this process on the elements α, β, γ and δ that follow (4.4), we get
α = (1)(2 6)(3 5)(4),
β = (1 3)(2)(4 6)(5),
γ = (1 2 3 4 5 6),
δ = (1 2)(3 6)(4 5).
Composition using cycle notation
Composing permutations is still composition of functions, even if the notation is
cycle notation. If the result is desired in Cauchy notation, then one just considers
138
CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS
the elements of {1, 2, . . . , n} in order. If the result is desired in cycle notation,
then a different order must be used. To compute αβ, we note that (αβ)(1) = 5.
So we write down (1 5 and the next calculation we do is (αβ)(5) = 3. Our cycle
of αβ now extends to (1 5 3. Next (αβ)(3) = 1, so our first cycle of αβ is (1 5 3).
We go on to (αβ)(2) = 6 and so forth. The final result is
αβ = (1 5 3)(2 6 4).
The composition βα computes as
βα = (1 3 5)(2 4 6).
Efficiencies
We wrote above that α = (1)(2 6)(3 5)(4). If it is known that α is an element of
S6 , then we can leave out the cycles of length 1 if it is agreed that any element
of {1, 2, 3, 4, 5, 6} that is not mentioned is in a cycle of length 1. Then the cycle
notation for α simplifies to α = (2 6)(3 5). If it is not mentioned that α ∈ S6 ,
then a reader might interpret the simpler notation to mean that α is in S5 . So
mentioning the group is important.
With this convention, other elements discussed above simplify. We get α =
(2 6)(3 5) and β = (1 3)(4 6). But γ = (1 2 3 4 5 6), αβ = (1 5 3)(2 6 4),
βα = (1 3 5)(2 4 6) and δ = (1 2)(3 6)(4 5) are no shorter.
However, if we write p = (1 5 3) and q = (2 6 4), then the statement αβ = pq
becomes a true statement. This gives a convenient statement once we make a
definition.
Disjoint cycles
We say that σ ∈ Sn is a cycle if it has only one cycle of length greater than 1.
Thus p and q are cycles. We can add to this terminology and say that σ ∈ Sn
is an n-cycle if it is a cycle and the its cycle of length greater than 1 has length
exactly n. Thus p and q are both 3-cycles.
This leaves the problem of what to say about the identity. We will avoid the
problem by simply calling it the identity.
What we have shown is that αβ is a product of two 3-cycles. But there is
more information here than we are pointing out. If you let r = (1 2 3 4) and
s = (4 5 6), then
rs = (1 2 3 4 5 6) = γ
which is already a cycle. The point is that the set of elements {1, 5, 3} most
relevant to p and the set of elements {2, 6, 4} most relevant to q are disjoint.
For r and s they are not.
We make two more definitions.
If σ is in Sn , the support of σ is the set {i ∈ {1, 2, . . . , n} | σ(i) 6= i}. That
is, the support of σ consists of the elements that σ actually moves. The support
of p is the set {1, 5, 3} and the support of q is {2, 6, 4}.
4.3. CONJUGATION
139
Our observation is that the supports of p and q are disjoint. We say that two
cycles are disjoint if their supports are disjoint. The main point of Lemma 4.3.11
is that every element of Sn is a product of cycles that are pairwise disjoint. The
usual way to say this leaves out the word “pairwise” since it is understood that
pairwise is meant. The translation of Lemma 4.3.11 into this terminology is as
follows.
Lemma 4.3.12 Every non-identity element of Sn can be written as a product
of disjoint cycles.
We omit the identity element from the statement for the reasons mentioned
above.
Cycle structure
There is a certain level of uniqueness that goes with lemma 4.3.12. It follows
from the fact that the cycles of a g ∈ Sn are the equivalence classes of ∼g
which are completely determined by g. Beyond that, there is a lot of freedom
in writing out cycles of an element. Recall the example that we gave where
g = (1 5 3 4)(2 6) = (6 2)(3 4 1 5)
gives two (among many other) ways to write g as a product of disjoint cycles.
We will concentrate on the fact that the cycles of g ∈ Sn are equivalence
classes determined by ∼g . From this it is clear that the sizes of the cycles are
completely determined by g. Since there can be several cycles of each size (recall
the example α = (2 6)(3 5)), we cannot just talk about the set of cycle sizes.
We must say how many cycles there are of each size. For a given g ∈ Sn , we
will call the cycle structure of g the number of k-cycles that g has for each k.
Of course, for most k this number is 0. We even include the cycles of length 1
for completeness.
In the examples we have been using, α and β both have two 2-cycles, and
two 1-cycles, γ has one 6-cycle, δ has three 2-cycles. Both αβ and βα have two
3-cycles. We will see later that this is not a coincidence.
We are now ready to apply cycle structure to conjugation.
4.3.5
Permutations that are conjugate in Sn
In this section we will learn exactly which pairs of elements in Sn are conjugate
and which are not. Further, if two elements a and b of elements in Sn are
conjugate, we will be able to figure out an element h ∈ Sn so that ah = b.
This is rather special to Sn . Other groups do not behave this well.
The key lemma that follows is a direct application of the fundamental triviality.
Lemma 4.3.13 Let a and b be in Sn . If
(i1 i2 · · · ik )
140
CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS
is a k-cycle of a, then
b
is a k-cycle of a .
(b(i1 ) b(i2 ) · · · b(ik ))
Proof. To say that (i1 i2 · · · ik ) is a cycle of a means that a(ij ) = ij+1 for
1 ≤ j < k and a(ik ) = i1 . By (4.3), we have ab (b(ij )) = b(ij+1 ) for 1 ≤ i < k
and ab (b(ik )) = b(i1 ). But this information says exactly that
(b(i1 ) b(i2 ) · · · b(ik ))
is a cycle of ab . Since b is one-to-one, there are k different elements in the cycle
and it is a k-cycle.
Corollary 4.3.14 Let a and b be in Sn . Then a and ab have the same cycle
structure.
Proof. From Lemma 4.3.13, b takes k-cycles of a to k-cycles of ab . Since b is
one-to-one, two k cycles of a cannot be mapped by b to one k-cycle of ab . Thus
ab has at least as many k-cycles as a. Now b−1 conjugates ab to a, so b−1 takes
(in a one-to-one fashion) k-cycles of ab to k-cycles of a. Thus a has at least
as many k-cycles as ab and we have that a and ab have the same number of
k-cycles. Since this applies to any k, we have the claimed result.
There is a converse to the corollary. Assume σ and τ are in Sn and have the
same cycle structure. It is easy to create an h ∈ Sn so that σ h = τ . It is easiest
to do by using the cycle notation for σ and τ to create Cauchy notation for h.
First write out the cycle notation for σ on one line. Then on the line below,
write out cycle notation for τ so that for each k every k-cycle of τ is directly
below a k-cycle of σ. The fact that σ and τ have the same cycle structure
guarantees that this can be done. Now erase the parentheses from the cycle
structures, put large parentheses around the pair of lines and h has been given
in Cauchy notation.
The reason that this all works is (4.3) and Lemma 4.3.13. If h is built by the
instructions above, then for each k-cycle of σ, its image under h is a k-cycle of
σ h . But h was built to have this be exactly a k-cycle of τ . Thus we get σ h = τ
on a cycle by cycle basis.
We illustrate this with α = (1)(2 6)(3 5)(4) and β = (1 3)(2)(4 6)(5) from
our examples above. The first step is to write
α =(1)(2 6)(3 5)(4)
β =(2)(1 3)(4 6)(5).
Note that we have rearranged the cycles of β so that each 1-cycle of β is under
a 1-cycle of α and so that each 2-cycle of β is under a 2-cycle of α. Now we
change parentheses to get
1 2 3 4 5 6
1 2 6 3 5 4
= (1 2)(3 4 5 6).
=
h=
2 1 4 5 6 3
2 1 3 4 6 5
4.3. CONJUGATION
141
Note that there are several ways to line the cycles of β under the cycles of
α. Not only that, a k-cycle can be written down in k different ways depending
on which element of the cycle is listed first. This means that there may be
many elements that will work as the conjugator. This explains why we did not
discover either of the elements in S6 , namely γ and δ, that we already knew
conjugated α to β.
We have done all the work to prove the following.
Theorem 4.3.15 Let α and β be in Sn . Then α and β are conjugate in Sn if
and only if they have the same cycle structure.
Note the phrase “in Sn ” in the statement. We qualify the relation “conjugate
to” to include where the conjugator must come from. This only becomes relevant
if there is a group and subgroup in the discussion. If f and g are in a subgroup
H of a group G, then to say that f and g are conjugate in H means that there
is an h ∈ H so that f h = g. To say that f and g are conjugate in G means
that the conjugator h is only required to come from G. Since there are more
elements in G that might act as conjugators, it is possible for two elements to
be conjugate in G and not in H.
We can illustrate this with elements of D12 . By consulting the figure below
3
4 111
11
1
5
121
11
11
1
6
you can check that
σ =(1 4)(2 5)(3 6),
τ =(1 4)(2 3)(5 6)
are both elements of D12 . According to Theorem 4.3.15, σ and τ are conjugate
in S6 . However, there are 720 elements in S6 and only 12 in D12 . We could
conjugate σ by all 12 elements of D12 , but that would be rather dull. We will
argue that σ and τ are not conjugate in D12 by combining what we know about
conjugating permutations and what we know about hexagons.
The three 2-cycles in σ all consist of pairs that are on opposite extremes of
the hexagon. They are the pairs (1 4), (2 5) and (3 6). If a conjugator h were
to take the three 2-cycles of σ to the three 2-cycles of τ , at least one of the pairs
just listed would have to be taken to the pair (2 3) of τ . However, this could
not come from a permutation of the vertices of the hexagon that preserves the
structure of the hexagon. Thus no conjugator can come from D12 . So σ and τ
are in D12 , are conjugate in S6 , but not conjugate in D12 .
142
4.3.6
CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS
One more example
We move to a somewhat more complex example of a permutation group.
We consider the group G of symmetries of the cube. A cube with labeled
vertices is shown below.
2
3
7
6 _4 _ _ _ _ _ 8
1
(4.8)
5
There are 48 elements of G. We see this by noting that the vertex 1 can go
to 8 different places. Once there, its three neighbors (vertices connected to it
by an edge) can be permuted at will.
To see this last point, note that if vertex 1 is viewed as sitting at the origin
of three space, then vertices 2, 4, 5 sit on the major axes. The vertices 2, 4,
5 can be permuted in any way while keeping 1 fixed and the cube comes back
to itself. Note that some of these permutations involve reflections as well as
possibly rotations. Keeping 1 and 4 fixed and switching 2 and 5 involves a
reflection of the cube in a plane through edge 1,4 and tilted 45 degrees up from
the horizontal.
Since there are 6 permutations for each position that 1 is sent to, there are
a total of 48 symmetries of the cube in (4.8).
We can look at stabilizers. To make things more interesting, we can think
of not only moving the vertices, but moving the entire cube. Thus we can let
x be the center point of the top face (the face 2,3,7,6) and discuss Gx . This
turns out to be the same as the stabilizer of the top face. We can let L be a
line segment from the midpoint of edge 2,6 to the midpoint of 3,7. Now we can
ask about the stabilizer of L.
These and other questions will be left as exercises.
4.3.7
Overview
All of (4.3), Lemma 4.3.6, Lemma 4.3.7 and Corollary 4.3.14 are variants of the
same theme.
The calculation (4.3) says that if σ and τ are permutations on X, then the
behavior of σ in a given location is carried by τ to the behavior of σ τ at the
image of that location under τ . This is exploited in Corollary 4.3.14 where it is
interpreted in cycle notation.
Lemma 4.3.6 and Lemma 4.3.7 interpret the calculation (4.3) in the setting
of a group of permutations instead of a single permutation. The lemmas say
that the behavior of a group H of permutations in a given location is carried by
a conjugator τ to the behavior of the group H τ at the image of that location
under τ .
4.3. CONJUGATION
143
This summary is as important to know as the ability to do the calculations
that lie behind the summary. If this summary is well understood then you will
have an easy time with several of the problems below.
The point of view just discussed will reappear often in these notes.
Exercises (33)
1. Prove Lemma 4.3.2.
2. Prove Lemma 4.3.3.
3. Prove Lemma 4.3.4.
4. In proving Lemma 4.3.4, you have to do the calculations that “fix” Caylay’s theorem about multiplying on the right. If you have not already done
so in Exercise Set (31), prove that setting ra (x) = xa−1 for each a in a
group G and f (a) = ra gives an isomorphism from G to a subgroup of SG .
5. There are four elements α, β, γ and δ defined after (4.4). Pick out several
pairs of these four elements and conjugate one by the other. Make sure
that you try at least one conjugation of an element with itself and see if
the result complies with Lemma 4.3.1.
6. Let G = D12 . If you calculated αδ , you should have gotten β which is also
αγ . Given that α is the non-identity element of G1 , explain why αγ = αδ
must be true.
7. We still let G = D12 . In Problem 10 in Exercise Set (32), you hopefully
wrote out the elements of G{1,4} . Explain how you can immediately write
out the elements of G{2,5} . Explain how you can immediately get the
elements of StG ({2, 5}) from the elements of StG ({1, 4}).
8. In D8 there are four elements that do not “flip” the square. The numbering of the corners of the square in (4.1) is arranged to increase in
the counterclockwise direction. The four elements that do not “flip” the
square are those that preserve the fact that the numbering increases in
the counterclockwise direction. We can call these elements the rotations
of the square. Show that these four elements form a subgroup of D8 and
that it is normal in D8 .
9. Give the details of the proof of Lemma 4.3.8 that were hinted at following
the statement of the lemma.
10. For the elements α, β, γ and δ of D12 given after (4.4), write each of βγ,
γβ, βδ, δβ, γδ and δγ in cycle notation. Do the same for γ 2 , γ 3 , γ 4 and
γ5.
11. What are the cycle structures of γ i for each i with 1 ≤ i ≤ 6?
144
CHAPTER 4. GROUP ACTIONS I: GROUPS OF PERMUTATIONS
12. For the elements α and β used in Section 4.3.5, how many different elements h ∈ S6 are there that conjugate α to β?
13. Both αβ and βα are products of two disjoint 3-cycles. Thus they have the
same cycle structure and must be conjugate. Find an h that conjugates
αβ to βα. Now take elements a and b in any group. Prove that ab and ba
are conjugate. Hint: this last fact has nothing to do with permutations.
Conclude that in Sn even if two elements f and g do not commute, then
at least f g and gf have the same cycle structure.
14. The following refer to the group G of symmetries of the cube shown in
(4.8). How many elements are in each of the following? If this is too easy,
list the elements.
(a) G1 .
(b) StG ({1, 7}).
(c) StG ({1, 6}).
(d) StG ({1, 5}).
(e) StG ({1, 2, 3, 4}).
(f) StG ({1, 3, 6, 8}).
(g) StG ({x, y}) where x is the center point of the top face and y is the
center point of the bottom face.
(h) StG ({x, y}) where x is the center point of the edge 2,6 and y is the
center point of the edge 1,5.
(i) Find an element that conjugates StG ({1, 6}) to StG ({3, 8}). Verify
by writing out the elements of each and checking. You should find
the conjugator before you write out the elements of the stabilizers.
(j) Find an element that conjugates StG ({1, 5}) to StG ({3, 4}). Verify
by writing out the elements of each and checking. You should find
the conjugator before you write out the elements of the stabilizers.
15. If you got the stabilizer of {1, 3, 6, 8} right in the group of symmetries
of the cube in (4.8), then you got 24 elements. Since {1, 3, 6, 8} has 4
elements and S4 has 24 elements, this should tell you something. What?
Chapter 5
Group actions II: general
actions
In this chapter, we move from permutation groups to a more general notion
known as group actions. Lemma 4.3.4 is good motivation for what we discuss
in this chapter. We repeat the lemma for the convenience of the reader.
Lemma (4.3.4) If G is a group, then f : G → Aut(G) defined by f (b) = cb
with cb as in Lemma 4.3.3 is a homomorphism.
This can be compared to Cayley’s theorem. Cayley’s theorem says that any
group is isomorphic to a subgroup of a symmetric group. Lemma 4.3.4 says
that every group has a homomorphism to a subgroup of its own automorphism
group. The similarities and differences are worth noting.
Since elements of Aut(G) are bijections from G to G (admittedly of a special
nature), they are permutations of the elements of G. Thus the target of the
function f in Lemma 4.3.4 is a group of permutations. This is similar to Cayley’s
theorem.
Further the function f in Lemma 4.3.4 is a homomorphism so we have that
f (ab) = f (a)f (b) or cab = ca cb . Thus, as in Cayley’s theorem, the multiplication
of elements of G is reflected in the composition of the permutations that f carries
the elements to.
But f is not guaranteed to be one-to-one. In fact there are strong reasons
to expect it not to be one-to-one.
The permutation corresponding to a is ca , conjugation by a. Thus ca (x) =
axa−1 . We know that this will equal x if a and x commute. It is possible that
a commutes with every x. In fact, this must happen in an abelian group. If a
commutes with every x ∈ G, then we will have ca (x) = axa−1 = x for every
x ∈ G. This means that ca is the identity permutation. But if 1 is the identity
in G, then c1 is also the identity permutation. If a 6= 1, then we will have two
elements in G carried to the same permutation.
We can gather a lot of this together by looking at the kernel of f . This is
145
146
CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS
all a ∈ G so that f (a) = ca is the identity in Aut(G). As just observed, ca is
the identity exactly when a commutes with every x ∈ G. This is given a name
and we have that for any group G, the center of G, written Z(G), is defined as
Z(G) = {a ∈ G | ∀x ∈ G(ax = xa)}.
The letter Z is used since the German word zentrum for center was first used
for this construct.
We don’t have to give an exercise to show that Z(G) is a subgroup of G. The
work in the proof of Lemma 4.3.4 that shows that f is a homomorphism shows
that its kernel Z(G) must be a subgroup of G. In fact Z(G) must be a normal
subgroup of G. In spite of the fact that we don’t have to give the exercise, we
will give it anyway.
We could reject the function f of Lemma 4.3.4 as a failed attempt at a
Cayley type theorem, but conjugation is too important a concept. Instead we
invent a new concept called a group action and declare that conjugation is one
of its prime examples.
5.1
5.1.1
Definition and examples
The definition
If G is a group and X is a set, then an action of G on X is a homomorphism
θ : G → SX . Various words are sometimes added to actions, and we will
illustrate that with the example provided by Lemma 4.3.4.
Since θ is a homomorphism, it has a kernel. The kernel of θ is the kernel of
the action. We will look at the kernel of an action again when new notation is
introduced.
The homomorphism f of Lemma 4.3.4 is an example of a group action.
The group is G and the set that G acts on is also G. All of the permutations
in the image of f are automorphisms of G so we could say that the action
is by automorphisms. However, each permutation is calculated by taking a
conjugation, so we could also say that the action is by conjugation. The latter
is more specific and the action supplied by Lemma 4.3.4 is usually introduced
by the phrase “let G act on itself by conjugation.”
Alternate definition and notation
This definition is perfectly fine, but it is not the most typical way that it is
presented. The way group actions are usually presented involves more detail
and less notation. We will explain.
If θ : G → SX is an action of G on X, then for each a ∈ G we have a
permutation θ(a) of X. Thus for each x ∈ X, we have θ(a)(x) as an element
of X. If b is another element of G and y another element of X, then we have
θ(b)(y) as an element of X. Thus for every pair (g, z) in G × X, we get an
element θ(g)(z) in X. Thus θ is a function from G × X to X.
5.1. DEFINITION AND EXAMPLES
147
We cannot let any function from G × X to X be an action. We would be
forgetting the multiplication on G. So we would have to add restrictions that
would make it an action. The following lemma tells what we would need.
Lemma 5.1.1 Let G be a group and X be a set. A function φ : G × X → X
gives an action θ : G → SX defined by θ(g)(x) = φ(g, x) if and only if the
following hold.
1. For all g and h in G and x ∈ X, we have φ(gh, x) = φ(g, φ(h, x)).
2. If 1 is the identity in G, then for all x ∈ X, we have φ(1, x) = x.
We will not ask you to prove this lemma. It is too ugly. We will pretty it
up first and then ask you to prove it.
The multiplication in a group is often written without a symbol. We will
reword Lemma 5.1.1 to omit the function symbol φ. An action is now going to
take a pair (g, x) in G × X and return an element of X that is called gx. It
is thought of as the element of X that g takes x to (after g is turned into a
permutation by the action). The expression gx is referred to as the result of the
action of g on x. We “know” that gx is not a multiplication because g comes
from a group G and x comes from a set X. Unfortunately, we have the example
of G acting on itself by conjugation to prove that we have to be careful with
this notation in certain circumstances.
Now Lemma 5.1.1 looks like the following.
Lemma 5.1.2 Let G be a group, X be a set and let a function from G × X to
X be given where the image of (g, x) is written gx. Then this function gives an
action θ : G → SX defined by θ(g)(x) = gx if and only if the following hold.
1. For all g and h in G and x ∈ X, we have (gh)x = g(h(x)).
2. If 1 is the identity in G, then for all x ∈ X, we have 1x = x.
The proof is left as an exercise. It is an important exercise to do. Once
proven, one has an alternate definition of the action of a group G on a set X
as a function from G × X to X with image of (g, x) written as gx that satisfies
conditions 1 and 2 of Lemma 5.1.2.
One of the steps in the proof of Lemma 5.1.2 will show the following fact
which is important enough to state separately.
Fact: If G acts on X with g ∈ G and x and y in X, then gx = y if and only if
g −1 y = x.
In doing the proof, you should keep careful track of what each of gh, (gh)x,
h(x) and g(h(x)) belong to and that all the expressions and the equalities make
sense. Lastly, note that 1x is not multiplication of x by 1 and that 1x = x does
not follow from the identity axiom for groups.
If an action of G on X uses notation for gx for the action of g ∈ G on x ∈ X,
then the kernel of the action becomes the set {g ∈ G | ∀x ∈ X, gx = x}.
148
5.1.2
CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS
Examples
Every permutation group G on a set X gives an action of G on X. The kernel
of such an action is trivial. In this situation the only element of G that acts as
the identity is the identity itself.
All examples of permutation groups from Chapter 4 are therefore examples
of group actions.
Below we give a few examples of actions that are newer.
Action of a group on itself by conjugation
We already have the example of G acting on itself by conjugation. Since the
group is acting on itself, it is inadvisable to use the shorthand notation of Lemma
5.1.2 for the action. Thus it is better to use ab for cb (a) which is the result of b
acting on a.
The notation ab has unfortunate aspects. The first requirement in Lemma
5.1.2 says that cab (g) must equal (ca cb )g = ca (cb (g)). This turns into g ab =
(g b )a . Most would prefer to see the incorrect g ab = (g a )b . In fact, the better
looking equality holds if a different definition is made for conjugation. However,
this would then require that permutations be composed the opposite way that
functions are usually composed. Since reversing the way functions compose is
easy to get used to, many books make these changes. We will stick with our
definition of conjugation and stick with the way that we compose permutations.
So g ab = (g b )a will remain as something we will have to live with.
Action of a group on its subgroups by conjugation
Let G be a group and let S be the set of subgroups of G. Let g be in G. In
the paragraph before Lemma 4.3.5, we defined Ag for any subset of G, and
Lemma 4.3.5 says that Ag is a subgroup if A is a subgroup. So conjugation
takes subgroups to subgroups. It needs to be checked that this forms an action.
Most of the work is taken up by Lemma 4.3.4. The rest of the details are left
to the reader.
We will wait until we have more practice listing all the subgroups of a group
before giving problems based on this example.
The action by conjugation on a normal subgroup
Let G be a group and let N ⊳ G. If b is in G and n ∈ N , then nb is in N . Thus
each element of G permutes the elements of N . Since the action of a group on
itself by conjugation is truly an action, all the equalities that must hold to make
the action of G on N a true action really do hold. Thus Lemma 4.3.4 shows
that this is an action. This is a special case of a form of restriction that will be
taken up shortly.
5.1. DEFINITION AND EXAMPLES
149
The action of the line on the complex plane by rotations
Let t ∈ R and z ∈ C be given. Define tz = z(cos t + i sin t). We should
have introduced eit for cos t + i sin t, but this would have taken time to justify.
Basically, t rotates the complex plane through the angle t. Details about this
action are left as an exercise.
Restrictions of actions to subgroups
Let G act on X and let H be a subgroup of G. Then H acts on X by looking
only at gx for those g ∈ H. The requirements in Lemma 5.1.2 are all equalities.
These cannot be violated by restricting to a subgroup, so the restriction is also
an action.
Restricting an action to an invariant subset
Let G act on X and let Y be a subset of X. We cannot just look at gx for those
x ∈ Y and expect to get an action on Y since gx might not be in Y even if x is
in Y . So this topic needs a condition.
For G, X and Y as above, we say that Y is invariant under the action of G
if gy is in Y for every g ∈ G and y ∈ Y . Now if Y is invariant under the action
of G, then looking only at gx for those x ∈ Y gives an action of G on Y . Once
again, the equalities required in Lemma 5.1.2 cannot fail since they hold for the
action of G on X.
The action of G by conjugation on a normal subgroup of G is of this type.
Exercises (34)
1. Show that for any group G that Z(G) is a subgroup of G and in fact a
normal subgroup of G.
2. Show that the center of D6 is trivial, but that the center of D8 is not.
What are the elements of Z(D8 )?
3. Prove Lemma 5.1.2. One aspect needs care. In the direction where you
prove that 1 and 2 imply that θ is an action, remember to prove that each
θ(g) is a permutation on X.
4. Prove that the “action” of a group on its set of subgroups by conjugation
is truly an action.
5. You showed that the four rotations in D8 form a normal subgroup of
D8 . Call this subgroup R. What is the kernel of the action of D8 on R
under conjugation? What is the kernel of the action of D8 on itself by
conjugation?
6. Show that tz = z(cos t + i sin t) is an action of R on C and find the kernel
of the action.
150
5.2
CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS
Stabilizers
Everything that was said about stabilizers for permutation groups applies to
group actions.
Such sweeping statements are bad pedagogy. It invites the student to review
a large amount of material with a slight change of definition and check that
everything goes through as well with the new definitions as it did with the old.
Very few students accept such an invitation.
Given this, we will be very selective in what we ask the student to review.
Also given this, we have not said everything that needs to be said about group
actions/permutation groups, and have reserved a few concepts to be introduced
only after group actions are defined. That will start in the next section. Here
we review some definitions and a few lemmas.
Let a group G act on a set X with the result of g ∈ G acting on x ∈ X
written gx.
Let x ∈ X and A ⊆ X. The stabilizer of x in G is
Gx = {g ∈ G | gx = x}.
The pointwise stabilizer of A in G is
GA = {g ∈ G | ∀x ∈ A, gx = x}.
The stabilizer of A in G is
StG (A) = {g ∈ G | ∀x ∈ A, gx ∈ A}.
Comparing these definitions to those in Chapter 4 shows that the only difference
typographically is the replacement of g(x) by gx in the conditions.
Note that the kernel of the action of a group G on a set X is GX .
Lemma 5.2.1 Let G act on X with x ∈ X and A ⊆ X. Then all of Gx , GA
and StG (A) are subgroups of G.
The proof is left as an exercise.
Lemma 5.2.2 Let G act on X with b ∈ G, x ∈ X and A ⊆ X. Then with
cb conjugation by b, we have that the appropriate restriction of cb gives isomorphisms as follows:
1. from Gx to Gb(x) ,
2. from GA to Gb(A) , and
3. from StG (A) to StG (b(A)).
The shift from permutation groups to general actions is not without consequences. Calculations of conjugacy in permutation groups on finite sets (such
as subgroups of Sn ) are easily carried out. In general actions, such niceties
as Cauchy notation or cycle notation that uniquely identify elements are not
generally available. In spite of this, we will be able to work with the concepts.
5.2. STABILIZERS
151
Special stabilizers
Conjugation is such an important action that some stabilizers have their own
names. To discuss this, we need to look at the action of G on itself by conjugation. The abbreviated notation for an action (g takes f to gf ) would be too
confusing for this action. It looks exactly like left multiplication by g. So we
write that g takes f to f g or more specifically to gf g −1 .
We have already seen that Z(G), the center of G, is the kernel of the action
of G on itself by conjugation. We look at other subgroups that can be defined
by this action, and in all that follows in this section, we are considering the
action of G on itself by conjugation.
If g ∈ G, then the centralizer of g in G is Gg with respect to this action. It
is usually defined separately as {h ∈ G | hg = gh}, and it is usually denoted
CG (g). With the machinery we have developed, we know that the centralizer of
g in G is a subgroup of G and that conjugation by f ∈ G takes the centralizer
of g in G to the centralizer of g f in G. These facts are quite easy to show
directly from the definitions, but it is nice to know that they also follow from
more general consideration.
If H is a subgroup of G, then the stablizer of H under the action of G on itself
by conjugation is called the normalizer of H in G. It is often denoted NG (H).
It is also a subgroup of G and the reader can supply a typical statement about
the effect of conjugation.
We can make two remarks about the normalizer. The first is that if we look
instead at the action of G on the set of subgroups of G by conjugation, then
NG (H) becomes the stabilizer of a single element, namely the subgroup H.
The second is that H ⊳ NG (H), and in fact NG (H) is the largest subgroup
of G in which H is normal. That is, if K is a subgroup of G containing H and
H ⊳ K, then K ⊆ NG (H). This is a triviality from the definitions, but we leave
it as an exercise to check the definitions.
Exercises (35)
1. Prove Lemma 5.2.1.
2. Prove at least one of the isomorphisms in Lemma 5.2.2.
3. Prove, without quoting any lemmas, the statement made above that “conjugation by f ∈ G takes the centralizer of g in G to the centralizer of g f
in G.” The claim is that it is an easy calculation. However, it is hard to
write down the easy calculation correctly. Be careful.
4. Let H be a subgroup of G. Prove that H ⊳ NG (H), and that if K is a
subgroup of G containing H and H ⊳ K, then K ⊆ NG (H).
152
5.3
CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS
Orbits and fixed points
We now come to concepts that were not introduced when permutation groups
were discussed. They could have been introduced with permutation groups,
but they were delayed to this point to have some exercises done that are not
imitations of previous exercises.
We give some useful notation. If G acts on X, S ⊆ G, if A ⊆ X are subsets,
and if g is in G and x is in X, then we can write down various sets of X. We
define
gA = {ga | a ∈ A},
Sx = {sx | s ∈ S}, and
SA = {sa | s ∈ S, a ∈ A}.
Basically, each looks at all possible combinations hinted at by the each of the
notations gA, Sx and SA. We look at one of these combinations when S is all
of G.
Orbits
Let the group G act on the set X and let x be in X. The orbit of x under the
action of G is the set
OG (x) = Gx = {gx | g ∈ G}.
Either notation OG (x) or Gx will do. We will tend to use the first more often.
In words, the orbit of x is the set of all the elements that x is taken to under
the action of G.
Another way to description of the orbit is
OG (x) = {y ∈ X | ∃g ∈ G, y = gx}.
The second description gives a better indication of how to show that something
is in an orbit and it makes a better connection to the discussion that follows.
To say something about the nature of orbits, we define a relation. Given an
action of G on X with x and y in X, we say that x ∼G y to mean that there is
a g ∈ G so that gx = y. Note that the second defintion of OG (x) makes it clear
that x ∼G y if and only if y ∈ OG (x). So if we prove that ∼G is an equivalence
relation, we will have shown that OG (x) is an equivalence class.
Lemma 5.3.1 If G acts on X and ∼G is defined as above, then ∼G is an
equivalence relation and each OG (x) is an equivalence class under ∼G .
The proof will be left as an exercise.
There is some resemblance of Lemma 5.3.1 to Lemma 4.3.10, but the proof
of Lemma 5.3.1 is even easier. In addition, Lemma 4.3.10 depends on Lemma
4.3.9 and there is no need for a parallel to Lemma 4.3.9 here.
5.3. ORBITS AND FIXED POINTS
153
Since orbits are equivalence classes, we now know that the orbits under the
action of G partition X. This leads to a number of counting exercises since now
the number of elements of X is known to be the sum of the sizes of the orbits.
This will be used shortly.
When you write out the proof of Lemma 5.3.1, you will see that Lemma
5.3.1 depends on the fact that G is a group. We do not get such good behavior
from arbitrary subsets of G and for a subset S ⊆ G, we do not call Sx an orbit
and we do not write OS (x). However, if S is a subgroup of G, then it is a group
in its own right and we can talk about the orbit of x under the subgroup H,
denote it OH (x), and have it defined as Hx = {hx | h ∈ H}. You can check
that OH (x) ⊆ OG (x) holds in this situation.
Lemma 5.3.2 Let G act on X with x ∈ X, H a subgroup of G, and g ∈ G.
Then OH g (gx) = gOH (x).
The proof is left as an exercise.
Fixed points
A special situation arises when an orbit of an action of a group G on a set X
has only one element in it. The element of such an orbit is taken only to itself
under the action of G. As such it is called a fixed point of G. The set of all fixed
points of G is called the fixed set of G and is defined as
Fix(G) = {x ∈ X | ∀g ∈ G, gx = x}.
This is easy to confuse with the definition of a stabilizer. Note that stabilizers
are contained in G and fixed sets are contained in X.
For a subgroup H of G, we define Fix(H) = {x ∈ X | ∀h ∈ H, hx = x}.
We have that Fix(G) is the union of all the orbits of G of size one. This
together with Lemma 5.3.2 makes the proof of the following quite easy.
Lemma 5.3.3 Let G act on X with H a subgroup of G and b ∈ G. Then
Fix(H b ) = bFix(H).
The proof is left as an exercise.
Invariant sets
Slightly looser than a fixed point is an invariant set. The following was previously mentioned, but we repeat it here with slightly different notation. The
notion is easy to confuse with a stablizer.
If G acts on X and A ⊆ X, then we say that A is invariant under the action
of G if GA ⊆ A. This rather brief definition when expanded into words says
that a set A is invariant under the action if all its elements are carried into A
by all elements of G.
When A ⊆ X is invariant under the action of G on X, then we can restrict
the action of G to only act on A. If A is not invariant, then trying to restrict
154
CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS
the action of G to A would make no sense since there would be elements of A
that G would carry outside of A. The restriction of the action of G on itself by
conjugation to the action of G on a normal subgroup is an example of this.
Invariant sets and stablizers
The notion of invariant sets cooperates with the notion of stabilizers. We have
seen an example of this with the normalizer of a subgroup.
Let G act on X with A ⊆ X. Then we can look at the subgroup H = StG (A)
of G. The restriction of the action to H is another action and A is invariant
under the action of H on X even if it is not invariant under the action of all
of G on X. Further H is the largest subgroup of G for which this is true. The
verification of these remarks is left as an exercise.
5.4
Cosets and counting arguments
We come to the first important results in elementary group theory. They all
involve counting and they give the first restrictions on how groups can be built.
The main result, Lagrange’s theorem, gives the possible sizes of subgroups of
groups with finitely many elements. Lagrange’s theorem can be proven very
early after groups are defined. We have chosen to delay its proof so that we
could first introduce the notion of a group action. Now that we have introduced
that notion, we can make use of it.
Lagrange’s theorem is only about groups with finitely many elements. A
group with finitely many elements is called a finite group. Recall that the
number of elements of a group G is called the order of G and is written |G|.
5.4.1
Cosets
The key lemma behind Lagrange’s theorem is the following. It applies to all
groups, finite or not.
Lemma 5.4.1 Let H be a subgroup of a group G and let H act on G by left
multiplication in that h ∈ H takes g ∈ G to hg. Then for any g ∈ G, the
function sending h to hg is a bijection from H to the orbit Hg of g.
Proof. The function is onto by the definition of an orbit. To show the function
is one-to-one, we consider h and h′ in H and assume hg = h′ g. Now right
multiplication by g −1 shows that h = h′ .
The orbit Hg is usually called a right coset of H in G. The word “right”
is used, since the element g that determines which orbit (coset) we are talking
about is on the right of H. Note that g ∈ Hg since 1 ∈ H and g = 1g.
There are also left cosets. The notation gH refers to the set {gh | h ∈ H}
and is called a left coset of H. Note that g ∈ gH since 1 ∈ H and g = g1.
5.4. COSETS AND COUNTING ARGUMENTS
155
We would like to know that the left cosets also partition G. This can be
proven directly, but it would be nice to use orbits. Unfortunately multiplication
on the right is not an action of H on G. This was mentioned in Exercise Set
(31). However, there is a useful right multiplication that is an action. Since the
abbreviated notation will be confusing here, we will define an action of H on G
in which h ∈ H takes g ∈ G to an element of G that we will write as h(g) which
is given by the formula h(g) = gh−1 .
If you did Problem 4 in Exercise Set (33), then you know how to prove the
following lemma.
Lemma 5.4.2 If H is a subgroup of G, then defining h(g) = gh−1 for h ∈ H
and g ∈ G defines an action of H on G.
The proof is left as an exercise.
The relevance of Lemma 5.4.2 to right cosets is the following.
Lemma 5.4.3 Let H be a subgroup of G and let g ∈ G. Then the left coset gH
equals {gh−1 | h ∈ H}.
The proof is left as an exercise.
Lastly, we have the following parallel to Lemma 5.4.1.
Lemma 5.4.4 If H is a subgroup of G and g is in G, then the function sending
h to gh is a bijection from H to gH.
The proof is identical with trivial changes to the proof of the last sentence
in Lemma 5.4.1.
From Lemma 5.4.3, we know that left cosets are orbits of an action and thus
form a partition of G just as the right cosets do. This raises several questions.
The first question is whether the left cosets are the same as the right cosets.
We will see that the answer is sometimes they are and sometimes not. The
difference between the two situations will be studied. The second question is
why the two kinds of cosets are needed. They both partition the full group
into subsets that are all the same size as each other and the same size as the
subgroup. It turns out that both kinds of cosets can be useful.
We turn to the first application of cosets.
5.4.2
Lagrange’s theorem
For Lagrange’s theorem, it does not matter which kinds of cosets we use. For
no particular reason, we will use right cosets.
Assume that G is a finite group and that H is a subgroup of G.
We know that the right cosets partition G and are all the same size. In
particular they are the same size as H. Since G is a finite group, H is finite as
well. If k is the number of different right cosets of H, we must also have that k
is a finite number. Thus |G| = k|H|. We have proven the following.
156
CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS
Theorem 5.4.5 (Lagrange) If G is a finite group and H is a subgroup of G,
then |H| divides |G|.
We cannot resist giving an immediate consequence of Lagrange’s theorem.
Corollary 5.4.6 If a group G has prime order, then the only subgroups of G
are the trivial subgroup and G itself.
Proof. Let p = |G|. The only positive integers dividing p are 1 and p. The only
subgroup of order 1 is {1} and the only subgroup with p elements in a group
with p elements is the whole group.
Another consequence of Lagrange’s theorem is the following.
Corollary 5.4.7 If a group G has prime order, then it is abelian.
Proof. Assume false. Let g ∈ G be such that it does not commute with h ∈ G.
We know that g cannot be the identity. Now CG (g) has at least the identity
and g, but it does not have the element h. So it is a subgroup of G that is not
trivial and not the whole group. This is impossible by Corollary 5.4.6.
There is another proof of Corollary 5.4.7 based on a better understanding of
how subgroups are built. We will see this in the next chapter.
5.4.3
The index of a subgroup
Lagrange’s theorem can be put in another form. If H is a subgroup of G then
the index of H in G is the number of right cosets of H in G. It is also the
number of left cosets of H in G since the left cosets also partition G and are
also the same size as H.
The notation for the index of H in G is [G : H]. Lagrange’s theorem can
now be stated as follows.
Theorem 5.4.8 If H is a subgroup of G, then |G| = |H|[G : H].
Proof. This is true by the proof of the original form of Lagrange’s theorem if
|G| is finite. If G is infinite then at least one of |H| or [G : H] is infinite, and
we can regard the equality as being correct.
There is a certain amount of arithmetic that can be done with indexes.
Recall that the definition and the notation regarding the index of a subgroup
takes into account both the subgroup and the group that it is contained in. If
we have a chain of subgroups H ⊆ K ⊆ G, then we have three indexes that we
can look at: [G : H], [G : K] and [K : H]. They are related as follows.
Lemma 5.4.9 If H ⊆ K ⊆ G is a chain of subgroups of the group G, then
[G : H] = [G : K][K : H].
The proof is left as an exercise.
5.4. COSETS AND COUNTING ARGUMENTS
5.4.4
157
Sizes of orbits
Let G act on X and let x be in in X. We want to know the size of OG (x). The
answer will be that it is equal to an index of a certain subgroup. Thus we will
want to set up a one-to-one correspondence between cosets of the subgroup and
the orbit. It turns out that it will be left cosets that we have to work with.
Theorem 5.4.10 Let G act on X and take x ∈ X. Then the number of elements of OG (x) is [G : Gx ].
Proof. We will claim a one-to-one correspondence between the set of left cosets
of Gx in G and OG (x). The question is what the one-to-one correspondence is
based on. The answer is that each left coset is exactly the set of elements of G
that take x to a specific element of OG (x). Here are the details.
We claim that for g ∈ G, we have h ∈ gGx if and only if hx = gx. To show
one direction, we take h ∈ gGx . That is h = gf for some f ∈ Gx . This means
that hx = (gf )x = g(f x) = gx. To show the other direction, we assume that
hx = gx. This means that (g −1 h)x = (g −1 g)x = x and g −1 h belongs to Gx .
Say g −1 h = f with f ∈ Gx . That means h = gf and h ∈ gGx .
Now define ρ from the set of left cosets of Gx in G to OG (x) by setting
ρ(gGx ) = gx. This needs to be checked for well definedness since there are
many g that give the same left coset. If gGx = hGx , then g and h are in the
same left coset and gx = hx. The function ρ is one-to-one, by the “if” part of
the previous paragraph. If ρ(gGx ) = ρ(hGx ), then gx = hx and g and h are in
the same left coset, so gGx = hGx . The function ρ is onto since every element y
in OG (x) is gx for some g ∈ G and this makes ρ(gGx ) = gx = y. This completes
the proof.
Corollary 5.4.11 Let a finite group G act on X with x ∈ X. Then the number
of elements in OG (x) divides |G|.
This is immediate from Theorem 5.4.10.
5.4.5
Cauchy’s theorem
We come to the first intricate theorem of group theory. Lagrange’s theorem is
of vast importance, but it is rather elementary to prove. The proof of Cauchy’s
theorem, first published in 1844, has had over 150 years to be polished and
simplified, and it is still an effort to prove. The following proof [3] appeared
first in 1959.
Theorem 5.4.12 (Cauchy) Let a prime p divide the order of a finite group
G. Then there is an element of G of order p.
Proof. Let n = |G| and let 1 be the identity element of G. Let Gp = G ×
G × · · · × G with p factors. Thus each element of Gp is a p-tuple of the form
158
CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS
(g0 , g1 , . . . , gp−1 ). We start the subscripts with 0 to have the subscripts cooperate with the elements of Zp . The number of elements of Gp is np .
Let A = {(g0 , g1 , . . . , gp−1 ) | g0 g1 · · · gp−1 = 1}. The number of elements
|A| of A is np−1 . This is seen since the first p − 1 entries of a p-tuple in A can
be any of the elements of G. The last entry has exactly one choice, namely
(g0 g1 · · · gp−2 )−1 .
We let Zp act on A by rotating the entries in each p-tuple of A. That is, for
[k]p ∈ Zp , we have
k(g0 , g1 , . . . , gp−1 ) = (g[k]p , g[k+1]p , . . . , g[k+p−1]p ).
First we note that this is an action on A. If a product of elements is 1, then
any rotation of the product is 1. This is seen by induction if we show this for
rotation by one position. If g0 g1 · · · gp−1 = 1, then
g0−1 (g0 g1 · · · gp−1 )g0 = g0−1 1g0 = 1.
But
g0−1 (g0 g1 · · · gp−1 )g0 = g1 g2 · · · gp−1 g0
which is a single rotation of the product. Thus each element of Zk takes elements
of A to elements of A. That it is an action is seen by noting that rotating first
j positions and then k positions, is the same as rotating j + k positions. Also
note that rotating by 0 positions leaves the p-tuple fixed. We then quote Lemma
5.1.2 to conclude that we have an action.
From Corollary 5.4.11, we know that the size of each orbit of the action is
either 1 or p. An orbit of size 1 is a fixed point of the action. But an orbit of
size 1 is a p-tuple that remains the same under all possible rotations. Thus an
orbit of size 1 is a p-tuple whose entries are all the same.
There is at least one orbit of size 1. This has the p-tuple (1, 1, . . . , 1). Thus
the number of orbits of size 1 is not zero.
Let k be the number of orbits of size 1 and let l be the number of orbits of
size p. Since the orbits of the action partition A, we must have that the sum of
the sizes equals the number of elements of A. Thus
pn−1 = |A| = k(1) + l(p).
Since p divides both |A| and lp, it must divide k(1) = k. Since k is not zero, it
must be at least p. Thus there is at least one other p-tuple in A in which all the
entries are the same besides (1, 1, . . . , 1). This tuple is of the form (g, g, . . . , g)
for some g 6= 1 in G. But to be in A we must have gg · · · g = 1 or g p = 1.
We must argue that p is the order of g. The order of p is not 1, and if it is
some d other than p, then 1 < d < p by the definition of order. Then by the
division algorithm, p = dq + r with 0 ≤ r < d. Now
1 = g p = g dq+r = g dq g r = (g d )q g r = 1q g r = g r
and r is a smaller power than d that makes g r = 1. If r > 0, then this contradicts
the statement that d is the order of g, so r = 0 and d|p. But this is not possible
unless d = p. So p is the order of g. This completes the proof.
5.4. COSETS AND COUNTING ARGUMENTS
159
Exercises (36)
1. Prove Lemma 5.3.1.
2. Prove Lemma 5.3.2.
3. Prove Lemma 5.3.3.
4. Let G act on itself by conjugation. Show that for this action Fix(G) equals
Z(G).
5. Let G act on X with A ⊆ X and let H = StG (A). Then A is invariant
under the action of H on X and if A is invariant under the action of a
subgroup K of G on X, then K ⊆ H.
6. Prove Lemma 5.4.2 if you have not already done the work in Problem 4
in Exercise Set (33). If you have done that problem, then verify that the
details of the solution prove the lemma.
7. Prove Lemma 5.4.3.
8. Prove Lemma 5.4.9. As a hint, note that the number of elements in a
cross product A × B of sets is the product of the sizes of the two sets A
and B.
160
CHAPTER 5. GROUP ACTIONS II: GENERAL ACTIONS
Chapter 6
Subgroups
6.1
Subgroup generated by a set of elements
We add information from Section 3.2.3. There we showed (Lemma 3.2.10) that
if S is a subgroup of a group G, then there is a smallest subgroup of G that
contains S. We used this lemma to define the subgroup of G generated by S to
be this smallest subgroup. However, we never said exactly what is in the group
generated by S. In this section we will fill in this gap in our information.
To save words, we let hSi denote the subgroup generated by S, where S is
a subset of a group G. We will build on the fact that hSi must contain S by
definition. Before we look at what else hSi must contain, we discuss strategy.
6.1.1
Strategy
We ouline the strategy very explicitly since similar strategies apply in many
other situations.
The start
We want to know what the elements of hSi are where S is a subset of a group
G. We turn this around and pick a collection C of elements of G and ask if the
collection C is equal to hSi. Of course hSi is a subgroup of G, so our collection
C must form a subgroup of G. Also hSi contains S. So our collection C must
also contain S. But if our collection C is a subgroup of G and contains S, then
we have
hSi ⊆ C
(6.1)
by the definition of hSi. So all we need to prove is that C ⊆ hSi.
The middle
To discuss how to show C ⊆ hSi, we look at how hSi is formed in Lemma 3.2.10.
We get hSi by intersecting all subgroups of G that contain S. So we need build
161
162
CHAPTER 6. SUBGROUPS
C so that every element of C is in every subgroup of G that contains S.
Looking at this negatively, we say that we put nothing in C unless it has
to be in every subgroup of G that contains S. Looking at it positively, we say
that we will throw everything that we can think of into C that must be in every
subgroup of G that contains S.
The end
We combine these observations. To build C, we start with S. This certainly
has only elements that are in every subgroup of G that contains S. Then we
throw in everything that we can think of that must be in any subgroup of G
that contains S. For example, the squares of elements of S, the inverses of every
element of S, the products of pairs of elements of S, and so forth. This will
keep C ⊆ hSi. If it turns out that the set C that we have created is a group,
then (6.1) will give us hSi ⊆ C.
In summary:
1. Start with S.
2. Build C by adding to S all elements of G that must be in any subgroup
of G.
3. Show that C is a group.
4. Conclude that C = hSi.
6.1.2
The strategy applied
Let G be a group and let S be a subset of G. We use S −1 to denote the set of
all the inverses of elements in S. More specifically,
S −1 = {g −1 | g ∈ S}.
We note that every element of S −1 must be in any subgroup of G that contains
S. This statement also applies to S ∪ S −1 .
We now let C be all finite products of elements from S ∪ S −1 . This sentence
needs clarification.
We know what a product of two elements is. Since the multiplication is
associative, we also know what a product of n elements is for n > 2. We next
discuss a product of one element, and a product of zero elements.
We define a product of one element x to be x itself.
To discuss a product of zero elements, we note that if u is a product of m
elements and v is a product of n elements, then uv is clearly a product of m + n
elements. It would be nice if we defined a product of zero elements so that if
z is a product of zero elements and u is a product of m elements, then zu is a
product of 0 + m = m elements. This is easy to accomplish by letting z be the
identity.
6.1. SUBGROUP GENERATED BY A SET OF ELEMENTS
163
We turn the discussion of the last paragraph into a definition and say that
a product of zero elements is the identity in G.
We are now ready to prove the following.
Proposition 6.1.1 Let G be a group and let S be a subset of G. Let C consist
of all finite products of elements of S ∪ S −1 . Then C = hSi.
Proof. We will discuss what we are doing as the proof goes along.
A subgroup of G needs three things. It needs to have the identity of G, it
needs to be closed under the taking of inverses, and it needs to be closed under
products. We start with a discussion of the identity.
There is a bit of hidden efficiency in our setup since it works even if S is
empty. If S is empty, then our convention that a product of zero elements is
the identity puts the identity in C. If S is empty, then so is S −1 , and there are
no other elements to take products of. So we end up with only the identity in
C. This is the smallest subgroup of G and clearly contains S since S is empty.
If S is not empty, then it has some element whose inverse is in S −1 . Now the
product of the element and its inverse gives the identity and so our convention
that the product of zero elements is the identity is not really needed in this case.
Next we consider inverses. Since a subgroup of G is closed under the taking
of inverses, we know that S ∪ S −1 is in any subgroup of G that contains S.
Further the inverse of an inverse is the original element, so S ∪ S −1 is closed
under the taking of inverses.
Next we consider products. Finite products of elements from S ∪ S −1 must
also be in any subgroup of G that contains S since subgroups are closed under
finite products. So we have C ⊆ hSi. Also we know that if u and v are finite
products of elements of S ∪ S −1 , then uv is also a finite product of elements of
S ∪ S −1 . So we have created a set closed under finite products. As mentioned
above, we also know that the identity is present. But by adding new elements,
we may have ruined closure with respect to the taking of inverses since there
are now more elements to invert. It turns out that we have not and we now
prove this.
If u is in C, then u = a1 a2 · · · an where all the ai are in S ∪ S −1 . But now
−1
−1 −1
u−1 = a−1
n an−1 · · · a2 a1 .
For any i with 1 ≤ i ≤ n, we know that ai is in S or S −1 , so either ai = s or
ai = s−1 for some s ∈ S. But then a−1
= s−1 or a−1
= s for some s in S and
i
i
−1
−1
−1
ai is in S ∪ S . This makes u a product of elements of S ∪ S −1 and u−1
is in C. Thus C is a subgroup of G.
As mentioned in the description of the strategy, this tells us that hSi ⊆ C.
We have been careful to keep C ⊆ hSi, so C = hSi.
The order that was used to build C in Proposition 6.1.1 is important. In the
proposition, inverses are added before products. If we reverse this we do not get
a good result. Start with a set S in a group G. Throw in the identity as the first
step. Then throw in all finite products as the second step. At this point, the
164
CHAPTER 6. SUBGROUPS
collection is closed under products as shown in the proof of Proposition 6.1.1.
Lastly, throw in the inverses of all that has been created as the third step. Since
inverses were done last, it is clear that the result is closed under the taking of
inverses. However, the third step has ruined the good results of the second step.
The result is not necessarily closed under the taking of products. This will be
explored in an exercise.
6.1.3
Generators
As mentioned in Section 3.2.3, the smallest subgroup H of a group G that
contains a subset S of G is called the subgroup generated by S, and we can refer
to S as a set of generators for H. If the subgroup H is the full group G, then
of course we can say that G is generated by S and that S is a set of generators
for G.
There is no unique set of generators for a group. If G is a group, then G is a
subset of G and generates G since G is the smallest subgroup of G that contains
G. However smaller generating sets might work as well.
Consider Zk , the group of integers modulo some integer k > 1 and consider
the set {1} (leaving out [ ]k for brevity). Any subgroup of Zk containing {1},
must contain 1 + 1, as well as 1 + 1 + 1, and so forth. Eventually, we see that
the subgroup must also contain a sum of k ones, which is the same as 0 in Zk .
Thus we see that any subgroup of Zk that contains {1} must contain all of Zk ,
so Zk is generated by the single element 1.
This example leads to a definition. A group generated by one element is said
to be cyclic. We have seen that Zk is a cyclic group. It will be shown in an
exercise that Z is also a cyclic group.
In a cyclic group generated by {a}, we often say that the group is generated
by a.
Exercises (37)
1. Consider G = Z × Z. This consists of all ordered pairs (a, b) with both a
and b from Z under the operation
(a, b) + (c, d) = (a + c, b + d).
Consider the set S = {(1, 0), (0, 1)}. Create a subset C of G as follows.
Let C1 be all finite sums (as opposed to products since this abelian group
is written additively) of elements of S. Let C = C1 ∪ C1−1 . In other words,
we throw in all sums before we throw in inverses in contrast to the order
used in Proposition 6.1.1.
(a) Describe the elements in C.
(b) Determine whether C is a group or not.
(c) Determine what goes wrong with the proof of Proposition 6.1.1 if the
order used in this exercise is used.
6.1. SUBGROUP GENERATED BY A SET OF ELEMENTS
165
2. Show that Z is generated by 1.
3. Let G be a cyclic group generated by a.
(a) If the order of a is a finite number k, then show that G is isomorphic
to Zk .
(b) If the order of a is infinite, then show that G is isomorphic to Z.
Hint: exhibit the isomorphism explicitly.
4. Let p > 1 be a prime integer. Use Lagrange’s theorem to show that Zp is
generated by any element in Zp that is not [0]k .
5. Let k > 1 be an integer that is not a prime. Show that there are non-zero
elements of Zk that do not generate Zk .
6. Can you find two elements a and b of Z6 so that {a, b} generates Z6 , but
neither a nor b by themselves generates Z6 ?
7. What is the smallest generating set that you can find for D12 ?
8. What is the smallest generating set that you can find for S4 ?
9. List all the subgroups of D16 . This rather formidable looking problem is
not all that bad. Consider what some simple combinations of elements
generate, and constantly take into account Lagrange’s theorem.
166
CHAPTER 6. SUBGROUPS
Chapter 7
Quotients and
homomorphic images
In this chapter we take the process that creates the group Zk from the group
Z and apply it to arbitrary groups. In order to give an outline of what will be
done, we review how Zk is built and its relationship to Z.
7.1
7.1.1
The outline
On the groups Z and Zk
The construction
To build Zk , we go through several steps.
First we put the relation ∼k on Z and show that it is an equivalence relation.
We then declare that the new group will have the equivalence classes of ∼k as
its elements.
Second we define a binary operation (written additively since it turns out
to be abelian and it comes from the addition on Z), on the set of equivalence
classes of ∼k as [a]k + [b]k = [a+ b]k and show that this operation is well defined.
Third, we make the easy observation that the binary operation has all the
properties needed to declare that we have a group. This turns out to be easy
since the binary operation on the equivalence classes is defined in terms of the
operation on Z.
The homomorphism Z → Zk
The fact that the binary operation on Zk is defined using the binary operation
on Z makes the function π : Z → Zk defined by π(a) = [a]k a surjective
homomorphism.
167
168
CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES
Construction and homomorphism revisited
The relevant equivalence relation on Z is defined by a ∼k b if and only if k|(b−a).
Another way to say this is to say that b − a is a multiple of k. A third way to
say this is that b − a is in the set of multiples of k in Z.
The multiples of k make an appearance in a discussion of the surjective
homomorphism π. The multiples of k are exactly the elements in the kernel of
π since π(a) = [0]k happens exactly when [a]k = [0]k or a − 0 is a multiple of k.
Note that this means that the multiples of k in Z form a subgroup of Z and
that this subgroup has to be normal. Of course, the fact that the multiples of k
in Z form a subgroup is extremely easy to argue without building Zk , and the
fact that the subgroup turns out to be normal is immediate from the fact that
Z is abelian and all of its subgroups are normal.
If we let M be the set of multiples of k in Z, then we have the following.
1. The equivalence relation ∼k could have been defined by saying a ∼k b if
and only if b − a is in M .
2. After Zk is defined from ∼k , the kernel of π : Z → Zk turns out to be M .
7.1.2
The new outline
We wish to apply our experience with Z and Zk to more general situations.
We wish to start with an arbitrary group G (which might not be abelian) and
form a new group by putting an equivalence relation on G and define a binary
operation on the set of equivalence classes.
If the new group is called Q and the operation on Q is based on the operation
on G, then there should be a reasonable homomorphism from G to Q. If the
construction of Q from G closely imitates the construction of Zk from Z, then
the kernel of this homomorphism should be related to the equivalence relation
that we put on G. But the kernel of a homomorphism is always normal, so the
equivalence relation should probably be based on a normal subgroup of G.
Next we note that we should be writing operations multiplicatively since G
might not be abelian. Thus we should be looking at ba−1 instead of b − a.
We have arrived at the following outline.
1. Start with a group G and a normal subgroup N of G.
2. Define a relation ∼N on G by saying a ∼N b if and only if ba−1 ∈ N .
3. Prove that ∼N is an equivalence relation. Let [a]N denote the equivalence
class of a under this relation.
4. Define a binary operation on the equivalence classes by [a]N [b]N = [ab]N
and prove that this is well defined.
5. Prove that all the requirements in the definition of a group are satisfied by
this operation on the set of equivalence classes. Let Q denote the resulting
group structure on the set of equivalence classes.
7.2. COSETS
169
6. Define h : G → Q by h(a) = [a]N . That this is a homomorphism is
immediate from the definition of the binary operation on Q.
7. Show that N is the kernel of h.
Some details in the outline above are not usually given in the way that we
have described them. It turns out that the eqivalence classes [a]N have a much
more familiar description. Before working through the outline, we first take a
look at the equivalence classes.
Exercises (38)
The following is technically optional. It will be shown by other means in the
next section. However, it is extremely easy and should be done anyway. Note
that normality is not needed.
1. Let G be a group with a subgroup H. Define the relation ∼H on G
by declaring a ∼H b to mean that ba−1 is in H. Prove that ∼H is an
equivalence relation.
7.2
Cosets
We make our first observations about an arbitrary subgroup and then specialize
to normal subgroups. We do these two steps to emphasize the difference in
behavior between normal subgroups and more general subgroups.
7.2.1
Identifying the equivalence classes with cosets
Let G be a group and let H be a subgroup of G. Define the relation ∼H on G
by declaring a ∼H b to mean that ab−1 ∈ H. We could go through the work
to show that this is an equivalence relation (which you may have done already
in an exercise above), but it turns out that we have already done that work.
Since this might be hard to recognize, we show that ∼H could have been given a
different definition. This will allow us to argue that it is an equivalence relation
and it will give us a second view (actually a third) on the relation.
The following lemma refers to orbits of an action as discussed in Section 5.3,
and the view of cosets as orbits as discussed in Section 5.4.1.
Lemma 7.2.1 Let G be a group and let H be a subgroup of G. Then for
elements a and b in G, we have a ∼H b if and only if a is in the orbit Hb
of b under the action of H on G by left multiplication.
Proof. If a ∼H b, then ab−1 = h for some h ∈ H. This means that a = hb
putting a in the orbit Hb of the action. Conversely, if a is in the orbit of b
under the action of H on G by left multiplication, then a = hb for some h ∈ H
making ab−1 = h ∈ H.
170
CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES
In Section 5.3 we saw from Lemma 5.3.1 that being in the same orbit of an
action is an equivalence relation and that orbits are equivalence classes. Thus
we have one step of the outline finished. Not only that, we have a name for the
equivalence classes. The equivalence classes are simply the right cosets of H in
G.
So far, normality has not been essential in the outline. However, the next
steps will need normality. The definition of the binary operation on the equivalence classes will not be well defined unless the subgroup in question is normal.
We look next look at the important difference that normality makes.
7.2.2
Cosets of normal subgroups
An example
We will look at an example that will run through this discussion for a while. We
will look at D16 , the dihedral group of order 16. This is the group of symmetries
of the octogon shown below.
3
llRRRRRR
4 lllll
R, 2
,,
,,
,,
, 5 ,,
1
,,
,,
RRRR
l
l
RRR llll 8
6
l
7
This example was picked because it is large enough to exhibit certain behviors.
We will find two subgroups N and H of the same size, with N normal in G and
H not. Further N will be normal in spite of the fact that some of its elements
do not commute with some elements of D16 . The fact that N and H are the
same size is not crucial, just pleasing.
With the vertices labeled as shown above, we can use cycle notation to
describe all 16 elements of D16 .
First we have the identity e = (1)(2)(3)(4)(5)(6)(7)(8).
Next we have the rotation r = (1 2 3 4 5 6 7 8). Another 6 elements
are the various powers of r. For example r3 = (1 4 7 2 5 8 3 6) and r4 =
(1 5)(2 6)(3 7)(4 8).
Then we have 8 reflections. Four of them are across lines through two opposite vertices, and four of them are across lines through the midpoints of two
opposite edges. The four reflections across lines through vertices are
v = (3)(2 4)(1 5)(6 8)(7),
h = (1)(2 8)(3 7)(4 6)(5),
d1 = (2)(1 3)(4 8)(5 7)(6),
d2 = (4)(5 3)(6 2)(7 1)(8).
7.2. COSETS
171
The four reflections across lines through midpoints of edges are
m1 = (1 2)(8 3)(7 4)(6 5),
m2 = (2 3)(1 4)(8 5)(7 6),
m3 = (3 4)(2 5)(1 6)(8 7),
m4 = (4 5)(3 6)(2 7)(1 8).
We will look at two subgroups N = {1, r2 , r4 , r6 } and H = {1, v, h, r4 }.
The subgroup N is normal in D16 . This can be shown directly since conjugations are easy to calculate. The calculations will show that not all elements
of N commute with all elements of D16 . The elements 1 and r4 do, but there
are elements of D16 that do not commute with r2 and r6 .
The subgroup H is not normal in D16 . Conjugating v by r gives d1 which
is not in the subgroup H.
We can look at the relations ∼N and ∼H . The equivalence classes of these
relations are the right cosets of N and H, respectively. Since D16 has 16 elements
and each of N and H have 4, there are 4 cosets of each. The right cosets of N
are
N 1 = {1, r2 , r4 , r6 },
N r = {r, r3 , r5 , r7 },
N v = {v, d2 , h, d1 },
N m1 = {m1 , m2 , m3 , m4 }.
The calculations leading to N 1 and N r are trivial. To compute N v and N m1 ,
we note that each of these cosets is filled with a flip followed by a rotation. Thus
each element in these cosets reverses the circular order of the vertex labels. Next
it is noted that all of the “order reversing” elements in D16 do different things
to 1. For example, v(1) = 5, h(1) = 1, d1 (1) = 3 and so forth. So seeing where
1 goes determines the element. For example r2 v(1) = 7 so rv = d2 .
The right cosets of H are
H1 = {1, v, h, r4 },
Hr = {r, m2 , m4 , r5 },
Hr2 = {r2 , d1 , d2 , r6 },
Hr3 = {r3 , m1 , m3 , r7 },
and are again computed by seeing whether a flip has taken place and by following
vertex 1.
An important difference between the normal N and the non-normal H shows
up when we look at left cosets. The left cosets of N are
1N = {1, r2 , r4 , r6 },
rN = {r, r3 , r5 , r7 },
vN = {v, d1 , h, d2 },
m1 N = {m1 , m4 , m3 , m2 },
172
CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES
and the left cosets of H are
1H = {1, v, h, r4 },
rH = {r, m3 , m1 , r5 },
r2 H = {r2 , d2 , d1 , r6 },
r3 H = {r3 , m4 , m2 , r7 }.
Observations
We note that every left coset xN of N equals the right coset N x. However,
there are left cosets of H that equal no right coset of H. In particular rH and
r3 H equal no right coset of H. Note that some left cosets of H do equal the
corresponding right coset. This is no surprise since 1H = H1 has to be true.
We also have r2 H = Hr2 .
These observations lead to the next lemma whose proof is left to the reader.
Lemma 7.2.2 Let G be a group and let J be a subgroup of G. Then the following are equivalent.
1. J is normal in G.
2. For every g ∈ G, we have gJ = Jg.
3. For every g ∈ G and j ∈ J, there is an element k ∈ J so that gj = kg.
4. For every g ∈ G and j ∈ J, there is an element k ∈ J so that jg = gk.
Exercises (39)
1. Prove that the H given above is a subgroup of D16 . This exercise should
also ask to show that N is a subgroup, but that would be too easy. You
should check that N is normal in D16 , however.
2. Find an element g of D16 so that gr2 g −1 6= r2 .
3. Verify that v r = d1 .
4. The order that the elements are listed in N v and vN is based on the order
that the elements are listed in N . Verify that when every element of N in
the order given above is multiplied on the right by v, we get the elements
in the order given above and that when every element in N in the order
given above is multiplied on the left by v, we get the elements in the order
given above.
5. More generally, verify all the calculations of the right and left cosets of N
and H. If you do not take into account the remarks made above about
how the calculations can be done, you will end up doing much more work
than necessary.
7.3. THE CONSTRUCTION
173
6. Prove Lemma 7.2.2. Note that to prove that 1 through 4 are equivalent
to each other it suffices to prove 1 ⇒ 2 ⇒ 3 ⇒ 4 ⇒ 1. Any other cyclic
order of 1 through 4 will do. Also note that once 3 (say) is known, then
it can be exploited in 4 (say) by seeing that every g ∈ G is the inverse of
some other element of G.
7.3
The construction
We are now ready to work through the outline.
7.3.1
The multiplication
Let G be a group and let N be a normal subgroup of G. Define ∼N on G by
saying that a ∼N b means that ba−1 is in N . We know that this is an equivalence
relation and we know that for each a ∈ G the equivalence class of a ∈ G under
this relation is just the right coset N a of N .
Because of Lemma 7.2.2, we know that right cosets of N in G are also left
cosets of N in G. So referring to right versus left cosets of N is not all that
important. However, we will usually refer to right cosets to be specific.
The definition
Define a binary multiplication on the set of equivalence classes (i.e., the set of
right cosets of N in G) by declaring
(N a)(N b) = N (ab).
(7.1)
Well definedness
When we write N a, we are identifying an entire set by mentioning one element a of that set. There are other elements in that set that could have been
used equally well. For example, if we consder D16 and the normal subgroup
N = {1, r2 , r4 , r6 }, the cosets N v, N h, N d1 and N d2 are all the same. So the
definition in (7.1) gives the result of multiplying two cosets by a formula that
uses representatives of the cosets being multiplied. Thus we have to prove that
the result is independent of the representatives chosen.
The following use of Lemma 7.2.2 is typical, is used often, and should be
learned.
Lemma 7.3.1 For a group G with normal subgroup N , the multiplication on
the cosets of N in G defined by (7.1) is well defined.
Proof. Let a, b, x and y be in G with a ∼N x and b ∼N y. We want to show
that (ab) ∼n (xy).
We know that xa−1 = m ∈ N and yb−1 = n ∈ N . Now
(xy)(ab)−1 = xyb−1 a−1 = xna−1 = n′ xa−1 = n′ m
174
CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES
where n′ is the element of N guaranteed by Lemma 7.2.2 to satisfy xn = n′ x.
Since n′ and m are in N , we have n′ m ∈ N and we have shown what we needed
to show.
The main feature of the proof of Lemma 7.3.1 is that even though x and n
do not commute, we can “pass x over n” by replacing n with another element
n′ of N .
Group properties
Let Q denote the set of right cosets of N in G. The argument that Q with the
multiplication as defined by (7.1) forms a group is identical to the corresponding
argument for Zk . Since the product in (7.1) is based on the product in G, we
get the an identity for Q by noting
(N 1)(N a) = n(1a) = N a = N (a1) = (N a)(N 1).
(7.2)
Inverses follow from
(N a)(N a−1 ) = N (aa−1 ) = N 1 = N (a−1 a) = (N a−1 )(N a),
and associativity from
(N a)((N b)(N c)) = (N a)(N (bc)) = N (a(bc))
= N ((ab)c) = (N (ab))(N c) = ((N a)(N b))(N c).
We have shown that Q with the multiplication defined in (7.1) is a group.
Notation and terminology
Standard notation for this construction is to denote the set of right cosets of N
in G endowed with the multiplication defined in (7.1) by G/N . The group G/N
is referred to as the quotient of G by N . The notation G/N is often read out
loud as “G mod N ” or “G modulo N .” The action of creating G/N from G is
often referred to as “modding out by N .”
7.3.2
The projection homomorphism
Just as we have a homomorphism from Z to Zk , we have a homomorphism from
G to G/N . If we define π : G → G/N by π(a) = N a, then (7.1) immediately
gives that this is a homomorphism. It is clearly a surjection. We call π the
projection or the quotient homomorphism from G to G/N .
The nature of π is that it takes each element of G to the right coset of N that
contains it. Thus all the elements in one coset are carried by π to one element of
G/N , and elements in different cosets are carried to different elements of G/N .
The kernel of π is the coset N 1. But N 1 = N . Thus N is the kernel of π.
From these observations, we see that N measures the extent to which π is
not one-to-one. Not only are |N | elements carried to the identity of G/N , but
7.3. THE CONSTRUCTION
175
also since the cosets of N are all the same size as N , we see that exactly |N |
elements are carried to any one element of G/N .
These observations are made more specific in the proposition below.
Proposition 7.3.2 Let G be a group and let N be a normal subgroup of G.
Then π : G → G/N defined by π(a) = N a is a surjective homomorphism and
for each a ∈ G, we have π −1 (N a) = N a.
Proof. All provisions except the last equality have been argued in the paragraphs
above. The last equality makes two different uses of N a. To the left of the equal
sign N a represents an element of G/N . To the right of the equal sign it is a
subset of G.
Now g ∈ G is in π −1 (N a) if π(g) = N a. But π(g) = N g, so N g = N a and
g = 1g ∈ N g = N a puts g in N a. So π −1 (N a) ⊆ N a.
If g ∈ N a, then N g∩N a is not empty so N a = N g since the right cosets of N
partition G. This gives N a = N g = π(g) so g ∈ π −1 (N a) and N a ⊆ π −1 (N a).
We will see in the next section that this is the structure of any surjective
homomorphism, and in fact every surjective homomorphism is essentially a quotient homomorphism.
7.3.3
The first isomorphism theorem
There are several theorems in group theory known as isomorphism theorems.
Often they are numbered and are given names like “first isomorphism theorem,”
“second isomorphism theorem” and up to a third. Unfortunately not all books
number the isomorphism theorems the same way. We do not need them all, and
we will present two of them—one here and one in a later section. We will also
present a theorem usually called the correspondence theorem, but sometimes
called (not by us) the “fourth isomorphism theorem.”
Most books call the next theorem the first isomorphism theorem.
Theorem 7.3.3 (First Isomorphism Theorem) Let h : G → H be a surjective homomorphism and let K be the kernel of h. Then j : G/K → H defined
by j(Ka) = h(a) is a well defined isomorphism from G/K to H.
Proof. We start with the well definedness question. If Ka = Kb, then ba−1 ∈ K.
So h(ba−1 ) = 1 in H. This means 1 = h(ba−1 ) = h(b)h(a−1 ) = h(b)(h(a))−1
which implies that h(b) = h(a).
The calculation j((Ka)(Kb)) = j(K(ab)) = h(ab) = h(a)h(b) = j(Ka)j(Kb)
shows that j is a homomorphism.
Since every c ∈ H is h(a) for some a ∈ G, we get c = h(a) = j(Ka) showing
that j is onto.
Lastly, if j(Ka) = j(Kb), then h(a) = h(b). From this 1 = h(b)(h(a))−1 =
h(ba−1 ) so ba−1 is in K. This means that Ka = Kb and we have shown that j
is one-to-one and thus an isomorphism.
176
CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES
Note that Proposition 7.3.2 says that every normal subgroup leads to a
quotient and a surjective homomorphism onto the quotient. On the other hand,
the First Isomorphism Theorem says that every surjective homomorphism has
its image isomorphic in a natural way to the quotient of the domain by the
kernel. Thus we have a close correspondence between (group, normal subgroup,
quotient) and (group, kernel, homomorphic image).
Often it is easier to “recognize” a quotient G/N of a group G with normal
subgroup N by noticing that there is a surjective homomorphism with domain
G having N as the kernel. We will give examples after we discuss some easy
ways normal subgroups can occur.
7.3.4
Abelian groups and products
The easiest way to get normal subgroups is to have things commute. As mentioned in Lemma 3.2.16, every subgroup of an abelian group is normal. So it
follows that if G an abelian group and H is a subgroup of G, then we can form
G/H.
However, we do not have to go all the way to abelian groups to get easy
access to normal subgroups. Consider two groups A and B. We know that the
set A × B is defined as
A × B = {(a, b) | a ∈ A, b ∈ B}.
Given that A and B are both groups, we can put a group structure on the set
A × B by defining the multiplication as
(a, b)(c, d) = (ac, bd).
(7.3)
That is, multiplication is done separately on each coordinate with the left coordinate behaving as A behaves and the right coordinate behaving as B behaves.
We will leave as an exercise that this puts a group structure on A × B. We
will also leave as an exercise that A × B is abelian if both A and B are abelian.
There are two important subgroups of A × B. They are
A × {1} = {(a, 1) | a ∈ A},
and
{1} × B = {(1, b) | b ∈ B}.
The fact that these are subgroups and other facts about them are given in the
next lemma whose proof is left as an exercise.
Lemma 7.3.4 If A and B are groups, then (7.3) makes A × B into a group.
If we let A′ = A × {1} and B ′ = {1} × B, then the following are true.
1. For all a′ ∈ A′ and b′ ∈ B ′ , we have a′ b′ = b′ a′ .
2. A′ and B ′ are normal subgroups of A × B.
3. The quotient (A × B)/A′ is isomorphic to B and the quotient (A × B)/B ′
is isomorphic to A.
Note that in the first item, both a′ and b′ denote ordered pairs.
7.3. THE CONSTRUCTION
7.3.5
177
Examples
The quotient of a group G by a normal subgroup N involves three groups: G,
N and G/N . Obviously the first two determine the third. However, that has to
be said carefully to be completely correct. It may also be guessed that knowing
any two of G, N and G/N determines the third. This is also wrong. We will
present very simple examples.
Two of the examples involve G = Z4 × Z2 . We know that G is abelian, so all
its subgroups are normal. Using Z4 = {0, 1, 2, 3} and Z2 = {0, 1}, we consider
the following subgroups:
A = {(0, 0), (0, 1)},
B = {(0, 0), (2, 0)},
C = {(0, 0), (1, 0), (2, 0), (3, 0)},
D = {(0, 0), (2, 0), (0, 1), (2, 1)}.
We also look at Z8 . (We could look at Z4 , but we have already defined G
and a relevant subgroup and this cooperates well with Z8 , so we will use it.)
Using Z8 = {0, 1, 2, 3, 4, 5, 6, 7}, we let E = {0, 2, 4, 6}.
1. In exercises, you will show that A and B both are of order 2 and isomorphic
to each other, that G/A is isomorphic to Z4 , and that G/B is isomorphic
to the Klein four group. Thus knowing what G is isomorphic to and
what the normal subgroup is isomorphic to does not determine what the
quotient is isomorphic to.
2. In exercises, you will show that C and D are both of order 4 and not
isomorphic to each other, and that G/C is isomorphic to G/D. Thus
knowing what G is isomorphic to and what the quotient is isomorphic to
does not determine what the normal subgroup is isomorphic to.
3. In exercises, you will show that C and E are isomorphic to each other, that
Z8 /E and G/C are isomorphic to each other, and that Z8 and G are not
isomorphic to each other. Thus knowing what the quotient is isomorphic
to and what the normal subgroup is isomorphic to does not determine
what the full group is isomorphic to.
We return to a statement in the first paragraph of this section. We said that
G and N determines G/N . The first example above shows that to determine
G/N you need to know the particular normal subgroup of G, and not just what
it is isomorphic to.
7.3.6
The correspondence theorem
If N is a normal subgroup of a group G, it is valid to say that (among other
consequences) all of the information in N has been “collapsed” to the trivial
subgroup in G/N . But there is much of the structure of G that survives the
178
CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES
transition from G to G/N . In particular structures of G that contain N survive
in a very complete way. The correspondence theorem makes this precise.
The bridge from G to G/N is the projection homomorphism π : G → G/N .
The following lemma should probably have been introduced earlier.
Lemma 7.3.5 If h : G → H is a homomorphism of groups, and J ⊆ G is a
subgroup of G, then the restriction of h to J is a homomorphism, and h(J) is
a subgroup of H.
Proof. The are only two requirements for an item (say f ) be a homomorphism:
that f be a function and that f (ab) = f (a)f (b) for all a and b in the domain.
But the restriction of a function to a smaller domain is still a function, and
the equality still holds since the restriction uses the same values as the original.
That h(J) is a subgroup of H follows from Lemma 3.2.13.
Applying this to π : G → G/N , we see that π carries each subgroup of G to
a subgroup of G/N . This gives a function from subgroups of G to subgroups of
G/N . This function deserves its own notation, and even though it looks more
confusing to have two notations for what looks like almost the same thing, it
will be less confusing later. So for a subgroup H of G, we define π(H) to be
π(H) and we have a function π from the set of subgroups of G to the set of
subgroups of G/N .
This function turns out to be onto, but not necessarily one-to-one. For
example, every subgroup of N (including N itself and {1}) is taken to the trivial
subgroup in G/N . However, π does give a one-to-one correspondence (hence the
name of the theorem) between the set of subgroups of G that contain N and
the set of subgroups of G/N . If we focus only on normal subgroups, then we get
a similar result: π gives a one-to-one correspondence between the set of normal
subgroups of G that contain N and the set of normal subgroups of G/N .
Much of the argument that goes into the proof of the Correspondence Theorem is about the behavior of functions. The next lemma separates out the ideas
needed so that they don’t clutter up the proof of the theorem.
Lemma 7.3.6 If f : X → Y is a function between sets, if A ⊆ X and B ⊆ Y ,
then
1. f (f −1 (B)) ⊆ B always holds,
2. B ⊆ f (f −1 (B)) holds if and only if B is contained in the image of f ,
3. A ⊆ f −1 (f (A)) always holds, and
4. f −1 (f (A)) ⊆ A holds if and only if for every x ∈ A, we have
f −1 (f (x)) ⊆ A.
Proof. We leave the proof as a set of exercises.
We are now ready for:
7.3. THE CONSTRUCTION
179
Theorem 7.3.7 (Correspondence Theorem) Let N be a normal subgroup
of a group G, let S be the set of subgroups of G that contain N , and let T be
the set of subgroups of G/N . Let π : S → T be derived from the projection
homomorphism π : G → G/N as described above. Then the following are true.
1. The function π : S → T is a one-to-one correspondence.
2. If H ∈ S has H ⊳ G, then π(H) ⊳ G/N , so that if S ′ is the set of normal
subgroups in S and T ′ is the set of normal subgroups in T , then π also
gives a function from S ′ to T ′ .
3. The function π : S ′ → T ′ is a one-to-one correspondence.
Proof. For the first conlcusion, we use the fact that a function with an inverse is a
one-to-one correspondence. Thus we need an inverse to the function π : S → T .
For an A element of T (subgroup of G/N ), consider
π −1 (A) = {g ∈ G | π(g) ∈ A}.
We claim that π −1 gives a function from T to S. That is, we claim that
π −1 (A) is a subgroup of G that contains N .
That π −1 (A) contains N follows from the fact that A contains the identity
of G/N and N is the kernel of π.
To check that π −1 (A) is a subgroup of G, we check for identity, inverses
and products. Since π(1) is the identity in G/H it is in A, so 1 ∈ π −1 (A). If
π(x) ∈ A, then π(x−1 ) = (π(x))−1 ∈ A so every x ∈ π −1 (A) has its inverse in
π −1 (A). Lastly, if π(x) ∈ A and π(y) ∈ A, then π(xy) = π(x)π(y) ∈ A, so every
x and y in π −1 (A) has xy ∈ π −1 (A). So π −1 (A) is a subgroup of G.
We want to show that π(π −1 (A)) = A for all A ∈ T and that π −1 (π(H)) = H
for all H ∈ S. Here we will use the fact that π(H) = π(H) for any H in T .
The equality π(π −1 (A)) = A becomes π(π −1 (A)) = A which is true by the
first two conclusions of Lemma 7.3.6 because π is onto.
The equality π −1 (π(H)) = H becomes π −1 (π(H)) = H, and the containment H ⊆ π −1 (π(H)) comes from conclusion 3 of Lemma 7.3.6. For the reverse
containment, we need the fourth conclusion of Lemma 7.3.6. For this we need
that for every g ∈ H we have π −1 (π(g)) ⊆ H. But π(g) = N g and Proposition
7.3.2 gives that π −1 (N g) = N g, a coset of N . But N ⊆ H and g ∈ H imply
that N g ⊆ H which is what we need.
This establishes that π is a one-to-one correspondence.
The last two provisions are left as exercises.
The key fact in the last part of the argument given above is that if N ⊆
H ⊆ G with G a group and N and H subgroups, then any coset of N that has
an element of H lies entirely in H. Another way to state this is that no coset
of N has an element in H and an element outside of H, and yet another way to
say this is that H is a union of cosets of N .
180
7.3.7
CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES
Another isomorphism theorem
The next theorem is called the “second isomorphism theorem” by many books
and the “third isomorphism theorem” by about as many books. We will simply
call the Other Isomorphism Theorem.
If N and H are normal subgroups of G with N contained in H, then the
nature of the definition of normality makes it clear that N is normal in H. Now
the Correspondence Theorem says that H/N is normal in G/N . Thus we can
look at the quotient
.
(G/N ) (H/N ).
The elements of this quotient of quotients are clumsy to write down. They
are right cosets of H/N by elements of G/N wich in turn are right cosets of N
by elements of G. So a typical element would have to be written (H/N )(N x)
where x ∈ G.
The Other Isomorphism Theorem has this to say about this quotient of
quotients.
Theorem 7.3.8 (Other Isomorphism Theorem) Let N and H be normal
subgroups of G with N ⊆ H. Then taking Hx to (H/N )(N x) gives a well defined
isomorphism
.
G/H → (G/N ) (H/N ).
Proof. Well definedness is easy. If x and y are both in N x, then xy −1 is in N . But
N ⊆ H, so xy −1 is in H. Now (N x)(N y)−1 = (N x)(N y −1 ) = N (xy −1 ) ∈ H/N
since xy −1 ∈ H. This means that N x and N y are in the same coset of H/N in
G/N .
That this is a homomorphism and a one-to-one correspondence is left as an
exercise.
Exercises (40)
1. Prove that if A and B are groups, then the multiplication defined in (7.3)
makes A × B a group.
2. Prove that if A and B are abelian groups, then the multiplication (7.3)
on A × B is abelian.
3. Prove Lemma 7.3.4. The first sentence is a previous exercise. The first
conclusion needs careful attention to what elements are. The last conclusion is best done using the First Isomorphism Theorem. Since that
theorem needs a homomorphism, the main task in the last conclusion is
to figure out what the right homomorphisms are.
4. Prove that the three examples in Section 7.3.5 have the properties claimed
and that each example shows the “does not determine” claim stated for
that example.
7.3. THE CONSTRUCTION
181
5. Prove Lemma 7.3.6.
6. Prove the second and third conclusions of the Correspondence Theorem.
Be careful to cover all that needs to be proven in the third conclusion and
be careful not to do too much work by ignoring things that have already
been proven.
7. Finish the proof of the Other Isomorphism Theorem.
8. Let N be the normal subgroup of D16 described in Section 7.2.2. Write
out the mutliplication table for D16 /N . What known group is D16 /N
isomorphic to? How can this be used to determine all the subgroups of
D16 that contain N ?
9. Consider the following two permutations in S8 .
σ = (1 2 3 4)(5 6 7 8),
τ = (1 5)(2 6)(3 7)(4 8).
These can be viewed as symmetries of the cube shown in (4.8). They
generate a group G of 8 elements. The elements of G are 8 of the 16 the
elements that stabilize the set {x, y} where x is the center of the square
1234 and y is the center of the square 5678. (Can you find an element in
the stabilizer of {x, y} that is not in G?)
(a) Show that G is abelian and is isomorphic to Z4 × Z2 . Think about
how little you can get away with showing. Otherwise you will end
up computing an 8 × 8 multiplication table and that is way too much
work. Since G is abelian, all subgroups are normal.
(b) The identity and τ form an order two subgroup K. What group is
G/K isomorphic to?
(c) The identity and σ 2 form an order two subgroup N . What group is
G/N isomorphic to?
10. Prove that if H is a subgroup of G and [G : H] = 2, then H is a normal
subgroup of G.
11. The purpose of this problem is to show that a normal subgroup of a normal
subgroup is not always normal. Consider certain subgroups of D8 by
referring to the figure in (4.1). Let v = (1 2)(3 4) and let h = (2 3)(1 4).
Let H = {1, v} and let J = {1, v, h, vh}. Show that H is a normal
subgroup of J, that J is a normal subgroup of D8 , and that H is not a
normal subgroup of D8 .
182
CHAPTER 7. QUOTIENTS AND HOMOMORPHIC IMAGES
Chapter 8
Classes of groups
We became interested in groups because groups generalize groups of permutations. We became interested in groups of permutations because interesting
permutations (of the roots) came up when looking at solutions to polynomials.
Galois’ main observation was that the permutation groups that came up with
polynomials whose roots were “expressible by radicals” were nicer than arbitrary groups. The groups that arise this way are now called “solvable groups”
because of their association with solutions to polynomials. This chapter introduces solvable groups.
The property that defines solvable groups does not resemble in any way a
solution to a polynomial equation. It will take several other chapters to make
the connection between the group property and solutions to polynomials.
8.1
Abelian groups
We start with a very short section on abelian groups. We do this for several
reasons. First, it is a bit of a warm up exercise before getting into the full
topic of solvable groups since some of the behavior of abelian groups is also
found among solvable groups. Second, the definition of solvable groups makes
reference to abelian groups. And third, because of the reference to abelian
groups in the definition of solvable groups, some facts about solvable groups are
based on corresponding facts about abelian groups.
8.1.1
Subgroups of abelian groups
Lemma 8.1.1 A subgroup of an abelian group is abelian.
Proof. The only argument that needs to be made is based on the definition of
a subgroup. If H is a subgroup of G, then the multiplication used for H is just
the multiplication used for G. That is if a, b are in H, then ab as computed
in H is the same as ab as computed in G. So if G is abelian then ab = ba as
computed in G. So ab = ba as computed in H.
183
184
8.1.2
CHAPTER 8. CLASSES OF GROUPS
Quotients of abelian groups
Lemma 8.1.2 A quotient of an abelian group is abelian.
Proof. The only argument that needs to be made is based on the definition of
the multiplication in a quotient of a group. If N is a normal subgroup of a group
G, then the multiplication on G/N is defined by (N a)(N b) = N (ab). Now if G
is abelian, then ab = ba, so (N a)(N b) = N (ab) = N (ba) = (N b)(N a).
8.2
Solvable groups
We will define solvable groups, and prove statements thare are parallel to Lemmas 8.1.1 and 8.1.2 and that use these lemmas in their proofs. Specifically, we
will show that subgroups and quotients of solvable groups are solvable. The
reader should not expect it to be clear why this class of groups is called solvable
nor why the class is at all useful.
8.2.1
The definition
Let G be a group. We say that G is solvable if there is a finite sequence of
subgroups
{1} = G0 ⊆ G1 ⊆ G2 ⊆ · · · ⊆ Gn−1 ⊆ Gn = G
with the property that for each i with 0 ≤ i < n we have Gi ⊳ Gi+1 and Gi+1 /Gi
is abelian.
The definition needs some discussion.
Recall from Problem 11 in Exercise set (40) that a normal subgroup of a
normal subgroup might not be normal in the entire groups. So the definition
of solvable does not require that each Gi be normal in G. It only requires that
each group in the sequence of groups be normal in the next larger subgroup in
the sequence.
Note that G1 must be abelian. We have G1 /G0 must be abelian, but G0 =
{1} and G1 /G0 is isomorphic to G1 .
If G is abelian, then G is solvable by taking G0 = {1} and G1 = G.
If G has finite order, then G is solvable if and only if G satisfies a definition that reads exactly the same as the definition of solvable except that we
replace the requirement that Gi+1 /Gi be abelian by the stronger requirement
that Gi+1 /Gi be cyclic. We will argue this after we give the parallels to Lemmas
8.1.1 and 8.1.2.
8.2.2
Subgroups of solvable groups
Lemma 8.2.1 A subgroup of a solvable group is solvable.
Proof. Let
1 = G0 ⊳ G1 ⊳ · · · ⊳ Gn = G
8.2. SOLVABLE GROUPS
185
be as in the definition of a solvable group and let H ⊆ G be a subgroup. Let
Hi = H ∩ Gi . A conjugate of an element of Hi by an element of Hi+1 must be
in H since both elements are in H and it must be in Gi since Gi ⊳ Gi+1 . Thus
Hi ⊳ Hi+1 .
We now define a homomorphism from Hi+1 /Hi to Gi+1 /Gi . In the following
a and b are elements of Hi+1 . Sending Hi a to Gi a is well defined since Hi a = Hi b
implies ba−1 ∈ Hi ⊆ Gi which implies Gi a = Gi b. It is a homomorphism by
the way cosets are multiplied. If Gi a = Gi b, then ba−1 is in Gi . But a and b
are in H, so ba−1 is in H ∩ Gi = Hi and Hi a = Hi b. So the homomorhpism is
one-to-one.
This makes Hi+1 /Hi isomorphic to a subgroup of the abelian group Gi+1 /Gi
which by Lemma 8.1.1 must be abelian. We have shown that H is solvable. 8.2.3
Quotients of solvable groups
Lemma 8.2.2 A quotient of a solvable group is solvable.
Proof. If G is a group and N ⊳ G, then the projection homomorphism π : G →
G/N makes the quotient a homomorphic image. Thus it suffices to prove that
the homomorphic image of a solvable group is solvable.
Now assume that
1 = G0 ⊳ G1 ⊳ · · · ⊳ Gn = G
so that each Gi+1 /Gi is abelian, and assume that h : G → H is a surjective
homomorphism. For each i, let Hi = h(Gi ). If h(a) ∈ Hi and h(b) ∈ Hi+1 with
a ∈ Gi and b ∈ Gi+1 , then (h(b))(h(a))(h(b))−1 = h(bab−1 ) = h(c) for some
c ∈ Gi . Thus h(c) ∈ Hi and Hi ⊳ Hi+1 .
Define j : Gi+1 /Gi → Hi+1 /Hi by j(Gi a) = Hi h(a). If Gi a = Gi b, then
ba−1 ∈ Gi . But this gives (h(b))(h(a))−1 = h(ba−1 ) ∈ Hi , so Hi h(a) = Hi h(b)
and j(Gi a) = j(Gi b) and j is well defined. It is a homomorphism by the way we
define the product of cosets. It is onto since h : Gi+1 → Hi+1 is onto. Since we
assume Gi+1 /Gi is abelian, Lemma 8.1.2 shows that Hi+1 /Hi is abelian. We
have shown that H satisfies the definition of a solvable group.
8.2.4
Finite solvable groups
We will discuss the following statement that appears to be stronger than the
definition of solvable. We will refer to it as condition (∗).
We say that a group G satisfies (∗) if there is a finite sequence of subgroups
{1} = G0 ⊆ G1 ⊆ G2 ⊆ · · · ⊆ Gn−1 ⊆ Gn = G
with the property that for each i with 0 ≤ i < n we have Gi ⊳ Gi+1 and Gi+1 /Gi
is cyclic.
Clearly, we have that if G satisfies (∗), then G solvable since every cyclic
group is abelian.
186
CHAPTER 8. CLASSES OF GROUPS
In parallel to the fact that every abelian group is solvable, we have that
every cyclic group satisfies (∗).
Finite abelian groups
We will show first that every finite abelian group satisfies (∗), and then we will
show that every finite solvable group satisfies (∗). Since every abelian group is
solvable, this two step process seems redundant, but we will use the first result
in proving the second much in the way that Lemmas 8.1.1 and 8.1.2 were used
in proving Lemmas 8.2.1 and 8.2.2.
Lemma 8.2.3 If a group G is finite and abelian, then it satisfies condition (∗).
Proof. The proof is inductive and heavily based on the Correspondence Theorem
and the Other Isomorphism Theorem.
We induct on the order of the group. A group of order 1 satisfies (∗). We
now look at an abelian group G of order n and assume that a group of order
less than n satsfies (∗).
If G is cyclic, then there is nothing to show. So assume that there is an
element x of G that is not the identity, but that does not generate all of G. Let
N be the subgroup generated by x. It is not all of G, it is not trivial, and it is
normal in G since G is abelian. We consider G/N .
Since |G/N | = |G|/|N | and |N | > 1, we know that |G/N | < |G| = n. Also by
Lemma 8.1.2, G/N is abelian. So G/N satisfies (∗) by the inductive hypothesis
and there is a sequence
H0 ⊆ H1 ⊆ · · · ⊆ Hk = G/N
with H0 the trivial subgroup, all subgroups are normal since G/N is abelian,
and each Hi+1 /Hi cyclic.
Letting π : G → G/N be the quotient homomorphism, the Correspondence
Theorem gives that each π −1 (Hi ) is a subgroup of G that contains N . Let
Ni = π −1 (Hi ) for each i. Since Hi ⊆ G/N , we have that Hi is the set of cosets
of N in G that lie in Ni and we can write Hi = Ni /N .
We have N0 = N . We also have Nk = π −1 (Hk ) = π −1 (G/N ) = G.
Thus we have the sequence
{1} ⊆ N = N0 ⊆ N1 ⊆ · · · ⊆ Nk = G.
Since G is abelian, all subgroups are normal.
Now the Other Isomorphism Theorem says that for 0 ≤ i < k we have
.
Ni+1 /Ni ≃ (Ni+1 /N ) (Ni /N ) = Hi+1 /Hi
(8.1)
which is cyclic. (Here ≃ means “isomorphic to.”)
This covers all the “successive quotients” in (8.1) except the first N/{1}. But
N was chosen cyclic so even the first quotient is cyclic. This finishes showing
that G satisfies (∗).
8.2. SOLVABLE GROUPS
187
Finite solvable groups
Proposition 8.2.4 If a group G is finite and solvable, then it satisfies condition
(∗).
The proof will be left as an exercise. The proof should be modeled on the
proof of Lemma 8.2.3.
If G is a finite solvable group, then the definition of solvable will give a sequence of subgroups whose successive quotients are all abelian. What is wanted
is a new sequence of subgroups whose successive quotients are all cyclic. There
is no reason to expect that the new sequence will equal the original sequence
given by the definition of solvable, nor even that the new sequence will have
the same number of subgroups as the original sequence. However, when Lemma
8.2.3 and the Correspondenc Theorem are used to prove Proposition 8.2.4, it
will be seen that there is a strong relationship between the original sequence
and the new sequence. The proof should end up proving the following more
specific statement.
Theorem 8.2.5 If a group G has a finite sequence of subgroups Gi , 0 ≤ i ≤
n, satisfying the requirements in the definition of solvable, then it has a finite
sequence of groups G′j , 0 ≤ j ≤ m, satsifying the requirements of condition (∗)
so that the sequence Gi is included in the sequence G′j in that for each i, there
is a j so that Gi = G′j .
In other words, to get the sequence of the G′j from the sequence of the Gi ,
one “inserts” extra groups between the successive Gi .
Exercises (41)
1. Prove Theorem 8.2.5. Of course, this will prove Proposition 8.2.4.
2. Prove that all the dihedral groups are solvable. There are infinitely many
dihedral groups, and you cannot write out infinitely many proofs. But if
you start with the smallest D6 , you should see a general argument that
works for all of them.
3. Prove the following “converse” to Lemmas 8.2.1 and 8.2.2. If N ⊳ G, and
if both N and G/N are solvable, then G is solvable. A hint is to use the
Correspondence Theorem.
4. This is a bit harder. Show that S4 is solvable. To find normal subgroups,
you should keep in mind that all permutations that share the same cycle
structure are conjugate in S4 (Section 4.3.5).
188
CHAPTER 8. CLASSES OF GROUPS
Chapter 9
Permutation groups
If we accept the fact that solvable groups go with solvable polynomial equations,
then the existence of non-solvable groups becomes interesting. This chapter
does some calculations with permutation groups that, among other things, finds
examples of non-solvable groups.
If a group is to be non-solvable, it must be non-abelian. For a non-abelian
group to be solvable, it has to have at least one normal subgroup that is not
trivial and not the whole group. So the easiest way to find a non-solvable group
is to find a non-abelian group whose only normal subgroups are the trivial group
and the whole group. This leads to a definition.
A group G is said to be simple if its only normal subgroups are the trivial
subgroup and G itself.
Note that for a prime p, the group Zp is simple because Lagrange’s theorem
says that the only subgroups of Zp (normal or not) are the trivial subgroup and
Zp itself. However, each Zp is abelian and thus solvable.
The main purpose of this chapter is show that the full permutation groups
Sn are not solvable when n ≥ 5. We will do some direct calculations on permutations that will give this, and one or two other facts about the groups Sn that
will be needed later.
9.1
Odd and even permutations
A transposition in Sn has exactly one cycle of length 2 and all other cycles are of
length one. That is, a tranposition simply switches two numbers in {1, 2, . . . , n}
and leaves all other numbers fixed. A transposition looks like (a b) in cycle
notation.
Every permutation in Sn can be written as a product of transpositions.
This is not hard to show. It is a standard exercise to do this in almost any
introductory computer class, and it is left as an exercise here.
Given a permutation σ ∈ Sn , there may be many ways to write it as a
189
190
CHAPTER 9. PERMUTATION GROUPS
product of transpositions. For example
1 2 3 4
= (1 3)(2 4) = (2 3)(3 4)(1 2)(2 3),
3 4 1 2
so even the number of transpositions used can vary.
We claim that for a given permutation σ ∈ Sn either all ways of writing σ as
a product of transpositions use an even number of transpositions, or all ways of
writing σ as a product of transpositions use an odd number of transpositions.
Note that in the example above, both ways shown of writing out the given
permutation used an even number of transpositions.
We will prove our claim by relating the number of transpositions to a number
calculated directly from the permutation.
9.1.1
Crossing number of a permutation
Recall the general form of the Cauchy notation for a permutation σ ∈ Sn as
introduced in Section 2.3.4.
!
1
2
3 ···
n
.
(2.2)
σ=
σ(1) σ(2) σ(3) · · · σ(n)
For such a σ, we let the crossing number for σ be the number of i and j with
i < j for which σ(i) > σ(j). That is, it is the number of pairs in {1, 2, . . . , n}
whose order has been “switched” by σ. In (2.2), it is the number of pairs in the
bottom line that are
out of order.
1 2 3 4
, the bottom line reads (3 4 1 2). There are six
In the example
3 4 1 2
pairs to consider in the bottom line:
(3 4) (3 1) (3 2) (4 1) (4 2) (1 2)
Of these six pairs, the ones that are out of order are (3, 1), (3, 2), (4, 1), and
(4, 2), while are (3, 4), and (1, 2) are in order. Thus the crossing number of this
permutation is 4.
We are not really concerned with the crossing number, but only in whether
it is even or odd. The evenness or oddness of an integer is called its parity and
an integer is said to have even parity if it is even, and odd parity if it is odd.
However, the use of “even parity” and “odd parity” only allows one to say in
two words what used to be said in one word, so we will not use it much.
If σ ∈ Sn has odd crossing number, then σ is said to have even parity (or
more simply σ is said to be even) and have odd parity (or be odd) otherwise.
Our main result for this part will be that if σ can be written as a product of
k transpositions, then the parity of k must agree with the parity of σ. We will
prove this in several steps.
First we observe that the identity has crossing number zero (no pair is reversed) and thus the identity permutation is even.
9.1. ODD AND EVEN PERMUTATIONS
191
Next we look at special transpositions. We call a transposition (i j) an
adjacent transposition if j = i + 1. Thus adjacent transpositions switch two
numbers that are consecutive.
Lemma 9.1.1 If σ is in Sn and τ ∈ Sn is an adjacent transposition, then the
partity of τ σ is the opposite of the parity of σ.
Proof. If τ = (i i + 1), then the numbers in positions p and q have their order
unchanged when τ acts if neither p nor q is in {i, i + 1}. Next one observes, if
one (say p) is in {i, i + 1} and the other is not, then the values in positions p
and i before τ is done become the values in positions p and i + 1 after τ is done
and the values in positions p and i + 1 become the values in positions p and
i. Thus the number of pairs out of order for such a p and q does not change.
Lastly, when both p and q are in {i, i + 1}, then the pair of numbers in those
positions either changes from in order to out of order or the reverse. Thus the
application of τ changes the crossing number by exactly one. This proves the
claim.
Corollary 9.1.2 If σ ∈ Sn is the product of k adjacent transpositions, then the
parity of k equals the parity of σ.
Proof. The identity is even, and each time we multiply by an adjacent transposition, the parity reverses. From Lemma 9.1.1, the parity of the resulting product
must be the parity of the number of adjacent transpositions multiplied.
Next we relate arbitrary transpositions to adjacent transpositions.
Lemma 9.1.3 If (a b) is a transposition in Sn , then (a b) can be written as a
product of an odd number of adjacent transpositions.
The proof of this is left as an exercise. From Lemma 9.1.3 and Corollary
9.1.2, we get the following two consequences. The first is immediate.
Corollary 9.1.4 Every transposition in Sn is odd.
Corollary 9.1.5 If σ ∈ Sn is the product of k transpositions, then the parity
of k equals the parity of σ.
Proof. Each transposition can be replaced by an odd number of adjacent transpositions. The number of adjacent transpositions in the resulting (larger) product
is the sum of these odd numbers which is odd if the number of original transpositions is odd and even otherwise.
A proof similar to that of Corollary 9.1.5 give the following.
Lemma 9.1.6 If σ1 and σ2 are in Sn , the the parity of σ1 σ2 is the sum of the
parities of σ1 and σ2 .
192
CHAPTER 9. PERMUTATION GROUPS
The proof is left as an exercise. By the “sum” of two parities, we mean the
parity that results if two integers are added with the given parities. Note that
we can think of the elements of Z2 as representing all the parities that we need.
The element 0 represents even parity, and the element 1 represents odd parity.
Adding parities, is now represented by addition in Z2 .
With this view of parities, Lemma 9.1.6 says that taking σ ∈ Sn to 0 if
σ is even and taking σ to 1 if σ is odd, gives a homomorphism from Sn to
Z2 . It is onto since we know that there are elements of Sn of odd parity. The
homomorphism can be referred to as the parity homomorphism from Sn to Z2 .
Exercises (42)
1. Show that every permutation can be written as a product of transpositions.
This is easier if you have had to write a sort routine in a programming
class.
2. Show that every transposition can be written as a product of an odd
number of adjacent transpositions. That is, prove Lemma 9.1.3.
3. Prove Lemma 9.1.6.
4. Let σ be a single cycle in Sn with k elements in the cycle. Show that σ is
even if k is odd and odd if k is even. A direct proof would be nice and an
inductive proof would be nicer.
9.2
The alternating groups
Let An denote the collection of even permutations in Sn . It follows from Lemma
9.1.6 that products of elements in An are in An , and from Corollary 9.1.5 that
the inverse of an element in An is in An . Thus An is a subgroup of Sn . An
alternate way to argue this is to notice that An is the kernel of the parity
homomorphism from Sn to Z2 . That observation also gives that An is a normal
subgroup of Sn . However, since Z2 has order 2 and the parity homomorphism
is onto, we get that [Sn : An ] = 2 and we know that any subgroup of index two
has to be normal.
The groups An are called the alternating groups. One of their claims to fame
is that for n ≥ 5, the group An is non-abelian and simple. We will show that
A5 is simple. It is all we need later. The techniques that we use for A5 can be
generalized to all An with n ≥ 5, but it will be easier to work with just A5 since
it is easy to describe all types of elements that will be encountered.
9.2.1
The A5 menagerie
Assume that N is a non-trivial normal subgroup of A5 . If we show that N has
to be all of A5 , then we will have shown that A5 is simple.
The assumption that N is not trivial says that there is a non-identity element
in N . There are only three kinds of non-identity elements in A5 :
9.2. THE ALTERNATING GROUPS
193
1. a product of two “non-overlapping” transpositions (a b)(c d),
2. a three-cycle (a b c), and
3. a five-cycle (a b c d e).
All other non-identity cycle structures possible with five objects being permuted
are odd.
Showing that A5 is simple now proceeds in two steps. We let N be a nontrivial normal subgroup of A5 and we let g be a non-identity element of N .
The first step is to show that N must contain a three-cycle. Of course if g
is already a three-cycle, there is nothing to show. If g is not a three-cycle, then
we must use the structure of g and the normality of N to build another element
of N that is a three-cycle.
The second step is to show that if N has a three-cycle, then N must have
all elements of A5 .
9.2.2
Getting a three-cycle
Let g be a non-identity element in a normal subgroup N of A5 . We know that
g has one of the cycle structures given in Section 9.2.1.
If g is a three-cycle, then N certainly contains a three-cycle.
If g = (a b)(c d), then we can conjugate g by h = (d e). We can calculate
that g h = (a b)(c e). Since N is normal, g h must be in N . One now calculates
that
(g h )(g −1 ) = (a)(b)(c d e)
which is a three-cycle.
If g = (a b c d e), then we can conjugate g by h = (d e). We can calculate
that g h = (a b c e d). Since N is normal, g h must be in N . One now calculates
that
(g h )(g −1 ) = (a d e)(b)(c)
which is a three-cycle.
We have shown that every non-trivial normal subgroup N of A5 has a threecycle.
9.2.3
Getting all of A5
Three-cycles
We first show that a non-trivial normal subgroup N of A5 contains all threecycles. From the previous section, we can assume that there is a three-cycle
(a b c) in N . If (p q r) is another three-cycle in S5 , then we know from Theorem
4.3.15 that there is a τ ∈ Sn so that (a b c)τ = (p q r). But N ⊳ A5 can only be
exploited if τ ∈ A5 . There are two cases.
In the case that τ is even, then τ ∈ A5 and N ⊳ A5 implies that (p q r) =
(a b c)τ is in N .
194
CHAPTER 9. PERMUTATION GROUPS
In the case that τ is odd, then write (p q r) as (p q r)(s)(t). Let λ = (s t)τ .
Then λ is even and is in A5 . Further, since (s t) is its own inverse, we have
(a b c)λ = λ(a b c)λ−1 = (s t)τ (a b c)τ −1 (s t) = (s t)(p q r)(s t) = (p q r).
Now with λ ∈ A5 and N ⊳ A5 , we get that (p q r) must also be in A5 . This
proves that N contains all three-cycles.
Arbitrary elements
Now we wish to show that N contains all elements of A5 . An arbitrary element
in A5 is in one of the three forms discussed in Section 9.2.1. We already know
that all elements in the second form (three-cycles) are in A5 , so we need to
account for the other two.
If g ∈ A5 is of the form (a b)(c d), then
(a b c)(b c d) = (a b)(c d) = g
and g is in N .
If g is of the form (a b c d e), then
(a b c)(c d e) = (a b c d e) = g
and g is also in N . This shows that N contains all elements in A5 .
We have given most of the proof of the following.
Theorem 9.2.1 If N is a non-trivial normal subgroup of A5 , then N is all of
A5 . In particular, A5 is simple and not solvable.
Proof. The only item not already proven is the fact that A5 is not solvable.
From the definition of solvable, we only need to show that A5 is not abelian.
We leave this as an exercise.
This has the following consequences.
Corollary 9.2.2 The symmetric group S5 is not solvable.
Proof. This follows from the fact (Lemma 8.2.1) that subgroups of solvable
groups are solvable and that the non-solvable A5 is a subgroup of S5 .
Corollary 9.2.3 The symmetric group Sn is not solvable for n ≥ 5.
Proof. If Sn is the group of permutations of {1, 2, . . . , n}, then the subgroup of
Sn that keeps each element of {6, 7, . . . , n} fixed is isomorphic to S5 which is
not solvable. Now Lemma 8.2.1 says that Sn is not solvable.
9.3. SHOWING A SUBGROUP IS ALL OF SN
195
Exercises (43)
1. Find two elements of A5 that do not commute. Explain the comment in
the proof of Theorem 9.2.1 that this is all that is needed to show that A5
is not solvable.
9.3
Showing a subgroup is all of Sn
Here we give an argument that resembles the proof that A5 is simple, but that
gives a different result. It will be needed to justify an example that will occur
much later in the course.
First we need a definition. If a group G acts on a set X, we say that the
action is transitive (or that G acts transitively on X) if there is only one orbit
of the action (which then must necessarily be all of X). Another way to say the
same thing is that for every x and y in X, there is a g ∈ G so that g(x) = y.
That is, you can take any element of X to any other element of X.
We will apply this to the action of Sn on {1, 2, . . . , n}. If H is a subgroup
of Sn , then it is also a group of permutations on {1, 2, . . . , n} and thus acts on
{1, 2, . . . , n}.
Clearly Sn acts transitively on {1, 2, . . . , n}, and for a subgroup H of Sn , the
action of H on {1, 2, . . . , n} will either be transitive or not. Clearly, the action
of the trivial subgroup of Sn on {1, 2, . . . , n} is not transitive.
Our goal is the following.
Proposition 9.3.1 If H is a subgroup of S5 that acts transitively on {1, 2, 3, 4, 5}
and that contains at least one transposition, then H = S5 .
We will give the argument as a sequence of lemmas that build up information
about H. We will be hindered by the fact that the transitivity of the action
tells only so much about the action and no more. For example if a and b are two
elements of {1, 2, 3, 4, 5}, and c is a third, then we know that there is a σ ∈ H so
that σ(a) = b. But we have no way of knowing what σ(c) is for this particular
σ. In other words, the transitivity lets us dictate what happens to one element
of {1, 2, 3, 4, 5}, but does not let us dictate what happens to two.
The proofs of the lemmas will be somewhat wordy, and to be more efficient
we introduce some efficient wording.
If σ and τ are each single cycles, then we say that they overlap to mean that
the set of elements involved in the cycle of σ and the set of elements involved
in the cycle of τ have non-empty intersection. For example if σ = (1 3 5) and
τ = (2 5), then they overlap. But if σ = (1 3 5) and τ = (2 4), then they do
not overlap.
Even further, we can discuss by how much they overlap. So we can say that
σ = (1 3 5) and τ = (2 3 5) overlap in two elements, but σ = (1 3 5) and
τ = (2 4 5) only overlap in one element.
196
CHAPTER 9. PERMUTATION GROUPS
Lemma 9.3.2 If H is a subgroup of S5 that acts transitively on {1, 2, 3, 4, 5}
and that contains at least one transposition, then H contains a three-cycle.
Proof. Let τ = (a b) be the transposition that is guaranteed to be in H and
let c be neither a nor b. There is a σ ∈ H so that σ(a) = c. We consider the
transposition τ σ which we know must be in H since both τ and σ are in H. We
know that τ σ transposes at least the element c of {1, 2, 3, 4, 5} that is neither a
nor b. But we do not know where the other element transposed by τ σ is. Thus
the two-cycle τ σ either does not overlap the two-cycle τ or they overlap in a
single element. We thus must consider two cases.
In the first case, we assume that the overlap is in a single element. We leave
as an exercise to show that the product of two transpositions that overlap in a
single element is a three-cycle. Thus we are done in this case.
In the second case, we have τ = (a b) and τ σ = (c d) where {a, b} and {c, d}
are disjoint. But then there is a fifth element e that is not in {a, b, c, d}. There
is a λ ∈ H so that λ(a) = e. Now τ λ is a transposition of the form (e ?) where
the unknown value “?” must be in {a, b, c, d} since there are only 5 values being
permuted by S5 . Thus τ λ must overlap exactly one τ or τ σ in a single element.
As in the previou case, we have two transpositions that we can multiply to give
us a three-cycle.
The next two lemmas will be given rather breezy proofs. You will be asked
to justify the steps in an exercise.
Lemma 9.3.3 If H is a subgroup of S5 that acts transitively on {1, 2, 3, 4, 5}
and that contains at least one transposition, then H contains a four-cycle.
Proof. We know there is a three-cycle σ in H. Using the transitivity of H, we
can conjugate σ to another three cycle that overlaps σ in either one or two
elements.
If the overlap is two elements, then there is a fifth element that is fixed by
both three cycles. Using the transitivity of H, we can find a transposition that
moves this fifth element. The transposition must overlap one of the two three
cycles in one element. The product of the three cycle and the transposition that
overlaps in a single element is a four-cycle.
If the overlap of the two three cycles is one element, then using the transitivity of H, we can find a transposition that moves the common element. The
transposition must overlap one of the two three cycles in one element. The product of the three cycle and the transposition that overlaps in a single element is
a four-cycle.
Lemma 9.3.4 If H is a subgroup of S5 that acts transitively on {1, 2, 3, 4, 5}
and that contains at least one transposition, then H contains a five-cycle.
Proof. We know there is a four-cycle σ in H. Using the transitivity of H, we can
find a transposition that moves the element fixed by σ. The transposition must
9.3. SHOWING A SUBGROUP IS ALL OF SN
197
overlap the four-cycles in one element. The product of σ and the transposition
that overlaps in a single element is a five-cycle.
Lemma 9.3.5 If H is a subgroup of S5 that acts transitively on {1, 2, 3, 4, 5}
and that contains at least one transposition, then H contains all transpositions.
Proof. Once again we sketch the proof and leave details as exercises.
We know there is a four-cycle in H that we choose to write as (b c d e). We
can find a transposition (a ?) where the unknown element is in {b, c, d, e}. Now
conjugating the transposition by powers of the four-cycle give all transpositions
in which one of the elements being transposed is a.
This works as long as a is the fixed element of the four cycle. But now we
can conjugate the four-cycle by powers of the five-cycle that must be in H to
change the fixed element of the four cycle to anything we want. Thus we can
get all transpositions in S5 to be in H.
From a problem in Exercise Set (43), we know that every permutation can
be written as a product of permutations. This fact and Lemma 9.3.5 proves
Theorem 9.3.1.
Exercises (44)
In these exercises, it will be important to remember that there are several ways
to write down the same cycle. In particular there are two ways to write down
a transposition: (a b) and (b a) denote the same transposition. There are three
ways to write down the same three-cycle, four ways to write down the same
four-cycle, and so forth.
1. Find a non-trivial subgroup of S3 that does not act transitively on {1, 2, 3}.
Find a subgrop of S3 that is not all of S3 that does act transitively on
{1, 2, 3}.
2. Show that if σ and τ are two-cycles that overlap in a single element, then
their product is a three-cycle.
3. This exercise should not be attempted until you feel that you are an expert
on the proof of Lemma 9.3.2. Fill in the details in the proof of Lemmas
9.3.3 and 9.3.4. In Lemma 9.3.3, how do we get the three-cycle that
overlaps σ in one or two elements? In the two cases, how do argue that we
can find a transposition that overlaps one of the three cycles in a single
element? Show that the product of an n-cycle and a two-cycle that overlap
in one element is an (n + 1)-cycle.
4. This exercise should also not be attempted until you feel that you are an
expert on the proof of Lemma 9.3.2. Fill in the details in the proof of
Lemma 9.3.5. Be as thorough as the previous exercise.
198
CHAPTER 9. PERMUTATION GROUPS
5. Prove that if a subgroup H of S5 contains a transposition and a five-cycle,
then H = S5 .
Part III
Field theory
199
Chapter 10
Field basics
10.1
Introductory remarks
In this part (Part III) we will cover Galois Theory. This theory tells, for a
given polynomial, when its roots can be expressed in terms of its coefficients
and some basic constants using only the five operations of addition, subtraction,
multiplication, division, and the extraction of n-th roots. Fields are structures
in which one can do the first four of these operations, and much of our effort
will be a study of fields.
If a root of a polynomial is calculated from the coefficients of the polynomial
by a string of the five operations, and the coefficients and basic constants are
all contained in a given field, then the four operations of addition, subtraction,
multiplication and division stay within that field. However, each time the taking
of an n-th root is required, a larger field might have to be brought in so that the
calculation can continue. For this reason, the study of pairs of fields, smaller
fields contained in larger fields, becomes important. Since the larger field is
thought of as being built from the smaller field by including new numbers, the
larger field is called an extension of the smaller.
It is important to know when an arbitrary extension can be accomplished by
the inclusion of new numbers that are n-th roots of numbers from the smaller
field. And because the calculation of a root of a polynomial might involve the
taking of several different n-th roots (with perhaps different values of n), it
is important to know when an extension field can be obtained from a smaller
field by a sequence of such inclusions. It is Galois’ main contribution that he
recognized that this could be detected by looking at certain groups of symmetries
(automorphisms) of the extension field.
Thus we have the first themes of our study: we will study certain symmetries
of certain pairs of fields, where the pair consists of a smaller field contained in
an extension field. Later we will study what the symmetries reveal about the
extension.
In order to derive information from symmetries, you have to know what the
201
202
CHAPTER 10. FIELD BASICS
symmetries are. In order to know what the symmetries of an object (such as a
field) are, you have to know some structure of the object. Thus we do not start
with symmetries, but instead start with a study of the structures of fields and
field extensions.
We will learn about the structures in two steps. First, we will look at some
basic facts that arise from the definitions and that apply to all fields. That will
be the subject of this chapter. Then we will see how certain specific fields and
field extensions are constructed. The details of the construction will reveal the
symmetries. The construction will be studied in a later chapter.
In between this chapter and the chapter on the construction of extensions
will be a chapter on polynomials. Galois Theory never strays very far from
discussions of polynomials. The subject is motivated by a search for roots of
polynomials, and the promised constructions are based heavily on the properties of polynomials and sets of polynomials. Indeed sets of polynomials are
valid objects of study in an algebra course since polynomials can be added, subtracted and multiplied (but not divided) to give new polynomials, and thus form
algebraic structures (rings) in their own right. We will see that the algebraic
structures of sets of polynomials have very familiar properties that will be easy
to exploit.
10.2
Review
The definition of a field, some examples, and some properties of fields that
can be derived immediately from the definitions are to be found in Chapter 2
(Sections 2.6 and 2.8) and in Chapter 3 (Section 3.4).
Important items from these sections include the following.
1. Examples of fields include√the rationals Q, the real numbers R, the complex numbers
C, and Q[ 2] which consists of all numbers of the form
√
a + b 2 where a and b are rationals. See Section 2.6.2 for the details
of √
the addition, subtraction, multiplication and division of elements of
Q[ 2].
2. Each Zp with p a prime integer is a field. See Section 2.8.
3. The identities of a field are unique as are additive and multiplicative inverses. Further, for all x and y in a field, we have 0x = 0, (−x)y = −(xy)
and (−x)(−y) = xy. See Lemma 3.4.1.
4. If F ⊆ E is an extension of fields, then E is a vector space over F . The
dimension of this vector space is called the degree of E over F and is
denoted [E : F ]. See Section 3.4.3.
5. The intersection of subfields is a subfield. From this it follows that if
F ⊆ E is an extension of fields and S is a subset of E, then there is a
smallest subfield of E that contains F and S. This subfield is called the
extension of F by S in E. See Section 3.4.3.
10.3. FIXED FIELDS OF AUTOMORPHISMS
203
6. A homomorphism between fields either takes all elements to 0 or is oneto-one. In the latter case, the image of the homomorphism is a subfield of
the range. See Section 3.4.4.
7. For a field E, the set Aut(E) of isomorphisms from E to itself is a group
under composition. If F ⊆ E is an extension of fields, then we let
Aut(E/F ) = {h ∈ Aut(E) | h(x) = x for all x ∈ F }.
With this definition Aut(E/F ) is a subgroup of Aut(E). See Section 3.4.5.
In the rest of this chapter we add to this list a few more basic facts and
concepts concerning fields.
Exercises (45)
1. This should probably have been included in Lemma 3.4.1. Prove that if
xy = 0 in a field F , then either x = 0 or y = 0. Conclude that if a 6= 0
and b 6= 0 in a field F , then ab 6= 0.
10.3
Fixed fields of automorphisms
Let E be a field, let θ be an automorphism of E and let Γ be a subgroup of
Aut(E). The fixed field Fix(θ) of θ is the set {x ∈ E | θ(x) = x}, and the fixed
field Fix(Γ) of Γ is
\
{x ∈ E | θ(x) = x for every θ ∈ Γ} =
Fix(θ).
θ∈Γ
Exercises (46)
1. If E is a field, θ ∈ Aut(E) and Γ is a subgroup of Aut(E), then Fix(θ)
and Fix(Γ) are subfields of E.
2. If E is an extension of the field F and Γ = Aut(E/F ), then F ⊆ Fix(Γ).
In the second exercise above, you can try to prove that F = Fix(Γ), but you
should hopefully fail because it is false. Later we will see when equality occurs.
10.4
Automorphisms and polynomials
In the introductory remarks to this chapter, we point out that we are interested
in situations in which a field contains the coefficients of a polynomial, but the
roots are not contained in that field, but in a larger field. For several reasons, this
situation cooperates beautifully with the automorphisms discussed in Section
3.4.5.
204
CHAPTER 10. FIELD BASICS
Polynomials are put together from various ingredients using addition and
multiplication. Automorphisms cooperate well with addition and multiplication,
and so cooperate well with the structure of a polynomial. Further, a root of a
polynomial makes the value of a polynomial zero. An automorphism must take
zero to zero, so we get good cooperation between automorphisms and roots. We
now give details to these comments.
Let P (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 be a polynomial. Assume
further that all the ai are elements of a field F . Lastly assume that there is a
field extension E of F so that some r ∈ E is a root of P . That is P (r) = 0
in E. Let θ be an automorphism in Aut(E/F ). The reason for chosing θ in
Aut(E/F ) will be clear immediately.
Because of the properties of automorphisms, we have
θ(P (x)) = θ(an xn + an−1 xn−1 + · · · + a1 x + a0 )
= θ(an xn ) + θ(an−1 xn−1 ) + · · · + θ(a1 x) + θ(a0 )
= θ(an )θ(xn ) + θ(an−1 )θ(xn−1 ) + · · · + θ(a1 )θ(x) + θ(a0 )
= θ(an )(θ(x))n + θ(an−1 )(θ(x))n−1 + · · · + θ(a1 )θ(x) + θ(a0 )
= an (θ(x))n + an−1 (θ(x))n−1 + · · · + a1 θ(x) + a0
= P (θ(x))
for any x ∈ E. The next to last equality holds because θ fixes all elements of F
and all the ai are in F .
Now we note that P (r) = 0 so θ(P (r)) = θ(0) = 0. But θ(P (r)) = P (θ(r))
so P (θ(r)) = 0. This means that θ(r) is a root of P (x). We have shown the
following.
Proposition 10.4.1 Let F ⊆ E be fields, let P (x) be a polynomial with coefficients in F , let r ∈ E be a root of P (x), and let θ be in Aut(E/F ). Then θ(r)
is also a root of P (x).
We now have our first serious restriction on automorphisms. Under the
right conditions, automorphisms have to take roots of a polynomial to roots of
the same polynomial. In the next chapter we will see that there are a limited
number of roots for a given polynomial, so that this is a real restriction. Later
we will see conditions that imply that given two roots of a polynomial there
must actually be an automorphism that carries one root to the other root.
Exercises (47)
1. We consider the reals R and the complex numbers C, regarding C as an
extension of R. We also consider complex conjugation z 7→ z taking C to
C which is defined by a + bi = a − bi for real a and b. The notation z is
not convenient for us, since it does not give complex conjugation a symbol
that looks like a function symbol, so we define κ : C → C by κ(z) = z to
fill that role.
10.5. ON THE DEGREE OF AN EXTENSION
205
(a) Prove that κ is in Aut(C/R).
(b) Prove that R is the fixed field of κ.
(c) Prove that if θ is any element of Aut(C/R), then θ(i) is either i or
−i.
(d) Prove that Aut(C/R) has exactly two elements.
(e) Prove that if P (x) is a polynomial with real coefficients, and r is a
root of P (x) in C, then r is also a root of P (x).
Later, another exercise will use the exercise above to conclude that every
polynomial P (x) with real coefficients factors into a product of polynomials
with real coefficients in which each factor is of degree one or two.
10.5
On the degree of an extension
Recall (Section 3.4.3) that the degree [E : F ] of an extension F ⊆ E of fields is
the dimension of E as a vector space over F .
10.5.1
Comparing degree with index
The notation [E : F ] for the degree of the extension invites misunderstandings
since it resembles the notation used for the index of a subgroup in a group. If
H is a subgroup of G then [G : H] is the index of H in G. Both [G : H] and
[E : F ] relate to the sizes of the objects if they are finite, but not in the same
way.
Let us use |A| for the number of elements in a set A. For groups, we have
[G : H] =
|G|
,
|H|
or equivalently
|G| = |H| [G : H].
For a field extension F ⊆ E, let us work out the relation between [E : F ],
|E| and |F | under the assumption that all these numbers are finite. As with
groups, we refer to |E| as the order of the field E.
Let d = [E : F ]. Since d is the dimension of the vector space E over F , there
is a basis (x1 , x2 , . . . , xd ) of elements of E for this vector space. We know that
every element of E is a unique linear combination of the form
a1 x1 + a2 x2 + · · · + ad xd
where all the ai come from F . Since this linear combination for each element
of E is unique, different linear combinations give different elements of E. Thus
the function that takes each such linear combination to its value in E is one-toone. But every element of E is such a linear combination. Thus this function
206
CHAPTER 10. FIELD BASICS
is also onto, and we have a one-to-one correspondence between the set of linear
combinations above and the elements of E.
We now count the number of such linear combinations. There are |F | choices
for a1 , there are |F | choices for a2 and so forth. Thus there are |F |d choices all
together and we have
|E| = |F |d = |F |[E:F ] .
We see that the relationship between [E : F ], and |E| and |F | is very different
from the relationship between the corresponding quantities for groups.
Finite extensions of Zp
For each prime p, we know that Zp is a field. Thus there is a field of order p for
each prime p. If E is a finite field extension of Zp , then for d = [E : Zp ], we have
that |E| = pd . Thus potentially, we have a field of order pd for various primes p
and positive integers d. It turns out that these are the only possible orders for
finite fields. This will be shown shortly. It also turns out that all such orders of
fields do exist. This will not be proven but will be discussed in a later chapter.
10.5.2
Properties of the degree
Lemma 10.5.1 Let G ⊆ F ⊆ E be a sequence of field extensions so that both
[E : F ] = m and [F : G] = n are finite. Let (x1 , x2 , . . . , xm ) be a basis for E
over F and let (y1 , y2 , . . . , yn ) be a basis for F over G. Then the set
B = {xi yj | 1 ≤ i ≤ m,
1 ≤ j ≤ n}
is a basis for E over G.
Note that this proves that when [E : F ] and [F : G] are both finite, then
[E : G] is finite and [E : G] = [E : F ][F : G].
Proof. We must show that the elements of B are linearly independent over G
and span E. The span is the easier to deal with.
Let e be in E. We can get e as the linear combination
e = f1 x1 + f2 x2 + · · · + fm xm =
m
X
fi xi
i=1
with all the fi ∈ F because (x1 , x2 , . . . , xm ) is a basis for E over F . But every
fi is a linear combination
fi = gi1 y1 + gi2 y2 + · · · + gin yn =
n
X
j=1
gij yj
10.5. ON THE DEGREE OF AN EXTENSION
207
because (y1 , y2 , . . . , ym ) is a basis for F over G. Putting all this together lets us
write e as


m
n
X
X

gij yj  xi
e=
i=1
=
m
X
i=1
=
m
X
i=1
j=1




n
X
j=1
n
X
j=1

gij yj xi 

gij xi yj  .
The first equality is gotten by plugging in the expression for each fi in terms of
the yi . The second equality is the distributive law used m times and the third
equality is the commutative law.
This shows that e is a linear combination of the elements xi yj using coefficients from G. Thus the set B spans all of E.
To show linear independence, we start with a linear combination that gives
zero. That is, we assume that there are gij ∈ G so that the sum of all the
gij xi yj gives zero. If we gather all the expressions with the same i together we
can write this sum as in the last line of the calculation above. We then get the
above calculation in reverse. Specifically


m
n
X
X

gij xi yj 
0=
i=1
=
m
X
i=1
=
m
X
i=1
j=1




n
X
j=1
n
X
j=1

gij yj xi 

gij yj  xi .
But the xi are linearly independent. So each coefficient of xi must be zero in
the last expression giving


n
X

gij yj  = 0
j=1
for each i. But the yj are linearly independent. So each coefficient gij must be
zero. This proves the linear independence of the xi yj .
The proof of the next lemma is left as an exercise.
Lemma 10.5.2 Let F ⊆ E be a field extension. Then F = E if and only if
[E : F ] = 1.
208
CHAPTER 10. FIELD BASICS
Exercises (48)
1. Prove Lemma 10.5.2.
2. What is [C : R]? If it is finite, what is a basis?
√
3. Let T = Q[ 2] as in Section 2.6.2. What is [T : Q]? What is a basis?
How can you argue that your answers are correct?
√
4. Let T = Q[ 3 2] from Problem 5 of Exercise set (24). What is [T : Q]?
What is a basis? How can you argue that your answers are correct?
10.6
The characteristic of a field
10.6.1
Definition and properties
There is an important number associated to each field. Let F be a field and look
at the structure of the abelian group formed by F and the addition operation.
In this group, the element 1 has an order c that is either a positive integer
or infinite. We will shortly give a set of exercises that will supply important
information about the number c.
Observe that since the additive group F under addition is written additively,
we are considering sums of copies of 1 rather than products of copies of 1. We
are asking for the smallest number c so that the sum
1 + 1 + ···+ 1
(in which there are exactly c ones) is equal to zero in F . We could write c1 for
this sum, but that looks too much like multiplication in F in spite of the fact
that the c really comes from Z and the 1 comes from F .
We invent a notation for this situation. Let n be a positive integer. We
write n(1) for the sum of n ones in F . While it still looks like multiplication, it
also looks like a function which is a more accurate way to think about it. We
expand the meaning of this notation to say that n(x) is the sum of n copies of
x for any x ∈ F . That is
n(x) = x + x + · · · + x
in which there are exactly n copies of x.
Exercises (49)
1. The function n defined above is NOT a multiplicative homomorphism.
Show that
n(xy) = xn(y) = yn(x)
for any x and y in F . In particular n(x) = xn(1) for any x ∈ F .
10.6. THE CHARACTERISTIC OF A FIELD
209
2. The function n defined above IS an additive homomorphism. Show that
n(x + y) = n(x) + n(y)
for any x and y in F .
3. Show that for positive integers m and n and for x in a field F , that
(mn)(x) = m(n(x)).
4. For a field F let c be the smallest positive integer so that c(1) = 0. If c is
not prime so that c = ab with a and b positive integers both greater than
1, show that either a(1) = 0 or b(1) = 0 in F . Show that this implies that
c must be a prime. The number c as defined here is used in the rest of the
exercises in this set.
5. If c is finite, then show that c(x) = 0 for every x ∈ F .
6. If there is a positive integer n so that n(x) = 0 in F for some x 6= 0 in F ,
then n(1) = 0 in F and the order of 1 in F is finite and no larger than n.
7. Argue from the previous exercises that if c is finite, then it is the order of
every non-zero element of F under addition.
8. If c is finite and n(x) = 0 for some x 6= 0 in F , then c|n.
If F is a field, and the order of 1 in the group formed by F under addition
is finite, then we call this order the characteristic of F . If this order is infinite,
then we say that the characteristic of F is zero. Motivation for this last twist in
the definition is that the characteristic of F ends up being the smallest positive
integer n so that n(x) = 0 for all x ∈ F if such a positive integer exists, or the
only integer n (namely 0) so that n(x) = 0 for all x ∈ F if no such positive
integer exists.
We see from the exercises, that the characteristic of a field is always either
a prime or is zero, and if non-zero, it is the order of every non-zero element of
the field under addition.
The behavior of fields of non-zero characteristic is rather different from that
of fields of characteristic zero. Because we are interested in the solution of
polynomials with real or complex coefficients, and because the reals and complex
numbers are fields of characteristic zero, we will spend more time with fields of
characteristic zero. However, fields of non-zero characteristic are interesting and
we will spend a little time with them.
10.6.2
A minimal field of each characteristic
The field Q has characteristic 0, and for each prime p, the field Zp has characteristic p. These are not just typical examples. They are the most fundamental
examples. The next lemma supports this claim.
210
CHAPTER 10. FIELD BASICS
Theorem 10.6.1 A field F has characteristic zero if and only if it has a subfield
isomorphic to Q. A field F has characteristic p for a prime p if and only if it
has a subfield isomorphic to Zp .
The proof of Theorem 10.6.1 in the case of characteristic zero is lengthy and
we will not give it here. It will be left as a fairly long but straightforward project
for the curious. For non-zero characteristic, we outline the argument and leave
the details as an exercise.
If F has a subfield isomorphic to Zp , then any non-zero element of this
subfield has order p. Thus the characteristic of F must be p.
If F has non-zero characteristic p (which we know must be a prime), then
we are trying to find a subfield of F isomorphic to Zp . We accomplish this by
finding a homomorphism from Zp to F that takes [1]p ∈ Zp to 1 ∈ F . From our
basic facts about field homomorphisms (Lemma 3.4.6), we know that this will
be one-to-one and its image will be the subfield that we want.
We will be strict with notation to make sure we are being careful with details.
For [n]p ∈ Zp , let h([n]p ) = n(1) where n(1) is the sum of n copies of 1 ∈ F .
Since this is a definition based on a formula that uses a representative of an
equivalence class, an exercise must be done to show that it is well defined.
The function h clearly takes [1]p to 1 ∈ F . What remains to show is that h
is a field homomorphism. This is left as an easy exercise.
Exercises (50)
1. Finish the proof of Theorem 10.6.1 in the case of non-zero characteristic.
That is, show that h is well defined and a field homomorphism.
2. (Optional) Prove Theorem 10.6.1 in the case of characteristic zero. This
must be done in stages. First show that there is a ring homomorphism
from Z to F that takes 1 ∈ Z to 1 ∈ F . This starts as an imitation of
the proof of Thoerem 10.6.1 in the case of characteristic p to get the nonnegative integers into F . Then one must decide how to get the negative
integers into F , and lastly how to get the rational numbers into F . Each
stage must be checked for well definedness (if necessary) and the properties
of a homomorphism. The one-to-one property can be ignored until the end
when all that is needed is to show that there is at least one element of Q
that does not go to zero.
10.6.3
Consequences of Theorem 10.6.1
Let F be a finite field. The order of 1 in F under addition must be finite. So F
has characteristic p for some prime p. From Theorem 10.6.1, we know that F
is an extension of Zp . Since |F | is finite, it must have a finite basis over Zp and
thus have some finite dimension d = [F : Zp ]. From the discussion in Section
10.5.1 we get |F | = pd . We have shown the following.
Theorem 10.6.2 The order of any finite field is a power of a prime number.
10.6. THE CHARACTERISTIC OF A FIELD
211
This gives one half of the facts mentioned at the end of Section 10.5.1. The
other half says that every power of a prime is the order of some finite field. This
will be discussed to some extent in a later chapter.
212
CHAPTER 10. FIELD BASICS
Chapter 11
Polynomials
11.1
Motivation: the construction of Zp from Z
To construct the field Zp from Z when p is prime, we first build an equivalence
relation on Z, and then build a structure on the resulting set of equvialence
classes. Building the operations of addition, multiplication and negation is easy,
and the proof that they are well defined is straightforward. The required laws,
commutative, associative, distributive, follow from the fact that these laws hold
in Z. The only subtle part of the construction is finding multiplicative inverses.
The argument that multiplicative inverses exist is lengthy and is discussed in
outline form in Section 2.7.1 with details given in Sections 2.7.3, 2.7.4 and 2.8.
The outline visits many fundamental properties of the integers: the existence of
a division algorithm, the existence of greatest common divisors, the notion of a
prime integer, and the notion of relatively prime integers.
One of the goals of this chapter is to prove that these properties and results
concerning the integers are also shared by collections of polynomials. This will
be used in the next chapter where we show how certain field extensions can be
constructed from collections of polynomials in much the same way that Zp is
constructed from Z. In particular, finding multiplicative inverses will follow the
same outline that was used to find multiplicative inverses in Zp .
The outline in Section 2.7.1 also leads to other results about the integers,
and the details are given in Sections 2.7.5, 2.7.6 and 2.7.7. These results are
the Fundamental Theorem of Arithmetic, Euclids Theorem about primes, and
the uniqueness of factorization of integers into primes. These results will get
us to the second goal of this chapter which is to prove basic facts about roots
of polynomials. Deriving these facts will use results about polynomials that
parallel the results just mentioned about the integers.
The facts that we gather from the details of the constructions of the extensions will allow us to say that certain automorphisms must exist. The facts
that we gather about roots of polynomials will combine with Proposition 10.4.1
which says that certain automorphisms must take roots to roots. This will es213
214
CHAPTER 11. POLYNOMIALS
tablish restrictions on how many automorphisms can exist. Between the two we
will later get a complete understanding of certain groups of automorphisms.
11.2
Rings
The integers have all the properties of a field except for multiplicative inverses.
Various structures that have two operations and a distributive law of one over
the other, but that lack some of the properties of a field are called rings.
11.2.1
Ring definitions
The following duplicates the definition in Section 3.3. Sections 3.3.2 and 3.3.4
should be reviewed. Lemma 3.3.4 is particularly important to remember.
A ring is a set with two operations, usually called addition and multiplication, that satisfy some of the axioms of a field, but not all. Specifically a set R
with an addition (written a + b), a negation (taking a to −a), a multiplication
(written ab), and an element 0 is a ring if it satisfies the following.
1. The set R and the addition form a commutative group with identitity
element 0.
2. The multiplication is associative in that a(bc) = (ab)c for all a, b and c in
R.
3. Multiplication distributes over addition in that a(b + c) = ab + ac and
(b + c)a = ba + ca for all a, b and c in R.
Some comments are needed. The multiplication need not be commutative.
This is why two distributive laws are needed. The multiplication need not have
an identity. These gaps can be filled by adding more words.
If R has an element 1 so that for all a ∈ R, we have a1 = a = 1a, then we
say that R is a ring with identity1 or ring with 1 or ring with unit. The even
integers form a ring without identity.
If a ring R satisfies ab = ba for all a and b in R, then R is called a commutative
ring. A commutative ring with identity fails to be a field only in that it lacks
multiplicative inverses. The integers form a commutative ring with identity.
It turns out that polynomials (whose coefficients come from a field) also
form a commutative ring with identity. Since this is the only ring other than
the integers that we will deal with, we will assume from now on that all our
rings are commutative with identity.
There is terminology for rings with various combinations of properties, but
we will leave such terminology alone so as not to introduce too many new words.
As noted, the integers with the usual addition and multiplication form a
commutative ring with identity. However, the integers also have special properties not covered by these terms. In Z, if a 6= 0 and b 6= 0, then ab 6= 0. A
1 Some books choose not to deal with rings without a multiplicative identity and so their
definition of a ring coincides with our definition of a ring with identity.
11.3. POLYNOMIALS
215
commutative ring R for which a 6= 0 and b 6= 0 always implies that ab 6= 0 is
called an integral domain. Thus Z is an integral domain with identity.
Even further, Z has a division algorithm. There is a definition associated
with this fact, but it is stated in much greater generality than we need and so
we will skip the technical definition. Instead we will just prove the specific fact
that polynomials have a division algorithm and work with that.
Recall that a ring homomorphism is a function h : R1 → R2 , where R1
and R2 are rings, so that h(a + b) = h(a) + h(b) and h(ab) = h(a)h(b) for
all a and b in R1 . As usual, it is easy to prove that such an h preserves 0
and 1 and additive inverses. In comparisons with field homomorphisms, which
by Lemma 3.4.6 can only be trivial or one-to-one, we will eventually see that
ring homomorphisms are much more flexible. Since the main difference between
rings and fields is the lack of multiplicative inverses in rings, you should check
to see how multiplicative inverses are crucial to the proof of Lemma 3.4.6.
We next turn to polynomials and show that they share many properties with
the integers.
Exercises (51)
1. Show that an integral domain has the cancellative property. That is, if
pq = pr and p 6= 0, then q = r. Hint: consider pq − pr. Show also that
a commutative ring with the cancellative property is an integral domain.
Thus for commutative rings, being an integral domain is equivalent to
having the cancellative property.
11.3
Polynomials
11.3.1
Introductory remarks on polynomials
Polynomials can be added, negated and multiplied to give other polynomials.
Thus it seems likely that we can make a ring out of polynomials. However, polynomials are more complicated than integers, and the discussion of polynomials
is correspondingly more complicated.
We must first deal with the question of what a polynomial is. A typical
polynomial looks like
P (x) = an xn + an−1 xn−1 + · · · + a1 x1 + a0 x0 .
This is an expression of a very recognizable form. Usually, the letter x is assumed
to represent some variable and the ai are assumed to represent some constants.
This makes a polynomial a function whose value depends on the variable x.
Thus we can think of a polynomial in two ways: an expression of a certain
form, or a function given by a formula in that form.
If we regard a polynomial as a function, then it is clear that two expressions
can give the same function. Both 0x2 + 3x + 2 and 3x + 2 specify the same
function, but they look different. The first form may not look all that necessary,
216
CHAPTER 11. POLYNOMIALS
but it is useful when giving a simple formula that tells how to add 3x + 2 to
5x2 − x − 7.
One can also wonder if there are other ways that different looking polynomials can specify the same function. We will avoid such questions for now and
start by treating polynomials as expressions and ignore until later the fact that
they specify functions.
We now get down to specifics.
11.3.2
Polynomial basics
Definition of a polynomial
We know that polynomials have constants (coefficients) and variables. We need
to specify where the coefficients come from. To make a ring, we only need to add,
negate and multiply and it is easy to see that we only need to add, negate and
multiply the coefficients to do that. Thus we can make a ring of polynomials if
we choose to take the coefficients from a ring. However, we want more than just
a ring. We want a ring that imitates the properties of the integers. One of the
properties is that a division algorithm exists. We will see that the easiest way
to get a good division algorithm for polynomials is to insist that the coefficients
come from a field. This restriction fits with our intended uses of polynomials,
so it will never be seen as confining. We will see shortly how this restriction
gets used.
So to define a polynomial, we start with a field F . We give a first definition
by saying that a polynomial P (x) over F is an expression of the form
P (x) = an xn + an−1 xn−1 + · · · + a1 x1 + a0 x0
(11.1)
where all the ai come from F . We call this a first definition since we will give
a second definition shortly that is better suited to our purposes.
Usually, we replace x0 by 1 in (11.1), but that breaks the pattern of the
decreasing exponents. Exploiting that pattern lets us sometimes write (11.1) in
the convenient form
i=n
X
ai xi .
(11.2)
P (x) =
i=0
2
We must deal with the fact that 0x + 3x + 2 deserves to be thought of as
the same polynomial as 3x + 2. There are two ways to handle this. One is to
modify the definition so that a polynomial is an expression of the form
P (x) =
∞
X
ai xi
(11.3)
i=0
with the extra provision that all but finitely many of the ai are zero. The second is to simply declare that certain expressions as given in (11.1) and (11.2)
are equivalent. The second option would create a need to check that our defined equivalence is an equivlanece relation, and would then lead to future well
definenedness problems that would have to be checked.
11.3. POLYNOMIALS
217
We thus give our second and final definition to say that a polynomial over
F is an expression as given in (11.3) so that each ai is in F and where there is
a positive integer M for which i > M implies that ai = 0. Note that M need
not be the smallest such positive integer, so the condition does not mean that
aM is not zero.
Alternate notations
However, expressions such as (11.1) and (11.2) are both familiar and useful. So
we will continue to use them and take each to mean the same as (11.3) where
all ai = 0 for i > n. Again, this does not mean that an 6= 0.
We can go farther and say that any omitted term in a polynomial
P implies
i
that the omitted coefficient is zero. So x4 + 2x is the polynomial ∞
i=0 ai x in
which a4 = 1, a1 = 2 and all other ai are zero.
We can go even further still and allow a0 x0 to be represented by a0 .
Combining these simplifications in notation lets us use an element c of the
field F to represent a polynomial over F in which the coefficient of x0 is c and
all other coefficients are zero. Viewed as a function, the polynomial c is just
the constant function to c. We call such a polynomial (one whose only nonzero coefficient is that of x0 ), a constant polynomial since it is a constant when
viewed as a function of x.
Two very useful polynomials that use this notation are the constant polynomial 0 in which all the coefficients are zero, and the constant polynomial 1 in
which the coefficient of x0 is 1 and all other coefficients are zero.
The polynomials 0 and 1 are special cases of monomials. A monomial is a
polynomial in which all coeffients except at most one are non-zero. We say “at
most” instead of “exactly” so that we include 0 among the monomials.
Having every c in F represent not only an element of F , but also a polynomial
over F makes ambiguous statements possible. However, this tends not to be a
serious problem.
Polynomial operations
The sum, negative and product are now easy to define. If
P (x) =
∞
X
i
ai x
and Q(x) =
(P + Q)(x) =
bi xi
i=0
i=0
are polynomials, then
∞
X
∞
X
(ai + bi )xi ,
i=0
(−P )(x) =
∞
X
(−ai )xi ,
and
i=0
(P Q)(x) =
j=i
∞ X
X
i=0 j=0
(aj bi−j )xi .
(11.4)
218
CHAPTER 11. POLYNOMIALS
Note that the statement that P (x) and Q(x) are polynomials carries with it
the assumption that all but finitely many of their coefficients is zero. It must
then be proven that this holds for our definitions of P + Q, −P and P Q. That
is, it must be proven that these are polynomials. In fact we have the following
whose proof is left as an exercise.
Lemma 11.3.1 If P (x) and Q(x) are polynomials over a field F , then so are
(P + Q)(x), −P (x) and (P Q)(x).
Note that the definition of P Q in (11.4) agrees with the usual instructions
that one is taught about multiplying polynomials. When the term aj xj from
P (x) is multiplied by the term bi−j xi−j from Q(x), then aj bi−j xi is contributed
to (P Q)(x) and is a summand of the ultimate expression involving xi in (P Q)(x).
However, the definitions carry more power than just this illusttation of familiarity. We give two more lemmas about the nature of the definitions in (11.4).
Lemma 11.3.2 The polynomials over a field F form a commutative ring with
identity. The polynomial 0 is the additive identity and the polynomial 1 is the
multiplicative identity.
The proof of Lemma 11.3.2 is tedious with the biggest culprits being the
proofs of associativity of multiplication and the distributivity of multiplication
over addition. We will accept the truth of this lemma for now and leave its
proof as an optional exercise.
We will use F [x] to denote the set of all polynomials over the field F with
the ring structure guaranteed by Lemma 11.3.2.
P
i
The next lemma discusses polynomials as functions. If P (x) = ∞
i=0 ai x is
a polynomial over a field F , then
a given z ∈ F , we can discuss the meaning
Pfor
∞
of P (z). Taking P (z) to mean i=0 ai z i has us adding up elements of F since
each ai is in F and z is in F . However, we are formally adding infinitely many
values of F . But since all but finitely many of the ai are zero, Lemma 3.4.1
says that all but finitely many of the ai z i are zero. Thus we are only adding up
finitely many non-zero elements of F and the result is a specific element of F .
Thus for each z ∈ F , we get an element P (z) in F , and we see that P (x) is a
function from F to F .
Let us look at some small examples. Let
P (x) = 2x + 4, and Q(x) = 6x − 5.
We have
P (3) = 10, and Q(3) = 13, so P (3)Q(3) = 130.
But we can also write
P (3) = 2 · 3 + 4, and Q(3) = 6 · 3 − 5
so
P (3)Q(3) = 12 · 32 + 14 · 3 − 20 = 108 + 42 − 20 = 130.
11.3. POLYNOMIALS
219
Of course in the latter calculation we were just using 3 instead of x in the
multiplication
P (x)Q(x) = (2x + 4)(6x − 5) = 12x2 + 14x − 20.
Essentially, we gave an example that shows that the rules for multiplying polynomials work no matter what value is given to the variable. This also works with
addition and negation. This can all be formalized into a very simple statement.
Take one value z ∈ F . Now each polynomial P (x) over F gives a value P (z).
This gives a function from F [x] to F . We refer to this function as the evaluation
function at z. Let us denote this function as vz (think of “value at z”), so that
we have vz (P (x)) = P (z). Note that taking z to be a different element of F
gives a different function. As an example, v0 takes each polynomial P (x) to the
element a0 in F where a0 is the coefficient of x0 in P (x). What can you say
about v1 ?
The next lemma uses the notion of a ring homomorphism. Not every ring is
a field, but every field is a ring. So the notion of a ring homomorphism from a
ring to a field makes sense.
Lemma 11.3.3 Let F be a field and let z be a given element of F . Let vz :
F [x] → F be the evaluation function at z from the ring F [x] of all polynomials
over F to F . Then vz is a ring homomorphism.
As with Lemma 11.3.2, the proof of Lemma 11.3.3 can be painful. Discussion
of its proof will occur later. Because of Lemma 11.3.3, the evaluation function
vz is usually called the evaluation homomorphism.
The evaluation homomorphism at 0 is rather special.
Lemma 11.3.4 Let F be a field and let C be the collection of constant polynomials in F [x]. Then C is a subring of F [x] and the evaluation homorphism v0
at 0 restricted to C is an isomorphism from C to F .
Essentially, Lemma 11.3.4 lets us view the constant polynomials as a copy
of F living inside F [x].
Exercises (52)
1. Prove Lemma 11.3.1.
2. (optional) Prove Lemma 11.3.2. This breaks into many checks. Some
(such as commutativity of addition and multiplication, and the associativity of addition) are quite easy. More difficult are the associativity of
multiplication and the distributive law, but these are less terrible than you
might think. They require careful manipulation of summation indexes.
3. Prove Lemma 11.3.4.
220
11.3.3
CHAPTER 11. POLYNOMIALS
Degree
P∞
In the polynomial P (x) = i=0 ai xi , each ai xi is a term of P (x). The degree
of this term is i. If P (x) 6= 0, then the degree of P (x) is the largest i so that
ai 6= 0. That is, the degree of P (x) is the highest degree of a term of P (x) with
non-zero coefficient. We write deg(P (x)) for the degree of P (x).
We have the following very simple observation about degrees and multiplication.
Lemma 11.3.5 If P (x) 6= 0 and Q(x) 6= 0 are polynomials over a field F , then
deg((P Q)(x)) = deg(P (x)) + deg(Q(x)).
P∞
P∞
Proof. Let P (x) = i=0 ai xi , let Q(x) = i=0 bi xi , let m = deg(P (x)), and let
n = deg(Q(x)). Then the coefficient of xm+n in (P Q)(x) is
j=m+n
X
aj bm+n−j .
j=0
One of the terms in this sum is am bn which is not zero since am 6= 0 and bn 6= 0.
All other terms in this sum either have j > m implying that aj = 0, or j < m
implying that m + n − j > n so bm+n−j = 0. Thus the coefficient of xm+n in
(P Q)(x) is the non-zero quantity am bn added to a bunch of zeros.
Now for k > m + n, we have that the coefficient of xk in (P Q)(x) is
j=k
X
aj bk−j .
j=0
For j > m, we have aj = 0. For j ≤ m, we have k − j ≥ k − m > m + n − m = n
and bk−j = 0. Thus for every xk with k > m+n, the coefficient of xk in (P Q)(x)
is zero.
The degree of the polynomial 0 is a problem. Since 0P = 0 for any polynomial, we want the degree of 0 plus any other degree to always be the degree of 0
if we wish to have degree cooperate with Lemma 11.3.5 even for the polynomial
0. We do this by declaring2 that the degree of 0 is −∞ and invent the rule that
−∞ + m = −∞ for any integer m and −∞ + −∞ = −∞. We do not allow
polynomials of degree +∞, so we never have to deal with the sum of −∞ and
+∞.
We can now extend Lemma 11.3.5.
Proposition 11.3.6 If P (x) and Q(x) are polynomials over a field F , then
deg((P Q)(x)) = deg(P (x)) + deg(Q(x)).
2 Some books simply leave the degree of 0 undefined. Our convention has other problems
which we will ignore.
11.4. THE DIVISION ALGORITHMM FOR POLYNOMIALS
221
We introduce terms that arise naturally at this point. If a polynomial P (x) is
not zero, then its leading coefficient is the coefficient of xd where d = deg(P (x)).
If P (x) = 0, then we say that its leading coefficident is zero. Note that the zero
polynomial is the only polynomial with leading coefficient equal to zero.
We say that a polynomial is monic if its leading coefficient is 1.
We can use degree to redefine constant polynomials as those polynomials of
degree no more than zero. Polynomials of degree exactly one are called linear
polynomials.
We declare that −∞ < m for any integer m. This cooperates with induction
arguments. Note that with the exception of the polynomial 0, all degrees of
polynomials are non-negative. Thus the set D of degrees of polynomials has the
non-negative integers and −∞. If S is a non-empty subset of D, then either
−∞ ∈ D or −∞ ∈
/ D. In the first case, −∞ is the least element of D and in
the second case, D has a least element since it is a non-empty subset of the
non-negative integers. We will use this in the next section.
Exercises (53)
1. Show that if polynomials P (x) and Q(x) have degrees m and n respectively, then the degree of (P + Q)(x) is no larger than max{m, n}.
11.4
The division algorithmm for polynomials
Theorem 11.4.1 Let P (x) 6= 0 and S(x) be polynomials over a field F . Then
there are unique polynomials Q(x) and R(x) so that S(x) = (P Q)(x) + R(x)
and deg(R(x)) < deg(P (x)).
Proof. What follows is an extremely slight modification of the proof of the division algorithm for integers (Proposition 2.7.3).
Let d = deg(P (x)). Note that d ≥ 0.
Let A be the set of degrees of all polynomials of the form S(x) − (P Q)(x)
as Q(x) runs over all polynomials over F . The set A is certainly non-empty
and so has a least element r (which might be −∞). For this r let Q(x) be
some polynomial over F so that deg(S(x) − (P Q)(x)) = r. We do not yet
know that there is only one such Q(x). Let R(x) = S(x) − (P Q)(x). We have
deg(R(x)) = r and we wish to show that r < d.
Assume that r ≥ deg(P (x)). This is the point at which the details of the
proof vary from the details of the proof of Proposition 2.7.3. Let ar be the
coefficient of xr in R(x) and let bd be the coefficient of xd in P (x). We know
that neither ar nor bd is zero. Since r ≥ d, we have that the monomial M (x) =
(ar /bd)xr−d is a polynomial over F . We consider the polynomial
D(x) = R(x) − (P M )(x)
= (S(x) − (P Q)(x)) − (P M )(x)
= S(x) − (P (Q + M ))(x).
222
CHAPTER 11. POLYNOMIALS
The degree of (P M )(x) is r as is the degree of R(x), so the degree of D(x) is
no more than r. The coefficient of xr in D(x) is zero, so the degree of D(x) is
less than r.
However, writing D(x) as S(x) − (P (Q + M ))(x) shows that deg(D(x)) is
in A. This contradicts the fact that r was chosen to be the least element of A
and we have shown that we must have r < d.
Now suppose that (P Q1 )(x) + R1 (x) = S(x) = (P Q2 )(x) + R2 (x) with
deg(R1 (x)) < d and deg(R2 (x)) < d. But we have
(P (Q1 − Q2 ))(x) = (R2 − R1 )(x)
and the degree of the left side is at least d if (Q1 − Q2 )(x) 6= 0 and the degree of
the right side is strictly less than d. Thus (Q1 − Q2 )(x) = 0 and Q1 (x) = Q2 (x).
Now (R2 − R1 )(x) = 0 and R2 = R1 .
Exercises (54)
1. Where in the proof of Theorem 11.4.1 is the fact that F is a field used?
2. (optional) There is a more complicated statement of a division algorithm
for polynomials with coefficients in the integers. Find and prove such a
statement. This is an example of dealing with the ring of polynomials
whose coefficients come from a ring.
11.4.1
Roots and linear factors
The division algorithm gives us the usual correspondence between roots of a
polynomial and its linear factors.
Lemma 11.4.2 Let P (x) be a polynomial over a field F . Then r ∈ F is a root
of P (x) if and only if x − r divides P (x).
Proof. If x − r divides P (x), then P (x) = (x − r)A(x) and P (r) = 0.
Now assume P (r) = 0. The degree of x − r is 1. From Theorem 11.4.1
we know there are unique Q(x) and R(x) so that deg(R(x)) < 1 and P (x) =
(x − r)Q(x) + R(x). Since deg(R(x)) is either 0 or −∞, we know that R(x) is
a constant c. Thus P (x) = (x − r)Q(x) + c, and 0 = P (r) = c.
11.5
Greatest common divisors and consequences
11.5.1
Divisors and units
In Section 2.7.4, we defined “a” greatest common divisor of two integers m and
n, not both zero, to be a common divisor g of m and n so that every other
common divisor of m and n divides g. We then observed that most pairs of
integers have two greatest common divisors, one positive and one negative. For
11.5. GREATEST COMMON DIVISORS AND CONSEQUENCES
223
example, both −6 and 6 are greatest common divisors of 12 and 18. We then
adopted the convention that the notation (m, n) would refer to the non-negative
greatest common divisor of m and n.
We face a similar but larger problem with polynomials. The problem in the
integers stems from the fact that both −1 and 1 have multiplicative inverses in
Z. In F [x], all the non-zero constant polynomials have multiplicative inverses.
(Check it out.)
Consider common divisors in R[x] of x2 − 1 and x2 + x − 2. One sees quickly
that x−1 divides both since (x2 −1)/(x−1) = x+1 and (x2 +x−2)/(x−1) = x+2.
But 2x−2 also divides both x2 −1 and x2 +x−2 since (x2 −1)/(2x−2) = 12 x+ 12
and (x2 + x − 2)/(2x − 2) = 12 x + 1. In fact so does 3x − 3, as does .1x − .1,
and so on. However, these common divisors all divide each other so they are all
candidates for greatest common divisor. While there are usually two choices for
a greatest common divisor in the integers there are infinitely many candidates
in the world of polynomials. We choose one of several ways to get around this
and take time to discuss the solution before we prove that greatest common
divisors exist.
We first make a definition. In a ring R with 1, we say that u ∈ R is a unit
if u has a multiplicative inverse. Note that the element 1 has to exist in R in
order to have this discussion. Note also that if u is a unit, so is u−1 . Note also
that 1 is a unit in any ring with identity.
This use of the word “unit” seems to overlap with the phrase “ring with
unit.” However, if a ring has a unit with the meaning just defined, then it must
have an element 1 in order for the word unit to make sense. So “ring with unit”
still means that an identity must exist.
We need other notions that we used with integers. Let a and b be elements
of a ring R. We say a divides b if there is a q ∈ R so that b = aq. As usual, we
write a|b if a divides b. We imitate the definition of greatest common divisor
and say that if a and b are in R, not both zero, then a greatest common divisor
g of a and b is a common divisor of a and b so that if h is any common divisor
of a and b, then h|g.
The following lemma ties together all the notions that we need.
Lemma 11.5.1 Let R be an integral domain with 1.
1. The set of units in R forms a group under multiplication.
2. If a ∈ R and u is a unit in R, then u|a.
3. If a|b in R, and u is a unit in R, then (au)|b.
4. If a|b and b|a in R are not both zero, then neither is zero and for some
units u and v in R with uv = 1, we have a = bu and b = va.
5. If g is a greatest common divisor of a and b in R and u is a unit in R,
then gu is a greatest common divisor of a and b in R.
224
CHAPTER 11. POLYNOMIALS
6. If g1 and g2 are two greatest common divisors of a and b in R, then for
some units u and v in R with uv = 1, we have g1 = ug2 and g2 = vg1 .
7. If F is a field, then the units of F [x] are exactly the non-zero constant
polynomials.
8. If P (x) is a non-zero polynomial in F [x] for a field F , then there is a
unique unit u in F [x] and unique monic polynomial Q(x) so that P (x) =
uQ(x).
9. If P (x) and Q(x) have a greatest common divisor in F [x] for a field F ,
then they have a unique monic greatest common divisor.
Proof. We give the proofs of a couple of the items and leave the rest as exercises.
To prove 4, we note that a|b and b|a implies that b = av for some v (not yet
known to be a unit) and a = bu for some u (not yet known to be a unit). This
means that if one is zero, then both are zero. But since we assume they are not
both zero, neither is zero. Substituting bu for a in b = av gives b = (bu)v. Since
b = b1, we get b1 = buv and cancellativity (from the fact that R is an integral
domain) says uv = 1 giving the last point needed.
To prove 7, we note that 0 is never a unit. If P (x) and Q(x) satisfy P Q(x) =
1, then the degree of P Q is zero. But the degree of P Q is the sum of the degrees
of P and Q, neither of which is −∞. So the degrees of P and Q are both zero
and both are non-zero constants.
A consequence of 4 and 7 is that if two non-zero polynomials in F [x] are
mutually divisible, then each is a non-zero constant times the other.
The main point of Lemma 11.5.1 is Item 9. If polynomials A(x) and B(x)
over a field F have a greatest common divisor, then they have a unique monic,
greatest common divisor and we will reserve the notation (A, B) for that unique
monic, greatest common divisor of A(x) and B(x).
Exercises (55)
1. Prove the rest of Lemma 11.5.1.
We now have a meaning for (A, B) when A(x) and B(x) are polynomials
over a field F , but we do not yet know that (A, B) always exists. We prove this
in the next section.
11.5.2
GCD of polynomials
Theorem 11.5.2 Let P (x) and Q(x) be polynomials over a field F so that at
least one of P (x) and Q(x) is not zero. Then there is a unique monic, greatest
common divisor of P (x) and Q(x).
Proof. Repeating our opening sentence to the proof of Theorem 11.4.1, what
follows is an extremely slight modification of the proof of the corresponding
11.5. GREATEST COMMON DIVISORS AND CONSEQUENCES
225
result for integers (Proposition 2.7.6). The modfications are even smaller than
than those needed for the proof of Theorem 11.4.1.
We only have to show that a greatest common divisor exists. The existence
and uniqueness of a monic, greatest common divisor will follow from this and
from Lemma 11.5.1.
Let A be the set of degrees of all polynomials of the form (M P + N Q)(x)
where M (x) and N (x) are polynomials over F . Let B be the subset of A
consisting of all degrees not equal to −∞. We know that B is not empty since
at least one of P (x) and Q(x) is not zero.
Let g be the least value in B and let G(x) = (M P + N Q)(x) for some M (x)
and N (x) for which deg(G(x)) = g. We claim that G(x) is a greatest common
divisor of P (x) and Q(x).
We know that there are unique polynomials S(x) and R(x) so that P (x) =
(SG)(x) + R(x) and deg(Rx)) < g. Note that
R(x) = P (x)−(SG)(x) = P (x)−S(x)(M P +N Q(x) = ((1−SM )P +(SN )Q)(x)
so deg(R(x)) is in A. If deg(R(x)) 6= −∞, then deg(R(x)) is also in B. But
deg(R(x)) < g which is least in B, so deg(R(x)) must be −∞ and R(x) = 0.
Thus G(x)|P (x). Similarly G(x)|Q(x) and we have that G(x) is a common
divisor of P (x) and Q(x).
If H(x) is a common divisor of P (x) and Q(x), then H(x) must divide
G(x) = (M P + N Q)(x). This makes G(x) a greatest common divisor.
Note that at least one greatest common divisor of P (x) and Q(x) is of the
form (M P + N Q)(x). Further, all greatest common divisors can be obtained
from any other by multiplying by a unit. So any greatest common divisor of P (x)
and Q(x) can be written u(M P + N Q)(x). But this is ((uM )P + (uN )Q)(x).
So all greatest common divisors of P (x) and Q(x) can be put in the form
(M P + N Q)(x). In particular, this is true of the monic polynomial (P, Q).
11.5.3
Irreducible polynomials
One fact about the integers is Euclid’s Theorem (Theorem 2.7.9). In the integers, if p is a prime and p|(ab), then either p|a or p|b. We need a notion in a
more general ring that imitates the notion of a prime in the integers. There are
two words that are used. The word “prime” is used for the behavior referred to
above: if p|(ab), then p|a or p|b. The word “irreducible” is used for the behavior
that we usually think of when we think of prime integers.
In a ring R, we say that a ∈ R is irreducible if a is not a unit and whenever
a factors as a = bc, then one of b or c is a unit. That is, a is not a unit and is
not a product of two non-units. For polynomials, this means that an irreducible
polynomial is not a constant and cannot be factored into non-constant factors.
We say that a non-unit p in a ring R is prime if whenever p|(ab) with a
and b in R, then p|a or p|b. For some rings “irreducible” and “prime” are not
identical. For the rings F [x] with F a field, we will see shortly that there is
no real difference and we will not discuss the matter further. However, the
226
CHAPTER 11. POLYNOMIALS
word “irreducible” is traditionally used with polynomials more frequently than
“prime” and we will do so here as well. The definition of “prime” is included
for the curious.
As a companion to these definitions we can define two elements a and b of
a cancellative ring with an identity to be relatively prime if there is a greatest
common divisor that is a unit. Note that if a unit u is a greatest common divisor
of relatively prime a and b, then so is uu−1 = 1. Applying this to polynomials
over a field F , if P (x) and Q(x) are relatively prime, then for some M (x) and
N (x), we have (M P + N Q)(x) = 1, and we will have (P, Q) = 1 since we insist
that the greatest common divisor of two polynomials be monic.
Exercises (56)
1. Every polynomial of degree one over a field is irreducible. Hint: consider
degrees.
We are now in a position to prove that “irreducible” implies “prime” in F [x]
for a field F . It will be seen that units add an extra step here and there.
Theorem 11.5.3 Let P (x), A(x) and B(x) be polynomials over a field F , with
P (x) irreducible. If P (x)|(AB)(x), then either P (x)|A(x) or P (x)|B(x).
Proof. We assume that P (x) does not divide A(x). If (P, A) is not 1, then it
is some G(x) that is not a unit. But G(x)|P (x) implies that P (x) = (GC)(x)
for some C(x) which must then be a constant polynomial which we may as well
write as c, giving P (x) = cG(x). But G(x) also divides A(x), and P (x) is a unit
times G(x) so P (x) divides A(x). So we must have (P, A) = 1.
The rest of the argument follows the proof of Theorem 2.7.9.
Now for some M (x) and N (x), we have 1 = (P M + N A)(x). Multiplying by
B(x) gives B(x) = (BP M )(x)+(N AB)(x). Since P (x) divides both summands
on the right it divides B(x).
11.6
Uniqueness of factorization
We can now prove a parallel to the fundamental theorem of arithmetic. Details
will be left as exercises.
Exercises (57)
1. Prove that every polynomial over a field F that is not a unit is a product
of irreducible polynomials. Hint: if false, then let P (x) be the polynomial
of least degree for which the statement is false.
2. Prove that every polynomial P (x) over a field F that is not a unit is a
constant times a product of monic, irreducible polynomials, and that this
constant is the leading coefficient of P (x) and is thus unique.
11.7. ROOTS OF POLYNOMIALS
227
3. Prove that if a polynomial P (x) over a field F and is not a unit, if P (x) is
a constant times a product of monic, irreducible polynomials in two ways,
and if A(x) is an irreducible factor of the first factorization, then A(x)
equals one of the factors in the second factorization.
4. Prove that if P (x) is a polynomial over a field F and is is not a unit, then
any two factorizations of P (x) into a constant times a product of monic,
irreducible polynomials can be made the same by permuting the factors
of one of the factorizations. Hint: induct on the number of irreducible
factors.
5. Prove that if P (x) is a polynomial over a field F , and A(x) is a monic,
irreducible polynomial over F that divides P (x), then A(x) is one of the
factors of any factorization of P (x) into a constant times a product of
monic, irreducible polynomials.
The problems above prove the following version of the Fundamental Theorem
of Arithmetic for polynomials over a field.
Theorem 11.6.1 If P (x) is a polynomial over a field F and is not a unit, then
P (x) factors as a constant times a product of monic, irreducible polynomials
over F . Further, this factorization is unique up to a permutation of the factors.
At this point, we have finally filled in some missing details from Section
1.2.6. There it was claimed that if r1 and r2 are roots of the monic quadratic
x2 + bx + c, then it must be true that r1 + r + 2 = −b and r1 r2 = c. This
follows from Lemma 11.4.2 and Theorem 11.6.1. The lemma says that if a
monic quadratic has roots r1 and r2 , then (x − r1 ) and (x − r2 ) are factors of
the polynomial and the theorem says that the polynomial factors into linear
factors in only one way. Thus the polynomial must equal (x − r1 )(x − r2 ) and
the claims follow.
11.7
Roots of polynomials
11.7.1
Counting roots
We combine information from Lemma 11.4.2 and Theorem 11.6.1.
Let P (x) be a polynomial over a field F . There may be a root of P (x) in F
or there may not. The polynomial x2 − 2 is a polynomial over Q and also over
R. There is a root of x2 − 2 in R, but not in Q. Thus the number of roots of
P (x) in F is not completely known just by knowing P (x). One must know F
as well. But there are facts about the number of roots that we can record, and
we do that here.
Proposition 11.7.1 Let P (x) be a non-zero polynomial over a field F , and let
d = deg(P (x)). Then there are no more than d different roots of P (x) in F .
228
CHAPTER 11. POLYNOMIALS
Proof. If there are n different roots of P (x) in F , then let them be denoted r1 , r2 ,
. . . , rn . From Lemma 11.4.2, we know that each x − ri , 1 ≤ i ≤ n, divides P (x).
But each x − ri is monic and irreducible so by Theorem 11.6.1 each x − ri is
a factor in any factorization of P (x) into a constant times a product of monic,
irreducible polynomials. Since the x − ri are all different for 1 ≤ i ≤ n, the
degree of P (x) must be at least n, and d ≥ n.
11.7.2
Polynomials as functions
We have been treating polynomials as expressions. They are also functions. If
P (x) is a polynomial over a field F , then if a value from F is assigned to the
variable x, then a value for P (x) can be calculated by applying the operations
of the field to the expression P (x). This makes P (x) a function from F to F . A
question arises as to whether different looking polynomials have to give different
functions. The answer is that often they do, but not always.
Proposition
Q(x) be polynomials over a field F with
P∞ 11.7.2 Let P (x)Pand
∞
P (x) = i=0 ai xi and Q(x) = i=0 bi xi . If there are more elements of F then
the larger of deg(P (x)) and deg(Q(x)), and if P (x) = Q(x) for every x in F ,
then ai = bi for 0 ≤ i < ∞.
The conclusion can be thought of as saying that if two polynomials are equal
as functions and the field is large enough, then the two polynomials are equal
as expressions.
Proof. The assumptions say that (P − Q)(x) = 0 for every x in F . But the
degree of (P − Q)(x) is no larger than max{deg(P (x)), deg(Qx))} so there are
more roots to (P − Q)(x) then deg((P − Q)(x)). From Proposition 11.7.1, we
must have that (P − Q)(x) is the zero polynomial. That is, all its coefficients
are zero.
Corollary 11.7.3 Let PP
(x) and Q(x) be polynomials over a field F with P (x) =
P
∞
∞
i
i
a
x
and
Q(x)
=
i
i=0
i=0 bi x . If the characteristic of F is zero, and if
P (x) = Q(x) for every x in F , then ai = bi for 0 ≤ i < ∞.
Proof. A field with characteristic zero must have infinitely many elements. 11.7.3
Automorphisms and roots
We give a first hint of restrictions that exist on automorphisms. If P (x) is a
polynomial over a field F , if E is an extension field of F , and if r ∈ E is a root
of P (x), then by Proposition 10.4.1 an element of Aut(E/F ) must carry r to a
root of P (x). But P (x) is a polynomial over E as well as over F , and there are
no more than d = deg(P (x)) roots of P (x) in E. Thus there are only d places
that θ can carry r. This is stated formally in the following.
11.8. DERIVATIVES AND MULTIPLICITIES OF ROOTS
229
Lemma 11.7.4 Let P (x) be a polynomial of degree d over a field F , and let E be
a field extension of F containing a root r of P (x). Then {θ(r) | θ ∈ Aut(E/F )}
is contained in the set of roots of P (x) in E and thus has no more than d
elements.
Later we will learn more detailed information about automorphism groups
of field extensions.
11.8
Derivatives and multiplicities of roots
Counting roots becomes inaccurate if there are multiple roots. We derive a
criterion for telling whether there are multiple roots.
11.8.1
The derivative
P∞
If P (x) = i=0 ai xi is a polynomial over a field F , then we know from calculus
that the derivative of P (x) is given by
P ′ (x) =
∞
X
i=1
i(ai )xi−1 =
∞
X
(i + 1)(ai+1 )xi .
(11.5)
i=0
Note that i(ai ) involves an element of Z, namely i, times an element of F
and is defined as the sum of i copies of ai in manner identical to the definition of
m(x) from Section 10.6.1. The first of the two formulas is the more familiar. The
two are seen equal when the terms of similar powers of x are matched between
the two formulas. In calculus (11.5) is derived as the result of a limit process.
Here, we just take (11.5) as the definition of P ′ (x), and limits are completely
eliminated from the discussion. We can describe (11.5) as an algebraic definition
of the derivative.
Before we give an exercise that shows that the algebraic definition of the
derivative behaves in familiar ways, we give one more fact about the function
m(x). In the expression (mn)(xy) where m and n are non-negative integers and
x and y are field elements, we can use two facts from Exercise Set (49). We can
write
(mn)(xy) = m(n(xy)) = m(xn(y)) = m(x)n(y)
where the first equality follows from (mn)(x) = m(n(x)), and the other two
equalities from different applications of m(xy) = xm(y).
Exercises (58)
1. Using (11.5) as a definition, prove the product rule
(P Q)′ (x) = (P ′ Q + Q′ P )(x).
You will need various facts about the functions m(x).
230
11.8.2
CHAPTER 11. POLYNOMIALS
Multiplicities of roots
We use the derivative to detect roots with multiplicities. For a polynomial P (x)
over a field F , we say that r ∈ F is a root of P (x) with multiplicity m if (x− r)m
divides P (x) but (x − r)m+1 does not divide P (x). That is, m is the largest
integer for which (x − r)m divides P (x).
Exercises (59)
1. Prove that if P (x) is a non-zero polynomial of degree d over a field F and
r1 , r2 , . . . , rn are the different roots in F of P (x), and if for each i, the
multiplicity of ri is mi , then m1 + m2 + · · · + mn ≤ d.
2. Prove that if P (x) is a non-zero polynomial over a field F of characteristic
zero and r ∈ F is a root of P (x), then the multiplicity of r is greater than
1 if and only if r is also a root of P ′ (x).
The two exercises above explain our interest in multiplicities and derivatives.
From the first exercise, we know that if one or more roots of P (x) has multiplicty greater than 1, then there must be strictly fewer roots than the degree of
P (x). This combines with Lemma 11.7.4 to say that the number of places an
automorphism can take a root P (x) is strictly smaller than the degree of P (x).
We will see that this is a less than desirable situation since there is less going
on in a relevant automorphism group than hinted at by the degree of P (x).
The second exercise explains our interest in the derivative as defined in
(11.5), since derivatives help determine when a root of a polynomial has multiplicity greater than 1.
11.9
Factoring polynomials over the reals
We look at a very special case of polynomials. We look at R[x]. Experience
shows that the non-real roots of a polynomial P (x) ∈ R[x] come in complex
conjugate pairs, and Exercise 1(e) of Exercise Set (47) proves that this is always
the case. It turns out (what is called the Fundamental Theorem of Algebra)
that any polynomial in C[x] (not just R[x]) has all of its roots in C. So this
applies as well to P (x).
We now bring in uniqueness of factorization. We know that if r1 , r2 , . . . ,
rd are the roots of P (x) (here d is the degree of P (x)), then with c the leading
coefficient of P (x), we have
P (x) = c(x − r1 )(x − r2 ) · · · (x − rd ).
We can arrange the product so that any non-real roots appear next to their
complex conjugate “twin.” Now if (x − ri )(x − ri+1 ) is a pair of such “twins,”
then their producgt is really (x − ri )(x − r i ) and multiplies to
x2 − (ri + r i )x + ri r i .
11.9. FACTORING POLYNOMIALS OVER THE REALS
231
But for any complex number z, both z + z and zz are real. (Why?) So the
quadratic above is in R[x].
For any real (non-non-real?) root rj , the factor (x − rj ) is also in R[x]. Thus
we have shown (accepting the Fundamental Theorem of Algebra as true) that
any P (x) ∈ R[x] factors into degree one and degree two factors over R[x].
232
CHAPTER 11. POLYNOMIALS
Chapter 12
A construction procedure
for field extensions
In this chapter we give one method for building field extensions. There is another
method that we will not cover that gives a very different kind of extension. We
start with generalities that apply to all extensions, and then give definitions
that describe the two basic types of extensions. Then we settle down to the
construction of the particular type that we need.
12.1
Smallest extensions
We recall some basic results from Section 3.4.3. The first is standard and its
argument follows the same outline as for groups and other algebraic structures.
Lemma 3.4.3 Let E be a field and let C be a collection of subfields of E. Then
the intersection of all the subfields in C is a subfield of E.
From this, another standard technique gives the following.
Lemma 3.4.4 Let F ⊆ E be an extension of fields and let S be a subset of E.
Then in the collection of subfields of E that contain both F and S, there is a
smallest subfield.
The smallest extension given by Lemma 3.4.4 is denoted F (S), or is denoted
F (a1 , a2 , . . . , an ) if S = {a1 , a2 , . . . , an } is a finite set. In particular, when
S = {a} has only one element, we write F (a) for the extension. We think
of F (a) as created from F by “adding a” to F . The next lemma shows that
F (a1 , a2 , . . . , an ) can be created by adding one element at a time. Note that
F (a1 ) is a field in its own right, so F (a1 )(a2 ) refers to the field obtained from
F by first adding a1 to obtain F (a1 ) and then adding a2 to F (a1 ) to obtain
F (a1 , a2 ). The lemma says that this is the field F (a1 , a2 ) which is the smallest
233
234
CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS
field containing F and {a1 , a2 }.
Lemma 3.4.5 Let F ⊆ E be an extension of fields, and let {a1 , a2 , . . . , an } be
a subset of E. Then
F (a1 )(a2 ) · · · (an ) = F (a1 , a2 , . . . , an ).
We will refer to F (a1 , a2 , . . . , an ) as the extension of F (in E) by {a1 , a2 , . . . , an }.
In the special case when S is the single element {a} in E, we will refer to F (a)
as the extension of F (in E) by a.
It is nice to know that these smallest fields exist, but it is even nicer to know
what is in them. In the setting of groups, we have a description of what is in
a group generated by a certain set of elements. However, fields are somewhat
more complicated than groups and so the descriptions of what is in them is correspondingly more complicated. The promised construction of a field extension
will give a description of F (a) when a has a certain property. The construction
that we will not cover would describe what happens when a does not have that
property. We now discuss the property that distinguishes between the two cases.
12.2
Algebraic and transcendental elements
Let F ⊆ E be an extension of fields, and let α be an element of E. If there is
a non-zero polynomial P over
P , then we say that α
√ F so that α is a root of √
is algebraic over F . Thus 2 is algebraic over Q since 2 is a root of x2 − 2.
Note that every element α of F is algebraic over F since α is a root of x − α.
We say that α ∈ E is transcendental over F if it is not algebraic over F . That
is, for every non-zero polynomial P with coefficients in F , we have P (α) 6= 0.
We will give a construction for F (α) when α is algebraic over F . This will
be all we need since we are interested in roots of polynomials. Elements that
are not roots of polynomials will have no use in our discussions. However,
transcendental elements do exist and form the basis for discussions that are
outside the scope of these notes.
It can be hard to prove that a specific element is transcendental. Both e (the
base of the natural logartithm) and π are transcendental over Q. The proofs of
these facts are beyond the scope of these notes and involve a certain amount of
analysis as well as algebra. Note that e and π are both algebraic over R. Thus
it is not correct to simply say that e and π are transcendental “period.”
For certain extensions, it can be easy to prove that transcendental elements
must exist. This is quite different from showing that a specific element is transcendental. This will be given as an exercise later.
Exercises (60)
1. Prove that if α is algebraic over F , then so is −α. Conclude that if α is
transcendental over F , then so is −α.
12.3. EXTENSION BY AN ALGEBRAIC ELEMENT
235
2. Prove that if α is transcendental over F , then so is α + α. Show by
example that if α and β are transcendental over F , then α + β might not
be transcendental over F .
3. Prove that if α is transcendental over F and β ∈ F , then α + β is transcendental over F . If in addition β 6= 0, then αβ is transcendental over F .
Conclude that if α is algebraic over F and β ∈ F , then α + β is algebraic
over F , and if in addition β 6= 0, then αβ is algebraic over F .
4. Prove that if α is algebraic over F , then α−1 is algebraic over F . Conclude
that if α is transcendental over F , then α−1 is transcendental over F . Hint:
if P (x) is non-zero polynomial for which P (α) = 0, look at P (α−1 ) but do
not expect it to be zero. See what transformations it goes through when
you clear fractions. Then start again with P (α) which you know is zero
and see what transformations you can make on it.
If α and β are algebraic over F , you might wonder about α + β and αβ.
These are both algebraic over F , but we won’t have the tools to prove that until
later. For now you can think about why it is not obvious.
Next we give the promised construction. This will lead to a description of
F (α) when α is algebraic over F .
12.3
Extension by an algebraic element
Let F ⊆ E be an extension of fields and let α ∈ E be algebraic. We would like
to understand the structure of F (α).
The structure of F (α) will come from the structure of F [x], the ring of
polynomials over F . From Lemma 11.3.3, we know that vα : F [x] → E defined
by vα (P (x)) = P (α) is a ring homomorphism. Since vα (P (x)) = P (α) is a
combination of products and sums of elements of F and powers of α, we get
that P (α) is in F (α), the smallest subfield of E that contains all of F and
contains α. Thus we can say that vα goes from F [a] to F (α). Later we will see
that it is onto. Now we investigate the extent to which it is not one-to-one.
The fact that α is algebraic over F tells us that there is some non-zero
polynomial Q(x) in F [x] for which Q(α) = 0. That is, α is a root of Q(x). If we
concentrate on the additive parts of F [x] and of F (α), then this says that Q(x)
is a non-zero element of the kernel of vα . The kernel of a group homomorphism
captures the failure of the homomorphism to be one-to-one1 , and we can say that
vα (M (x)) = vα (N (x)) for M (x) and N (x) in F [x] if and only if M (x) − N (x)
is in the kernel of vα . It thus becomes important to understand the kernel of
vα . To do so, we bring in the multiplicative structure of F [x].
Since α ∈ E is algebraic over F , it is a root of some non-zero polynomial in
F [x]. There must be a smallest degree d so that there is a non-zero polynomial
in P [x] of degree d with α as a root. We call d the degree of α over F .
1 Field homomorphisms are so restrictive that failure to be one-to-one has drastic consequences, but F [x] is not a field and vα is not a field homomorphism.
236
CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS
If α 6= 0, then a polynomial P (x) ∈ F [x] whose degree is the degree of α
is called a minimal polynomial for α over F . The next exercises show that the
kernel of vα is completely determined by a minimal polynomial for α over F
and that all such minimal polynomials are closely related.
Exercises (61)
1. Let F ⊆ E be an extension of fields, let α ∈ E be algebraic over F and
let P (x) ∈ F [x] be minimal for α. Show that P (x) is irreducible over F .
2. Let F ⊆ E be an extension of fields, and let α ∈ E be a root of P (x) ∈ F [x]
where P (x) is irreducible over F . Show that P (x) is a minimal polynomial
for α.
3. In the setting of problem 1, show that Q(x) ∈ F [x] has Q(α) = 0 (that is,
Q(x) is in the kernel of vα ) if and only if P (x) divides Q(x).
4. In the setting of Problem 1 show that given two minimal polynomials for
α that each is a constant time the other, and show that there is a unique
monic, minimal polynomial for α.
The exercises above show that in the setting given, M (α) = N (α) for M (x)
and N (x) in F [x] if and only if M (x) − N (x) is a multiple of some minimal
polynomial (equivalently, all minimal polynomials) for α over F .
12.3.1
The construction
Let F ⊆ E be an extension of fields and let α ∈ E be algebraic over F with
minimial polynomial P (x) ∈ F [x] for α. The exercise set above motivates the
following definition in imitation of the construction of Zp from Z.
For M (x) and N (x), we define M (x) ∼P N (x) to mean that N (x) − M (x)
is a multiple of P (x).
Exercises (62)
The following problems refer to the items described in the previous two paragraphs. The arguments are similar to arguments about the construction of Zp
from Z.
1. Show that ∼P is an equivalence relation.
2. For M (x) ∈ F [x], write [M (x)]P to denote the equivalence class of M (x)
under ∼P . Prove that setting
[M (x)]P + [N (x)]P = [(M + N )(x)]P ,
−[M (x)]P = [−M (x)]P , and
[M (x)]P [N (x)]P = [(M N )(x)]P
give well defined operations on equivalence classes. We will denote the set
of equivalence classes with the operations above by F [x]/P (x).
12.3. EXTENSION BY AN ALGEBRAIC ELEMENT
237
3. Argue that the multiplication defined in Problem 2 makes F [x]/P (x) a
commutative ring with identity. That is, the relevant laws (commutative,
associative, identity, distributive, etc.) hold.
4. Use the irreducibility of P (x) to prove that there are multiplicative inverses
for every non-zero element in F [x]/P (x). This argument mirrors the proof
that Zp has multiplicative inverses when p is a prime. You should look up
that argument in Section 2.8.
The previous exercise set shows that F [x]/P (x) with the operations in Problem 2 forms a field. We can now prove the following.
Theorem 12.3.1 Let F ⊆ E be an extension of fields. Let α ∈ E be algebraic
over F and let P (x) ∈ F [x] be a minimal polynomial for α. Let vα : F [x] → E
be the evaluation homomorphism. Then
v α : F [x]/P (x) → E
defined by v α ([M (x)]P = vα (M (x)) = M (α) is a well defined field homomorphism which takes [x]P to α and whose image is exactly F (α).
Thus v a : F [x]/P (x) → F (α) is a field isomorphism taking [x]P to α.
Proof. The last line follows from the previous lines since a field homomorphism
either takes everything to zero or is one-to-one. The homomorphism cannot take
everything to zero since if k ∈ F , then the constant polynomial k has v α ([k]) =
vα (k) = k. Thus the image contains at least F . A one-to-one homomorphism
is an isomorphism onto its image.
Thus we must show that v α : F [x]/P (x) → E is a well defined field homomorphism with the claimed image.
We start with the well definedness question. If M (x) ∼P N (x), then M (x)−
N (x) is divisible by P (x) and M (x) − N (x) = P (x)Q(x). Now M (α) − N (α) =
P (α)Q(α) = 0Q(α) = 0 so M (α) = N (α).
That v α is a homomorhpism follows from the fact that vα is a homomorphism, and from the definitions in Problem 2 of Exercise set (62).
The monomial x has vα (x) = α, so v α ([x]P ) = vα (x) = α giving one of the
facts to be proven and putting α in the image of v α .
Each k ∈ F also can be thought of as a constant polynomial so that
v α ([k]P ) = vα (k) = k(α) = k which puts F in the image of v α .
Since the image of v α is a subfield of E containing both F and α and so
contains F (α).
But each [M (x)]P for M (x) ∈ F [x] maps to M (α) under v α . As argued in
the first few paragraphs of Section 12.3, we have M (α) ∈ F (α). So the image
of v α is contained in F (α) and so must equal F (α). This completes the proof.
Theorem 12.3.1 says that in the setting of the theorem, F (α) is like F [x]/P (x).
But we would like to say more about what F [x]/P (x) is like on its own merits,
238
CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS
just using the fact that P (x) is irreducible over F . We can say what F [x]/P (x)
is like, and we will do so twice. It turns out to be very easy when P (x) is a
minimal polynomial for some α that is algebraic in some extension of F . It is a
bit more involved when it is not known that P (x) has such an α. It turns out
that every P (x) has such an α, but that argument needs either a result from
complex analysis in the setting of subfields of C, or a prior understanding of
F [x]/P (x).
12.3.2
The structure of F [x]/P (x)
In the discussion that follows, we will assume that P (x) is a minimal polynomial
for some α that is algebraic in some extension E of F . From Theorem 12.3.1,
we know that
v α : F [x]/P (x) → F (α)
is an isomorphism. In particular, the inverse θ of v α is an isomorphism from
F (α) to F [x]/P (x). We will use the two isomorphisms v α and θ to say something
about F [x]/P (x).
In exercises that come right after the analysis of F [x]/P (x) using the two
isomorphisms, you will not assume that P (x) is a is a minimal polynomial for
some α that is algebraic in some extension of F . Thus you will not have the
isomorphism v α and its inverse θ available. In spite of this, you will be asked
to prove statements that correspond to the facts that we will extract from the
two isomorphisms.
Assuming that E, α, v α and θ exist, we know for every element k ∈ F that
k also represents the constant polynomial that we can refer to as k in F [x].
Further we have vα (k) = k, so v α ([k]P ) = k. From this we know that v α is
one-to-one from the classes of constant polynomials to F ⊆ F (α), that different
constant polynomials lie in different classes mod P in F [x]/P (x), and that the
classes of constant polynomials form a subfield of F [x]/P (x) isomorphic to F .
Also, θ(k) = [k]P for each k ∈ F . It is cumbersome to keep writing [k]P , for
“the class mod P of the constant polynomial k,” so we will simply denote it by
k. With this shorthand, we have θ(k) = k.
With the shorthand that regards k ∈ F as also representing the class [k]P ,
we have inserted F in F [x]/P (x) as a subfield. Thus we can think of F [x]/P (x)
as an extension of F .
We know that the monomial x has vα (x) = α, so v α ([x]P ) = α and θ(α) =
∞
X
ci xi where each ci is in F . Since P (α) = 0, we have
[x]P . Let P (x) =
∞
X
i=0
i=0
ci αi = 0. Applying θ to this equality, we get
∞
X
ci ([x]P )i = 0 since our
i=0
shorthand says that θ(ci ) = ci for each i. Thus in F [x]/P (x), the element [x]P
is a root of the polynomial P (x).
If there is a subfield K in F [x]/P (x) that contains F and contains [x]P , then
it will have an isomorphic image v α (K) in F (α) that contains F and contains α.
12.3. EXTENSION BY AN ALGEBRAIC ELEMENT
239
But our definition of F (α) says that v α (K) would then have to equal F (α). Thus
K = F [x]/P (x), and F [x]/P (x) is the smallest subfield of itself that contains
F and [x]P .
We now consider the degree [F [x]/P (x) : F ] where we regard F as a subfield
of F [x]/P (x) using the shorthand discussed above. If d is the degree of P (x),
we will show that [F [x]/P (x) : F ] = d by showing that the d elements of
{1, [x]P , [x2 ]P , . . . , [xd−1 ]P } form a basis for F [x]/P (x).
If we bring these elements to F (α) by the isomorphism v α , then we are
looking at 1, α, α2 , . . . , αd−1 . If these are linearly dependent, then the linear
dependence would would have a factor that is a non-zero polynomial of degree
no more than d − 1 with α as a root. (To find such a factor one would only have
to factor out a power of α from the linear dependence.) But this would violate
the fact that P (x) is a minimal polynomial for α.
Every element in F [x]/P (x) is of the form [M (x)]P for some polynomial
M (x) over F . By the division algorithm, M (x) = (P Q)(x) + R(x) with the
degree of R(x) smaller than d, the degree of P (x). Since M (x) − R(x) is a
multiple of P (x), we have [M (x)]P = [R(x)]P and every element of F [x]/P (x)
is represented by the class of some polynomial of degree less than d.
It is tempting to jump to the correct conclusion that R(x) is a linear combination of the elements of {1, [x]P , [x2 ]P , . . . , [xd−1 ]P }. However, this conclusion
requires a bit more than just a jump. We will take advantage of the fact that
R(x) is carried by the isomorphism v α to R(α) which is a linear combination
of the elements of {1, α, α2 , . . . , αd−1 }. Now this linear combination is carried
back by θ to a linear combination of the elements of {1, [x]P , [x2 ]P , . . . , [xd−1 ]P }
and we are done.
We have shown the following lemma.
Lemma 12.3.2 Let F ⊆ E be an extension of fields, and let α ∈ E be algebraic
over F with minimal polynomial P (x). Then the following hold.
1. Sending k ∈ F to [k]P in F [x]/P (x) is an isomorphism from F into
F [x]/P (x) whose image is not zero. Using this, we regard F as a subfield
of F [x]/P (x) for the rest.
2. The element [x]P in F [x]/P (x) is a root of the polynomial P (x).
3. The smallest subfield of F [x]/P (x) that contains F and [x]P is F [x]/P (x)
itself.
4. The degree [F [x]/P (x) : F ] equals the degree of P (x), and a basis for
F [x]/P (x) over F is {1, [x]P , [x2 ]P , . . . , [xd−1 ]P }.
Corollary 12.3.3 Let F ⊆ E be an extension of fields, and let α ∈ E be
algebraic over F with minimal polynomial P (x) of degree d. Then [F (α) : F ] =
d, and a basis for F (α) over F is {1, α, α2 , . . . , αd−1 }.
240
CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS
There is a result corresponding to Lemma 12.3.2 with a different hypothesis.
Its proof will be an exercise. Note that the conclusions are identical to those of
Lemma 12.3.2.
Proposition 12.3.4 Let F be a field and let P (x) be a non-constant polynomial
that is irreducible over F . Then the following hold.
1. Sending k ∈ F to [k]P in F [x]/P (x) is an isomorphism from F into
F [x]/P (x) whose image is not zero. Using this, we regard F as a subfield
of F [x]/P (x) for the rest.
2. The element [x]P in F [x]/P (x) is a root of the polynomial P (x).
3. The smallest subfield of F [x]/P (x) that contains F and [x]P is F [x]/P (x)
itself.
4. The degree [F [x]/P (x) : F ] equals the degree of P (x), and a basis for
F [x]/P (x) over F is {1, [x]P , [x2 ]P , . . . , [xd−1 ]P }.
Exercises (63)
The steps below will prove Proposition 12.3.4. Note that from Exercise set (62)
we already know that F [x]/P (x) is a field.
1. Prove Conclusion 1 of the proposition. There are several ways to approach
this. Some steps may be saved if Lemma 3.4.6 is taken into account.
2. This is the start of a proof of Conclusion 2 of the proposition. Let (x)
represent the polynomial in which all coefficients are equal to zero except
that the coefficient of x1 is one. Similarly, let (xn ) represent the polynomial in which all coefficients are equal to zero except that the coefficient of
xn is one. Prove that (x)n = (xn ). Use this to conclude that in F [x]/P (x)
the equality ([x]P )n = [xn ]P holds.
3. This will help with Conclusion 2 of the proposition as well as Conclusion 4.
∞
X
ai xi in F [x], and k ∈ F , define (kM )(x) to
For a polynomial M (x) =
be the polynomial
∞
X
i=0
(kai )xi . Prove that this multiplication of polynomial
i=0
times “scalar” and the addition and negation in F [x] (but ignoring the
multiplication in F [x]), makes F [x] a vector space over F . There are
many things to check. See Section 1.4.2 for the list.
4. This will also help with Conclusions 2 and 4. If M (x) is a polynomial in
F [x], then in F [x]/P (x), we have M ([x]P ) = [M (x)]P . This includes a
small amount of calculation and a very careful review of definitions.
5. Prove Conclusion 2 of Proposition 12.3.4.
12.3. EXTENSION BY AN ALGEBRAIC ELEMENT
241
6. Prove Conclusion 4 of Proposition 12.3.4. That is right. Do 4 before 3.
You should decide what the basis is, and prove that it is the basis. A
review of definitions will help with linear independence.
7. Prove Conclusion 3 of Proposition 12.3.4. This asks what has to be in a
subfield of F [x]/P (x) that contains F and [x]P .
The point of Proposition 12.3.4 is that even if we don’t have an extension
having a root of some P (x), we can build one that does. In the proposition, we
need to assume that P (x) is irreducible, but later we will see how to handle any
polynomial.
Note that the constant polynomials in F [x] form a subfield of the ring F [x],
and that in turn the constant polynomials are exactly the units of the ring F [x]
together with the zero polynomial. That this forms a field is an accident as the
next exercise shows.
Exercises (64)
1. Find an n so that the ring Zn (which is not a field unless n is prime) does
not have a subfield consisting exactly of the units and the zero element.
12.3.3
A result about automorphisms
From Proposition 12.3.4, we know a great deal about the structure of F [x]/P (x).
We should be able to say something about its automorphisms. Speficically, we
will say something about the automorphisms that fix F .
To keep the notation simpler we will look at an extension F ⊆ E where
α ∈ E is algebraic over F with a minimal polynomial P (x). We know that the
extension F ⊆ F (α) has the same structure as F ⊆ F [x]/P (x) and F (α) is less
complicated to write than F [x]/P (x).
It is possible that F (α) has other roots of P (x). If d is the degree of P (x),
we know that there cannot be more than d roots of P (x) in F (α). Let the roots
of P (x) in F (α) be {α1 , α2 , . . . , αn } with n ≤ d and with α1 = α. Note that if
Q(x) is another minimal polynomial for α, it is a non-zero constant times P (x)
and has exactly the same roots as P (x).
Consider an αi for i 6= 1, and consider F (αi ). This is a subfield of F (α)
since αi is in F (α).
Since P (x) is minimal for α it is irreducible over F . Since αi is a root of
P (x) and P (x) is irreducible over F , P (x) is a minimal polynomial for αi . By
Lemma 12.3.2, we know that [F (αi ) : F ] = d and also that [F (α) : F ] = d. But
F ⊆ F (αi ) ⊆ F (α) so
[F (α) : F ] = [F (α) : F (αi )][F (αi ) : F ].
But this implies that [F (α) : F (αi )] = 1 and F (αi ) = F (α). Thus all the F (αi )
are equal.
Now for each i there is an isomorphism φi = v αi from F [x]/P (x) to F (αi )
that takes each constant polynomial k to k in F (αi ) and takes [x]P to αi . Now
242
CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS
φi ◦ (φ1 )−1 takes F (α) isomorphically to F (αi ) so that each k ∈ F is taken to
itself and so that α is taken to αi . Since F (αi ) = F (α), this is an automorphism
of F (α) that fixes F and that takes α to αi . We have just discovered an element
θi = φi ◦ (φ1 )−1 of Aut(F (α)/F ). We know that all the θi are different since
they do different things to α.
We now argue that Aut(F (α)/F ) consists entirely of the θi . Let σ be in
Aut(F (α)/F ). By Proposition 10.4.1, we know that σ(α) is a root of P (x).
Thus σ(α) is one of the θi and σ and θi agree on α. We now need a quick
exercise.
Exercises (65)
1. If E is a field and ρ and θ are in Aut(E), then {y ∈ E | ρ(y) = θ(y)} is a
subfield of E.
By the exercise, the set of elements on which σ and θi agree is a subfield of
F (α). But this subfield contains F and contains α. Thus it contains F (α) and
σ and θi are the same element in Aut(F (α)/F ).
We have one more observation to make before we are ready to state a result.
So far we have shown that if F (α) contains n different roots of a minimal
polynomial P (x) for α, then there are n elements in the group Aut(F (α)/F ) and
that each root of P (x) is the image of α under some element of Aut(F (α)/F ).
Further, an element of Aut(F (α)/F ) is completely determined by what it does
on α. Recall that we also showed that if αi is another root of P (x), then
F (αi ) = F (α), so that we can let αi play the role of α. This lets us repeat for
αi all that has been said about α.
All these observations combine to give the following extremely important
result about automorphism groups. We call it a proposition since later we will
generalize it to larger extensions and will call the generalization a theorem.
Proposition 12.3.5 Let F ⊂ E be an extension of fields, and let α ∈ E be
algebraic over F with minimal polynomial P (x) ∈ F [x] of degree d. Then the
number of elements in Aut(F (α)/F ) is exactly the number of roots of P (x) in
F (α) and is no larger than d. Further given any two (not necessarily different)
roots αi and αj of P (x) in F (α), there is an element of Aut(F (α)/F ) that
carries αi to αj . Lastly, if αi is a root of P (x) in F (α), then each automorphism
in Aut(F (α)/F ) is determined completely by what it does on αi .
If we restrict the action of the group Aut(F (α)/F ) to the set of all roots
R = {α1 , α2 , . . . , αn } of P (x) in F (α), then we note that the orbit of any of the
αi is all of R. Recall (Section 9.3) that when a group acts on a set so that there
is only one orbit (i.e., given any two elements of the set, there is an element of
the group taking one to the other), then it is said that the action of the group
on the set is transitive. Thus Proposition 12.3.5 says, assuming its hypotheses,
that the action of Aut(F (α)/F ) on the roots of P (x) in F (α) is transitive.
12.3. EXTENSION BY AN ALGEBRAIC ELEMENT
12.3.4
243
Examples
We give a few examples to show the power of Proposition 12.3.5. They also illustrate some of the need for the care that went into the wording of the proposition.
We work out one fully in the text and then give others as exercises.
Cube roots of 1
We will use Q as the base field for our extensions. The polynomial P (x) = x3 −1
is not irreducible over Q since it factors as (x−1)(x2 +x+1).
The left factor
has a
√
√
3
3
1
1
2
root of 1 and the right factor has as roots ω = − 2 +i 2 , and ω = − 2 −i 2 . The
right factor is irreducible over Q (the factorization x2 + x + 1 = (x − ω)(x − ω 2 )
uses coefficients outside of Q) and so is a minimal polynomial for ω and ω 2 .
Since ω 2 is in any field that contains ω, we know that it is in Q(ω). From
Proposition 12.3.5, we know that Aut(Q(ω)/Q) has exactly two elements—one
that takes ω to itself and one that takes ω to ω 2 . Since the identity must be
an element in Aut(Q(ω)/Q), the first must be the identity. What must be the
image of ω 2 under the non-identity element of Aut(Q(ω)/Q)?
Fifth roots of 1
The fifth roots of 1 are evenly spaced around the unit circle in C and one of
them is 1. The separation of 2π/5 or 72o puts the one of the four non-real roots
in each of the four quadrants. The non-real root in the first quadrant we will
denote by α and the other three non-real roots will be α2 , α3 and α4 . These
are all roots of x5 − 1 but x5 − 1 factors as (x − 1)(x4 + x3 + x2 + x + 1). The
fifth root 1 of 1 is the root of x − 1 and the four non-real roots are roots of
the right factor. The exercises below will address the irreducibility of the right
factor and then will address Aut(Q(α)/Q).
Exercises (66)
1. Prove that there is no degree one factor of x4 + x3 + x2 + x + 1 over Q.
2. Prove that there is no degree two factor of x4 + x3 + x2 + x + 1 over
Q. (Hint: What must the roots of such a degree two factor be and what
would that say about the coefficients in the factor. Use part (e) of the
problem in Exercise set (47) and its consequences discussed in Section
11.9.) Conclude that x4 + x3 + x2 + x + 1 is irreducible over Q.
3. Argue that Aut(Q(α)/Q) has exactly four elements. Give each element of
Aut(Q(α)/Q) a name and write out the multiplication table using these
names for Aut(Q(α)/Q). What is this group isomorphic to?
√
4. (This has nothing to do with automorphisms.) Show that 5 is an element
of Q(α).
244
CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS
Cube roots of 2
√
The cube roots of 2 are evenly spaced around√the circle of radius 3 2 centered at
3
the origin, with one root being the real root 2 that is the “usual” cube root of
o
2. The other
two are at angles
√ ±120 from the2 real root and can be represented
√
3
2 3
as β1 = ω 2 and β2 = ω 2 where ω and ω are the two non-real cube roots
of 1.
Exercises (67)
1. Prove that x3 − 2 is irreducible over Q and is thus a minimal polynomial
over Q for each of the cube roots of 2.
√
2. How many roots of x3 − 2 are in Q( 3 2)?
√
3. How many elements does Aut(Q( 3 2)/Q) have?
4. How many roots of x3 − 2 does Q(β1 ) have?
5. How many elements does Aut(Q(β1 )/Q) have?
Sixth roots of 1
The sixth roots of 1 are evenly spaced around the unit circle in C with one being
1, one being −1 and the other four not real. Since they are spaced 60o apart,
there is one non-real sixth root of 1 in each of the four quadrants. Let γ be the
non-real root in the first quadrant. The sixth roots of 1 are all roots of x6 − 1.
However, the polynomial x6 − 1 factors into (x − 1)(x5 + x4 + x3 + x2 + x + 1).
Exercises (68)
1. Show that x5 + x4 + x3 + x2 + x + 1 factors into three factors in Q[x]
of degrees 2, 2, and 1, respectively. Show that each of these factors is
irreducible.
2. How many roots of x6 − 1 are in Q(γ)?
3. What can you say about Aut(Q(γ)/Q)?
Project (optional) 1.
Existence of finite fields
To build finite extensions of Zp , one follows the same outline. The key is to find
polynomials with coefficients in Zp of desired degrees that do not factor (called
irreducible in the next chapter) into smaller polynomials with coeffients in Zp .
These are easier to show exist than to find. The fact that they exist comes from
the fact that there are more polynomials of a given degree than polynomials
that result from multiplying polynomials of smaller degree. It follows from this
that a finite field exists of size pd for each prime p and each integer d ≥ 1.
12.3. EXTENSION BY AN ALGEBRAIC ELEMENT
245
One can attempt direct calculations of this as a project. To see a particularly
elegant proof that irreducible polynomials of all degrees always exist, see [1],
Section 16.9, Page 368–371. The narrative there will require reading some of
the pages that come before that section.
Chapter 16 of the book [1] has a lot of interesting information on finite fields,
some of which duplicates what has been covered here, and much that has not.
246
CHAPTER 12. CONSTRUCTING FIELD EXTENSIONS
Chapter 13
Multiple extensions
Consider the cubic polynomial P (x) = x3 + 3x − 2. Its derivative is P ′ (x) =
3x2 + 3 is positive for all real numbers, so P (x) is strictly increasing
onp
all of R.
p
√
√
3
3
Thus P (x) has only one real root. The real root is α = 1 + 2 + 1 − 2
which can be verfied either by evaluating P (α) directly or by going through the
formula
s
 s

r r 2
3
2
3
r
r
3 −r
p   3 −r
p 
x=
+
+
+
−
+
2
2
3
2
2
3
for the solutions to cubic polynomial equations of the form x3 + qx + r = 0. The
coefficients of P (x) were chosen deliberately to cooperate reasonably well with
the formula.
We are studying whether polynomials can have their roots calculated from
their coefficients by the five processes of addition, subtraction, multiplication,
division and the taking of n-th roots. The first four of the five operations are
field operations, so we can summarize the process by saying that we want to
find the roots of a polynomial by the processes of field operations and the taking
of n-th roots.
If we take the example just given and see how the allowed processes can be
arranged, one step at a time, to take us from the coefficients to the roots, we
would start with Q, the smallest field that contains the coefficients. Calculations
that we can make before taking of any n-th roots would include calculating the
expression
r 2 p 3
+
(13.1)
2
3
that lies inside the square roots. These involve only field operations and stay
within the field containing the roots. In the example, the expression (13.1)
evaluates to 2.
Then a square root must be taken. This √goes outside the field containing
the coefficients and brings us to the field Q( 2). Adding −r/2 to the square
247
248
CHAPTER 13. MULTIPLE EXTENSIONS
√
root (or its negative) is another field operation and stays within Q( 2), but the
taking of the cube roots is a problem.
Note that if we get one of the cube roots in a field, we will also have the
other. This is because
q
q
√
√
√
√
3
3
1+ 2
1 − 2 = 3 1 − 2 = 3 −1 = −1
p
p
√
√
3
3
so that 1 − 2 = −( 1 + 2)−1 .
√
It is possible to believe that the cube roots exist in Q( 2), but in fact they do
not. At the end of this chapter we will see how to argue that P (x) is irreducible
over
√ Q. If P (x) is reducible over
√ Q. This means that Q(α) is of degree 3 over
factor of P (x) in Q( 2)[x] must √
be degree one and
Q( 2), then at least one √
give the real (because Q( 2) lies in R) root α. Since [Q( 2) : Q] = 2, this
would put Q(α) inside a field of degree 2 over
√ Q. This is not possible.
2) and the cube roots cannot be
Thus
P
(x)
is
also
irreducible
over
Q(
√
in Q( 2). Thus at least one more extension
must be created. Specifically,
√ p
√
3
we mustpcreate the extension Q( 2)( 1 + 2). As noted above, the field
√
√
3
Q( 2)( 1 + 2) will contain both of the cube roots that are needed to build
α.
So we see that if we are to follow the processes that we want to use to
arrive at roots, we must look at extensions that are more complicated than the
extensions by single elements that we studied in the previous chapter. This
chapter proposes to do just that.
In fact, we can arrive at a field containing α just by forming Q(α). This
will be a degree 3 extension of Q by Lemma 12.3.2. But building this extension
does not follow the allowable processes in a step by step fashion. Adding the
element α to Q all at once combines several
taking of n-th roots.
√
√ p
3
Note that neither Q(α) nor Q( 2)( 1 + 2) contains the other roots of
P (x) since the other roots must be non-real complex numbers. One theme in
this chapter will be that looking at a smallest extension that contains all the
roots of a polynomial will reveal much about the polynomial and its roots. As
this example shows, such extensions will often take more than one step.
Lastly, this chapter will end with a discussion of one way to show that a
polynomial is irreducible. In particular it will apply to the polynomial P (x) =
x3 + 3x − 2, and it will apply to other important cases.
13.1
Multiple extensions
Recall the following.
Lemma 3.4.5 Let F ⊆ E be an extension of fields, and let {a1 , a2 , . . . , an } be
a subset of E. Then
F (a1 )(a2 ) · · · (an ) = F (a1 , a2 , . . . , an ).
13.2. ALGEBRAIC EXTENSIONS
249
To illustrate what this means for an extension done in two steps, we take
an extension of fields F ⊂ E and we let α and β be elements of E. Then the
Lemma 3.4.5 implies that F (α, β) = F (α)(β) = F (β)(α).
Of the three fields being compared, the first is the smallest subfield of E
containing F and α and β, the second is the smallest subfield of E containing
F (α) and β where F (α) is the smallest subfield of E containing F and α. The
third is the smallest subfield of E containing F (β) and α, where F (β) is the
smallest subfield of E containing F and β.
There is terminology to separate out those extensions that can be done in
one step. If F ⊆ E is an extension of fields, then we say that the extension is
simple if there is an element α ∈ E so that E = F (α). In this situation, α is
called a primitive element for the extension. We shortly give conditions under
which primitive elements always exist.
The wording of the definition of simple has been carefully chosen. Note that
an extension F (α, β) appears not to be a simple extension of F , but it might
be if there is an element γ ∈ F (α, β) for which F (γ) = F (α, β).
Exercises (69)
√ √
1. Is Q( 2, 3) a simple extension of Q? Hint: find a lot of elements in the
extension.
The previous chapter analyzed the structure of simple extensions and had a
lot to say about the automorphisms of a simple extension. We wish to do the
same here for multiple extensions. However, we will limit ourselves to multiple extensions by algebraic elements. We next make some general (and fairly
powerful) observations about such extensions.
Notation
We will often be more interested in extensions of fields by specific elements than
in some larger field that might contain even more elements. Thus we will often
let F (α) be an extension of F by α without saying that α comes from a specific
field originally containing both F and α. If this is hard to accept, then we
can say that letting F (α) be an extension of F by α is shorthand for saying
that F ⊆ E is an extension of fields and that α ∈ E is an element for which
F (α) = E.
13.2
Algebraic extensions
Let F ⊆ E be an extension of fields. We say that E is algebraic over F if every
element of E is algebraic over F . We need to see that algebraic extensions exist.
The next lemma discusses finite dimensional extensions. The next definition
should have been given earlier. If F ⊆ E is an extension of fields, we say that E
is a finite extensionof F if the degree [E : F ] is finite. Thus we save one word by
referring to finite dimensional extensions as finite extensions. Note that saying
250
CHAPTER 13. MULTIPLE EXTENSIONS
that E is a finite extension of F says nothing about the number of elements of
E.
Lemma 13.2.1 Let E be a finite extension of the field F . Then E is algebraic
over F . Further, if α is an element of E and [E : F ] = d, then α has a minimal
polynomial of degree at most d.
Proof. Proving the second sentence will prove the entire lemma. Consider the set
S = {1 = α0 , α1 , α2 , . . . , αd }. If two of these (say αi and αj with 0 ≤ i < j ≤ d)
are equal, then α is a root of xj − xi which has degree at most d. If all αi and
αj with 0 ≤ i < j ≤ d are different, then S has d + 1 elements which must be
linearly dependent over F . A linear dependence will be a non-zero polynomial
of degree no more than d with α as a root.
Lemma 13.2.1 could have been proven in back in Chapter 3, just after the
definition of the degree of an extension in Section 3.4.3. It says that finite
extensions are algebraic. We can combine this with Proposition 12.3.4 from
from Chapter 12 which says that an extension by an algebraic element is a
finite extension, and with facts about degrees that were established in Section
10.5.2 to get a suite of lemmas about algebraic extensions.
Lemma 13.2.2 The extension F (α) of the field field F is algebraic over F if
α is algebraic over F .
We know that α is algebraic over F . The point is that so is every other
element of F (α).
Proof. From Proposition 12.3.4, the degree [F (α) : F ] is finite, and from Lemma
13.2.1, we get that F (α) is algebraic over F .
Lemma 13.2.2 is really a special case of the next lemma.
Lemma 13.2.3 Let F ⊆ E be an extension of fields and assume that E =
F (α1 , α2 , . . . , αn ), let F0 = F and for each i with 1 ≤ i ≤ n, let Fi =
F (α1 , α2 , . . . , αi ). Assume for each i with 1 ≤ i ≤ n that αi is algebraic over
Fi−1 . Then E is algebraic over F .
Proof. Proposition 12.3.4 says that each Fi with 1 ≤ i ≤ n is a finite extension of
Fi−1 . The result follows from Lemma 13.2.1 and the fact that degrees multiply
(Lemma 10.5.1).
Note that a consequence of Lemma 13.2.3 is that all of the αi in the statement
of the Lemma turn out to be algebraic over F .
Corollary 13.2.4 The extension F (S) of the field F by a finite set of elements
is algebraic if each element of S is algebraic over F .
Proof. This follows from Lemma 3.4.5 and the fact that an element that is
algebraic over F is algebraic over any extension of F .
13.3. AUTOMORPHISMS
251
Corollary 13.2.4 can be generalized to extensions by infinite sets of algebraic
elements, but first we need another of its consquences.
Lemma 13.2.5 Let F ⊆ E be an extension of fields, and let A be the set of all
elements of E that are algebraic over F . Then A is a field.
Of course A will then be algebraic extension of F .
Proof. If α and β are in A, then they are in F (α, β) which must then be an
algebraic extension of F by Corollary 13.2.4. But F (α, β) contains, α + β, αβ,
−α and α−1 . These four quantities must then be algebraic which puts them in
A. Thus A is closed under the four basic operations of a field and is a field. Lemma 13.2.6 Let F (S) be an extension of F by a (not necessarily finite) set
of elements all of which are algebraic over F . Then F (S) is algebraic over F .
Proof. By Lemma 13.2.5, there is a subfield A of F (S) consisting of all the
elements of F (S) that are algebraic over F . The subfield A contains F and S
and so contains F (S).
Lemma 13.2.7 If F ⊆ K ⊆ E are field extensions, if K is algebraic over F
and E is algebraic over K, then E is algebraic over F .
If F = F1 ⊆ F2 ⊆ F3 ⊆ · · · ⊆ Fn = E are field extensions and for each i
with 1 ≤ i < n the extension Fi ⊆ Fi+1 is algebraic, then E is algebraic over F .
Proof. The second sentence follows by induction from the first. To prove the
first sentence, let α be in E. It is algebraic over K and so there is a non-zero
polynomial P (x) ∈ K[x] with P (α) = 0. If {c1 , c2 , . . . , ck } is the set of non-zero
coefficients of P (x), then it follows from Lemma 13.2.3 that F (c1 , c2 , . . . , ck , α)
is algebraic over F , so that α is algebraic over F .
13.3
Automorphisms
We know about the automorphisms of simple algebraic extensions. Here we say
something about autormorphisms of multiple algebraic extensions. We do not
try to handle the most general situations, and so confine ourselves to extensions
that are obtained by finitely many successive simple extensions.
There are two ways to work on this. One way is to prove that a succession
of finitely many simple extensions is, in fact, a simple extension in disguise.
Exercise set (69) gives one example. This works in many situations, and later
we will see that it works in the situations that we are interested in.
A second way is to build on our knowledge of automorphisms of simple
extensions and build up facts about successive simple extensions in a stepby-step manner. We will take this approach here, but first we point out a
complication that will occur.
252
CHAPTER 13. MULTIPLE EXTENSIONS
Consider field extensions
F ⊆ F (α) ⊆ F (α)(β) = F (α, β) = E
where α is algebraic over F and β is algebraic over F (α). It follows from Lemma
13.2.3 that β is also algebraic over F , but our emphasis will be on the fact that
β is algebraic over F (α).
Let P (x) be a minimal polynomial for α over F , and let Q(x) be a minimal
polynomial for β over F (α). That is, P (x) ∈ F [x] and Q(x) ∈ F (α)[x]. Let the
degree of P (x) be p, and let the degree of Q(x) be q.
Let θ be an automorphism in Aut(E/F ). Since θ fixes F , the field containing the coefficients of P (x), we know that θ(α) is another root α′ of P (x).
From our discussion in Section 12.3.3 about automorphisms of simple, algebraic
extensions, we know that F (α′ ) = F (α) and that the restriction of θ to F (α)
gives an isomorphism from F (α) to F (α′ ) = F (α) and is thus an automorphism
in Aut(F (α)/F ).
If we now look at what θ does to β, the situation is more complicated. The
field F (α) that contains the coefficients of Q(x) is not fixed by θ.1 Thus the
discussion of Section 12.3.3 needs to be expanded so that instead of insisting
that the field containing the coefficients of the minimal polynomial is kept fixed
elementwise, it is instead moved by an automorphism. In fact, it is no harder
to deal with the situation in which the field containing the coefficients is moved
by an isomorphism, and this is what we will do.
13.3.1
Relativizing Proposition 12.3.5
Certain mathematical results come in two versions. One version (called the
absolute version) will discuss facts about a given structure. The second version
(called the relative version) will discuss facts about a given structure while taking
into account a second (similar) structure. The relative version often proves more
useful as a building block that can be combined with itself or other results in
complex situations.
Our studies are already partly relative. We have not just investigated
Aut(E), the automorphisms of the field E, we have investigated Aut(E/F ),
the automorphisms of the field E that keep the subfield F fixed. However, now
we go farther and allow F to move. As mentioned above, it is just as easy to
allow isomorhpisms of F and E as it is to allow automorphisms of F and E,
and we will also have a need for this extra flexibility. We have to look at what
the setting will be.
We start with a field extension F ⊆ E, an element α of E that is algebraic over F , and a minimal polynomial P (x) ∈ F [x] for α. Since we are only
interested in F (α), we may as well assume E = F (α).
1 If one tries to get around this by replacing Q(x) by a polynomial that is minimal for β over
F instead of over F (α), then the unknown interaction between α and β becomes a problem.
In a vague sense, the polynomial Q(x) that is minimal for β over F (α) contains the required
information about the interaction of α and β.
13.3. AUTOMORPHISMS
253
We then take a field F ′ together with an isomorphism φ : F → F ′ and an
extension E ′ of F ′ . We wonder if there is an isomorphism θ that extends φ from
E = F (α) into a subfield of E ′ . The statement that θ extends φ means that the
restriction of θ to F equals φ.
In the former situation F ′ = F and φ was the identity. In that case we knew
that θ(α) had to be another root of P (x). We can say something similar in this
situation under the assumption that the extension θ exists.
∞
∞
X
X
φ(ai )xi . This
ai xi and let φ(P )(x) denote the polynomial
Let P (x) =
i=0
i=0
puts φ(P )(x) in F ′ [x]. If we now assume that an extension θ of φ to F (α) exists,
then we have
0 = θ(0) = θ(P (α))
=θ
∞
X
ai αi
i=0
=
∞
X
!
θ(ai )(θ(α))i
i=0
=
∞
X
φ(ai )(θ(α))i
i=0
= φ(P )(θ(α))
where the next to last equality holds because θ is an extension of φ and the
equality before that one holds because θ is an isomorphism. Thus θ(α) is a root
of φ(P )(x).
Next we argue that φ(P )(x) is irreducible over F ′ . This follows from a lemma
that should have been done in the chapter on polynomials. Here it is.
Lemma 13.3.1 Let F and F ′ be fields and let φ : F → F ′ be an isomorphism.
Then taking each P (x) ∈ F [x] to φ(P )(x) ∈ F ′ [x] is an isomorphism from F [x]
to F ′ [x].
Exercises (70)
1. Prove Lemma 13.3.1. The formulas in (11.4) makes the calculations easy.
2. Use Lemma 13.3.1 to prove that in the setting of the lemma if P (x) ∈ F [x]
is irreducible over F , then φ(P )(x) ∈ F ′ [x] is irreducible over F ′ .
We are now in a position to prove the following.
Proposition 13.3.2 Let F and F ′ be fields and let φ : F → F ′ be an isomorphism. Let P (x) ∈ F [x] be irreducible over F . Then φ : F [x]/P (X) →
F ′ [x]/φ(P )(x) defined by φ([H(x)]) = [φ(P )(x)] is a well defined isomorphism.
254
CHAPTER 13. MULTIPLE EXTENSIONS
Proof. For well definedness, we let H(x) and J(x) be in the same class. This
means that (H − J)(x) = P (x)M (x) for some M (x) ∈ F [x]. From Lemma
13.3.1, we have
(φ(H) − φ(J))(x) = φ(P )(x)φ(M )(x)
and [φ(H)(x)] = [φ(J)(x)].
The function φ is a homomorphism since φ on F [x] is a homomorphism, and
F [x] contains the representatives of the classes in F [x]/P (x). It is one-to-one
since it is a homomorphism of fields. It is onto since φ is onto on the coefficients.
Exercises (71)
1. Explain the last sentence in the proof above.
We can now give a relative version of Proposition 12.3.5.
Proposition 13.3.3 Let F ⊆ F (α) be fields with α algebraic over F with minimal polynomial P (x) of degree d. Let F ′ ⊆ E be fields and let φ : F → F ′ be
an isomorphism. Let A be the set of all homomorphisms θ : F (α) → E that are
extensions of φ in that θ(x) = φ(x) for all x ∈ F . Let B be the set of roots of
φ(P )(x) in E. Then θ(α) is in B for each θ ∈ A, and sending θ ∈ A to θ(α) in
B is a one-to-one correspopndence from A to B. In particular, A has no more
than d elements.
Proof. That θ(α) is in B for every θ in A is shown in the calculation before
Lemma 13.3.1. We know that if two homomorphisms agree on F and on α,
then they agree on all of F (α). Thus taking θ ∈ A to θ(α) ∈ B is one-to-one.
To show that this is onto, consider some β ∈ B. We have isomorphisms
F (α) → F [x]/P (x) → F ′ [x]/φ(P )(x) → F ′ (β)
by Lemmas 12.3.2 and 13.3.2. For k ∈ F , the isomorphisms take k first to [k],
then to [φ(k)], then to φ(k). The element α is taken first to [x], then to [x]
(since φ(1) = 1), then to β. Thus the composition of the isomorphisms above
is an element of A that takes α to β.
13.3.2
Applying the relative proposition
The next theorem relies heavily on the fact that if F ⊆ K ⊆ E are finite
extensions of fields, then [E : F ] = [E : K][K : F ].
Theorem 13.3.4 Let F ⊆ E be an extension of fields with [E : F ] = d < ∞.
Let F ′ ⊆ E ′ be an extension of fields and let φ : F → F ′ be an isomorphism.
Then there are no more than d different homomorphisms θ : E → E ′ that extend
φ.
13.3. AUTOMORPHISMS
255
Proof. We will induct on d. The theorem is true if d = 1. Our inductive
hypothesis will be that the theorem is true for all extensions of F of degree less
than d.
Let K be a field with F ⊆ K ⊆ E and with k = [K : F ] as large as possible
but still strictly less than d. We can find such a K since there are only finitely
many possibilities for [K : F ] with F ⊆ K ⊆ E. By our inductive hypothesis,
there are no more than k different homomorphisms θ : K → E ′ that extend φ.
Since K is not all of E, there is an element α in E − K. We have
F ⊆ K ⊆ K(α) ⊆ E.
Since α ∈
/ K, we have K(α) : K] 6= 1 so that [K(α) : F ] > [K : F ]. But the way
that we picked K forces K(α) = E.
Since all the extensions are finite, α is algebraic over K and has a minimal
polynomial P (x) ∈ K[x] of some degree q. Since F ⊆ K ⊆ K(α) = E, we have
d = [E : F ] = [K(α) : F ] = [K(α) : K][K : F ] = qk
where we know that [K(α) : K] = q from Corollary 12.3.3.
Let θ : E = K(α) → E ′ be an extension of φ. Note that the restriction θ|K
of θ to K is an extension of φ to a homomorphism of K into E ′ . Now θ is an
extension of θ|K . Thus each extension θ : E → E ′ of φ can be thought of as
obtained in two steps: first extend to a homomorphism with domain K, and
then extend that homomorphism to one with domain E.
However, there are only k different extensions of φ to K, and for each extension ρ to K of φ, Proposition 13.3.3 applied to K, ρ and K(α), gives that
there are no more than q extensions of ρ to all of K(α) = E. Thus there are no
more than kq = d possible extensions of φ to all of K(α) = E.
Corollary 13.3.5 Let F ⊆ E be an extension of fields with [E : F ] = d < ∞.
Then Aut(E/F ) has no more than d elements.
Proof. Apply Theorem 13.3.4 where F ′ = F , E ′ = E and where the isomorphism
from F to F ′ = F is the identity.
Example
We add to one of the examples in Section 12.3.4 to illustrate the need for the
generality in Proposition 13.3.3.
√
√
Consider Q( 3 2, ω) where ω = − 21 + i 23 is the cube root of 1 in the second
√
3
2, ω) : Q] in stages by
quadrant of the complex
plane.
We can figure out [Q( √
√
√
3
3
3
looking at Q ⊆ Q( 2) ⊆ Q( 2,
ω).
We
know
that
[Q(
2 : Q] = 3 since x3√
−2
√
3
2
is irreducible over Q. Also, Q( 2) ⊆ R√so x + x +√1 is irreducible over Q( 3 2)
since it is irreducible over R. Thus [Q( 3 2, ω) : Q( 3 2)] = 2. This gives
√
√
√
√
3
3
3
3
[Q( 2, ω) : Q] = [Q( 2, ω) : Q( 2)][Q( 2) : Q] = (2)(3) = 6.
256
CHAPTER 13. MULTIPLE EXTENSIONS
We ask what are the extensions of the identity on Q to homomorphisms√of
√
by what it does on 3 2
Q( 3 2, ω) into C. Sucn an extension will be determined
√
3
and on ω. (Why?) The extension must take 2 to some root of x3 − 2 and
must take ω to some root of x2 + x + 1. √
√
√
3
2, ω 3 2 and ω 2 3 2. These are all in
Now the three roots of x3 − 2 in C are
√
√
in Q( 3 2, ω). So any extension of the identity
Q( 3 2, ω). Also both ω and ω 2 are
√
√
3
3
on Q to a homomorphism of Q( 2, ω) into C must have its image in Q( √
2, ω).
3
(Why?) But the image will have degree 6 over Q, and so must
be
all
of
Q(
2, ω).
√
Thus the extensions we seek are just elements of Aut(Q( 3 2, ω)/Q).
If we follow the outline in the proof of Theorem
13.3.4, we must pick a field
√
K containing Q and strictly contained in Q( 3 2, ω) of largest degree over Q.
Since the√degrees involved must be factors of 6, the largest such factor is 3.
Thus Q( 3 2) can be used for K.
√
√
only 3 2 is in
The three roots of x3 − 2 in C are all in Q( 3 2, ω), but
√
√
Q( 3 2). By Proposition 12.3.5, the only automorphism of Q( 3 2) that fixes Q
is the identity. However, by Proposition 13.3.3,
there are
√ three extensions of
√
extend the
the identity on Q of homomorphisms of Q( 3 2) into Q( 3 2, ω) that
√
3
3
identity on Q: one for each root
of
x
−
2.
The
images
of
Q(
2)
under the
√
√
√
three homomorphisms are Q( 3 2) itself, Q(ω 3 2), and Q(ω 2 3 2).
Now the second part of
13.3.4, looks
√
√ the argument in the proof of Theorem
at each extension to Q( 3 2) and extends further to Q( 3 2, ω) using the full
generality of Proposition 13.3.3. The full generality is needed since two of√the
homomorphisms that we are extending are
not only not the identity on Q( 3 2),
√
3
they are not even automorphisms of Q( 2). √
They are homomorphisms (isomorphisms onto their images) with domain Q( 3 2).
We know that there are two such extensions in each case. One will be the
identity which takes ω to itself, and the other will be complex conjugation that
takes ω to ω 2 . We thus end up with a total of six extensions
of the identity on
√
3
2,
ω)/Q).
The following
Q. As noted above, these are all elements of Aut(Q(
√
table gives the action of the six automorphisms on 3 2 and on ω. Each numbered
column shows what happens to the two values in the extreme left column.
√
3
2
ω
1
√
3
2
ω
2
√
3
2
ω2
3
√
ω32
ω
4
√
ω32
ω2
5√
ω2 3 2
ω
6√
ω2 3 2
ω2
More interesting is what each automorphism does to the three roots of x3 −2.
Using the information
√ in the table above, we show what each of the numbered
elements of Aut(Q( 3 2, ω)/Q) does to the roots.
2
3
4
5√
6√
√
√
√
3
3
3
2 3
2 3
2
ω
2
ω
2
ω
2
ω
2
√
√
√
√
√
3
3
3
2 3
2 3
ω √ 2 ω√ 2
2
ω√ 2
√
√2
3
3
ω32
2
ω2 3 2 ω 3 2
2
√
√
3
For√example,
the image of ω 2 2 under automorphism 4 is calculated
as (ω 2 )2 (ω 3 2) =
√
√
√
ω 5 3 2 = ω 2 3 2 since automorphism 4 takes ω to ω 2 and 3 2 to ω 3 2.
√
3
2
√
3
ω √2
ω2 3 2
1
√
3
2
√
3
ω √2
ω2 3 2
13.4. SPLITTING FIELDS
257
It is easier
√ to see what is happening to the roots if we let A =
and C = ω 2 3 2. Then the table above turns into the following.
6
C
B
A
√
It is now easy to see that the elements in Aut(Q( 3 2, ω)/Q) give all possible
permutations of the three roots of x3 − 2.
A
B
C
13.4
1
A
B
C
2
A
C
B
3
B
C
A
4
B
A
C
√
√
3
2, B = ω 3 2
5
C
A
B
Splitting fields
√
The field Q( 3 2, ω) is the smallest subfield of C containing all the roots of
x3 − 2. This is because as a field of characteristic
0, any
subfield of C must
√
√
3
3
2
and
ω
2
must also contain
contain
Q,
and
because
a
field
containing
both
√ √
√
ω 3 2( 3 2)−1 = ω. Thus Q( 3 2, ω) is the smallest subfield of C in which x3 − 2
factors into a product of three terms of degree one. This phenomenon is turned
into a definition.
Let F ⊆ E be an extension of fields. Let P (x) be in F [x]. We say that P (x)
splits over E if P (x) factors as a product of one degree factors in E[x]. If P (x)
splits over E, then a splitting field for P (x) in E over F is the smallest subfield
of E containing F over which P (x) splits. If P (x) splits over E, and if the roots
of P (x) in E are r1 , r2 , . . . , rk , then the splitting field for P (x) in E over F is
exactly F (r1 , r2 , . . . , rk ). If E itself is the smallest subfield of E containing F
over which P (x) splits, then we simply say that E is a splitting field for P (x)
over F .
We need to say a few words about what is happening. If P (x) is in F [x] and
F ⊆ E, then P (x) is also in E[x] since its coefficients are in E as well as F . If
P (x) splits over E, then P (x) is a product of degree one polynomials, each of
which uses coefficients from E. Thus if d is the degree of P (x), then we have
that P (x) = A1 (x)A2 (x) · · · Ad (x) where each Ai (x) is a degree one element of
E[x] is a true statement about multiplication in E[x]. However, the result of
the multiplication is a polynomial P (x) which happens to have its coefficients
in a smaller field F so that it happens to lie in F [x].
Exercises (72)
1. For each of the following polynomials P (x), determine the splitting field
SP for P (x) in C over Q and determine the degree [SP : Q]. If you are
ambitious, you can also determine the automorphism group Aut(SP /Q).
Hint: review the exercises in Section 12.3.4
(a) P (x) = x3 − 1.
(b) P (x) = x4 − 1.
258
CHAPTER 13. MULTIPLE EXTENSIONS
(c) P (x) = x5 − 1.
(d) P (x) = x6 − 1.
Next we will show that splitting fields always exist (although not necessarily
inside a given field) and that all splitting fields for a given P (x) over a given F
are isomorphic in a strong way.
13.4.1
Existence
If F is a field, and if P (x) is in F [x], then a splitting field for P (x) exists if
P (x) splits over some extension E of F . Thus we work to break P (x) into linear
factors. Finding linear factors is the same as finding roots, and we have the
tools to build roots where needed.
Proposition 13.4.1 Let F be a field and let P (x) be in F [x]. Then there is an
extension E of F over which P (x) splits.
Proof. Let d be the degree of P (x). If d = 1, then P (x) already splits over F
and we can let E = F . We then induct on d in the sense that we assume that
any polynomial of degree less than d over any field splits over some extension
of that field.
We claim that if there is an extension E of F (even allowing E = F ) that
has a root r of P (x), then we are done in one extra step because then P (x)
factors into (x − r)Q(x). This will make the degree of Q(x) equal to d − 1, and
our inductive hypothesis will then give an extension E ′ of E over which Q(x)
splits. Thus P (x) will split over E ′ which is also an extension of F .
So we assume that F itself has no roots of P (x) and we seek an extension
E of F having at least one root of P (x). Now let A(x) be an irreducible factor
of P (x) over F . Perhaps A(x) = P (x), but it does not matter whether this
equality holds or not.
By Proposition 12.3.4 the field F [x]/A(x) has an element (namely [x]) that
is a root of P (x). Further, the classes of constant polynomials in F [x]/A(x)
form a subfield of F [x]/A(x) that is isomorphic to F . Thus if we regard F as
directly contained in F [x]/A(x) by regarding each class of a constant polynomial
as being that constant element in F itself, we then have an extension of F that
has a root of A(x) and thus of P (x).
13.4.2
Uniqueness
If E and E ′ are two extensions of F that are splitting fields for one polynomial
P (x) ∈ F [x], then we want to show that they are isomorphic. We have the
following which is stated in relative form to help with the induction.
Proposition 13.4.2 Let F and F ′ be fields and let φ : F → F ′ be an isomorphism. Let P (x) be in F [x]. Let E be a splitting field for P (x) over F and
13.4. SPLITTING FIELDS
259
let E ′ be a splitting field for φ(P )(x) over F ′ . Then there is an isomorphism
θ : E → E ′ that extends φ.
Proof. Let d be the degree of P (x). If d = 1, then E = F and E ′ = F ′ and we
are done by letting θ = φ. So we assume d > 1 and we inductively assume the
truth of the proposition in all situations where the degree of the polynomial is
less than d.
Let A(x) be an irreducible factor of P (x) over F so that P (x) = A(x)B(x).
By Lemma 13.3.1 and Problem 2 in Exercise set (70), we know that φ(A)(x)
is an irreducible factor of φ(P )(x) and φ(P )(x) = φ(A)(x)φ(B)(x). Let α be a
root of A(x) in E (which must exist since P (x) splits over E) and let α′ be a
root of φ(A)(x) in E ′ . By Proposition 13.3.3 (the relative version of Proposition
12.3.5), there is an extension ρ of φ that is an isomorphism from F (α) to F ′ (α′ )
that carries α to α′ .
Now A(x) factors over F (α) as (x − α)C(x) for some (not necessarily irreducible) C(x) ∈ F (α)[x]. By Lemma 13.3.1, ρ(x − α)ρ(C)(x) is a factorization of ρ(A)(x). But ρ(x − α) = (x − α′ ). So P (x) = (x − α)C(x)B(x)
and ρ(P )(x) = (x − α′ )ρ(C)(x)ρ(B)(x). We know that D(x) = C(x)B(x) and
ρ(D)(x) = ρ(C)(x)ρ(B)(x) have degree d − 1.
We now try to apply the inductive hypothesis to the isomorphism ρ from
field F (α) to F (α′ ) and polynomials D(x) and ρ(D)(x).
By hypothesis P (x) factors into degree one factors over E. By uniqueness
of factorization, (x − α) is one of those factors and D(x) must be the product
of all the factors of P (x) with one copy of (x − α) removed. Similarly ρ(D)(x)
is the product of all the factors of ρ(P )(x) with one copy of (x − α′ ) removed.
Thus D(x) splits over E and ρ(D)(x) splits over E ′ . If D(x) splits over a field
K containing F (α) that is smaller than E, then P (x) would split over K as
well since all factors of D(x) would be in K[x] and the extra factor (x − α) is
in F (α)[x] which is also in K[x]. Thus E would not be the smallest field in E
containing F over which P (x) splits. We conclude that E is a splitting field for
D(x) over F (α) as well.
An identical argument shows that E ′ is a splitting field for ρ(D)(x) over
F ′ (α′ ).
With F (α) and F ′ (α′ ) playing the role of F and F ′ , with ρ playing the role of
φ, with D(x) playing the role of P (x), and with E and E ′ playing their original
roles, we have reproduced the hypotheses of the statement we are proving but
with a polynomial of degree d − 1 instead of d. Our inductive hypothesis says
that there is an isomorphism θ : E → E ′ that extends ρ. But ρ extends φ, so θ
also extends φ. Thus θ is the isomorphism that we were looking for.
13.4.3
An application
Theorem 13.4.3 Let F and F ′ be two finite fields with the same number of
elements. Then F and F ′ are isomorphic.
260
CHAPTER 13. MULTIPLE EXTENSIONS
Proof. We know that for some prime p and some natural number n that the
number of elements of F is pn . We also know that Zp is a subfield of both F
and F ′ .
Let q = pn . Then the multiplicative group F ∗ has q − 1 elements and thus
every element of F ∗ has order dividing q − 1. From this we know that for every
x ∈ F ∗ that xq−1 = 1. Multiplying by one more copy of x gives that every x in
F ∗ satisfied xq = x. However, this is also true of 0. So for every x ∈ F , we have
that x is a root of the polynomial xq = x. But there are exactly q elements of
F and every one is a root of the degree q polynomial xq − x. Since xq − x can
have no more than q roots, it splits over F . Since every element of F is needed
for the splitting, F is the smallest subfield of F over which xq − x splits. This
makes F a splitting field for xq − x over Zp .
Similarly F ′ is a splitting field for xq − x over Zp . The result now follows
from the uniqueness of splitting fields.
A strategy
We wish to understand a polynomial (P (x) in some F [x]. In particular we want
to understand its roots. A splitting field contains all its roots and its structure
(up to an isomorphism) depends only on P (x) and F . Thus the splitting field
should have much information about the relationship between F , P (X) and the
roots of P (x) and should not depend on choices made in the construction of the
splitting field.
Galois’ approach is to study Γ = Aut(E/F ) where E is some splitting field
for P (x) over F . There is no reason at the outset to expect that there will be
enough information in Γ to say much about how F , P (x) and its roots relate,
but it turns out that there is. However, there are some extra conditions that
must be met.
Every one of the automorphisms in Γ fixes every element of F , so no internal structure of F is picked up by Γ. Since it is the relatinship between F
(which contains the coefficients) and E (which contains the roots) that is to be
understood, this does not seem to be a problem.
However, if Γ fixes more than F , then there are parts of E not contained
in F whose structure is being ignored by Γ. To get the maximum amount of
information about Γ, we would like F to be the only set of elements fixed by Γ.
In the next section, we will explore what it takes to make F the fixed field of
Aut(E/F ).
13.5
Fixed fields
From Corollary 13.3.5, we know that if an extension F ⊆ E of fields has degree
[E : F ] = d < ∞, then Γ = Aut(E/F ) has no more than d elements. But
it might have fewer. In this section, we will learn that if Γ has fewer than d
elements, than the fixed field of Γ will not be F , but a field that is strictly larger
than F . Thus there will be parts of the structure of the extension that are not
13.5. FIXED FIELDS
261
being distinguished from F by Γ. This turns out to be an important loss, and
in the next chapter we will see what conditions are needed to guarantee that
for a finite extension F ⊆ E, the number of elements in Aut(E/F ) is exactly
[E : F ].
In order to discuss the size of fixed fields, we need an importantion notion of
independence of automorphisms. This will be introduced first, and then applied
to fixed fields.
13.5.1
Independence of automorphisms
Field automorphisms are linear transformations and if that is all that was required of a field automorphism, then field automorphisms might form a vector
space. But sums and “scalar multiples” of field automorphisms are not field
automorphisms. Simply look at what happens to 1 under such operations. In
spite of this, we can make a definition that imitates aspects of linear algebra.
Let φ1 , φ2 , . . . , φk be different automorphisms of a field F . We say that
these automorphisms are linearly dependent if there are elements ai , 1 ≤ i ≤ k
so that not all the ai are zero and so that
k
X
ai φi (α) = 0
(13.2)
i=0
for every element α ∈ F . We say that these automorphisms are linearly independent if they are not linearly dependent. It turns out that field automorphisms
are so restrictive that all that is needed to make a finite set of field automorphisms linearly independent is to make sure that they are all different.
Proposition 13.5.1 If φi , 1 ≤ i ≤ k are automorphisms of a field F with no
two of them being the same, then they are linearly independent.
Proof. We proceed by contradiction and assume that they are linearly dependent.
If the automorphisms φi , 1 ≤ i ≤ k are linearly dependent, then a dependence such as (13.2) must exist with some number of the ai not equal to zero.
If we only keep the terms with ai not zero, and discard the others, then we have
a linear dependence on a subset of the φi . If we choose a minimal non-empty
subset of the φi that are linearly dependent, then the sum like (13.2) for that
subset will have that all the ai used in the sum are non-zero. We will prove that
a strictly smaller subset must have a linear dependence, contradicting the fact
that we have chosen a minimal subset. Note that a subset of size one cannot be
linearly independent, since the statement ai 6= 0 and ai φi (α) = 0 for all α ∈ F
implies that φi is the zero homomorphism and cannot be an automorphism.
For simplicity of notation, we assume that a minimal linearly dependent
subset of the φi is the set of φi with 1 ≤ i ≤ j so that (13.2) takes the form
a1 φ1 (α) + a2 φ2 (α) + · · · aj φj (α) = 0.
(13.3)
262
CHAPTER 13. MULTIPLE EXTENSIONS
We would like to subtract from (13.3) a similar but not identical sum so that
at least one term cancels, but not all terms cancel. We can do this by exploiting
the properties of field automorphisms. We note that φj (βα) = φj (β)φj (α).
So one way to get the last term to read as aj φj (β)φj (α) is to replace every
appearance of α in (13.3) by βα. Since βα is another element of F , the resulting
sum will still be zero by the definition of linear dependence.
Another way to get the last term to read as aj φj (β)φj (α) is to simply multiply all terms in (13.3) by φj (β). Again the resulting sum will still be zero.
The first modification makes the first term read as a1 φ1 (β)φ1 (α), and the
second modification makes the first term read a1 φj (β)φ1 (α). To keep these
terms different, we only need that φ1 (β) 6= φj (β). But this can be arranged by
the right choice of β since φ1 and φj are not the same automorphism and must
differ on some element of F .
The difference of the two modifications will be a linear dependence on a
strictly smaller subset since at least one of the coefficients (the first) will not be
zero and at least one (the last) will be zero.
13.5.2
Sizes of fixed fields
The following argument uses a technique that will come up again.
Proposition 13.5.2 Let γ ⊆ Aut(E) be a subgroup with finitely many elements
and let F be the fixed field of Γ. Then [E : F ] is the order of Γ.
Proof. Let Γ = {φ1 , . . . , φk } and let d = [E : F ]. From Corollary 13.3.5, we
know that k ≤ d. We assume that k < d and arrive at a contradiction.
We let (β1 , . . . , βd ) be a basis for E as a vector space over F . Consider the
equation
φi (β1 )x1 + φi (β2 )x2 + · · · φi (βd )xd = 0.
(Di)
Since φi for a fixed i is an automorphism of the vector space E that fixes the
field of “scalars” F , the elements φi (βj ), 1 ≤ j ≤ d must be linearly independent
over F and the only solutions for the xj in F for (Di) are all zero. But the
φi (βj ) are not linearly independent over E and there are non-zero solutions in
E. Further, if we consider the system of linear equations (Di) for 1 ≤ i ≤ k,
then with our assumption k < d, we have fewer equations than unknowns, and
there is at least one solution where not all the xi are zero.
Let us renumber so that at least x1 6= 0. Note that we can multiply all
the xi by any non-zero element of E and still have a solution to the system of
equations for which x1 6= 0. Since we can multiply by any element of E, we can
find a solution to the system of the (Di) where x1 is whatever fixed non-zero
element of E that we please.
We now define for each j with 1 ≤ j ≤ d
aj = φ1 (xj ) + φ2 (xj ) + · · · φk (xj ).
13.5. FIXED FIELDS
263
Since the φi are linearly independent, and we can arrange to have x1 any nonzero element of E that we wish, we can choose x1 so that a1 6= 0.
We now come to the technique referred to before the statement of the proposition. We use the fact that the φi are all the elements of a group. We know that
multiplication (composition) on the left by any one group element permutes the
elements of a group. Thus for any i and j we have that φi (aj ) is the same sum
as the one giving aj except that the order of the summands is permuted. Thus
for each i and j, we have that φi (aj ) = aj . Since each aj is fixed by all the φi ,
each aj is in F , the fixed field of Γ.
Since the βj , 1 ≤ j ≤ d are linearly independent over F , since the aj are all
in F , and since at least a1 6= 0, we must have
d
X
j=1
aj βj 6= 0.
(13.4)
We calculate this sum using the definition of the aj and get
d
X
a j βj =
j=1
k
d
X
X
=
φi (xj ) βj
i=1
j=1
d X
k
X
!
(φi (xj )βj )
j=1 i=1
=
d X
k
X
j=1 i=1
=
k X
d
X
i=1 j=1
=
k
X
i=1
But for each i we have
d
X
(xj φ−1
i (βj )) =
j=1
φi xj φ−1
i (βj )
φi xj φ−1
i (βj )


d
X

φi  (xj φ−1
i (βj )) .
d
X
j=1
(φ−1
i (βj )xj ) =
j=1
d
X
(φt (βj )xj )
j=1
for whatever φt ∈ Γ equals φ−1
i . This is one of the sums (Di) for i = t and is
equal to zero. This gives
d
X
a j βj = 0
j=1
which contradicts (13.4).
264
CHAPTER 13. MULTIPLE EXTENSIONS
Combining Corollary 13.3.5 and Proposition 13.5.2
In Proposition 13.5.2, we can let Γ = Aut(E/F ) and G be the fixed field of
Aut(E/F ). We have F ⊆ G ⊆ E.
Corollary 13.5.3 If F ⊆ E is a field extension and Aut(E/F ) is finite, then
letting G be the fixed field of Aut(E/F ) gives
[E : G] = |Aut(E/F )| ≤ [E : F ].
Proof. The equality comes from Proposition 13.5.2 and the inequality comes
from Corollary 13.3.5.
From this we get the following.
Corollary 13.5.4 Let F ⊆ E be a field extension of finite degree, and let G be
the fixed field of Aut(E/F ). Then G = F if and only if |Aut(E/F )| = [E : F ].
Proof. From Corollary 13.5.3, we have |Aut(E/F )| = [E : F ] if and only if
[E : G] = [E : F ]. From the multiplicative properties of degree (Lemma 10.5.1),
we have [E : G] = [E : F ] if and only if [G : F ] = 1, and from Lemma 10.5.2
this happens if and only if G = F .
The strategy revisited
We are interested in roots of a polynomial P (x) over a field F , and we know that
we will find them in some splitting field for P (x) over F . The automorphism
group Γ = Aut(E/F ) will explore the relationship between E and F best if F
is the fixed field of Γ. We know that F must be contained in the fixed field of
Γ, but from Proposition 13.5.2, we will only get equality if the order of Γ equals
[E : F ].
13.6
A criterion for irreducibility
We give a celebrated criterion that implies irreducibility of a polynomial over
Q. In particular it will imply that the polynomial used as an example at the
beginning of the chapter is irreducible. We will also show that it implies that a
certain class of polynomials is irreducible and will use this fact in an important
way in the next chapter.
The criterion is easy to state, but its proof requires that some concepts be
introduced first.
13.6.1
Primitive polynomials and content
The roots of a polynomial do not change when the polynomial is multiplied by
a constant. If we are given a polynomial over Q, we can multiply by a large
13.6. A CRITERION FOR IRREDUCIBILITY
265
enough integer (the product of all the denominators of the coefficients, say) so
that the resulting polynomial has the same roots as the original, but all the
coefficients are integers. We will refer to a polynomial with integer coefficients
as being “over Z” and in Z[x] even though we have not yet discussed polynomials
over rings.
Given a polynomial over Z, we can alter it further by dividing by the greatest
common divisor (here we mean positive greatest common divisor) of all the
coefficients. The result is a polynomial over Z with the same roots and whose
greatest common divisor of the coefficients is 1. We use this as a definition and
say that a non-zero polynomial is primitive if it is over Z and if the greatest
common divisor of all its coefficients is 1.
Our observations before the definition have argued that a polynomial over
Q has the same roots as a primitive polynomial. We can make this observation
stronger and more specific.
Lemma 13.6.1 Let P (x) be a non-zero polynomial in Q[x]. Then there is a
unique c ∈ Q and a unique primitive polynomial A(x) so that P (x) = cA(x).
Proof. That there are such a c and A(x) has already been argued. We assume
that P (x) = d(B(x) where d ∈ Q and B(x) is primitive. Then c = m/n and
d = p/q with all of m, n, p, q in Z. Now CA(x) = dB(x) so mqA(x) = npB(x).
Since the greatest common divisor of the coefficients of A(x) is 1, the greatest
common divisor of the coefficients of mqA(x) must be mq. And since the greatest
common divisor of B(x) is 1, the greatest common divisor of the coefficients of
npB(x) must be np. But the two polynomials are the same so mq = np and
c = d. Now A(x) = B(x).
The rational number c in Lemma 13.6.1 is called the content of the polynomial P (x) ∈ Q[x].
An important property of primitive polynomials is that they are closed under
multiplication.
Lemma 13.6.2 If P (x) and Q(x) are both primitive, then so is (P Q)(x).
Proof. If (P Q)(x) is not primitive, then some integer greater than 1 divides all
its coefficients. In particular, some prime integer p divides all the coefficients.
We will show that no such prime exists.
We have


∞
i
X
X

aj bi−j  xi
(P Q)(x) =
i=0
j=i
where ai are the coefficients of P (x) and bi are the coefficients of Q(x).
Since both P (x) and Q(x) are primitive, we know that p does not divide all
the ai and does not divide all the bi . Let s be the smallest so that p does not
divide as and let t be the smallest so that p does not divide bt . So p|ai when
i < s and p|bj when j < t.
266
CHAPTER 13. MULTIPLE EXTENSIONS
The coefficient of xs+t in (P Q)(x) is
s+t
X
aj bs+t−j .
(13.5)
j=0
When j < s we know p|aj bs+t−j since p|aj . When j > s, then s + t − j < t and
p|aj bs+t−j since p|bs+t−j . The sum (13.5) is divisible by p and the only term
not yet mentioned is as bt . So p|as bt . Since p is prime, either p|as or p|bt which
contradicts our choice of s and t.
The previous two lemmas combine to give the following non-obvious result.
Lemma 13.6.3 If P (x) ∈ Z[x] factors into two polynomials of positive degree
in Q[x], then it factors into two polynomials of positive degree in Z[x].
Proof. Suppose P (x) ∈ Z[x] factors as A(x)B(x) with A(x) and B(x) in Q[x]
each of positive degree. Letting c be the content of A(x) and d be the content
of B(x), we have A(x) = cC(x) and B(x) = dD(x) for primitive polynomials
C(x) and D(x). Now
P (x) = A(x)B(x) = cd(CD(x))
where (CD)(x) is primitive by Lemma 13.6.2. But this makes cd the content
of P (x). The coefficients of (CD)(x) have greatest common divisor 1, so if cd
is not an integer, then some coefficient of cd(CD)(x) would not be an integer.
Since
all
coefficients
of P (x) are integers, cd is an integer. Writing P (x) as
cdC(x)
13.6.2
D(x) gives the required factorization of P (x).
The Eisenstein Irreducibility Criterion
We now get to the criterion for irreducibilitiy. The criterion in the theorem
below is known as the Eisenstein Irreducibility Criterion. Note that it is not an
if and only if criterion. Any polynomial in Z[x] must be irreducible over Q, but
there are polynomials in Z[x] that are irreducible over Q that do not satisfy
the criterion. In fact an important set of examples will not satisfy the criterion
but will be provably irreducible over Q by a trick that allows the criterion to
be used indirectly.
Theorem 13.6.4 Let
P (x) =
∞
X
ai xi
i=0
be a polynomial in Z[x] of degree d. If there is a prime p that that divides every
ai except ad and if p2 does not divide a0 , then P (x) is irreducible over Q.
13.6. A CRITERION FOR IRREDUCIBILITY
267
Note that it is a requirement of the hypotheses that p does not divide ad .
Proof. We assume that P (x) is reducible and that P (x) = Q(x)R(x) where Q(x)
and R(x) have positive degree. By Lemma 13.6.3, we can assume that Q(x) and
R(x) have integer coefficients. Let
Q(x) =
∞
X
bi xi
and
R(x) =
i=0
∞
X
ci xi .
i=0
We know that a0 = b0 c0 and that p divides a0 but p2 does not. Thus p divides
exactly one of b0 and c0 but not both. Let us assume that p divides b0 and does
not divide c0 . Our goal is to prove that p divides every bi . This will say that
Q(x) = pS(x) for some S(x) ∈ Z[x] making P (x) = pS(x)R(x) and p will divide
all coefficients of P (x) including ad . But this will contradict a hypothesis.
We prove that p divides all bi by induction. We know that p|b0 . Assume
that p|bi for all i < k. We want to prove that p|bk . Let the degree of Q(x) be
q. If k > q, then we have nothing to prove since bk = 0 if k > q. So we assume
k ≤ q. Since Q(x) and R(x) have positive degree, we know that q < d so k < d.
We know that ak = b0 ck + b1 ck−1 + · · · bk c0 . By our inductive hypotheses,
we know that p divides every bi with i < k. Thus p divides all terms in the sum
except possibly the last. But the hypotheses of the lemma and the fact that
k < d say that p divides the sum which is ak . So p divides the last term. But p
does not divide c0 so p|bk . This completes the proof.
13.6.3
Applications of the irreducibility criterion
The example at the beginning of the chapter
The example at the beginning of this chapter is P (x) = x3 + 3x − 2. This does
not satsify the hypotheses of the Eisenstein Criterion. But
P (x + 2) = (x + 2)3 + 3(x + 2) − 2
= x3 + 3x2 · 2 + 3x · 22 + 23 + 3x + 6 − 2
= x3 + 6x2 + 15x + 12
does using the prime 3. So H(x) = P (x + 2) is irreducible over Q.
Now if P (x) = H(x − 2) were reducible as P (x) = A(x)B(x), then H(x) =
P (x + 2) would be reducible as H(x) = A(x + 2)B(x + 2). Since H(x) is
irreducible over Q, so is P (x) as promised at the beginning of the chapter.
This trick is based on the fact that if A(x) and B(x) are in Q[x], then so
are A(x + 2) and B(x + 2). We will not bother to formalize this procedure into
a lemma.
Roots of one
We worked in a previous problem to show that x4 + x3 + x2 + x + 1 is irreducible
over Q. This comes up as a factor of x5 − 1. For any prime n, the polynomial
268
CHAPTER 13. MULTIPLE EXTENSIONS
xn − 1 factors as x − 1 times
P (x) =
n−1
X
xi .
(13.6)
i=0
We have seen that for some n, such as n = 6, the polynomial in (13.6) is
reducible over Q. However for n a prime, it is irreducible as we will show.
We use a procedure similar to the previous example. We will show that
P (x + 1) satisfies the Eisenstein Criterion which will then imply that P (x) is
irreducible.
We let n = p a prime so that the notation makes clear what we are working
with. We note that (x − 1)P (x) = xp − 1. so that we can write
P (x) =
xp − 1
.
x−1
Thus we can write
P (x + 1) =
(x + 1)p − 1
(x + 1)p − 1
=
.
(x + 1) − 1
x
Now
p
(x + 1) =
p X
p
i=0
p
We know that
=
0
p
p
!
i
xi
= 1 and that for 1 ≤ i ≤ p − 1 that
p
p(p − 1)(p − 2) · · · (p − i + 1)
.
=
1 · 2 · 3···i
i
With 1 ≤ i ≤ p − 1, no term!in the denominator divides p so that the result is
p
= p. Since the constant term of (x + 1)p is 1, we
divisible by p. Also
p−1
p
know that (x +
1)− 1 has constant term 0 and is divisible by x.
p
If we take
to be 0 when k > p, we can write
k
(x + 1)p − 1
x
X p xi
=
i
+
1
i=0
P (x + 1) =
which has degree p − 1, leading term 1, constant term p and all other coefficients
divisible by p. Thus Eisenstein’s Criterion applies and P (x + 1) is irreducible
over Q and so is P (x).
Chapter 14
Galois theory basics
Let F ⊆ E be a finite extension of fields. We know that [E : F ] is always
no smaller than Aut(E/F ), and we know that F is always contained in the
fixed field of Aut(E/F ). From Corollary 13.5.3 we know that the following are
equivalent:
1. [E : F ] equals the order of Aut(E/F ).
2. F is the fixed field of Aut(E/F ).
In this chapter, we will study the situation in which the above equivalent
items hold. The relevant definition is the following. We call an extension F ⊆ E
of fields a Galois extension if [E : F ] is finite and equals the order of Aut(E/F ).
Equivalently, we can call an extension F ⊆ E of fields a Galois extension if
[E : F ] is finite and F equals the fixed field of Aut(E/F ).
We start by investigating which extensions are Galois extensions.
14.1
Separability
Let F ⊆ E is an extension of fields, and let Γ = Aut(E/F ). We know that
F ⊆ Fix(Γ). If F is strictly smaller than Fix(Γ), then there are elements in
E − F that Γ cannot distinguish from F . Specifically, there are elements in
E − F that no element in Γ can budge. One might say that there are elements
of E − F that cannot be “separated” from F by elements of Γ. We will let the
reader judge if this bit of background justifies the choice of words used in the
following discussion. This choice of words is quite standard.
We now assume that [E : F ] is finite. We know from the previous chapter,
that F = Fix(Γ) if and only if |Γ| = [E : F ]. It will always be the case that
|Γ| ≤ [E : F ], so we are only in trouble if there are fewer automorphisms in
Aut(E/F ) than demanded by the degree of the extension.
From the details of the proofs of Propositions 12.3.5, 13.3.3 and Theorem
13.3.4, we know that we get automorphisms from the roots of the minimal
269
270
CHAPTER 14. GALOIS THEORY BASICS
polynomials that are involved. One way to guarantee that the roots we need
are available is to work with splitting fields. Properties of splitting fields will
be dealt with later in this chapter.
However, the number of automorphisms that we seek is determined by the
degree of the extension, and this degree is tied to the degrees of the minimal
polynomials involved. Thus we are in trouble if the number of roots of a relevant
minimal polynomial is smaller than the degree of the polynomial. This will occur
if the polynomial has multiple roots. In discussions of multiple roots, one talks
about counting multiple roots multiple times. But counting a root more than
once will not create more than one automorphism taking a given element to that
particular root. So having multiple roots will stand in the way of “separating”
elements in E − F from F . This leads to our first definition.
If F is a field and P (x) is in F [x], then we say that P (x) is separable over F
if P (x) has no multiple roots in any splitting field for P (x) over F . Shortly, we
will connect this notion of “separable” to our use of the word “separating.”
Lemma 14.1.1 Let F be a field of characteristic zero and let P (x) ∈ F [x] be
irreducible. Then P (x) is separable over F .
Proof. Let α be a root of P (x). Since P (x) is irreducible, it is minimal for α
over F .
We know that α is a multiple root of
P (x) =
∞
X
ai xi
i=0
if and only if α is also a root of
P ′ (x) =
∞
X
i(ai )xi−1 .
i=1
Since P (x) is not zero and has a root, it has degree at least 1. Also each i > 0
that has ai 6= 0 gives i(ai ) 6= 0 since F has characteristic zero. So P ′ (x) will be
a non-zero polynomial with α as a root whose degree is smaller than the degree
of P (x). This contradicts the fact that P (x) is a minimal polynomial for α over
F.
If F ⊆ E is an extension of fields, then α ∈ E is said to be separable, if its
minimal polynomial over F is separable. The extension E of F is said to be
separable if every element of E is separable over F .
Corollary 14.1.2 A finite extension of a field of characteristic zero is separable.
Separability gives the next important result which deserves its own section.
It will not see use in this chapter but will in the next chapter.
14.2. THE PRIMITIVE ELEMENT THEOREM
14.2
271
The primitive element theorem
Recall from Section 13.1 that an extension F ⊆ E of fields is said to be simple
if there is an element α ∈ E so that E = F (α), and that if such an element
exists, it is called a primitive element for the extension. In our setting, finite
extensions are all simple. That is, primitive elements always exist. The key
lemma is the following where the use of separability is prominent.
Lemma 14.2.1 Let F ⊆ E be an extension of fields, and let α and β in E
be algebraic over F . If the minimal polynomial for α over F is separable, then
there are only finitely many t ∈ F for which F (α + tβ) is not all of F (α, β).
Proof. We will avoid t = 0 since this is only one value to avoid and thus will not
affect the conclusion.
Note that if α is in F (α + tβ), then so is β since β can be gotten from α + tβ
and α by field operations. So we are done if we show that there are only finitely
many t ∈ F for which α ∈
/ F (α + tβ).
Let P (x) be a minimal polynomial for α over F and let Q(x) be a minimal
polynomial for β over F . By Proposition 13.4.1, we can extend E so that P (x)
splits in the extension, so we can just assume that P (x) splits in E. We assume
that α ∈
/ F (α + tβ), and we let P1 (x) be a minimal polynomial for α over
F (α + tβ). It will have degree bigger than 1 and it will divide P (x). Since P (x)
has no multiple roots, neither will P1 (x) and there will be a root α′ of P1 (x)
different from α. Note that α′ will also be a root of P (x).
By Prop 13.3.3, there is an isomorphism φ from F (α+tβ)(α) to F (α+tβ)(α′ )
that is the identity on F (α + tβ). The element β ′ = φ(β) must be some root
of a minimal polynomial for β over F (α + tβ) and thus a root for the minimal
polynomial Q(x) for β over F .
Noting that t ∈ F implies φ(t) = t, we have
α + tβ = φ(α + tβ) = φ(α) + tφ(β) = α′ + tβ ′ .
Now α 6= α′ together with α + tβ = α′ + tβ ′ implies that β 6= β ′ and we can
solve for t as
α − α′
.
(14.1)
t= ′
β −β
Thus the assumption that α ∈
/ F (α + tβ) forces t to be one of the finitely many
elements of F of the form (14.1) obtained by letting α′ range over the roots of
P (x) different from α and letting β ′ range over the roots of Q(x) different from
β. This completes the proof.
This now gives the following important result.
Theorem 14.2.2 (Primitive Element Theorem) A finite separable extension F ⊆ E of infinite fields is simple.
272
CHAPTER 14. GALOIS THEORY BASICS
Proof. Since [E : F ] is finite, there are finitely many elements α1 , α2 , . . . , αk in
E so that E = F (α1 , α2 , . . . , αk ). We assume that these elements have been
chosen so that k is as small as possible. If k = 1, we are done. If k > 1,
Lemma 14.2.1 says that we can find a t ∈ F so that setting γ = α1 + tα2 gives
F (γ) = F (α1 , α2 ) and E = F (γ, α3 , . . . , αk ) which contradicts our choice of k.
Corollary 14.2.3 A finite extension F ⊆ E of fields of characteristic zero is
simple.
Proof. A field of characteristic zero is infinite and any finite extension of it is
separable.
Theorem 14.2.2 holds in greater generality than what we have given. In
particular it also holds for finite separable fields, so the most general statement
would be that a finite separable extension of any field is simple. Our ultimate
goal is to work in fields of characteristic zero (in particulat subfields of C), so
we will not discuss the more general version.1
Theorem 14.2.2 could have been used to give a shorter proof of Proposition
13.5.2, but we would have had to assume separability to use it. This is not a
big restriction, as Lemma 14.1.1 shows, but the technique that we did use in
the proof of Proposition 13.5.2, independence of automorphisms, will be needed
later.
14.3
Galois extensions
Two ideas have been put forward to avoid having two few automorhpisms.
One is to insure that all roots of a relevant polynomial are present. This is
accomplished by looking at splitting fields. The other is to avoid polynomials
with multiple roots. This concept comes under the name “separable.” In this
section we explore what happens when the two ideas are combined. The main
result is that one obtains our goal, a Galois extension.
However, we get much more. When the number of automorphisms equals the
number predicted by the degree, then a large collection of techniques become
available. These lead to a large number of results about Galios extensions,
as well as a large number of properties that turn out to be equivalent to an
extension being Galois.
Ultimately these results combine to the first theorem, the Fundamental Theorem of Galois Theory, that makes serious ties between the automorphism group
of a Galois extension and the internal structure of the extension. The statement
and proof of the fundamental theorem is at the end of this chapter. In the chap1 If you have done the project at the end of Chapter 11 which shows that the multiplicative
group F ∗ for a finite field F is cyclic, then you will have done all the work for the more general
version of Theorem 14.2.2.
14.3. GALOIS EXTENSIONS
273
ter that follows, we will start with the fundamental theorem and use it to deduce
facts about polynomials and their roots.
Normal extensions
The first property that we will explore in connection with Galois extensions is
called normality. Normality seems like a strengthening of the notion of a splitting field. Galois extensions will always turn out to be splitting fields, but it
turns out that Galois extensions also have this apparently stronger property.
Conversely normality combined with separability implies Galois (for finite extensions). However, splitting fields that are separable are also Galois (again, for
finite extensions) and thus normal. We have put off the definition long enough,
so it is given in the next paragraph.
An extension F ⊆ E of fields is called a normal extension if every irreducible
polynomial in F [x] that has a root in E also splits in E. In other words, if an
irreducible polynomial over F has one of its roots in E, then it has all of its roots
in E. Another rewording is that F ⊆ E is normal if all minimal polynomials in
F [x] for elements of E that are algebraic over F split in E.
The reason for the word “normal” will be made clear after the statement of
the Fundamental Theorem of Galois Theory is given.
14.3.1
Finite, separable, normal extensions
In this section we show that being finite, separable and normal is equivalent to
being Galois. We break the two directions into two arguments since each takes a
bit of doing and each uses techniques worth learning. Here is the first direction.
Proposition 14.3.1 If F ⊆ E is a Galois extension of fields, then it is finite,
separable and normal.
The following argument uses a technique that gets used often. In this situation it has the strange effect of proving the two items that require work
simultaneously.
Proof. The extension is finite by the definition of a Galois extension. Thus it is
algebraic. Any element of E has a minimal polynomial P (x) over F that must
be irreducible over F , and if a polynomial over F is irreducible over F and has
a root in E, then it is a minimal polynomial for that root.
So if we look at elements α ∈ E and irreducible polynomials P (x) ∈ F [x]
for which α is a root of P (x), then we know we are looking at all elements of
E and all irreducible polynomials over F with a root in E. Since multiplying a
polynomial by a constant does not change its roots, we can assume that P (x)
is monic.
For such an α and P (x), let A = {α1 , α2 , . . . , αk } be all the roots of P (x) in
E. We take α1 = α. If θ is in Aut(E/F ), then each θ(αi ) must also be a root
of P (x) and must be some αj . Since θ is one-to-one and A is finite, the action
of θ on A is to permute the elements of A.
274
CHAPTER 14. GALOIS THEORY BASICS
We now bring in the advertised technique.
We now consider Q(x) = (x − α1 )(x − α2 ) · · · (a − αk ). The action of θ on
Q(x) is to permute that factors of Q(x). Thus θ(Q)(x) = Q(x) and θ fixes all
the coefficients of Q(x). Since this was shown for any θ ∈ Aut(E/F ), we know
that the coefficients of Q(x) lie in the fixed field of Aut(E/F ) which is F by
hypothesis.
If we extend E to a field K that is a splitting field for P (x), we see that Q(x)
is a product of factors of P (x) in K. Thus Q(x)|P (x). But α = α1 is a root
of Q(x) and P (x) is minimal for α in E over F . Thus P (x)|Q(x). This means
that P (x) and Q(x) are constants times each other. Since we assume P (x) is
monic, and Q(x) is clearly monic, they are equal.
We have shown that P (x) splits in E, and since all the αi are different, we
have shown that P (x) is separable. This is what was to be proven.
The work to prove the converse to Proposition 14.3.1 is contained in a key
lemma that can be used to show that in the presence of separability not only is
normality a powerful property, but being a splitting field field is just as powerful
a property.
Lemma 14.3.2 Let F ⊆ E be an extension of fields and assume that there is a
finite set A of elements in E that are separable and algebraic so that E = F (A)
and so that each element of A has a minimal polynomial over F that splits in
E. Then F ⊆ E is a Galois extension.
Proof. It may be that there are elements in A that are not needed to make
E = F (A) true. We can throw the unneeded elements out and end up with a
new subset that we still call A for which E = F (A) is still true and all other
hypothese hold and so that no proper subset S ⊆ A makes E = F (S) true.
If α1 , α2 , . . . , αk are the elements of A, then we can define
Fi = F (αi , α2 , . . . , αi )
for 0 ≤ i ≤ k (where F0 = F ) and we have a succession of extensions
F = F0 ⊆ F1 ⊆ F2 ⊆ · · · ⊆ Fk = E.
Our assumption that A cannot be made smaller implies, for each i with 1 ≤ i ≤
k, that αi ∈
/ Fi−1 .
Since each αi is algebraic over F , it is algebraic over Fi−1 . So each [Fi+1 : Fi ]
is finite and [E : F ] = [Fk : F0 ] is finite.
For each i with 1 ≤ i ≤ k, let Pi (x) be a minimal polynomial for αi over F
and let Qi (x) be a minimal polynomial for αi over Fi−1 . We will use this data to
count the number of elements in Aut(E/F ). It is in this lemma that we deliver
on our promise that enough assumptions about the presence of roots and the
non-duplication of roots gives the number of automorphisms that is predicted
by the degree.
We will build automorphisms by building homomorphisms of the Fi into
E. These will not necessarily be automorphisms of the Fi , but they will fit
14.3. GALOIS EXTENSIONS
275
together to give automorphisms of E. We will use Proposition 13.3.3 to count
the homomorphisms. To help with the count, we let ni be the number of homomorphisms from Fi into E that restrict to the identity on F . Note that nk will
be the njumber of elements in Aut(E/F ). This is because each homomorphism
θ from Fk = E into E that is the identity on F will have [θ(E) : F ] = [E : F ].
Thus [E : θ(E)] = 1 and θ(E) = E.
We proceed by induction. Our goal is to prove that ni = [Fi : F ] for each
i with 0 ≤ i ≤ k. It is true for i = 0 since there is only one homomorphism
from F0 = F into E that is the identity on F , and [F : F ] = 1. We assume that
ni−1 = [Fi−1 : F ] for some i ≥ 1 and work to prove the corresponding statement
for ni .
Let θ be one of the ni−1 homomorphisms from Fi−1 into E that is the identity
on F . We know from Proposition 13.3.3 that the number of extensions of θ to
a homomorphism from Fi into E is the number of roots of θ(Qi )(x) that are in
E. Let di be the degree of Q(x). It is the degree of θ(Qi )(x).
We know that Qi (X) divides Pi (x) and that Pi (x) splits into a product of
linear factors over E and that no factor is repeated. We also know that θ(Qi )(x)
divides θ(Pi )(x). But the coefficients of Pi (x) are in F and θ is the identity on
F . Thus θ(Pi )(x) = Pi (x) and θ(Qi )(x) divides Pi (x). Thus over E, θ(Qi )(x)
factors into a product of some of the linear factors of Pi (x), none of which are
repeated. From this and Proposition 13.3.3 we get that the number of extensions
of θ to Fi is exactly di . Note that every extension is the identity on F since it
is extending a homomorphism that is already the identity on F .
Note that extensions of different homomorphisms from Fi−1 into E must be
different since they disagree on Fi−1 . Also for a given extension (e.g., θ) from
Fi−1 to E, the di extensions to Fi are different simply because they are different
(specifically they disagree on αi by the extra provisions of Lemma 13.3.3 if you
want details). Thus each of the ni−1 homomorphisms from Fi−1 into E that are
the identity on F give di different extensions to Fi and there are ni−1 di such
extensions in total. But ni−1 = [Fi−1 : F ] by the inductive assumption and
di = [Fi : Fi−1 ] by Lemma 12.3.2, so there are [Fi : Fi−1 ][Fi−1 : F ] = [Fi : F ]
extensions and we have shown ni = [Fi : F ].
By induction, we get to nk = [E : F ] and, as mentioned, this shows that
[E : F ] equals the order of Aut(E/F ). This proves that F ⊆ E is Galois. Proposition 14.3.3 A finite, separable, normal extension of fields is a Galois
extension.
Proof. Let F ⊆ E be an extension satsifying the hypotheses. There is a finite
sequence α1 , α2 , . . . , αk so that F (α1 , α2 , . . . , αk ) = E. To get such a sequence,
we start with F . If F is not all of E, we let α1 be any element in E − F . If
F (α1 ) is not all of E, we let α2 be in E − F (α1 ). Inductively if F (α1 , . . . , αi ) is
not all of E, we take αi+1 to be any element of E − F (α1 , . . . , αi ). This process
must stop since [E : F ] is finite. Note that each αi is algebraic over F since
F ⊆ E is a finite extension.
276
CHAPTER 14. GALOIS THEORY BASICS
Now the hypotheses in the statement show that we have all the hypotheses
of lemma 14.3.2 and the result follows from that lemma.
14.3.2
Splitting fields
We need to make some remarks about splitting fields. The construction of
splitting fields in Proposition 13.4.1 needs no assumption of irreducibility. This
allows us to split any finite set of polynomials by being able to split any one
polynomial. Given a finite set of polynomials, just multiply all of them together
and build a splitting field for the product.
It turns out that infinite sets of polynomials can be split, but we will have
no need here of such power. So the assumption below that we are looking at a
splitting field of a single polynomial is no real restriction for us.
We start with a lemma that gives a brief introduction to the power of the
assumption that a field is a splitting field.
Lemma 14.3.4 Let F ⊆ E ⊆ K be an extension of fields, and assume that
E is a splitting field for a polynomial P (x) in F [x]. Then any homomorphism
from E into K that is the identity on F is an automorphism of E and thus an
element of Aut(E/F ).
Proof. Let A = {α1 , α2 , . . . , αk } be the roots of P (x). Then E = F (α1 , α2 , . . . , αk ).
An immediate consequence is that the extension F ⊆ E is finite.
Let θ be a homomorphism from E into K that is the identity on F . Since θ
is the identity on F , we must have that each θ(αi ) is a root of P (x). Thus each
θ(αi ) is an element of A. Thus θ takes every element of A into A and thus into
E.
We know that θ(E) ∩ E is a field. But it contains F and all of A. Thus it
contains E = F (α1 , α2 , . . . , αk ). So E ⊆ θ(E).
We know that [E : F ] is finite and must equal [θ(E) : F ]. So [θ(E) : E] = 1
and E = θ(E).
We now give the proposition that shows that “splitting” implies “normal”
in the presence of “separable.” It uses the same key lemma that Proposition
14.3.3 does.
Proposition 14.3.5 Let F ⊆ E be a separable extension of fields, and assume
that E is a splitting field for a polynomial in F [x]. Then the extension is a
Galois extension.
Remark. The power of the result might be hidden. Proposition 14.3.5 says
that a separable extension that is a splitting field for one polynomial splits all
polynomials that have at least one root in the extension. Further, there is a
great deal that can be said about the automorphism group of the extension.
Proof. As in the proof of Lemma 14.3.4, we know that the extension is finite.
Let P (x) be the polynomial for which E is the splitting field, and let A =
14.4. THE FUNDAMENTAL THEOREM OF GALOIS THEORY
277
{α1 , . . . , αk } be the set of roots of P (x). We have that E = F (A). Each αi has
a minimal polynomial Pi (x) over F which must divide P (x). The separability
assumption makes each Pi (x) separable, and the fact that P (x) splits in E says
that each Pi (x) splits in E. We now have all the hypotheses of Lemma 14.3.2
and the conclusion of that lemma is that F ⊆ E is a Galois extension.
The converse to Proposition 14.3.5 is easy to get from from Proposition 14.3.1
and the remarks that we made at the beginning of this section that a splitting
field for finitely many polynomials is a splitting field for one polynomial by the
trick of multiplying all the polynomials together.
14.3.3
Characterizations of Galois extensions
The theorem below summarizes what we know from our defintions, propositions
and lemmas.
Theorem 14.3.6 The following are eqivalent for an extension F ⊆ E of fields.
1. The extension is Galois.
2. The degree [E : F ] is finite and equals the order of Aut(E/F ).
3. The degree [E : F ] is finite and the fixed field of Aut(E/F ) is F .
4. The extension is finite, separable and normal.
5. The extension is a separable extension and E is a splitting field for some
polynomial in F [x].
The statements simplify if we assume the fields have characteristic zero since
characteristic zero implies separable.
Theorem 14.3.7 The following are eqivalent for an extension F ⊆ E of fields
of characteristic zero.
1. The extension is Galois.
2. The degree [E : F ] is finite and equals the order of Aut(E/F ).
3. The degree [E : F ] is finite and the fixed field of Aut(E/F ) is F .
4. The extension is finite and normal.
5. E is a splitting field for some polynomial in F [x].
14.4
The fundamental theorem of Galois Theory
We are now in a position to reveal our true motives.
We are interested in seeing when a polynomial P (x) has its roots expressible
in terms of its coefficients by a sequence of five operations, four of which are
278
CHAPTER 14. GALOIS THEORY BASICS
the field operations and the fifth being the taking of n-th roots. This translates
into asking whether there is a chain of field extensions
F1 ⊆ F2 ⊆ · · · ⊆ Fk
where F1 contains the coefficients, Fk contains the roots and each extension
Fi ⊆ Fi+1 , 1 ≤ i < k, is of the form Fi+1 = Fi (αi ) where αi is a root of a
polynomial of the form xn − bi for some bi ∈ Fi .
Such a setup has a name. We say that the extension F1 ⊆ Fk is a radical
extension if the sequence of extensions as described connecting F1 and Fk exists.
A radical extension of one step (each Fi ⊆ Fi+1 , for example) would be described
as “adding an n-th root” of an element.
Given an extension F ⊆ E of fields, we refer to a field K with F ⊆ K ⊆ E
as a field that is intermediate to (or between) F and E. Thus our interest is
first in finding fields intermediate to F1 and Fk , and second in understanding
the relationships between various pairs of the intermediate fields.
The Galois group of a Galois extension
The Fundamental Theorem of Galois Theory takes care of the first and a bit
of the second. For Galois extensions, it tells exactly what the intermediate
fields are and gives some information about their relationships. The theorem
gives this information in terms of the group Aut(E/F ) and it subgroups. This
makes the group Aut(E/F ) so important that it is given a name. For a Galois
extension F ⊆ E of fields, we call Aut(E/F ) the Galois group of the extension
and denote it as Gal(E/F ).
The full analysis of the second question (“What is the relation between two
of the intermediate fields?”) will come from a more detailed examination of the
structure of the Galois group than is given by the fundamental theorem. This
will occupy us in the next chapter.
14.4.1
Some permutation facts
One part of the proof of the fundamental theorem relies on key facts about
conjugation in groups of permutations. It has been a while since this was seen,
so we review it here. We also review the notation since different notations are
used in different books.
If h and g are in a group G, we write hg for ghg −1 . We call hg the conjugate
of h by g. If H is a subgroup of G, we write H g for {hg |h ∈ H}. We call H g
the conjugate of H by g.
If G acts on a set S by permutations, then each g ∈ G is a one-to-one
correspondence from S to itself. For a subgroup H of G, we write Fix(H) for
the set {s ∈ S|h(s) = s for all h ∈ H}. One basic fact that we want is the
following.
Lemma 14.4.1 In the setting just described, Fix(H g ) = g(Fix(H)). In particular if H ⊳ G, then Fix(H) = g(Fix(H)) for all g ∈ G.
14.4. THE FUNDAMENTAL THEOREM OF GALOIS THEORY
279
Proof. This is just Lemma 5.3.3 with an extra observation added at the end. 14.4.2
The Fundamental Theorem
Theorem 14.4.2 (Fundamental Theorem of Galois Theory) Let F ⊆ E
be a Galois extension of fields. Then for every field K intermediate to F and
E, the extension K ⊆ E is a Galois extension and Gal(E/K) is a subgroup of
Gal(E/F ). Further sending K to Gal(E/K) gives a one-to-one correspondence
between the fields intermediate to F and E and the subgroups of Gal(E/F ).
This one-to-one correspondence has the following properties:
1. If F ⊆ K ⊆ L ⊆ E, then Gal(E/L) ⊆ Gal(E/K).
2. An field K intermediate to F and E is a Galois extension of F if and only
if Gal(E/K) is a normal subgroup of Gal(E/F ). When this occurs, then
(a) θ(K) = K for all θ ∈ Gal(E/F ),
(b) π : Gal(E/F ) → Gal(K/F ) defined by π(θ) = θ|K for each θ ∈
Gal(E/F ) is onto and a homomorphism of groups, and
.
(c) Gal(K/F ) is isomorphic to the quotient group Gal(E/F ) Gal(E/K).
We make some comments before giving the proof.
Item 1 in the conclusion says that the one-to-one correspondence K ↔
Gal(E/K) is containment reversing. Larger intermediate fields have smaller
Galois groups. This even applies to the extremes. The field E is intermediate to F and E, is the largest such intermediate field, and its Galois group
Gal(E/E) is the smallest possible: the trivial group. The field F is the smallest
field intermediate to F and E and its Galois group Gal(E/F ) is the largest
subgroup of Gal(E/F ).
Item 2 in the conclusion says that the one-to-one correspondence K ↔
Gal(E/K) also serves as a one-to-one correspondence between the Galois extensions F ⊆ K with K intermediate to F and E and the normal subgroups of
Gal(E/F ).
We can now justify the word normal as applied to extensions. The finiteness
of the extension F ⊆ K comes from the setting setting, and for fields of characteristic zero, separability is immeidate as well. By Item 4 of Theorem 14.3.7,
the only property needed to get that F ⊆ K is Galois for fields of characteristic
zero is that the extension be normal. The fundamental theorem says that this
happens if and only if the corresponding Galois group is normal. This explains
why normal subgroups and normal extensions have the same name.
The statement of the Fundamental Theorem of Galois Theory makes a lot
of promises, and so the proof is correspondingly long. However, no one of the
conclusions is very difficult to prove. It will be found that most of the work has
been done in Section 14.3.
In the proof we sometimes write Aut( / ) and other times write Gal( / ).
Our rule is that we use Aut when it is not known if the extension is Galois,
280
CHAPTER 14. GALOIS THEORY BASICS
and we use Gal when we know that the extension is Galois. This is not exactly
standard terminology.
Proof. If K is intermediate to F and E, then we must first show that K ⊆ E
is a Galois extension. From theorem 14.3.6, we have a choice of approach. We
will use item 4 from that theorem and show that K ⊆ E is finite, separable and
normal. But F ⊆ E is finite, separable and normal. Finiteness of K ⊆ E is
immediate. If α is in E, then its minimal polynomial Q(x) over K must divide
its minimal polynomial P (x) over F . But P (x) splits in E and hash no repeated
roots. Thus Q(x) is a product of some of the linear factors that make up P (x)
and so Q(x) splits in E and has no repeated roots.
We know that Gal(E/K) is a subgroup of Aut(E). Since every element of
Gal(E/K) is the identity on K and F ⊆ K, it is also the identity on F . Thus
every element of Gal(E/K) is an element of Gal(E/F ).
We now have a function g from I, the set of fields intermediate to F and
E to S, the set of subgroups of Gal(E/F ). It is defined as g(K) = Gal(E/K).
To show that f is one-to-one and onto, we build an inverse. Let f : S → I be
defined by f (H) = Fix(H) for each subgroup H of Gal(E/F ). We must show
that gf and f g are identity functions.
For K intermediate to F and E, the field f g(K) is the fixed field of Gal(E/K).
But the basic property of Galois extensions is that this be K. So the work in
showing that f g is the identity is found in the proof that K ⊆ E is Galois.
For a subgroup H of Gal(E/F ), the field f (H) is the fixed field L of H in
E. If m is the order of H, then Proposition 13.5.2 says that [E : L] = m. Since
every element of H fixes F , we must have F ⊆ L, so L is intermediate to F and
E and it is known that L ⊆ E is Galois. Now gf (H) is Gal(E/L) the group of
all automorphisms of E that fix L. This must include H since all elements of
H fix L and we have H ⊆ gf (H). But for the Galois extension L ⊆ E, we must
have that m = [E : L] equals the order of Gal(E/L). Thus we have that H and
gf (H) have the same order and must be equal.
We have shown the provisions in the first paragraph of the statement of the
theorem.
If F ⊆ K ⊆ L ⊆ E, then every element of Gal(E/L) fixes L which contains
K. Thus every element of Gal(E/L) fixes K and must be in Gal(E/K). This
proves Item 1.
For Item 2, there are several things to prove: both directions of the if and
only if, and Items (a)–(c).
Assume that K is intermediate to F and E and that F ⊆ K is a Galois
extension. Then it is a splitting field for some polynomial P (x) in F [x].
Let θ be in Gal(E/F ). The restriction of θ to K is a homomorphism from
K into E that is the identity on F . From Lemma 14.3.4, we know that this
restriction has image K and is an element of Aut(K/F ). This gives Item (a).
Thus π(θ) = θ|K is a function from Gal(E/F ) to Gal(K/F ). Since K is taken
to itself by each automorphism in Gal(E/F ), composing two restrictions gives
the same result as restricting the result of a composition of two elements from
Gal(E/F ). Written out, this says θ|K ρ|K = (θρ)|K and gives that the function
14.4. THE FUNDAMENTAL THEOREM OF GALOIS THEORY
281
π is a homomorphism. We want to prove that π is onto.
The kernel of π consists of those elements of Gal(E/F ) whose restriction
to K is the identity on K. But this is just a description of Gal(E/K). Thus
Gal(E/K) is normal in Gal(E/F ) and we have one direction of the “if and
only if.” For Item (c), we note that the first isomorphism theorem for group
homomorphisms says that the image of a group homomorphism is isomorphic to
the domain of the homomorphism modulo the kernel.
In our situation, this says
.
that the image of π is isomorphic to Gal(E/F ) Gal(E/K). Thus the order of
the image of π is
[E : F ]
|Gal(E/F )|
=
|Gal(E/K)|
[E : K]
by Item 2 of Theorem 14.3.6. But F ⊆ K ⊆ E says that
[E : F ] = [E : K][K : F ]
so
[E : F ]
= [K : F ] = |Gal(K/F )|
[E : K]
and the image of π must .
abe all of Gal(K/F ). Thus π is onto and Gal(K/F ) is
isomorphic to Gal(E/F ) Gal(E/K).
For the other direction of the “if and only if” we assume that Gal(E/K) is
normal in Gal(E/F ) and we work to prove that E ⊆ K is Galois. To shorten
the notation let Γ = Gal(E/K).
We know from the first part of this proof that K ⊆ E is Galois, so K is the
fixed field of Γ = Gal(E/K). From Lemma 14.4.1, we know that θ(K) = K
for every θ in Gal(E/F ). (This proves Item (a) again, but from a different
hypothesis.) At this point, we can choose any of a number of arguments to
show that F ⊆ K is Galois. We choose Item 3 of Theorem 14.3.6, and show
that F ⊆ K is Galois because F is the fixed field of Aut(K/F ).
We know that F is the fixed field of Gal(E/F ). If a ∈ K − F , then there
is a θ ∈ Gal(E/F ) with θ(a) 6= a. But θ(K) = K and so θ|K is an element of
Aut(K/F ) that does not fix a. Thus no element of K − F is in the fixed field of
Aut(K/F ). But F is clearly in the fixed field of Aut(K/F ). So F is the fixed
field of Aut(K/F ) and F ⊆ K is Galois.
282
CHAPTER 14. GALOIS THEORY BASICS
Chapter 15
Galois theory in C
We now apply the results in the previous chapter to subfields of C, the field of
complex numbers. This setting is so important that it has a name of its own.
We will refer to subfields of C as number fields. So the title of the chapter could
have been “Galois theory of number fields.” However, the terminology is not
completely standard1 and the chapter title we use is more specific.
The setting has numerous advantages. One important fact is that polynomials in C[x] split in C. This is called the Fundamental Theorem of Algebra
and will be proven here if we have time. Its proof needs at least some analysis
(calculus). Another advantage is that C has characteristic zero and extensions
are thus automatically separable. A third advantage is that we know a lot about
multiplying and adding in C. In particular, we know a lot about the n-th roots
of complex numbers (with n-th roots of 1 being the most important special case)
which will turn out to be very important in our analysis.
The goal of this chapter is to use Galois theory to give information about
roots of polynomials. In particular, we will determine when a polynomial P (x)
in Q[x] has roots that can be calculated from the coefficients of P (x) using the
four field operations and the taking of n-th roots. This will be translated into
asking when the extension Q ⊆ K (where K is a splitting field for P (x)) has
a very specific strucure. The triumph of Galois theory is to tie this specific
structure to properties of the group Gal(K/Q). The Fundamental Theorem of
Galois Theory and the interplay between our knowledge of n-th roots in C and
the automorphisms they lead to will give the analysis we seek.
The Fundamental Theorem of Algebra
To make sure we get the statement recorded and to indicate its importance, we
record the statement here. A proof will be supplied much later (and perhaps
not before the end of the semester).
1 The
term “number field” is often used to refer to a finite extension of Q.
283
284
CHAPTER 15. GALOIS THEORY IN C
Theorem 15.0.3 (Fundamental Theorem of Algebra) Every polynomial in
C[x] splits in C[x].
15.1
Radical extensions
We wish to investigate polynomials over a number field F . Typically we will
want to understand some P (x) over Q, but number fields other than Q could
be used as well.
We repeat some of a previous discussion.
If P (x) ∈ F [x] is given then we know its coefficients. If its roots can be
found from the coefficients by the five operations that we have discussed (the
four field operations and the taking of n-th roots), then we say that P (x) is
solvable by radicals.
There is a structure that we will associate to the concept of solvability by
radicals. Consider the following structure where all the Fi are number fields.
We have a sequence of field extensions
F0 ⊆ F1 ⊆ F2 ⊆ · · · ⊆ Fk
where Fi for 1 ≤ i ≤ k is of the form Fi−1 (αi ) and αi ∈ Fi is a root of a
polynomial of the form xni − βi−1 for some element βi−1 ∈ Fi−1 . In other
words, we get from one field to the next larger field by adding an element that
is a root of some element in the smaller field.
If such a structure exists, then we say that Fk is a radical extension of
F0 . We will refer to the collection of the ni as the exponents of the structure.
This is not standard terminology, but it will be convenient for us. Also, it is
sometimes required that the structure have other nice properties. For example,
it is sometimes required that each extension Fi ⊆ Fi+1 be normal. We will
not make that requirement now, but will add it later and show that it can be
achieved. We will also show that other properties can be added.
Lemma 15.1.1 If P (x) is a polynomial over a number field F , then P (x) is
solvable by radicals if and only if the splitting field E ⊆ C for P (x) over F is
contained in a radical extension Fk of F .
Proof. If P (x) is solvable by radicals, then there is a set of intermediate calculations that lead to the roots. Since there are finitely many roots, there are
finitely many intermediate calculations. We start with F0 = F , and we consider
the list of intermediate values. We eliminate all intermediate values that are
already in F . We take an intermediate vale α1 that is not in F but that is a root
of an element of F . We form F1 = F (α1 ), and then eliminate from the list of
intermediate values that remains all values that are now in F1 . We continue in
this way until we have a field Fk that contains all the roots. Since Fk contains
all the roots of P (x), it must contain a splitting field for P (x) over F . By the
way Fk is constructed, it is a radical extension of F .
15.1. RADICAL EXTENSIONS
285
Conversely, assume that there is a radical extension Fk of F that contains
a splitting field E for P (x) over F . We will be done when we show that every
element of Fk can be obtained from the elements of F by the four field operations, and the taking of n-th roots. This is done quite simply by induction
on k. There is nothing to show if k = 0. If k ≥ 1, let Fi , 0 ≤ i ≤ k, with
F0 = F , and elements αi , βi and ni be as in the definition above of a radical
extension. We have that Fk = Fk−1 (αk ) and that αk is a root of xnk − βk−1
with βk−1 ∈ Fk−1 . From Corollary 12.3.3, we know that every element in Fk
is a linear combination of of powers of αk with coefficients from Fk−1 . By the
inductive assumption, every element of Fk−1 can be obtained from elements of
F using the five allowed operations, and so we can get all elements of Fk by
from elements of F by these operations, the taking of nk -th roots (specifically
of βk−1 ), and some further field operations. Since E ⊆ Fk and E contains all
roots of P (x), this applies to the roots of P (x).
15.1.1
An outline
Lemma 15.1.1 gives enough of a picture for us to describe how things will go.
We have fields F ⊆ E ⊆ Fk . The radical extension Fk of F might not be
Galois, but the extension F ⊆ E is Galois. If Fk were Galois over F , then we
would have
.
Gal(E/F ) ≃ Gal(Fk /F ) Gal(F/E)
making Gal(E/F ) a quotient group of Gal(Fk /F ).
It turns out that we can extend the radical extension of Lemma 15.1.1 even
farther to make it Galois and still keep it as a radical extension. It further
turns out that automorphism groups of Galois, radical extensions are solvable
groups. This is why solvable groups are solvable groups. Recall that quotients
of solvable groups are solvable. This gives us our first major result. If P (x) is
a polynomial over a number field F that is solvable by radicals, and E is the
splitting field in C for P (x) over F , then Gal(E/F ) is a solvable group.
We will use this result to show that a specific polynomial in Q[x] (it will have
degree 5 since all polynomials of degree no more than 4 are solvable by radicals)
is not solvable by radicals by computing the Galois group of its splitting field
and showing that it is not a solvable group.
Thus we will have illustrated one direction of the statement “P (x) ∈ F [x]
is solvable by radicals if and only if a splitting field E for P (x) over F has
that Gal(E/F ) is a solvable group.” It is the easier direction and the more
flamboyant.
The other direction is harder and more interesting. It says that if the Galois
group is known to have a certain structure (namely, being solvable), then the
extension must have a certain structure. That this direction turns out to be
true creates a need for two comments.
First, this shows that not only is the solvability of the Galois group necessary
for solvability of the polynomial by radicals (and thus something that prevents
286
CHAPTER 15. GALOIS THEORY IN C
solvability of the polynomial by radicals when it is absent), but it is also sufficient. That is, it exactly captures exactly the solvability of the polynomial.
Second, it is a powerful example where the niceness of structure of the group
of symmetries implies niceness of the structure having those symmetries. This
is why it is the harder direction. One must use properties of the group of
symmetries to prove that the splitting field of the polynomial is contained in an
extension that can be built step by step by extensions that are determined by
polynomials of the form xn − β.
15.2
Improving radical extensions
This section will be an example of making a situation fit a technique. Galois
theory works will on Galois extensions. The result of Lemma 15.1.1 shows
that a polynomial P (x) is solvable by radicals if its splitting field is contained
in a radical extension. The discussion above shows that it would be nice to
know that the radical extension is Galois. It would be even nicer to know more
about the intermediate fields in the radical extension. In particular, getting the
automorphism group to be solvable will require knowing that the intermediate
extensions are normal.
We will show that all this can be achieved, and more. That is, we will
start with the conclusion of Lemma 15.1.1 and show that the conclusion can be
strengthened to the point where Galois theory can make some contributions.
One task of this section is to show that any radical extension can be make
Galois. But there will be another task. We will also show that the “internal
structure” of the radical extension can be shown to have a particularly simple
structure. This will help when we work to compute the Galois group of the
extension.
The terminology that we use for the following is not standard. Let F ⊆ K
be an extension of fields. We say that K is an improved radical extension of F
if F ⊆ K is Galois, and if there are fields Fi , elements βi , primes pi , and an
integer n so that
F = F0 ⊆ F1 ⊆ F2 ⊆ · · · ⊆ Fn = K
with βi ∈ Fi , 0 ≤ i < n, and with each Fi , 1 ≤ i ≤ k a splitting field over Fi−1
of xpi − βi−1 .
The definition needs some comments. First, if the definition of improved
radical extension is compared carefully to the definition of a radical extension,
then it is not immediately obvious that an improved radical extension is a radical
extension. This is cleared up in the next lemma.
Second, the requirement that F ⊆ K be Galois has already been given
motivation by the discussion in the previous section.
Third, the requirement that the exponents pi used in the polynomials all be
prime is more of a convenience than a necessity. The fact that the exponents
are primes will make certain arguments slightly easier.
15.2. IMPROVING RADICAL EXTENSIONS
287
Lemma 15.2.1 If F ⊆ K is an improved radical extension of fields, it is a
radical extension of fields.
Proof. The item missing from the definition of a (plain) radical extension is
that each intermediate extension be by a single element that is a root of a
polynomial of the right form. But if Fi is a splitting field over Fi−1 for the
polynomial xpi − βi−1 , then Fi = Fi−1 (α1 , α2 , . . . , αk ) where α1 , α2 , . . . , αk are
all the roots of xpi − βi−1 . If we add one root at a time to Fi−1 and give
the intermediate fields names, then we have the required structure of a radical
extension between Fi−1 and Fi since each αj is a root of xpi − βi−1 . This can
be done between each pair Fi−1 ⊆ Fi . Thus by adding extra intermediate fields,
we get that F ⊆ K is a radical extension.
Not every radical extension is an improved radical extension. But every
radical extension can be made bigger so that it becomes an improved radical
extension. We tackle the improvements one at a time. First we get the exponents
in the polynomials to be primes.
15.2.1
The first improvement
If F ⊆ F (α) is an extension of fields, and α is a root of xn − β with β ∈ F , then
αn = β. But if n is not a prime, then n = pq for some prime p and some q > 1,
and αn = (αq )p = β and we can introduce a field intermediate to F and F (α)
to get
F ⊆ F (γ) ⊆ F (α)
where γ = αq ∈ F (α). Now α is a root of xp − γ over F (γ), and γ is a root
of xq − β over F . Since q < n, we have all the ingredients needed to give an
inductive proof of the following.
Lemma 15.2.2 Let F ⊆ F (α) be an extension where α is a root of xn − β for
β ∈ F . Then there are fields Fi , elements αi and βi , primes pi and a natural
number k so that
F = F0 ⊆ F1 ⊆ F2 ⊆ · · · ⊆ Fk = F (α)
where for 1 ≤ i ≤ k, Fi = Fi−1 (αi ), and αi is a root of xpi − βi−1 with
βi−1 ∈ Fi−1 .
By using Lemma 15.2.2 at each stage of the definition of a radical extension,
we get the following corollary to Lemma 15.2.2.
Corollary 15.2.3 Let E be a radical extension of a number field F . Then there
are fields Fi , elements αi and βi , primes pi and a natural number k so that
F = F0 ⊆ F1 ⊆ F2 ⊆ · · · ⊆ Fk = E
where for 1 ≤ i ≤ k, Fi = Fi−1 (αi ), and αi is a root of xpi − βi−1 with
βi−1 ∈ Fi−1 .
288
15.2.2
CHAPTER 15. GALOIS THEORY IN C
The second improvement
In the proof of the following, we will refer to splitting fields for certain polynomials. We know that splitting fields can always be constructed, but here we
can take advantage of the fact that we work in C, and a splitting field for any
polynomial always exists in C.
Proposition 15.2.4 If F ⊆ K is a radical extension of a number field, then
there is an improved radical extension F ⊆ E that has K ⊆ E.
Proof. We first apply corollary 15.2.3 so that we can assume that all the exponents of the structure of the radical extension F ⊆ K are primes.
We next list, in the order that they are added, all the elements α1 , α2 , . . . ,
αk that are added to F to obtain K. We will induct on k. We let Fi be the
extension of F by α1 through αi . There is nothing to do when k = 0, and so we
assume that there is an improved radical extension F ⊆ G that contains Fk−1 .
We will add αk to Fk−1 and then enough extra so that the resulting extension E
satisfies the conclusion of the proposition. Note that αk is a root of xpk − βk−1
where βk−1 is from Fk−1 and is thus in G.
We will obtain E by building a succession of extensions
G = S0 ⊆ S1 ⊆ S2 ⊆ · · · ⊆ St = E
(15.1)
where each Si is a splitting field over Si−1 of a polynomial of the form xpk −
ci−1 where ci−1 is some element in Si−1 . To create a splitting field, one must
add all roots of a polynomial. This can be done in succession so that if (say)
r1 , r2 , . . . , rn are the roots of xpk − c0 , then S1 is the end of a succession of
extensions
S0 ⊆ S0 (r1 ) ⊆ S0 (r1 , r2 ) ⊆ · · · ⊆ S0 (r1 , r2 , . . . , rn ) = S1 .
Note that each intermediate extension is of the allowed type. The addition of a
root of xpk − c0 . The fact that c0 is the same for all of the extensions is not a
problem.
Thus we will take it as established that extending by building a splitting
field of a polynomial of the form xpk − c for some c stays within the definition of
a radical extension and that keeps the exponents used within the set of primes.
Note also, that since we are building splitting fields, we are satisfy one of the
requirements of an improved radical extension.
We now work on getting the last extension (splitting field) St to be Galois
over F . The Fundamental Theorem of Galois Theory tells us what we must do.
The splitting field S1 already contains all roots of xpk − βk−1 . The Galois
extension E of F that we seek will contain the Galois extension G of F . There
will be a surjective homomorphism
π : Gal(E/F ) → Gal(G/F )
given by restriction. That is, if θ is in Gal(E/F ), then π(θ) = θ|G will be in
Gal(G/F ), and every element in Gal(G/F ) arises this way.
15.3. ON THE IMPROVED EXTENSION
289
Let ρ be some element of Gal(G/F ) and let ρ ∈ Gal(E/F ) be such that its
restriction to G is ρ. If R is the set of all roots of xpk − βk−1 , then ρ(R) will be
in E and will all be roots of ρ(xpk − βk−1 ). This equals xpk − ρ(βk−1 ) since βk−1
is in G and ρ restricted to G is ρ. (We can see that all the roots of xpk − ρ(βk−1 )
will be in ρ(R) by applying ρ −1 to the set of roots of xpk − ρ(βk−1 ) and noting
that they will all be in R.)
So we must not only add roots of xpk − βk−1 to G, we must all roots of
pk
x − ρ(βk−1 ) for every ρ ∈ Gal(G/F ). This gives us our succession of splitting
fields (15.1). The number t of split extensions in (15.1) will be the number of
elements of Gal(G/F ).
This means that St = E is a splitting field over G for the polynomial
P (x) = (xpk − ρ1 (βk−1 ))(xpk − ρ2 (βk−1 )) · · · (xpk − ρt (βk−1 ))
where ρ1 , ρ2 , . . . , ρt are all the elements (including the identity) in Gal(G/F ).
This makes E Galois over G, but we would like E to be Galois over F . We will
get this in two steps. First by the inductive hypothesis, G is Galois over F , so
it is a splitting field of some Q(x) over F . In the next paragraph, we will show
that all the coefficients of P (x) are in F so that P (x) ∈ F [x]. Thus Q(x)P (x)
will be in F [x] and E will be a splitting field of Q(x)P (x) over F and will be
Galois over F .
We now bring in a technique that was used in the proof of Proposition 14.3.1.
All the factors of P (x) use coefficients from G (namely 1 and ρi (βk−1 )) and so
we can act on P (x) by any element of Gal(G/F ). If ρ is such an element, then
the factors of ρ(P (x)) will be the factors of P (x) with −ρi (βk−1 ) replaced by
−ρρi (βk−1 ) in each factor. But multiplying all elements of Gal(G/F ) on the left
by one element of Gal(G/F ) simply permutes the elements of Gal(G/F ). Thus
the factors of ρ(P (x)) are exactly those of P (x) except for a change of order.
Thus ρ(P (x)) = P (x). This puts all the coefficients of P (x) in the fixed field of
ρ. Since this applies to any element of Gal(G/F ), we have that the coefficients
of P (x) are in the fixed field of Gal(G/F ) which must be F since F ⊆ G is
Galois. Thus P (x) is in F [x].
This makes E Galois over F . As mentioned above, all other requirements
for an improved, radical extension have been obtained.
15.3
The Galois group of an improved, radical
extension
Recall that an improved, radical extension F ⊆ K has fields Fi , elements βi ,
primes pi and an integer n so that F ⊆ K is Galois and so that
F = F0 ⊆ F1 ⊆ F2 ⊆ · · · ⊆ Fn = K
with βi ∈ Fi , 0 ≤ i < n, and with each Fi , 1 ≤ i ≤ k a splitting field over Fi−1
of xpi − βi−1 . This gives us various extensions to look at, some of which are
Galois.
290
CHAPTER 15. GALOIS THEORY IN C
Since F ⊆ K is Galois, the Fundamental Theorem of Galois Theory says
that each Fi ⊆ K is Galois and we have a succession of groups
{1} = Gn ⊆ Gn−1 ⊆ Gn−2 ⊆ G1 ⊆ G0 = Gal(K/F )
where each Gi = Gal(K/Fi ).
We do not know that each Fi is Galois over F , so we do not know that the
Gi are normal in Gal(K/F ). But each Fi is a splitting field for a polynomial
over Fi−1 , so each Fi−1 ⊆ Fi is Galois. This means that Gal(K/Fi ) is normal
in Gal(K/Fi−1 ). In terms of the Gi , this means that each Gi is normal in Gi−1 .
Thus the sequence of groups above can be rewritten as
{1} = Gn ⊳ Gn−1 ⊳ Gn−2 ⊳ G1 ⊳ G0 = Gal(K/F ).
(15.2)
Recall that normality is not transitive, and that this does not imply that all the
Gi are normal in Gal(K/F ).
We are interested in analyzing each
.
Gi−1 /Gi = Gal(K/Fi−1 ) Gal(K/Fi ).
Since each each extension in Fi−1 ⊆ Fi ⊆ K is Galois, the Fundamental Theorem of Galois Theory says that the quotient above is isomorphic to Gal(Ki /Ki−1 ).
But this is the Galois group of a splitting field for a polynomial of the form
xpi − βi−1 . We are interested in this in the setting of number fields, which
means that we are looking at an extension obtained by adding all the complex
roots of a single complex number βi−1 . Since we know much about such roots,
we are in a good position to analyze the corresponding Galois group.
To simplify the notation, let us consider a number fields G ⊆ G′ where G′ is
the splitting field over G in C for xp − β with β ∈ G and p a prime. We want
to say something about Gal(G′ /G).
The case p = 1 is either ruled out because 1 is not a prime, or is ruled out
because it is trivial. Thus we assume p > 1.
We know that the p-th roots of β in C are of the form γ i α, 0 ≤ i < p,
where γ is the p-th root of 1 making the smallest positive angle with respect to
the positive real axis, and α is one p-th root of β. Since all of these roots are
in G′ and there is more than one root (using p > 1), their ratios are in G′ as
well. Thus all the γ i , 0 ≤ i < p, are in G′ , and we see that G′ contains Gp , the
splitting field over G in C for xp − 1. Once all the αi are in Gp , to get all the
roots of xp − β, we only need to add α. Thus we have the sequence of extensions
G ⊆ Gp ⊆ Gp (α) = G′ .
The extension G ⊆ Gp (α) = G′ is Galois, as is G ⊆ Gp sincei Gp is a splitting
field of a polynomial. Thus Gal(Gp (α)/Gp ) is normal in Gal(G′ /G). Thus we
can insert into (15.2), a normal subgroup between each pair Gi ⊳ Gi−1 to obtain
a sequence twice as long in which half the extensions consist of adding all roots
of the polynomial xp − 1 for some prime p, and the remaining extensions consist
15.3. ON THE IMPROVED EXTENSION
291
of adding one root (and thus all roots) of a polynomial of the form xp − β for
some prime p and some β to a field containing β and all the p-th roots of 1.
We look at the Galois group of each of the two types.
If we consider G ⊆ Gp where Gp is the splitting field over G in C of xp − 1,
then Gp = G(γ) with γ as described above. This is because all p-th roots of
1 are of the form γ i for some i. Thus all automorphisms in Gal(Gp /G) are
completely determined by where γ goes.
Since roots of xp − 1 have to be taken to roots of xp − 1, an automorphism
in Gal(Gp /G) must take γ to some γ i . If θ(γ) = γ i and φ(γ) = γ j , for θ and
φ in Gal(Gp /G), then θ(φ(γ)) = θ(γ j ) = (γ i )j = γ ij . Similarly φ(θ(γ)) = γ ij .
Since elements of Gal(Gp /G) are determined by what they do on γ, we have
θ ◦ φ = φ ◦ θ and we have shown that Gal(Gp /G) is abelian.2
If we consider Gp ⊆ Gp (α) where Gp contains the p-th root of 1 γ as described
above (and thus all p-th roots of 1), and α is one root of xp − β for β ∈
Gp , then Gp (α) is a splitting field for xp − β over Gp . All automorphisms in
Gal(Gp (α)/Gp ) are completely determined by where α goes.
Since α must be taken to a root of xp −β, an automorphism in Gal(Gp (α)/Gp )
must take α to some αγ i for some i. If θ(α) = αγ i and φ(α) = αγ j for θ and φ
in Gal(Gp (α)/Gp ), then
θ(φ(α)) = θ(αγ j ) = θ(α)θ(γ j ) = αγ i γ j = αγ i+j
since γ ∈ Gp implies that θ fixes γ. Similarly φ(θ(α)) = αγ i+j . Since elements
of Gal(Gp (α)/Gp ) are determined by what they do on α, we have θ ◦ φ = φ ◦ θ
and we have shown that Gal(Gp (α)/Gp ) is abelian.
Thus we have shown that an improved, radical extension has a Galois group
with a sequence of subgroups, each normal in the next so that the successive
quotient groups are abelian. But this is the definition of a solvable group. We
have shown the following.
Proposition 15.3.1 The Galois group of an improved, radical extension of
number fields is solvable.
If we combine Lemma 15.1.1, the comments in Section 15.1.1, Propositon
15.2.4 and Proposition 15.3.1, we have the following.
Proposition 15.3.2 Let P (x) be a polynomial over a number field F and let
K be the splitting field in C over F of P (x). If P (x) is solvable by radicals,
then the Galois group Gal(K/F ) is a quotient of a solvable group.
2 We
have not use the fact that p is prime. In fact, we are leaving out a lot of commentary.
In general, not every γ 7→ γ i is valid. For example, γ 7→ γ p = 1 is not valid. But also, if γ is
in G, then it must be fixed by any automorphism. And even if γ is not in G, γ 7→ γ i cannot
be valid of γ i is in G. So there are values of i that cannot be used. This does not interfere
with our argument. If p is a prime, then the roots of xp − 1 form a cyclic group of order p
under multiplication. Any root of xp − 1 that is not 1 is a generator of this group since the
only subgroup bigger than {1} is the whole group. Thus if any root of xp − 1 other than 1 is
in G, they all are. So if one root of xp − 1 is not in G, then all roots of xp − 1 other than 1
are not in G, and all values of i other than multiples of p can be used for an automorphism
determined by γ 7→ γ i . Thus we get a more complete picture when p is a prime.
292
CHAPTER 15. GALOIS THEORY IN C
But we know from Lemma 8.2.2 that a quotient of a solvable group is solvable. Thus we get the following key result.
Theorem 15.3.3 Let P (x) be a polynomial over a number field F and let K
be the splitting field in C over F of P (x). If P (x) is solvable by radicals, then
the Galois group Gal(K/F ) is a solvable group.
15.4
An example
Consider the polynomial P (x) = 4x5 − 10x2 + 5. This is irreducible over Q
using the Eisenstein criterion with the prime 5. It has odd degree, so it has at
least one real root. We will show that it has three real roots.
We have
P ′ (x) = 20x4 − 20x = 20(x4 − x) = 20x(x3 − 1) = 20x(x − 1)(x2 + x + 1).
We recognize x2 + x + 1 as the polynomial whose roots are the complex cube
roots of 1, so the real roots of P ′ (x) are x = 0 and x = 1. We have P (0) = 5
and P (1) = 4 − 10 + 5 = −1. So P (x) crosses the x-axis three times and only
three times. The other two roots of P (x) are complex and by the last part of
Exercise Set (47), they are complex conjugate pairs.
Let r1 , r2 , r3 be the three real roots of P (x), and let c1 , c2 be the complex
roots. The splitting field E for P (x) over Q is Q(r1 , r2 , r3 , c1 , c2 ). We know
that complex conjugation is an automorphism of C fixing Q. Since E is a
splitting field for a polynomial over Q, Lemma 14.3.4 says that the image of E
under complex conjugation is E itself, and complex conjugation is an element
of Gal(E/Q). Its action on {r1 , r2 , r3 , c1 , c2 } is to fix each ri and to switch c1
and c2 .
Since P (x) is irreducible (this is the only place that we use irreducibility),
we know that any one root (r1 , say) of P (x) can be taken to any other root
of P (x). (We actually never recorded this fact explicitly, but it follows from
Propositions 13.3.3 and 13.4.2. Any Q(a) for a root a of P (x) is isomorphic to
Q(r1 ) by the first and the isomorphism extends to E by the second since E is
a splitting field of some factor of P (x) over Q(r1 ).) We now argue that this
makes Gal(E/Q) isomorphic to the symmetric group on 5 objects.
The statement that any root of P (x) can be taken to any other root says
that Gal(E/Q) acts transitively on the roots of P (x). Recall that if a group G
acts on a set S, we say it acts transitively if given any two elements x and y of S,
there is a g ∈ G so that g(x) = y. We have already shown complex conjugation
is one of the elements of Gal(E/Q) and that this is a single transposition of the
roots of P (x). From Proposition 9.3.1, we know that the action of Gal(E/Q)
on the roots of P (x) contains all elements of S5 .
Since E = Q(r1 , r2 , r3 , c1 , c2 ), any element of Gal(E/Q) is determined completely by what it does on {r1 , r2 , r3 , c1 , c2 }. Since we have shown that any
permutation of {r1 , r2 , r3 , c1 , c2 } can be accomplished by an automorphism in
Gal(E/Q), we know that Gal(E/Q) is isomorphic to S5 .
15.4. AN EXAMPLE
293
From Corollary 9.2.3, we know that S5 is not solvable. So we have shown
that the group Gal(E/Q) is not a solvable group. Thus P (x) is not solvable by
radicals. Thus no radical extension of Q in C contains the roots of P (x), and
the roots of P (x) cannot be built from elements of Q by the five operations of
addition, subtraction, multiplication, division and the taking of n-th roots for
various positive integers n.
This does not mean that computers cannot calculate the roots of P (x) to
some degree of approximation. To fifteen significant figures, the five roots of
P (x) are
r1 = −0.668329831433218,
r2 = 0.788731352638918,
r3 = 1.16412943244799,
c1 = −0.642265476826847 + 1.27455269235095i,
c1 = −0.642265476826847 − 1.27455269235095i,
294
CHAPTER 15. GALOIS THEORY IN C
Bibliography
[1] Norman L. Biggs, Discrete mathematics, second ed., Oxford Science Publications, The Clarendon Press Oxford University Press, New York, 1989. MR
1078626 (91h:00002)
[2] Girolamo Cardano, The rules of algebra, (Ars Magna), Dover Publications,
reprint of the 1968 translation the 1545 edition of Artis magnae, sive de
regulis algebraics. Lib. unus. Qui & totius operas de arithmetica, quod Opus
Perfectum inscripsit, est in ordine decimus, with additions from the 1570
and 1663 editions, translated by T. Richard Widmer, published by the MIT
Press., 2007.
[3] James H. McKay, Another proof of Cauchy’s group theorem, Amer. Math.
Monthly 66 (1959), 119. MR 0098777 (20 #5232)
[4] Frederic Rosen, The algebra of Mohmmed ben Musa, Kessinger Publishing,
reprint of the 1831 translation published by the Oriental Translation Fund,
of the c. 830, The Compendious Book on Calculation by Completion and
Balancing, by Mohammed ibn Musa al-Khwarizmi, 2004.
295
Index
Zk , 70
Zp as field, 88
∃, 54
poof of, 55
using in proof, 55
∀, 52
proof of, 52
using in proof, 55
n-cycle, 138
abelian
group, 60, 93
ring, 104
absolute value
of complex number, 27
action
alternate definition, 147
fixed point, 153
fixed set, 153
invariant set, 149, 153
kernel, 146
of group on set, 146
orbit, 152
transitive, 195, 292
addition
complex numbers, 22
adjacent
transposition, 191
al-Jabr, 10
al-Khwarizmi, 9
al-Mukabalah, 10
algebra
word origin, 10
algebraic
element, 234
extension, 249
algebraic element
minimal polynomial for, 236
algorithm
word origin, 10
alternating
group, 192
Arabic
number, 9
argument
of complex number, 27
automorphism, 100
group
of extension, 113
of field, 113
of group, 101
of a field, 112
of ring, 107
automorphisms
linearly dependent, 261
linearly independent, 261
axiomatic system, 8, 91
axis
imaginary, of complex plane, 25
real, of complex plane, 25
bijection, 56
binary
operation, 59
relation, 64
biquadrtic
equation, 3
Cardano, Girolamo, 15
Cauchy notation
for permutation, 62
Cauchy’s theorem, 157
Cayley’s theorem, 121, 122
center
296
INDEX
of group, 146
centralizer, 151
characteristic
of field, 209
coefficient
leading, 5, 221
common divisor, 82
commuative
group, 60
commutative
ring, 104, 214
commuting elements, 93
completing the square, 7
complex
addition, 22
conjugate, 27
multiplication, 23
multiplicative inverse, 24
negation, 23
number, 22
absolute value, 27
argument, 27
imaginary part, 27
modulus, 27
real part, 27
conjugacy
class, 131
of element, 131
conjugate
complex number, 27
of element, 130
of set, 131
conjugate to, 130
conjugation, 130
constant
polynomial, 217
content
of polynomial, 265
correspondence theorem, 179
coset
left, 154
right, 154
cross product, 51
crossing number
of permutatino, 190
cubic
297
equation, 3
polynomial
reduction, 18
solution, 19
cycle
as permutation, 138
notation for permutation, 136
structure
of permutation, 139
cycles
disjoint, 139
of permutation, 136
cyclic
group, 164
degree
compared to index, 205
of a field extension, 110
of polynomial, 5, 220
of term, 5, 220
del Ferro, Scipione, 16
dihedral group, 124
disjoint
cycles, 139
pairwise, 53
sets, 53
disjoint cycles, 139
divides
in ring, 223
division algorithm, 80
division algorothm
divisor, 81
qotient, 81
remainder, 81
divisor
common, 82
greatest, 83
of division algorithm, 81
domain, 56
of function, 54
Eisenstein irreducibility criterion, 266
empty set, 50
equation
biquadratic, 3
cubic, 3
298
polynomial, 3
quadratic, 3
quartic, 3
quintic, 3
equivalence
class, 67
representative, 67
relation, 65
Euclid
theorem about primes, 86
evaluation
function, 219
homomorhpism, 219
exponents
of radical extension, 284
extension
field, 109
automorphism group, 113
of field, 109, 201
algebraic, 249
by a set, 111, 234
by an element, 234
finite, 249
Galois group of, 278
intermediate, 278
normal, 273
radical, 278, 284
separable, 270
simple, 249
field, 74, 108
automorphism, 112
automorphism group, 113
characteristic, 209
element
algebraic, 234
primitive, 249
separable, 270
transcendental, 234
extension, 109
algebraic, 249
automorphism group, 113
finite, 249
Galois, 269
Galois group of, 278
intermediate, 278
INDEX
normal, 273
radical, 278, 284
separable, 270
simple, 249
homomorphism, 111
isomorphism, 111
number, 283
splitting, 257
finite
extension
of field, 249
group, 70
finite group, 154
first isomorphism theorem, 175
fixed
by automorphism, 113
fixed field
of automorphism, 203
of group of automorphisms, 203
fixed group
of element, 127
fixed point
under action, 153
fixed set
of action, 153
formula
quadratic, 6
symmetric, 36
full
subgroup, 96
function, 53
bijection, 56
domain, 54, 56
image, 54, 56
injection, 56
inverse, 57
one to one, 56
onto, 55
range, 54
surjection, 55
well defined, 69
Fundamental theorem
of algebra, 230, 283
of arithmetic, 85
of Galois theory, 279
fundamental triviality, 132
INDEX
Galois, 3, 42
Galois extension
of fields, 269
Galois group
of field extension, 278
Galois theory
fundamental theorem, 279
generators
group, 164
subgroup, 164
greatest common divisor, 83
group, 93
abelian, 60, 93
action, 146
alternating, 192
automorphism
of extension, 113
of field, 113
automorphism group of, 101
axioms, 59
center of, 146
commutative, 60
cyclic, 164
definition, 59
dihedral, 124
finite, 70, 154
generators, 164
of permutations, 47, 121
of symmetries, 124
operation, 59
order of, 70
quotient, 174
simple, 189
solvable, 184
symmetric, 61
homomorphism, 49, 89, 97
evaluation, 219
image, 98
kernel, 98
of fields, 111
of rings, 105, 215
parity, 192
projection, 174
quotient, 174
299
ideal, 106
identity
isomorphism, 100
permutation, 37
image
inverse, 57
of function, 54, 56
of homomorphism, 98
imaginary
number, 25
imaginary part
complex number, 27
improved
radical extension, 286
inclusive or, 51
index
compared to degree, 205
of subgroup, 156
injection, 56
integers
as a ring, 73
modulo k, 70
integral domain, 215
intermediate
extension of field, 278
intersection
of sets, 51
invariant
set under action, 149, 153
inverse
function, 57
image, 57
multiplicative
complex, 24
irreducible
Eisenstein criterion, 266
in ring, 225
polynomial, 225
isomorphic, 99
fields, 111
isomorphism, 99
identity, 100
of fields, 111
of rings, 106
isomorphism theorem
first, 175
300
INDEX
other, 180
kernel
of action, 146
of homomorphism, 98
Khayyám, Omar, 16
Lagrange’s theorem, 156
leading
coefficient, 221
leading coefficient
polynomial, 5
linear
polynomial, 221
linearly
dependent
automorphisms, 261
independent
automorphisms, 261
minimal polynomial, 236
modulus
of complex number, 27
monic
polynomial, 14, 221
monomial, 217
multiplication
complex numbers, 23
mutiplicity
of root, 230
natural
numbers, 50
natural numbers, 79
negation
complex numbers, 23
normal
extension of fields, 273
subgroup, 99, 130
normalizer, 151
number
Arabic, 9
complex, 22
imaginary, 25
natural, 50
number field, 283
one to one
function, 56
one-to-one correspondence, 56
onto
function, 55
operation
binary, 59
group, 59
ternary, 59
unary, 59
orbit
under action, 152
order
infinite
of element, 70
of group, 70
of element, 70
of group, 70
ordered pair, 51
other isomorhpism theorem, 180
pair
ordered, 51
pairwise disjoint, 53
parity, 190
even, 190
homomorphism, 192
odd, 190
of permutation, 190
partition, 66
permutation, 36, 61
as a cycle, 138
Cauchy notation, 62
crossing number, 190
cycle notation, 136
cycle structure, 139
group, 121
identity, 37
support of, 138
transposition, 189
polynomial, 5, 216
as function, 218
constant, 217
content of, 265
cubic
reduction, 18
INDEX
solution, 19
degree, 5, 220
equation, 3
solution, 5
irreducible, 225
leading coefficient, 5
linear, 221
monic, 14, 221
primitive, 265
resolvent, 43
root, 5
separable, 270
solvable by radicals, 14, 284
splitting of, 257
term, 5
term of, 220
prime
Euclid’s theorem about, 86
in ring, 225
integer, 85
primitive
polynomial, 265
primitive element, 249
Primitive element theorem, 271
product
cross, 51
projection
homomorphism, 174
proper
subgroup, 96
quadratic
equation, 3
formula, 6
quartic
equation, 3
quintic
equation, 3
quotient
group, 174
homomorphism, 174
of division algorithm, 81
radical
extension, 278, 284
exponents, 284
301
improved, 286
range
of function, 54
real part
complex number, 27
reduction
cubic polynomial, 18
reflexive
relation, 65
relation
binary, 64
equivalence, 65
reflexive, 65
symmetric, 65
transitive, 65
relatively prime, 226
remainder
of division algorithm, 81
representative
of equivalence class, 67
resolvent
of polynomial, 43
ring, 72, 103, 214
abelian, 104
automorphism, 107
commutative, 72, 104, 214
homomorphism, 105, 215
ideal in, 106
isomorphism, 106
of integers, 73
unit in, 84
with identity, 72, 104, 214
with unit, 104, 214
root
multiplicity, 230
of polynomial, 5
separable
element of field, 270
extension of fields, 270
polynomial, 270
set
disjoint, 53
empty, 50
equality, 51
intersection, 51
302
notations for, 50
subset, 51
union, 51
simple
extension of fields, 249
group, 189
solution
cubic polynomial, 19
polynomial equation, 5
solvable
by radicals, 14, 284
group, 184
splitting
field, 257
of polynomial, 257
stabilizer
of element, 127, 150
of set, 128, 150
pointwise, 127, 150
strong induction, 80
subfield, 109
subgroup, 95
full, 96
generated by, 164
generated by set, 97
generators, 164
index, 156
normal, 99, 130
proper, 96
trivial, 96
subring, 105
subset
of set, 51
support
of permutation, 138
surjection, 55
symmetric
formula, 36
relation, 65
symmetric group, 61
Tartaglia, Niccolò, 16
term
degree, 5, 220
of polynomial, 5, 220
ternary
INDEX
operation, 59
transcendental
element, 234
transitive
action, 195, 292
relation, 65
transposition, 189
adjacent, 191
trivial
subgroup, 96
unary
ooperation, 59
union
of sets, 51
unit
in ring, 84, 223
well defined
operation, 69
well definedness problem, 69
well ordering, 79