An Introduction to Finite Groups 1 Definitions Physics 5040

advertisement
An Introduction to Finite Groups
Physics 5040
1
Spring 2009
Definitions
A group (G, ⋄) is a nonempty set G together with a binary operation called multiplication (or a product) and denoted by ⋄ that obeys the following axioms:
(G1)
(G2)
(G3)
(G4)
a, b ∈ G implies a ⋄ b ∈ G (closure);
a, b, c ∈ G implies (a ⋄ b) ⋄ c = a ⋄ (b ⋄ c) (associativity);
There exists e ∈ G such that a ⋄ e = e ⋄ a = a for all a ∈ G (identity);
For each a ∈ G, there exists a−1 ∈ G such that a ⋄ a−1 = a−1 ⋄ a = e
(inverse).
Furthermore, a group is said to be abelian if it also has the property that
(G5) a ⋄ b = b ⋄ a for all a, b ∈ G
(commutativity).
In the case of abelian groups, the group multiplication operation is frequently
denoted by + and called addition. We will generally simplify our notation by
leaving out the group multiplication symbol and assuming that it is understood for
the particular group under discussion.
The number of elements in a group G is called its order and will be denoted by
nG . If this number is finite, then we say that G is a finite group. Otherwise, G is
said to be infinite. It is also a fact that every group of order less than or equal to
5 must be abelian, and hence that the smallest non-abelian group is of order 6.
While we have defined a group in the usual manner, it should be realized that
there is a certain amount of redundancy in our definition. In particular, it is not
necessary to require that a “right inverse” also be the “left inverse.” To see this,
suppose that for any a ∈ G, we have the right inverse defined by aa−1 = e. Then
multiplying from the left by a−1 yields a−1 aa−1 = a−1 . But a−1 ∈ G so there exists
an (a−1 )−1 ∈ G such that (a−1 )(a−1 )−1 = e. Multiplying our previous expression
from the right by (a−1 )−1 results in a−1 a = e, and hence we see that a−1 is also a
left inverse. Of course, we could have started with a left inverse and shown that
it is also a right inverse.
Similarly, we could have defined a right identity by ae = a for all a ∈ G. We
then observe that a = ae = a(a−1 a) = (aa−1 )a = ea, and hence e is also a left
identity.
It is easy to show that the identity element is unique. To see this, suppose that
there exist e, e′ ∈ G such that for every a ∈ G we have ea = ae = a = e′ a = ae′ .
Since ea = a for every a ∈ G, we have in particular that ee′ = e′ . On the other
hand, since we also have ae′ = a, it follows that ee′ = e. Therefore e′ = ee′ = e so
that e = e′ .
1
Before showing the uniqueness of the inverse, we first prove an important basic
result. Suppose that ax = ay for a, x, y ∈ G. Let a−1 be a (not necessarily unique)
inverse to a. Then x = ex = (a−1 a)x = a−1 (ax) = a−1 (ay) = (a−1 a)y = ey = y.
In other words, the equation ax = ay means that x = y. This is sometimes called
the (left) cancellation law. As a special case, we see that if aa−1 = e = aa′−1 ,
then this implies a−1 = a′−1 so that the inverse is indeed unique as claimed. This
also shows that
(a−1 )−1 = a
since (a−1 )−1 (a−1 ) = e and aa−1 = e.
Finally, another important result follows by noting that
(ab)(b−1 a−1 ) = a((bb−1 )a−1 ) = a(ea−1 ) = aa−1 = e .
Since the inverse is unique, we then see that
(ab)−1 = b−1 a−1 .
This clearly extends by induction to any finite product of group elements.
Example 1. The set of integers Z = 0, ±1, ±2, . . . forms an infinite abelian group
where the group multiplication operation is just ordinary addition. It should be
obvious that the (additive) identity element is 0, and the inverse of any number
n is given by −n. However, it is easy to see that Z is not a group under the
operation of ordinary multiplication. Indeed, while Z is both closed and associative
under multiplication, and it also contains the (multiplicative) identity element 1,
no element of Z (other than ±1) has a multiplicative inverse in Z (for example,
2−1 = 1/2 ∈
/ Z).
On the other hand, if we consider the set Q of all rational numbers, then Q forms
a group under ordinary addition (with identity element 0 and inverse −p/q ∈ Q to
any p/q ∈ Q). Moreover, the nonzero elements of Q also form a group under
ordinary multiplication (with identity element 1 and inverse q/p ∈ Q to any p/q ∈
Q).
Example 2. The cyclic groups of order n are the groups Cn of the form
{e, a, a2 , a3 , . . . , an−1 , an = e}. They are all necessarily abelian, and the simplest
non-cyclic group is of order 4.
Example 3. A more complicated (but quite useful) example is given by the set
of all rotations in the xy-plane. Consider the following figure that shows a vector
r = (x, y) making an angle ϕ with the x-axis, and a vector r′ = (x′ , y ′ ) making an
angle θ + ϕ with the x-axis:
2
y′
r′ = (x′ , y ′ )
y
r = (x, y)
θ
ϕ
x′
x
We assume r = krk = kr′ k so that the vector r′ results from a counterclockwise
rotation by an angle θ with respect to the vector r. From the figure, we see that r′
has components x′ and y ′ given by
x′ = r cos(θ + ϕ) = r cos θ cos ϕ − r sin θ sin ϕ = x cos θ − y sin θ
y ′ = r sin(θ + ϕ) = r sin θ cos ϕ + r cos θ sin ϕ = x sin θ + y cos θ.
Let R(α) denote a counterclockwise rotation by an angle α. It should be clear
that R(0) is just the identity rotation (i.e., no rotation at all), and that the inverse
is given by R(α)−1 = R(α). With these definitions, it is easy to see that the set of
all rotations in the plane forms an infinite (actually, continuous) abelian group. A
convenient way of describing these rotations is with the matrix
cos α − sin α
R(α) =
.
sin α
cos α
(As we will see below, such a matrix is said form a representation of the rotation
group.) We then see that r′ = R(θ)r, which in matrix notation is just
′ x
cos θ − sin θ
x
=
.
y′
sin θ
cos θ
y
Using this notation, it is easy to see that R(0) is the identity since
x
1 0
x
=
y
0 1
y
and also that R(θ)−1 = R(−θ) because
cos θ − sin θ
cos θ
R(θ)R(−θ) =
sin θ
cos θ
− sin θ
sin θ
cos θ
=
1
0
0
1
= R(−θ)R(θ).
We remark that while the rotation group in two dimensions is abelian, the
rotation group in three dimensions is not. For example, let Rz (θ) denote a rotation
3
about the z-axis (in the “right-handed sense”). Then, applied to any vector x̂ lying
along the x-axis, we see that
Ry (90◦ )Rz (45◦ )x̂ 6= Rz (45◦ )Ry (90◦ )x̂
since in the second case, the result lies along the z-axis, while in the first case it
does not.
Now that we know the inverse is unique, let us take another look at the (left)
cancellation law. We restate it as the following useful result, often called the rearrangement lemma.
Lemma. If a, b, c ∈ G and ca = cb, then a = b.
Proof. Since G contains c−1 by definition, simply multiply the equation from the
left by c−1 .
What this result means is that if a and b are distinct elements of G, then so
are ca and cb. The importance of this comes from observing that if all the group
elements are arranged in a sequence, say {g1 , . . . , gn }, then multiplying from the
left by an element h results in the sequence {hg1 , . . . , hgn } which is the same as the
original sequence except for order. (Of course, the same result holds equally well
for multiplication from the right.)
Since each element hgi is determined by the group multiplication rule, let us
write hgi = ghi where hi is the integer label of this particular element. (For example,
if hg2 = g5 , then h2 = 5.) Since the rearrangement lemma tells us that ghi and
ghj are distinct if i 6= j, it follows that the numbers (h1 , . . . , hn ) are simply a
permutation of (1, . . . , n). In other words, there is a natural correspondence between
group elements h and the permutation characterized by (h1 , . . . , hn ).
To elaborate on this, let us denote an arbitrary permutation of n objects by
1 2 ··· n
p=
p1 p2 · · · pn
where each entry in the first row is to be replaced by the corresponding entry
below it. The set of all n! permutations of n objects forms a group called the
permutation group or the symmetric group and is denoted by Sn . It is clear
that one permutation followed by another permutation is a third permutation, and
this defines the group multiplication. (By convention, multiplication proceeds from
right to left.) The identity corresponds to no permutation at all, and is represented
by
1 2 ··· n
e=
1 2 ··· n
4
while the inverse of p is just
p−1 =
Example 4. Let us
mutations. Consider
1
p=
2
p1
1
p2
2
···
···
pn
n
.
give an example that demonstrates the multiplication of perthe permutations p, q ∈ S6 defined by
2 3 4 5 6
1 2 3 4 5 6
q=
.
6 1 3 4 5
2 1 6 3 4 5
To evaluate the product pq, we first note that q takes 1 → 2 and then p takes 2 → 6,
so the product pq takes 1 → 2 → 6 or simply 1 → 6. Next, q takes 2 → 1 while p
takes 1 → 2 so that pq takes 2 → 2. Continuing in this manner, we see that
1 2 3 4 5 6
pq =
.
6 2 5 1 3 4
Notice that in this example, the final answer can be written as a product of two
disjoint permutations (i.e., they permute different sets of numbers):
1 2 3 4 5 6
1 2 3 4 5 6
1 2 3 4 5 6
=
.
6 2 5 1 3 4
6 2 3 1 5 4
1 2 5 4 3 6
Also notice that since the two permutations on the right are disjoint, the order in
which you multiply them doesn’t matter. A simpler notation for permutations is
the so-called cycle notation, which in this case for the result pq would be the
product of a 1-cycle, a 2-cycle and a 3-cycle written as
(164)(35)(2) .
The first of these is to be interpreted as leaving 2 unchanged, then the second as
3 → 5 and 5 → 3, while the third is interpreted as 1 → 6, 6 → 4 and 4 → 1. In other
words, each cycle only includes those numbers in the permutation that are actually
permuted within themselves, starting with one number and following its path until
you return to the starting point. Thus (164) is the same as (641) or (416) (but not
(614) or (146) and so forth).
In order to describe how to decompose a permutation into a product of cycles,
it is easiest to simply give an example. Consider the permutation p ∈ S6 shown
in Example 4. Starting with 1, repeatedly apply p until you get back to 1 again.
Since Sn is finite, this can only take a finite number of steps. In this case, we have
1 → 2 → 6 → 5 → 4 → 3 → 1 so we can write p as the 6-cycle p = (126543).
Since all 6 numbers in p are accounted for, we are done in this case. Now look at
q. Again starting from 1 we have 1 → 2 → 1 so we have the 2-cycle (12). Now
go to the next number not included so far, which in this case is 3. Then we have
5
3 → 6 → 5 → 4 → 3 which gives us the 4-cycle (3654) and we are done. Therefore
we can write q = (3654)(12). For the product pq we see that 1 → 6 → 4 → 1 so we
have the 3-cycle (164), then 2 → 2 gives (2) and 3 → 5 → 3 gives the 2-cycle (35).
Therefore pq = (164)(2)(35) as we stated above.
The multiplication of cycles is also very straightforward if you just think about
how permutations are multiplied (and remember that multiplication proceeds from
right to left). For example, consider the product (12)(23) of 2-cycles in S3 . The
first cycle says 2 → 3, while the second cycle doesn’t do anything to 3. Then the
first says 3 → 2 and now the second also says 2 → 1, so we have 3 → 1. Putting
these together we have the result (231) = (123).
As another example, consider the product (234)(123) of 3-cycles in S4 . We have
1 → 2 followed by 2 → 3 for the term 1 → 3. Then we have 2 → 3 followed by
3 → 4 for the term 2 → 4. Even though the first term also has 3 → 1, this was
already taken into account since there is no 1 in the second term. Thus we are left
with the result (13)(24).
Keep in mind that if you have any doubt, you can always write out the complete
permutations and multiply them. And again I emphasize that the order in which
disjoint cycles are multiplied is irrelevant. In any case, this notation will greatly
simplify some of our examples that use the group Sn , as we now show.
First, a subset H of a group G is said to be a subgroup of G if H also forms a
group under the same multiplication rule as G.
Example 5. Consider the group S3 of order 3! = 6. I leave it as an exercise to
verify that each of the following four subsets forms a subgroup of S3 :
{e, (12)}
{e, (23)}
{e, (31)}
{e, (123), (321)} .
Be sure to note that (123) = (312) 6= (321).
2
Homomorphisms
Let ϕ : G → G′ be a mapping from a group G to a group G′ . If for every a, b ∈ G
we have
ϕ(ab) = ϕ(a)ϕ(b)
then ϕ is said to be a homomorphism, and the groups G and G′ are said to be
homomorphic. In other words, a homomorphism preserves group multiplication,
but is not in general either surjective or injective. It should also be noted that the
product ab is an element of G while the product ϕ(a)ϕ(b) is an element of G′ .
Example 6. Let G be the (abelian) group of all real numbers under addition, and
let G′ be the group of nonzero real numbers under multiplication. If we define
6
ϕ : G → G′ by ϕ(x) = 2x , then
ϕ(x + y) = 2x+y = 2x 2y = ϕ(x)ϕ(y)
so that ϕ is indeed a homomorphism.
Example 7. Let G be the group of all real (or complex) numbers under ordinary
addition. For any real (or complex) number a, we define the mapping ϕ of G onto
itself by ϕ(x) = ax. This ϕ is clearly a homomorphism since
ϕ(x + y) = a(x + y) = ax + ay = ϕ(x) + ϕ(y).
However, if b is any other nonzero real (or complex) number, then you can easily
show that the (“non-homogeneous”) mapping ψ(x) = ax+b is not a homomorphism.
Let e be the identity element of G, and let e′ be the identity element of G′ . If
ϕ : G → G′ is a homomorphism, then ϕ(g)e′ = ϕ(g) = ϕ(ge) = ϕ(g)ϕ(e), and we
have the important result
ϕ(e) = e′ .
Using this result, we then see that e′ = ϕ(e) = ϕ(gg −1 ) = ϕ(g)ϕ(g −1 ), and hence
the uniqueness of the inverse tells us that
ϕ(g −1 ) = ϕ(g)−1 .
It is very important to note that in general ϕ(g)−1 6= ϕ−1 (g) since if g ∈ G we have
ϕ(g)−1 ∈ G′ while if g ∈ G′ , then ϕ−1 (g) ∈ G.
It is now easy to see that the homomorphic image of G is necessarily a subgroup of G′ . Closure is obvious because ϕ(a)ϕ(b) = ϕ(ab). There is an identity because ϕ(e) = e′ ∈ G′ . There is also an inverse because if ϕ(g) ∈ G′ ,
then ϕ(g)−1 = ϕ(g −1 ) ∈ G′ . And associativity holds because [ϕ(a)ϕ(b)]ϕ(c) =
ϕ(ab)ϕ(c) = ϕ(abc) = ϕ(a)ϕ(bc) = ϕ(a)[ϕ(b)ϕ(c)].
In general, there may be many elements g ∈ G that map into the same element
g ′ ∈ G′ under ϕ. It is of particular interest to see what happens if more than one
element of G (besides e) maps into e′ . If k ∈ G is such that ϕ(k) = e′ , then for
any g ∈ G we have ϕ(gk) = ϕ(g)ϕ(k) = ϕ(g)e′ = ϕ(g). Therefore, if gk 6= g we
see that ϕ could not possibly be a one-to-one mapping. To help us see just when a
homomorphism is one-to-one, we define the kernel of ϕ to be the set
Ker ϕ = {g ∈ G : ϕ(g) = e′ }.
It is easy to see that Ker ϕ is a subgroup of G.
If a homomorphism ϕ : G → G′ is one-to-one (i.e., injective), we say that ϕ is an
isomorphism. If, in addition, ϕ is also onto (i.e., surjective), then we say that G
7
and G′ are isomorphic. In other words, G and G′ are isomorphic if ϕ is a bijective
homomorphism. (We point out that many authors use the word “isomorphism” to
implicitly mean that ϕ is a bijection.) In particular, an isomorphism of a group
onto itself is called an automorphism.
From the definition, it appears that there is a relationship between the kernel
of a homomorphism and whether or not it is an isomorphism. We now proceed to
show that this is indeed the case. By way of notation, if H is a subset of a group
G, then by Hg we mean the set Hg = {hg ∈ G : h ∈ H}. In the particular case
that H is a subgroup of G, then Hg is called a right coset of G. (And left cosets
are defined in the obvious way.) Recall also that if ϕ : G → G′ and g ′ ∈ G′ then,
by an inverse image of g ′ , we mean any element g ∈ G such that ϕ(g) = g ′ .
Theorem 1. Let ϕ be a homomorphism of a group G onto a group G′ , and let Kϕ
be the kernel of ϕ. Then given any g ′ ∈ G′ , the set of all inverse images of g ′ is
given by Kϕ g where g ∈ G is any particular inverse image of g ′ .
Proof. Consider any k ∈ Kϕ . Then by definition of homomorphism, we must have
ϕ(kg) = ϕ(k)ϕ(g) = e′ g ′ = g ′ .
In other words, if g is any inverse image of g ′ , then so is any kg ∈ Kϕ g. We must
be sure that there is no other element h ∈ G, h ∈
/ Kϕ g with the property that
ϕ(h) = g ′ .
To see that this is true, suppose ϕ(h) = g ′ = ϕ(g). Then ϕ(h) = ϕ(g) implies
e′ = ϕ(h)ϕ(g)−1 = ϕ(h)ϕ(g −1 ) = ϕ(hg −1 ).
But this means that hg −1 ∈ Kϕ , and hence hg −1 = k for some k ∈ Kϕ . Therefore
h = kg ∈ Kϕ g and must have already been taken into account.
Corollary. A homomorphism ϕ mapping a group G to a group G′ is an isomorphism if and only if Ker ϕ = {e}.
Proof. Note that if ϕ(G) 6= G′ , then we may apply Theorem 1 to G and ϕ(G).
In other words, it is trivial that ϕ always maps G onto ϕ(G). Now, if ϕ is an
isomorphism, then it is one-to-one by definition, so that there can be no element
of G other than e that maps into e′ . Conversely, if Ker ϕ = {e} then Theorem 1
shows that any x′ ∈ ϕ(G) ⊂ G′ has exactly one inverse image.
Of course, if ϕ is surjective, then ϕ(G) is just equal to G′ . In other words, we
may think of isomorphic groups as being essentially identical to each other.
8
Example 8. Let G be any group, and let g ∈ G be fixed. We define the mapping
ϕ : G → G by ϕ(a) = gag −1 , and we claim that ϕ is an automorphism. To see this,
first note that ϕ is indeed a homomorphism since for any a, b ∈ G we have
ϕ(ab) = g(ab)g −1 = g(aeb)g −1 = g(ag −1 gb)g −1 = (gag −1 )(gbg −1 )
= ϕ(a)ϕ(b).
To see that ϕ is surjective, simply note that for any b ∈ G we may define a = g −1 bg
so that ϕ(a) = b. Next, we observe that if ϕ(a) = gag −1 = e, then right-multiplying
by g and left multiplying by g −1 yields
a = (g −1 g)a(g −1 g) = g −1 eg = e
and hence Ker ϕ = {e}. From the corollary to Theorem 1, we now see that ϕ must
be an isomorphism.
Our next important result is called Cayley’s theorem.
Theorem 2. Every group of order n is isomorphic to a subgroup of Sn .
Note that G has n elements while Sn has n! elements.
Proof. Let us define the mapping ϕ : G → Sn by
1 2
ϕ : a ∈ G −→ ϕ(a) := pa :=
a1 a2
···
···
n
an
∈ Sn
where the indices ai are determined by the definition (due to the rearrangement
lemma)
agi = gai .
If we can show that ϕ is a homomorphism, then from ϕ(a) = pa , we see that if pa
is the identity in Sn , then we must have a = e so that Ker ϕ = {e} and ϕ is in fact
an isomorphism.
Let ab = c in G. First note that
gci = cgi = (ab)gi = a(bgi ) = agbi = gabi
so that ci = abi . Now observe that (since (b1 , b2 , . . . , bn ) is just some permutation
of (1, 2, . . . , n))
1 2 ··· n
1 2 ··· n
·
ϕ(a)ϕ(b) = pa pb =
b1 b2 · · · bn
a1 a2 · · · an
9
b1
ab1
1
=
ab1
=
b2
ab2
···
···
2
ab2
···
···
1 2 ···
·
b1 b2 · · ·
n
= pc = pab
abn
bn
abn
n
bn
= ϕ(ab) .
Thus ϕ is a homomorphism as claimed.
Example 9. We show that the cyclic group C3 = {e, a, b = a2 } of order 3 is
isomorphic to the subgroup of S3 defined by {e, (123), (321)}. (See Examples 2 and
5.)
Let us relabel the elements of C3 as (g1 , g2 , g3 ). Multiplying from the left by
g1 = e leaves the ordered set unchanged, so egi = gei = gi which corresponds to
the identity permutation pe = (1)(2)(3) ∈ S3 . Next, multiplying by g2 = a we
obtain the ordered set (a, b, e) = (g2 , g3 , g1 ) so that (a1 , a2 , a3 ) = (2, 3, 1) which
corresponds to (in cycle notation)
a ∈ C3 → pa = (123) ∈ S3 .
Finally, multiplying by g3 = b = a2 we obtain (b, e, a) = (g3 , g1 , g2 ) so that
(b1 , b2 , b3 ) = (3, 1, 2) which corresponds to
b ∈ C3 → pb = (132) = (321) ∈ S3 .
To verify the homomorphism property, consider, for example
pa pb = (123)(132) = (1)(3)(2) = pe = pab
and
pa pa = (123)(123) = (132) = pb = paa .
3
Representations
We know that the composition of linear transformations on a vector space V is
associative, but not necessarily commutative (think of matrix multiplication), and
hence it is really just a group multiplication. A set of nonsingular linear transformations on V that is closed with respect to compositions forms a group of linear
transformations, or a group of operators.
If there is a homomorphism ϕ from a group G to a set U (G) of operators on a
vector space V , then U (G) is said to form a representation of G. In other words,
we have ϕ(g) = U (g) for each g ∈ G, and
U (g1 g2 ) = U (g1 )U (g2 ) .
10
The dimension of the representation is the dimension of V . If the homomorphism is also an isomorphism, then the representation is said to be faithful. A
representation that is not faithful is sometimes said to be degenerate.
Note that by definition the U (g)’s form a group, and hence they must be nonsingular operators. Just as we showed for homomorphisms in general, given any U (g),
there exists U (g)−1 so that U (g) = U (ge) = U (g)U (e) and therefore we always have
U (e) = 1 .
Now we observe that 1 = U (e) = U (gg −1 ) = U (g)U (g −1 ) and thus
U (g −1 ) = U (g)−1 .
In particular, if the representation U (G) is unitary, then
U (g)† = U (g)−1 = U (g −1 ) .
Note also that if U (g) ∈ L(V ) and v ∈ V , then the fact that U (g) is nonsingular
means that U (g)v = 0 if and only if v = 0.
We will restrict our consideration to finite-dimensional representations. Let
{ei } be a basis for V . Then the matrix representation D(g)i j of an operator U (g)
is defined in the usual manner by
U (g)ei = ej D(g)j i .
That these matrices themselves form a representation follows by observing that
U (g1 )U (g2 )ei = U (g1 )ej D(g2 )j i = ek D(g1 )k j D(g2 )j i
= U (g1 g2 ) = ek D(g1 g2 )k i .
Since the {ek } form a basis, we must have (in terms of matrix multiplication)
D(g1 g2 ) = D(g1 )D(g2 ) .
The group of matrices D(G) are said to form a matrix representation of G.
Example 10. Let V = C, and for each g ∈ G, define U (g) = 1. Since U (g1 )U (g2 ) =
1 · 1 = 1 = U (g1 g2 ), we see that the mapping g → 1 forms a trivial one-dimensional
representation of any group.
Example 11. Let G be a group of matrices. For example, G could be the group
GL(n) consisting of all nonsingular n × n matrices, or the group U (n) of all unitary
n × n matrices. We may define a representation of G by U (g) = det g. That this is
indeed a representation follows from the fact that
det(g1 g2 ) = (det g1 )(det g2 ) .
Thus we have a non-trivial one-dimensional representation of any matrix group.
11
Example 12. Let G = {R(θ), 0 ≤ θ < 2π} be the group of rotations in the plane
as shown in Example 3.
x2
x′2
e′2 θ
e2
e′1
θ
e1
x′1
x1
The vectors ei and e′i are the usual orthonormal basis vectors with kei k = ke′i k = 1.
From the geometry of the diagram we see that
e′1 = U (θ)e1 = e1 cos θ + e2 sin θ
e′2 = U (θ)e2 = e1 (− sin θ) + e2 cos θ
so that e′i = ej D(θ)j i and the matrix (D(θ)j i ) is given by
"
#
cos θ − sin θ
j
(D(θ) i ) =
.
sin θ
cos θ
I leave it as an exercise to verify that
D(θ + φ) = D(θ)D(φ) .
Thus {U (θ)} forms a two-dimensional representation of the rotation group R(θ)
with the matrix realization {D(θ)} with respect to the basis {ei }.
Now let U (G) be a representation of G on V , and let S be any nonsingular
operator on V . Then U ′ (G) = S −1 U (G)S also forms a representation on V because
U ′ (g1 g2 ) = S −1 U (g1 g2 )S = S −1 U (g1 )U (g2 )S = S −1 U (g1 )SS −1 U (g2 )S
= U ′ (g1 )U ′ (g2 ) .
Two representations related by such a similarity transformation are said to be
equivalent representations. Equivalent representations are essentially the same,
and our interest is generally in determining the various inequivalent representations
of a group.
What happens if we have a representation U (G) on V where there happens to
be a subspace W ⊂ V with the property that U (g)W ⊂ W for all g ∈ G? In this
case W is said to be a U (G)-invariant subspace, and we know that the matrix
12
representation D(G) of U (G) will take the block diagonal form (with respect to the
appropriate basis)
" ′
#
D (g) B
D(g) =
.
0
C
Note that
D(g1 )D(g2 ) =
"
D′ (g1 ) B1
0
= D(g1 g2 ) =
#"
D′ (g2 ) B2
0
C2
C1
#
" ′
D (g1 g2 ) B12
0
#
=
"
D′ (g1 )D′ (g2 ) E
0
F
#
C12
and therefore the matrices D′ (G) also form a representation of U (G) but with
dimension dim W < dim V . If the invariant subspace W ⊂ V does not contain any
nontrivial U (G)-invariant subspace, then W is said to be minimal or proper.
A representation U (G) is said to be irreducible if it does not contain any
nontrivial U (G)-invariant subspace; otherwise, it is reducible. Furthermore, if we
have V = W1 ⊕ · · · ⊕ Wr where each Wi is a minimal U (G)-invariant subspace, then
U (G) is said to be completely reducible or decomposable. In this case, the
matrix representation of U (G) takes the block diagonal form


D1 (G)


..
D(G) = 

.
Dr (G)
where each Di (G) is a matrix representation of U (G). Thus we see that restricting
U (G) to any Wi yields a lower-dimensional representation of G. Therefore a representation U (G) is completely reducible if it can be decomposed into a direct sum of
irreducible representations.
We will frequently refer to an irreducible representation as an irrep.
If the group representation space V is a unitary space (i.e., a complex inner
product space) and if the operators U (G) on V are unitary, then we say that U (G)
is a unitary representation of G. Since unitary operators preserve lengths, angles and scalar products, unitary representations are fundamental to the study of
symmetry groups. The following two theorems greatly simplify many of the results
in representation theory.
Theorem 3. Every representation of a finite group on a unitary space is equivalent
to a unitary representation.
Proof. Let D(G) be a representation of G on V . We must find a nonsingular
operator S such that U (g) := SD(g)S −1 is unitary for every g ∈ G. (If you feel so
13
e
compelled, you can define Se = S −1 so this can be written as U (g) = Se−1 D(g)S.)
Define the Hermitian operator
X
A=
D(g)† D(g) .
g∈G
A is positive definite since for any x ∈ V we have
X
X
X
hx, Axi =
hx, D(g)† D(g)xi =
hD(g)x, D(g)xi =
kD(g)xk2 .
g
g
g
This is greater than or equal to 0, and is equal to 0 if and only if x = 0. (Because
each D(g) is nonsingular so that D(g)x = 0 if and only if x = 0. Also note that
2
2
D(e) = 1 so that the sum includes the term kD(e)xk = kxk which is 0 only if
x = 0.) In particular, since A is Hermitian it can be diagonalized. This means there
exists a basis of (nonzero) eigenvectors vi with corresponding real eigenvalues λi .
Then
2
hvi , Avi i = λi hvi , vi i = λi kvi k .
2
Since vi 6= 0, the result above then shows that hvi , Avi i = λi kvi k > 0 so that each
λi is both real and greater than 0.
Let M be the unitary operator that diagonalizes A so that


λ1


..
M † AM = 
.
.
λn
Defining the “square root” of A by
√
λ1

T =
..
.
√
λn
we then define the nonsingular operator S by


†
=T
S = M T M † = S† .
Note that (since T 2 = M † AM )
S 2 = M T M † M T M † = M T 2M † = A .
And for any h ∈ G, the rearrangement lemma tells us that
X
X
D(h)† AD(h) =
D(h)† D(g)† D(g)D(h) =
[D(g)D(h)]† D(g)D(h)
g
=
X
g
g
D(gh)† D(gh) =
X
g
= A.
14
D(g)† D(g)
Therefore
S 2 = A = D(h)† AD(h) = D(h)† S 2 D(h)
so that
[S −1 D(h)† S][SD(h)S −1 ] = 1 .
Finally, define U (h) = SD(h)S −1 so that U (h)† U (h) = 1 and U (G) is unitary.
(Where we used the fact that (S −1 )† = (S † )−1 = S −1 .)
Theorem 4. Every reducible representation of a finite group is completely reducible.
Proof. By Theorem 3, we need only consider a unitary representation U (G). (I
leave it as an easy exercise for you to show that if an arbitrary representation D(G)
is reducible, then U (G) = SD(G)S −1 is also reducible.) Let W ⊂ V be a U (G)invariant subspace. As we have seen, using the Gram-Schmidt process we may write
V = W ⊕ W ⊥ , and we need only show that W ⊥ is also U (G)-invariant. But this is
easy to do, for if x ∈ W and y ∈ W ⊥ , then for any g ∈ G we have
hx, U (g)yi = hU (g)† x, yi = hU (g −1 )x, yi = 0
because the invariance of W means that U (g −1 )x ∈ W .
Although you proved them in the last homework set, I include Schur’s lemmas
and their proofs here for the sake of completeness.
Theorem 5 (Schur’s lemma 1). Let U (G) be an irreducible representation of G
on V . If A ∈ L(V ) is such that AU (g) = U (g)A for all g ∈ G, then A = λ1 where
λ ∈ C.
Proof. Suppose U (g)A = AU (g) for all g ∈ G. Let v ∈ Vλ so that Av = λv. Then
A[U (g)v] = U (g)[Av] = λ[U (g)v]
so that U (g)v ∈ Vλ and Vλ is U (G)-invariant. Since U (G) is irreducible we have
either Vλ = {0} or Vλ = V . But v 6= 0 by definition, and hence we must have
Vλ = V . This means that Av = λv for all v ∈ V which is equivalent to saying that
A = λ1.
Theorem 6 (Schur’s lemma 2). Let U (G) and U ′ (G) be two irreducible representations of G on V and V ′ respectively, and suppose A ∈ L(V ′ , V ) is such that
AU ′ (g) = U (g)A for all g ∈ G. Then either A = 0, or else A is an isomorphism of
V ′ onto V so that A−1 exists and U (G) is equivalent to U ′ (G).
15
Proof. Let AU ′ (g) = U (g)A for all g ∈ G and for A ∈ L(V ′ , V ). If v ∈ Im A then
there exists v ′ ∈ V ′ such that Av ′ = v. But then
U (g)v = U (g)Av ′ = A[U ′ (g)v] ∈ Im A
by definition. Since U (G) is irreducible we must have Im A = {0} or V . If Im A =
{0} then A = 0. If Im A = V , look at Ker A. For v ∈ Ker A we have Av = 0 so that
A[U ′ (g)v] = U (g)Av = 0
which implies U ′ (g)v ∈ Ker A. But U ′ (G) is also irreducible, so Ker A is either {0}
or V ′ . If Ker A = V ′ then A = 0 which isn’t possible since Im A = V . Therefore
Ker A = {0} so A is one-to-one and onto. In other words, A−1 exists so that
U ′ (g) = A−1 U (g)A for all g ∈ G and hence U (G) is equivalent to U ′ (G).
Example 13. Let us show that a consequence of Schur’s lemma 1 is that the
irreducible representations of any abelian group must be one-dimensional. To see
this, let U (G) be an irrep of an abelian group G, and let h ∈ G be arbitrary but
fixed. Since G is abelian, we have U (gh) = U (g)U (h) = U (h)U (g) for all g ∈ G,
and hence Theorem 5 tells us we can write U (h) = λh 1. Since this applies to any
h ∈ G, we see that the mapping h 7→ λh is a one-dimensional representation of G.
In other words, if V is one-dimensional and x ∈ V , then
U (g)U (h)x = λh U (g)x = λg λh x
= U (gh)x = λgh x
so that λgh = λg λh .
Example 14. Schur’s lemmas have extremely important consequences for any
quantum mechanical operator that corresponds to a physical observable that is
invariant under some symmetry transformation group G. The symmetry operators
are mapped into a unitary representation D(g) that acts on a Hilbert space V of
states. In general, this representation is reducible, meaning that we can find a basis for V in which the matrix representation of each D(g) is block diagonal. Then
D(g) = D1 (g)⊕ · · ·⊕ Dr (g) where each Di (g) is a unitary irrep acting on a subspace
of V . In general, each irrep may occur more than once, but we assume that we have
chosen our basis so that the µth irrep is represented by the same unitary matrix
Dµ (g) no matter how many times it occurs in the block diagonal decomposition.
Let us label our orthonormal basis states by
|µ, j, xi
where µ labels the irrep (i.e., which blocks (invariant subspaces) correspond to a
particular irrep), j = 1, . . . , nµ labels the basis vectors within each subspace (which
16
is then of dimension nµ ), and x labels any other physical variables necessary to
describe the state. Note that if a particular irrep occurs only once, then we don’t
need the extra variable x to label its states. However, in general there will be many
physical states that all have the same symmetry properties, and in this case we need
the extra label to distinguish them. The orthonormality of these states means that
hµ, j, x|ν, k, yi = δµν δjk δxy .
With respect to this basis of invariant subspaces, the matrix representation of
D(g) is defined in the usual manner by
D(g)|ν, k, yi =
X
l
|ν, l, yiDν (g)lk
and therefore the matrix elements are given by
hµ, j, x|D(g)|ν, k, yi = δµν δxy Dµ (g)jk .
(1)
This is simply the algebraic description of the block diagonal matrix form of D(g).
Using the completeness relation
X
|µ, j, xihµ, j, x| = I
µ,j,x
we have the equivalent operator representation
X
X
|µ, j, xihµ, j, x|D(g)
|ν, k, yihν, k, y|
D(g) =
µ,j,x
=
X
µ,j,x
ν,k,y
=
ν,k,y
|µ, j, xiδµν δxy Dµ (g)jk hν, k, y|
X
µ,j,k,x
|µ, j, xiDµ (g)jk hµ, k, x|
Under the symmetry transformation, the states transform as
|ψi → |ψ ′ i = D(g)|ψi
and
hψ| → hψ ′ | = hψ|D(g)† .
And under a symmetry transformation, the matrix elements of an observable O
must obey
hψ|O|ψi = hψ ′ |O′ |ψ ′ i
and this therefore requires that
O → O′ = D(g)OD(g)† .
17
If the observable is invariant under the symmetry, then we must have O′ = O, and
this then implies that
[O, D(g)] = 0
for all g ∈ G.
That the symmetry operators commute with the observable puts an important
constraint on the matrix elements hµ, j, x|O|ν, k, yi. To see this, we insert complete
sets and calculate as follows, using equation (1):
0 = hµ, j, x|[O, D(g)]|ν, k, yi
X
X
= hµ, j, x|O
|ρ, l, zihρ, l, z|D(g)|ν, k, yi − hµ, j, x|D(g)
|ρ, l, zihρ, l, z|O|ν, k, yi
ρ,l,z
=
X
l
ρ,l,z
hµ, j, x|O|ν, l, yiDν (g)l k −
X
l
Dµ (g)j l hµ, l, x|O|ν, k, yi
This is essentially just the equation [ODν (g)]jk = [Dµ (g)O]jk . Thus, by Schur’s
lemma 2, we conclude that the matrix elements of O vanish unless µ = ν. And by
Schur’s lemma 1, we see that in the case where µ = ν, the matrix elements of O
must be proportional to the identity matrix, i.e., to δjk . However, the symmetry
doesn’t tell us anything about the dependence on the physical parameters x and y,
so we are finally able to write
hµ, j, x|O|ν, k, yi = fµ (x, y)δµν δjk
where the function fµ (x, y) is independent of the symmetry variables of the problem,
and only contains the physics. This is a simple example of the famous Wigner-Eckart
theorem.
4
Cosets and Quotient Groups
Because equivalent representations are in a sense identical, it would be nice to find
some way of characterizing representations that is independent of whether or not
they are equivalent. One approach that immediately comes to mind is the trace of
a representation. Since the trace is invariant under similarity transformations, it
will be the same for all equivalent matrices corresponding to a given group element.
Thus, we define the character of g ∈ G in the representation U (G) to be the
number χ(g) = tr U (g). If D(G) is the matrix representation of U (G), then we have
X
D(g)i i .
χ(g) =
i
Since the character of a group element is the same for all equivalent representations, let us take a closer look at equivalence. This will lead in a natural way to the
18
concept of class. In the next section we will return to our discussion of characters,
where we will treat them in great detail.
We say that an element a ∈ G is conjugate to b ∈ G if there exists x ∈ G
such that a = xbx−1 . Clearly, if a is conjugate to b, then b is conjugate to a,
and any element is conjugate to itself (just take x = e). Furthermore, if a is
conjugate to b and b is conjugate to c, then a = xbx−1 and b = ycy −1 so that
a = x(ycy −1 )x−1 = (xy)c(xy)−1 so that a is conjugate to c. Thus conjugation is an
equivalence relation. The set of all group elements conjugate to each other is called
a (conjugate) class. The class of an element a will be denoted by [a]. Note that e
is in a class by itself, and that no other class can contain e. Furthermore, [e] = e is
the only class which is also a subgroup (although a trivial one).
Example 15. Consider the permutation group S3 again. The element (12) is
conjugate to (31) because
(23)(12)(23)−1 = (23)(12)(23) = (23)(231) = (31) .
Similarly, (123) is conjugate to (321) because
(12)(123)(12)−1 = (12)(123)(12) = (12)(13) = (132) = (321) .
One of the most important properties of equivalence relations is that these classes
are disjoint. To see this, pick any a ∈ G. By letting x vary over all of G in the
expression xax−1 , we can find all elements of G that are conjugate to a, and therefore
determine [a]. Similarly, given any b ∈ G we may do the same thing to find [b]. We
claim that [a] and [b] are disjoint (as long as a 6= b). Indeed, suppose they contain
a common element c. Then we have a = xcx−1 and b = ycy −1 for some x, y ∈ G.
But then we can write c = y −1 by so that a = x(y −1 by)x−1 = (xy −1 )b(xy −1 )−1 so
that a is conjugate to b and a and b would have to be in the same class. Therefore,
equivalence classes are either identical or disjoint. This shows that every group
element is contained in a unique class.
Example 16. We can divide the elements of S3 into three classes:
C1 = {e}
C2 = {(12), (23), (31)}
C3 = {(123), (321)} .
This illustrates a general property of the symmetric groups: permutations with the
same cycle structure belong to the same class. To see that this is true, let us take
a look at conjugate elements in Sn . Consider two elements a, b ∈ Sn given by
a1 · · · an
1 ··· n
1 ··· n
=
.
b=
a=
ba1 · · · ban
b1 · · · bn
a1 · · · an
Then
bab−1 =
1
b1
···
···
n
bn
1
a1
19
···
···
n
an
b1
1
···
···
bn
n
=
b1
ba1
···
···
bn
ban
So we see that to evaluate bab−1 , we apply b separately to the top and bottom
rows of a. Since a cycle in general leaves some numbers unchanged, we see that
conjugation will leave that number alone. In particular, suppose a1 = 1. Then
b1 → ba1 = b1 and in general, if ak = k then bk → bak = bk also remains unchanged,
and cycle structure is preserved. For example,
(23)(12)(23)−1 = (31)
(123)(12)(123)−1 = (32)
(12)(123)(12)−1 = (132) .
Note that if a is a product of cycles, say a = a1 a2 , then we can always write
bab−1 = ba1 a2 b−1 = [ba1 b−1 ][ba2 b−1 ]
and again we see that the cycle structure of a will be maintained.
It is also possible in many cases to give a physical interpretation of the class
structure. For example, consider some group of symmetry transformations on a
symmetrical object. Then we can interpret the relation b = x−1 ax as the result
of first rotating the object by x, then performing the transformation a, and then
rotating back by x−1 . This shows that b must be the same physical type of transformation as a, but performed about a different axis, one that is related to that of
a by the transformation x.
Example 17. Let us consider the symmetry transformations of an equilateral triangle.
3
c
b
2
1
a
Here we label the vertices by 1, 2 and 3 so they may be distinguished in a symmetry
operation. The group elements a, b and c represent rotations by π about the axes
shown. The element d is defined to be a clockwise rotation by 2π/3 in the plane
of the triangle, and element f is a counterclockwise rotation by 2π/3. These five
20
operations together with the identity e define a group of order 6 which is denoted by
D3 (the dihedral group of order 3). By convention, we assume that the rotation
axes a, b and c remain fixed in space and do not rotate with the object. Then it
is convenient to describe the group multiplication rules by constructing a group
multiplication table as shown below:
e
a
b
c
d
f
e
e
a
b
c
d
f
a
a
e
f
d
c
b
b
b
d
e
f
a
c
c
c
f
d
e
b
a
d
d
b
c
a
f
e
f
f
c
a
b
e
d
By definition, the table entries are row elements times column elements (in that
order).
Using the multiplication table, you can show that the two rotations by 2π/3 form
a class, the three rotations by π form a class, and obviously the identity element
forms a class of its own. For example, we see that dcd−1 = bd−1 = bf = a or
c = d−1 ad. In physical terms, first d rotates the triangle clockwise by 2π/3 so that
vertex 2 lies on axis a. Next, a rotates about its axis by π so that vertices 1 and
3 are interchanged. Finally, d−1 = f rotates counterclockwise by 2π/3, leaving the
triangle in precisely the same configuration as the single rotation by π about axis
c, which is the same as a but rotated 2π/3 counterclockwise by the transformation
d−1 .
It is clear that this is an non-abelian group. In fact, I leave it to you to verify
that the following set of matrices forms a two-dimensional representation of this
group.
"
#
"
#
"
#
√
1 0
1 0
−1/2
3/2
e=
a=
b= √
0 1
0 −1
3/2
1/2
c=
"
#
√
−1/2 − 3/2
√
1/2
− 3/2
d=
"
−1/2
√
− 3/2
#
√
3/2
−1/2
f=
"
#
√
−1/2 − 3/2
√
3/2
−1/2
Now let H be a subgroup of a group G, and let a ∈ G be arbitrary. As we stated
earlier, the set
Ha = {ha : h ∈ H}
is called a right coset of H in G. Note that if a ∈ H, the rearrangement lemma
shows us that Ha = H. Let a, b ∈ G be arbitrary, and suppose that the cosets
Ha and Hb have an element in common. This means that h1 a = h2 b for some
21
h1 , h2 ∈ H. But then using the fact that H is a subgroup, we see that
a = h1 −1 h1 a = h1 −1 h2 b ∈ Hb.
Since this means that a = hb for some h = h1 −1 h2 ∈ H, we see from the rearrangement lemma that this implies
Ha = Hhb = Hb
and therefore if any two right cosets have an element in common, then they must
in fact be identical. It is easy to see that the set of all right cosets of H in G defines
an equivalence relation that partitions G into disjoint subsets.
It is important to realize that if a ∈
/ H, then the coset Ha can not be a subgroup,
because it cannot contain the identity element. Indeed, if ha = e for some h ∈ H,
then a = h−1 e = h−1 ∈ H, a contradiction. Clearly every g ∈ G must lie in some
coset of H (just form Hg for each g ∈ G), and these cosets are either identical or
disjoint. Since each coset contains the same number of elements as H, it follows
that the order of G is a multiple of the order of H, i.e., nH | nG . This proves the
next theorem, known as Lagrange’s theorem.
Theorem 7. If G is a finite group and H is a subgroup of G, then nH is a divisor
of nG .
While we have restricted our discussion to right cosets, it is clear that everything
could be repeated using left cosets defined in the obvious way. It should also be
clear that for a general subgroup H of a group G, we need not have Ha = aH for
any a ∈ G. However, if N is a subgroup of G such that for every n ∈ N and g ∈ G
we have gng −1 ∈ N , then we say that N is a normal (or invariant) subgroup
of G. An equivalent way of phrasing this is to say that N is a normal subgroup of
G if and only if gN g −1 ⊂ N for all g ∈ G (where by gN g −1 we mean the set of
all gng −1 with n ∈ N ). The notation N ⊳ G is sometimes used to denote the fact
that N is a normal subgroup of G.
Since for any n ∈ N we have gng −1 ∈ N for all g ∈ G, we see that the entire
class [n] is contained in N . Thus a normal subgroup N consists of complete classes.
Theorem 8. A subgroup N of G is normal if and only if gN g −1 = N for every
g ∈ G.
Proof. If gN g −1 = N for every g ∈ G, then clearly gN g −1 ⊂ N so that N is
normal. Conversely, suppose that N is normal in G. Then, for each g ∈ G we have
gN g −1 ⊂ N , and hence
g −1 N g = g −1 N (g −1 )−1 ⊂ N.
22
Using this result, we see that
N = (gg −1 )N (gg −1 ) = g(g −1 N g)g −1 ⊂ gN g −1
and therefore N = gN g −1 (This also follows from Example 8).
Example 18. The subgroup N = {e, a2 } is a normal subgroup of the cyclic group
C4 = {e = a4 , a, a2 , a3 }. Noting that a−1 = a3 , (a2 )−1 = a2 and (a3 )−1 = a, we see
that
aN a−1 = a{e, a2 }a3 = {a4 , a6 } = {e, a2 } = N
a2 N (a2 )−1 = a2 {e, a2 }a2 = {a4 , a6 } = {e, a2 } = N
a3 N (a3 )−1 = a3 {e, a2 }a = {a4 , a6 } = {e, a2 } = N .
Be careful to note that Theorem 8 does not say that gng −1 = n for every n ∈ N
and g ∈ G. This will in general not be true. The usefulness of this theorem is that
it allows us to prove the following result.
Theorem 9. A subgroup N of G is normal if and only if every left coset of N in
G is also a right coset of N in G.
Proof. If N is normal, then gN g −1 = N for every g ∈ G, and hence gN = N g.
Conversely, suppose that every left coset gN is also a right coset. We show that
in fact this right coset must be N g. Since N is a subgroup it must contain the
identity element e, and therefore g = ge ∈ gN so that g must also be in whatever
right coset it is that is identical to gN . But we also have eg = g so that g is in the
right coset N g. Then, since any two right cosets with an element in common must
be identical, it follows that gN = N g. Thus, we see that gN g −1 = N gg −1 = N so
that N is normal.
If G is a group and A, B are subsets of G, we define the set
AB = {ab ∈ G : a ∈ A, b ∈ B}.
In particular, if H is a subgroup of G, then HH ⊂ H since H is closed under the
group multiplication operation. But we also have H = He ⊂ HH (since e ∈ H),
and hence HH = H.
Now let N be a normal subgroup of G. By Theorem 9 we then see that
(N a)(N b) = N (aN )b = N (N a)b = N N ab = N ab.
In other words, the product of right cosets of a normal subgroup is again a right
coset. This closure property suggests that there may be a way to construct a group
23
out of the cosets N a where a is any element of G. We now show that there is
indeed a way to construct such a group. Our method is used frequently throughout
mathematics, and entails forming what is called a quotient structure.
Let G/N denote the collection of all right cosets of N in G. In other words, an
element of G/N is a right coset of N in G. We use the product of subsets as defined
above to define a product on G/N .
Theorem 10. Let N be a normal subgroup of a group G. Then G/N is a group.
Proof. We show that the product in G/N obeys properties (G1)–(G4) in the definition of a group.
(1) If A, B ∈ G/N , then A = N a and B = N b for some a, b ∈ G, and hence
(since ab ∈ G)
AB = N aN b = N ab ∈ G/N.
(2) If A, B, C ∈ G/N , then A = N a, B = N b and C = N c for some a, b, c ∈ G
and hence
(AB)C = (N aN b)N c = (N ab)N c = N (abN )c = N (N ab)c = N (ab)c
= N a(bc) = N a(N bc) = N a(N bN c) = A(BC)
(3) If A = N a ∈ G/N , then
AN = N aN e = N ae = N a = A
and similarly
N A = N eN a = N ea = N a = A.
Thus N = N e ∈ G/N serves as the identity element in G/N .
(4) If N a ∈ G/N , then N a−1 is also in G/N , and we have
N aN a−1 = N aa−1 = N e
as well as
N a−1 N a = N a−1 a = N e.
Therefore N a−1 ∈ G/N is the inverse to any element N a ∈ G/N .
Corollary. If N is a normal subgroup of a finite group G, then nG/N = nG /nN .
Proof. By construction, G/N consists of all the right cosets of N in G, and by
Lagrange’s theorem (Theorem 7) this number is just nG/N = nG /nN .
24
The group defined in Theorem 10 is called the quotient group (or factor
group) of G by N .
Example 19. Consider the normal subgroup N = {e, a2 } of C4 (see Example 18).
The quotient group C4 /N consists of the distinct cosets of N , which are N e = N
and M := N a = {a, a3 } = aN . (Since N a2 = a2 N = N and N a3 = a3 N = M .) It
is easy to see that C4 /N indeed forms a group {e = N, M } since N M = N N a3 =
N a3 = M, M N = N a3 N = a3 N N = a3 N = M and M M = N so that M −1 = M .
Thus both N and C4 /N are of order 2, and are isomorphic to C2 (they have the
same multiplication table).
Example 20. The permutation group S3 has the normal subgroup N =
{e, (123), (321)}. It is also easy to verify that N (12) = N (23) = N (31) =
{(12), (23), (31)} so that the elements of S3 /N are the cosets N e = N and
M = N (ij) = (ij)N = {(12), (23), (31)} (where (ij) stands for any of the three
2-cycles). It is then not hard to see that N M = N (ij)N = (ij)N = M = M N and
N N = M M = N . Therefore S3 /N is of order 2 and is also isomorphic to C2 .
Theorem 11. Let a group G have a non-trivial normal subgroup N . Then any
representation of the quotient group G/N induces a degenerate (i.e., non-faithful)
representation of G. Conversely, let U (G) be a degenerate representation of G.
Then G has at least one normal subgroup N such that U (G) defines a faithful
representation of the quotient group G/N .
Proof. Define a mapping ϕ from G to the cosets of N by ϕ(g) = gN . This is a
homomorphism because ϕ(g1 g2 ) = g1 g2 N = g1 N g2 N = ϕ(g1 )ϕ(g2 ). Since G can
be decomposed into the distinct equivalence classes defined by the cosets of N ,
the fact that N is non-trivial (so it has more than one element) means that nN
elements of G all map into the same coset. Thus ϕ is many-to-one. Now let ψ be
a representation of G/N , and define the mapping U = ψ ◦ ϕ. This is illustrated in
the following commutative diagram:
U
G
ϕ
U (G)
ψ
G/N
Then U is a representation of G because
U (g1 g2 ) = ψ(ϕ(g1 g2 )) = ψ(ϕ(g1 )ϕ(g2 )) = ψ(ϕ(g1 ))ψ(ϕ(g2 )) = U (g1 )U (g2 ) .
25
But because ϕ is many-to-one, this representation is not faithful.
The proof of the converse is left as an exercise (see the homework problems).
Example 21. Referring to Example 20, we know that S3 has the normal subgroup
N = {e, (123), (321)}, and that the cosets of N are only N itself and M = N (ij).
Furthermore, we showed that S3 /N is isomorphic to the cyclic group C2 = {e, a}.
Now, the group C2 has the rather simple representation ψ(e) = 1, ψ(a) = −1
as you can easily verify. This is then also a representation of S3 /N . Then this
representation induces a representation of S3 via the assignment U : S3 → S3 /N →
{1, −1} as
U (g) = +1
U (g) = −1
for g = e, (123), (321)
for g = (12), (23), (31) .
You should verify that this yields the same multiplication table as S3 . For example,
(12)(23) = (123) which agrees with (−1)(−1) = +1.
While we won’t go into any further discussion of these topics, a group is said to
be simple if it does not contain any non-trivial invariant subgroup, and is semisimple if it does not contain any abelian invariant subgroup. A consequence of
Theorem 11 is that all (non-trivial) representations of simple groups are necessarily
faithful.
While Theorem 8 shows us that for every g ∈ G we have gN g −1 = N if N
is a normal subgroup of G, it is in fact also true that for every g ∈ G we have
gCg −1 = C if C is merely any class. This is actually obvious, because by definition,
if a ∈ C, then because C is a class, it must contain every gag −1 . (You can also
think of this as a direct consequence of the rearrangement lemma.) In other words,
any class contains every element conjugate to every member of the class, and only
those conjugate elements.
Conversely, suppose C is a subset of G with the property that gCg −1 = C for all
g ∈ G. I claim that C must consist entirely of (complete) classes. Indeed, subtract
every complete class (in common) from both sides of this relation, and denote any
remainder by R. If r ∈ R, then for any g ∈ G we have grg −1 on the left, and this
must be contained in R on the right. But this means that R contains [r] for each
r ∈ R. Therefore C must consist of complete classes.
Summarizing this discussion, we have the following result.
Theorem 12. Let C be a subset of a finite group G. Then C consists entirely of
(complete) classes if and only if gCg −1 = C for every g ∈ G.
Generalizing our notation slightly, if Ci and Cj are two classes, we let Ci Cj
denote the set Ci Cj = {xi xj : xi ∈ Ci and xj ∈ Cj }. (Be sure to realize that any
26
specific term in the product Ci Cj may occur more than once.) Then according to
Theorem 12, for every g ∈ G we have
Ci Cj = gCi g −1 gCj g −1 = gCi Cj g −1 .
But then the converse part of Theorem 12 tells us that Ci Cj must consist of complete
classes. We can express this fact mathematically by writing
X
Ci Cj =
cijk Ck
(2)
k
where the integers cijk tell us how often the class Ck appears in the product Ci Cj .
Example 22. In Example 17 we found the classes of the group D3 (the symmetry
group of the equilateral triangle). We may label them by
C1 = {e}
C2 = {a, b, c}
C3 = {d, f } .
Using the group multiplication table we find that
C1 C1 = C1
C1 C2 = C2
C1 C3 = C3
C2 C2 = 3C1 + 3C3
C2 C3 = 2C2
C3 C3 = 2C1 + C3 .
For example,
C2 C2 = {a2 , ab, ac, ba, b2, bc, ca, cb, c2 } = {e, d, f, f, e, d, d, f, e} = 3C1 + 3C3 .
Let us derive some results that we will find very useful in the next section.
Suppose xi ∈ Ci and xj ∈ Cj . Then
xi xj = (xi xj xi−1 )xi ∈ Cj Ci
since xi xj x−1
∈ Cj . This shows that Ci Cj ⊂ Cj Ci . Similarly we see that Cj Ci ⊂
i
Ci Cj , and therefore
Ci Cj = Cj Ci .
This shows that the coefficients in equation (2) have the symmetry property
cijk = cjik .
If C1 = {e}, then C1 Cj = Cj =
P
k c1jk Ck
(3)
implies that c1jk = δjk . Thus we have
c1jk = cj1k = δjk .
(4)
Now suppose xi , xj ∈ C. By definition of class, this means there exists g ∈ G
such that xi = gxj g −1 . Taking the inverse of both sides of this equation shows that
27
−1
x−1
= gx−1
and hence x−1
and x−1
i
j g
i
j also belong to the same class. Given a class
Ci , we denote the class of all inverse elements of Ci by Ci′ . Then if j 6= i′ , Ci Cj can
not contain C1 = [e]. If we denote the number of elements in Ci by ni , then n1 = 1
and ni = ni′ . Since Ci Ci′ contains C1 = [e] precisely ni times, we then see that
cij1 = ni δji′ .
(5)
Recall that the character of g ∈ G in the µ-representation is defined by
µ
µ
χ (g) = tr U (g) =
nµ
X
Dµ (g)ii .
i=1
In particular, note that since the group identity element e is represented by the
identity matrix, we must have
χµ (e) = nµ .
Since the matrix representations of all elements in the same class are related by a
similarity transformation, we see that the character is the same for each member of
a class. Let us denote the character of class Ck in the µ-representation by χµ (Ck ).
We denote the number of elements in Ck by nk .
Now take another look at equation (2). Writing Ci = {xi1 , . . . , xini } and Cj =
j
{x1 , . . . , xjnj } we have
Ci Cj = {xi1 xj1 , . . . , xi1 xjnj , . . . , xini xj1 , . . . , xini xjnj }
so that summing together all of these elements yields
xi1
nj
X
l=1
xjl
+ ···+
xini
nj
X
xjl
=
l=1
X
ni
xik
k=1
X
nj
l=1
xjl
.
If we write each element in terms of an irreducible matrix representation µ, then
we can write this sum as Ciµ Cjµ where
Ciµ =
X
Dµ (g)
g∈Ci
is the sum of all matrices representing the elements in the class Ci in the µrepresentation. For the other sidePof equation (2) we simply sum the irreps of
the elements in each Ck to obtain k cijk Ckµ , and hence we have
Ciµ Cjµ =
X
cijk Ckµ .
k
By Theorem 12, any class Ck satisfies gCk g −1 = Ck so that gCk = Ck g for all
g ∈ G. Writing this in terms of irreducible matrix representations and summing we
28
have Dµ (g)Ckµ = Ckµ Dµ (g). Since this holds for all g = G, Schur’s lemma 1 tells us
that Ckµ = λk I. Using this result in the above equation we obtain
X
λi λj =
cijk λk .
k
To evaluate λk , we first take the trace of Ckµ = λk I to obtain
tr Ckµ = λk nµ
where nµ is the dimension of the µ-representation. On the other hand, we can also
take the trace of the defining equation for Ckµ to obtain
tr Ckµ = nk χµ (Ck ) .
Equating these two results yields
λk =
nk χµ (Ck )
.
nµ
Finally, using this formula for λk in the above equation for λi λj we have
X
ni nj χµ (Ci )χµ (Cj ) = nµ
cijk nk χµ (Ck ) .
(6)
k
5
Orthogonality Relations
Almost all of the main results in finite group representation theory are based on
the following result, sometimes called the great orthogonality theorem. By
way of notation, we let nG be the order of G, nµ denote the dimensionality of the
µth irreducible representation, and Dµ (g) be the unitary matrix corresponding to
g ∈ G in the µ-representation with respect to an orthonormal basis. Also, to keep
our notation as simple as possible we will usually write D(g)ij , but when summing
over multiple indices, it will be easier to use the summation convention and write
D(g)i j .
Theorem 13. With respect to all the inequivalent, irreducible, unitary representations of a finite group G, we have
X
nG
Dµ (g)†ki Dν (g)jl =
δµν δij δkl .
nµ
g∈G
Remark : Since D(g)†ik = D(g)∗ki , this theorem may be written as
X nµ 1/2
g∈G
nG
Dµ (g)ik
∗ nµ
nG
29
1/2
Dν (g)jl = δµν δij δkl .
For fixed (µ, i, k), we regard (nµ /nG )1/2 Dµ (g)ik as an nG -component vector (as g
ranges over G), and looking at the result in this manner, the equation is just the
usual orthonormality relationship for vectors labeled by the three indices (µ, i, k).
Proof. Part (i): Let X be any nµ × nµ matrix and define
X
M=
Dµ (g)† XDν (g) .
g
Then we have (using D(g)† = D(g)−1 and the rearrangement lemma)
X
Dµ (h)−1 M Dν (h) =
[Dµ (g)Dµ (h)]−1 XDν (g)Dν (h)
g
=
X
Dµ (gh)−1 XDν (gh)
g
=
X
Dµ (g)−1 XDν (g)
g
=M.
Since this holds for all h ∈ G, Schur’s lemmas tell us that either µ 6= ν and M = 0,
or µ = ν and M = cx I where cx is a constant depending on X.
Part (ii): Let Xlk be one of the nµ nν matrices with matrix elements (Xlk )i j =
δil δjk . In other words, Xlk has a 1 in the (l, k)th position and 0’s elsewhere. Then
X
X
(Mlk )r s =
[Dµ (g)† ]r i (Xlk )i j Dν (g)j s =
[Dµ (g)† ]r l Dν (g)k s .
g
g
According to Part (i), the left-hand side of this equation is zero if µ 6= ν. This
proves the theorem in the case that µ 6= ν. If we have µ = ν, then Part (i) tells us
that the left-hand side must equal ckl δrs where the ckl are constants. To determine
them, we take the trace of both sides of this last equationP
(i.e., set r = s and sum).
Since 1 ≤ r, s ≤ nµ , the left-hand side yields nµ ckl (since r δrr = nµ ), while from
the right-hand side we obtain
X
X
[Dµ (g)Dµ (g)† ]k l =
δkl = nG δkl .
g
g
So now ckl = (nG /nµ )δkl so that (Mlk )r s = ckl δrs = (nG /nµ )δkl δrs and we have
X
[Dµ (g)† ]r l Dν (g)k s =
g
nG
δµν δkl δrs .
nµ
One immediate consequence of this result is the following. As we have pointed
out, Theorem 13 may be interpreted as an orthonormality condition on nG -component
vectors labeled by (µ, i, k) where 1 ≤ i, k ≤ nµ . Since for each µ there are (nµ )2
30
P
possible values of i and k, the total number of vectors is given by µ nµ 2 . But any
orthonormal set of vectors is linearly independent, and the number of components
of each vector in a linearly independent set can’t be less than the number of vectors in the set. (You can’t have three linearly independent two-component vectors.)
Therefore we must have
X
nµ 2 ≤ nG .
µ
In fact, we will show that equality holds in this relation, and it is this result that
allows us to find all possible inequivalent irreducible representations of finite groups.
However, proving that equality holds here requires a fair amount of work, which we
will have to develop gradually.
Example 23. The simplest non-trivial group is C2 = {e, a}, which is abelian
and hence has only one-dimensional irreducible representations. Furthermore, each
element forms a class by itself. The identity representation assigns the number 1 to
every element of C2 , and hence we have D1 (e) = D1 (a) = 1 where the superscript
1 refers to the identity irrep. Note that Theorem 13 in this case with µ = ν = 1
becomes 1 · 1 + 1 · 1 = 2 = nG as it should. If we have a second inequivalent irrep
D2 (C2 ), then as a (nG = 2)-component vector, it must have components orthogonal
to (1, 1). Up to normalization, the only such vector is (1, −1), and therefore the
only other possible one-dimensional irrep with the correct normalization is e → 1
and a → −1 so that 1 · 1 + (−1) · (−1) = 2 also.
We can summarize these results in a table with the group elements gi labeling
the columns and the inequivalent irreps Dµ (G) labeling the rows (this is not quite
the same as a character table):
1
2
e
a
1
1
1 −1
Next, turn to Theorem 13 and set k = i and l = j. Summing over i and j we
then obtain
XX
X
nG
δij
Dµ (g)†ii Dν (g)jj =
δµν
nµ
g
ij
ij
or
X
χµ (g)∗ χν (g) = nG δµν .
(7)
g
This shows that the characters form a set of orthogonal vectors in group-element
space.
Since the characters are the same for all members of the same class, we can
rewrite equation (7) as a sum over classes to obtain
X
nk χµ (Ck )∗ χν (Ck ) = nG δµν
(8)
k
31
where nk is the number of elements in the class Ck .
The important observation to make from this is that the characters of the irreps
form a set of orthogonal vectors in a vector space with coordinate axes labeled
by classes. Since orthogonal vectors are linearly independent, and you can’t have
more linearly independent vectors than the dimensionality of the space, we see
that the number of irreducible representations can not exceed the number of classes.
Most importantly, we will show below (see Theorem 15) that the number of irreps
equals the number of classes, but to do so requires that we introduce the regular
representation, which we do shortly.
Example 24. Taking the result that the number of irreps equals the number nc
of classes, we can form an nc × nc matrix with columns labeled by class i and rows
labeled by irrep µ. (We will have more to say about these below.) In the case of
abelian groups, all irreps are one-dimensional and each element forms a class of its
own, and therefore Dµ (gi ) = χµ (Ci ). Thus, for abelian groups such as C2 , the table
shown in Example 23 is the same as its character table.
An immediate consequence of equation (7) is that we can easily determine the
number of times a given irrep occurs in a reducible representation. To see this,
suppose that the representation U (g) is put into block diagonal form D(g) = D1 (g)⊕
· · · ⊕ Dr (g). Now in general, a given irrep Dµ (g) will occur a number aµ of times in
this decomposition. Since the trace of D(g) is the sum of the traces of the Dµ (g),
we easily see that
X
χ(g) =
aν χν (g) .
(9)
ν
Multiplying both sides of this equation by χµ (g)∗ and summing over g we obtain
X
X X
X
χµ (g)∗ χ(g) =
aν
χµ (g)∗ χν (g) = nG
aν δµν = nG aµ .
g
ν
g
ν
Therefore, summing over either group elements or classes we have
aµ =
1 X µ ∗
1 X
χ (g) χ(g) =
nk χµ (Ck )∗ χ(Ck ) .
nG g
nG
(10)
k
We now turn to a particularly useful representation called the regular representation. This is defined by taking the group elements themselves to be operators
acting on the vector space defined to have the group itself as a basis. Thus the dimension of the regular representation is nG .
To distinguish the regular representation from other representations and save
myself some typing, I will use the Greek letter Γ instead of U to denote the operator,
and Γ(g)ij instead of Dreg (g)ij . Thus we define
X
Γ(gi )gj := gi gj =
gk Γ(gi )kj .
k
32
This is just our usual definition for the matrix representation of an operator. To
verify that this indeed defines a representation, we calculate as follows:
Γ(gi gj )gk = gi (gj gk ) = gi gr Γ(gj )r k = gs Γ(gi )s r Γ(gj )r k = gs [Γ(gi )Γ(gj )]s k
= gs Γ(gi gj )s k
and hence Γ(gi gj )s k = [Γ(gi )Γ(gj )]s k as claimed. This is equivalent to writing the
operator expression
Γ(gi gj )gk = gi (gj gk ) = gi [Γ(gj )gk ] = Γ(gi )[Γ(gj )gk ] = [Γ(gi )Γ(gj )]gk
so that Γ(gi gj ) = Γ(gi )Γ(gj ).
Example 25. Referring to Example 17, we know that ab = d or g2 g3 = g5 . Then
X
g5 = Γ(g2 )g3 =
gk Γ(g2 )k3
k
which implies that Γ(g2 )k3 = δk5 .
As demonstrated in this example,
Γ(gi )kj =
(
1
0
if gi gj = gk
.
otherwise
It is important to realize this implies that
Γ(gi )kk = 1
Therefore we have
χreg (g) =
if and only if gi = e .
(
nG
0
if g = e
.
if g =
6 e
If we now combine this result with equation (10) we find
aµ =
1 µ
1 X µ ∗ reg
χ (g) χ (g) =
χ (e)nG = nµ .
nG g
nG
and hence we have proved one of the most important properties of the regular
representation.
Theorem 14. The regular representation contains each irreducible representation a
number of times aµ equal to the dimensionality nµ of the irreducible representation.
33
P
Recall that following the proof of Theorem 13 we showed that µ nµ 2 ≤ nG . We
are now able to show that in fact equality holds in this relation. If you think about
the block diagonal form of the regular representation matrices, the µ-representation
with dimension nµ occurs nµ times, and thus takes up nµ 2 rows (or columns) of each
matrix. Since each matrix is of size nG (the dimension of the regular representation),
we must have
X
nµ 2 = nG .
(11)
µ
As another application of Theorem 14, we see directly from equation (9) that
(
X
nG if g = e
χreg (g) =
nµ χµ (g) =
.
0
if g 6= e
µ
Now we are finally in a position to prove the last important orthogonality relation. Starting from equation (6)
X
ni nj χµ (Ci )χµ (Cj ) = nµ
cijk nk χµ (Ck )
k
we sum over all irreps µ and use the fact that e = [e] = C1 and e ∈
/ Ck for k 6= 1 to
obtain
X
X
X
ni nj
χµ (Ci )χµ (Cj ) =
cijk nk
nµ χµ (Ck ) = cij1 n1 nG .
µ
µ
k
But n1 = 1, and from equation (5) we know that cij1 = ni δji′ so we have
X
nj
χµ (Ci )χµ (Cj ) = nG δji′ .
µ
Since Cj ′ is the class of all inverse elements to Cj , if we take each irrep Γµ to be
unitary, then Γ(g −1 ) = Γ(g)−1 = Γ(g)† so that
χµ (Cj ) = χµ (Cj ′ )∗ .
Using nj = nj ′ we can now write
X
nj ′
χµ (Ci )χµ (Cj ′ )∗ = nG δji′ .
µ
′
Now relabel by letting j = k or, equivalently, j = k ′ . Noting that δk′ i′ = δki we
arrive at our desired result
X
nG
χµ (Ck )∗ χµ (Ci ) =
δki .
(12)
nk
µ
At last we are in a position to prove our assertion that the number of classes
is equal to the number of inequivalent irreducible representations. Start by letting
µ = ν in equation (8):
X
nk χµ (Ck )∗ χµ (Ck ) = nG .
k
34
Suppose there are nc classes and nr irreps. Summing this last equation over all
irreps yields
nr X
nc
X
nk χµ (Ck )∗ χµ (Ck ) = nG nr .
µ=1 k=1
Similarly, set k = i in equation (12) and sum over classes to obtain
nr X
nc
X
nk χµ (Ck )∗ χµ (Ck ) = nG nc .
µ=1 k=1
Comparison of these last two equations shows that nr = nc , and hence we have
proved
Theorem 15. For a finite group G, the number of inequivalent irreducible representations is equal to the number of classes.
Example 26. If a group is abelian, then each element forms its own class. Therefore
an abelian group of order nG has nG one-dimensional
(irreducible) representations.
P
But from equation (11) we then see that µ nµ 2 = nG so there can be no other
irreps. This is the same result derived in Example 13 using Schur’s lemma 1.
The main use of these orthogonality equations is that they allow us to construct
character tables with entries consisting of the characters of each class within each
irrep. Since the number of irreps is equal to the number of classes, we can construct
an nc × nc “matrix” with rows labeled by the irreps Dµ and columns labeled by
classes nk Ck where we include the number of elements in each class in our label
(but not in the corresponding entry).
Example 27. One consequence of equation
P (11) (at least for smaller groups) is
that there is often a unique solution to µ nµ 2 = nG . For example, a group of
order 6 must have two one-dimensional irreps and a single two-dimensional irrep
2
because 1P
+ 12 + 22 = 6 is the only decomposition of 6 as a sum of squares (not
counting
12 ).
Now, we know that any group always has the trivial one-dimensional representation D1 (g) = 1. Referring to Examples 17 and 22, we see that another onedimensional representation may be defined by D2 (g) = det g, and we have also
explicitly shown a two-dimensional representation from which you can find the
characters (the trace of each matrix). Thus we have the following character table:
D1
D2
D3
C1
1
1
2
3C2 2C3
1
1
−1
1
0 −1
35
Note that the rows are orthogonal according to equation (8), and the columns are
orthogonal according to equation (12).
Even though a character table contains far less information than an entire set
of irrep matrices, it is often enough for the problem at hand. In fact, for the
simple groups usually of interest, it is possible to fill in the table without even
constructing an explicit matrix representation of the group. The general procedure
is the following:
1. Find the number of classes by using the group multiplication table (or physical
considerations if possible).
P
2. Find the dimensionalities nµ from µ nµ 2 = nG . In simple cases this usually
has a unique solution.
Since the identity element is represented by the identity matrix, the character
(trace) of the identity class gives χµ (e) = nµ which gives the first column
of the table. In addition, since we always have the trivial representation, we
know that the first row always has χ1 (Ck ) = 1.
3. The rows obey the orthogonality relation (equation (8))
X
nk χµ (Ck )∗ χν (Ck ) = nG δµν .
k
4. The columns obey the orthogonality relation (equation (12))
X
χµ (Ck )∗ χµ (Ci ) =
µ
nG
δki .
nk
5. Entries within the µth row are related by (equation (6))
X
ni nj χµ (Ci )χµ (Cj ) = nµ
cijk nk χµ (Ck ) .
k
36
Download