An Introduction to Finite Groups Physics 5040 1 Spring 2009 Definitions A group (G, ⋄) is a nonempty set G together with a binary operation called multiplication (or a product) and denoted by ⋄ that obeys the following axioms: (G1) (G2) (G3) (G4) a, b ∈ G implies a ⋄ b ∈ G (closure); a, b, c ∈ G implies (a ⋄ b) ⋄ c = a ⋄ (b ⋄ c) (associativity); There exists e ∈ G such that a ⋄ e = e ⋄ a = a for all a ∈ G (identity); For each a ∈ G, there exists a−1 ∈ G such that a ⋄ a−1 = a−1 ⋄ a = e (inverse). Furthermore, a group is said to be abelian if it also has the property that (G5) a ⋄ b = b ⋄ a for all a, b ∈ G (commutativity). In the case of abelian groups, the group multiplication operation is frequently denoted by + and called addition. We will generally simplify our notation by leaving out the group multiplication symbol and assuming that it is understood for the particular group under discussion. The number of elements in a group G is called its order and will be denoted by nG . If this number is finite, then we say that G is a finite group. Otherwise, G is said to be infinite. It is also a fact that every group of order less than or equal to 5 must be abelian, and hence that the smallest non-abelian group is of order 6. While we have defined a group in the usual manner, it should be realized that there is a certain amount of redundancy in our definition. In particular, it is not necessary to require that a “right inverse” also be the “left inverse.” To see this, suppose that for any a ∈ G, we have the right inverse defined by aa−1 = e. Then multiplying from the left by a−1 yields a−1 aa−1 = a−1 . But a−1 ∈ G so there exists an (a−1 )−1 ∈ G such that (a−1 )(a−1 )−1 = e. Multiplying our previous expression from the right by (a−1 )−1 results in a−1 a = e, and hence we see that a−1 is also a left inverse. Of course, we could have started with a left inverse and shown that it is also a right inverse. Similarly, we could have defined a right identity by ae = a for all a ∈ G. We then observe that a = ae = a(a−1 a) = (aa−1 )a = ea, and hence e is also a left identity. It is easy to show that the identity element is unique. To see this, suppose that there exist e, e′ ∈ G such that for every a ∈ G we have ea = ae = a = e′ a = ae′ . Since ea = a for every a ∈ G, we have in particular that ee′ = e′ . On the other hand, since we also have ae′ = a, it follows that ee′ = e. Therefore e′ = ee′ = e so that e = e′ . 1 Before showing the uniqueness of the inverse, we first prove an important basic result. Suppose that ax = ay for a, x, y ∈ G. Let a−1 be a (not necessarily unique) inverse to a. Then x = ex = (a−1 a)x = a−1 (ax) = a−1 (ay) = (a−1 a)y = ey = y. In other words, the equation ax = ay means that x = y. This is sometimes called the (left) cancellation law. As a special case, we see that if aa−1 = e = aa′−1 , then this implies a−1 = a′−1 so that the inverse is indeed unique as claimed. This also shows that (a−1 )−1 = a since (a−1 )−1 (a−1 ) = e and aa−1 = e. Finally, another important result follows by noting that (ab)(b−1 a−1 ) = a((bb−1 )a−1 ) = a(ea−1 ) = aa−1 = e . Since the inverse is unique, we then see that (ab)−1 = b−1 a−1 . This clearly extends by induction to any finite product of group elements. Example 1. The set of integers Z = 0, ±1, ±2, . . . forms an infinite abelian group where the group multiplication operation is just ordinary addition. It should be obvious that the (additive) identity element is 0, and the inverse of any number n is given by −n. However, it is easy to see that Z is not a group under the operation of ordinary multiplication. Indeed, while Z is both closed and associative under multiplication, and it also contains the (multiplicative) identity element 1, no element of Z (other than ±1) has a multiplicative inverse in Z (for example, 2−1 = 1/2 ∈ / Z). On the other hand, if we consider the set Q of all rational numbers, then Q forms a group under ordinary addition (with identity element 0 and inverse −p/q ∈ Q to any p/q ∈ Q). Moreover, the nonzero elements of Q also form a group under ordinary multiplication (with identity element 1 and inverse q/p ∈ Q to any p/q ∈ Q). Example 2. The cyclic groups of order n are the groups Cn of the form {e, a, a2 , a3 , . . . , an−1 , an = e}. They are all necessarily abelian, and the simplest non-cyclic group is of order 4. Example 3. A more complicated (but quite useful) example is given by the set of all rotations in the xy-plane. Consider the following figure that shows a vector r = (x, y) making an angle ϕ with the x-axis, and a vector r′ = (x′ , y ′ ) making an angle θ + ϕ with the x-axis: 2 y′ r′ = (x′ , y ′ ) y r = (x, y) θ ϕ x′ x We assume r = krk = kr′ k so that the vector r′ results from a counterclockwise rotation by an angle θ with respect to the vector r. From the figure, we see that r′ has components x′ and y ′ given by x′ = r cos(θ + ϕ) = r cos θ cos ϕ − r sin θ sin ϕ = x cos θ − y sin θ y ′ = r sin(θ + ϕ) = r sin θ cos ϕ + r cos θ sin ϕ = x sin θ + y cos θ. Let R(α) denote a counterclockwise rotation by an angle α. It should be clear that R(0) is just the identity rotation (i.e., no rotation at all), and that the inverse is given by R(α)−1 = R(α). With these definitions, it is easy to see that the set of all rotations in the plane forms an infinite (actually, continuous) abelian group. A convenient way of describing these rotations is with the matrix cos α − sin α R(α) = . sin α cos α (As we will see below, such a matrix is said form a representation of the rotation group.) We then see that r′ = R(θ)r, which in matrix notation is just ′ x cos θ − sin θ x = . y′ sin θ cos θ y Using this notation, it is easy to see that R(0) is the identity since x 1 0 x = y 0 1 y and also that R(θ)−1 = R(−θ) because cos θ − sin θ cos θ R(θ)R(−θ) = sin θ cos θ − sin θ sin θ cos θ = 1 0 0 1 = R(−θ)R(θ). We remark that while the rotation group in two dimensions is abelian, the rotation group in three dimensions is not. For example, let Rz (θ) denote a rotation 3 about the z-axis (in the “right-handed sense”). Then, applied to any vector x̂ lying along the x-axis, we see that Ry (90◦ )Rz (45◦ )x̂ 6= Rz (45◦ )Ry (90◦ )x̂ since in the second case, the result lies along the z-axis, while in the first case it does not. Now that we know the inverse is unique, let us take another look at the (left) cancellation law. We restate it as the following useful result, often called the rearrangement lemma. Lemma. If a, b, c ∈ G and ca = cb, then a = b. Proof. Since G contains c−1 by definition, simply multiply the equation from the left by c−1 . What this result means is that if a and b are distinct elements of G, then so are ca and cb. The importance of this comes from observing that if all the group elements are arranged in a sequence, say {g1 , . . . , gn }, then multiplying from the left by an element h results in the sequence {hg1 , . . . , hgn } which is the same as the original sequence except for order. (Of course, the same result holds equally well for multiplication from the right.) Since each element hgi is determined by the group multiplication rule, let us write hgi = ghi where hi is the integer label of this particular element. (For example, if hg2 = g5 , then h2 = 5.) Since the rearrangement lemma tells us that ghi and ghj are distinct if i 6= j, it follows that the numbers (h1 , . . . , hn ) are simply a permutation of (1, . . . , n). In other words, there is a natural correspondence between group elements h and the permutation characterized by (h1 , . . . , hn ). To elaborate on this, let us denote an arbitrary permutation of n objects by 1 2 ··· n p= p1 p2 · · · pn where each entry in the first row is to be replaced by the corresponding entry below it. The set of all n! permutations of n objects forms a group called the permutation group or the symmetric group and is denoted by Sn . It is clear that one permutation followed by another permutation is a third permutation, and this defines the group multiplication. (By convention, multiplication proceeds from right to left.) The identity corresponds to no permutation at all, and is represented by 1 2 ··· n e= 1 2 ··· n 4 while the inverse of p is just p−1 = Example 4. Let us mutations. Consider 1 p= 2 p1 1 p2 2 ··· ··· pn n . give an example that demonstrates the multiplication of perthe permutations p, q ∈ S6 defined by 2 3 4 5 6 1 2 3 4 5 6 q= . 6 1 3 4 5 2 1 6 3 4 5 To evaluate the product pq, we first note that q takes 1 → 2 and then p takes 2 → 6, so the product pq takes 1 → 2 → 6 or simply 1 → 6. Next, q takes 2 → 1 while p takes 1 → 2 so that pq takes 2 → 2. Continuing in this manner, we see that 1 2 3 4 5 6 pq = . 6 2 5 1 3 4 Notice that in this example, the final answer can be written as a product of two disjoint permutations (i.e., they permute different sets of numbers): 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 = . 6 2 5 1 3 4 6 2 3 1 5 4 1 2 5 4 3 6 Also notice that since the two permutations on the right are disjoint, the order in which you multiply them doesn’t matter. A simpler notation for permutations is the so-called cycle notation, which in this case for the result pq would be the product of a 1-cycle, a 2-cycle and a 3-cycle written as (164)(35)(2) . The first of these is to be interpreted as leaving 2 unchanged, then the second as 3 → 5 and 5 → 3, while the third is interpreted as 1 → 6, 6 → 4 and 4 → 1. In other words, each cycle only includes those numbers in the permutation that are actually permuted within themselves, starting with one number and following its path until you return to the starting point. Thus (164) is the same as (641) or (416) (but not (614) or (146) and so forth). In order to describe how to decompose a permutation into a product of cycles, it is easiest to simply give an example. Consider the permutation p ∈ S6 shown in Example 4. Starting with 1, repeatedly apply p until you get back to 1 again. Since Sn is finite, this can only take a finite number of steps. In this case, we have 1 → 2 → 6 → 5 → 4 → 3 → 1 so we can write p as the 6-cycle p = (126543). Since all 6 numbers in p are accounted for, we are done in this case. Now look at q. Again starting from 1 we have 1 → 2 → 1 so we have the 2-cycle (12). Now go to the next number not included so far, which in this case is 3. Then we have 5 3 → 6 → 5 → 4 → 3 which gives us the 4-cycle (3654) and we are done. Therefore we can write q = (3654)(12). For the product pq we see that 1 → 6 → 4 → 1 so we have the 3-cycle (164), then 2 → 2 gives (2) and 3 → 5 → 3 gives the 2-cycle (35). Therefore pq = (164)(2)(35) as we stated above. The multiplication of cycles is also very straightforward if you just think about how permutations are multiplied (and remember that multiplication proceeds from right to left). For example, consider the product (12)(23) of 2-cycles in S3 . The first cycle says 2 → 3, while the second cycle doesn’t do anything to 3. Then the first says 3 → 2 and now the second also says 2 → 1, so we have 3 → 1. Putting these together we have the result (231) = (123). As another example, consider the product (234)(123) of 3-cycles in S4 . We have 1 → 2 followed by 2 → 3 for the term 1 → 3. Then we have 2 → 3 followed by 3 → 4 for the term 2 → 4. Even though the first term also has 3 → 1, this was already taken into account since there is no 1 in the second term. Thus we are left with the result (13)(24). Keep in mind that if you have any doubt, you can always write out the complete permutations and multiply them. And again I emphasize that the order in which disjoint cycles are multiplied is irrelevant. In any case, this notation will greatly simplify some of our examples that use the group Sn , as we now show. First, a subset H of a group G is said to be a subgroup of G if H also forms a group under the same multiplication rule as G. Example 5. Consider the group S3 of order 3! = 6. I leave it as an exercise to verify that each of the following four subsets forms a subgroup of S3 : {e, (12)} {e, (23)} {e, (31)} {e, (123), (321)} . Be sure to note that (123) = (312) 6= (321). 2 Homomorphisms Let ϕ : G → G′ be a mapping from a group G to a group G′ . If for every a, b ∈ G we have ϕ(ab) = ϕ(a)ϕ(b) then ϕ is said to be a homomorphism, and the groups G and G′ are said to be homomorphic. In other words, a homomorphism preserves group multiplication, but is not in general either surjective or injective. It should also be noted that the product ab is an element of G while the product ϕ(a)ϕ(b) is an element of G′ . Example 6. Let G be the (abelian) group of all real numbers under addition, and let G′ be the group of nonzero real numbers under multiplication. If we define 6 ϕ : G → G′ by ϕ(x) = 2x , then ϕ(x + y) = 2x+y = 2x 2y = ϕ(x)ϕ(y) so that ϕ is indeed a homomorphism. Example 7. Let G be the group of all real (or complex) numbers under ordinary addition. For any real (or complex) number a, we define the mapping ϕ of G onto itself by ϕ(x) = ax. This ϕ is clearly a homomorphism since ϕ(x + y) = a(x + y) = ax + ay = ϕ(x) + ϕ(y). However, if b is any other nonzero real (or complex) number, then you can easily show that the (“non-homogeneous”) mapping ψ(x) = ax+b is not a homomorphism. Let e be the identity element of G, and let e′ be the identity element of G′ . If ϕ : G → G′ is a homomorphism, then ϕ(g)e′ = ϕ(g) = ϕ(ge) = ϕ(g)ϕ(e), and we have the important result ϕ(e) = e′ . Using this result, we then see that e′ = ϕ(e) = ϕ(gg −1 ) = ϕ(g)ϕ(g −1 ), and hence the uniqueness of the inverse tells us that ϕ(g −1 ) = ϕ(g)−1 . It is very important to note that in general ϕ(g)−1 6= ϕ−1 (g) since if g ∈ G we have ϕ(g)−1 ∈ G′ while if g ∈ G′ , then ϕ−1 (g) ∈ G. It is now easy to see that the homomorphic image of G is necessarily a subgroup of G′ . Closure is obvious because ϕ(a)ϕ(b) = ϕ(ab). There is an identity because ϕ(e) = e′ ∈ G′ . There is also an inverse because if ϕ(g) ∈ G′ , then ϕ(g)−1 = ϕ(g −1 ) ∈ G′ . And associativity holds because [ϕ(a)ϕ(b)]ϕ(c) = ϕ(ab)ϕ(c) = ϕ(abc) = ϕ(a)ϕ(bc) = ϕ(a)[ϕ(b)ϕ(c)]. In general, there may be many elements g ∈ G that map into the same element g ′ ∈ G′ under ϕ. It is of particular interest to see what happens if more than one element of G (besides e) maps into e′ . If k ∈ G is such that ϕ(k) = e′ , then for any g ∈ G we have ϕ(gk) = ϕ(g)ϕ(k) = ϕ(g)e′ = ϕ(g). Therefore, if gk 6= g we see that ϕ could not possibly be a one-to-one mapping. To help us see just when a homomorphism is one-to-one, we define the kernel of ϕ to be the set Ker ϕ = {g ∈ G : ϕ(g) = e′ }. It is easy to see that Ker ϕ is a subgroup of G. If a homomorphism ϕ : G → G′ is one-to-one (i.e., injective), we say that ϕ is an isomorphism. If, in addition, ϕ is also onto (i.e., surjective), then we say that G 7 and G′ are isomorphic. In other words, G and G′ are isomorphic if ϕ is a bijective homomorphism. (We point out that many authors use the word “isomorphism” to implicitly mean that ϕ is a bijection.) In particular, an isomorphism of a group onto itself is called an automorphism. From the definition, it appears that there is a relationship between the kernel of a homomorphism and whether or not it is an isomorphism. We now proceed to show that this is indeed the case. By way of notation, if H is a subset of a group G, then by Hg we mean the set Hg = {hg ∈ G : h ∈ H}. In the particular case that H is a subgroup of G, then Hg is called a right coset of G. (And left cosets are defined in the obvious way.) Recall also that if ϕ : G → G′ and g ′ ∈ G′ then, by an inverse image of g ′ , we mean any element g ∈ G such that ϕ(g) = g ′ . Theorem 1. Let ϕ be a homomorphism of a group G onto a group G′ , and let Kϕ be the kernel of ϕ. Then given any g ′ ∈ G′ , the set of all inverse images of g ′ is given by Kϕ g where g ∈ G is any particular inverse image of g ′ . Proof. Consider any k ∈ Kϕ . Then by definition of homomorphism, we must have ϕ(kg) = ϕ(k)ϕ(g) = e′ g ′ = g ′ . In other words, if g is any inverse image of g ′ , then so is any kg ∈ Kϕ g. We must be sure that there is no other element h ∈ G, h ∈ / Kϕ g with the property that ϕ(h) = g ′ . To see that this is true, suppose ϕ(h) = g ′ = ϕ(g). Then ϕ(h) = ϕ(g) implies e′ = ϕ(h)ϕ(g)−1 = ϕ(h)ϕ(g −1 ) = ϕ(hg −1 ). But this means that hg −1 ∈ Kϕ , and hence hg −1 = k for some k ∈ Kϕ . Therefore h = kg ∈ Kϕ g and must have already been taken into account. Corollary. A homomorphism ϕ mapping a group G to a group G′ is an isomorphism if and only if Ker ϕ = {e}. Proof. Note that if ϕ(G) 6= G′ , then we may apply Theorem 1 to G and ϕ(G). In other words, it is trivial that ϕ always maps G onto ϕ(G). Now, if ϕ is an isomorphism, then it is one-to-one by definition, so that there can be no element of G other than e that maps into e′ . Conversely, if Ker ϕ = {e} then Theorem 1 shows that any x′ ∈ ϕ(G) ⊂ G′ has exactly one inverse image. Of course, if ϕ is surjective, then ϕ(G) is just equal to G′ . In other words, we may think of isomorphic groups as being essentially identical to each other. 8 Example 8. Let G be any group, and let g ∈ G be fixed. We define the mapping ϕ : G → G by ϕ(a) = gag −1 , and we claim that ϕ is an automorphism. To see this, first note that ϕ is indeed a homomorphism since for any a, b ∈ G we have ϕ(ab) = g(ab)g −1 = g(aeb)g −1 = g(ag −1 gb)g −1 = (gag −1 )(gbg −1 ) = ϕ(a)ϕ(b). To see that ϕ is surjective, simply note that for any b ∈ G we may define a = g −1 bg so that ϕ(a) = b. Next, we observe that if ϕ(a) = gag −1 = e, then right-multiplying by g and left multiplying by g −1 yields a = (g −1 g)a(g −1 g) = g −1 eg = e and hence Ker ϕ = {e}. From the corollary to Theorem 1, we now see that ϕ must be an isomorphism. Our next important result is called Cayley’s theorem. Theorem 2. Every group of order n is isomorphic to a subgroup of Sn . Note that G has n elements while Sn has n! elements. Proof. Let us define the mapping ϕ : G → Sn by 1 2 ϕ : a ∈ G −→ ϕ(a) := pa := a1 a2 ··· ··· n an ∈ Sn where the indices ai are determined by the definition (due to the rearrangement lemma) agi = gai . If we can show that ϕ is a homomorphism, then from ϕ(a) = pa , we see that if pa is the identity in Sn , then we must have a = e so that Ker ϕ = {e} and ϕ is in fact an isomorphism. Let ab = c in G. First note that gci = cgi = (ab)gi = a(bgi ) = agbi = gabi so that ci = abi . Now observe that (since (b1 , b2 , . . . , bn ) is just some permutation of (1, 2, . . . , n)) 1 2 ··· n 1 2 ··· n · ϕ(a)ϕ(b) = pa pb = b1 b2 · · · bn a1 a2 · · · an 9 b1 ab1 1 = ab1 = b2 ab2 ··· ··· 2 ab2 ··· ··· 1 2 ··· · b1 b2 · · · n = pc = pab abn bn abn n bn = ϕ(ab) . Thus ϕ is a homomorphism as claimed. Example 9. We show that the cyclic group C3 = {e, a, b = a2 } of order 3 is isomorphic to the subgroup of S3 defined by {e, (123), (321)}. (See Examples 2 and 5.) Let us relabel the elements of C3 as (g1 , g2 , g3 ). Multiplying from the left by g1 = e leaves the ordered set unchanged, so egi = gei = gi which corresponds to the identity permutation pe = (1)(2)(3) ∈ S3 . Next, multiplying by g2 = a we obtain the ordered set (a, b, e) = (g2 , g3 , g1 ) so that (a1 , a2 , a3 ) = (2, 3, 1) which corresponds to (in cycle notation) a ∈ C3 → pa = (123) ∈ S3 . Finally, multiplying by g3 = b = a2 we obtain (b, e, a) = (g3 , g1 , g2 ) so that (b1 , b2 , b3 ) = (3, 1, 2) which corresponds to b ∈ C3 → pb = (132) = (321) ∈ S3 . To verify the homomorphism property, consider, for example pa pb = (123)(132) = (1)(3)(2) = pe = pab and pa pa = (123)(123) = (132) = pb = paa . 3 Representations We know that the composition of linear transformations on a vector space V is associative, but not necessarily commutative (think of matrix multiplication), and hence it is really just a group multiplication. A set of nonsingular linear transformations on V that is closed with respect to compositions forms a group of linear transformations, or a group of operators. If there is a homomorphism ϕ from a group G to a set U (G) of operators on a vector space V , then U (G) is said to form a representation of G. In other words, we have ϕ(g) = U (g) for each g ∈ G, and U (g1 g2 ) = U (g1 )U (g2 ) . 10 The dimension of the representation is the dimension of V . If the homomorphism is also an isomorphism, then the representation is said to be faithful. A representation that is not faithful is sometimes said to be degenerate. Note that by definition the U (g)’s form a group, and hence they must be nonsingular operators. Just as we showed for homomorphisms in general, given any U (g), there exists U (g)−1 so that U (g) = U (ge) = U (g)U (e) and therefore we always have U (e) = 1 . Now we observe that 1 = U (e) = U (gg −1 ) = U (g)U (g −1 ) and thus U (g −1 ) = U (g)−1 . In particular, if the representation U (G) is unitary, then U (g)† = U (g)−1 = U (g −1 ) . Note also that if U (g) ∈ L(V ) and v ∈ V , then the fact that U (g) is nonsingular means that U (g)v = 0 if and only if v = 0. We will restrict our consideration to finite-dimensional representations. Let {ei } be a basis for V . Then the matrix representation D(g)i j of an operator U (g) is defined in the usual manner by U (g)ei = ej D(g)j i . That these matrices themselves form a representation follows by observing that U (g1 )U (g2 )ei = U (g1 )ej D(g2 )j i = ek D(g1 )k j D(g2 )j i = U (g1 g2 ) = ek D(g1 g2 )k i . Since the {ek } form a basis, we must have (in terms of matrix multiplication) D(g1 g2 ) = D(g1 )D(g2 ) . The group of matrices D(G) are said to form a matrix representation of G. Example 10. Let V = C, and for each g ∈ G, define U (g) = 1. Since U (g1 )U (g2 ) = 1 · 1 = 1 = U (g1 g2 ), we see that the mapping g → 1 forms a trivial one-dimensional representation of any group. Example 11. Let G be a group of matrices. For example, G could be the group GL(n) consisting of all nonsingular n × n matrices, or the group U (n) of all unitary n × n matrices. We may define a representation of G by U (g) = det g. That this is indeed a representation follows from the fact that det(g1 g2 ) = (det g1 )(det g2 ) . Thus we have a non-trivial one-dimensional representation of any matrix group. 11 Example 12. Let G = {R(θ), 0 ≤ θ < 2π} be the group of rotations in the plane as shown in Example 3. x2 x′2 e′2 θ e2 e′1 θ e1 x′1 x1 The vectors ei and e′i are the usual orthonormal basis vectors with kei k = ke′i k = 1. From the geometry of the diagram we see that e′1 = U (θ)e1 = e1 cos θ + e2 sin θ e′2 = U (θ)e2 = e1 (− sin θ) + e2 cos θ so that e′i = ej D(θ)j i and the matrix (D(θ)j i ) is given by " # cos θ − sin θ j (D(θ) i ) = . sin θ cos θ I leave it as an exercise to verify that D(θ + φ) = D(θ)D(φ) . Thus {U (θ)} forms a two-dimensional representation of the rotation group R(θ) with the matrix realization {D(θ)} with respect to the basis {ei }. Now let U (G) be a representation of G on V , and let S be any nonsingular operator on V . Then U ′ (G) = S −1 U (G)S also forms a representation on V because U ′ (g1 g2 ) = S −1 U (g1 g2 )S = S −1 U (g1 )U (g2 )S = S −1 U (g1 )SS −1 U (g2 )S = U ′ (g1 )U ′ (g2 ) . Two representations related by such a similarity transformation are said to be equivalent representations. Equivalent representations are essentially the same, and our interest is generally in determining the various inequivalent representations of a group. What happens if we have a representation U (G) on V where there happens to be a subspace W ⊂ V with the property that U (g)W ⊂ W for all g ∈ G? In this case W is said to be a U (G)-invariant subspace, and we know that the matrix 12 representation D(G) of U (G) will take the block diagonal form (with respect to the appropriate basis) " ′ # D (g) B D(g) = . 0 C Note that D(g1 )D(g2 ) = " D′ (g1 ) B1 0 = D(g1 g2 ) = #" D′ (g2 ) B2 0 C2 C1 # " ′ D (g1 g2 ) B12 0 # = " D′ (g1 )D′ (g2 ) E 0 F # C12 and therefore the matrices D′ (G) also form a representation of U (G) but with dimension dim W < dim V . If the invariant subspace W ⊂ V does not contain any nontrivial U (G)-invariant subspace, then W is said to be minimal or proper. A representation U (G) is said to be irreducible if it does not contain any nontrivial U (G)-invariant subspace; otherwise, it is reducible. Furthermore, if we have V = W1 ⊕ · · · ⊕ Wr where each Wi is a minimal U (G)-invariant subspace, then U (G) is said to be completely reducible or decomposable. In this case, the matrix representation of U (G) takes the block diagonal form D1 (G) .. D(G) = . Dr (G) where each Di (G) is a matrix representation of U (G). Thus we see that restricting U (G) to any Wi yields a lower-dimensional representation of G. Therefore a representation U (G) is completely reducible if it can be decomposed into a direct sum of irreducible representations. We will frequently refer to an irreducible representation as an irrep. If the group representation space V is a unitary space (i.e., a complex inner product space) and if the operators U (G) on V are unitary, then we say that U (G) is a unitary representation of G. Since unitary operators preserve lengths, angles and scalar products, unitary representations are fundamental to the study of symmetry groups. The following two theorems greatly simplify many of the results in representation theory. Theorem 3. Every representation of a finite group on a unitary space is equivalent to a unitary representation. Proof. Let D(G) be a representation of G on V . We must find a nonsingular operator S such that U (g) := SD(g)S −1 is unitary for every g ∈ G. (If you feel so 13 e compelled, you can define Se = S −1 so this can be written as U (g) = Se−1 D(g)S.) Define the Hermitian operator X A= D(g)† D(g) . g∈G A is positive definite since for any x ∈ V we have X X X hx, Axi = hx, D(g)† D(g)xi = hD(g)x, D(g)xi = kD(g)xk2 . g g g This is greater than or equal to 0, and is equal to 0 if and only if x = 0. (Because each D(g) is nonsingular so that D(g)x = 0 if and only if x = 0. Also note that 2 2 D(e) = 1 so that the sum includes the term kD(e)xk = kxk which is 0 only if x = 0.) In particular, since A is Hermitian it can be diagonalized. This means there exists a basis of (nonzero) eigenvectors vi with corresponding real eigenvalues λi . Then 2 hvi , Avi i = λi hvi , vi i = λi kvi k . 2 Since vi 6= 0, the result above then shows that hvi , Avi i = λi kvi k > 0 so that each λi is both real and greater than 0. Let M be the unitary operator that diagonalizes A so that λ1 .. M † AM = . . λn Defining the “square root” of A by √ λ1 T = .. . √ λn we then define the nonsingular operator S by † =T S = M T M † = S† . Note that (since T 2 = M † AM ) S 2 = M T M † M T M † = M T 2M † = A . And for any h ∈ G, the rearrangement lemma tells us that X X D(h)† AD(h) = D(h)† D(g)† D(g)D(h) = [D(g)D(h)]† D(g)D(h) g = X g g D(gh)† D(gh) = X g = A. 14 D(g)† D(g) Therefore S 2 = A = D(h)† AD(h) = D(h)† S 2 D(h) so that [S −1 D(h)† S][SD(h)S −1 ] = 1 . Finally, define U (h) = SD(h)S −1 so that U (h)† U (h) = 1 and U (G) is unitary. (Where we used the fact that (S −1 )† = (S † )−1 = S −1 .) Theorem 4. Every reducible representation of a finite group is completely reducible. Proof. By Theorem 3, we need only consider a unitary representation U (G). (I leave it as an easy exercise for you to show that if an arbitrary representation D(G) is reducible, then U (G) = SD(G)S −1 is also reducible.) Let W ⊂ V be a U (G)invariant subspace. As we have seen, using the Gram-Schmidt process we may write V = W ⊕ W ⊥ , and we need only show that W ⊥ is also U (G)-invariant. But this is easy to do, for if x ∈ W and y ∈ W ⊥ , then for any g ∈ G we have hx, U (g)yi = hU (g)† x, yi = hU (g −1 )x, yi = 0 because the invariance of W means that U (g −1 )x ∈ W . Although you proved them in the last homework set, I include Schur’s lemmas and their proofs here for the sake of completeness. Theorem 5 (Schur’s lemma 1). Let U (G) be an irreducible representation of G on V . If A ∈ L(V ) is such that AU (g) = U (g)A for all g ∈ G, then A = λ1 where λ ∈ C. Proof. Suppose U (g)A = AU (g) for all g ∈ G. Let v ∈ Vλ so that Av = λv. Then A[U (g)v] = U (g)[Av] = λ[U (g)v] so that U (g)v ∈ Vλ and Vλ is U (G)-invariant. Since U (G) is irreducible we have either Vλ = {0} or Vλ = V . But v 6= 0 by definition, and hence we must have Vλ = V . This means that Av = λv for all v ∈ V which is equivalent to saying that A = λ1. Theorem 6 (Schur’s lemma 2). Let U (G) and U ′ (G) be two irreducible representations of G on V and V ′ respectively, and suppose A ∈ L(V ′ , V ) is such that AU ′ (g) = U (g)A for all g ∈ G. Then either A = 0, or else A is an isomorphism of V ′ onto V so that A−1 exists and U (G) is equivalent to U ′ (G). 15 Proof. Let AU ′ (g) = U (g)A for all g ∈ G and for A ∈ L(V ′ , V ). If v ∈ Im A then there exists v ′ ∈ V ′ such that Av ′ = v. But then U (g)v = U (g)Av ′ = A[U ′ (g)v] ∈ Im A by definition. Since U (G) is irreducible we must have Im A = {0} or V . If Im A = {0} then A = 0. If Im A = V , look at Ker A. For v ∈ Ker A we have Av = 0 so that A[U ′ (g)v] = U (g)Av = 0 which implies U ′ (g)v ∈ Ker A. But U ′ (G) is also irreducible, so Ker A is either {0} or V ′ . If Ker A = V ′ then A = 0 which isn’t possible since Im A = V . Therefore Ker A = {0} so A is one-to-one and onto. In other words, A−1 exists so that U ′ (g) = A−1 U (g)A for all g ∈ G and hence U (G) is equivalent to U ′ (G). Example 13. Let us show that a consequence of Schur’s lemma 1 is that the irreducible representations of any abelian group must be one-dimensional. To see this, let U (G) be an irrep of an abelian group G, and let h ∈ G be arbitrary but fixed. Since G is abelian, we have U (gh) = U (g)U (h) = U (h)U (g) for all g ∈ G, and hence Theorem 5 tells us we can write U (h) = λh 1. Since this applies to any h ∈ G, we see that the mapping h 7→ λh is a one-dimensional representation of G. In other words, if V is one-dimensional and x ∈ V , then U (g)U (h)x = λh U (g)x = λg λh x = U (gh)x = λgh x so that λgh = λg λh . Example 14. Schur’s lemmas have extremely important consequences for any quantum mechanical operator that corresponds to a physical observable that is invariant under some symmetry transformation group G. The symmetry operators are mapped into a unitary representation D(g) that acts on a Hilbert space V of states. In general, this representation is reducible, meaning that we can find a basis for V in which the matrix representation of each D(g) is block diagonal. Then D(g) = D1 (g)⊕ · · ·⊕ Dr (g) where each Di (g) is a unitary irrep acting on a subspace of V . In general, each irrep may occur more than once, but we assume that we have chosen our basis so that the µth irrep is represented by the same unitary matrix Dµ (g) no matter how many times it occurs in the block diagonal decomposition. Let us label our orthonormal basis states by |µ, j, xi where µ labels the irrep (i.e., which blocks (invariant subspaces) correspond to a particular irrep), j = 1, . . . , nµ labels the basis vectors within each subspace (which 16 is then of dimension nµ ), and x labels any other physical variables necessary to describe the state. Note that if a particular irrep occurs only once, then we don’t need the extra variable x to label its states. However, in general there will be many physical states that all have the same symmetry properties, and in this case we need the extra label to distinguish them. The orthonormality of these states means that hµ, j, x|ν, k, yi = δµν δjk δxy . With respect to this basis of invariant subspaces, the matrix representation of D(g) is defined in the usual manner by D(g)|ν, k, yi = X l |ν, l, yiDν (g)lk and therefore the matrix elements are given by hµ, j, x|D(g)|ν, k, yi = δµν δxy Dµ (g)jk . (1) This is simply the algebraic description of the block diagonal matrix form of D(g). Using the completeness relation X |µ, j, xihµ, j, x| = I µ,j,x we have the equivalent operator representation X X |µ, j, xihµ, j, x|D(g) |ν, k, yihν, k, y| D(g) = µ,j,x = X µ,j,x ν,k,y = ν,k,y |µ, j, xiδµν δxy Dµ (g)jk hν, k, y| X µ,j,k,x |µ, j, xiDµ (g)jk hµ, k, x| Under the symmetry transformation, the states transform as |ψi → |ψ ′ i = D(g)|ψi and hψ| → hψ ′ | = hψ|D(g)† . And under a symmetry transformation, the matrix elements of an observable O must obey hψ|O|ψi = hψ ′ |O′ |ψ ′ i and this therefore requires that O → O′ = D(g)OD(g)† . 17 If the observable is invariant under the symmetry, then we must have O′ = O, and this then implies that [O, D(g)] = 0 for all g ∈ G. That the symmetry operators commute with the observable puts an important constraint on the matrix elements hµ, j, x|O|ν, k, yi. To see this, we insert complete sets and calculate as follows, using equation (1): 0 = hµ, j, x|[O, D(g)]|ν, k, yi X X = hµ, j, x|O |ρ, l, zihρ, l, z|D(g)|ν, k, yi − hµ, j, x|D(g) |ρ, l, zihρ, l, z|O|ν, k, yi ρ,l,z = X l ρ,l,z hµ, j, x|O|ν, l, yiDν (g)l k − X l Dµ (g)j l hµ, l, x|O|ν, k, yi This is essentially just the equation [ODν (g)]jk = [Dµ (g)O]jk . Thus, by Schur’s lemma 2, we conclude that the matrix elements of O vanish unless µ = ν. And by Schur’s lemma 1, we see that in the case where µ = ν, the matrix elements of O must be proportional to the identity matrix, i.e., to δjk . However, the symmetry doesn’t tell us anything about the dependence on the physical parameters x and y, so we are finally able to write hµ, j, x|O|ν, k, yi = fµ (x, y)δµν δjk where the function fµ (x, y) is independent of the symmetry variables of the problem, and only contains the physics. This is a simple example of the famous Wigner-Eckart theorem. 4 Cosets and Quotient Groups Because equivalent representations are in a sense identical, it would be nice to find some way of characterizing representations that is independent of whether or not they are equivalent. One approach that immediately comes to mind is the trace of a representation. Since the trace is invariant under similarity transformations, it will be the same for all equivalent matrices corresponding to a given group element. Thus, we define the character of g ∈ G in the representation U (G) to be the number χ(g) = tr U (g). If D(G) is the matrix representation of U (G), then we have X D(g)i i . χ(g) = i Since the character of a group element is the same for all equivalent representations, let us take a closer look at equivalence. This will lead in a natural way to the 18 concept of class. In the next section we will return to our discussion of characters, where we will treat them in great detail. We say that an element a ∈ G is conjugate to b ∈ G if there exists x ∈ G such that a = xbx−1 . Clearly, if a is conjugate to b, then b is conjugate to a, and any element is conjugate to itself (just take x = e). Furthermore, if a is conjugate to b and b is conjugate to c, then a = xbx−1 and b = ycy −1 so that a = x(ycy −1 )x−1 = (xy)c(xy)−1 so that a is conjugate to c. Thus conjugation is an equivalence relation. The set of all group elements conjugate to each other is called a (conjugate) class. The class of an element a will be denoted by [a]. Note that e is in a class by itself, and that no other class can contain e. Furthermore, [e] = e is the only class which is also a subgroup (although a trivial one). Example 15. Consider the permutation group S3 again. The element (12) is conjugate to (31) because (23)(12)(23)−1 = (23)(12)(23) = (23)(231) = (31) . Similarly, (123) is conjugate to (321) because (12)(123)(12)−1 = (12)(123)(12) = (12)(13) = (132) = (321) . One of the most important properties of equivalence relations is that these classes are disjoint. To see this, pick any a ∈ G. By letting x vary over all of G in the expression xax−1 , we can find all elements of G that are conjugate to a, and therefore determine [a]. Similarly, given any b ∈ G we may do the same thing to find [b]. We claim that [a] and [b] are disjoint (as long as a 6= b). Indeed, suppose they contain a common element c. Then we have a = xcx−1 and b = ycy −1 for some x, y ∈ G. But then we can write c = y −1 by so that a = x(y −1 by)x−1 = (xy −1 )b(xy −1 )−1 so that a is conjugate to b and a and b would have to be in the same class. Therefore, equivalence classes are either identical or disjoint. This shows that every group element is contained in a unique class. Example 16. We can divide the elements of S3 into three classes: C1 = {e} C2 = {(12), (23), (31)} C3 = {(123), (321)} . This illustrates a general property of the symmetric groups: permutations with the same cycle structure belong to the same class. To see that this is true, let us take a look at conjugate elements in Sn . Consider two elements a, b ∈ Sn given by a1 · · · an 1 ··· n 1 ··· n = . b= a= ba1 · · · ban b1 · · · bn a1 · · · an Then bab−1 = 1 b1 ··· ··· n bn 1 a1 19 ··· ··· n an b1 1 ··· ··· bn n = b1 ba1 ··· ··· bn ban So we see that to evaluate bab−1 , we apply b separately to the top and bottom rows of a. Since a cycle in general leaves some numbers unchanged, we see that conjugation will leave that number alone. In particular, suppose a1 = 1. Then b1 → ba1 = b1 and in general, if ak = k then bk → bak = bk also remains unchanged, and cycle structure is preserved. For example, (23)(12)(23)−1 = (31) (123)(12)(123)−1 = (32) (12)(123)(12)−1 = (132) . Note that if a is a product of cycles, say a = a1 a2 , then we can always write bab−1 = ba1 a2 b−1 = [ba1 b−1 ][ba2 b−1 ] and again we see that the cycle structure of a will be maintained. It is also possible in many cases to give a physical interpretation of the class structure. For example, consider some group of symmetry transformations on a symmetrical object. Then we can interpret the relation b = x−1 ax as the result of first rotating the object by x, then performing the transformation a, and then rotating back by x−1 . This shows that b must be the same physical type of transformation as a, but performed about a different axis, one that is related to that of a by the transformation x. Example 17. Let us consider the symmetry transformations of an equilateral triangle. 3 c b 2 1 a Here we label the vertices by 1, 2 and 3 so they may be distinguished in a symmetry operation. The group elements a, b and c represent rotations by π about the axes shown. The element d is defined to be a clockwise rotation by 2π/3 in the plane of the triangle, and element f is a counterclockwise rotation by 2π/3. These five 20 operations together with the identity e define a group of order 6 which is denoted by D3 (the dihedral group of order 3). By convention, we assume that the rotation axes a, b and c remain fixed in space and do not rotate with the object. Then it is convenient to describe the group multiplication rules by constructing a group multiplication table as shown below: e a b c d f e e a b c d f a a e f d c b b b d e f a c c c f d e b a d d b c a f e f f c a b e d By definition, the table entries are row elements times column elements (in that order). Using the multiplication table, you can show that the two rotations by 2π/3 form a class, the three rotations by π form a class, and obviously the identity element forms a class of its own. For example, we see that dcd−1 = bd−1 = bf = a or c = d−1 ad. In physical terms, first d rotates the triangle clockwise by 2π/3 so that vertex 2 lies on axis a. Next, a rotates about its axis by π so that vertices 1 and 3 are interchanged. Finally, d−1 = f rotates counterclockwise by 2π/3, leaving the triangle in precisely the same configuration as the single rotation by π about axis c, which is the same as a but rotated 2π/3 counterclockwise by the transformation d−1 . It is clear that this is an non-abelian group. In fact, I leave it to you to verify that the following set of matrices forms a two-dimensional representation of this group. " # " # " # √ 1 0 1 0 −1/2 3/2 e= a= b= √ 0 1 0 −1 3/2 1/2 c= " # √ −1/2 − 3/2 √ 1/2 − 3/2 d= " −1/2 √ − 3/2 # √ 3/2 −1/2 f= " # √ −1/2 − 3/2 √ 3/2 −1/2 Now let H be a subgroup of a group G, and let a ∈ G be arbitrary. As we stated earlier, the set Ha = {ha : h ∈ H} is called a right coset of H in G. Note that if a ∈ H, the rearrangement lemma shows us that Ha = H. Let a, b ∈ G be arbitrary, and suppose that the cosets Ha and Hb have an element in common. This means that h1 a = h2 b for some 21 h1 , h2 ∈ H. But then using the fact that H is a subgroup, we see that a = h1 −1 h1 a = h1 −1 h2 b ∈ Hb. Since this means that a = hb for some h = h1 −1 h2 ∈ H, we see from the rearrangement lemma that this implies Ha = Hhb = Hb and therefore if any two right cosets have an element in common, then they must in fact be identical. It is easy to see that the set of all right cosets of H in G defines an equivalence relation that partitions G into disjoint subsets. It is important to realize that if a ∈ / H, then the coset Ha can not be a subgroup, because it cannot contain the identity element. Indeed, if ha = e for some h ∈ H, then a = h−1 e = h−1 ∈ H, a contradiction. Clearly every g ∈ G must lie in some coset of H (just form Hg for each g ∈ G), and these cosets are either identical or disjoint. Since each coset contains the same number of elements as H, it follows that the order of G is a multiple of the order of H, i.e., nH | nG . This proves the next theorem, known as Lagrange’s theorem. Theorem 7. If G is a finite group and H is a subgroup of G, then nH is a divisor of nG . While we have restricted our discussion to right cosets, it is clear that everything could be repeated using left cosets defined in the obvious way. It should also be clear that for a general subgroup H of a group G, we need not have Ha = aH for any a ∈ G. However, if N is a subgroup of G such that for every n ∈ N and g ∈ G we have gng −1 ∈ N , then we say that N is a normal (or invariant) subgroup of G. An equivalent way of phrasing this is to say that N is a normal subgroup of G if and only if gN g −1 ⊂ N for all g ∈ G (where by gN g −1 we mean the set of all gng −1 with n ∈ N ). The notation N ⊳ G is sometimes used to denote the fact that N is a normal subgroup of G. Since for any n ∈ N we have gng −1 ∈ N for all g ∈ G, we see that the entire class [n] is contained in N . Thus a normal subgroup N consists of complete classes. Theorem 8. A subgroup N of G is normal if and only if gN g −1 = N for every g ∈ G. Proof. If gN g −1 = N for every g ∈ G, then clearly gN g −1 ⊂ N so that N is normal. Conversely, suppose that N is normal in G. Then, for each g ∈ G we have gN g −1 ⊂ N , and hence g −1 N g = g −1 N (g −1 )−1 ⊂ N. 22 Using this result, we see that N = (gg −1 )N (gg −1 ) = g(g −1 N g)g −1 ⊂ gN g −1 and therefore N = gN g −1 (This also follows from Example 8). Example 18. The subgroup N = {e, a2 } is a normal subgroup of the cyclic group C4 = {e = a4 , a, a2 , a3 }. Noting that a−1 = a3 , (a2 )−1 = a2 and (a3 )−1 = a, we see that aN a−1 = a{e, a2 }a3 = {a4 , a6 } = {e, a2 } = N a2 N (a2 )−1 = a2 {e, a2 }a2 = {a4 , a6 } = {e, a2 } = N a3 N (a3 )−1 = a3 {e, a2 }a = {a4 , a6 } = {e, a2 } = N . Be careful to note that Theorem 8 does not say that gng −1 = n for every n ∈ N and g ∈ G. This will in general not be true. The usefulness of this theorem is that it allows us to prove the following result. Theorem 9. A subgroup N of G is normal if and only if every left coset of N in G is also a right coset of N in G. Proof. If N is normal, then gN g −1 = N for every g ∈ G, and hence gN = N g. Conversely, suppose that every left coset gN is also a right coset. We show that in fact this right coset must be N g. Since N is a subgroup it must contain the identity element e, and therefore g = ge ∈ gN so that g must also be in whatever right coset it is that is identical to gN . But we also have eg = g so that g is in the right coset N g. Then, since any two right cosets with an element in common must be identical, it follows that gN = N g. Thus, we see that gN g −1 = N gg −1 = N so that N is normal. If G is a group and A, B are subsets of G, we define the set AB = {ab ∈ G : a ∈ A, b ∈ B}. In particular, if H is a subgroup of G, then HH ⊂ H since H is closed under the group multiplication operation. But we also have H = He ⊂ HH (since e ∈ H), and hence HH = H. Now let N be a normal subgroup of G. By Theorem 9 we then see that (N a)(N b) = N (aN )b = N (N a)b = N N ab = N ab. In other words, the product of right cosets of a normal subgroup is again a right coset. This closure property suggests that there may be a way to construct a group 23 out of the cosets N a where a is any element of G. We now show that there is indeed a way to construct such a group. Our method is used frequently throughout mathematics, and entails forming what is called a quotient structure. Let G/N denote the collection of all right cosets of N in G. In other words, an element of G/N is a right coset of N in G. We use the product of subsets as defined above to define a product on G/N . Theorem 10. Let N be a normal subgroup of a group G. Then G/N is a group. Proof. We show that the product in G/N obeys properties (G1)–(G4) in the definition of a group. (1) If A, B ∈ G/N , then A = N a and B = N b for some a, b ∈ G, and hence (since ab ∈ G) AB = N aN b = N ab ∈ G/N. (2) If A, B, C ∈ G/N , then A = N a, B = N b and C = N c for some a, b, c ∈ G and hence (AB)C = (N aN b)N c = (N ab)N c = N (abN )c = N (N ab)c = N (ab)c = N a(bc) = N a(N bc) = N a(N bN c) = A(BC) (3) If A = N a ∈ G/N , then AN = N aN e = N ae = N a = A and similarly N A = N eN a = N ea = N a = A. Thus N = N e ∈ G/N serves as the identity element in G/N . (4) If N a ∈ G/N , then N a−1 is also in G/N , and we have N aN a−1 = N aa−1 = N e as well as N a−1 N a = N a−1 a = N e. Therefore N a−1 ∈ G/N is the inverse to any element N a ∈ G/N . Corollary. If N is a normal subgroup of a finite group G, then nG/N = nG /nN . Proof. By construction, G/N consists of all the right cosets of N in G, and by Lagrange’s theorem (Theorem 7) this number is just nG/N = nG /nN . 24 The group defined in Theorem 10 is called the quotient group (or factor group) of G by N . Example 19. Consider the normal subgroup N = {e, a2 } of C4 (see Example 18). The quotient group C4 /N consists of the distinct cosets of N , which are N e = N and M := N a = {a, a3 } = aN . (Since N a2 = a2 N = N and N a3 = a3 N = M .) It is easy to see that C4 /N indeed forms a group {e = N, M } since N M = N N a3 = N a3 = M, M N = N a3 N = a3 N N = a3 N = M and M M = N so that M −1 = M . Thus both N and C4 /N are of order 2, and are isomorphic to C2 (they have the same multiplication table). Example 20. The permutation group S3 has the normal subgroup N = {e, (123), (321)}. It is also easy to verify that N (12) = N (23) = N (31) = {(12), (23), (31)} so that the elements of S3 /N are the cosets N e = N and M = N (ij) = (ij)N = {(12), (23), (31)} (where (ij) stands for any of the three 2-cycles). It is then not hard to see that N M = N (ij)N = (ij)N = M = M N and N N = M M = N . Therefore S3 /N is of order 2 and is also isomorphic to C2 . Theorem 11. Let a group G have a non-trivial normal subgroup N . Then any representation of the quotient group G/N induces a degenerate (i.e., non-faithful) representation of G. Conversely, let U (G) be a degenerate representation of G. Then G has at least one normal subgroup N such that U (G) defines a faithful representation of the quotient group G/N . Proof. Define a mapping ϕ from G to the cosets of N by ϕ(g) = gN . This is a homomorphism because ϕ(g1 g2 ) = g1 g2 N = g1 N g2 N = ϕ(g1 )ϕ(g2 ). Since G can be decomposed into the distinct equivalence classes defined by the cosets of N , the fact that N is non-trivial (so it has more than one element) means that nN elements of G all map into the same coset. Thus ϕ is many-to-one. Now let ψ be a representation of G/N , and define the mapping U = ψ ◦ ϕ. This is illustrated in the following commutative diagram: U G ϕ U (G) ψ G/N Then U is a representation of G because U (g1 g2 ) = ψ(ϕ(g1 g2 )) = ψ(ϕ(g1 )ϕ(g2 )) = ψ(ϕ(g1 ))ψ(ϕ(g2 )) = U (g1 )U (g2 ) . 25 But because ϕ is many-to-one, this representation is not faithful. The proof of the converse is left as an exercise (see the homework problems). Example 21. Referring to Example 20, we know that S3 has the normal subgroup N = {e, (123), (321)}, and that the cosets of N are only N itself and M = N (ij). Furthermore, we showed that S3 /N is isomorphic to the cyclic group C2 = {e, a}. Now, the group C2 has the rather simple representation ψ(e) = 1, ψ(a) = −1 as you can easily verify. This is then also a representation of S3 /N . Then this representation induces a representation of S3 via the assignment U : S3 → S3 /N → {1, −1} as U (g) = +1 U (g) = −1 for g = e, (123), (321) for g = (12), (23), (31) . You should verify that this yields the same multiplication table as S3 . For example, (12)(23) = (123) which agrees with (−1)(−1) = +1. While we won’t go into any further discussion of these topics, a group is said to be simple if it does not contain any non-trivial invariant subgroup, and is semisimple if it does not contain any abelian invariant subgroup. A consequence of Theorem 11 is that all (non-trivial) representations of simple groups are necessarily faithful. While Theorem 8 shows us that for every g ∈ G we have gN g −1 = N if N is a normal subgroup of G, it is in fact also true that for every g ∈ G we have gCg −1 = C if C is merely any class. This is actually obvious, because by definition, if a ∈ C, then because C is a class, it must contain every gag −1 . (You can also think of this as a direct consequence of the rearrangement lemma.) In other words, any class contains every element conjugate to every member of the class, and only those conjugate elements. Conversely, suppose C is a subset of G with the property that gCg −1 = C for all g ∈ G. I claim that C must consist entirely of (complete) classes. Indeed, subtract every complete class (in common) from both sides of this relation, and denote any remainder by R. If r ∈ R, then for any g ∈ G we have grg −1 on the left, and this must be contained in R on the right. But this means that R contains [r] for each r ∈ R. Therefore C must consist of complete classes. Summarizing this discussion, we have the following result. Theorem 12. Let C be a subset of a finite group G. Then C consists entirely of (complete) classes if and only if gCg −1 = C for every g ∈ G. Generalizing our notation slightly, if Ci and Cj are two classes, we let Ci Cj denote the set Ci Cj = {xi xj : xi ∈ Ci and xj ∈ Cj }. (Be sure to realize that any 26 specific term in the product Ci Cj may occur more than once.) Then according to Theorem 12, for every g ∈ G we have Ci Cj = gCi g −1 gCj g −1 = gCi Cj g −1 . But then the converse part of Theorem 12 tells us that Ci Cj must consist of complete classes. We can express this fact mathematically by writing X Ci Cj = cijk Ck (2) k where the integers cijk tell us how often the class Ck appears in the product Ci Cj . Example 22. In Example 17 we found the classes of the group D3 (the symmetry group of the equilateral triangle). We may label them by C1 = {e} C2 = {a, b, c} C3 = {d, f } . Using the group multiplication table we find that C1 C1 = C1 C1 C2 = C2 C1 C3 = C3 C2 C2 = 3C1 + 3C3 C2 C3 = 2C2 C3 C3 = 2C1 + C3 . For example, C2 C2 = {a2 , ab, ac, ba, b2, bc, ca, cb, c2 } = {e, d, f, f, e, d, d, f, e} = 3C1 + 3C3 . Let us derive some results that we will find very useful in the next section. Suppose xi ∈ Ci and xj ∈ Cj . Then xi xj = (xi xj xi−1 )xi ∈ Cj Ci since xi xj x−1 ∈ Cj . This shows that Ci Cj ⊂ Cj Ci . Similarly we see that Cj Ci ⊂ i Ci Cj , and therefore Ci Cj = Cj Ci . This shows that the coefficients in equation (2) have the symmetry property cijk = cjik . If C1 = {e}, then C1 Cj = Cj = P k c1jk Ck (3) implies that c1jk = δjk . Thus we have c1jk = cj1k = δjk . (4) Now suppose xi , xj ∈ C. By definition of class, this means there exists g ∈ G such that xi = gxj g −1 . Taking the inverse of both sides of this equation shows that 27 −1 x−1 = gx−1 and hence x−1 and x−1 i j g i j also belong to the same class. Given a class Ci , we denote the class of all inverse elements of Ci by Ci′ . Then if j 6= i′ , Ci Cj can not contain C1 = [e]. If we denote the number of elements in Ci by ni , then n1 = 1 and ni = ni′ . Since Ci Ci′ contains C1 = [e] precisely ni times, we then see that cij1 = ni δji′ . (5) Recall that the character of g ∈ G in the µ-representation is defined by µ µ χ (g) = tr U (g) = nµ X Dµ (g)ii . i=1 In particular, note that since the group identity element e is represented by the identity matrix, we must have χµ (e) = nµ . Since the matrix representations of all elements in the same class are related by a similarity transformation, we see that the character is the same for each member of a class. Let us denote the character of class Ck in the µ-representation by χµ (Ck ). We denote the number of elements in Ck by nk . Now take another look at equation (2). Writing Ci = {xi1 , . . . , xini } and Cj = j {x1 , . . . , xjnj } we have Ci Cj = {xi1 xj1 , . . . , xi1 xjnj , . . . , xini xj1 , . . . , xini xjnj } so that summing together all of these elements yields xi1 nj X l=1 xjl + ···+ xini nj X xjl = l=1 X ni xik k=1 X nj l=1 xjl . If we write each element in terms of an irreducible matrix representation µ, then we can write this sum as Ciµ Cjµ where Ciµ = X Dµ (g) g∈Ci is the sum of all matrices representing the elements in the class Ci in the µrepresentation. For the other sidePof equation (2) we simply sum the irreps of the elements in each Ck to obtain k cijk Ckµ , and hence we have Ciµ Cjµ = X cijk Ckµ . k By Theorem 12, any class Ck satisfies gCk g −1 = Ck so that gCk = Ck g for all g ∈ G. Writing this in terms of irreducible matrix representations and summing we 28 have Dµ (g)Ckµ = Ckµ Dµ (g). Since this holds for all g = G, Schur’s lemma 1 tells us that Ckµ = λk I. Using this result in the above equation we obtain X λi λj = cijk λk . k To evaluate λk , we first take the trace of Ckµ = λk I to obtain tr Ckµ = λk nµ where nµ is the dimension of the µ-representation. On the other hand, we can also take the trace of the defining equation for Ckµ to obtain tr Ckµ = nk χµ (Ck ) . Equating these two results yields λk = nk χµ (Ck ) . nµ Finally, using this formula for λk in the above equation for λi λj we have X ni nj χµ (Ci )χµ (Cj ) = nµ cijk nk χµ (Ck ) . (6) k 5 Orthogonality Relations Almost all of the main results in finite group representation theory are based on the following result, sometimes called the great orthogonality theorem. By way of notation, we let nG be the order of G, nµ denote the dimensionality of the µth irreducible representation, and Dµ (g) be the unitary matrix corresponding to g ∈ G in the µ-representation with respect to an orthonormal basis. Also, to keep our notation as simple as possible we will usually write D(g)ij , but when summing over multiple indices, it will be easier to use the summation convention and write D(g)i j . Theorem 13. With respect to all the inequivalent, irreducible, unitary representations of a finite group G, we have X nG Dµ (g)†ki Dν (g)jl = δµν δij δkl . nµ g∈G Remark : Since D(g)†ik = D(g)∗ki , this theorem may be written as X nµ 1/2 g∈G nG Dµ (g)ik ∗ nµ nG 29 1/2 Dν (g)jl = δµν δij δkl . For fixed (µ, i, k), we regard (nµ /nG )1/2 Dµ (g)ik as an nG -component vector (as g ranges over G), and looking at the result in this manner, the equation is just the usual orthonormality relationship for vectors labeled by the three indices (µ, i, k). Proof. Part (i): Let X be any nµ × nµ matrix and define X M= Dµ (g)† XDν (g) . g Then we have (using D(g)† = D(g)−1 and the rearrangement lemma) X Dµ (h)−1 M Dν (h) = [Dµ (g)Dµ (h)]−1 XDν (g)Dν (h) g = X Dµ (gh)−1 XDν (gh) g = X Dµ (g)−1 XDν (g) g =M. Since this holds for all h ∈ G, Schur’s lemmas tell us that either µ 6= ν and M = 0, or µ = ν and M = cx I where cx is a constant depending on X. Part (ii): Let Xlk be one of the nµ nν matrices with matrix elements (Xlk )i j = δil δjk . In other words, Xlk has a 1 in the (l, k)th position and 0’s elsewhere. Then X X (Mlk )r s = [Dµ (g)† ]r i (Xlk )i j Dν (g)j s = [Dµ (g)† ]r l Dν (g)k s . g g According to Part (i), the left-hand side of this equation is zero if µ 6= ν. This proves the theorem in the case that µ 6= ν. If we have µ = ν, then Part (i) tells us that the left-hand side must equal ckl δrs where the ckl are constants. To determine them, we take the trace of both sides of this last equationP (i.e., set r = s and sum). Since 1 ≤ r, s ≤ nµ , the left-hand side yields nµ ckl (since r δrr = nµ ), while from the right-hand side we obtain X X [Dµ (g)Dµ (g)† ]k l = δkl = nG δkl . g g So now ckl = (nG /nµ )δkl so that (Mlk )r s = ckl δrs = (nG /nµ )δkl δrs and we have X [Dµ (g)† ]r l Dν (g)k s = g nG δµν δkl δrs . nµ One immediate consequence of this result is the following. As we have pointed out, Theorem 13 may be interpreted as an orthonormality condition on nG -component vectors labeled by (µ, i, k) where 1 ≤ i, k ≤ nµ . Since for each µ there are (nµ )2 30 P possible values of i and k, the total number of vectors is given by µ nµ 2 . But any orthonormal set of vectors is linearly independent, and the number of components of each vector in a linearly independent set can’t be less than the number of vectors in the set. (You can’t have three linearly independent two-component vectors.) Therefore we must have X nµ 2 ≤ nG . µ In fact, we will show that equality holds in this relation, and it is this result that allows us to find all possible inequivalent irreducible representations of finite groups. However, proving that equality holds here requires a fair amount of work, which we will have to develop gradually. Example 23. The simplest non-trivial group is C2 = {e, a}, which is abelian and hence has only one-dimensional irreducible representations. Furthermore, each element forms a class by itself. The identity representation assigns the number 1 to every element of C2 , and hence we have D1 (e) = D1 (a) = 1 where the superscript 1 refers to the identity irrep. Note that Theorem 13 in this case with µ = ν = 1 becomes 1 · 1 + 1 · 1 = 2 = nG as it should. If we have a second inequivalent irrep D2 (C2 ), then as a (nG = 2)-component vector, it must have components orthogonal to (1, 1). Up to normalization, the only such vector is (1, −1), and therefore the only other possible one-dimensional irrep with the correct normalization is e → 1 and a → −1 so that 1 · 1 + (−1) · (−1) = 2 also. We can summarize these results in a table with the group elements gi labeling the columns and the inequivalent irreps Dµ (G) labeling the rows (this is not quite the same as a character table): 1 2 e a 1 1 1 −1 Next, turn to Theorem 13 and set k = i and l = j. Summing over i and j we then obtain XX X nG δij Dµ (g)†ii Dν (g)jj = δµν nµ g ij ij or X χµ (g)∗ χν (g) = nG δµν . (7) g This shows that the characters form a set of orthogonal vectors in group-element space. Since the characters are the same for all members of the same class, we can rewrite equation (7) as a sum over classes to obtain X nk χµ (Ck )∗ χν (Ck ) = nG δµν (8) k 31 where nk is the number of elements in the class Ck . The important observation to make from this is that the characters of the irreps form a set of orthogonal vectors in a vector space with coordinate axes labeled by classes. Since orthogonal vectors are linearly independent, and you can’t have more linearly independent vectors than the dimensionality of the space, we see that the number of irreducible representations can not exceed the number of classes. Most importantly, we will show below (see Theorem 15) that the number of irreps equals the number of classes, but to do so requires that we introduce the regular representation, which we do shortly. Example 24. Taking the result that the number of irreps equals the number nc of classes, we can form an nc × nc matrix with columns labeled by class i and rows labeled by irrep µ. (We will have more to say about these below.) In the case of abelian groups, all irreps are one-dimensional and each element forms a class of its own, and therefore Dµ (gi ) = χµ (Ci ). Thus, for abelian groups such as C2 , the table shown in Example 23 is the same as its character table. An immediate consequence of equation (7) is that we can easily determine the number of times a given irrep occurs in a reducible representation. To see this, suppose that the representation U (g) is put into block diagonal form D(g) = D1 (g)⊕ · · · ⊕ Dr (g). Now in general, a given irrep Dµ (g) will occur a number aµ of times in this decomposition. Since the trace of D(g) is the sum of the traces of the Dµ (g), we easily see that X χ(g) = aν χν (g) . (9) ν Multiplying both sides of this equation by χµ (g)∗ and summing over g we obtain X X X X χµ (g)∗ χ(g) = aν χµ (g)∗ χν (g) = nG aν δµν = nG aµ . g ν g ν Therefore, summing over either group elements or classes we have aµ = 1 X µ ∗ 1 X χ (g) χ(g) = nk χµ (Ck )∗ χ(Ck ) . nG g nG (10) k We now turn to a particularly useful representation called the regular representation. This is defined by taking the group elements themselves to be operators acting on the vector space defined to have the group itself as a basis. Thus the dimension of the regular representation is nG . To distinguish the regular representation from other representations and save myself some typing, I will use the Greek letter Γ instead of U to denote the operator, and Γ(g)ij instead of Dreg (g)ij . Thus we define X Γ(gi )gj := gi gj = gk Γ(gi )kj . k 32 This is just our usual definition for the matrix representation of an operator. To verify that this indeed defines a representation, we calculate as follows: Γ(gi gj )gk = gi (gj gk ) = gi gr Γ(gj )r k = gs Γ(gi )s r Γ(gj )r k = gs [Γ(gi )Γ(gj )]s k = gs Γ(gi gj )s k and hence Γ(gi gj )s k = [Γ(gi )Γ(gj )]s k as claimed. This is equivalent to writing the operator expression Γ(gi gj )gk = gi (gj gk ) = gi [Γ(gj )gk ] = Γ(gi )[Γ(gj )gk ] = [Γ(gi )Γ(gj )]gk so that Γ(gi gj ) = Γ(gi )Γ(gj ). Example 25. Referring to Example 17, we know that ab = d or g2 g3 = g5 . Then X g5 = Γ(g2 )g3 = gk Γ(g2 )k3 k which implies that Γ(g2 )k3 = δk5 . As demonstrated in this example, Γ(gi )kj = ( 1 0 if gi gj = gk . otherwise It is important to realize this implies that Γ(gi )kk = 1 Therefore we have χreg (g) = if and only if gi = e . ( nG 0 if g = e . if g = 6 e If we now combine this result with equation (10) we find aµ = 1 µ 1 X µ ∗ reg χ (g) χ (g) = χ (e)nG = nµ . nG g nG and hence we have proved one of the most important properties of the regular representation. Theorem 14. The regular representation contains each irreducible representation a number of times aµ equal to the dimensionality nµ of the irreducible representation. 33 P Recall that following the proof of Theorem 13 we showed that µ nµ 2 ≤ nG . We are now able to show that in fact equality holds in this relation. If you think about the block diagonal form of the regular representation matrices, the µ-representation with dimension nµ occurs nµ times, and thus takes up nµ 2 rows (or columns) of each matrix. Since each matrix is of size nG (the dimension of the regular representation), we must have X nµ 2 = nG . (11) µ As another application of Theorem 14, we see directly from equation (9) that ( X nG if g = e χreg (g) = nµ χµ (g) = . 0 if g 6= e µ Now we are finally in a position to prove the last important orthogonality relation. Starting from equation (6) X ni nj χµ (Ci )χµ (Cj ) = nµ cijk nk χµ (Ck ) k we sum over all irreps µ and use the fact that e = [e] = C1 and e ∈ / Ck for k 6= 1 to obtain X X X ni nj χµ (Ci )χµ (Cj ) = cijk nk nµ χµ (Ck ) = cij1 n1 nG . µ µ k But n1 = 1, and from equation (5) we know that cij1 = ni δji′ so we have X nj χµ (Ci )χµ (Cj ) = nG δji′ . µ Since Cj ′ is the class of all inverse elements to Cj , if we take each irrep Γµ to be unitary, then Γ(g −1 ) = Γ(g)−1 = Γ(g)† so that χµ (Cj ) = χµ (Cj ′ )∗ . Using nj = nj ′ we can now write X nj ′ χµ (Ci )χµ (Cj ′ )∗ = nG δji′ . µ ′ Now relabel by letting j = k or, equivalently, j = k ′ . Noting that δk′ i′ = δki we arrive at our desired result X nG χµ (Ck )∗ χµ (Ci ) = δki . (12) nk µ At last we are in a position to prove our assertion that the number of classes is equal to the number of inequivalent irreducible representations. Start by letting µ = ν in equation (8): X nk χµ (Ck )∗ χµ (Ck ) = nG . k 34 Suppose there are nc classes and nr irreps. Summing this last equation over all irreps yields nr X nc X nk χµ (Ck )∗ χµ (Ck ) = nG nr . µ=1 k=1 Similarly, set k = i in equation (12) and sum over classes to obtain nr X nc X nk χµ (Ck )∗ χµ (Ck ) = nG nc . µ=1 k=1 Comparison of these last two equations shows that nr = nc , and hence we have proved Theorem 15. For a finite group G, the number of inequivalent irreducible representations is equal to the number of classes. Example 26. If a group is abelian, then each element forms its own class. Therefore an abelian group of order nG has nG one-dimensional (irreducible) representations. P But from equation (11) we then see that µ nµ 2 = nG so there can be no other irreps. This is the same result derived in Example 13 using Schur’s lemma 1. The main use of these orthogonality equations is that they allow us to construct character tables with entries consisting of the characters of each class within each irrep. Since the number of irreps is equal to the number of classes, we can construct an nc × nc “matrix” with rows labeled by the irreps Dµ and columns labeled by classes nk Ck where we include the number of elements in each class in our label (but not in the corresponding entry). Example 27. One consequence of equation P (11) (at least for smaller groups) is that there is often a unique solution to µ nµ 2 = nG . For example, a group of order 6 must have two one-dimensional irreps and a single two-dimensional irrep 2 because 1P + 12 + 22 = 6 is the only decomposition of 6 as a sum of squares (not counting 12 ). Now, we know that any group always has the trivial one-dimensional representation D1 (g) = 1. Referring to Examples 17 and 22, we see that another onedimensional representation may be defined by D2 (g) = det g, and we have also explicitly shown a two-dimensional representation from which you can find the characters (the trace of each matrix). Thus we have the following character table: D1 D2 D3 C1 1 1 2 3C2 2C3 1 1 −1 1 0 −1 35 Note that the rows are orthogonal according to equation (8), and the columns are orthogonal according to equation (12). Even though a character table contains far less information than an entire set of irrep matrices, it is often enough for the problem at hand. In fact, for the simple groups usually of interest, it is possible to fill in the table without even constructing an explicit matrix representation of the group. The general procedure is the following: 1. Find the number of classes by using the group multiplication table (or physical considerations if possible). P 2. Find the dimensionalities nµ from µ nµ 2 = nG . In simple cases this usually has a unique solution. Since the identity element is represented by the identity matrix, the character (trace) of the identity class gives χµ (e) = nµ which gives the first column of the table. In addition, since we always have the trivial representation, we know that the first row always has χ1 (Ck ) = 1. 3. The rows obey the orthogonality relation (equation (8)) X nk χµ (Ck )∗ χν (Ck ) = nG δµν . k 4. The columns obey the orthogonality relation (equation (12)) X χµ (Ck )∗ χµ (Ci ) = µ nG δki . nk 5. Entries within the µth row are related by (equation (6)) X ni nj χµ (Ci )χµ (Cj ) = nµ cijk nk χµ (Ck ) . k 36