Group actions I

There is something odd about the experience of learning group theory. At first, one is told that the great virtue of groups is their abstractness: many mathematical structures, from number systems, to sets of permutations, to symmetries, to automorphisms of other algebraic structures, to invariants of geometric objects (these last two are examples you won’t meet for a while) have important properties in common, and these are encapsulated in a small set of axioms that lead to a rich theory with applications throughout mathematics. So far so good — understanding about abstraction is wonderful and mind-expanding and the definition of a group is one of the best examples.

But then one studies group actions (and later group representations). They appear to be doing the reverse of abstraction: we take an abstract group and find a way of thinking of it as a group of symmetries. And that is supposed to help us understand the group better — so much so that group actions are an indispensable part of group theory.

So is abstraction good or bad? Well, both the views above are correct. Abstraction does indeed play a very important clarifying role, by showing us that many apparently different phenomena are basically the same, and isolating the aspects of those phenomena that really matter. However, if a group is defined for us in an abstract way (I’ll say more precisely what I mean by this later), then showing that it is isomorphic to a group of symmetries can make it much easier to answer questions about that group.

In this post, and one or two further ones, I want to discuss what a group action actually is, the orbit-stabilizer theorem and how to remember its proof, and how to use group actions to prove facts about groups.

What is a group action?

There are two ways of defining group actions. I don’t know which one you were given in lectures, but most people give the first and then mention the second more as an aside.

Definition 1. Let G be a group and let X be a set. An action of G on X is a function \phi:G\times X\to X with the following properties: \phi(e,x)=x for every x\in X, and \phi(g,\phi(h,x))=\phi(gh,x) for every g,h\in G and every x\in X.

Definition 2. Let G be a group and let X be a set. Let S(X) be the group of all permutations of X. An action of G on X is a homomorphism \phi:G\to S(X).

I much prefer the second of these, because I find it far more intuitive: an action of a group G is a way of thinking of the elements of G as symmetries of some sort. It’s tempting to say that an action of G is a way of regarding G as a symmetry group, but that’s not quite correct, because we allow different elements of G to give us the same symmetry. For example, here’s an action of the permutation group S_n on the set \{1,2\}. There are two permutations of \{1,2\}, namely the identity and (12): map all even permutations to the identity and all odd permutations to (12).

What do we have to check in order to be sure that that is an action? If \rho and \sigma are elements of S_n and \phi(\rho) and \phi(\sigma) are the associated permutations of \{1,2\}, then we need \phi(\rho\sigma) to equal \phi(\rho)\phi(\sigma). That is an easy consequence of facts about what you get when you multiply even and odd permutations together, which correspond closely to facts about what happens when you multiply the identity and (12) together. (For example, odd times odd is even, and (12)(12)=identity.)

If \phi is a homomorphism from G to S(X), that means that for each g\in G, \phi(g) is a permutation of X. (Here I am doing something easy but important: making sure I am clear in my mind about what kinds of objects things are. It’s a very good habit to get into.) That means that I will find myself writing slightly odd expressions like \phi(g)(x): since \phi(g) is a permutation of X, which is in turn a kind of function from X to X, I must be able to apply \phi(g) to elements of X.

If one is careful, it can be nice to imagine that the elements of G themselves are “doing the transformation” to X. That is, instead of \phi turning g into a bijection, which in turn does things to elements of X, we allow ourselves (once we have carefully defined what the action is) to write expressions like g(x), and simply understand that this is shorthand for \phi(g)(x). Then the main property that an action has to have is that (gh)(x) is the same as g(h(x)) for every g,h\in G and every x\in X. (The first of these means you multiply g and h and then apply the transformation that corresponds to gh, whereas the second means that you apply the transformation that corresponds to h and then the transformation that corresponds to g.)

There is one example that is a good one to have in your head, as it gives you a very good idea of what an action is. Let G be the alternating group A_4 and let T be a regular tetrahedron, and label its vertices 1, 2, 3 and 4. For each permutation \sigma in A_4 and each position of T, we can find a rotation that permutes the vertices of T according to \sigma. For example, to achieve the permutation (12)(34) we do a half turn about the line that joins the midpoints of the edges linking 1 to 2 and 3 to 4, and to achieve the permutation (123) we do rotation through 120 degrees through the line that joins vertex 4 to the middle of the opposite face. (Note that these axes depend on the position of T rather than being fixed. See the discussion in the post on permutations.)

The action just described is a faithful action, which means that different elements of the group correspond to different transformations. (More formally, the homomorphism from G to S(X) is an injection.) However, we can also use this set-up to define a second action of A_4. Let X be the set that consists of the three lines that join midpoints of opposite edges. (That is, X is a set with three elements, each of which is a line.) Then any rotation of the tetrahedron will also permute these three lines, so A_4 acts on X. This action is not faithful: for example, a half turn about one of the lines fixes all three lines (the other two rotate through 180 degrees but they still map to themselves). In a later post, we shall see that this gives us a very clear explanation of an important fact about the group A_4.

Group presentations.

I want to end this post by elaborating on what I meant by “defining a group in an abstract way”. You should by now have met the dihedral groups. The dihedral group D_{2n} of order 2n can be defined in two rather different ways. The first way is concrete: it is the group of symmetries of an n-gon. By that I don’t really mean that its elements have to be symmetries of n-gons, but rather that any group that is isomorphic to the group of symmetries of the n-gon counts as an instantiation of the abstract group D_{2n}.

Another way of defining groups is by using generators and relations. This is called giving a presentation of the group. In the case of the group D_{2n} the usual presentation uses two generators, a and b, say, and the relations a^2=b^n=e and aba=b^{-1}. It’s not hard to use these relations to reduce every product you can make out of a, b, a^{-1} and b^{-1} to an element of the form b^j or ab^j with 0\leq j\leq n-1. For example, if n=5 (so we are talking about the symmetry group of the pentagon), and I take the product ab^7a^{-3}b^{-2}ab^{-4}a^2b^{-2}a, then I can do a series of obvious simplifications as follows. First, since a^2=e, I can change every power of a to either e or a. If I do that, I get ab^7ab^{-2}ab^{-6}a. In a similar way, I can change all powers of b so that they are between 0 and 4. That gets me to ab^2ab^3ab^4a. Thirdly, the relation aba=b^{-1} implies that ba=ab^{-1}. That is, I can move an a from the right of a b to the left of that b if I change the b to a b^{-1}. From that it follows that I can do the same with powers of b. For example,
b^3a=bbba=bbab^{-1}=bab^{-1}b^{-1}=ab^{-1}b^{-1}b^{-1}=ab^{-3}.

Going back to the expression ab^2ab^3ab^4a, I can use the above fact to change it to
ab^2ab^3aab^{-4}=ab^2ab^3b^{-4}=ab^2ab^{-1}=ab^2ab^4.
Applying the little fact again, we get
ab^2ab^4=aab^{-2}b^4=b^{-2}b^4=b^2.
So we have a simple algorithm for putting all “words” (that is, expressions made out of a, b, a^{-1} and b^{-1}) into a standard form. With a bit more effort, one can show that no two expressions in standard form are equal. For example, if ab^3=b^4 we would deduce that a=b (by multiplying both sides on the right by b^{-3}), which is false.

Actually, why is it false? How can we be sure that there isn’t some strange way of using the relations a^2=b^5=e and aba=b^{-1} to show that a=b? One quick answer is that we can find a concrete group — the symmetry group of a pentagon — and two elements of that group — one reflection and one rotation — that satisfy the given relations. If those relations implied that a=b then they would enable us to deduce that a reflection of the pentagon was equal to a rotation of the pentagon, which is just plain false.

That same argument shows that D_{10} has at least 10 elements, and since it has at most 10 elements (since there are 10 distinct standard forms) it has exactly 10 elements. By using the standard-form algorithm we can build up the multiplication table. For example, ab^3ab^2=a^2b^{-3}b^2=b^4, and so on.

So now we have two ways of thinking about D_{2n}. Either it is the group with generators a and b and relations a^2=b^n=e and aba=b^{-1}, or it is the group of symmetries of an n-gon.

The main point of what I want to say in this section is that there is a danger that you will become too keen on abstraction. Certain facts about D_{2n} are obvious if you think of it as a symmetry group and quite a lot less obvious if you argue directly from a presentation. For example, D_{12} contains (a copy of) D_6 as a subgroup. One proof of that fact consists in arguing as follows. Let a and b be the generators of D_6 and let c and d be the generators of D_{12}. Then the function that takes b^j to d^{2j} and ab^j to cd^{2j} is an isomorphism from D_6 to its image, which is a subgroup of D_{12}. Checking this is slightly fiddly, though not too hard.

How much more transparent, however, is the following argument. D_{12} is the group of symmetries of a regular hexagon. If you join alternate vertices of the hexagon you get an equilateral triangle. All the symmetries of that triangle are given in an obvious way by symmetries of the hexagon, so the symmetry group of the triangle is a subgroup of the symmetry group of the hexagon.

I shall have more to say about group actions, but for now I’ll content myself with this message (which I’ve said a few times, but let me say it once more).

Abstraction is great, but don’t get too carried away with it. In particular, if you know that a group is isomorphic to a group of symmetries, that gives you direct access to a lot of information about it. Don’t throw that information away (unless for some reason you like complicated fiddly proofs that don’t tell you why a result is true).

About these ads

12 Responses to “Group actions I”

  1. Paul Matthews Says:

    I prefer definition 1, because I like to think of groups acting on R^n (eg A_4 acting on R^3 rather than on the 4 points of a tetrahedron).

    • gowers Says:

      I don’t understand what you’re saying here. Can’t one just define an action of A_4 on \mathbb{R}^3 to be a homomorphism from A_4 to a group of transformations of \mathbb{R}_3 (the obvious group to go for being SO(3))? The question of which definition to use seems to be independent of what kinds of actions you want to consider.

    • Paul Matthews Says:

      Well yes but that involves generalising the idea of permutation to transformations (not how permutations are usually taught) , as well as the concepts of the group SO(3) and group homomorphisms.
      To me (and I would guess to most students just learning the subject?) definition 1 is more natural and easier to understand. I suppose it depends on how one’s brain works – unlike cardster below I think in pictures!

  2. Cardster Says:

    I like the second definition. It is far far clearer and makes sens of the usual one. (However, I am highly non-visual though and prefer algebraic explanations.)

  3. Mazoit Says:

    I prefer the infix definition of 1: an action is a product * such that
    g*x=y…

    It is closer to the notation g(x) which you mention quickly after.

    • gowers Says:

      I agree that that’s nice — it sort of captures both ways of thinking about actions and lets you pass freely from one to the other.

  4. schlafly Says:

    I favor introducing groups by defining them as a set of symmetries (ie, permutations) that are closed under composition and inverse. Learning to think abstractly may be admirable, but I just don’t see how it is better to think of the associate law as a purely algrebraic property, and avoid thinking about group actions.

    • Qiaochu Yuan Says:

      The point of defining groups abstractly isn’t to avoid thinking about group actions; it’s to separate an abstract group from a particular concrete realization of it, in the same way that we separate abstract manifolds from a particular embedding of them into Euclidean space. Just as the same group can act in various ways on various objects, the same manifold can be embedded in various ways into Euclidean space, and separating these two aspects of the group out clarifies many aspects of group theory (e.g. which properties are intrinsic to a group and which depend on a choice of action).

    • Anonymous Says:

      A group=hand metaphor once occurred to me, and I still like it. If a group is a hand, then a group acting on something is the hand grasping something. The point of groups is to act, just as the point of hands is to grasp. But it’s also very important for hands to be able to let go so they can one day grasp something else. For the same reason, it’s important for groups to be able to let go of their actions.

  5. Qiaochu Yuan Says:

    Regarding the two definitions of group action, it seems to me that the really crucial point to understand here is the notion of currying, or in other words the natural isomorphism \text{Hom}(X \times Y, Z) \cong \text{Hom}(X, Z^Y) where Z^Y is (in this context) the set of functions Y \to Z. When defining a group action (so Y = Z and X = G is a group) it is probably more intuitive to use the second definition, but

    1) the two definitions are equivalent for sets by currying, and

    2) the second definition behaves poorly once one moves beyond the realm of sets.

    For example, the second definition behaves poorly when one considers topological groups and continuous group actions, since in general it may not be possible to put a nice topology on \text{Aut}(Z) such that continuous actions are precisely continuous group homomorphisms G \to \text{Aut}(Z). The situation is even worse for, for example, algebraic groups and algebraic group actions.

  6. Bryn Says:

    This article has helped to somewhat elucidate group actions for me. The second definition has helped a lot with gaining an intuitive understanding of what is going on, I like it in itself and as an auxiliary to the first definition. You’re a great help, Tim, cheers.

  7. student Says:

    Dear Gowers:

    I have several questions regarding the “group representation” part:

    “It’s not hard to use these relations to reduce every product you can make out of a,b,a^{-1},\ \hbox{and}\ \ b^{-1} to an element of the form b^j or ab^j with 0\leq j\leq n-1.”

    What would be a formal proof for this statement. The argument in this article is based on a few examples. How can one apply the “algorithm” without know what a specific word is?

    “One quick answer is that we can find a concrete group — the symmetry group of a pentagon — and two elements of that group — one reflection and one rotation — that satisfy the given relations.”

    To make this argument rigor, does one actually show that this D_{10} satisfies the given relations AND no more?

    “since there are 10 distinct standard forms” gives “D_{10}” has at least 10 elements, how does the “at most 10″ part come? The sentence “That same argument shows that D_{10} has at least 10 elements, and since it has at most 10 elements (since there are 10 distinct standard forms) it has exactly 10 elements.” is confusing.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

Join 1,600 other followers

%d bloggers like this: