## Group actions I

There is something odd about the experience of learning group theory. At first, one is told that the great virtue of groups is their abstractness: many mathematical structures, from number systems, to sets of permutations, to symmetries, to automorphisms of other algebraic structures, to invariants of geometric objects (these last two are examples you won’t meet for a while) have important properties in common, and these are encapsulated in a small set of axioms that lead to a rich theory with applications throughout mathematics. So far so good — understanding about abstraction is wonderful and mind-expanding and the definition of a group is one of the best examples.

But then one studies group actions (and later group representations). They appear to be doing the reverse of abstraction: we take an abstract group and find a way of thinking of it as a group of symmetries. And that is supposed to help us understand the group better — so much so that group actions are an indispensable part of group theory.

So is abstraction good or bad? Well, both the views above are correct. Abstraction does indeed play a very important clarifying role, by showing us that many apparently different phenomena are basically the same, and isolating the aspects of those phenomena that really matter. However, if a group is defined for us in an abstract way (I’ll say more precisely what I mean by this later), then showing that it is isomorphic to a group of symmetries can make it much easier to answer questions about that group.

In this post, and one or two further ones, I want to discuss what a group action actually is, the orbit-stabilizer theorem and how to remember its proof, and how to use group actions to prove facts about groups.

What is a group action?

There are two ways of defining group actions. I don’t know which one you were given in lectures, but most people give the first and then mention the second more as an aside.

Definition 1. Let $G$ be a group and let $X$ be a set. An action of $G$ on $X$ is a function $\phi:G\times X\to X$ with the following properties: $\phi(e,x)=x$ for every $x\in X$, and $\phi(g,\phi(h,x))=\phi(gh,x)$ for every $g,h\in G$ and every $x\in X$.

Definition 2. Let $G$ be a group and let $X$ be a set. Let $S(X)$ be the group of all permutations of $X$. An action of $G$ on $X$ is a homomorphism $\phi:G\to S(X)$.

I much prefer the second of these, because I find it far more intuitive: an action of a group $G$ is a way of thinking of the elements of $G$ as symmetries of some sort. It’s tempting to say that an action of $G$ is a way of regarding $G$ as a symmetry group, but that’s not quite correct, because we allow different elements of $G$ to give us the same symmetry. For example, here’s an action of the permutation group $S_n$ on the set $\{1,2\}$. There are two permutations of $\{1,2\}$, namely the identity and $(12)$: map all even permutations to the identity and all odd permutations to $(12)$.

What do we have to check in order to be sure that that is an action? If $\rho$ and $\sigma$ are elements of $S_n$ and $\phi(\rho)$ and $\phi(\sigma)$ are the associated permutations of $\{1,2\}$, then we need $\phi(\rho\sigma)$ to equal $\phi(\rho)\phi(\sigma)$. That is an easy consequence of facts about what you get when you multiply even and odd permutations together, which correspond closely to facts about what happens when you multiply the identity and $(12)$ together. (For example, odd times odd is even, and $(12)(12)=$identity.)

If $\phi$ is a homomorphism from $G$ to $S(X)$, that means that for each $g\in G$, $\phi(g)$ is a permutation of $X$. (Here I am doing something easy but important: making sure I am clear in my mind about what kinds of objects things are. It’s a very good habit to get into.) That means that I will find myself writing slightly odd expressions like $\phi(g)(x)$: since $\phi(g)$ is a permutation of $X$, which is in turn a kind of function from $X$ to $X$, I must be able to apply $\phi(g)$ to elements of $X$.

If one is careful, it can be nice to imagine that the elements of $G$ themselves are “doing the transformation” to $X$. That is, instead of $\phi$ turning $g$ into a bijection, which in turn does things to elements of $X$, we allow ourselves (once we have carefully defined what the action is) to write expressions like $g(x)$, and simply understand that this is shorthand for $\phi(g)(x)$. Then the main property that an action has to have is that $(gh)(x)$ is the same as $g(h(x))$ for every $g,h\in G$ and every $x\in X$. (The first of these means you multiply $g$ and $h$ and then apply the transformation that corresponds to $gh$, whereas the second means that you apply the transformation that corresponds to $h$ and then the transformation that corresponds to $g$.)

There is one example that is a good one to have in your head, as it gives you a very good idea of what an action is. Let $G$ be the alternating group $A_4$ and let $T$ be a regular tetrahedron, and label its vertices 1, 2, 3 and 4. For each permutation $\sigma$ in $A_4$ and each position of $T$, we can find a rotation that permutes the vertices of $T$ according to $\sigma$. For example, to achieve the permutation $(12)(34)$ we do a half turn about the line that joins the midpoints of the edges linking 1 to 2 and 3 to 4, and to achieve the permutation $(123)$ we do rotation through 120 degrees through the line that joins vertex 4 to the middle of the opposite face. (Note that these axes depend on the position of $T$ rather than being fixed. See the discussion in the post on permutations.)

The action just described is a faithful action, which means that different elements of the group correspond to different transformations. (More formally, the homomorphism from $G$ to $S(X)$ is an injection.) However, we can also use this set-up to define a second action of $A_4$. Let $X$ be the set that consists of the three lines that join midpoints of opposite edges. (That is, $X$ is a set with three elements, each of which is a line.) Then any rotation of the tetrahedron will also permute these three lines, so $A_4$ acts on $X$. This action is not faithful: for example, a half turn about one of the lines fixes all three lines (the other two rotate through 180 degrees but they still map to themselves). In a later post, we shall see that this gives us a very clear explanation of an important fact about the group $A_4$.

Group presentations.

I want to end this post by elaborating on what I meant by “defining a group in an abstract way”. You should by now have met the dihedral groups. The dihedral group $D_{2n}$ of order $2n$ can be defined in two rather different ways. The first way is concrete: it is the group of symmetries of an $n$-gon. By that I don’t really mean that its elements have to be symmetries of $n$-gons, but rather that any group that is isomorphic to the group of symmetries of the $n$-gon counts as an instantiation of the abstract group $D_{2n}$.

Another way of defining groups is by using generators and relations. This is called giving a presentation of the group. In the case of the group $D_{2n}$ the usual presentation uses two generators, $a$ and $b$, say, and the relations $a^2=b^n=e$ and $aba=b^{-1}$. It’s not hard to use these relations to reduce every product you can make out of $a$, $b$, $a^{-1}$ and $b^{-1}$ to an element of the form $b^j$ or $ab^j$ with $0\leq j\leq n-1$. For example, if $n=5$ (so we are talking about the symmetry group of the pentagon), and I take the product $ab^7a^{-3}b^{-2}ab^{-4}a^2b^{-2}a$, then I can do a series of obvious simplifications as follows. First, since $a^2=e$, I can change every power of $a$ to either $e$ or $a$. If I do that, I get $ab^7ab^{-2}ab^{-6}a$. In a similar way, I can change all powers of $b$ so that they are between 0 and 4. That gets me to $ab^2ab^3ab^4a$. Thirdly, the relation $aba=b^{-1}$ implies that $ba=ab^{-1}$. That is, I can move an $a$ from the right of a $b$ to the left of that $b$ if I change the $b$ to a $b^{-1}$. From that it follows that I can do the same with powers of $b$. For example,
$b^3a=bbba=bbab^{-1}=bab^{-1}b^{-1}=ab^{-1}b^{-1}b^{-1}=ab^{-3}$.

Going back to the expression $ab^2ab^3ab^4a$, I can use the above fact to change it to
$ab^2ab^3aab^{-4}=ab^2ab^3b^{-4}=ab^2ab^{-1}=ab^2ab^4$.
Applying the little fact again, we get
$ab^2ab^4=aab^{-2}b^4=b^{-2}b^4=b^2$.
So we have a simple algorithm for putting all “words” (that is, expressions made out of $a$, $b$, $a^{-1}$ and $b^{-1}$) into a standard form. With a bit more effort, one can show that no two expressions in standard form are equal. For example, if $ab^3=b^4$ we would deduce that $a=b$ (by multiplying both sides on the right by $b^{-3}$), which is false.

Actually, why is it false? How can we be sure that there isn’t some strange way of using the relations $a^2=b^5=e$ and $aba=b^{-1}$ to show that $a=b$? One quick answer is that we can find a concrete group — the symmetry group of a pentagon — and two elements of that group — one reflection and one rotation — that satisfy the given relations. If those relations implied that $a=b$ then they would enable us to deduce that a reflection of the pentagon was equal to a rotation of the pentagon, which is just plain false.

That same argument shows that $D_{10}$ has at least 10 elements, and since it has at most 10 elements (since there are 10 distinct standard forms) it has exactly 10 elements. By using the standard-form algorithm we can build up the multiplication table. For example, $ab^3ab^2=a^2b^{-3}b^2=b^4$, and so on.

So now we have two ways of thinking about $D_{2n}$. Either it is the group with generators $a$ and $b$ and relations $a^2=b^n=e$ and $aba=b^{-1}$, or it is the group of symmetries of an $n$-gon.

The main point of what I want to say in this section is that there is a danger that you will become too keen on abstraction. Certain facts about $D_{2n}$ are obvious if you think of it as a symmetry group and quite a lot less obvious if you argue directly from a presentation. For example, $D_{12}$ contains (a copy of) $D_6$ as a subgroup. One proof of that fact consists in arguing as follows. Let $a$ and $b$ be the generators of $D_6$ and let $c$ and $d$ be the generators of $D_{12}$. Then the function that takes $b^j$ to $d^{2j}$ and $ab^j$ to $cd^{2j}$ is an isomorphism from $D_6$ to its image, which is a subgroup of $D_{12}$. Checking this is slightly fiddly, though not too hard.

How much more transparent, however, is the following argument. $D_{12}$ is the group of symmetries of a regular hexagon. If you join alternate vertices of the hexagon you get an equilateral triangle. All the symmetries of that triangle are given in an obvious way by symmetries of the hexagon, so the symmetry group of the triangle is a subgroup of the symmetry group of the hexagon.

I shall have more to say about group actions, but for now I’ll content myself with this message (which I’ve said a few times, but let me say it once more).

Abstraction is great, but don’t get too carried away with it. In particular, if you know that a group is isomorphic to a group of symmetries, that gives you direct access to a lot of information about it. Don’t throw that information away (unless for some reason you like complicated fiddly proofs that don’t tell you why a result is true).

### 17 Responses to “Group actions I”

1. Paul Matthews Says:

I prefer definition 1, because I like to think of groups acting on R^n (eg A_4 acting on R^3 rather than on the 4 points of a tetrahedron).

• gowers Says:

I don’t understand what you’re saying here. Can’t one just define an action of $A_4$ on $\mathbb{R}^3$ to be a homomorphism from $A_4$ to a group of transformations of $\mathbb{R}_3$ (the obvious group to go for being $SO(3)$)? The question of which definition to use seems to be independent of what kinds of actions you want to consider.

• Paul Matthews Says:

Well yes but that involves generalising the idea of permutation to transformations (not how permutations are usually taught) , as well as the concepts of the group SO(3) and group homomorphisms.
To me (and I would guess to most students just learning the subject?) definition 1 is more natural and easier to understand. I suppose it depends on how one’s brain works – unlike cardster below I think in pictures!

2. Cardster Says:

I like the second definition. It is far far clearer and makes sens of the usual one. (However, I am highly non-visual though and prefer algebraic explanations.)

3. Mazoit Says:

I prefer the infix definition of 1: an action is a product * such that
g*x=y…

It is closer to the notation g(x) which you mention quickly after.

• gowers Says:

I agree that that’s nice — it sort of captures both ways of thinking about actions and lets you pass freely from one to the other.

4. schlafly Says:

I favor introducing groups by defining them as a set of symmetries (ie, permutations) that are closed under composition and inverse. Learning to think abstractly may be admirable, but I just don’t see how it is better to think of the associate law as a purely algrebraic property, and avoid thinking about group actions.

• Qiaochu Yuan Says:

The point of defining groups abstractly isn’t to avoid thinking about group actions; it’s to separate an abstract group from a particular concrete realization of it, in the same way that we separate abstract manifolds from a particular embedding of them into Euclidean space. Just as the same group can act in various ways on various objects, the same manifold can be embedded in various ways into Euclidean space, and separating these two aspects of the group out clarifies many aspects of group theory (e.g. which properties are intrinsic to a group and which depend on a choice of action).

• Anonymous Says:

A group=hand metaphor once occurred to me, and I still like it. If a group is a hand, then a group acting on something is the hand grasping something. The point of groups is to act, just as the point of hands is to grasp. But it’s also very important for hands to be able to let go so they can one day grasp something else. For the same reason, it’s important for groups to be able to let go of their actions.

5. Qiaochu Yuan Says:

Regarding the two definitions of group action, it seems to me that the really crucial point to understand here is the notion of currying, or in other words the natural isomorphism $\text{Hom}(X \times Y, Z) \cong \text{Hom}(X, Z^Y)$ where $Z^Y$ is (in this context) the set of functions $Y \to Z$. When defining a group action (so $Y = Z$ and $X = G$ is a group) it is probably more intuitive to use the second definition, but

1) the two definitions are equivalent for sets by currying, and

2) the second definition behaves poorly once one moves beyond the realm of sets.

For example, the second definition behaves poorly when one considers topological groups and continuous group actions, since in general it may not be possible to put a nice topology on $\text{Aut}(Z)$ such that continuous actions are precisely continuous group homomorphisms $G \to \text{Aut}(Z)$. The situation is even worse for, for example, algebraic groups and algebraic group actions.

6. Bryn Says:

This article has helped to somewhat elucidate group actions for me. The second definition has helped a lot with gaining an intuitive understanding of what is going on, I like it in itself and as an auxiliary to the first definition. You’re a great help, Tim, cheers.

7. student Says:

Dear Gowers:

I have several questions regarding the “group representation” part:

“It’s not hard to use these relations to reduce every product you can make out of $a,b,a^{-1},\ \hbox{and}\ \ b^{-1}$ to an element of the form $b^j$ or $ab^j$ with $0\leq j\leq n-1$.”

What would be a formal proof for this statement. The argument in this article is based on a few examples. How can one apply the “algorithm” without know what a specific word is?

“One quick answer is that we can find a concrete group — the symmetry group of a pentagon — and two elements of that group — one reflection and one rotation — that satisfy the given relations.”

To make this argument rigor, does one actually show that this $D_{10}$ satisfies the given relations AND no more?

“since there are 10 distinct standard forms” gives “$D_{10}$” has at least 10 elements, how does the “at most 10” part come? The sentence “That same argument shows that $D_{10}$ has at least 10 elements, and since it has at most 10 elements (since there are 10 distinct standard forms) it has exactly 10 elements.” is confusing.

8. Groups and Group Actions Lecture 11 | Theorem of the week Says:

[…] also going to meet the notion of a group action.  You could read this post by Tim Gowers to start to get a feel for what that’s […]

9. Groups and Groups Actions: Lecture 12 | Theorem of the week Says:

[…] blog post about group actions by Tim Gowers is well worth a […]

10. Groups and Group Actions: Lecture 11 | Theorem of the week Says:

[…] also going to meet the notion of a group action.  You could read this post by Tim Gowers to start to get a feel for what that’s […]

11. Groups and Group Actions: Lecture 12 | Theorem of the week Says:

[…] blog post about group actions by Tim Gowers is well worth a […]

12. algebra 5 – last instructions Says: