The traditional presentation of normal subgroups and quotient groups goes something like this. First, you define a subgroup to be normal if it satisfies a certain funny condition. Then, given a group and a normal subgroup , you show that you can define an operation on the cosets of , and that that operation turns the set of all cosets into a group, called the quotient group. Ideally, you also show that one can’t give a natural group structure to the left cosets of an arbitrary subgroup: that justifies restricting attention to normal subgroups.

There’s nothing terribly wrong with this approach, but it does leave one question unanswered: why bother with all this stuff? The traditional approach to *that* question is to ignore it, confident that the answer will gradually reveal itself. The more group theory you do, the more normal subgroups and quotients will arise naturally and demonstrate their utility, so if you just diligently keep studying, you will (fairly soon) come to regard normal subgroups and quotient groups as natural concepts that were obviously worth introducing.

But a slight variant of the question is harder to answer: why did anybody bother to introduce these concepts in the first place? Surely the concepts were introduced for a *reason*, and not in the vague hope that they would turn out to be useful.

The obvious way of answering this second question is to look at the history. It turns out that normal subgroups were introduced by Galois (who was also the founder of group theory) as part of his study of the solubility of polynomials by radicals. That’s slightly unfortunate, since it means that to understand why normal subgroups were introduced, one has to put in a lot of work understanding about the theory of solving polynomials.

However, there is another way of justifying the introduction of a new concept into mathematics. Instead of looking at the *actual* history of that concept, one can look at a *fictitious* history. If you can tell a plausible story about why a concept *might have been* invented, then that is sufficient to make it seem reasonable. It solves the mystery of how anyone could have thought of the concept, and it also shows that it was pretty well inevitable that the concept would have been introduced sooner or later.

In this post, then, I’d like to give a fictitious account of why normal subgroups and quotient groups were introduced into group theory, once some of the more basic concepts were already in place.

The first phase of group theory (in this fictitious account) consisted in spotting that many mathematical structures, such as symmetries of Platonic solids, permutations of a finite set, non-singular matrices, had features in common that could be abstracted out. This led to the formulation of the axioms for group theory: associativity, identity, inverses.

It was noticed very early — indeed, this observation was part of what drove the initial development of the subject — that two groups could be defined very differently and yet be “basically the same”. For example, the group of rotations of a regular pentagon was basically the same as the group of integers mod 5 under addition, and the group of symmetries of a rectangle (that wasn’t a square) was basically the same as the group of transformations of that consisted of the identity and the three half turns about the coordinate axes.

The first attempts to make precise this intuition that groups could be “different but basically the same” were a little clumsy. Two groups and were said to be *identical up to permutation and relabelling* [please bear in mind that none of what I'm saying is true -- this definition included] if you could find orderings of the elements of and the elements of such that if you formed the multiplication tables, then they would correspond, in the sense that if the element listed in the th place times the element listed in the th place equals the element listed in the th place in , then the same is true in . Later, this definition was tidied up so that it became the definition of *isomorphism* that we are now familiar with: an *isomorphism* from to is a bijection such that for every . Two groups and were said to be *isomorphic* if there was an isomorphism between them.

A further important step was taken when it was observed that functions that satisfy the multiplicativity property were interesting and important even if they weren’t bijections. To give just one of many examples, an extremely useful property of determinants is that det is always equal to detdet. Phenomena such as this led people to study what were initially called *multiplicative functions* between groups. Later, they were renamed *homomorphisms*, to stress the similarity with isomorphisms.

It was noted almost immediately that if is a multiplicative function, or homomorphism as we now call it, then the set of such that forms a subgroup of . Moreover, many natural subgroups occur in that way. For example, the group of rotations of is what you get if you take the group of all rigid motions of that fix the origin and restrict to the ones that have determinant equal to 1. For that and many similar reasons, the notion of the *kernel* of a homomorphism was introduced.

By this time, people had got a bit of a taste for the purely abstract study of group theory: it excited them that they could think about a group such as without saying whether its elements were rotations of a pentagon or integers mod 5 or something else entirely. So people started studying groups *for their own sake*. And one of the first questions they asked was this: we know that kernels of homomorphisms are subgroups, but what about the converse? That is, is every subgroup the kernel of some homomorphism?

This problem turned out not to be very interesting, since there were easy counterexamples. For example, if you take the permutation group and take the subgroup that consists of the identity and the transposition , then that is not the kernel of any homomorphism. The original proof of this fact went something like this. Suppose we know that goes to the identity. We can use that to deduce that, say, also goes to the identity. We do this by “using to do ” as follows. We first find a permutation that switches and — the obvious one being . We then switch and , perform the permutation and finally switch and back again. More formally, we calculate the permutation , and we note that because we began and ended by swapping and round, this new permutation does to what the old one did to and vice versa. So the resulting permutation is instead of — and is the same as .

What does that prove? Well, if we now think about what must be, it is the result of multiplying by by . But since is the identity and is its own inverse, we end up just doing and inverting it again. This shows that also belongs to the kernel.

It was soon realized that this basic trick could be used to generate lots of counterexamples. All you had to do was find a group , a subgroup , an element and an element such that was not in . And that was easy to do.

Soon this observation was turned into a formal definition. A subgroup was said to be *closed under conjugation* if whenever and . And now the original question was modified in an obvious way. The argument used to generate counterexamples could be encapsulated in the following statement: the kernel of a homomorphism must always be closed under conjugation. (Proof: if , then

.) So what about the converse? Is every subgroup that is closed under conjugation the kernel of some homomorphism?

This question is an example of a phenomenon that occurs frequently in mathematics. You find a very simple *necessary* condition for something to be the case. (In this example, being closed under conjugation is a necessary condition for a subgroup to be the kernel of a homomorphism.) You then attempt to show that that condition is sufficient. It’s always a pleasant surprise when you manage, since it is far from obvious in advance that the conditions you have identified are the only obstacles to getting what you want.

Here, for instance, it isn’t obvious at all that just because a subgroup satisfies a condition that it clearly has to satisfy, it must be the kernel of some homomorphism. Where’s that homomorphism going to come from? There isn’t another group around, let alone a map from to that group.

This problem was found quite hard at the time, though the technique used to solve it has since become standard. The original thought process that led to a solution went something like this.

*If you are trying to find something complicated but have no idea where to start, then pretend you’ve found what you are looking for and see what you can say about it.*

I think of this method as a huge generalization of what we do when we solve equations. If someone says, “Find a number that gives you 128 when you multiply it by itself and add 7,” you might (if you had only just started algebra) think as follows. “I’ll pretend I’ve got the number, and I’ll call it . The property has is that . That tells me that , which in turn tells me that . So let me see whether one of those works. Yes, it does! I can take and then I get .”

Let’s try something similar here. I’ve got a group and a subgroup that’s closed under conjugation. Let’s pretend we have a homomorphism from to some group and that the kernel of is . What can we say about ?

Well, one thing we know immediately is that for every , since that was our initial assumption. What can we do with that information? One thing it tells us is that whenever and , and also that . So that tells us that whenever we can find such that or .

Let’s explore that a little more. Let be an element of . Can we say precisely for which elements it *must* be the case that ? We’ve shown that it must when or for some . To prove that, we used the fact that every belongs to the kernel of . But we also know the converse: that every element of the kernel of belongs to . So if , then , since .

What we have just established, using only the information that is the kernel of , is that if and only if , which is the same as saying that . In other words, is constant on the left cosets of but takes different values on different left cosets.

But I could have argued slightly differently. I could have shown that implies that , and therefore concluded that if and only if , or . This would have shown that is constant on the *right* cosets of , and that it takes different values on different right cosets.

These two observations are incompatible with each other unless every left coset is a right coset and vice versa. So we’d better check that. What right coset of might be equal to ? Well, it had better contain , so is a pretty obvious choice. Does ? The answer is yes if and only if . But is closed under conjugation, so we’re OK. [By the way, if you are anxious about my writing equations that involve not just elements of but subgroups of and then doing things like multiplying both sides on the right by , then you have good instincts. The reasoning is valid, but it is important to check that it is valid. I'll leave that as an exercise if you haven't done it already.]

Where have we got to now? We have shown that *if* is a homomorphism from to a group , and *if* the kernel of is , then must be constant on the cosets of — and we have also shown that I’m allowed to say “cosets” because the left and right cosets coincide. Also, must take different values on different cosets.

Is that all we can say? Very much not. If we have an ounce of mathematical curiosity, then sooner or later we will start to wonder whether we can say anything about how the values of on different cosets are related to each other. If we know that takes the value everywhere on the coset and the value everywhere on the coset , can we deduce anything from that? Well, the main thing we know about is that it is a homomorphism, so let’s try to use that fact. If , then and , so , by the multiplicativity property. By what we have just established, that tells us that will take the value on the entire coset to which belongs. But what is that coset? To answer that we would like to rewrite as a product that begins with something in and ends with something in . It would be nice if we could let and swap places.

Can we say that ? Unfortunately not. But let’s play around a little. We know that is closed under conjugation, so we might try to find a conjugation. And we can! Rearranging the equation we were hoping for gives us that . There is no reason to suppose that that is true, but we do at least know that the right hand side belongs to . So we can at least write . And rearranging that tells us that . So . That tells us that the coset that contains is , which is as nice an answer as we could have hoped for.

What does it tell us? Let’s write to mean that takes the value everywhere on the coset . Then we have shown that if and , then .

OK, we’ve found all sorts of properties that must have, but are we any nearer to finding ? Yes we are, in that losing-at-chess sense that our options are getting more and more restricted, which makes it easier to work out what to do. Here’s a trick that reduces our options still further. One of the observations we have made is that is constant on every coset and takes different values on different cosets. That means that there is a bijection between the cosets of and the elements of the image of .

How does that help when we don’t actually know what is, or what its image is? It actually helps a lot. We are free to *define* the image. Can we think of a set that’s in one-to-one correspondence with the set of all cosets of ? Yes of course we can: just take the set itself!

But hang on, you might say, isn’t that a bit dangerous? There are lots of sets that are in one-to-one correspondence with any given set, so what reason is there to think that the set itself is a good choice? Well, here are two reasons.

(i) We are given absolutely no data in the problem other than the group and the subgroup , so it is highly likely that the homomorphism and the group that maps to will be built out of and in some way.

(ii) In a sense it doesn’t actually matter what set we define the group operation on, since if we define it on a set and is a bijection, then we can use the group operation on to define essentially the same group operation on by .

So now we’ve managed to cut things down further. We want to define a binary operation on the set of all cosets of that will make it into a group, and we want the function that takes to the coset to be a homomorphism.

Now let’s go back to what we have managed to establish about . An important property was that (where meant the constant value that takes on the coset ). But now we’ve decided that we’re going to define to be itself. The only thing that we haven’t decided is what the binary operation on the set of all cosets should be. But what we established earlier about forces our hand completely. Since must equal and since , it follows that must be . We have arrived at the definition of the quotient group and the quotient map, and thereby solved the problem.

Usually when the quotient group is defined, one defines the binary operation on the set of cosets and then checks that it is well-defined. In the course of the above thoughts, we have basically already checked this.

In my fictitious world, there was one final stage in the early development of group theory, which was that all the thoughts that led to the definition of the quotient group were carefully suppressed. The solution to the kernels classification problem was presented like this.

**Theorem.** *A subgroup of a group is the kernel of some homomorphism if and only if it is closed under conjugation.*

**Proof.** First we show that the condition is necessary. Let be a homomorphism with kernel . Then for every and every we have

,

which proves that also belongs to the kernel of , and thus also to . That proves that is closed under conjugation.

Now suppose that is closed under conjugation. Define a group as follows. Its elements are the left cosets of (which, it can be shown, are also the right cosets of ). We define a binary operation on these cosets by taking to equal .

There are a few things we must check. First, we must make sure that the definition we have just given does not change if we choose different elements and of the cosets and . A quick way of doing that is to note that a different way of defining is as the set of all such that and . That definition clearly depends on the cosets themselves and not on how they are described, but does it give us ? Well, it certainly contains . In the other direction, , so it is also contained in .

Now let us define a function by . Then , which, by definition of the group operation on , is equal to . Therefore, is a homomorphism. For to equal we need , so the kernel of is , as required.

Unfortunately, each year a few of the brighter mathematics students found this argument reasonably easy to digest. So the decision was taken to suppress all mention of why the group was constructed. Instead, it became common practice to define the *quotient group* for no apparent reason and to point out only later that the function , known as the *quotient map* was a homomorphism with kernel . Now, at last, the goal of making the concept difficult for everybody had been triumphantly achieved.

A further development was the realization that the method that had been arrived at was very general indeed. After this proof, different notions of “quotient” kept appearing all over mathematics, and in an effort to find a unified description of them, the notion of an equivalence relation was formulated. But that is another story (perhaps to be presented in another post).

As an afterthought, here’s a different way of presenting the proof, which uses equivalence relations instead of partitioning into cosets. I’ll switch to using the phrase “normal subgroup” rather than talking about subgroups being closed under conjugation.

**Theorem.** *Let be a normal subgroup of a group . Then there is a group and a homomorphism such that is the kernel of .*

**Proof.** Define two elements and to be -*equivalent*, and write , if there exists such that . It is easy to check that this is an equivalence relation.

Now we shall prove that if and , then . Let and be such that and . Then

,

where belongs to , since is a normal subgroup.

It follows that we can define a group operation on the equivalence classes of : if and are two equivalence classes, then all products with and are equivalent, and we let that equivalence class be the product of and .

The group axioms for this operation basically follow if we replace “” by “” in the usual group axioms. For instance, given any three elements of the group, we know that . It follows that , and from that and the definition of the product it follows that (where stands for the equivalence class of ). And a similar argument shows that the map that takes to for each is a homomorphism. The kernel of this map is .

November 20, 2011 at 5:02 pm |

I know it’s easy, but perhaps there should also be a comment about associativity etc for the multiplication on the quotient group in the penultimate version, where it looks like it might be missing currently? (It is there in the equivalence class version.)

Of course, everything actually follows from the fact that is an isomorphism, if you allow yourself to define an isomorphism between things with just any old binary operation, rather than insisting on groups. Because then anything isomorphic to a group is a group! But this might not be desirable.

November 20, 2011 at 5:12 pm

Oops! I suppose that I mean that is a surjective homomorphism, and a surjective homomorphic image of a group has to be a group?

November 21, 2011 at 4:36 am |

Wonderful presentation. It is added to my list of blog posts that are better than the standard textbook presentations on a topic. Especially relevant because I have been bouncing the “normal subgroup” around in my head recently, trying to develop an intuition for it and decide if I should try to delve further past the surface of group theory. Thank you! “A normal subgroup is the kernel of some homomorphism, representing a ‘collapsible’ ‘basis dimension’ of the group’s structure. {e} is the degenerate one. Any subset that is closed under conjugation works.” Yay!

I am not sure about “fictitious history” though… I think that is confusing and can be misleading (since it fictitious). Why not just say “math took centuries to develop the hard way, but you can learn a lot of it in one lifetime if you follow this presentation that cleans it up with the benefit off hindsight?”

November 21, 2011 at 9:37 am |

Lang introduces normal subgroups in the same way as you, stressing their role as kernels of homomorphisms. You can see it in Lang’s book “Algebra” GMT 211.

November 21, 2011 at 8:21 pm |

A way that I find good to think of the concept of normal subgroup is to think of the action of the group on itself. The action of the group on the quotient group is a simplification of the action of the group on itself.

This can be visualized by thinking of the group as a cloud of points. Any one of those points represents a permutation of the set of points, so we can imagine each point flying to where another point was. Imagine the cosets of the normal group as circles round the points in the coset. Then the quotient group is the group containing the permutations of these circles that we get from the permutations of the points.

Bearing in mind that we don’t care about the internal structure of the elements of the quotient group, we could collapse the points in each circle into a single point, giving a cloud of points like the one we started with. The set of cosets is a construction a bit like the way you can define numbers as sets, and we will never talk about whether for an element in a quotient group .

I don’t know how easy it would have been to come up with this picture if you had no conception of quotient groups.

April 23, 2014 at 9:39 pm

I have a hard time understanding what “the action of a group” means. I could not understand normal groups until after I’d come up with this narrative on my own and realized normal groups were exactly the kernels of homomorphisms.

November 22, 2011 at 10:01 am |

Great article!

I found an interesting presentation in Penrose, Road to Reality, Chapter 13. He uses an square over complex plane, with vertices at 1, i, -1, -i. He writes (C is the operation: complex conjugate):

We can exhibit a non-normal subgroup of the group of symmetries of the square, as the subgroup of two elements {1, C}. It is non-normal because {1, C}i = {i, Ci} whereas i{1, C} = {i, Ci}. Note that this subgroup arises as the new (reduced) symmetry group if we mark our square with a horizontal arrow pointing off to the right

That is, he “breaks symmetry” (drawing an arrow from 0 to 1) to obtain a non-normal subgroup. He writes:

In the case of O(3), there happens to be only one non-trivial

normal subgroup,[13.7] namely SO(3), but there are many non-normal subgroups. Non-normal examples are obtained if we select some appropriate finite set of points on the sphere, and ask for the symmetries of the sphere with these points marked. If we mark just a single point, then the subgroup consists of rotations of the sphere about the axis joining the origin to this point (Fig. 13.3c). Alternatively, we could, for example, mark points that are the vertices of a regular polyhedron. Then the subgroup is Wnite, and consists of the symmetry group of that particular polyhedron

Hmmm… every non-normal subgroup arise from such “mark the points/draw an arrow” operation?

December 24, 2011 at 4:17 am |

I am reading Mac Lane and Birkhoff’s Algebra textbook right now, and they develop normal subgroups in almost exactly the same way as your fictional history.

Section 2.9 (3rd edition): “We now discuss a necessary condition on a subgroup S in order that S be the kernel of some morphism. We shall show that the necessary condition is also sufficient. ” The next sentence defines “normal”. (The preceding section introduced cosets. The succeeding section introduces quotient groups and coset products.)

That book was pretty much the first textbook of modern algebra, so indeed we have lost something along the way over the years.

February 24, 2012 at 2:15 am |

This is briliantly written, seriously done.

I guess that understanding the underlying history of theory does help you grasp concepts better.

March 21, 2012 at 9:36 am |

Thank you, Mr. Gowers!

You made my group theory class much more interesting.

I am often annoyed by obscure and counter-intuitive ways of presenting concepts and ideas (not only in group theory, pretty much in every subject I’ve taken there is at least one case of this).

Mathematics seems like an arcane art that only a gifted few that ”get it” can grasp, when presented in the ”usual” way.

Your presentation is really inviting and motivating. Thank you again!

May 22, 2012 at 9:20 pm |

An excellent post, spot on psychologically, and actually made me laugh out loud. I specifically remember as an undergraduate fresh out of school being very bemused hearing the lecturer (Graeme Segal, no less) present all the stuff about equivalence classes and quotient groups as if it obviously made sense and feeling like everybody else must have read some book that I hadn’t. Thank you for finally clearing that up!

Andrew White

May 29, 2012 at 11:57 am |

[...] normal subgroups is easy, but the reverse is rather less so (I discussed this result in detail in this blog post). The rough answer is that if is a normal subgroup, then it is the kernel of the quotient map from [...]

June 21, 2012 at 8:31 pm |

hey guys, I am looking for a way to prove that a quotient group of a group with one of its normal subgroups is isomorphic into a subgroup of the parent group. Any clue? Actually, I need to prove that PSL group is isomorphic to a subgroup of the general linear groups.

June 21, 2012 at 8:53 pm

In general it is not true that a quotient group of a group G is isomorphic to a subgroup of G. For your second question, are you asking whether PSL(n) is isomorphic to a subgroup of GL(n)? Or do you just want GL(m) for some m? By the way, I strongly recommend the website Mathematics Stack Exchange for questions like this: you will probably get several good answers very quickly.

August 31, 2012 at 2:11 am |

Can I say that the main idea in this post can actually be generalized to almost all the mathematics courses in college? For those mathematical concepts/objects which are “strange” or “non-natural”, one should finally come up with such “fictitious history” in order to understand what on earth those concepts are talking about.

Is there an analogy between normal subgroup and ideal in sense of the “fictitious history” mentioned in this post?

October 17, 2012 at 12:08 pm |

Is is logically sound (with respect to the usual definition(s) of normal) to simply define a normal group as one that is the kernel of some (any) homomorphism?

The kernel of any homomorphism is normal, and as established in the post, any normal subgroup is a kernel of a homomorphism. Then I beleive the answer is yes?

April 6, 2013 at 8:45 pm |

An interesting non fiction tidbit that could relate to this alternative history: “Fabian Stedman: The First Group Theorist?”

Author(s): Arthur T. White

Source: The American Mathematical Monthly, Vol. 103, No. 9 (Nov., 1996), pp. 771-778 Published by: Mathematical Association of America

November 12, 2013 at 4:14 am |

[…] http://gowers.wordpress.com/2011/11/20/normal-subgroups-and-quotient-groups/ […]

December 18, 2013 at 10:09 pm |

[…] actual reason….for an amusing "fictional" take on this state of affairs, see: Normal subgroups and quotient groups | Gowers's Weblog Follow Math Help Forum on Facebook and […]

April 23, 2014 at 11:40 pm |

gowers,

I absolutely agree. I wasn’t able to grasp normal subgroups until I had managed to construct for myself the natural pursuit you describe. Early on I thought “normal” was part of the conditions required to be a subgroup (all subgroups are normal), I quickly moved past that, but then for years I thought normal subgroups were subgroups of the center. (The first error being too inclusive, the second excluded some (“most”) normal subgroups.) Lately, pondering subnormal series and composition series, it seemed to me that a quotient group of G must be isomorphic to a normal subgroup of G, but I couldn’t prove it. A google search must have found your comment indicating that is false. Thank you, the statement of that fact doesn’t appear to be very common.