How to use Zorn’s lemma

I am continuing my series of sample articles for the Tricks Wiki with one that is intended to represent a general class of such articles. It is common practice in lecture courses (at least if the ones I attended as an undergraduate are anything to go by) to state useful theorems, lemmas, propositions, etc., without going to much trouble to explain why they are useful. Of course, there are many ways to pick up this further understanding: taking note of where and how such results are used, doing carefully designed exercises, and so on. Nevertheless, it is often the case that more could be done to help people recognise the signs that indicate that a particular result can be applied.

This article, which is principally aimed at undergraduates early on in a mathematics degree, is inspired by an experience I myself had as an undergraduate. I had a sheet of challenging problems (set by Béla Bollobás) and one of them completely stumped me. (I’m sure several of them stumped me but this is the one that sticks in my mind.) I can’t remember exactly what the question was, but it was something of similar difficulty to that of determining whether an additive function from $\mathbb{R}$ to $\mathbb{R}$ was necessarily linear. My supervision partner solved the problem using Zorn’s lemma, which we had been told about in a lecture, and I just sat there in disbelief because it hadn’t even remotely occurred to me that Zorn’s lemma might be useful. At some point in the intervening years, I “got” Zorn’s lemma and now find it straightforward to see where it is needed. This article is intended to speed up that process for other people.

One might argue that it is better to work out how to apply theorems for oneself. I’m in two minds about that. Probably for any individual theorem, if one works out for oneself how to recognise situations where it can be applied then that is better than being told by somebody else. But working it out can take quite a lot of time and effort, so if one has help then one may be able to learn how to apply more theorems, and one may be able to learn about theorems in branches of mathematics other than one’s own. On balance, I think that a series of articles on how to use various important mathematical principles would be a useful resource, one that I would love to have to help me understand better the many areas of mathematics where I feel shaky. So here’s a first attempt at Zorn’s lemma. So far I have only two examples of applications: at some stage it would be good to have several more.

Title: How to use Zorn’s lemma.

Quick description: If you are building a mathematical object in stages and find that (i) you have not finished even after infinitely many stages, and (ii) there seems to be nothing to stop you continuing to build, then Zorn’s lemma may well be able to help you.

Prerequisites: Basic concepts of undergraduate mathematics, such as vector spaces.

Example 1: A function $f$ from $\mathbb{R}$ to $\mathbb{R}$ is called additive if $f(x+y)=f(x)+f(y)$ for every $x,y\in\mathbb{R}$ . Clearly any function of the form $f(x)=\lambda x$ is additive. Are there any other additive functions?

An easy induction shows that if $f(1)=\lambda$ then $f(n)=\lambda n$ for every positive integer $n$ . (This is because, $f(n+1)=f(n)+f(1)$ .) It is also easy to prove that $f(0)=0$ (since e.g. $f(0)=f(0)+f(0)$ ). And from this it follows that $f(n)+f(-n)=0$ , so $f(-n)=-\lambda n$ , for every positive integer $n$ . Another easy induction shows that $f(mx)=mf(x)$ for every real number $x$ and every positive integer $m$ . Therefore, $mf(x/m)=f(x)$ , from which it follows that $f(x/m)=f(x)/m)$ for every real number $x$ and every positive integer $m$ . From these observations it follows that if $f(1)=\lambda$ then $f(x)=\lambda x$ for every rational number $x$ .

At this point it seems to be hard to deduce anything about other values of $f(x)$ . Indeed, there doesn’t seem to be much obstacle to defining $f(\sqrt 2)$ to be anything we like. If we set $f(\sqrt 2)$ to be $\mu$ , then an argument similar to the argument of the previous paragraph shows that $f(x\sqrt 2)=\mu x$ for every rational number $x$ , but no number of the form $x\sqrt 2$ is rational except when $x=0$ , so this is not going to conflict with the choices we have already made. We will of course be forced to set $f(x+y\sqrt 2)$ to be $\lambda x+\mu y$ , but there is no problem in doing that.

Let us be slightly more explicit about why this isn’t a problem: it is because if $x+y\sqrt 2=x'+y'\sqrt 2$ then $x=x'$ and $y=y'$ (since otherwise we would find that $\sqrt 2=(x'-x)/(y-y')$ , which is rational). This will enable us to see more clearly what is going on later.

The discussion so far strongly suggests that there should be a function that is additive but not of the form $f(x)=\lambda x$ . We haven’t yet defined one, since by no means every real number is of the form $x+y\sqrt 2$ with $x$ and $y$ rational. But we have produced a partially defined additive non-linear function, and the method we have used is rather flexible. Indeed, if we pick another number, such as $\pi$ , say, where $f$ is not yet defined, then we can extend the definition to all numbers of the form $x+y\sqrt 2+z\pi$ with $x,y,z\in\mathbb{Q}$ by setting $f(x+y\sqrt 2+z\pi)=\lambda x+\mu y+\nu z$ for some arbitrarily chosen $\nu$ .

More generally, we could construct a sequence of numbers $t_1,t_2,\dots$ with $t_1=1$ , with the property that no $t_n$ is a rational linear combination of $t_1,t_2,\dots,t_{n-1}$ . And then we could define $f(t_i)=\lambda_i$ , for arbitrarily chosen $\lambda_i$ , which would tell us that $f(t_1x_1+\dots+t_nx_n)=\lambda_1x_1+\dots+\lambda_nx_n$ for every sequence $x_1,\dots,x_n$ of rational numbers.

The trouble is, even when we have built an infinite sequence in this way, we still haven’t defined $f$ for all real numbers, since the set of rational linear combinations of those numbers is countable. However, we can still continue to build our function, since we can pick a new real number $s_1$ that is not a rational linear combination of the $t_i$ and choose a value for $f(s_1)$ . And then we can choose $s_2$ that is not a rational linear combination of the $t_i$ and $s_1$ , and so on. But again we find that even if we produce an infinite sequence of $s_i$ we have still defined $f$ for only countably many real numbers.

A good way of regarding what we are doing is this: we are considering the real numbers as a vector space over the rationals, and we are trying to build a basis for this vector space, where this means a collection $B$ of real numbers such that every real number is a rational linear combination of numbers in $B$ in precisely one way. Then if we define the values of $f$ however we like for the numbers in $B$ and define the values of $f$ in the obvious way for rational linear combinations of those numbers, we have a function from $\mathbb{R}$ to $\mathbb{R}$ that is linear over the rationals, and hence additive, but not necessarily of the form $f(x)=\lambda x$ .

Now $\mathbb{R}$ , considered as a vector space over $\mathbb{Q}$ , is certainly infinite-dimensional. In fact, it has uncountable dimension. So does it have a basis? (All we mean by “has uncountable dimension” is “cannot be spanned by countably many vectors,” so it is not true by definition that it has a basis.) Inspired by finite-dimensional vector spaces it is tempting to say, “Pick a maximal linearly independent set,” since such a set is not just linearly independent but also spans the whole space, since if it didn’t we could just pick an element that did not belong to its linear span and we could add it to the linearly independent set, contradicting maximality.

So now we seem to be done: we are looking for a basis of $\mathbb{R}$ over $\mathbb{Q}$ , and all we need to do to find a basis of any vector space is take a maximal linearly independent set.

But why should a maximal linearly independent set exist? Isn’t that exactly the difficulty we were facing earlier: we could carry on picking more and more rationally independent real numbers but we never seemed to reach the point where we could no longer continue.

Let us now interrupt this example for a more general discussion of Zorn’s lemma and how to use it.

General discussion: We are now in a very typical situation where Zorn’s lemma can be applied. We would like to build a maximal object, and we feel as though we ought to be able to, because any non-maximal object can easily be extended. The usual statement of Zorn’s lemma is as follows. A partially ordered set is a set $X$ together with an ordering $\leq$ of the elements of $X$ that is transitive and antisymmetric (this means that if $x\leq y$ and $y\leq x$ then $x=y$ ). A typical example is where $X$ is a collection of sets and $x\leq y$ if and only if $x\subset y$ . (Here I use the symbol “ $\subset$ ” to mean “is a subset of” and not “is a proper subset of”.) A chain in a partially ordered set $X$ is a totally ordered subset of $X$ : that is, a subset $Y$ such that if $y,z\in Y$ then either $y\leq z$ or $z\leq y$ . An upper bound for a subset $Y$ of a partially ordered set $X$ is an element $u$ such that $y\leq u$ for every $y\in Y$ . And a maximal element in a partially ordered set $X$ is an element $x_0$ such that the only element $x\in X$ with $x_0\leq x$ is $x_0$ itself. Zorn’s lemma states that if $X$ is a partially ordered set such that every chain in $X$ has an upper bound, then $X$ has a maximal element. (Note that a maximal element does not have to be bigger than everything else: it just mustn’t be smaller than anything else.)

In order to see how this rather abstract-looking statement relates to the kind of problem we had earlier, let us imagine that we have a partially ordered set $X$ and are looking for a maximal element. We could try to build one as follows. We start with an element $x_1$ . If it isn’t maximal then we find a bigger element $x_2$ . If that isn’t maximal we find a bigger element $x_3$ , and so on. That gives us an increasing sequence $x_1,x_2,\dots$ . Now we seem to be stuck, and indeed sometimes we are stuck. For example, if $X$ is the set of natural numbers with their usual ordering then we might have created the sequence $1,2,3,\dots$ , which would not help us to find a maximal element — not surprisingly, since $\mathbb{N}$ doesn’t have a maximal element.

However, $\mathbb{N}$ does not satisfy the hypothesis of Zorn’s lemma, because the sequence $1,2,3,\dots$ is a chain with no upper bound. If $X$ does satisfy this hypothesis then the sequence $x_1,x_2,x_3,\dots$ , which is also a chain, has an upper bound, which we could call $x_\omega$ . If this is not maximal, then we can find a larger element $x_{\omega+1}$ . If that is not maximal, then we can find a yet larger element $x_{\omega+2}$ , and so on.

Example 1 continued: Note the similarity between the position we are now in and the position we were in when we were trying to create a basis for $\mathbb{R}$ over $\mathbb{Q}$ . Once again, we can continue to create larger and larger objects, but there seems to be no easy way of saying that the process eventually ends. Or rather, there is an easy way: Zorn’s lemma tells us that it ends.

Let us see how Zorn’s lemma applies in our example. The objects we were looking at were subsets of $\mathbb{R}$ that were linearly independent over $\mathbb{Q}$ . We noted that a maximal linearly independent subset of $\mathbb{R}$ spanned the whole of $\mathbb{R}$ , where by “maximal” we meant that the set was not contained in any larger linearly independent set. Thus, we were looking at the set of all linearly independent subsets of $\mathbb{R}$ , with the partial order $\subset$ .

All we have to do if we want to apply Zorn’s lemma is check that every chain has an upper bound. So let us imagine that we have a collection $Y$ of linearly independent subsets of $\mathbb{R}$ and that for any two of those sets one is contained in the other. What could serve as an upper bound? By definition it has to be a set that contains all the sets in $Y$ , so it has to contain their union. We want it to be linearly independent, so the smaller it is, the better. So there is basically only one candidate to try: the union itself. Is the union linearly independent? Well, if $t_1,\dots,t_n$ belong to the union, then each $t_i$ belongs to some linearly independent set $L_i\in Y$ . Because $Y$ is a chain, one of these sets $L_i$ contains all the others. If that is $L_j$ , then the linear independence of $L_j$ implies that no non-trivial linear combination of $t_1,\dots,t_n$ can be zero, which proves that the union of the sets in $Y$ is linearly independent, just as we wanted. Therefore, by Zorn’s lemma, there is a maximal linearly independent set. Earlier we observed that such a set was a basis and could be used to create additive functions not of the form $f(x)=\lambda x$ , so our problem is now solved.

Further general remarks: How, one might ask, is Zorn’s lemma itself proved? One answer is that it cannot be proved: it is just an axiom. But a slightly more informative answer is that it is equivalent to the axiom of choice and the well-ordering principle. A hint of why this should be can be found in the attempted proof above. There we created an infinite sequence $x_1,x_2,x_3,\dots$ , which we then continued “transfinitely” with the elements $x_\omega,x_{\omega+1},x_{\omega+2},\dots$ . This transfinite process can continue until a maximal element is reached, but to do so one needs to make infinitely many choices (since there isn’t a way of defining the next element of the sequence). Thus, the axiom of choice comes into play. If we knew in advance that $X$ could be well-ordered (that is, given a total ordering such that every non-empty subset had a minimal element), then we could build up the sequence by always taking the minimal element that worked. And it’s an easy exercise to use Zorn’s lemma to prove that every set has a well-ordering.

Example 2: Let us justify this last statement, since it gives another representative application of Zorn’s lemma. Suppose, then, that we have a set $X$ and we would like to give it a well-ordering. That is, we would like to define a total ordering on the elements of $X$ such that every non-empty subset $Y$ of $X$ has a minimal element.

Once again, we find ourselves in a situation where there are no obvious constraints to building the object we require, other than the “transfinite length of time” needed to complete the process. We just pick an arbitrary element $x_1$ to be the minimal element of $X$ itself, then an arbitrary element $x_2$ to be the minimal element of $X\setminus\{x_1\}$ , and so on. In other words, at each stage we would like to choose an element not yet chosen and declare it to be the next element in our ordering.

To convert that rough idea into a Zorn’s-lemma argument, we need to define a partial ordering on the set of “incomplete attempts” at defining a well-ordering on $X$ . An incomplete attempt means a subset $Y$ of $X$ and a well-ordering of that subset. Let us define an attempt to be precisely that: a subset $Y$ of $X$ with a well-ordering of the elements of $Y$ . The partial ordering on the set of all attempts should reflect the idea of one attempt extending another, so the obvious partial ordering to take is as follows: given two attempts $Y$ and $Z$ , we say that $Y\leq Z$ if $Y$ is an initial segment of $Z$ . This means that $Y$ is a subset of $Z$ , that the ordering associated with $Y$ is the same as the ordering associated with $Z$ when you restrict it to $Y$ , and that every element of $Y$ is less than every element of $Z\setminus Y$ in the ordering on $Z$ .

Does this partially ordered set satisfy the chain condition? Well, if we have a chain of attempts, then we can define an upper bound to be the union $U$ of the sets in that chain, with the following ordering: $u\leq u'$ if there is some attempt $Y$ in the chain such that $u$ and $u'$ both belong to $Y$ and $u\leq u'$ in $Y$ . This ordering is well-defined, because if $Z$ is another attempt that contains $u$ and $u'$ , then either $Y\leq Z$ or $Z\leq Y$ (since $Y$ and $Z$ both belong to the chain), and the definition of $\leq$ guarantees that the orderings on $Y$ and $Z$ are consistent.

Is the ordering on $U$ a well-ordering? Yes, since if $V$ is a non-empty subset of $U$ , then there must be an attempt $Y$ in the chain such that $V\cap Y$ is non-empty. But then $V\cap Y$ has a minimal element $v$ in $Y$ . This must be minimal in $U$ as well. To see why, suppose that $v'\in V\setminus Y$ and $v'< v$ . Then there must exist an attempt $W$ in the chain with $v'\in W$ . Since $v'\notin Y$ it follows that $Y$ does not contain $W$ , so we must have $Y\leq W$ . Therefore, $Y$ must be an initial segment of $W$ and we have $v'< v$ with $v'\in W\setminus Y$ and $v\in Y$ . This is a contradiction.

We have now shown that $U$ is well-ordered, and thereby verified the chain condition. (Note that verifying the chain condition was fairly straightforward: this is true of many applications of Zorn’s lemma.) Therefore, by Zorn’s lemma, there is a maximal attempt, which we hope will be a well-ordering of the whole of $X$ .

If it is not the whole of $X$ , then it is a well-ordering of some proper subset $Y$ of $X$ . But we can easily extend this attempt: let $z$ be any element of $X\setminus Y$ and define $Z$ to be $Y\cup\{z\}$ , ordering $Z$ by taking the ordering we already have on $Y$ and stipulating in addition that $y\leq z$ for every $y\in Y$ . (Of course, we also stipulate that $z\leq z$ .) This produces a larger attempt, contradicting the maximality of $Y$ . This contradiction completes the proof that (Zorn’s lemma implies that) every set can be well-ordered.

Further general remarks: Note that the final step of the last argument, the proof that every maximal attempt must be a well-ordering of the whole of $X$ , corresponds to the informal observation made earlier that we can keep on and on extending a well-ordering, while the verification of the chain condition corresponds to the fact that if we produce an infinite sequence of attempts then we can take their union and carry on. Again, this is typical of Zorn’s-lemma arguments.

So how, in general, does one recognise the need for Zorn’s lemma and how does one construct an appropriate partially ordered set in order to apply it? The clues are in the two examples above. Typically, one is trying to build a structure of some kind (such as a basis for a vector space, or a well-ordering of a set). The natural way to do it appears to be to build the structure up in stages, but there are too many stages for this to work straightforwardly. However, once one has an idea of what a stage is and what the building-up process is, one can wheel out Zorn’s lemma to finish the job. The partially ordered set will consist of all objects that might conceivably be stages in the construction, and one of these objects will be smaller than another if it might conceivably come before the other in the building-up process. If the resulting partial order satisfies the chain condition and if a maximal element must be a structure of the kind one is trying to build, then the proof is complete.

This entry was posted on August 12, 2008 at 1:25 pm and is filed under Mathematical pedagogy, Mathematics on the internet. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

43 Responses to “How to use Zorn’s lemma”

carlo Says:
August 12, 2008 at 2:22 pm | Reply
Outstanding article: this is exactly the sort of things
one needs to learn Mathematics.
Thank you so much!

P.S. I look forward to see the tricks wiki realized!
drini Says:
August 12, 2008 at 3:53 pm | Reply
Is the ordering on U a well-ordering? Yes, since if V is a non-empty …

What is U ? W?

Thanks: it was indeed $W$ and I’ve now corrected my slip (by changing what I originally called $W$ to $U$ rather than the other way round).
Additive Functions « The Unapologetic Mathematician Says:
August 12, 2008 at 7:58 pm | Reply
[…] busy with a few things today, like assembling my furniture. Luckily, Tim Gowers has a post on “How to use Zorn’s lemma”. His example is the construction of additive (not linear) functions from to […]
anon. Says:
August 12, 2008 at 8:48 pm | Reply
Let me echo carlo: this is a terrific article! Thanks so much for writing this up. Really makes me wish the web had been around when I was in grad school…
alex Says:
August 13, 2008 at 12:04 am | Reply
I really like the exposition style here. There is no “magic,” no arguments where the writer simply exhibits an object which satisfies all the properties one wants, leaving the reader no clue about how to come up with such objects.

I wish all math books were written this way!
Michael Nielsen Says:
August 13, 2008 at 12:04 am | Reply
Has the tricks wiki actually been set up? It’ll be great to see.
Terence Tao Says:
August 13, 2008 at 3:31 am | Reply
I like to think of Zorn’s lemma as a guarantee that greedy algorithms can go on as long as necessary, before being stopped by some sort of obstruction (either a chain without an upper bound, or a maximal element). So any finitary construction that is built via the greedy algorithm (e.g. selecting a basis for a finite-dimensional vector space greedily) has a decent chance of extending to the infinitary setting as well by a Zorn’s lemma argument. Which is of course essentially what you are saying above 🙂
Mark Says:
August 13, 2008 at 5:16 am | Reply
To me, the problem

Does there exist an additive function on $\mathbb{R}$ that isn’t of the form $f(x)=\lambda x$ ?

is more intuitive if thought about in greater generality. Considering $\mathbb{R}$ as a Banach space, then the problem is easily seen to be equivalent to:

Does there exist an unbounded linear operator on $\mathbb{R}$ ?

By essentially the argument presented in the post one can construct an unbounded linear operator on any “sufficiently non-trivial” Banach space. One thing that has always struck me as a bit curious is that while it appears to be necessary to use the axiom of choice to construct an unbounded linear operator on $\mathbb{R}$ , on most “interesting” Banach spaces (from a functional analysis view) there exists very natural unbounded linear operators, which do not require the axiom of choice to construct.
gowers Says:
August 13, 2008 at 8:34 am | Reply
Mark, You can’t quite mean what you write here, since $\mathbb{R}$ considered as a Banach space is one-dimensional. It can’t be a Banach space over $\mathbb{Q}$ either, since $\mathbb{Q}$ isn’t complete, and even to make it a normed space over $\mathbb{Q}$ would seem to require the axiom of choice. So I don’t really know what an unbounded linear operator on $\mathbb{R}$ is. I also take issue with your assertion that on most interesting Banach spaces there are obvious unbounded linear operators. For instance, take the most interesting Banach space of all: a separable Hilbert space, which for definiteness we could take to be the sequence space $\ell_2$ . If $(e_n)$ is the standard basis, we could define $f(e_n)=n$ , say, and extend linearly, but this function is defined just on finite linear combinations of the $e_n$ . To turn that into a completely defined unbounded linear map we need a basis (in the vector space sense) of $\ell_2$ , and I’m pretty sure that needs the axiom of choice. So as far as I can see the only way to produce explicit examples of unbounded linear operators is to define them on artificial normed (but not Banach) spaces such as the space of sequences that are zero except in finitely many places (with any norm you like).
Chris Eagle Says:
August 13, 2008 at 12:01 pm | Reply
I think I remember Imre Leader telling me that it’s consistent with ZF that every linear functional on every Banach space is bounded.
gowers Says:
August 13, 2008 at 1:56 pm | Reply
Terry, I sort of agree with you but I’m not sure you mean what I would mean by “greedy”. I think of a greedy algorithm as one where you make the locally best choice at each stage, without paying attention to possible long-term consequences. So Zorn’s lemma would be needed only if the locally best choice is not unique (as happens, for example, when one is proving the Hahn-Banach theorem). I suppose one could regard building a basis for a vector space as greedy in the sense that every vector not in the linear span of the vectors chosen so far is equally good, so they are all “locally best”. But I think it would be more natural to substitute something like “non-determined” for “greedy” in what you say. But the idea of saying that Zorn tells you that you can keep going until you reach one of two kinds of obstacle is a nice perspective on it.
Emmanuel Kowalski Says:
August 13, 2008 at 3:08 pm | Reply
If I remember right something that a colleague analyst once explained to me, any measurable linear functional on a Hilbert space (at least; I don’t remember if this works for arbitrary linear operators or Banach spaces) is automatically continuous. Here, measurable refers to the Borel sigma-algebra.

So intuitively, if you can “write down” such a functional, either it is not defined everywhere, and then it may be unbounded (e.g., trying to define f –> f'(0) on L^2([0,1])), or it is in fact continuous.
Mark Meckes Says:
August 13, 2008 at 4:00 pm | Reply
Similarly to Emmanuel’s comment, it is stated in Dugundji’s Topology that every nonlinear additive function on the real numbers is not Lebesgue measurable.
alphonse Says:
August 13, 2008 at 4:10 pm | Reply
Dear Sir, this is a very well-written article thank you. But is this really a trick given that, as you point out, Zorn’s lemma is the axiom of choice in disguise? Might it not be called ‘a fact in ZFC’ rather?

I mean, I thought (perhaps wrongly) that the axiom of choice is used precisely so that such existence results are easily obtained or in fact possible at all (i.e. it’s not a surprise, it’s a consciously built-in feature of our theory).

That said I’m as everyone else eagerly awaiting the day the Tricks Wiki will be launched!
gowers Says:
August 13, 2008 at 5:22 pm | Reply
Alphonse, you raise an important point. I’m slightly worried about the word “trick” because what the site will really be is a repository for methods and techniques, which won’t have to be tricks in the usual clever-clever sense of the word. Basically, any topic will be suitable for the site, provided its main aim is to explain to the reader how to prove certain kinds of theorems. But this means that one thing the articles won’t do is merely state facts. For example, the point of this one is not the fact of Zorn’s lemma (in ZFC) but rather the way it is typically used in the proofs of theorems. The reason I discussed the relationship between Zorn’s lemma and the axiom of choice was partly that it would be criminal not to and partly that it provided me with a good second example of an application of the lemma.

It’s also worth saying that from the point of view of the working mathematician, and also of the Tricks Wiki, Zorn’s lemma is not the same as the axiom of choice. Yes, they are logically equivalent, but they are not equivalently useful: for many applications of the axiom of choice it is far more convenient to use Zorn’s lemma instead, because if you don’t then you effectively end up having to prove it, or a special case that is no easier than the result in its full generality. (Alternatively, one could use induction on the ordinals: if I have time then I may write a companion article to this one, in which I would explain how Zorn’s-lemma proofs can often be converted quite easily into transfinite-induction proofs, and vice versa.)

Mark, I will soon have another Tricki article ready that will include amongst other things a proof of precisely the (contrapositive of the) result you refer to: every additive Lebesgue measurable function on $\mathbb{R}$ is linear.
Emmanuel Kowalski Says:
August 13, 2008 at 7:40 pm | Reply
As a followup to my previous comment, I found again the paper where a proof of the “automatic continuity” was given (indeed, for X–>Y linear with X Banach space and Y a normed space); the argument there (due to Étienne Matheron) is short and nice (and clever) enough that I just explained it in another post.
Terence Tao Says:
August 13, 2008 at 10:46 pm | Reply
Hmm, fair enough; Zorn’s lemma doesn’t require that each choice be locally best, only that it be better than (i.e. larger in the partial ordering than) the status quo.

One nice thing about this perspective is that it contrasts transfinite induction (which corresponds to the case where the choice is deterministic at each stage) with Zorn’s lemma (which is the non-determined choice case). It is then clear how the axiom of choice is used to reduce the latter to the former.
elacaraq Says:
August 27, 2008 at 6:59 am | Reply
Dear Sir, after reading your interesting article, my understanding of Zorn’s Lemma is: if we need a “maximal” set with property P (such as a set of basis, or a maximal ideal), we just pick an ascending chain of sets with that property P, and show their union has property P, then by Zorn’s Lemma such a maximal set exists. Is this understanding correct?

I like your articles a lot. Hope to see your explanation of transfinite induction soon.
gowers Says:
August 27, 2008 at 12:44 pm | Reply
I’d say that’s roughly it, but I prefer to be a bit more specific about the notion of “ascending chain” just to be absolutely clear that we’re usually talking about more than just set containment. (In other words, the basis example is not wholly representative.) In general, we are dealing with objects with more structure than mere sets, and the ordering associated with the chains is something along the lines of “object A is smaller than object B if B could turn out to be an extension of A in the building-up process”. Also, when you say “we just pick an ascending chain” what you mean is more like “whatever ascending chain we pick”.
Jimmy Says:
September 1, 2008 at 8:43 am | Reply
I have some doubt about the following statement:

Now \mathbb{R}, considered as a vector space over \mathbb{Q}, is certainly infinite-dimensional. In fact, it has uncountable dimension. So does it have a basis? (All we mean by “has uncountable dimension” is “cannot be spanned by countably many vectors,” so it is not true by definition that it has a basis.)

Do you mean the number of basis is uncountably infinite? (Suppose A is the set of basis.) I think it should be countable many, by the simple reason that we could write R as Q^A, which reveals that A should be countable many. Would you please explain that?

Thanks.
Zen Harper Says:
September 1, 2008 at 11:47 am | Reply
Hi everyone,

I recently saw in a book (Topological Vector Spaces, Distributions and Kernels by Francois Treves) that there is a Borel Graph Theorem for Banach spaces (and more generally), i.e. a big generalisation of the Closed Graph Theorem.

This is vaguely related to the above discussion; any unbounded linear operator on a Banach space must have a non-Borel graph. For $R^n$ this is weaker than (but in some ways “close to”) saying non-Lebesgue measurable, which intuitively (at least to me) means “very nasty” and “close to” requiring the Axiom of Choice (again, “very nasty and nonconstructive” in my opinion).

Apologies for all this vagueness, and the logical gaps. It doesn’t prove anything, but it intuitively suggests (at least to me) that unbounded linear operators between Banach spaces (defined on the whole space) are necessarily nonconstructive, which is in fact true as E. Kowalski says above (and gives a nice link about this).

Incidentally, does anyone else feel that AC is rather nasty, and that most undergraduate (or even graduate) students are not sufficiently well aware of this? Of course the extent of allowable nastiness is a matter of taste; I personally try to avoid it whenever possible, and explicitly warn students when it is being used. But maybe I am in a minority.
gowers Says:
September 2, 2008 at 9:27 am | Reply
Jimmy,

What is wrong with your friend’s argument is that the definition of the set spanned by the basis A is the set of *finite* linear combinations of elements of A. So R is not Q^A but the much smaller subset that consists of functions that are 0 in all but finitely many places. If A is countable then it’s easy to check that the set of finite linear combinations of elements of A is also countable. So if it’s to equal R then A must be uncountable.
The forthcoming launch of the Tricki « Gowers’s Weblog Says:
October 15, 2008 at 10:59 pm | Reply
[…] the article that explains the technique? Similarly, if you are trying to build some structure and Zorn’s lemma is what you need to make your argument work, what key words or tags are going to help you? It […]
olaiju Says:
January 19, 2009 at 11:37 am | Reply
pls i want to no the process of making an in complete normed space a complete normed space
245B, Notes 7: Well-ordered sets, ordinals, and Zorn’s lemma (optional) « What’s new Says:
January 31, 2009 at 6:08 pm | Reply
[…] Jan 31: See also "How to use Zorn’s lemma", by Tim Gowers.] Possibly related posts: (automatically generated)How to use Zorn’s […]
Prasad Says:
April 13, 2010 at 7:35 pm | Reply
In the following paragraph, I do not follow the argument from the lines starting with the quotation marks.

Is the ordering on $U$ a well-ordering? Yes, since if $V$ is a non-empty subset of $U$, then there must be an attempt $Y$ in the chain such that $V\cap Y$ is non-empty. But then $V\cap Y$ has a minimal element $v$ in $Y$. “This must be minimal in $U$ as well. To see why, suppose that $v’\leq v$ in $U$. Then there must exist an attempt $W$ in the chain with $v’\in W$. If $v’\notin V$ then $V$ does not contain $W$, so we must have $V\leq W$. But this is a contradiction, since $V$ must then be an initial segment of $W$ and we have $v’\leq v$ with $v’\in W\setminus V$ and $v\in V$. Therefore, $v’\in V$, and since $v$ is minimal in $V$ we must have $v’=v$.”

When you say “This must be minimal in $U$ as well”, you actually mean that we need to show that $v$ is a minimal element of $V$ in $U$. Am I right? You consider an element $v’ \leq v$ in $U$. The elements $v’,v \in W$ for some attempt $W$. If $v’ \notin V$, then $V$ doe not contain $W$. This is where I get totally confused. How could you conclude that $V \leq W$ when we do not know that $V$ is an attempt. Again, in the end you claim that $v$ is minimal in $V$, when that is what we set out to prove.
gowers Says:
April 14, 2010 at 3:12 am | Reply
You’re absolutely right. I accidentally wrote $V$ several times when I meant $Y$ . I’ve changed it now. I hope it’s correct, though I haven’t yet checked quite as hard as I should, so it may need further adjustment.
Prasad Says:
April 14, 2010 at 8:09 am | Reply
Thank you for your quick response. My hunt for a good understanding of Zorn’s lemma started with the following problem: Suppose f:S \Rightarrow T and g:T \Rightarrow S. Then show that there exist sets A \subseteq S and B \subseteq T such that f(A) = B and g(T\setminus B) = S\setminus A.

I am not sure if this problem requires Zorn’s lemma but I thought that I had a proof using Zorn’s lemma.
gowers Says:
April 14, 2010 at 9:13 am | Reply
That looks like the lemma you need to prove the Schroeder-Bernstein theorem, which does indeed require the axiom of choice.
chandrasekhar Says:
September 20, 2010 at 1:02 am | Reply
Superb article sir.

The first example, can be further seen as Automorphisms of R which are only the Identity. The standard way of proving it is showing f to be identity on rationals and then proving f to be continuous which would imply f being identity on the whole of R.

Another theorem, which i learnt makes use of Zorn’s lemma is “Existence of Maximal ideals in a Ring R with 1”.
245B, Notes 7: Well-ordered sets, ordinals, and Zorn’s lemma (optional) « mathTHÍCHinTOÁNmyHỌCbrain Says:
December 29, 2010 at 8:35 am | Reply
[…] Jan 31: See also "How to use Zorn's lemma", by Tim […]
Things | Mathematicalypse Says:
April 25, 2013 at 3:10 am | Reply
[…] How to use Zorn’s Lemma (the right […]
An exercise in transfinite magic | Dan Shved Says:
April 29, 2013 at 9:25 pm | Reply
[…] we are now going to turn this sketch into a real proof using Zorn’s lemma. Oh, by the way: here is an absolutely wonderful post about Zorn’s lemma on Tim Gowers’s blog. Anyway, here […]
Andy Says:
May 8, 2013 at 9:39 pm | Reply
gowers, why do you say that Shroeder-Bernstein requires the Axiom of Choice? Given injective maps from X->Y and Y->X, you can explicitly construct a bijection between X and Y; no choices are involved.
- gowers Says:
  May 11, 2013 at 11:58 am
  I wrote that false statement nearly three years ago, and have no idea what was going on in my head at the time …
Bohr (not Niels) revisited | 68, 27, 4, 0, 0, 0, 0, ... Says:
November 19, 2013 at 2:01 am | Reply
[…] learned about Harald’s work in my functional analysis class. By Zorn’s lemma, every Hilbert space admits an orthonormal basis (ONB). However, only separable Hilbert spaces have […]
Cauchy’s Functional Equation and Zorn’s Lemma | Power Overwhelming Says:
April 10, 2015 at 7:12 pm | Reply
[…] the words of Timothy Gowers, If you are building a mathematical object in stages and find that (i) you have not finished even […]
Maxis Jaisi Says:
December 11, 2017 at 3:19 pm | Reply
Can the choice of ${\lambda}_{i}$ be made totally arbitrarily in the construction? If a choice is made so that every member of $\{ {\lambda}_{i} \}_{i \in I}$ is a multiple of some number $a \in \mathbf{R}$ , then the map we construct will be the identity map, which is linear.
- Maxis Jaisi Says:
  December 11, 2017 at 3:20 pm
  Ooops, the map constructed won’t necessarily be the identity map, but it’ll be linear.
Maxis Jaisi Says:
December 11, 2017 at 3:25 pm | Reply
Ignore my previous comment. Here’s what I really wanted to say:

” Can the choice of ${\lambda}_{i}$ be made totally arbitrarily in the construction? If I choose ${\lambda}_{i} = a \in \mathbf{R}$ for every ${\lambda}_{i}$ in the maximally spanning set, we would have constructed a linear map.”
Cauchy’s functional Equation (Part 2) – Infinity Says:
July 29, 2019 at 1:35 pm | Reply
[…] as a vector space over ,from Zorn’s lemma(here is a great article to understand Zorn’s lemma ) it is a standard result that there always […]
John Aiken Says:
February 14, 2021 at 4:19 am | Reply
von Neumann is reputed to have once said to a student, “You don’t understand mathematics, you just accept it.”

Well, you have to at least know how to use it, even if you can’t understand it. Well, I certainly don’t “understand” Zorn’s lemma, but now I know how to use it.

Thank you for giving me an example of how to use Zorn’s lemma. I can use your example to see how the proofs of many theorems in mathematics work, even if the Theorems themselves seem rather mysterious to me.

An interesting thing in your example is that it was only after you had your maximum element that you proved that it was equal to the whole set. WOW!
Anonymous Says:
September 22, 2023 at 6:46 pm | Reply
This was so inspirational, such an epic story. I can’t believe that this happened. I can’t wait to read the sequel. 10/10