My spies tell me that the second lecture of the Group Theory course contained a discussion of functions, and in particular bijections — for which it was necessary to prove a few results about injections and surjections. Since a good understanding of functions is essential throughout mathematics, perhaps a post on the topic would be in order. My aim in these posts is not to cover the material all over again, but rather to stress the points that you need to understand in order to be able to write proofs. So let me give a few thoughts about functions, in no particular order.
1. The range and domain matter.
Two functions are considered equal if they have the same domain and the same range, and if they do the same thing to everything in the domain. In other words, there is more to a function than what it does. For example, let’s write for the set of real numbers and for the set of non-negative real numbers, and let’s define two functions as follows.
In a certain sense, these two functions are defined by the same mathematical process — that of taking a real number and squaring it. But the process is not a function. They are different functions because they have different domains and ranges.
Why do we insist on this point? One pretty convincing reason is that if we didn’t, then we wouldn’t be able to talk about injections, surjections and bijections. For example, the function I have just defined is a bijection, whereas the function that I defined before it was neither an injection nor a surjection.
2. Learn the definition of injection. Don’t try to reconstruct it for yourself.
I’ve lost count of the number of times I have seen attempted explanations like this for what it means to say that is an injection.
(i) is always unique.
(ii) takes exactly one value for every .
(iii) If then .
Those three attempted definitions are all completely wrong. Here are four correct ones. For all three I’ll assume that is a function with domain and range .
(i) Every has at most one preimage. [A preimage of means an element such that .]
(ii) For every , if , then .
(iii) For each there is at most one such that .
(iv) For every , if , then .
Of these, I recommend avoiding (iii). It’s correct, but it seems to be the definition that leads to the kind of nonsense of the first three definitions if you misremember it. Note that (ii) and (iv) are closely related: to convert (ii) into (iv) I simply replaced the statement
inside the quantifier by its contrapositive
I recommend usually avoiding (iv) as well. Why? Because it is usually easier to work with the “positive” hypothesis that than it is to work with the “negative” hypothesis that .
3. How to prove that a function is an injection.
If you are asked to show that a function is an injection, then my advice is to use definition (ii) above, by which I of course mean the correct definition (ii) rather than the incorrect one. Here’s the definition again.
How do we use that? Well, it involves a two-step process that you should program into your brain until you do it without thinking. If you ever find yourself asked to prove that a function is an injection, you first bring the definition above into the front of your mind. In order to be able to do that, you have to have learnt the definition first. A major theme of these posts will be that you can program yourself to do a lot of things automatically, but in order to do so you have to invest some time learning definitions and basic results.
Once you have recalled the definition, the second step is very easy indeed, and is something that I covered in the post about handling variables. If you have to prove a statement of the form
then you write down the words, “Let “. I call this the “let” trick. A great thing it does for you is remove one layer of quantification: if you find lots of quantifiers scary, then removing one can only be good news.
In most situations that crop up, the statement is itself of the form , and then you can do a little more while still on autopilot. If you need to prove the statement
then the first words you write should be
At any rate, you should declare the variable and then tell your reader that you are assuming .
Let us see how that applies here. The statement we want to prove is as follows.
Applying the “let” trick, we begin as follows.
We then see that we want to show that one thing implies another. So we follow up with what one might call the “suppose” trick, and end up with this.
Up to this point, no thought whatsoever has been required. So if you have any difficulty before getting to this stage, it is an excellent example of what I called a “fake difficulty” in an earlier post. The only reasons you might have for finding it difficult are
(a) that you haven’t learnt the definition of an injection;
(b) that you haven’t programmed yourself to apply the “let” and “suppose” tricks automatically.
Neither of those counts as a legitimate difficulty.
Let me show this process in action. You were shown the proof that a composition of two injections is an injection. This is a classic example of a result with a proof that you should try not to learn. Of course, you need to know the proof, but instead of learning it, you should try to find it easy. If you don’t already find it easy, let me show you just how little you have to think of if you follow the advice I have just given.
First, we ought to write out more precisely the result we are trying to prove, making clear what the domains and ranges are, declaring all our variables, and so on.
Proposition. Let , and be sets, let and let . Suppose that and are injections. Then their composition is an injection.
OK, let’s get started with the proof. What is that we are trying to prove? Oh yes: that the function is an injection. And what does that mean? If you’ve got the definition at your fingertips you’ll remember instantly that we need to show that for every , if , then . Applying the “let” and “suppose” tricks, we write this:
Now there’s something else that is good to do in a situation like this, which is to write the expressions and in a more transparent way. We define the value of at to be , so why don’t we write it like that, so we can see more easily what is being said? In fact, let’s rewrite the sentence I’ve just written, so that it reads as follows.
How did I know that it would be a good idea to rewrite the sentence like that? Well, I have to admit that it was a matter of experience, but there are a few situations where this kind of rewriting is almost always a good idea. This is one of them. Another is when you have a sentence of the form , where is a set that has been defined in terms of some property. For example, the closed interval is defined as the set of all real numbers such that . If you see the sentence , you will usually make life easier if you replace it by the equivalent sentence . As you go through the course, you should make a little collection of useful rewritings of this kind so that you do them without thinking. I’ll do what I can to get you started.
Right, where were we? We had written this.
Our aim at this point is to prove that . (If you don’t find it obvious that that is our aim, then you should go back and reread this section.) Why should and be equal? It’s not clear, but then why would it be clear? After all, we haven’t used the hypotheses yet!
Our hypotheses were that is an injection and that is an injection. If you have the definition of injection at your fingertips, then you will know that this tells you the following.
The first of these statements looks very promising, since the conclusion we are aiming for is precisely that two elements of are equal. But to use this statement we need to establish that , and all we have so far is that . So we’re not in a position to use the first hypothesis yet.
What about the second? for this we need to find two elements and of such that . We know that . Does that give us our and ?
Yes it does. Since takes elements of to elements of , we know that and are elements of . So it would be insane not to see what happens if we set and . If we do so, then we find that and therefore that (because is an injection). Remembering what and were, we realize that we have established that . But that was what we were looking for in order to use the fact that is an injection, so now we find that , as we wanted.
Now that was a rather long account of what might happen in your head, but the proof itself is very short. Here it is in full.
Note that in that proof I did not write out the definition of an injection. I just had it in my head and I assumed that the reader did too. I also assumed that the reader was happy with the “let” and “suppose” tricks.
Note also that, as the above proof makes very clear, one can think of “ is an injection” as a kind of cancellation law. It says that whenever you write something of the form you can “cancel” the s on both sides to get . So the phrase “since is an injection” can be read as “I’m allowed to cancel the s — honest.”
Here is a second example. If you have understood the above, then you might like to try it yourself before reading on: you should find it easy. Recall that a function is strictly increasing if whenever .
Exercise. Let be a strictly increasing function. Prove that is an injection.
The point of this exercise is to write out the kind of short, crisp, utterly rigorous and non-wordy argument of the kind that I used to prove that a composition of injections is an injection, using the kinds of proof-generating techniques I have been talking about. Merely satisfying yourself that the statement is “clearly true” is not enough. I’d almost go as far as to say that if you have to stop and think why the result is true (rather than just proceeding on autopilot and never getting stuck), then you haven’t done it properly. While that’s a slight exaggeration, I am at least happy to say that thought is not needed, and even that there is a danger that too much thought will lead you astray.
OK, let’s try operating on autopilot. We are trying to show that our strictly increasing function is an injection. So we write out the following first line without thinking.
Hmm … we seem to be in trouble, because there isn’t an obvious way of using the fact that is increasing. Or rather, there is, but the argument is more like one by contradiction: we want to say to ourselves that the fact that tells us that can’t be less than (or else would be less than ) and can’t be less than (or else would be less than ). So the only possibility left is that .
But the fact that we’ve ended up arguing like that suggests that we would have been better off going for the contrapositive in the first place. That is, we could take definition (iv) as the one we are using and start the argument as follows.
How do we use the fact that is strictly increasing? Well, if then either or . That gives us our way in. So the next line of the proof is as follows.
Now recall what I said in the post on AND and OR: if you want to use a statement with an OR in the middle, then you have to split into cases and show that the conclusion holds in all cases. So that’s what we do next.
I think that is the most natural proof of this statement, but just out of interest, here’s a slightly different way of doing it. It’s not all that different really — it just shifts the taking of the contrapositive to an earlier stage and packages it up as a lemma. (A lemma, by the way, is a statement that you establish separately with a view to using it to prove other statements.)
Lemma. Let be a strictly increasing function and let and be real numbers. If , then .
Proof. This is simply the contrapositive of the statement that if then .
With that lemma to hand, we can now argue as follows.
Proposition. Let be a strictly increasing function from to . Then is an injection.
Proof. Let and be real numbers with . Then and . By the lemma, it follows that and . Therefore, .
Notice that once I had packaged up the contrapositive-taking inside the lemma, I was able to use the “let” and “suppose” tricks with Definition (ii) rather than Definition (iv).
4. How to prove that a function is a surjection.
I’ll be briefer about this, since I’ve made several general points in the previous section that I don’t need to repeat.
Suppose, then, that we are given a function and are asked to prove that it is a surjection. How should the proof begin? Once again, we should bring the relevant definition to the front of our mind, which is as follows:
Since what we are trying to prove begins, “For every ,” we can apply the “let” trick. Therefore, the first line of the proof should be, “Let .” Once we’ve written that, our task is to prove that there is some such that . That is, we are trying to solve an equation.
Up to now I may have given the impression that I don’t believe that there is such a thing as genuine mathematical difficulty. But that is not at all what I think. There are many apparent difficulties that shouldn’t be difficulties at all, but as a general rule, finding things is where the going starts to get tough. Sometimes that isn’t true and those things are handed to you on a plate, but sometimes very sophisticated mathematics is needed. Here we have an arbitrary and we need to show that there is some such that . Let me give a few examples, some where it is easy and some where it is hard.
Example 1. Let be given by the formula , where and are real numbers and . Prove that is a surjection.
Solution: Let . Then . The result follows.
To come up with that proof I solved the equation for .
Example 2. Let be given by the formula . Prove that is a surjection.
Solution: Let . If , then , and if , then . Therefore, we can find such that . Since is continuous, there must be some between and such that .
Here I used the concept of continuity and a result called the intermediate value theorem: you will cover these in Analysis I next term. I used that result because I don’t know how to solve the equation (and neither do you). So I contented myself with proving that there exists a solution.
Example 3. A composition of two surjections is a surjection.
Solution: [Obtaining this argument is an entirely thought-free process. First we set up the problem properly by giving names to things.] Let , and be sets and let and be surjections. [Next, we apply the "let" trick to the statement we are trying to prove, which is that is a surjection, which is the statement that for every there exists such that -- which we rewrite as .] Let . [We can't immediately see an that will do, so let's turn to the hypotheses that is a surjection and that is a surjection. The first one doesn't help us much but the second does.] Since is a surjection there exists such that . [Now we've got a point in so we can use the hypothesis that is a surjection.] Since is a surjection, there exists such that . [Note that I went from "there exists " to behaving as though I had fixed , without actually saying that that was what I was doing. See the post on handling variables for a discussion of this. As for the proof we are trying to write, it's basically over.] But then . Therefore, is a surjection.
I mentioned that one way of thinking of the hypothesis that is a injection is that it tells you that you can cancel in any equation of the form . You can think of the hypothesis that is a surjection as saying that the equation always has at least one solution.
Example 4. The closed unit disc is the set of all points such that . Let be a continuous function from to such that whenever (That is, “fixes the boundary”.) Prove that is a surjection.
Solution: This is meant to illustrate that proving that a function is a surjection sometimes involves maths that is well beyond what you have covered so far. It relies on a result called Brouwer’s fixed point theorem. That result says that every continuous function from to has a fixed point. That is, if is continuous, then there must be some such that .
We make the decision to prove the result by contradiction and Brouwer’s theorem. That is, we shall assume that there exists a continuous function that fixes the boundary and is not a surjection, and we shall attempt to prove, from that, that there is a continuous function from to with no fixed point, which will contradict Brouwer’s fixed point theorem. (Of course, isn’t such a function, since every point on the boundary of is a fixed point of .)
Now, we are assuming that is not a surjection, so we need to flex our logical muscles and negate a statement with quantifiers. The statement is this.
Note that I’ve switched to representing points by single letters rather than giving them in coordinate form. So in the above statement and from now on and are points in the circle rather than coordinates of points.
Following the mechanical rules set out in the post on negation, we obtain the negation, which is this.
Now we want to define a continuous function with no fixed point. At this point we are trying to find or make something, so if we run into difficulties, we at least have the comfort of knowing that they are not fake difficulties. I happen to know how this one goes because at some point in the past I was shown it …
The idea is this. We shall do things in two stages. The first stage is to define something called a continuous retraction from to its boundary. That means a continuous function from to the boundary of that sends each boundary point to itself. (If you’re worried that such a function can’t exist, then don’t, because you’re right that it can’t. We’re writing a proof by contradiction, and that means that we’re in nonsense world.) For each we need to say what boundary point is. To do that, we take the point given above (by the negation of the statement that is a surjection) and draw a straight line segment from to . Note that our hypothesis tells us that and are different points. We then continue along that line until we hit the boundary of and let be the point where we hit.
If you know the right bits of maths (which you don’t), then it isn’t hard to show that is continuous. Also, if is on the boundary, then (by hypothesis), from which it follows that (by the way we defined ). So we’ve got our continuous retraction.
To create a continuous function with no fixed point, we just rotate the continuous retraction. That is, we just define to be what you get if you rotate through, say, 90 degrees anticlockwise about the origin. Then the only way could conceivably equal is if is itself on the boundary. But then isn’t , since it’s a rotation of .
Two important contrasts.
Concrete problems versus abstract problems.
Amongst the surjection proofs I’ve just discussed, there are two very different kinds. One is where I have a specific function, such as (from to ) and I show that it is a surjection. This I call a concrete problem. The other is where I have an unknown function about which I am given some information, and my task is to deduce from that information that it is a surjection. This I call an abstract problem.
It might be better to say that there is a spectrum with fully concrete at one end and fully abstract at the other. For example, when I proved that the function was a surjection, that felt pretty concrete, but I didn’t actually know what and were. I could have rephrased the problem to sound more abstract, by saying, “Let be a linear function with gradient not equal to zero. Prove that is a surjection.”
Finding things versus proving that they exist.
I’ve already drawn attention to this. For the first problem above, I actually found such that . For all the other problems, I proved that such an exists without actually finding one.
Graphs of injections and surjections.
I have encouraged you not to think when starting to write proofs that involve injections and surjections. This may go against the grain a little: surely one should be trying to understand injections and surjections rather than mechanically writing proofs without having the faintest idea what they mean.
My actual view about this is, not surprisingly, that both understanding and mechanical fluency are important. I am stressing the mechanical fluency side of things because my experience supervising suggests that where people run into problems it is the mechanical side that needs to be improved: the picture in their brains isn’t too bad, but they have trouble turning observations about this picture into a correctly written proof.
Nevertheless, it is worth saying just a little bit about how to visualize injections and surjections, at least in the case of functions from to . (Once you understand these, it should be reasonably easy to transfer your understanding to more general functions.)
The way to do this is to think about graphs. Let’s start with a very basic question: which subsets of the plane are graphs of functions from to ? Well, the point of a graph is that for each coordinate there is exactly one coordinate such that belongs to the graph. If the function in question is , then we say that . (Usually we think about this the other way round: we start with the function and draw the graph. Here I’m thinking about the graph and saying what the function is that it is a graph of.)
In visual terms, what this is saying is that a subset of is the graph of a function if and only if it satisfies the following condition.
Note that this is really saying two things: each vertical line must intersect in at least one point (or we wouldn’t have a value for ) and also at most one point (or we would have more than one value — which isn’t allowed).
Now let’s think about injections. For the purposes of this discussion it will be convenient to use the following definition.
If we’re given the graph of a function , how do we find all the solutions to the equation for some fixed ? We draw the horizontal line , and at each point where it cuts the graph we have a solution. Why? Because the line will cut the graph at a point of the form , and since the graph is of the function , that tells us that .
Therefore, if we want there to be at most one solution, that tells us that the line cuts the graph in at most one place. That gives us the following criterion for the graph of a function to be the graph of an injection.
A very similar line of reasoning leads to the following companion criterion for a function to be a surjection.
Putting those two together, we get a criterion for a function to be a bijection.
A final remark is this. If a function has an inverse, then the graph of the inverse is obtained from the graph of by reflecting it in the line , as you have probably seen at A level. But if you reflect the graph of a function, you don’t necessarily get the graph of a function. When do you get the graph of a function? Well, after the reflection you need every vertical line to cut your set in exactly one place, so before the reflection you need every horizontal line to cut the set in exactly one place. We can split that up into two conditions: cutting in at least one place and cutting in at most one place. That generates the definitions of surjection and injection. So if you ever stop to wonder where those definitions come from and why they are important, the answer is that they come from a consideration of what we need if we want to invert a function.
How to prove that a function is a bijection.
Do I need a section here? Surely if I’ve discussed how to show that a function is an injection and I’ve discussed how to show that a function is a surjection, then that’s all that’s needed. If you want to show that a function is a bijection then you have to show that it is an injection and that it is a surjection. Right?
Wrong. Suppose you were asked to prove that the function is a bijection from to . Would your argument look like this?
First let us show that is an injection. Suppose that . Subtracting 5 from both sides and then dividing both sides by 3 we find that . Now let us show that is a surjection. Let . Then . We have shown that is an injection and a surjection. It follows that is a bijection.
Perhaps it would. But then you have not yet picked up a very important mathematical principle, which is this.
Do we have any statement equivalent to “ is a bijection”? Yes we do. A function is a bijection if and only if it has an inverse. Let’s see what happens if we try to use this. We’re trying to show that the function is a bijection. Is it easy to show that it has an inverse? Yes, the function is the inverse. Therefore, is a bijection.
I’m not saying that you should always find an inverse if you want to show that a function is a bijection. What I am saying is that if it’s easy to find an inverse, then that will give you a better proof.
This tip holds not just for concrete problems (where you have to show that a specific function is a bijection) but also for more abstract ones (where you are given a function with certain properties and need to show that it is a bijection). And the general message that you shouldn’t always go back to the definition can be applied throughout mathematics. Indeed, so important is this principle that I plan to devote an entire post to it at some point.
There is one aspect of injections and surjections that I have not discussed here, which is how to show that a function is not an injection or a surjection. I hope to discuss that as part of a more general post on counterexamples, but let me end with a proof that the function is not an injection or a surjection, when considered as a function from to .
Proof that is not an injection: .
Proof that is not a surjection: there is no such that .
Note the total lack of waffle, and the fact that to prove my point I just needed one thing to go wrong in both cases.