## Domains, codomains, ranges, images, preimages, inverse images

If I were writing a textbook, I would have discussed the basics of functions before talking about injections and surjections, but this is not a textbook — it is a series of blog posts that provide a kind of commentary on some of the lecture courses. However, now that I have got on to the subject of functions, it probably makes sense to discuss them a bit more, especially as I hear that they have made an appearance in Numbers and Sets.

Let me start with the most basic question of all: what is a function? This is one of the first examples (of many, unfortunately) of a concept that you were probably reasonably happy with until your lecturer explained it to you. This is absolutely not a criticism of your lecturer (who is excellent, by the way). It’s more like a criticism of an entire mathematical tradition that goes back to the days when the foundations were laid for our subject in terms of set theory.

Let’s have some examples of functions. The ones you are likely to have come across are ones that take real numbers to real numbers: things like $f(x)=x^2$, $g(x)=e^x$ or $h(x)=-2\sin(x+\pi/3)$. If someone asked, “Yes, but what are $f,g$ and $h$?” then an appropriate response might be, “Well, $f(x)$ is the result of doing something to $x$ that turns it into another real number.”

Notice that in that last sentence I slightly avoided the issue. I didn’t say what $f$ itself was — I just said what $f(x)$ was. (It’s another real number.) Notice also that for a specific function we don’t feel quite as tempted to ask what it really is: we don’t say, “What is squared?” That’s because “squared” isn’t a noun, so it feels wrong to imagine that it must be a thing, just as you would expect strange looks if you went around asking, “What is and?” (That’s not to say that an answer isn’t possible: one could say that AND is a logical connective, and a logical connective is something that joins two statements to form a new statement, and it could, if you wanted, be regarded as a function from the set of all pairs of statements to the set of all statements, etc. etc.)

What really matters about a function is not so much its essence as the following fact.

• If $f$ is a function from $A$ to $B$ and $x$ is an element of $A$, then $f(x)$ is an element of $B$.
• Given a function $f:A\to B$, we can define a natural notion of the graph of that function. It is the set of all points $(x,f(x))$ such that $x\in A$. To put it another way, it is the set of all points $(x,y)\in A\times B$ such that $y=f(x)$. This set has the property (discussed in the previous post) that for every $x\in A$ there is exactly one $y\in B$ such that $(x,y)$ belongs to the graph of $f$.

And now the officially correct thing to do is to turn everything on its head and make the following definition.

• Let $A$ and $B$ be sets. A function from $A$ to $B$ is a subset $f$ of $A\times B$ such that for every $x\in A$ there is exactly one $y\in B$ such that $(x,y)\in f$.
• Then one goes on to say that it is traditional to write $y=f(x)$ instead of $(x,y)\in f$. Equivalently, we define $f(x)$ to be the unique $y$ such that $(x,y)\in f$.

Why does anyone bother with this strange definition? One reason is that we sometimes want to talk about the set of all functions from $A$ to $B$, or, even more commonly, the set of all functions from $A$ to $B$ that satisfy certain conditions. It’s one thing to be able to recognise a function when you see one, but how do you say what counts as a function? We somehow want to capture the idea that any way of associating with each $x\in A$ some $y\in B$ counts as a function, even if that “way of associating” isn’t given by a rule of any kind.

It may seem as though the “take any old subset of $A\times B$ as long as for each $x\in A$ there’s exactly one $y\in B$ such that $(x,y)$ belongs to that subset” is a pretty neat way of capturing this arbitrariness. And in a sense it is. I think psychologically we are happier with the idea of a completely arbitrary set than we are with the idea of a completely arbitrary “way of associating” elements of one set with elements of another. But just because we are happier with it, that doesn’t mean we should be happier with it. What is an “arbitrary” set of integers, say? Sets of integers defined by properties such as “the set of all $n$ such that $n$ is a prime greater than 1000″ are fine, but how can we capture the idea of a “completely arbitrary” set of integers that doesn’t have a definition of that kind? It’s more or less the same problem as that of capturing the idea of a completely arbitrary function that isn’t given by a rule. So, I respectfully submit, the subset-of-Cartesian-product definition of functions achieves virtually nothing.

That will probably provoke a lot of disagreement, so let me qualify it slightly. It is useful for some purposes to “reduce everything to sets”. If I am working on the foundations of mathematics and I show that every statement to do with functions can be translated into an equivalent statement to do with sets, then I have shown that if I can sort sets out then I don’t have to do any further work to sort out functions as well. But in what one might call “everyday mathematics” I think that the definition of functions in terms of subsets of Cartesian products is of no use whatsoever.

Hmm, I’m worried that I’m still exaggerating. Let me consider a statement that might make it seem important to answer the question, “What is a function?” It is the following.

• There are uncountably many functions from $\mathbb{N}$to $\mathbb{N}$.
• You haven’t yet been told what this means, but for now just think of “uncountably many” as meaning “not just infinitely many but an extra-specially big infinitely many” or something like that. We obviously can’t prove a statement like that by listing a whole bunch of functions, so doesn’t that force us to have some general idea of what a function is?

Here are two arguments that it doesn’t. One is that I can prove this by reducing it to another problem. For each positive real number $\alpha$ I can define $f(n)=\lceil\alpha n\rceil$ (this means the smallest integer greater than $\alpha n$). It’s easy to check that these are all different functions. And then we can appeal to the fact that there are uncountably many positive real numbers.

The second argument is closely related. It is also true that there are uncountably many subsets of $\mathbb{N}$, but nobody feels that we have to say what a subset of $\mathbb{N}$ is in order to make sense of this statement. We just need to know a few rules for dealing with sets: in particular, for defining new sets out of old ones.

Just before I move on, let me express particular distaste for any definition that begins, “A function from $A$ to $B$ is a relation such that …” I absolutely hate this. The reason I hate it is that functions and relations are, to any reasonable person, different kinds of things, except that I don’t want to call them things at all, so what I really mean is that they have a different grammar.

To illustrate what I mean by grammar, it’s rules like this.

1. If $f:A\to B$ and $x$ denotes an element of $A$, then $f(x)$ denotes an element of $B$.

2. If $P$ and $Q$ denote statements, then $P\vee Q$ denotes a statement.

3. If $\sim$ is a relation on $A\times B$, $x$ is an element of $A$, and $y$ is an element of $B$, then $x\sim y$ is a statement.

4. If $P$ is a property defined on a set $X$ and $x\in X$, then $P(x)$ is a statement.

5. If $P$ is a property defined on a set $X$, then $\{x\in X:P(x)\}$ is a subset of $X$.

I’ll talk more about this kind of thing in a later post, but I hope these examples give you the idea. And 1 and 3 demonstrate that the grammar of functions is not the same as the grammar of relations. The fact that you can turn them into equivalent concepts that do have the same grammar is neither here nor there. You can do that with nouns and adjectives too. For example, I could decide that from now on I’m going to say, “There’s a red” where I used to say, “That’s red.” If I wanted to say, “My car is green,” I would say, “My car is a green,” just as I might say, “That dog is a Rottweiler.” It would be possible (basically, nouns and adjectives are both ways of picking out some subset of the set of all possible objects in the world) — but it would also be a bit weird.

But none of this what I really wanted to talk about. Rather, I wanted to discuss a few confusing bits of terminology.

To begin with, what are domains, ranges and codomains? What’s confusing about this is that there isn’t a standard terminology. The one I was taught as an undergraduate was this. If $f:A\to B$ is a function, then $A$ is called the domain of $f$ and $B$ is called the range of $f$. That’s the sense in which I’ve been using the words in these posts and I hope it agrees with what your lecturers have said.

However, some people use the word “codomain” for $B$ instead of “range”. Worse still (from the point of view of communication between mathematicians), people who call $B$ the codomain often use the word “range” to refer to the set of values taken by $f$: that is, to the set $\{f(x):x\in A\}$. I call that the image of $f$, and I think that is probably the Cambridge standard.

To give an example, let’s take the function $f:\mathbb{R}\to\mathbb{R}$ defined by $f(x)=x^2$. In my terminology, the domain and range are both $\mathbb{R}$ and the image is the set of non-negative real numbers. I’m also happy to call $\mathbb{R}$ the codomain if you want — for me, “codomain” and “range” mean the same thing.

However, some people would say that the domain and codomain are $\mathbb{R}$ while the range is the set of non-negative reals. I think they would regard “range” as synonymous with “image”.

So I suppose we would all get along fine if we just abolished the word “range”.

Unfortunately, that isn’t the end of the confusion, since some people use the word “domain” to mean “set of points where $f$ makes sense”. To give an example, they might say this: “Let us define $f:\mathbb{R}\to\mathbb{R}$ by $f(x)=1/x(x+2)$. Then the domain of $f$ is the set of all real numbers apart from $0$ and $-2$.” Apparently, this use of the word “domain” is quite common in schools. According to my terminology, the function just defined was not a function from $\mathbb{R}$ to $\mathbb{R}$ at all. Why not? Because it doesn’t assign a real number to 0 or to -2.

So if you want to be safe, then given a function $f:A\to B$ you can call $A$ the domain, $B$ the codomain, and $\{f(x):x\in A\}$ the image.

That is still not the end of the potential confusion. I said earlier that the one thing you need to know about a function $f:A\to B$ is that if $x$ is an element of $A$ then $f(x)$ is an element of $B$. In a nice friendly world, you could deduce that whenever you see $f(***)$, then whatever $***$ is must be an element of $A$. Unfortunately that’s not the case in the world we actually inhabit: we often write $f(C)$ when $C$ is a subset of $A$.

Now in a sense that’s just plain incorrect. Functions from $A$ to $B$ are little machines that turn elements of $A$ into elements of $B$. So how can we write $f(C)$ if $C$ is a subset of $A$? The answer is that the wrongness of doing that tells us one of two things:

(i) the writer has made a mistake;

(ii) the writer means something different.

You should have enough confidence in your lecturers and textbooks to assume that it is (ii) that holds and not (i). So what does $f(C)$ mean? It means the set of all $f(x)$ such that $x\in C$, or in symbols $\{f(x):x\in C\}$. For example, if $f$ is the function from $\mathbb{N}$ to $\mathbb{N}$ that takes $n$ to $n^2$ and $E$ is the set of all even numbers, then $f(E)=\{4,16,36,64,100,144,...\}$. The set $f(C)$ is a subset of $B$ and it is called the image of $C$.

The alarm bells should be ringing again. Earlier, I defined the image of $f$ to be the set $\{f(x):x\in A\}$, which we now see that we can write as $f(A)$. So is the set $f(A)$ the image of $f$ (as I said earlier) or the image of $A$ (as would be consistent with the more recent definition)? You just have to be alert to the context. Functions have images, but if you’re talking about a given function that’s clear from the context, then you also talk about images of subsets. Oh, and while we’re at it, if $x$ is an element of $A$, then $f(x)$ is called the image of $x$.

There is nothing for it but to get used to the fact that the same words and notation can be used for concepts with different — and confusable — meanings. If you see $f(***)$, then one of the first things you should do is look at what’s in those brackets and ask yourself what kind of object it is. If it’s an element of the domain (which will usually be indicated by a lower-case letter, which helps to reduce the confusion), then what you’ve got is an element of the codomain. If it’s a subset of the domain, then what you’ve got is a subset of the codomain.

Let me give a different example of this kind of use of “element notation applied to subsets”. If $A$ and $B$ are sets of integers, we sometimes write $A+B$. You might object that this cannot be correct, on the grounds that $A$ and $B$ are sets and you don’t add sets together — you take things like unions and intersections. And that objection is right in the following sense: when we write $A+B$ we are not giving the usual meaning to the plus symbol. So what do we mean? We actually mean something fairly natural, which is the set of all numbers you can make by adding something in $A$ to something in $B$. In symbols, $A+B=\{x+y:x\in A, y\in B\}$.

A quick example: if $A=\{1,3,5,10\}$ and $B=\{1,4,7\}$, then $A+B=\{2,4,5,6,7,8,9,10,11,12,14,17\}$. Here, I’m not really adding sets: I’m adding the elements and forming a set out of all possible results. In a similar way, when I take the set $f(A)$, I’m not applying the function $f$ to the set $A$: I’m applying $f$ to the elements of $A$ and forming a set out of all possible results. It’s a very important distinction.

I’ve said that if $f:A\to B$ and $x$ is an element of $A$, then $f(x)$ is called the image of $x$. Something similar happens the other way round. If $f(x)=y$, then $x$ is called a preimage of $y$. Note a very important distinction between these two definitions: I talked about the image but a preimage. That’s because the definition of a function requires there to be exactly one image for each element $x\in A$, but if I pick $y\in B$ it might not have any preimages, and it might have more than one preimage.

Finally, a rare situation where we don’t use the same word twice — however, we make up for this big time in our choice of symbolic notation. If we have a subset $D$ of $B$, then the inverse image of $D$, denoted $f^{-1}(D)$, is defined to be the set of all preimages of elements of $D$. Equivalently, it is the set of all $x\in A$ such that $f(x)\in D$. Equivalently again, it is the set $\{x\in A:f(x)\in D\}$.

Here’s a quick example. Let $A=\{1,2,3,4,5\}$, let $B=\{1,2,3,4\}$ and define $f$ as follows: $f(1)=f(2)=1, f(3)=2, f(4)=f(5)=3$. Then the inverse image of the set $\{1,4\}$ is $\{1,2\}$. Why? Because 1 and 2 are the elements of $A$ that have images that belong to the set $\{1,4\}$.

I hope you will have noticed something very important about that example, which is that the function $f$ in question does not have an inverse. In fact, it doesn’t even come close: the fact that $f(1)=f(2)$ shows that it isn’t an injection (which means that if we tried to form an inverse $g$ we wouldn’t be able to decide between setting $g(1)=1$ and $g(1)=2$) and the fact that 4 has no preimage shows that it isn’t a surjection either (we would have no idea what value to give to $g(4)$). And yet, I happily wrote $f^{-1}(\{1,4\})$. (Actually, I didn’t write it, but I’m writing it now. And I’m happy.)

This is a very frequent source of confusion. Generation after generation of Cambridge undergraduates see an expression like $f^{-1}(D)$ and conclude, wrongly but not entirely unreasonably, that $f$ has an inverse. Indeed, it looks as though it must mean the image of $D$ under the inverse function of $f$. But it doesn’t (except that if $f$ does happen to have an inverse, then $f^{-1}(D)$ does happen to be the image of $D$ under that inverse).

The best I can do to help you with understanding inverse images of sets is this. If you ever see a sentence of the form $x\in f^{-1}(D)$, then you are at liberty to translate it into the equivalent but more transparent sentence $f(x)\in D$. What’s more, I recommend doing so.

Suppose, for example, that you are asked to prove the following simple fact. (At least, it’s very simple once you are used to the definitions and to standard techniques for writing proofs.)

Fact. Let $X$ and $Y$ be sets and let $f:X\to Y$. Let $A$ be a subset of $X$ and $B$ be a subset of $Y$. Prove that $f(A)\subset B$ if and only if $A\subset f^{-1}(B)$.

If you’re asked to prove an if and only if, then you start by assuming one side and deducing the other, and then you prove the implication in the opposite direction. So let’s begin by assuming that $f(A)\subset B$. What do we need to prove? We want to show that $A\subset f^{-1}(B)$. How do we show something like that? If you’ve learnt the definition of “is a subset of”, then you will call up to the front of your brain the following statement as what we want to prove.

• For every $x\in A$, $x\in f^{-1}(B)$.
• And if you have taken on board advice in the post on injections and surjections, you will now immediately write “Let $x\in A$.” The task is now to prove that $x\in f^{-1}(B)$.

Aha! That is a sentence of exactly the form that allows us to get rid of that nasty and confusing $f^{-1}$, since it is equivalent to the statement $f(x)\in B$. So we know that $x\in A$ and we want to prove that $f(x)\in B$. What were we given? Oh yes, that $f(A)\subset B$. But $x\in A$, so $f(x)\in f(A)$. But if $f(x)\in f(A)$ and $f(A)\subset B$, it follows (directly from the definition of “is a subset of”) that $f(x)\in B$, just as we wanted.

How about the other direction? This time we assume that $A\subset f^{-1}(B)$ and we want to prove that $f(A)\subset B$. So we want to prove that every element of $f(A)$ is an element of $B$. But every element of $f(A)$ is of the form $f(x)$ for some $x\in A$, so we can begin with, “Let $x\in A$,” and know that our target is to prove that $f(x)\in B$. (This is a slight elaboration of the “let” trick that is convenient for dealing with a situation where what we are sort of saying is, “For every $f(x)$ with $x\in A$, such-and-such happens.”) We know that $x\in A$, and that and the hypothesis that $A\subset f^{-1}(B)$ tell us that $x\in f^{-1}(B)$.

Aha! We can get rid of that nasty and confusing $f^{-1}$: we now know that $f(x)\in B$. Better still, that’s exactly what we were trying to prove.

Just in case I haven’t made it sufficiently clear, if $f:A\to B$, $y\in B$ and $f(x)=y$, then it is incorrect to say that $x$ is an inverse image of $y$ (it is a preimage) and it is incorrect to write $x=f^{-1}(y)$ (the function $f$ might not have an inverse, and if it doesn’t, then $f^{-1}(***)$ makes sense only if $***$ is a subset of the codomain of $f$). Similarly, if $D$ is a subset of $C$ then $f^{-1}(D)$ is not a preimage, or even the preimage, of $D$ (it is the inverse image). And the fact that we write $f^{-1}(D)$ does not mean that $f$ has an inverse.

It only remains for me to apologize on behalf of the mathematical community for the historical accidents that have led to this jumble of overlapping terminology and notation. You have no choice but to learn it and be very careful about using it. The one thing I would say is that one gets used to it — so much so that it becomes hard to remember what it was like to find it confusing.

That reminds me that I haven’t finished discussing non-standardness of terminology to do with functions. You will often see the following alternative terminology for injections, surjections and bijections. Injections are called one-to-one functions, surjections are called onto functions, and bijections are called one-to-one correspondences. This terminology is pretty confusing –see Terence Tao’s comment on one-to-one functions — but you probably have to learn it too.

I’m mentioning this here because I want to recommend another of the quizzes, the one on functions. If you aren’t quite sure whether you have understood the material in this post and the previous one (and what your lecturers have said on similar topics), then trying out that quiz will soon tell you how you are doing. But it uses “one-to one function”, “onto function” and “one-to-one correspondence” instead of “injection”, “surjection” and “bijection”, so you’ll need to be ready to use that terminology.

Added later. I received an email from Imre Leader expressing disagreement with my assertion that the set-theoretic definition of functions achieves nothing. Since he had some valid points to make, and since we ended up agreeing with each other completely (I think), I’d like to report on our exchange.

Imre’s point is that, as I said above, people just are more comfortable with the idea of an arbitrary set not defined by a nice property than they are with the idea of an arbitrary function not defined by a nice rule. I maintain that this is irrational, but even if it is, I can’t deny that it is true. So if you give the set-theoretic definition of functions, then you make completely clear that functions can be just as arbitrary as sets.

My eventual response (after a certain amount of thought, and an email that Imre disagreed with in a number of places) was that I agreed that the set-theoretic definition of functions helps one to understand how arbitrary functions can be, but that that benefit can be achieved in a different and better way. Instead of saying that a function from $A$ to $B$ is a subset $f$ of $A\times B$ such that for every $x\in A$ there is exactly one $y\in B$ such that $(x,y)\in f$, we should say that every function can be obtained from such a set. It turns out that Imre is entirely happy with that idea. (Also, he was at pains to stress that he is no fonder of turning everything into sets than I am.)

Here in detail is what I would say. Let $G$ (to stand for “graph”) be a subset of $A\times B$ such that for every $x\in A$ there is exactly one $y\in B$ with $(x,y)\in G$. Then we can define a function $f:A\to B$ in terms of $G$ by letting $f(x)$ be the unique $y$ such that $(x,y)\in G$. Moreover, every function from $A$ to $B$ can be obtained this way, since if $f$ is such a function we can define $G$ to be the set of all $(x,y)$ such that $y=f(x)$.

What I like about this approach is that it doesn’t feel unnecessarily paradoxical. I’m not saying, “Actually a function isn’t a function at all — it’s a funny kind of set.” Rather, I’m saying that there’s a one-to-one correspondence between functions and funny kinds of sets. This captures the arbitrariness of functions (if you believe that sets can be very arbitrary, then you can carry this arbitrariness over to functions), but it also preserves their function-like nature (given a set $G$ with certain properties, I then tell you a rule for associating elements of $B$ with elements of $A$).

### 31 Responses to “Domains, codomains, ranges, images, preimages, inverse images”

1. Qiaochu Yuan Says:

I sympathize with your concerns about defining functions as relations, but I don’t think it’s completely fair to say that functions and relations don’t have the same grammar. After all, there is a category of sets and relations, and morphisms can be composed as usual in this category, and the category of sets and functions is a subcategory of this category, and so forth. What’s new about the category of relations, I guess, is that one can also define daggers (transposes) of relations.

• gowers Says:

Perhaps what I ought to have said more clearly is that the standard examples of functions (things like $f(x)=x^2$) and the standard examples of relations (things like “is congruent mod $m$ to”) have different grammar. One converts elements of one set into elements of another, and the other goes between elements of two (usually equal) sets and produces a statement. It’s certainly possible to conceive of ways of talking about relations that make them feel much more function-like, and in some contexts (particularly, as I’ve discussed in the comments on a different post, when one is checking that a function is well-defined), I think there might even be some benefit in doing so.

• Qiaochu Yuan Says:

And conversely I think it’s beneficial at times to think about functions in a way that is more relation-like. For example, here’s a small observation. Abstractly, the inverse image construction $f^{-1}(B)$ you describe in the post is a contravariant corepresentable functor $\text{Hom}(-, 2)$ from the category of sets and functions to itself (or the category of Boolean algebras, etc.). The image construction $f(A)$ you describe is a covariant functor, but it isn’t representable in the category of sets and functions by a straightforward cardinality argument. It is, however, representable in the category of sets and relations, by $1$!

In fact inverse image and image for functions is a special case of inverse image and image for relations, which are related by transposition. Defining these constructions for functions only is, in a sense, breaking a natural symmetry.

• Terence Tao Says:

My basic rule of thumb to determine whether an object X is “truly” a set, as opposed to merely being modeled or represented by a set, is to ask oneself whether a statement such as “$x \in X$” could conceivably be useful for doing mathematics; similarly, I would only view $X$ as “truly” being a relation if sentences such as “$x X y$” could be viewed as useful mathematical statements.

For instance, I do not consider $1 \in 2$ to be a useful statement, so I do not view the numbers 1 and 2 to be “equal” to sets such as the von Neumann ordinals $\{ \emptyset \}$ or $\{ \emptyset, \{\emptyset\} \}$, instead viewing the latter merely as one possible model or representation of the former. Similarly, “$(x,y) \in f$” and “$x f y$” do not seem like a useful mathematical sentence to me if $f$ is a function, so I do not view functions as “truly” equal to sets or relations, but instead merely being represented by them.

In practice, these distinctions generally quite irrelevant, but occasionally it can be useful to maintain the distinction between a mathematical object and its representations, in order to be able to generalise the former in the absence of the latter. For instance, generalised functions such as distributions do not have any reasonable interpretations as sets or relations (at least, not in a fashion that recognisably corresponds to the analogous interpretations of classical functions), yet they still do many of the things that classical functions do (e.g. one can take linear combinations of generalised functions, limits of generalised functions, etc.). So this is one instance in which a dogmatic insistence that functions “are” sets or relations can get in the way of useful mathematical generalisation.

• Scott Carnahan Says:

I think functions as they are used in mathematical practice have a much more dynamic shape than relations and abstract sets. They have a strong directionality that suggests an irreversible process done by a machine, or perhaps the woes of time evolution. This is one reason why it may seem wrong to say that functions are relations or sets, but it seems okay to say that functions can be encoded as relations or sets.

The notion of function as set seems to be an artifact of a historical choice of foundations of mathematics. One could conceivably hatch an alternative, more computationally oriented foundation like a lambda calculus, where one does the same everyday mathematics, but where functions and processes are more primitive than sets and relations.

2. plm Says:

The “domain of a function not defined everywhere” can actually be called coimage.

For 1/x as a (pre)function from R to R in the category of sets (with morphisms relations with at most one output per input), the coimage is R^* -up to bijection.

• plm Says:

Actually what I write is wrong according to standard terminology, coimage as I used it would only coincide with the category theory concept for injective (pre)functions. (Coimage is ok for 1/x, but not for 1/x^2 for instance.)

3. Colin Reid Says:

I’m sure I’ve seen ‘preimage’ used to mean what you call the ‘inverse image’. The word ‘fibre’/'fiber’ is also popular in certain circles, though perhaps this means something more sophisticated than ‘inverse image’.

4. CD Says:

I am enjoying these posts a lot. I wonder if you have encountered as much difficulty teaching “codomain” as I have. In my experience it is something of a pons asinorum for students in abstract mathematics: either they internalize it more or less immediately, or they never do. It presents many pedagogical difficulties.

One is that much of our intuition for functions— and most use of functions outside of pure mathematics— simply does not require a precise choice of codomain. It’s quite common to have a definite, fixed choice of domain and something you want to do on it, but no fixed idea (or perhaps several distinct fixed ideas) of how you want to regard the result. Anybody who has ever worked with with a strongly typed programming language has, sooner or later, run up against a reminder that our pure-mathematics function concept contains just a little bit more structure than our intuition seems to want it to have.

A second issue is that for many purposes the choice of codomain is arbitrary, subject to the restriction that it be a set containing all of the values f(x), for x in the domain. This is reflected in the “a function is a set of ordered pairs” definition, which popular in textbooks, despite the fact that it does not encode the codomain concept at all. By this definition, for example, the squaring function from real numbers to real numbers, is the same as the squaring function from real numbers to nonnegative real numbers. In many contexts this is innocuous or even desirable, for more or less the same reasons that the non-strict behavior of non-strongly-typed programming languages is innocuous or desirable. In other contexts this is unwanted— for example, textbooks using this definition of “function” will still insist on a difference between my examples, because one is surjective and the other isn’t. But the fact that this difference is not actually encoded in the definition is not that commonly noticed. For the reason, I’d say, that a lot of the time it just doesn’t matter. (And not encoding the difference serves a small but useful pedagogical purpose. I’m not sure that students have a better time with functions if, on the first go-round, they learned “functions are sets” not with the graph, but with an ordered triple consisting of two sets and then the graph. This draws almost too much attention to the arbitrary choices made in modeling mathematics within set theory.)

The arbitrariness of the codomain seems related in my mind to the general idea that often we want to identify sets with sets containing those sets, whenever doing so simplifies the discussion and suppresses irrelevant choices in formalism. (A positive-integer-valued function is a real-valued function is a complex-valued function…) It takes experience and sensitivity to context— the context in which a function is being used, not just the context inherent in the definition of the function concept— to understand when the choice of codomain really matters and is not just an irrelevant choice of formalism.

So: when teaching the subject, one is in the awkward position of drawing a useful technical distinction— between the image of a function and the function that the set is “to”— but then following this immediately with examples showing that the choice of the object that resolves this distinction is sometimes arbitrary, and sometimes crucially important. The best I have been able to do with teaching it, is to emphasize that the codomain is not really something intuitive, but extra structure we add to the intuition, to have useful notions available to us when we need them.

5. Sam Alexander Says:

The way you defined “functions from A to B” in this post, you very deftly ducked under a very common problem.

When “function” is naively defined as “set of pairs with the single-valued property”, then what you call the range (and what some call the codomain) becomes non-well-defined.

For instance, using that naive definition, if I say f:R->R is defined by f(x)=x^2; and if I see g:R->[0,infty) is defined by g(x)=x^2; then f and g are actually the same set, and so if range/codomain could be well-defined, f and g would have to have the same one.

Of course, the category theorists get around this by defining a function to be (say) a triple (f,A,B) where A and B are sets and f is what you’ve defined here as “a function from A to B”.

• gowers Says:

That was in fact deliberate. It was pointed out to me a year or two ago, and CD also mentions this above, that many textbooks get the definition of function wrong (or at any rate are inconsistent about it) and I wanted to be careful to avoid that mistake but without drawing attention to it too much.

• Joel Says:

I always find this worrying: morally speaking, should a function “know” what its codomain is? The category theory approach certainly solves that one, but it doesn’t seem to be a common approach when introducing functions.
So, is the category theory approach the “right” one? If so, what would happen if we did try to teach that ordered triple version to starting undergraduates. CD says he isn’t sure this would be helpful. Has anyone tried it?
If we do try it, we would have to warn students that they will find variants on this definition in many books.
The “officially correct” definition in the original post says “A function from A to B is [a set of ordered pairs with certain properties] …”. Isn’t this in fact the “wrong” definition found in many textbooks, or is there a subtle difference, based on context, that I am missing?
The “officially correct” definition is, of course, the one I was taught as an undergraduate, and which I have always worked with, but subject to the concerns expressed above.
Joel

6. Thomas Says:

From my topology teacher I adopted the square bracket notation for images and preimages of sets: $$f[A]$$ and $$f^{-1}[A]$$, respectively. Though it gets messy if A is an interval and written as such …

• CD Says:

Some textbooks call attention to the difference by decorating the f (e.g. if f is a function from A to B, then f^* is the function from the power set of A to the power set of B induced by f in the usual way). This convention has a lot to recommend it— among other things, it is not just a notational choice, but is functorial, and a subconscious introduction to the idea of a functor. And it allows one to enforce completely rigid interpretation of notation at an educational stage when some students might abuse and over-generalize the apparent flexibility of being able to write both f(element of domain) and f(subset of domain).

On the downside, there is no standardized notation for the induced function on power sets— the standard is to write both f(x) and f(A)— and to any extent that confusion is avoided by not doing the standard thing, it is only postponed to the moment when students meet somebody who does do the standard thing.

7. JeffE Says:

In my own personal notes, I *always* use “into” for injections, in contrast with “onto”. For example: Stereographically project the plane into the sphere.

I haven’t quite convinced myself to use “unto” for bijections, though. (Map unto others as you would have them map unto you.)

8. Henning Makholm Says:

From my vantage point in computer science, the underlying message of this post seems to be that the language of ordinary mathematics is a _typed_ language, and that the habitual tendency of mathematicians not to explain explicitly what the typing rules are, or even acknowledge that they exist, engenders confusion and mistakes.

Expressing everything as sets is well and good for foundational purposes, but to insist that everything else is merely abbreviations for set-theoretical constructions amounts to denying every shred of intuition about what we’re doing — even though one often _cannot_ get from the symbols actually on the paper to their purportedly “real” set-theoretic meaning without knowing a lot of mostly unspoken rules of the subject area. It feels to me like insisting that all computer programming is just about bit patterns, so all the careful abstractions one builds when constructing a program are merely convenient (and ultimately irrelevant) abbreviations for particular bit manipulations. That’s just no way to get work done.

There’s a large and refined body of work in computer science on how to design and describe rules like the ones explained in this post such in a rigorous and unambiguous, yet expressive and flexible, way. We do it in the context of programming languages, but that’s really just a constructive subset of mathematics, and easily extended to cover all of it. I’ve often wondered why mathematicians are not eagerly making use of these ideas to simplify and structure the exposition of mathematics.

(Perhaps types got a bad rap after Russell’s theory of types turned out to be insufferably cumbersome to use relative to Zermelo’s untyped set theory? But that’s ignoring a century of later developments, where at least the last few decades have been keenly focused on practical convenience because programming languages are _tools_ rather than abstract objects of study).

In a well-developed typeful formulation of mathematics, I expect that one would define “function from A to B” just as an abstract type with certain axiomatic properties, and then use the set-of-pairs definition simply as a “reference implementation”, serving as an existence proof which guarantees that the concept makes sense and is safe to use.

• Qiaochu Yuan Says:

I’m glad someone said something about typing. Ever since I attempted to teach myself Haskell, I’ve thought that a typed foundation of mathematics would be both easier to learn and ultimately more faithful to mathematical practice, and I often wonder why more mathematicians aren’t trying to write one down (or if they have, why I haven’t heard about it).

• Terence Tao Says:

Well, mainstream mathematics certainly has categories, which are very type-friendly (cf. also their philosophy that equality between objects is “evil” and should be replaced by isomorphism whenever possible). I would imagine therefore that category theoretic foundations of mathematics (such as ETCS) would be quite amenable to typing, but I am not an expert on these matters.

In any event, once one moves away from foundations, I think most practicing mathematicians use some sort of informal typing when they discuss or do mathematics. For instance, I doubt mathematicians view a real number as an equivalence class of Cauchy sequences or as a Dedekind cut when manipulating such numbers.

• gowers Says:

I completely agree with this. In fact, I’ve been planning a post about parsing mathematical sentences, which will emphasize this point.

• Uwe Stroinski Says:

“I’ve often wondered why mathematicians are not eagerly making use of these ideas to simplify and structure the exposition of mathematics.”

Good point. We do mathematics bottom-up in assembly language and should then present it top-down in some object oriented frame using information hiding, inheritance, polymorphism aso. I think, from all these concepts ‘top-down’ seems to be the hardest to achieve.

• Henning Makholm Says:

Terrence: the problem with category theory is that while it can express a lot of mathematics, to do so it translates everything into its own language and henceforth it is something completely different.

Uwe, I have trouble discerning whether you’re being sarcastic. The goal of the program I’m imagining (which already makes it sound rather more grandiose than I think it should) would not be the recast mathematics in the mold of software development, but to vindicate the way everyday mathematics _already_ uses formulas. The most important thing to borrow from computer science might be a vocabulary for _describing_ systematically what must surely look to newcomers like an endless parade of disjoint ad-hoc notational conventions.

In any case, I think you’re conflating software engineering with programming language research. The former, being both practically important and very, very difficult to make solid theoretical progress in, is mostly a collection glorified rules of thumb, quite susceptible to fads and populated in part by loud, dogmatic amateur propagandists for current and past fads. The latter is a proper quasimathematical field of study with rigorous definitions, theorems, and heavy use of Greek letters.

I don’t really think OOP has much to offer mathematics. One of its ingredients, namely subtyping, is relevant on its own: as much as we don’t really want to identify the natural number 2 with the set {0,1}, we certainly do want it to be the same object as the integer 2, the rational number 2, the real number 2, and the complex number 2, which subtyping can help describe systematically (whereas a typical set-theoretic development would claim that there are “really” invisible injections being applied everywhere).

Insisting on a strict top-down order of development would be almost as misplaced in mathematics as it is in programming.

• Uwe Stroinski Says:

Henning: Oops. Absolutely no sarcasm intended. Sorry for being so sloppy. I very much sympathize with your approach (albeit I probably do not completely understand it).

Is software engineering really completely separable from programming language research ? I do not think so. There surely is a considerable overlap. At least there was, when I studied it.

In any case, typed variables with restricted scopes (as discussed in this thread) are just the beginning. Once you have that, why not making explicit when you overload an operator like + or $f^{-1}$. Polymorphisms like that are everyday business for mathematicians and can be very confusing to students. In terms of programming languages such concepts are best described by OOP languages. At least, that is what I think.

9. J.P. McCarthy Says:

Two minor points that I thought you were going to make. Firstly be careful to write $f^{-1}(\{x\})$ rather than $f^{-1}(x)$ so as not to imply that $f$ has an inverse.

Secondly that the issue with codomains and images is related to the fact that all injections are invertible if we make no distinction between codomain and image.

10. Luqing Ye Says:

Regard a function from set A to B as a subset of $A\times B$ which satisfy certain properties.This definition is also useful in such conditions:
In Terence Tao ‘s book “Analysis”,the part of the book which involve set theory,there are two ways of presenting the power set axiom:
(1)All the functions from set A from set B form a set.
(2)All the subset of set A form a set.
It seems only if we regard function as a subset of cartesian product can we prove that the two statements are equal.For more information,see “analysis” exercise 3.5.11

11. Chris Says:

It might be interesting also to note that there are some instances where it matters a great deal which codomain is chosen for a function. For example, in general topology the concept of a function being $\theta$-continuous is not preserved under restricting the codomain. So even though we often don’t need to specify the codomain explicitly, there are cases where the choice matters (and not just any set containing the image will do).

12. Two-place functions aren’t one-place functions, are they? | Logic Matters Says:

[...] Here’s a small niggle, that’s arisen rewriting a very early chapter of my Gödel book, and also in reading a couple of terrific blog posts by Tim Gowers (here and here). [...]

13. Think of a function | Explaining mathematics Says:

[...] (this appears to be the Nottingham standard), not the codomain of the function. Some, including Tim Gowers, prefer to use range to mean codomain (this appears to be the Cambridge standard). Perhaps we [...]

14. Vania Mascioni Says:

I feel that the dislike for the notion of “arbitrary set” as related to the notion of function might come from the fact that everyday practice makes us work with functions that not only have a name such as f(x), but an actual, working definition on how to get f(x) from x. In this sense, f(x) is the “name” standing for a specific operation, and we as a species are quite comfortable with naming things.

Yet, when I think of real numbers I should feel the same way: most of us tend to come up with examples of real numbers who have names, such as -2, square root of 5, or pi. Yet the crushing majority of real numbers are transcendental and will never have a name of their own in our entire lifetime. So, when trying to answer the question “what is a real number?” one is necessarily forced to use the more abstract approach since there is no way to exhibit your general real number. unless dealing with random-like sequences of decimal digits should feel “concrete” to anyone. Similarly, going back to functions if we try to think of a random-valued real function it feels somewhat silly to think of it as an f(x) since we have nowhere near a formula for it.

In this sense, I suspect that what Tim Gowers thinks about when speaking of the “grammar” of functions he is only referring to functions that actually can be meaningfully written as y = f(x) (with a “readable” f(x) part), and this seems quite analogous to the “grammar” associated to those real numbers that have a name: “see, square roof two is a real number.” But how can I reasonably fit in a sentence that forever unnamed (and rather unreal, in fact) transcendental “thing” that together with its brethren fills up most of the real line?

• obryant Says:

You’re using “transcendental” when you mean “computable”. As there are only countably many finite deterministic algorithms, there are only countably many reals whose decimal expansions we can compute by algorithm to arbitrary accuracy. We name transcendental numbers all the time ($\pi, e^{\sqrt{2}}$), but we never name noncomputable numbers. However, Chaitin would disagree.

15. rakayla Says:

this really didnt help to me can you please give more details

16. JEENATH NISHA Says:

i need theoretical proofs and examples for (a) f(AUB)=f(A)Uf(B)
(b) f-1(AUB)=f-1(A)Uf-1(B)
(c) f-1(A∩B)=f-1(A)∩f-1(B)