Basic logic — tips for handling variables

Roughly speaking, a variable is any letter you use to stand for an unknown object of a certain type. For example, if you write x+y=20, then x and y are variables. If you write, “Let A be a subset of \mathbb{N},” then A is a variable (it is an unknown set of a certain kind) whereas \mathbb{N} isn’t (it’s the name we give to the set of all positive integers). I suppose the definition I’ve just given isn’t quite perfect, since if I asked you to solve the simultaneous equations x+y=8 and x+3y=12, then one would normally call x and y variables even though their values are completely determined by the equations. Though even then one could say that they started out as “unknown”.

Just in case I’ve gone and confused the issue, let me try to clear it up instantly. It would be quite normal to say something like this: “Let x and y be two real numbers. Suppose that they satisfy the equations x+y=8 and x+3y=12. Determine the values of x and y.” It is then reasonable to call them variables, because when I started discussing them I gave no information about them whatever. I then went on to specify some relationships between x and y, and it so happened that from those relationships it was possible to deduce the exact values of x and y.

Free and bound variables.

There is a simple, but incredibly important, distinction between two kinds of variables. It’s one that you will have come across if you are familiar with the \Sigma notation for sums. Consider the expression \sum_{m=1}^nm^2. That involves two variables, m and n. But the role they play in the expression is very different indeed. To see the difference, ask yourself, for each variable, what difference it would make if you replaced it by a different variable. Let’s try it. I’ll start by replacing n by r. I get \sum_{m=1}^rm^2. So whereas before I was adding up the first n squares, now I’m adding up the first r squares, so if n\ne r then I’ll be getting a different number. Now let’s instead replace m by r. That gives us the expression \sum_{r=1}^nr^2. How does that differ from \sum_{m=1}^nm^2? It doesn’t. It’s just another way of writing the sum of the first n squares. We call n a free variable (roughly speaking because we are free to choose a value for it) and m a bound variable, or dummy variable.

Here are two further ways of distinguishing between free and bound variables. The first is to ask yourself the question, “What value does this variable take?” If the question is sensible, then the variable is free, and if it’s a stupid question then the variable is bound. For instance, if I ask what the value of m is in the expression \sum_{m=1}^nm^2, that is a stupid question: it’s just standing for something that goes from 1 to n. (The same phenomenon occurs in computer programs with FOR loops. If I write “FOR m=1 TO n DO such and such”, then m isn’t something you can substitute a value for, whereas n is.) But if I ask what the value of n is, that’s not ridiculous at all: we might decide to set n=100 or n=6k+2 for some other variable k that’s floating around, and so on.

The second way is to see whether you can rewrite the expression in a way that doesn’t mention the variable in question. For example, I can rewrite \sum_{m=1}^nm^2 as 1^2+2^2+\dots+n^2. This test doesn’t always work that well. For example, t is a dummy variable in the expression \int_1^x\sin t dt, but it’s difficult to rewrite the expression without mentioning some variable that plays the role of t. In fact, it was difficult even in the summation case — I had to use dots and trust that you would know what I was talking about. Even so, the test may occasionally be helpful.

So far, I’ve talked about free and bound variables in expressions that stand for mathematical objects (in both cases numbers). However, the main theme of this section of the post is really free and bound variables in statements. Let me give an example.

  • For every x\in A there exists y\in B such that y\leq x\leq 2y.
  • Here I’m imagining that A and B are sets of real numbers. The statement is telling me that every element of A can be sandwiched between some element of B and twice that element.

    The statement above involved four variables, A, B, x and y. I hope it is already obvious to you which ones are free and which are bound. In case it isn’t, just apply the test of seeing whether it makes a difference to what the statement is saying if you change a variable to something else. Then you will see that A and B are free variables and x and y are bound variables. Why? Because the statement is about A and B and not about x and y. (I recommend reading it and making sure not only that you understand what it means but also that you agree that it is saying something about A and B.) If you change A and B to C and D, then you are no longer saying that A and B are related in a certain way: you are saying that C and D are related in that way. But if you say,

  • For every a\in A there exists b\in B such that b\leq a\leq 2b.
  • then you are expressing exactly the same relationship between A and B as you were before.

    An easy way to tell which variables are free and which are bound in certain types of sentences is just to look and see which ones appear in quantifiers and which don’t. For example, take the following, by now fairly familiar, statement.

  • For every \epsilon>0 there exists N\in\mathbb{N} such that for every n\geq N, |a_n-a|<\epsilon.
  • Four of the variables that appear are \epsilon, N, n and a. The status of a_n is a little less easy to describe: I’ll come back to it in a moment. Now \epsilon appears inside a quantifier, since the statement begins, “For every \epsilon>0.” Similarly, N and n appear inside quantifiers. But a doesn’t. Therefore, \epsilon, N and n are bound, but a is free.

    What about a_n. A good way to understand its role is to write it instead as f(n). That is, we treat the sequence a_1,a_2,a_3,\dots as a function f that takes integers and turns them into real numbers. With this new notation the sentence would be rewritten

  • For every \epsilon>0 there exists N\in\mathbb{N} such that for every n\geq N, |f(n)-a|<\epsilon.
  • Now we can simply say that f is a variable: it stands for an unknown function from the positive integers to the real numbers. Since we have not quantified over f, it is a free variable.

    If that analysis is correct, then it should be the case that the sentence above is telling us about f and a, while \epsilon, N and n are just placeholders. And indeed that is the case. When we say that a sequence converges to a limit, we are talking about the sequence and the limit, and not all the other variables that come in when we write out the definition in full.

    Note that whether or not a variable is free depends very much on the statement that you regard it as being part of. For instance, suppose I write the following: “Let n\geq N. Then by the above calculation we see that |a_n-a|<\epsilon.” If I regard n as part of the second sentence only, then it is free. But if I regard it as part of both sentences, and if I regard those as a way of saying, “For every n\geq N, [the above calculation shows that] |a_n-a|<\epsilon,” then we are quantifying over n so it is bound. In a funny way, the word “Let” could be said to “liberate” n. (The phrase “with one bound he was free” comes to mind, but that really does confuse things.)

    I said in the title that I was going to offer some tips for handling variables. So here’s one.

  • Always be completely sure in your mind which variables are free and which are bound.
  • But actually, the real message of this post is more basic.

  • Always introduce your variables to the reader before going on to talk about them.
  • In this small respect, you should treat your variables like people. Suppose that you had two friends, Anne and David, who had never met. You wouldn’t begin by saying to David, “Anne went to Amsterdam a few months ago.” Only once you’d said, “This is Anne,” or words to that effect, would imparting information about Anne be appropriate social behaviour. I have often read supervision work, not really understood what has been written, and felt moved to ask something like, “What is \delta?” in roughly the tone of voice that David might ask, “Who’s Anne?” if you launched in with information about her last holiday.

    How do you introduce a variable? Let me illustrate by example. First, a definition: a function that takes real numbers to real numbers is called strictly increasing if …

    I’ve got to that point in my sentence and realized that it is rather difficult to say what I mean unless I give the function a name. Here’s what might have happened if I had struggled on with the sentence I was in the middle of writing.

  • A function that takes real numbers to real numbers is called strictly increasing if whenever you apply it to two real numbers, one of which is greater than the other, then the value it takes at the greater number is greater than the value it takes at the smaller number.
  • Here’s a much clearer way of saying the same thing.

  • A function f from \mathbb{R} to \mathbb{R} is called strictly increasing if for every pair of real numbers x and y, if x<y then f(x)<f(y).
  • I could have written that in a slightly less formal way as follows.

  • A function f from \mathbb{R} to \mathbb{R} is called strictly increasing if f(x)<f(y) whenever x<y.
  • That is less formal because I didn’t specify what x and y were (leaving it to the context to make it clear that they were real numbers) and I used the word “whenever” (leaving it to the reader to work out how to convert that into a statement involving a universal quantifier). However, I am much more concerned with the difference between these last two formulations of the definition and the first one. And that difference is that I followed another tip for dealing with variables:

  • Give names to the things you are talking about.
  • If you do that, then you nearly always convert clumsy, wordy sentences into much cleaner ones.

    Incidentally, this piece of advice very much depends on our modern practice of using letters to stand for whatever we feel like making them stand for. Before this practice was invented, nobody knew of any way of expressing mathematical statements apart from what I have been calling the clumsy, wordy way. For example, this is how the sixth proposition from Book II of Euclid’s Elements was stated (in translation of course, but the point still stands).

    If a straight line be bisected and a straight line be added to it in a straight line, the rectangle contained by the whole with the added straight line and the added straight line together with the square on the half is equal to the square on the straight line made up of the half and the added straight line.

    If you have the faintest idea what that’s saying, then you’re doing better than me. Personally, I find a sentence like that more or less impossible to understand. And if I do want to understand it, I have to translate it. Fortunately, we know how to avoid that style now, so please avoid it.

    Back to what I was really talking about, which was the principle that you should introduce your variables before you talk about them. As I’ve just been saying, a function f from the reals to the reals is strictly increasing if f(x)<f(y) whenever x<y. Imagine now that you had a question on an examples sheet that asked you to prove that the function f(x)=x^3 is strictly increasing. (By the way, a quick aside. You may notice that I wrote “examples sheet” and that almost all the sheets your lecturers hand out have at the top the words “example sheet”. In my day they had at the top the words “examples sheet”, for the simple reason that each sheet was a sheet of examples. I get irritated with the phrase “example sheet”, but as I write this I realize that if you replace the word “example” with “problem” or “question” then the plural seems utterly weird. In fact, I can’t think of a single example where “An X of Ys” would become “A Ys X” rather than “A Y X”. Somehow that’s even more irritating. And somehow “examples sheet” still feels right to me.)

    OK, how do we show that the function f(x)=x^3 is strictly increasing? Here’s how not to begin your argument.

  • Since x<y we know that y-x is positive.
  • If you write that and I ever see what you’ve written, it will be a pure reflex for me to ask, “WHAT ARE x AND y?” If you think that that’s ridiculously pedantic and that the context makes it obvious that you have chosen two real numbers x and y with x<y, I would respond as follows.

    (i) It is true that I can tell that that is the context that you have set up in your brain before writing what you wrote.

    (ii) It is also true that that is the only context I can think of that makes sense of what you wrote.

    (iii) Nevertheless, you have not explained the context.

    (iv) If you get into the habit of not explaining the context, then you will run into difficulties when the proofs get a little bit more complicated.

    Point (iv) is the most important. As soon as you start needing to write proofs that involve definitions that have strings of two or three quantifiers, if you don’t say what your variables are doing then you’ll get into a mess.

    Back to this proof. What should one write instead? Something more like this.

  • Let x and y be real numbers with x<y. [That’s the “This is Anne” part.] Then y-x is positive. [By the way, Anne has been to Amsterdam recently.]
  • It may seem so obvious to you that the function f(x)=x^3 is increasing that you’re not quite sure how to prove it. What I want to do is deduce this simple fact from very basic principles to do with how inequalities interact with addition and multiplication. The main two are these, which hold for any three real numbers a, b and c.

  • If a<b then a+c<b+c. [You can add anything you like to both sides of an inequality.]
  • If a<b and c>0 then ac<bc. [You can multiply both sides of an inequality by a positive number.]
  • Let me indicate two arguments. The first is a bit crude but it gets the job done.

    If x and y are both positive, then


    There I made repeated use of the second principle above. I also used the fact that x^2, yx and y^2 are all positive, which can be deduced from the second principle too.

    If x and y are negative, we can use the fact that -x and -y are positive and the fact that x^3=-(-x)^3 and y^3=-(-y)^3 to deduce that x^3<y^3 from what we have just proved. And if one of x and y is zero, then we can use the fact that the other is positive (if x=0) or negative (if y=0). If one is positive and the other negative, we can use the zero case to prove what we want in two steps.

    I won’t give the full details of that argument, because there is a cleaner argument that does it all in one go. Note first that y^3-x^3=(y-x)(y^2+xy+x^2). Now y-x is positive, by assumption. As for the second bracket, it equals (y^2+x^2+(y+x)^2)/2, which is positive because the square of any number is non-negative and x and y are not both equal to 0. (Why is the square of any number non-negative? I leave that to you as an exercise.)

    Perhaps by this point you are getting cross with me because you think I should have just differentiated. There are all sorts of answers to that, but the main one is that I simply don’t like using calculus when there’s an elementary calculus-free argument around. Another is that the argument via differentiation is slightly more complicated than you might think. In any case, the point of this whole discussion was not the proof itself but the fact that I began it with, “Let x and y be real numbers with x<y.” The word “let” is incredibly useful for the purpose of introducing variables. It is typically used in two situations.

    Situation 1. You want to prove a statement about every element x of some set X. You begin your argument with, “Let x\in X.”

    That’s the situation we’ve just met. There I wanted to prove something about every pair of real numbers with the first less than the second. So I started the proof, “Let x and y be real numbers with x<y.”

    Situation 2. You have just established that some object x exists with a property P(x). You follow that up with, “Let x be a [insert name for type of object that x is] such that P(x).”

    Let me give an example of situation 2. Recall the result I mentioned in an earlier post, that if m and n are two positive integers that fail to generate the whole of \mathbb{Z} then they must have a common factor greater than 1. Let’s suppose that I’m in the middle of a proof and I find that the only way that my argument can fail is if there is some integer r that cannot be written in the form am+bn. My proof might continue as follows.

  • Since r cannot be written in the form am+bn, it follows that m and n have a common factor greater than 1. Let d be such a factor.[Proceed to talk about d.]
  • Sometimes — in fact, extremely often — in this situation, mathematicians do something a bit sneaky. Instead of writing the careful introduction of d that I’ve given above, they write something more like this.

  • Since r cannot be written in the form am+bn, it follows that m and n have a common factor d>1. [Proceed to talk about d.]
  • Strictly speaking, this second way of writing is badly incorrect because in the first sentence d is a bound variable (because the sentence is effectively saying, “There exists d>1 such that d is a factor of both m and n.”) but when one goes on to talk about d it has magically become a free variable. But when a linguistic practice becomes sufficiently widespread, it makes no sense to call it incorrect. It is better to regard the above as a convenient shorthand: if we pass from existentially quantifying over a variable x in one sentence to talking about x in the next sentence, it should be understood that what we really mean is that we existentially quantify over a different variable, then say, “Let x be [an example of what we’ve just shown to exist]”, and then proceed to talk about x. As long as you know exactly what you are doing, then this is OK.

    One further remark. If you establish the existence of x such that P(x) and then go off and discuss something else for a while, then it will be quite confusing if later on in the argument you treat x as a free variable. So the shorthand above is probably best kept for situations where you prove that something exists and then immediately go on to talk about it (where by “it” I really mean one of the many possible examples).

    Let me end with an exercise. Out of all the variables in the following sentence, which are free and which are bound? (The variables are A,x,\epsilon,f, and y.)

  • A is a subset of the set of all x such that there exists \epsilon>0 such that |f(y)-f(x)|<1 whenever |y-x|<\epsilon.
  • 19 Responses to “Basic logic — tips for handling variables”

    1. Rob Kent Says:

      In the third line of your fifth paragraph, shouldn’t the “m^2” read “n^2” instead?

      Thanks very much — corrected now.

    2. Dave Lewis Says:

      You’ve got x listed twice in the second to last sentence (“The variables are…”).

      And thanks for this too … also corrected now.

    3. Greg Martin Says:

      First a typo: you’ve written “If a<b then a+c<a+c", but the last "a" should be "b".

      Also: in your "crude" argument proving that f(x)=x^3 is strictly increasing, you address the cases: one of x,y equal to zero; both positive; both negative. Unless I'm missing it, you didn't mention the (easy) argument in the case that x,y have different signs.

      Further thanks. Also now corrected.

    4. Terence Tao Says:

      There is an interesting asymmetry in mathematical exposition, in that we have standard phrases for introducing variables (“Let x be an arbitrary element of the set S“, etc.) but we don’t have a standard phrase for removing variables (other than, perhaps, the end-of-proof symbol). This can occasionally lead to confusion: for instance, one may have two consecutive homework problems, both involving an unknown variable x, but in the first homework problem x ends up equal to 17, while in the second homework problem x ends up equal to 23. Of course, it is implicitly understood that the x which was introduced in the first homework problem has been “de-introduced” by the end of that problem, so that the letter x becomes available again for an unrelated usage, so that it is not legitimate to use some fact about x from the first problem to be used in the second problem (unless this is explicitly circumvented with a phrase such as “Let x be the quantity from the previous problem”).

      This is one reason why it is good, when writing a lengthy argument, to encapsulate parts of the argument into lemmas and sublemmas, so that variables that are created for the sole purpose of proving one of these lemmas are automatically removed from the scene by the end-of-proof symbol, thus reducing clutter in the ambient “namespace”. (This is also why it is bad form to “reach into” the proof of such a lemma and use statements that involve variables whose scope is limited to that proof, to assist an argument outside of that proof.)

      There is an apocryphal story about a mathematics text which went to such extreme lengths to avoid namespace collision that once a symbol (e.g. x) was used anywhere in the book, it was not used again for any other purpose. Apparently, as the book progressed, the author quickly ran out of Roman letters (in both lower case and upper case), then the Greek, then Fraktur, and eventually was using Japanese katakana in the last few chapters!

    5. C# Says:

      You can get spectacularly difficult bugs when programming if you don’t properly declare your variables correctly, get the brackets round your conditionals correct (as in a previous post), fail to reset your variables (as per Professor Tao) and so on…
      It is a very practical form of misunderstanding.

    6. Tracy Hall Says:

      The expression “cos(1)-cos(x)” doesn’t mention t, and although it doesn’t express exactly the same idea as the definite integral, it does, for any value of x, represent the same quantity. Of course, there are plenty of easily-written functions without closed-form integrals (depending on what you mean by closed-form, since the useful ones like Li(x) have been given names of their own).

    7. g Says:

      When proving that x^2+xy+y^2>=0, you could instead have observed that it equals (x/2+y)^2 + 3/4 x^2. That would have had absolutely no advantages in terms of mathematical exposition, but it would have made use of the proposition of Euclid that you mentioned earlier!

      (Actually, I’m curious. Is it by any chance the case that you originally had just that in mind, then decided it would be more vivid to profess total incomprehension of the Euclid proposition, and then had to avoid using it as a lemma? 🙂 )

      • gowers Says:

        That’s wonderful — I had absolutely no idea that I could have used Euclid’s proposition. It did occur to me to use the observation you mentioned, since that is what I would have got from the usual procedure for diagonalizing a quadratic form. But I decided that I preferred the more symmetric expression.

    8. Richard Baron Says:

      I understand your concern about “examples sheet”, “example sheet” and other expressions that follow the same pattern. German makes this thing (I nearly wrote “everything”, but that would be untrue) so much simpler. It is a Lego language, so you just clip words together, generally in the singular: Beispielblatt, Problemblatt, Frageblatt.

      From Littlewood’s Miscellany, page 59 of the edition edited by Bela Bollobas (at least according to what Google Books shows me – I don’t have the book to hand), we have this story about variables:

      Schoolmaster: ‘Suppose x is the number of sheep in the problem.’ Pupil: ‘But Sir, suppose x is not the number of sheep.’ (I asked Prof. Wittgenstein was this not a profound philosophical joke, and he said it was.)

      • gowers Says:

        I think I’m with the pupil there: the schoolmaster should have said, “Let x be the number of sheep in the problem.” I would use the word “suppose” if I wanted to consider what happens when some fact holds. For example, I might go on to say, “Suppose that x is a multiple of 7.” That is, I use “let” for declaring the variable, and “suppose” for insisting that the already declared variable should satisfy some condition. I’m not certain that everyone would make the same decision, however.

      • Christian Says:

        I’m not sure that German is really much simpler in that regard. Just think of the word “Übungszettel”, which many people use instead of “Problemblatt”. (The word “Übung” refers to the official discussion sessions in which the examples sheets are discussed, “Zettel” means “Blatt”, and where the additional “s” comes from is not clear to me right now.)

      • Richard Baron Says:

        I am equally unsure as to the reason for the “s” on the end of “Übung”. It is not to form any case, since all cases of “Übung” have the same form as the nominative, the only difference being between the singular and the plural (-en). Perhaps it is there to make pronunciation easier. Or perhaps the declension used to be different, with an -s for the genitive.

        The extra “s” is standard when you clip another word onto the end of “Übung”. So while this peculiarity may lack justification, at least it is a regular peculiarity. There is no scope to debate what the correct form might be.

    9. k Says:

      Two more typos: in the fifth paragraph the sum is missing a summand, and in the seventh (i.e. the one starting “The statement above…”) the ‘B.’ has the full stop inside the LaTeX, so displays incorrectly

      Thanks — I’ve changed those two now.

    10. Marcus Cox Says:

      I am working on something called social mathematics, if you have time to chat please let me know.

    11. Injections, surjections and all that « Gowers's Weblog Says:

      […] recalled the definition, the second step is very easy indeed, and is something that I covered in the post about handling variables. If you have to prove a statement of the […]

    12. Daniel Brice (@fried_brice) Says:

      ‘ … I wrote “examples sheet” … for the simple reason that each sheet was a sheet of examples.’

      So do you say “fruits basket” or “fruit basket”? 😀

      Just rhetorical–not seriously expecting a reply.

      Also, thank you for making great blog entries!

    13. Anonymous Says:

      Suppose we want to write a direct proof that A\subseteq B, and start it with the standard “Let x\in A”. I suppose this is not Situation 2>, for you have not proved that such x exists, and you know nothing about its type. In your examples of Situation 1, however, you seem to have carefully picked up cases in which you know beforehand that A was not empty. So, my question is: Is this still the kind of situation in which x is being “introduced” as an element of A? And how do you feel about the sentence “Fix x\in A” (introducing a so-called eigenvariable, or parameter), often used by languages (such as Isar) intended to formalize the mathematical vernacular?

    14. Coaching program Says:

      Coaching program

      Basic logic — tips for handling variables | Gowers's Weblog

    Leave a Reply

    Fill in your details below or click an icon to log in: Logo

    You are commenting using your account. Log Out /  Change )

    Facebook photo

    You are commenting using your Facebook account. Log Out /  Change )

    Connecting to %s

    %d bloggers like this: