Basic logic — relationships between statements — converses and contrapositives

Converses.

What is the relationship between the following two statements?

1. If n is 1 or a prime number, then (n-1)!+1 is divisible by n.

2. If (n-1)!+1 is divisible by n, then n is 1 or a prime number.

At first sight, this doesn’t look a very difficult question: the first statement is of the form P\implies Q and the second is of the form Q\implies P. We say that the second statement is the converse of the first. (Note that the first statement is also the converse of the second.)

However, if you have read the previous posts in this series, I hope you will be slightly anxious about what I have just written, because I have not been clear about whether the above two statements are referring to some specific n or whether they are meant as general statements about all positive integers n. The latter is, I would suggest, the natural interpretation. And if you look at the kinds of converses that come up in a maths course, you will find that they are almost always not of specific statements, but rather of general statements.

This means that just as there are two notions of implication, there are also two notions of converse. One is the one I mentioned just a moment ago: the converse of P\implies Q is Q\implies P. The other involves a universal quantifier: we take a statement of the form \forall x\in X\ \ P(x)\implies Q(x) and its converse is \forall x\in X\ \ Q(x)\implies P(x). (We don’t have to have just one variable here. For instance, the converse of \forall x\in X\ \ \forall y\in Y\ \ P(x,y)\implies Q(x,y) is \forall x\in X\ \ \forall y\in Y\ \ Q(x,y)\implies P(x,y).)

This is a potential source of confusion, since the definition of converse is often given as the first one (that the converse of P\implies Q is Q\implies P), while almost all the interesting converses that actually come up in mathematics are of the second kind (where we universally quantify over one or more variables).

With that point behind us, let me discuss two examples of converses.

Example 1.

I’ll start with a simple one. Suppose you want to prove that two sets A and B are equal. The following is the criterion for this to be the case: every element of A is an element of B, and every element of B is an element of A. For convenience, let’s suppose that we know that A and B are both subsets of a set X. (For example, we might know that A and B consist of positive integers. In that case, X would be \mathbb{N}.) Then the statement that A=B would break down into two statements that we could write using quantifiers in the following way.

  • \forall x\in X\ \ x\in A\implies x\in B
  • \forall x\in X\ \ x\in B\implies x\in A
  • The first of these tells us that every element of A is an element of B, and the second that every element of B is an element of A. And those two statements are converses of each other.

    Example 2.

    Let m and n be two integers (not necessarily positive). Let us define the set of integers generated by m and n to be the set of all integers of the form am+bn, where a and b are themselves integers. We could call this the set of all “integer combinations” of m and n. An important question that comes up in the number theory that you will do this term is this: under what circumstances is the set generated by m and n equal to the set of all integers? That is, when can you make any integer you want out of m and n by adding appropriate multiples?

    To get some traction on this problem, it might be worth looking at some examples. Suppose that m=40 and n=70. Can we make all integers by adding multiples of m and n? Quite clearly not, since every integer we can make will be a multiple of 10. Obviously this example can be generalized: if m and n have a common factor d that’s bigger than 1, then they cannot generate the whole of \mathbb{Z} (since everything that they generate will be a multiple of d and not all integers are multiples of d).

    Let us express this observation more formally.

  • For every pair of integers m and n, if m and n have a common factor greater than 1, then m and n do not generate all of \mathbb{Z}.
  • Note that this statement is of the general form

  • \forall m,n\in\mathbb{Z}\ \ P(m,n)\implies Q(m,n)
  • As such, it has a converse, obtained by reversing the roles of the statements P(m,n) and Q(m,n). In words, the converse is as follows.

  • For every pair of integers m and n, if m and n do not generate the whole of \mathbb{Z}, then m and n have a common factor that is bigger than 1.
  • Now converses of true statements don’t have to be true. For example, the statement, “Every multiple of 10 is even,” is true, and its converse, “Every even number is a multiple of 10,” is false. However, the converse I have just written down happens to be true: it really is the case that if two integers don’t generate the whole of \mathbb{Z} then they must have a common factor bigger than 1. If you are not familiar with this fact, then I recommend pausing for a few seconds just to think how you might prove it.
    .
    .
    .
    .
    And unless you are very unusually quick, you will have come to the conclusion that it is not at all obvious how to prove it. This may seem slightly mysterious: the two properties, “do not generate all of \mathbb{Z}” and “have a common factor bigger than 1” are equivalent, so in a sense the same property, and yet the second feels stronger than the first, in the sense that the first is easy to deduce from the second, while the second is not so easy to deduce from the first. I discussed this phenomenon in a blog post a few years ago, which provoked a number of very interesting comments.

    Contrapositives.

    Let P and Q be two statements. Then the statement P\implies Q is true (here I’m using the truth-value definition) if and only if it is not the case that P is true and Q is false.

    When is \neg Q\implies\neg P true? Applying the same rule, the only thing that could stop it being true is if \neg Q is true and \neg P is false. But that’s the same as saying that P is true and Q is false. In other words, the conditions for P\implies Q to be true are precisely the same as the conditions for \neg Q\implies\neg P to be true. The statement \neg Q\implies\neg P is called the contrapositive of the statement P\implies Q. A statement and its contrapositive are always equivalent, but the importance of the contrapositive is that it is sometimes easier to prove than the original statement.

    As ever, the situation is more interesting if we quantify over a parameter involved in a statement. Consider a statement of the form

  • For every x\in X\ \ P(x)\implies Q(x).
  • For each x, the inner part of this statement is equivalent to \neg Q(x)\implies\neg P(x). Therefore, the entire statement is equivalent to

  • For every x\in X\ \ \neg Q(x)\implies\neg P(x).
  • That was a bit abstract, so let me look at a simple example. A surprisingly useful way of proving that two real numbers x and y are equal is to show that their difference is less than all positive numbers. That is, we rely on the following principle.

  • For every x,y\in\mathbb{R}, if for every \epsilon>0\ \ |x-y|<\epsilon, then x=y.
  • The statement I have just written is of the general form

  • \forall x,y\in\mathbb{R}\ \ P(x,y)\implies Q(x,y).
  • Indeed, P(x,y) is the statement,

  • For every \epsilon>0,\ \ |x-y|<\epsilon
  • and Q(x,y) is the statement x=y.

    Now the easiest way to prove this useful principle is to say the following. Suppose that x\ne y. Then x-y is some number that isn’t zero (since if x-y=0, then adding y to both sides gives us that x=y). But then |x-y| is a positive number, which shows that the statement

  • For every \epsilon>0,\ \ |x-y|<\epsilon
  • is false. (The statement claims that a certain inequality holds for every positive number \epsilon. But when \epsilon=|x-y| it doesn’t hold.)

    In that argument, we started with the assumption \neg Q(x,y) (in other words, we assumed that x\ne y) and ended up proving \neg P(x,y) (in other words, we proved that there is some positive number that is not greater than |x-y|).

    What makes it easier to prove the contrapositive in the cases when it is indeed easier? It’s a bit tricky to say in general, so I’m not going to try too hard. Instead, let me offer some stylistic advice, which is that you should always start by trying to prove a statement in the obvious direct way, turning to the contrapositive if (i) you get stuck and (ii) you can see why you will be less stuck if you try to prove the contrapositive. Let’s see how that advice would have played out in the example above.

    We would have started trying to prove the statement directly. So we’d have said, “Right, I’ve got two numbers x and y and I get to assume that their difference is smaller than any number I choose. Hmm … but what number do I choose?” At that point it wouldn’t be all that obvious. Note that one number we can’t choose is |x-y|, since we don’t know that that number is positive. In fact, we’re trying to prove that it is not positive!

    At that point, feeling stuck, we wonder whether starting with the assumption that x\ne y might help. And we see that it does, since it provides us with a positive real number, namely the difference between x and y. So when we decide to try to prove the contrapositive instead, we are not doing so merely because we are stuck — there are plenty of times when we get stuck where turning to the contrapositive is clearly of no help whatsoever — but for the additional reason that \neg Q gives us something that we needed in order to make progress.

    You may have heard this method called “proof by contradiction”. That isn’t strictly speaking correct (though this is a point that I myself didn’t understand until recently). There are two sorts of proof that make use of the negation of the conclusion. Just to give names to things, let’s suppose we are trying to prove the statement

  • For every x\in X\ \ P(x)\implies Q(x).
  • One type of proof is what I illustrated above: you simply prove the contrapositive instead. That is, you prove

  • For every x\in X\ \ \neg Q(x)\implies\neg P(x).
  • Another type of proof is this. You assume that the result is false. That is, you assume

  • There exists x\in X such that P(x) and \neg Q(x).
  • and from that you proceed to deduce a contradiction: that is, you deduce a statement and its negation. There is an interesting discussion of this difference at Mathoverflow, which also contains a link to a highly recommended blog post of Terence Tao.

    More on converses.

    Now that we have talked about contrapositives, I can say something further about converses. We know that a statement is equivalent to its contrapositive. What happens if we start with a statement and look at the contrapositive of its converse? It will of course be equivalent to the converse, but what will it actually look like?

    Well, if the original statement is P\implies Q, then the converse is Q\implies P, and the contrapositive of that is \neg P\implies \neg Q. What this tells us is that \neg P\implies\neg Q is another way of formulating the converse of P\implies Q. Or at least, that’s the bare truth-value way of putting it. For the more typical cases of interest, I would want to say that we can formulate the converse of

  • For every x\in X\ \ P(x)\implies Q(x).
  • as

  • For every x\in X\ \ \neg P(x)\implies\neg Q(x).
  • The practical consequence of this is that if you are asked to prove a statement that looks like this:

  • Let x\in X. Prove that the following two statements are equivalent: (i) P(x); (ii) Q(x).
  • and you start by showing that P(x)\implies Q(x) for some arbitrary x\in X (which means that you’ve proved it for all x\in X), then you have two options. Either you can say, “Conversely, let us assume that Q(x),” and proceed to try to deduce P(x), or you can say, “Conversely, let us assume that \neg P(x),” and proceed to try to deduce \neg Q(x). The one thing not to do is say, “Conversely, let us assume that \neg Q(x),” and proceed to try to prove \neg P(x). If you do that, then you are proving the contrapositive of what you proved before — i.e., basically the same result — and not the converse. I have seen this mistake many times. I’ve even made it myself, though I think I’ve always noticed before I’ve got very far. I recommend that if you ever decide to prove the converse of a statement by considering negations of the component parts, you check very carefully that you’ve got things the right way round.

    As an example of this, consider the pair of statements with which I began this post. Suppose that we want to prove the following equivalence.

  • For every positive integer n, the following are equivalent: (i) n=1 or n is a prime; (ii) (n-1)!+1 is divisible by n.
  • We might start by trying to show that (i) implies (ii). This turns out not to be obvious: it is a theorem that will be covered in the Numbers and Sets course, known as Wilson’s theorem. But suppose we’ve finished that and we wonder about the converse. If we go for the Q\implies P option, then we’ll be using a weird condition — that (n-1)!+1 is divisible by n — and trying to deduce from that a rather negative property of n — that it hasn’t got any non-trivial factors. If on the other hand we go for the \neg P\implies\neg Q option then we’re much better off. This time our hypothesis is that n is composite, which gives us some factors to play around with, and instead of being forced to use a weird condition we merely have to disprove it, which is somehow an easier task.

    Indeed, the whole deduction turns out to be easy this way round: if n isn’t prime then we can write n=ab with both a and b less than n. So if we write the numbers from 1 to n-1 out in a line, then a and b will be amongst them, from which it follows that (n-1)! is a multiple of ab, which equals n. Since it’s not possible for both (n-1)! and (n-1)!+1 to be multiples of n, we have proved that the strange condition doesn’t hold, just as we wanted.

    Or have we? Can you see a mistake in the above argument? If you can’t, then there is an important moral: don’t be fooled by your notation into making assumptions that you haven’t justified. In the above argument I assumed that a and b were different: if I hadn’t then it wouldn’t have followed that (n-1)! was a multiple of ab.

    There are two ways of correcting the argument. One is to note that we can always write n=ab with a and b different unless n is either a prime or the square of a prime. But if n=p^2 for some prime p, then both p and 2p are less than n — except if p=2 — so (n-1)! is a multiple of p^2. If p=2 then we just look and see that (n-1)!+1=7, which is not a multiple of 4.

    The other way of correcting the argument is cleaner because it doesn’t involve splitting into cases. We simply observe that if a is a factor of n that’s less than n, then a is also a factor of (n-1)!. Therefore a is not a factor of (n-1)!+1. (I’m also assuming of course that a\ne 1.) But a is a factor of any multiple of n, so (n-1)!+1 cannot be a multiple of n.

    So here we have an example of an equivalence with two properties mentioned above: one direction is easier than the other, and to prove the converse of a statement P\implies Q it is easier to show that \neg P\implies\neg Q than it is to show that Q\implies P.

    13 Responses to “Basic logic — relationships between statements — converses and contrapositives”

    1. Simon Tatham Says:

      ‘the two properties, “generate all of \mathbb{Z}” and “have a common factor bigger than 1″ are equivalent’: you meant the first one to be “do not generate all of \mathbb{Z}”, I think.

      Many thanks — corrected now.

    2. andrejbauer Says:

      About the beginning of your post. Clearly, the $n$ in the two statements is a _free parameter_. It is _not_ quantified, and the two statements are clearly converses of each other. The trouble is that many mathematicians do not understand the difference between a free parameter and a universally quantified one. But the difference is there. It is the same as the difference between a function and its definite integral. Perhaps it would be a good idea to say something about free variables in these posts. They are necessary in hypothetical reasoning, i.e., in the middle of a proof we often do have free parameters, and unavoidably so.

      • gowers Says:

        As a matter of fact, I do have a half-written post in which I discuss at some length the difference between free and bound variables. However, I’m not sure I agree with your second sentence. I think that when such a sentence is uttered, n is sometimes free and sometimes universally quantified — but only implicitly. For instance, if I say, “I discovered a cool fact yesterday: if p is a prime of the form 4m+1 then it can be written as a sum of two squares,” then I would maintain that what I actually mean is that every prime of the form 4m+1 can be written as a sum of two squares. However, if I start to prove this result and begin by saying, “Let p be a prime of the form 4m+1,” then p has that mysterious fixed-but-arbitrary status and perhaps it’s better to call it free.

        I’m not sure I completely buy the function/integral analogy, but you may be able to persuade me. If I take a statement P that involves a parameter n (presumably the analogue of the function) and form the statement \forall n\ \ P(n) then I get a statement without parameters, just as if I integrate f between 0 and 1 then I get a number that doesn’t depend on the variable that the function takes. But in the first case I get a statement from which I can deduce all the individual statements P(n), whereas from the definite integral I can’t say anything about individual values of f. So it seems to me that integrating is throwing away information in a way that universally quantifying isn’t.

        I’m also interested by your last three words. I can see that having free parameters in the middle of proofs is extremely natural, and of course I do it myself, but why can’t one avoid it by simply universally quantifying everything? For instance, if I want to prove that for every n\in\mathbb{N}\ \ P(n)\implies Q(n), I would normally say, “Let n\in\mathbb{N} and suppose that P(n),” and proceed to deduce Q(n). But …

        Hmm … honesty compels me to leave that last paragraph there, but I think I now see the answer to my question. If, for example, my argument went P(n)\implies R(n)\implies S(n)\implies Q(n) then the first line of my proof basically has to be that P(n)\implies R(n). I certainly can’t start with \forall n\in\mathbb{N}\ \ P(n) as my initial assumption (since I need to deduce Q(n) from P(n) and not from the quite possibly false statement that P holds for all n). So if I “universally quantify over everything”, what I’m doing is taking P(n) as my premise and then saying, “Oh, and by the way, this deduction works for all n,” at the end of the proof. Is that what you mean when you say that free parameters are unavoidable?

      • andrejbauer Says:

        Uh, there are a lot of interesting points here. (By the way, what do I have to write to get math notation?)

        We have a meta-theorem for first-order logic which says that P(x) with a free parameter x has a proof iff \forall x P(x) has a proof.
        In other words, as far as proving a statement goes, it does not matter whether the outer universal quantifiers are there. This is the precise meaning of “x is implicitly quantified”, I think. Also, this meta-theorem is sometimes misunderstood as saying that P(x) is equivalent to \forall x P(x), which is nonsense.

        Suppose I tell you that I integrated a function on [0,1] and got 42. Can you tell which function it was? No. Suppose I tell you that I universally quantified a statement over the domain [0,1] and I got true. Can you tell me which statement it was? No. We must not confuse the truth value of \forall x P(x) with the expression \forall x P(x). A fair analogy would be this: if I show you the expression \forall x P(x) then you can tell what P is. Likewise, if I show you the expression \int_0^1 f(x) dx then you can tell what f is.

        Yes, what you say about having free parameters in the middle of the proof is essentially what I am trying to say. Speaking as a logician, you simply must have free parameters because the rule of inference for universal quantifiers requires you to use them. Namely, in order to prove \forall x \in A . P(x) you must do the following: pick a letter which has not been used so far, say z, assume z \in A and prove P(z). Here the fresh letter is a free parameter, and it cannot be eliminated without significant changes to how we write down things.

      • gowers Says:

        Thanks — I see your point about integrating now. To get maths you do exactly as you are doing but insert “latex” after the first dollar with no space between the dollar and “latex”. So if you want, for example, to write \int_0^1f(x)dx, then what you actually type in is £latex \int_0^1f(x)dx£ except that the pound sign should be a dollar sign.

    3. Richard Baron Says:

      If we are keen to tie down the logic that lies behind “Oh, and by the way, this deduction works for all n”, should it go like this?

      1. Write out the proof. We start with P(n), then after several lines get to R(n), then after more lines to S(n), then eventually to Q(n). There is no binding of n in any of this.

      2. Use the deduction theorem several times, with liberal application of brackets, to convert the lines of proof into one gigantic sequence of nested conditionals.

      3. Now we have a single open formula, with free n. Put one last big pair of brackets round it.

      4. Finally, put the universal quantifier, for all n, on the front.

      Then no logician will be able to accuse a mathematician of any sloppiness. But we do need to be sure that the deduction theorem applies to the logic we implicitly invoke in writing proofs. (I guess it does.)

    4. anonymous Says:

      When I read this sentence,

      ”But then $latex| x – y |$ is a positive number that is not greater than $latex| x – y |$.”

      it seems like you may have intended to write something else, like

      ‘But then $latex| x – y |$ is a positive number, which shows that…”

    5. Blog of the Month- October 2011 | MY DIGITAL NOTEBOOK Says:

      […] Gower’s Weblog: Blog of the Month for October 2011. Prof. Timothy Gowers (a.k.a. Tim Gowers) is a field medalist and an eminent mathematician. He is a member of the Department of Pure Mathematics and Mathematical Statistics at Cambridge University . He is ideal of many student and math majors including me. I was not a huge fan of his writing as he  has  written  only a few articles after I joined the web. But as he started his series on ‘CAMBRIDGE TEACHING‘, I have become a regular reader of his weblog. One math student must read his posts on Cambridge Teaching. Most Recent Post: Basic logic — relationships between statements — converses and contrapositives  […]

    6. Chris Purcell Says:

      “For every positive integer n, the following are equivalent: (i) n is a prime; (ii) (n – 1)! + 1 is divisible by n.”

      1 is a positive integer. 1 is not prime. 0! + 1 = 2, which is divisible by 1. Hence, 1 is a counterexample.

    7. indirect proofs: contrapositives vs. proofs by contradiction « Sebastian Pokutta’s Blog Says:

      […] discussion on contrapositives vs. proofs by contradiction as part of Timothy Gowers’ Cambridge Math Tripos, mathoverflow, and Terry Tao’s blog. At first sight these two concepts, the contrapositive […]

    8. Blog of the Month Awards - October 2011 | Gaurav Tiwari Says:

      […] Gower’s Weblog: Blog of the Month for October 2011. Prof. Timothy Gowers (a.k.a. Tim Gowers) is a field medalist and an eminent mathematician. He is a member of the Department of Pure Mathematics and Mathematical Statistics at Cambridge University . He is ideal of many student and math majors including me. I was not a huge fan of his writing as he  has  written  only a few articles after I joined the web. But as he started his series on ‘CAMBRIDGE TEACHING‘, I have become a regular reader of his weblog. One math student must read his posts on Cambridge Teaching. Most Recent Post: Basic logic — relationships between statements — converses and contrapositives  […]

    Leave a Reply

    Fill in your details below or click an icon to log in:

    WordPress.com Logo

    You are commenting using your WordPress.com account. Log Out / Change )

    Twitter picture

    You are commenting using your Twitter account. Log Out / Change )

    Facebook photo

    You are commenting using your Facebook account. Log Out / Change )

    Google+ photo

    You are commenting using your Google+ account. Log Out / Change )

    Connecting to %s


    %d bloggers like this: