Converses.
What is the relationship between the following two statements?
1. If is 1 or a prime number, then is divisible by .
2. If is divisible by then is 1 or a prime number.
At first sight, this doesn’t look a very difficult question: the first statement is of the form and the second is of the form We say that the second statement is the converse of the first. (Note that the first statement is also the converse of the second.)
However, if you have read the previous posts in this series, I hope you will be slightly anxious about what I have just written, because I have not been clear about whether the above two statements are referring to some specific or whether they are meant as general statements about all positive integers . The latter is, I would suggest, the natural interpretation. And if you look at the kinds of converses that come up in a maths course, you will find that they are almost always not of specific statements, but rather of general statements.
This means that just as there are two notions of implication, there are also two notions of converse. One is the one I mentioned just a moment ago: the converse of is . The other involves a universal quantifier: we take a statement of the form and its converse is . (We don’t have to have just one variable here. For instance, the converse of is .)
This is a potential source of confusion, since the definition of converse is often given as the first one (that the converse of is ), while almost all the interesting converses that actually come up in mathematics are of the second kind (where we universally quantify over one or more variables).
With that point behind us, let me discuss two examples of converses.
Example 1.
I’ll start with a simple one. Suppose you want to prove that two sets and are equal. The following is the criterion for this to be the case: every element of is an element of and every element of is an element of For convenience, let’s suppose that we know that and are both subsets of a set . (For example, we might know that and consist of positive integers. In that case, would be .) Then the statement that would break down into two statements that we could write using quantifiers in the following way.
The first of these tells us that every element of is an element of , and the second that every element of is an element of . And those two statements are converses of each other.
Example 2.
Let and be two integers (not necessarily positive). Let us define the set of integers generated by and to be the set of all integers of the form , where and are themselves integers. We could call this the set of all “integer combinations” of and . An important question that comes up in the number theory that you will do this term is this: under what circumstances is the set generated by and equal to the set of all integers? That is, when can you make any integer you want out of and by adding appropriate multiples?
To get some traction on this problem, it might be worth looking at some examples. Suppose that and . Can we make all integers by adding multiples of and ? Quite clearly not, since every integer we can make will be a multiple of 10. Obviously this example can be generalized: if and have a common factor that’s bigger than 1, then they cannot generate the whole of (since everything that they generate will be a multiple of and not all integers are multiples of ).
Let us express this observation more formally.
Note that this statement is of the general form
As such, it has a converse, obtained by reversing the roles of the statements and . In words, the converse is as follows.
Now converses of true statements don’t have to be true. For example, the statement, “Every multiple of 10 is even,” is true, and its converse, “Every even number is a multiple of 10,” is false. However, the converse I have just written down happens to be true: it really is the case that if two integers don’t generate the whole of then they must have a common factor bigger than 1. If you are not familiar with this fact, then I recommend pausing for a few seconds just to think how you might prove it.
.
.
.
.
And unless you are very unusually quick, you will have come to the conclusion that it is not at all obvious how to prove it. This may seem slightly mysterious: the two properties, “do not generate all of ” and “have a common factor bigger than 1” are equivalent, so in a sense the same property, and yet the second feels stronger than the first, in the sense that the first is easy to deduce from the second, while the second is not so easy to deduce from the first. I discussed this phenomenon in a blog post a few years ago, which provoked a number of very interesting comments.
Contrapositives.
Let and be two statements. Then the statement is true (here I’m using the truth-value definition) if and only if it is not the case that is true and is false.
When is true? Applying the same rule, the only thing that could stop it being true is if is true and is false. But that’s the same as saying that is true and is false. In other words, the conditions for to be true are precisely the same as the conditions for to be true. The statement is called the contrapositive of the statement . A statement and its contrapositive are always equivalent, but the importance of the contrapositive is that it is sometimes easier to prove than the original statement.
As ever, the situation is more interesting if we quantify over a parameter involved in a statement. Consider a statement of the form
For each the inner part of this statement is equivalent to . Therefore, the entire statement is equivalent to
That was a bit abstract, so let me look at a simple example. A surprisingly useful way of proving that two real numbers and are equal is to show that their difference is less than all positive numbers. That is, we rely on the following principle.
The statement I have just written is of the general form
Indeed, is the statement,
and is the statement .
Now the easiest way to prove this useful principle is to say the following. Suppose that . Then is some number that isn’t zero (since if , then adding to both sides gives us that ). But then is a positive number, which shows that the statement
is false. (The statement claims that a certain inequality holds for every positive number . But when it doesn’t hold.)
In that argument, we started with the assumption (in other words, we assumed that ) and ended up proving (in other words, we proved that there is some positive number that is not greater than ).
What makes it easier to prove the contrapositive in the cases when it is indeed easier? It’s a bit tricky to say in general, so I’m not going to try too hard. Instead, let me offer some stylistic advice, which is that you should always start by trying to prove a statement in the obvious direct way, turning to the contrapositive if (i) you get stuck and (ii) you can see why you will be less stuck if you try to prove the contrapositive. Let’s see how that advice would have played out in the example above.
We would have started trying to prove the statement directly. So we’d have said, “Right, I’ve got two numbers and and I get to assume that their difference is smaller than any number I choose. Hmm … but what number do I choose?” At that point it wouldn’t be all that obvious. Note that one number we can’t choose is , since we don’t know that that number is positive. In fact, we’re trying to prove that it is not positive!
At that point, feeling stuck, we wonder whether starting with the assumption that might help. And we see that it does, since it provides us with a positive real number, namely the difference between and . So when we decide to try to prove the contrapositive instead, we are not doing so merely because we are stuck — there are plenty of times when we get stuck where turning to the contrapositive is clearly of no help whatsoever — but for the additional reason that gives us something that we needed in order to make progress.
You may have heard this method called “proof by contradiction”. That isn’t strictly speaking correct (though this is a point that I myself didn’t understand until recently). There are two sorts of proof that make use of the negation of the conclusion. Just to give names to things, let’s suppose we are trying to prove the statement
One type of proof is what I illustrated above: you simply prove the contrapositive instead. That is, you prove
Another type of proof is this. You assume that the result is false. That is, you assume
and from that you proceed to deduce a contradiction: that is, you deduce a statement and its negation. There is an interesting discussion of this difference at Mathoverflow, which also contains a link to a highly recommended blog post of Terence Tao.
More on converses.
Now that we have talked about contrapositives, I can say something further about converses. We know that a statement is equivalent to its contrapositive. What happens if we start with a statement and look at the contrapositive of its converse? It will of course be equivalent to the converse, but what will it actually look like?
Well, if the original statement is , then the converse is , and the contrapositive of that is . What this tells us is that is another way of formulating the converse of . Or at least, that’s the bare truth-value way of putting it. For the more typical cases of interest, I would want to say that we can formulate the converse of
as
The practical consequence of this is that if you are asked to prove a statement that looks like this:
and you start by showing that for some arbitrary (which means that you’ve proved it for all ), then you have two options. Either you can say, “Conversely, let us assume that ,” and proceed to try to deduce , or you can say, “Conversely, let us assume that ,” and proceed to try to deduce . The one thing not to do is say, “Conversely, let us assume that ,” and proceed to try to prove . If you do that, then you are proving the contrapositive of what you proved before — i.e., basically the same result — and not the converse. I have seen this mistake many times. I’ve even made it myself, though I think I’ve always noticed before I’ve got very far. I recommend that if you ever decide to prove the converse of a statement by considering negations of the component parts, you check very carefully that you’ve got things the right way round.
As an example of this, consider the pair of statements with which I began this post. Suppose that we want to prove the following equivalence.
We might start by trying to show that (i) implies (ii). This turns out not to be obvious: it is a theorem that will be covered in the Numbers and Sets course, known as Wilson’s theorem. But suppose we’ve finished that and we wonder about the converse. If we go for the option, then we’ll be using a weird condition — that is divisible by — and trying to deduce from that a rather negative property of — that it hasn’t got any non-trivial factors. If on the other hand we go for the option then we’re much better off. This time our hypothesis is that is composite, which gives us some factors to play around with, and instead of being forced to use a weird condition we merely have to disprove it, which is somehow an easier task.
Indeed, the whole deduction turns out to be easy this way round: if isn’t prime then we can write with both and less than . So if we write the numbers from 1 to out in a line, then and will be amongst them, from which it follows that is a multiple of , which equals . Since it’s not possible for both and to be multiples of , we have proved that the strange condition doesn’t hold, just as we wanted.
Or have we? Can you see a mistake in the above argument? If you can’t, then there is an important moral: don’t be fooled by your notation into making assumptions that you haven’t justified. In the above argument I assumed that and were different: if I hadn’t then it wouldn’t have followed that was a multiple of .
There are two ways of correcting the argument. One is to note that we can always write with and different unless is either a prime or the square of a prime. But if for some prime then both and are less than — except if — so is a multiple of . If then we just look and see that , which is not a multiple of 4.
The other way of correcting the argument is cleaner because it doesn’t involve splitting into cases. We simply observe that if is a factor of that’s less than , then is also a factor of . Therefore is not a factor of . (I’m also assuming of course that .) But is a factor of any multiple of , so cannot be a multiple of .
So here we have an example of an equivalence with two properties mentioned above: one direction is easier than the other, and to prove the converse of a statement it is easier to show that than it is to show that .
October 5, 2011 at 1:32 pm |
‘the two properties, “generate all of \mathbb{Z}” and “have a common factor bigger than 1″ are equivalent’: you meant the first one to be “do not generate all of \mathbb{Z}”, I think.
Many thanks — corrected now.
October 5, 2011 at 2:06 pm |
About the beginning of your post. Clearly, the $n$ in the two statements is a _free parameter_. It is _not_ quantified, and the two statements are clearly converses of each other. The trouble is that many mathematicians do not understand the difference between a free parameter and a universally quantified one. But the difference is there. It is the same as the difference between a function and its definite integral. Perhaps it would be a good idea to say something about free variables in these posts. They are necessary in hypothetical reasoning, i.e., in the middle of a proof we often do have free parameters, and unavoidably so.
October 5, 2011 at 2:33 pm
As a matter of fact, I do have a half-written post in which I discuss at some length the difference between free and bound variables. However, I’m not sure I agree with your second sentence. I think that when such a sentence is uttered, is sometimes free and sometimes universally quantified — but only implicitly. For instance, if I say, “I discovered a cool fact yesterday: if is a prime of the form then it can be written as a sum of two squares,” then I would maintain that what I actually mean is that every prime of the form can be written as a sum of two squares. However, if I start to prove this result and begin by saying, “Let be a prime of the form ,” then has that mysterious fixed-but-arbitrary status and perhaps it’s better to call it free.
I’m not sure I completely buy the function/integral analogy, but you may be able to persuade me. If I take a statement that involves a parameter (presumably the analogue of the function) and form the statement then I get a statement without parameters, just as if I integrate f between 0 and 1 then I get a number that doesn’t depend on the variable that the function takes. But in the first case I get a statement from which I can deduce all the individual statements , whereas from the definite integral I can’t say anything about individual values of . So it seems to me that integrating is throwing away information in a way that universally quantifying isn’t.
I’m also interested by your last three words. I can see that having free parameters in the middle of proofs is extremely natural, and of course I do it myself, but why can’t one avoid it by simply universally quantifying everything? For instance, if I want to prove that for every , I would normally say, “Let and suppose that ,” and proceed to deduce . But …
Hmm … honesty compels me to leave that last paragraph there, but I think I now see the answer to my question. If, for example, my argument went then the first line of my proof basically has to be that . I certainly can’t start with as my initial assumption (since I need to deduce from and not from the quite possibly false statement that holds for all ). So if I “universally quantify over everything”, what I’m doing is taking as my premise and then saying, “Oh, and by the way, this deduction works for all ,” at the end of the proof. Is that what you mean when you say that free parameters are unavoidable?
October 5, 2011 at 3:15 pm
Uh, there are a lot of interesting points here. (By the way, what do I have to write to get math notation?)
We have a meta-theorem for first-order logic which says that with a free parameter has a proof iff has a proof.
In other words, as far as proving a statement goes, it does not matter whether the outer universal quantifiers are there. This is the precise meaning of “ is implicitly quantified”, I think. Also, this meta-theorem is sometimes misunderstood as saying that is equivalent to , which is nonsense.
Suppose I tell you that I integrated a function on and got . Can you tell which function it was? No. Suppose I tell you that I universally quantified a statement over the domain and I got true. Can you tell me which statement it was? No. We must not confuse the truth value of with the expression . A fair analogy would be this: if I show you the expression then you can tell what is. Likewise, if I show you the expression then you can tell what is.
Yes, what you say about having free parameters in the middle of the proof is essentially what I am trying to say. Speaking as a logician, you simply must have free parameters because the rule of inference for universal quantifiers requires you to use them. Namely, in order to prove you must do the following: pick a letter which has not been used so far, say , assume and prove . Here the fresh letter is a free parameter, and it cannot be eliminated without significant changes to how we write down things.
October 5, 2011 at 3:28 pm
Thanks — I see your point about integrating now. To get maths you do exactly as you are doing but insert “latex” after the first dollar with no space between the dollar and “latex”. So if you want, for example, to write , then what you actually type in is £latex \int_0^1f(x)dx£ except that the pound sign should be a dollar sign.
October 5, 2011 at 2:52 pm |
If we are keen to tie down the logic that lies behind “Oh, and by the way, this deduction works for all n”, should it go like this?
1. Write out the proof. We start with P(n), then after several lines get to R(n), then after more lines to S(n), then eventually to Q(n). There is no binding of n in any of this.
2. Use the deduction theorem several times, with liberal application of brackets, to convert the lines of proof into one gigantic sequence of nested conditionals.
3. Now we have a single open formula, with free n. Put one last big pair of brackets round it.
4. Finally, put the universal quantifier, for all n, on the front.
Then no logician will be able to accuse a mathematician of any sloppiness. But we do need to be sure that the deduction theorem applies to the logic we implicitly invoke in writing proofs. (I guess it does.)
October 5, 2011 at 7:24 pm |
When I read this sentence,
”But then $latex| x – y |$ is a positive number that is not greater than $latex| x – y |$.”
it seems like you may have intended to write something else, like
‘But then $latex| x – y |$ is a positive number, which shows that…”
October 5, 2011 at 8:14 pm
I intended to write what I actually did write, but your suggestion is clearer, so I’ve adopted it.
October 6, 2011 at 7:49 am |
[…] Gower’s Weblog: Blog of the Month for October 2011. Prof. Timothy Gowers (a.k.a. Tim Gowers) is a field medalist and an eminent mathematician. He is a member of the Department of Pure Mathematics and Mathematical Statistics at Cambridge University . He is ideal of many student and math majors including me. I was not a huge fan of his writing as he has written only a few articles after I joined the web. But as he started his series on ‘CAMBRIDGE TEACHING‘, I have become a regular reader of his weblog. One math student must read his posts on Cambridge Teaching. Most Recent Post: Basic logic — relationships between statements — converses and contrapositives […]
October 7, 2011 at 8:18 pm |
“For every positive integer n, the following are equivalent: (i) n is a prime; (ii) (n – 1)! + 1 is divisible by n.”
1 is a positive integer. 1 is not prime. 0! + 1 = 2, which is divisible by 1. Hence, 1 is a counterexample.
October 7, 2011 at 9:32 pm
Damn. OK I’ll add in to (i) that n can be 1.
October 18, 2011 at 9:01 pm |
[…] discussion on contrapositives vs. proofs by contradiction as part of Timothy Gowers’ Cambridge Math Tripos, mathoverflow, and Terry Tao’s blog. At first sight these two concepts, the contrapositive […]
April 4, 2014 at 4:30 am |
[…] Gower’s Weblog: Blog of the Month for October 2011. Prof. Timothy Gowers (a.k.a. Tim Gowers) is a field medalist and an eminent mathematician. He is a member of the Department of Pure Mathematics and Mathematical Statistics at Cambridge University . He is ideal of many student and math majors including me. I was not a huge fan of his writing as he has written only a few articles after I joined the web. But as he started his series on ‘CAMBRIDGE TEACHING‘, I have become a regular reader of his weblog. One math student must read his posts on Cambridge Teaching. Most Recent Post: Basic logic — relationships between statements — converses and contrapositives […]
March 16, 2020 at 1:45 am |
coach Training
Basic logic — relationships between statements — converses and contrapositives | Gowers's Weblog
August 9, 2021 at 7:03 am |
http://ismp.Cpatu.embrapa.br/author/MarcReinha
Basic logic — relationships between statements — converses and contrapositives | Gowers's Weblog