It’s been a while since I have written a post in the “somewhat philosophical” category, which is where I put questions like “How can one statement be stronger than an another, equivalent, statement?” This post is about a question that I’ve intended for a long time to sort out in my mind but have found much harder than I expected. It seems to be possible to classify theorems into three types: ones where it would be ridiculous to use contradiction, ones where there are equally sensible proofs using contradiction or not using contradiction, and ones where contradiction seems forced. But what is it that puts a theorem into one of these three categories?
This is a question that arises when I am teaching somebody who comes up with a proof like this. “Suppose that the sequence is not convergent. Then … a few lines of calculation … which implies that . Contradiction.” They are sometimes quite surprised when you point out that the first and last lines of this proof can be crossed out. Slightly less laughable is a proof that is more like this. “We know that . Suppose that . Since the derivative of has absolute value at most 1 everywhere, it follows that , which is a contradiction. Therefore, .” There, it is clearly better to work directly from the premise that via the lemma that to the conclusion that . However, the usual proof of the lemma does use contradiction: one assumes that the conclusion is false and applies the mean value theorem.
The result of all this is that I don’t have a good tip of the form, “If your theorem is like this then try a proof by contradiction, and otherwise don’t.” For the remainder of this post I’ll discuss another couple of examples that show some of the complications that arise.
Example 1. The irrationality of the square root of 2.
This is of course the classic proof by contradiction, and one can even give a kind of quasi-proof that it must use contradiction. The reason is that the word “irrational” means “not equal to for any pair of integers “. If that is the definition, then let us suppose that the last two lines of the proof went: therefore has property ; therefore is irrational. We could then ask, “Why does having property imply that a number is irrational?” It might be obvious that property implies irrationality, but to prove it it would still be necessary to say, “Well, take any rational number … therefore does not have property .” (Why would that be necessary? For precisely the same reason! Perhaps this is an induction on proof length or something like that.)
With those thoughts in mind, consider the following argument. We begin by calculating the continued-fraction expansion of . We find that . The denominator of the fraction is , so we see that the continued-fraction expansion repeats itself, and, in one reasonably standard notation, is . In particular, it is infinite. Therefore, is irrational.
At first sight, this looks like a direct argument rather than a proof by contradiction: we used the hypothesis that to deduce that has a property that obviously implies irrationality. However, as I mentioned in the general remarks, one could question this by asking, “Why is it that a number with an infinite continued-fraction expansion has to be irrational?” The answer? It’s obvious that a rational number has a terminating continued fraction, because as you work it out the denominators keep decreasing … oops, sorry, that was a proof by contradiction.
So perhaps the answer is indeed that if you are trying to prove a negative statement, then you have to use a proof by contradiction. But what is a “negative statement”? How about the following theorem?
Theorem. If and are integers, then .
Aha, you say, the “not equals” makes that negative. But we can deal with that by a quick reformulation.
Theorem. If and are integers, then .
What’s negative about that? If you think that it’s somehow bound up in the notion of “strictly greater than”, then how about this?
Theorem. If and are integers, then there exists a real number such that .
That looks pretty positive to me, since it is asserting the existence of something.
It becomes less positive if you imagine how you would establish the existence of such an . The obvious thought would be, “Well, the only thing that could possibly go wrong is if , so all we have to prove is that .” And that’s negative again. So does that mean that a statement is negative if the only sensible way of proving it is to reduce it to a statement that includes the word “not”? Even if something like that is correct, it seems quite hard to formalize.
Here’s another example of that last type of question. Is being infinite a negative property? One might say yes, because it means not being finite. But when we were talking about continued fractions, all we cared about was sequences, and we can define a sequence to be infinite if its terms can be put in bijection with the natural numbers. (And we could define a set to be infinite if there is an injection to a proper subset. But is a proper subset a negative concept because it means not including all elements?)
Example 2. Continuous functions on closed bounded intervals.
Until recently I “knew” that the following was the case. If you want to prove something using the compactness of , then you can either prove it directly using the Heine-Borel theorem or you can prove it by contradiction by reformulating everything in terms of sequences and applying the Bolzano-Weierstrass theorem. For example, to prove that a continuous function on is bounded, you either find a neighbourhood of each point on which is bounded (by the definition of continuity) and cover by finitely many such neighbourhoods (by the Heine-Borel theorem), or you assume that is unbounded, construct a sequence such that for every , apply the Bolzano-Weierstrass theorem, and reach a contradiction.
I had also tacitly assumed that there is an algorithm for converting proofs of one kind into proofs of the other, though I had never actually tried to work out the details.
But recently, a colleague and I had a conversation that led to the following proof of the theorem, which disturbs my cosy view of how things work. The idea is to try to imitate the above proof by contradiction as much as possible, but without actually bothering to prove the result by contradiction. Specifically, one builds a sequence that is the most likely sequence to cause trouble, and then proves that it does not cause trouble. Here is how the argument goes.
Let be the supremum of . By the definition of a supremum, we can find a sequence such that . Pick a convergent subsequence , by Bolzano-Weierstrass. Then is a subsequence of , so . But if is the limit of the sequence , then . So . So is an upper bound for . (Note that this proof also shows that the bound is attained.)
There seems to be no contradiction involved. But again, if we dig a little deeper it starts to look as though what is really going on is that the contradiction is hidden in the “obvious” steps of the proof. For instance, how do we know that we can find a sequence such that ? We’ll need to split into two cases (unless we want to define the topology we are implicitly putting on the extended real line). The case where is not really worth investigating, since it instantly gives us that is bounded (though it was by unnecessarily doing this step that we obtained the added information that attains its upper bound). And if we then look at the case , how is what we are doing any different from assuming that is unbounded? I find myself rather confused.
One thing that seems to be coming out of these examples is that the notion of a proof by contradiction is relative to the definitions you use and the small results that you take for granted. For instance, I could define a number to be irrational if its continued-fraction expansion is non-terminating. I wouldn’t actually advocate doing that, but if one did, then the “direct” proof I gave of the irrationality of would be just that — direct. And if I do not allow myself to assume that then the proof that whenever appears to stop being direct and require a contradiction.
In which case, perhaps the advice that I give to students — proof by contradiction is a very useful tool, but try not to use it unless you really need it — is, though not completely precise, about the best one can do.