I have discussed how the mathematical meanings of the words “and”, “or” and “not” are not quite identical to their ordinary meanings. This is also true of the word “implies”, but rather more so. In fact, unravelling precisely what mathematicians mean by this word is a sufficiently complicated task that I have just decided to jettison an entire post on the subject and start all over again. (Roughly speaking what happened was that I wrote something, wasn’t happy with it for a number of reasons, made several fairly substantial changes, and ended up with something that simply wasn’t what I now feel like writing after having thought quite a bit more about what I want to say. The straw that broke the camel’s back was a comment by Daniel Hill in which he pointed out that “implies” wasn’t, strictly speaking, a connective at all.
I’ll mention a number of fairly subtle distinctions in this post, and you may find that you can’t hold them all in your head. If so, don’t worry about it too much, because you can afford to blur most of the distinctions. There’s just one that is particularly important, which I’ll draw attention to when we get to it.
“Implies” versus “therefore” versus “if … then”.
The three words “implies”, “therefore”, and “if … then” (OK, the third one isn’t a word exactly, but it’s not a phrase either, so I don’t know what to call it) are all connected with the idea that one thing being true makes another thing true. You may have thought of them as all pretty much interchangeable. But are they exactly the same thing?
Some indication that they aren’t quite identical comes from the grammar of the words. Consider the following three sentences.
The first one is the most natural of the three. The second doesn’t quite read like a proper English sentence (because it isn’t), and the third, though correct grammatically, somehow doesn’t quite mean the same as the first, which is partly reflected by the fact that it is two sentences rather than one. (I could have used a semicolon instead of the full stop, but a comma would not have been enough.)
Let’s deal with the difference between “Therefore” and “if … then” first. The third formulation starts with the sentence, “It’s 11 o’clock.” Therefore, it is telling us that it’s 11 o’clock. By contrast, the first formulation gives us no indication of whether or not it is 11 o’clock (except perhaps if there is a note of panic in the voice of the person saying the sentence). So we use “therefore” when we establish one fact and then want to say that another fact is a consequence of it, whereas we use “if … then” if we want to convey that the second fact is a consequence of the first without making any judgment about whether the first is true.
How about “implies”? Before I discuss that, let me talk about another distinction, between mathematics and metamathematics. The former consists of statements like “31 is a prime number” or “The angles of a triangle add up to 180”. The latter consists of statements about mathematics rather than of mathematical statements themselves. For example, if I say, “The theorem that the angles of a triangle add up to 180 was known to the Greeks,” then I’m not talking about triangles (except indirectly) but about theorems to do with triangles.
The sort of metamathematics that concerns mathematicians is the sort that discusses properties of mathematical statements (notably whether they are true) and relationships between them (such as whether one implies another). Here are a few metamathematical statements.
In each of these four sentences I didn’t make mathematical statements. Rather, I referred to mathematical statements. The grammatical reason for this is that the word “implies”, in the English language, is supposed to link two noun phrases. You say that one thing implies another.
A noun phrase, by the way, is, roughly speaking, anything that could function as the subject of a sentence. For instance, “the man I was telling you about yesterday” is a noun phrase, since it functions as the subject of the sentence,
Other noun phrases in that sentence are “his bicycle” and “the third time”.
Let me write something stupid:
I wrote that because there is an important difference between two kinds of nonsense. The above sentence doesn’t make much sense, because you can’t imply a bicycle. However, it is at least grammatical in a way that
All this means that when we use “implies” in ordinary English, we are not connecting statements (because statements are not noun phrases) but talking about statements (because we use noun phrases to refer to statements).
I can think of three ways of turning statements into noun phrases. The first is rather crude: you put inverted commas round it. For example, if I want to do something about the incorrect sentence
then I could change it to
The second method is to come up with some name for the statement. That doesn’t work well here, but let’s have a go.
It works better for mathematical statements with established names such as the Bolzano-Weierstrass theorem.
The third method is to stick “that” or something like “the claim that” in front.
I mentioned above that “implies” is not, strictly speaking, a connective. Why is this? It’s because connectives are used to turn mathematical statements into mathematical statements. For example, we can use “and” to build the statement “ is prime and ” out of the two statements “ is prime” and ““. When we do that, the new statement isn’t referring to the old statements, but rather it contains them.
Unfortunately, as so often with this kind of thing, common mathematical usage is more complicated than the above discussion would suggest. Most people read the “” symbol as “implies”. And most people are quite happy to write something like
which, according to what I said above, is ungrammatical because “implies” is not linking noun phrases. What I suggest you do here is not worry about this too much: confusion between mathematics and metamathematics is unlikely to be a problem when you are learning about Numbers and Sets and about Groups. If you are inclined to worry, then you could resolve to read a sentence like the above as “If then ” I would also say that the symbol “” should in general be used fairly sparingly. In particular, don’t insert it into continuous prose. For instance, don’t write something like, “Therefore and ” Instead, write, “Therefore and which implies that ” (Note that in that last sentence the word “which” functioned as the subject of “implies” and referred back to the statement “ and “.)
Quotation and quasi-quotation.
If you like subtle distinctions that will not matter in your undergraduate mathematical studies, then read on. If you don’t, then feel free to skip this short section.
The distinction I want to draw attention to is between two uses of quotation marks. Just for good measure, let’s look at three different ways of doing something with the sentence, “There are infinitely many primes.”
- There are infinitely many primes, but only one of them is even.
- “There are infinitely many primes” is a famous theorem of mathematics.
- “There are infinitely many primes” is an expression made up of five words.
The first of these sentences is about numbers. As such, it doesn’t use quotation marks. The third sentence is about a linguistic expression. As such, it very definitely requires quotation marks, just as they are needed in the sentence
As for the second sentence, it is somewhere in between. It isn’t about numbers, but it’s also not about a linguistic expression. It’s about a mathematical fact. This use of quotation marks is sometimes called quasi-quotation. I won’t say any more but will instead refer you to the relevant Wikipedia article if you are interested. [Thanks to Mohan Ganesalingam for drawing my attention to it.]
Yes, but what do “if … then” and “implies” mean?
I’ve just spent rather a long time discussing the grammar of “implies”, “therefore” and “if … then” and said almost nothing about what they actually mean. To avoid confusion, I’m mainly going to discuss “if … then” since there is no doubt that that really is a connective. But sometimes I’m going to want to do what I’ve done in previous posts and use the letters P and Q to stand for statements, and here, unfortunately, there is a danger of the confusion creeping back. In particular, if one is being careful about it then one needs to be clear what “standing for a statement” actually means.
Is it something like the relationship between “The Riemann hypothesis” and “Every non-trivial zero of the Riemann zeta function has real part 1/2”? That is, are P and Q names for some statements? Not exactly, because we want to be able to make sense of the expression (recall that is a symbolic way of writing “and”) and the word “and” links statements rather than names. (You don’t, for example, say, “The Riemann hypothesis and Fermat’s Last Theorem” if you want to assert that the Riemann hypothesis and Fermat’s Last Theorem are both true.) So we should think of P and Q as statements themselves — it’s just that they are unknown statements.
But in that case we shouldn’t be allowed to write or at least not if means “implies”. But that’s just too bad. I’m going to write it, and if you’re worried about it then read “” as “if P then Q”. But actually what I recommend is not worrying about it and just knowing in your heart of hearts that it would be easy to replace what you are saying by something that is strictly correct if there was ever any danger of confusion.
So let us pause, take a deep breath, allow everything I’ve written so far to slip comfortably into the back of our minds, and turn to the question of what “if … then” and “implies” actually mean. And the answer is rather peculiar. In everyday English, when we use one of these words, we are trying to explain that there is a link between the two statements we are relating (either directly or by referring to them). For example, if I say, “If we continue to emit carbon dioxide into the atmosphere at the current rate then sea levels will rise by two metres by 2100,” I am suggesting a causal link between the two.
Let me now give the standard account of what mathematicians mean by “if … then”. Later I shall qualify it considerably — not because I think it is incorrect but because I think it doesn’t give the whole picture and can be unnecessarily off-putting. The standard thing to say is that is true unless is true and is false. That is, if you want to establish that then the only thing that can go wrong is being true and being false.
A brief interruption: purists will note that I have been inconsistent. If is a statement rather than something that refers to a statement, then I can’t say “ is true”. I have to say, “”” is true.” Alternatively, I should have said, “ unless and .” Can we agree that I’ll be slightly sloppy here? (If you don’t understand why it’s sloppy, I don’t think it matters.)
Let me illustrate this with a few examples.
Of these four statements, the fourth one seems quite reasonable, while the other three are all a bit peculiar. For example, it’s quite obvious that (the recent Pink Floyd stunt notwithstanding) pigs cannot fly. Doesn’t that make the first sentence false? And how can one say that the Riemann hypothesis implies Fermat’s Last Theorem when nobody expects a proof of Fermat’s Last Theorem that uses the Riemann hypothesis? And surely if is both even and odd, it could just as well be 19. Can it be correct to say that it has to be 17? As for the fourth sentence, it seems fine: if is a prime not equal to 2, then it cannot have 2 as a factor (or it wouldn’t be prime), so it must indeed be odd.
Well, mathematicians would say that all four statements are true. That’s because the only way “If P then Q” can be false is if P is true and Q is false. You should understand this as a definition of “if … then”. Let’s check the four statements using this definition.
For the first one to be false, we would need there to have been weapons of mass destruction in Iraq and for pigs to be unable to fly. Well, we’ve got the earthbound pigs but there were no weapons of mass destruction in Iraq, so the first statement is true. (Again, this is not some metaphysical claim. It just follows from the way we have chosen to define “if … then”.)
For the second to be false, we would need the Riemann hypothesis to be true and Fermat’s Last Theorem to be false. Well, Andrew Wiles, with help from Richard Taylor, has proved Fermat’s Last Theorem, so it’s not false. So the second statement in the list is true.
As for the third, the only way for that to be false is if is both even and odd but is not equal to 17. But no number is both even and odd. Therefore, the third statement is true. The problem about equalling 19 doesn’t arise because there are no even and odd integers in the first place.
Truth values and “causes”.
There’s something unsatisfactory about the truth-value definition of “if … then” and “implies”. It seems to leave out the idea that one thing can be true because another is true. It would be quite wrong to say, for instance, that Fermat’s Last Theorem is true because the Riemann hypothesis is true.
Fortunately, there is a very close link between the truth-value definition and what I’ll call the causal concept of “if … then”. I’m not going to attempt a precise definition of the causal concept — I’m just referring to the basic idea of one statement’s being a reason for another.
Let’s go back to the one statement that felt reasonable in the list above. It was this.
Now comes another somewhat subtle distinction, and this is the one I really care about. What does that statement above actually mean? I think a very natural way of interpreting it is this.
In other words, although it looks like a statement about some fixed number , the fact that we have been told nothing whatsoever about makes us read it in a slightly different way. We say to ourselves, “Since we’ve been told nothing at all about this must be intended as a general statement about an arbitrary So what it’s really saying is that if a positive integer has one property — being a prime not equal to 2 — then it has another — being odd.” If we’re thinking about things that way, then it’s rather tempting to say that the property “is a prime not equal to 2” implies the property “is odd”.
What I’ve just suggested is not standard mathematical practice, but in principle it could have been. However, it is incredibly important in mathematics to be completely sure at all times what kinds of objects one is dealing with. I said earlier that “if … then” connects statements and “implies” connects noun phrases that refer to statements. I did not say that either of them connects properties. So if I want to say that one property implies another, then I have to be absolutely clear that this is a different meaning of the word “implies” (even if it is related to the previous one).
OK, so let me be careful. First of all, what is a property? It’s what you get when you take a statement that concerns a variable and you remove that variable. For example, if I take the statement “ is a perfect square” and remove the variable from it, I get the property “is a perfect square”. A property is a thing you say about something else. (It’s almost like an adjective, but not quite because of the extra “is”.) If you want to be more formal about it, if you are given a set like the set of all positive integers, a property associated with that set is a function from elements of the set to statements. For example, the property “is prime” takes the number to the statement “ is prime”. (It is more conventional to say that all we actually care about is the truth values of these statements. So the property “is prime” takes the value TRUE at each prime number and FALSE at all other numbers. I’ll stick with my unconventional discussion here.)
Now suppose that we have two properties A and B associated with the positive integers. When do we say that A implies B (according to my unconventional definitions)? Well, for each positive integer we have a statement and a statement I’ll say that implies (in the property sense) if for every positive integer , the statement implies the statement (in the truth-value sense). In other words, whenever is true, so is and otherwise anything can happen. In the example above, is the property “is a prime not equal to 2”, is the property “is odd”, and for each is the statement “ is a prime not equal to 2″ and is the statement “ is odd”. Every time is true, which it is when , so is This gives us the feeling that the property “causes” the property .
Let me go back to the statement that seemed reasonable.
It’s important to be careful about what this means. Is it a statement about some specific ? If so, then we must interpret the “if … then” in the strict truth-value sense. Or is it really a way of saying, “Every prime not equal to 2 is odd”? In that case, it has more of a causal feel to it.
The best way to keep everything clear at all times is not to write the above sentence when you’re really talking about all Instead, you can write
Now, if you pick out just the part of this statement that says, “If is a prime not equal to 2, then is odd,” then you have something that must be interpreted in the truth-value sense. But when you apply those truth-value statements to all positive integers simultaneously, what you end up with is the nice “causal” statement that the property “is a prime not equal to 2” implies the property “is odd”.
A silly deduction and a sensible deduction.
Because there is a sort of causal notion of implication, and because it is in a way what we really care about when doing mathematics, I very much prefer to illustrate the meaning of “implies” or “if … then” with reference to examples that include variables. If I just take two fixed statements like “Margaret Thatcher used to be Prime Minister of the UK” and “there was recently a tsunami in Japan” and tell you that, despite the lack of any obvious relationship between them, the first statement implies the second statement because the second statement happens to be true, then it it is clear the notion of implication I am using has nothing to do with one thing being true because another thing is true: not even the most rabidly left-wing person is going to blame the Japanese tsunami on Thatcher’s premiership. But a statement like, “If then ,” is completely reasonable. Moreover, because is a general element of which might be an infinite set, we can’t establish a statement like this by running through all and checking the truth values of the statements and Rather, we have to give a proof — that is, an explanation of why must belong to if it belongs to Thus, once you start looking at statements with variables, the truth-value notion of implication forces you to look for “reasons” and “causes” so that you can establish lots of truth-value facts at once. (I’m leaving out the possibility here that a statement could in some sense “just happen to be true”. For example, many people take seriously the following possibility. Perhaps the property “is even and at least 4” implies the property “is a sum of two primes” in the sense that no number is even and at least 4 without being a sum of two primes, but perhaps also there isn’t a reason for this — perhaps it just happens to be the case.)
Here’s another illustration of the difference between statements that involve parameters and statements that don’t. Consider the following claim.
I’m going to prove it in two different ways.
Proof 1. is irrational, so the statement “ is rational” is false, and therefore implies all other statements. In particular, it implies that there is an integer that is both even and odd.
Proof 2. If is rational, then we can find positive integers and such that which implies that Let be the largest integer such that is a multiple of Since is a perfect square, must be even. (To see this, just consider the prime factorization of ) But and the largest k for which is a multiple of is odd. (To see this, just consider the prime factorization of ) Therefore, is both even and odd, which proves the result.
Which of these two arguments is more interesting? Undoubtedly the second, since it actually gives us a proof of the irrationality of So is the first argument valid at all? You might object to it on the grounds that it uses without proof the fact that is irrational. But we can make the question more interesting as follows. There is (it happens) a different proof of the irrationality of that does not involve the statement that some positive integer is both even and odd. What if we used that argument, concluded that “ is rational” was false, and then went on to deduce “there exists an integer that is both even and odd” in the way that argument 1 does above. Would that be a valid deduction?
I think the answer has to be yes, but it is not an interestingly valid deduction. It is not showing that the irrationality of is in any way caused by a contradiction that involves parity, since we deduced that from another, and unrelated, false statement.
If we think of implication as primarily something we apply to statements with parameters, and therefore indirectly and in a different sense to properties, then our starting point is not the statement “ is irrational” but rather the statement ““. And our conclusion, that there exists an integer that is both even and odd, is deduced from the more precise (and informative) statement, “the highest such that is a multiple of is both even and odd”.
As a final remark about the above example, which allows me to emphasize a point I have already made, suppose that I start a proof of the irrationality of by writing,
What I am really saying is that whatever and might be, if then In other words, although it looks as though I’m talking about a specific pair and in fact I’m making a general deduction.
What’s good about the usual convention concerning “if … then” and “implies”?
I think I have partially answered this question by pointing out that when we consider statements with parameters then the truth-value meaning of “implies” feels a lot closer to the more intuitive “causal” meaning of “implies”. However, the agreement isn’t total. One of the “silly” examples from early in this post was this.
This looks odd, because although we know that can’t be both even and odd, we also feel that if were even or odd, there would be nothing about that fact that steered towards the number 17 as opposed to any other number. I can’t deny the feeling of oddness. All I can say is that the hypothetical situation never arises because the hypothesis, that is even and odd, is impossible.
What I can do, however, is explain why I don’t want to try to find a different convention that would make this statement false. I don’t want to do that because it would force me to give up some general principles that I like. One of those I have already mentioned:
I hope you’ll agree that that looks highly reasonable, and we don’t want to start having ugly exceptions to it if we don’t have to.
Here’s another mathematical principle that I think you will also have to agree with.
Now let’s apply these two principles. I’m going to let be the property “is both even and odd” and I’m going to let be the property “equals 17”. Then the set of such that is the empty set (since no is both even and odd). The set of such that is the set Since the empty set is a subset of the set the first principle tells us that implies
To summarize this discussion, the formal mathematical notion of implication is a bit strange, but most of the strangeness disappears if you just look at statements with parameters, which tend to be the statements we care about. Each such statement corresponds to a property of those parameters, and implication of properties is closer to our intuitive notion of one thing “making” another true than implication of statements. Even then there are one or two oddnesses, but these are a small price to pay for the cleanness and precision of the definition and for the fact that it allows us to hold on to some cherished general principles.
An exercise — not to be taken too seriously.
(i) Prove that Borsuk’s conjecture implies the Riemann hypothesis.
(ii) Comment on your proof.
Hint: if you find part (i) difficult, then you are not applying one of the pieces of general study advice I gave in the first post of this series.