I realized after writing the title of this post that it might look as though I was saying, “I’m going to discuss connectives … not!” Well, that’s not what I meant, since “not” is a connective and I’m about to discuss it.
If you don’t know how to negate a mathematical statement, then you won’t be able to do serious mathematics. It’s as simple as that. So how does the mathematical meaning of the word “not” differ from the ordinary meaning? To get an idea, let’s consider the following sentences.
“We are not amused.”
“He is not a happy man.”
“That was not a very clever thing to do.”
“ is not a perfect square.”
“ is an element of the set , but it is not the largest element of ”
“ is not a subset of .”
When Queen Victoria said, “We are not amused,” it is clear that what she meant was she was distinctly unamused, and not merely that she had failed to laugh. Similarly, if I say, “He is not a happy man,” I will usually mean that he is positively unhappy rather than neutral on the contentment scale. And if I say, “That was not a very clever thing to do,” I am saying, in a polite British way, that it was a stupid thing to do. (I could perhaps avoid that interpretation by stressing the word “very”. For example, if someone had made a good but reasonably standard move in chess and I knew enough about chess to tell that — which I don’t — then I might say, “That was not a very clever thing to do — but it was pretty good, so well done.”)
In all the examples above, the word “not” takes us from one end of some scale to the other: from amused to unamused, from happy to unhappy, from clever to stupid. In mathematics, the word “not” does not have this sense. If P is a mathematical statement, then “not P” is the mathematical statement that is true precisely when P is false. That is, if P is true, then “not P” is false, and if P is false, then “not P” is true. So if you want to understand a statement of the form “not P” then you should think to yourself, “What are the exact circumstances that need to hold for P to be false?”
In the case of the statement “n is not a perfect square” this is completely straightforward. We don’t have a notion of an utterly imperfect square, so there is no possibility of misinterpretation. We just mean that it is not the case that n is a perfect square. But take the statement “ is not the largest element of the set ” We have been told that is an element of If we want to show that is not the largest element of what do we have to establish? Do we need to show that is the smallest element of ? No. All we need to do is establish that it is not the case that is the largest element of The usual way to set out that objective would be to formulate it as the following statement.
If that statement is true, then is not the largest element of And if that statement is false, then there is no element of that is larger than and since is an element of that tells us that is the largest element of
The third mathematical statement above was “ is not a subset of .” Let me give two mistaken interpretations of that statement.
This is mistaken because it is perfectly possible for to fail to be a subset of without being a subset of For example, could be the set and could be the set
To work out what it means for to fail to be a subset of let us write out more carefully what it means to say that is a subset of . The usual definition (written out in a slightly wordy way because I’m still trying to avoid symbols) is this.
By the way, I should mention here a convention that you need to know about. When mathematicians give definitions, they tend to use the word “if” where “if and only if” might seem more appropriate. For example, I might write this: “an integer is even if there exists an integer such that ” You might argue that this definition says nothing about what happens if no such integer exists. Might 13 be even too? No, is the answer, because I am defining something and the convention is that you should simply understand that the words “and not otherwise” are implicit in what I’ve said, or equivalently that the “if” is really “if and only if”.
OK, if “A is a subset of B” can be translated into “Every element of is also an element of ” then how should we translate “A is not a subset of B”? This brings me to the second incorrect answer.
This makes the going-to-the-opposite-extreme mistake. Don’t do that. Faced with a statement of the form “not P”, think to yourself, “What needs to be true for P to be false?” Applied to this case, we ask, what needs to be true for “Every element of is an element of to be false?” The answer is that at least one element of should fail to be an element of . Or as mathematicians might normally write it,
Or more formally still,
As I hope you can guess if you didn’t know already, the symbol means “is not an element of”. As I also hope you can guess, putting a line through a symbol usually has the force of a not. You are probably already familiar with the symbol , which means “does not equal”.
There is a pair of logical principles known as de Morgan’s laws that you may well just take for granted, but that it is probably good to be consciously aware of. They concern what happens when you negate a statement that involves an “and” or an “or”. Let me illustrate them with some depressing scenes from my everyday life.
Once a year I have to renew the tax disc on my car. If I didn’t do that I would have to pay a fine, but doing it without a fuss involves being a bit more organized than I normally have it in me to be, so I find myself facing a last-minute panic. What is difficult about it? Well, I have to bring along my insurance and MOT certificates, as well as a form I am sent and a means of payment. (For non-UK readers, MOT stands for “Ministry of Transport” but it also means a test of roadworthiness that you have to have carried out once a year if your car is over three years old, which mine very definitely is.) I take all those along to the post office and can buy the new tax disc there.
Now suppose I were to arrive back from an expedition to the post office, the aim of which had been to get a new tax disc, and say to my wife, “I can’t believe it. I thought I had everything I needed, but I didn’t, and now I’m going to have to make another trip. Damn.” What could she deduce? Well, in order for me to have everything I needed, the following statement would have had to be true.
So from my failure to get a tax disc, she could deduce that the above statement was false. But what does it take for a statement like that to be false? It just takes one little slip-up. So she could deduce the following statement.
What happens here is that NOT turns AND into OR. Or to be a bit clearer about it, the statement “not (P and Q)” is the same as the statement “(not P) or (not Q)”. In the tax-disc example we had four statements linked by “and”, so the rule was a generalization of the basic de Morgan law, which told us that “not (P and Q and R and S)” was the same as “(not P) or (not Q) or (not R) or (not S)”.
As you might guess, the other de Morgan law is that NOT changes OR into AND. Suppose we vary the scenario above slightly. This time I am trying to open a bank account and I need some ID. I end up having to go back home and I say to my wife, “Damn, I didn’t manage to open the account, because they said that the only ID they would accept was a passport or a driving licence with a photo on it.” What could she deduce this time? Well, in order to open the account, I needed the following statement to be true.
What does it mean for that statement to be false? It means that I failed on both counts. In other words, this happened.
The more abstract rule here is that “not (P or Q)” is the same as “(not P) and (not Q)”.
A very general and possibly helpful rule of thumb applies to negation, including to several of the examples above. It’s this.
Negating something strong results in something weak, and negating something weak results in something strong.
For instance, suppose you are given two statements, which we’ll call P and Q. Then the statement “P and Q” is quite strong because it tells you that both the statements P and Q are true. By contrast, the statement “P or Q” is fairly weak because all it tells you is that one or other of P and Q is true and you don’t know which. Why do I use the words “strong” and “weak”? Well, it is easier for “P or Q” to be true than it is for “P and Q” to be true. That means that if “P and Q” is true, then I am getting a lot of information, whereas if “P or Q” is true I am getting less information. If you still don’t find that intuitively clear, then consider the statements
The first statement tells us that which is an extremely strong piece of information — we get to know exactly what number is. The second statement merely tells us that is one of the numbers 2,3,4,5,6,7,8,10,11,12,13,14,16,17,18,19,20,22,… which is giving us much less information. So “strong” basically means “tells us a lot”.
The rule of thumb is us that negating something strong — that is, pretty informative — gives us something weak — that is, not very informative — and vice versa. Therefore, de Morgan’s laws
are exactly what you would expect. The first law negates the strong statement “P and Q” and gets a weak statement “(not P) or (not Q)” and the second law negates a weak statement and gets a strong statement.
For another example, consider the statement
That, I hope you will agree, is strong information: it describes a most unusual state of affairs. So we would expect that negating it produces a very weak statement. And indeed, if I were to say,
you might well give me a funny look and ask, “Was there some reason that you expected it to be?” Note that the rule of thumb gives us a quick way of seeing that the negation of “This room is INCREDIBLY INCREDIBLY HOT” is not “This room is INCREDIBLY INCREDIBLY COLD.” After all, the second statement is also very strong, and we do not expect the negation of a strong statement to be strong.
I haven’t mentioned all the basic logical laws that concern AND, OR and NOT. One important one is the rule that two NOTs cancel. If I say, “It is not the case that A is not a subset of B” then I mean that A is a subset of B. In general, “not (not P)” is the same as P.
We can actually use that to deduce the second de Morgan law from the first. Here they are again.
To deduce the second, let me begin by applying the first to “not P” and “not Q”. (What I’m doing here is just like what you are allowed to do with an identity: I am substituting in a value. In this case, the first statement holds for any statements P and Q, so I am substituting in “not P”, which is a perfectly good statement, for P and “not Q” for Q.) I get this.
I’ve decided to dispense with some brackets here. Again just as with equations, there are conventions about what to do first when you don’t see brackets. And the convention is “do all your NOTs first”. So here “not P and not Q” means “(not P) and (not Q)”. It does not mean “not (P and not Q)”.
Using the rule that two NOTs cancel, we can simplify the above law to this.
Now we can “apply NOT to both sides” (just as with equations, if two things are the same and you do the same to both then you end up with two things that are still the same). We get this.
And finally, using once again the fact that two NOTs cancel, we get to the second de Morgan law.
A philosophical digression.
It would annoy some people if I left the discussion here, because some mathematicians feel strongly, and to many other mathematicians puzzlingly, that two NOTs do not cancel. That is, they maintain that “not (not P)” is not the same statement as P. That is because these mathematicians do not believe in the law of the excluded middle. If you believe that every statement is either true or false, then you can define (as I did) “not P” to be the statement that is true precisely when P is false and false precisely when P is true. But what if a statement doesn’t have to be true or false?
I don’t recommend worrying about this, but let me try to explain why mathematicians who ask this kind of question are not mad (or not necessarily, at any rate). There are at least three reasons that one might decide, in certain contexts, that not every statement has to be true or false.
The first is when we are dealing with statements that are not completely precise. Let me illustrate this with a few ordinary English sentences.
Do you want to say of each of those sentences that there must be a fact of the matter as to whether they are true or not (even if we might not know which)? Sometimes we can be pretty sure that Tony Blair is happy, but what about when he had just got up this morning and was cleaning his teeth? Was it definitely the case that one of the two statements, “Tony Blair is happy” or “Tony Blair is not happy” was true and the other false? (Here I’m interpreting the second statement as “It is not the case that Tony Blair is happy”.) A more reasonable attitude would surely be to say that being happy is a rather complex and not entirely precisely defined state of mind, so there is a bit of a grey area.
If you concede that there is this grey area, then how should you interpret the following sentence?
Or to put it more concisely, “not not (Tony Blair is happy)”.
When P is a vague sentence like “Tony Blair is happy” then it seems to me that a reasonable interpretation of “not P” is that it is sufficiently clear that P is false for it to be possible to state that confidently. Under this interpretation, “It is not the case that Tony Blair is happy” means that he is sufficiently clearly not happy for it to be possible to say so with confidence. Then “It is not the case that it is not the case that Tony Blair is happy,” means something like “It is clear that we cannot be clear that Tony Blair is not happy,” which is not the same as saying “It is clear that Tony Blair is happy.” When we say “Tony Blair is happy” we are ruling out the grey area, but when we say “not not(Tony Blair is happy)” we are allowing it. (Why? Because if Tony Blair’s mood is not clear to us, then we clearly cannot say with confidence that he is not happy, and therefore not not(Tony Blair is happy).) Perhaps if you asked him whether he was happy he would give a small sigh and say, “Well, I’m not unhappy.”
Yes, you might say, but the great thing about mathematics is that it eliminates vagueness. So surely the above considerations are simply irrelevant to mathematics.
That is by and large true, so let us consider a second type of statement.
These are both famous unsolved problems. So we don’t know whether they are true or false. Worse still, Gödel has shown us that it is at least conceivable that one of these statements (or another like it) cannot be proved or disproved. (That’s a bit of an oversimplification of what Gödel’s theorem says, for which I apologize to anyone it irritates.) So if we don’t have a proof, or even any certainty that there is a proof, what gives us such a huge confidence that these statements must have a determinate truth value? What does it mean? The problem now is not vagueness, but rather the lack of any accepted way of deciding whether or not the statement is true.
Suppose, for example, that we try to argue that even if we don’t know whether is irrational, there must nevertheless be a fact of the matter one way or the other. We might say something like this: either there are two integers p and q sitting out there such that or there aren’t. In principle we could just look through all the pairs of integers and check whether they equal Either we would find a pair that worked, or we wouldn’t.
Hmm … what is this “in principle” doing here? We live in a finite universe, so we can’t just look through infinitely many pairs of integers. So what happens in the actual universe? We find that at any one time the best we can hope for is to have looked through just finitely many pairs What can we conclude if none of the corresponding fractions is equal to ? Precisely nothing about whether is irrational.
Note, incidentally, that another “infinite algorithm” for solving the problem would simply be to work out the entire decimal expansion of and then go back and see whether it is a recurring decimal or not. Again, we can’t do this algorithm in practice because we live in a finite universe.
As a result of considerations like these, some mathematicians do not agree that a statement like “ is irrational” must have a determinate truth value. So again we have a grey area, but this time the reason is not vagueness but rather the lack of a proof.
I should also make clear that most mathematicians (I think) do believe that there must be a fact of the matter one way or the other, regardless of what we can prove. I myself don’t, but I am in the minority there.
A third reason for abandoning the idea that every statement must be either true or false is to insist on stricter standards for what counts as true. If you would like some idea of what I mean by this, I refer you to the excellent comment by Andrej Bauer below, and also to the Wikipedia article on intuitionism.
[This section was rewritten in response to criticisms from Andrej Bauer and Michael Hudson-Doyle below. There is no reason to set it in stone at any point, so further criticisms are welcome if you have them.]
Generalizing de Morgan’s laws.
If you feel like doing a little exercise, you could try using de Morgan’s laws together with the associativity of addition to deduce that
If you manage it, then that is a good sign: you are probably at, or well on the way to, the level of understanding of and fluency with “and”, “or” and “not” that you need to do undergraduate mathematics.
SPOILER ALERT — I’M ABOUT TO GIVE A SMALL HINT, SO IF YOU DON’T WANT IT THEN SKIP THE NEXT SENTENCE, WHICH IS IN FACT THE LAST SENTENCE OF THIS POST.
Hint: You should begin by putting the brackets back in and writing “P and (Q and R)” instead of “P and Q and R”.