## How can one equivalent statement be stronger than another?

It’s been a long time since any mathematical content was posted on this blog. This is in part because I have been diverting my mathematical efforts more to the Tricki (not to mention my own research), and indeed the existence of the Tricki means that this blog will probably become less active, though I may publish some Tricki articles here as well. But there are certain quasi-philosophical questions that I want to discuss and that are better discussed here. I have already written about one of my favourites: when are two proofs essentially the same? Another that is closely related is the question of how it can be that two mathematical statements are equivalent to each other and yet one is clearly “stronger”. This phenomenon will be familiar to all mathematicians, but here is an example that illustrates it particularly well. (The example is also familiar as a good example of the phenomenon).Hall’s theorem, sometimes known as Hall’s marriage theorem, is the following result. Let $G$ be a bipartite graph with finite vertex sets $X$ and $Y$ of the same size. A perfect matching in $G$ is a bijection $f:X\rightarrow Y$ such that $x$ and $f(x)$ are neighbours for every $x\in X$. Given a subset $A\subset X$, the neighbourhood $\Gamma(A)$ of $A$ is the set of all vertices in $Y$ that are joined to at least one vertex in $A$. A trivial necessary condition for the existence of a perfect matching is that for every subset $A\subset X$ the neighbourhood $\Gamma(A)$ is at least as big as $A$, since it must contain $f(A)$, which has the same size as $A$. This condition is called Hall’s condition.

What is not trivial at all is that Hall’s condition is also sufficient for the existence of a perfect matching. Here’s one of the shortest proofs. Let $G$ be a bipartite graph with vertex sets $X$ and $Y$ of size $n$ and suppose by induction that the result is true for all bipartite graphs with smaller vertex sets. Now we split into two cases. If there is a proper subset $A\subset X$ with $|\Gamma(A)|=|A|$, then by induction we can find a perfect matching from $A$ to $\Gamma(A)$. Let $B$ be the complement of $A$ in $X$. Then any subset $C\subset B$ has at least $|C|$ neighbours in $Y\setminus\Gamma(A)$, or else $C\cup A$ would have fewer than $|C\cup A|$ neighbours. So we can find a matching from $B$ to $Y\setminus\Gamma(A)$ as well. Putting the two together gives a perfect matching of $X$ and $Y$

Now suppose that $|\Gamma(A)|>|A|$ for every proper subset $A\subset X$. In that case we can pick an arbitrary $x\in X$ and let $f(x)$ be an arbitrary neighbour of $X$. Let $X'=X\setminus\{x\}$ and let $Y'=Y\setminus\{f(x)\}$. Then since we have removed just one vertex from $Y$, the restriction of $G$ to the vertex sets $X'$ and $Y'$ satisfies Hall’s condition, so by induction we have a perfect matching of $X'$ and $Y'$ and again we are done.

There can be no doubt that one of the directions of this equivalence is trivial compared with the other. And we are also tempted to say that the condition that $G$ has a perfect matching is “stronger” than Hall’s condition, since it trivially implies it. So Hall’s theorem has the flavour of obtaining a strong conclusion from a weak hypothesis. But how can this be if the hypothesis and conclusion are equivalent?

Before thinking about how to answer this question, it is perhaps a good idea to think about why exactly it seems mysterious, if indeed it does. This would be my suggestion. Normally when we say that condition P is stronger than condition Q we mean that P has consequences that Q does not have, or, more simply, that some objects satisfy Q without satisfying P. For example, the condition that a natural number is an odd prime is stronger than the condition that it is odd. But here we find ourselves wanting to say that one graph property is stronger than another even though the two properties have precisely the same consequences and pick out precisely the same set of graphs.

The solution to this little puzzle seems to be that it depends on a confusion between actual logical consequences and what we can easily perceive to be logical consequences. Or at least, that’s one suggestion for a solution, though I’m not sure I’m entirely satisfied by it. It does at least cover the case of Hall’s theorem: we could say that the existence of a perfect matching is psychologically stronger than Hall’s condition, since it trivially implies, and is not trivially implied by, Hall’s condition.

But I’d like to say something more “objective” somehow, and not tied to what people happen to find easy. To see what I might mean by this, consider another difference between the two conditions. If you were asked to check whether some bipartite graph $G$ satisfied Hall’s condition, what would you do? You could check the sizes of the neighbourhoods one by one, but there are exponentially many of them, so this would not be an appealing prospect. Alternatively, you could use a well-known polynomial-time algorithm that finds perfect matchings when they exist. This would clearly be much better. Is there some precise sense in which an efficient check that a graph $G$ satisfies Hall’s condition “has to” find a perfect matching?

Another thought is that Hall’s theorem is false for infinite graphs. For instance, if you let $X$ and $Y$ both be $\mathbb{N}$ and join $1$ in $X$ to all of $\mathbb{N}$, and all other $n$ in $X$ to $n-1$, then Hall’s condition is satisfied but there is no perfect matching. On the other hand, the existence of a perfect matching still trivially implies Hall’s condition. So we can generalize the two conditions in a natural way and we find that one is now stronger than the other in the precise formal sense. (It seems unlikely that we can do something like this systematically for all examples where one direction of an equivalence is trivial and the other hard. But it would be interesting to come up with other equivalences where it can be done.)

Since I’m throwing out questions here, I may as well mention the fact that is rarely talked about but is surely noticed by almost all mathematicians, which is that the notion of equivalence itself is used in an informal way. Why don’t we say that Fermat’s last theorem is equivalent to the four-colour theorem? After all, here’s a proof that FLT implies 4CT (in ZFC): obviously FLT+ZFC implies ZFC, and we know that ZFC implies 4CT. And similarly in the opposite direction. An answer to this question takes us back to ideas that were discussed in some of the responses to the post on sameness of proofs: one would like to say that when mathematicians talk of one statement implying another, the implication should not be “homotopic” to a silly implication such as the one just given of 4CT from FLT. (I feel some kind of structural property should be involved and not mere length of proof, but that is just a feeling that I can’t back up in any adequate way.)

One final remark is that questions of this kind show just how incomplete a picture of mathematics was provided by some philosophers early in the twentieth century, who held that it was merely a giant collection of tautologies. The way that tautologies relate to each other is fascinating and important, so even if one believes that mathematics consists of tautologies, the “merely” is unacceptable. It should be said that this particular view of mathematics went out of fashion fairly soon after it came into fashion, so it doesn’t really need attacking, but it would be interesting to develop this particular line of attack. Indeed, I’m fairly sure that theoretical computer scientists can say something precise about how Boolean tautologies relate to each other: that could perhaps provide a good toy model for “sensible” implications and “non-obvious” directions of equivalences.

### 48 Responses to “How can one equivalent statement be stronger than another?”

1. Terence Tao Says:

Another reason why the “difficult” implication in Hall’s theorem is non-trivial is that it is non-functorial: there is no canonical way to obtain the perfect matching from a bipartite graph obeying Hall’s condition which respects isomorphisms (i.e. relabeling of the graph). Instead, one must make many arbitrary choices. Thus, it largely falls outside the range of “abstract nonsense”. (The fact that the theorem fails in the infinite case is perhaps a symptom of this non-functoriality; the inability to generalise the result to a more abstract setting indicates that the implication must depend sensitively on all of its hypotheses, and so cannot be replaced by overly “soft” techniques that are insensitive to one or more of these hypotheses.)

[Perhaps one could sum up the two cultures of mathematics succinctly as “mathematics which generalises” and “mathematics which does not generalise”? :-)]

I could imagine another way to “certify” non-triviality of this sort of implication: suppose one had an example of a “low-complexity” graph for which Hall’s condition could be verified in some “low-complexity” or “easy” manner, but for which the only perfect matchings in the graph were somehow “high-complexity”. This would be evidence that the implication is doing something non-trivial. [The Szemeredi regularity lemma, as you well know, is a good example of this – any result whose conclusion has a complexity that is tower-exponential in the inputs can’t be all that trivial. 😉 For a similar reason, the Behrend example illustrates the non-triviality of the tautology that is Roth’s theorem. ].

I recently encountered an example of this in my work with Vitali Bergelson and Tamar Ziegler. Let F be a finite field of characteristic p, and let $P: F^n \to {\Bbb R}/{\Bbb Z}$ be a polynomial of degree d (i.e. the d+1^th derivatives of this polynomial vanish, or equivalently that $\|e(P)\|_{U^{d+1}(F^n)}=1$). At some point we needed a $p^{th}$ root of P, i.e. a function Q such that pQ=P. We wanted Q to be a polynomial as low a degree as possible. By breaking up P into monomials, and taking roots of each monomial separately, it turns out that one can make Q of degree d+p-1, which is best possible (I wrote about this in this blog post of mine). But the operation of breaking up into monomials is computationally intensive, in particular it cannot be used in any sort of constant-time probabilistic algorithm that would locally reconstruct Q from P in time uniform in n. [If one does not care about getting the minimal degree on Q, but would be happy with some bound on the degree of Q, there is a very cheap and local algorithm to obtain Q, namely to write Q = f(P) where f is some arbitrary branch of the $p^{th}$ root function.]

In fact, it appears that no such algorithm to obtain a minimal-degree root exists, because the ergodic theory analogue of the statement fails. Namely, given an action of $F^\omega$ on a probability space X, there exists polynomials P of degree d on X (where polynomiality is defined with respect to this action) which do not have a $p^{th}$ root Q of degree d+p-1 in X, which roughly speaking means that there is no low degree root Q of P which can be created out of a finite number of shifts of P. On the other hand, the monomial argument does eventually let one build such a low degree root of Q in an extension Y of the system X; thus these roots do exist in some sufficiently abstract sense, they are just not “measurable” enough to be definable within X.

2. Terence Tao Says:

Another example in a similar spirit came up in work I did with Tim Austin on property testing and repair in hypergraphs. Thanks to the hypergraph regularity and counting lemmas, many hypergraph properties P are known to be locally testable, which roughly speaking means that if a randomly chosen medium-size subhypergraph of a hypergraph G has a very high chance of obeying P, then it is possible to modify G slightly to another hypergraph G’ which also obeys P. For some properties P (e.g. the triangle-free property for graphs), the proof of local testability actually gives an efficient (probabilistic) algorithm for locally constructing G’ in terms of G, and so P is not just locally testable, but also locally correctable (or locally repairable). But Tim and I found some hypergraph properties which were locally testable but not locally repairable, thus indicating that the local testability result is in some sense non-trivial. The simplest example is that of total order: a directed graph is said to obey P if it is the graph of a total ordering of its vertices. It is not hard to see that P is locally testable, but it turns out that one cannot repair a partially corrupted total ordering on n vertices by purely local means; one has to make a large number (linear in n) of arbitrary choices before one can even begin to set down the corrected ordering.

3. Terence Tao Says:

As regards why we don’t view FLT as implying 4CT in any interesting way, perhaps it may help to see how robust the implication is with respect to perturbations. Consider a perturbation or generalisation FLT’ of FLT which we do not currently know to be true. Would having a proof of FLT’ be likely to help us make progress on some analogous perturbation or generalisation 4CT’ of 4CT that we also do not currently know to be true? Probably not – so the implication is non-robust. In contrast, most “useful” implications in mathematics tend to have at least some robustness with respect to at least some kinds of perturbations. (Indeed, us mathematicians spend a significant fraction of our research time exploring this sort of robustness.)

4. Andrej Bauer Says:

When speaking about equivalence of statements P and Q we need to be careful about the base theory. The stronger the base theory, the fewer distinctions can be made (as was already noted in the case of base theory ZFC and the statements FLT and 4CT). A familiar example is equivalence of various formulations of the axiom of choice, where it is evident that the base theory ought to be ZF rather than ZFC.

Now in the case at hand we should first identify a base theory in which it makes sense to talk about (non)-coincidence of graphs with bipartite matchins and graph satisfying Hall’s condition. But such a theory must be very, very weak, far weaker than Peano arithmetic. The book to look in for this sort of information is Stephen Simpson’s “Subsystems of Second Order Arithmetic”. If one condition really is stronger than the other, I would expect there to be a reasonable base theory which proves only one direction of the equivalence.

5. Terence Tao Says:

I should probably yield the floor after this comment, but: here is perhaps one proposal for defining the strength of an implication: a statement X strongly implies Y if one can use X to significantly shorten the proof of Y, or (if Y is false) one can use the falsity of Y to significantly shorten the demonstration of the falsity of X. (Thus, for instance, FLT does not strongly imply 4CT, indeed the proof of 4CT from FLT+ZFC is precisely one line longer than the proof of 4CT from ZFC.)

In practice, most theorems are not implications $X \implies Y$, but rather families of implications: “For all s, $X(s) \implies Y(s)$“. (For instance, in Hall’s theorem, s would be a bipartite graph, X(s) would be Hall’s condition applied to s, and Y(s) would be the claim that s has a perfect matching.) One can then generalise the previous definition by averaging over the types of s which are likely to come up in applications in which the implication could conceivably be useful. Since Hall’s theorem does significantly shorten the proof of existence of perfect matchings of many graphs in applications, it is a useful application. On the other hand, the converse Hall’s theorem rarely shortens the proof of expansion of a bipartite graph by much (indeed, since the proof of the converse is only a line or two, it can only shorten proofs by at most that amount.)

6. gowers Says:

I agree that one of the principal reasons for being interested in equivalences is that if $P\equiv Q$ and you want to prove $R\Rightarrow S$, then it’s enough to prove that $R\Rightarrow P$ and $Q\Rightarrow S$. And if $P$ is the “weak” half of the equivalence, then it will be easier to prove $R\Rightarrow P$ than $R\Rightarrow Q$, and also easier to prove $Q\Rightarrow S$ than $P\Rightarrow S$. But fleshing out the details of what you say in the second paragraph of your last comment is not all that easy I think: it might at first sight seem to be saying little more than that the proof in one direction is quite a bit longer than the proof in the other. But I have the impression that you’re being careful to say a bit more when you talk about averaging over graphs where one might conceivably want to find a perfect matching.

It feels to me as though the main thing, however, is not so much the length of the proof as a written-out object but the number of ideas it contains (which one could think of as a different notion of length). The trivial direction of Hall’s theorem is trivial because it needs no ideas, and the non-trivial direction is non-trivial because it needs at least one idea. A question provoked by this is whether anyone knows of two equivalent statements $P$ and $Q$ where there is a short but necessarily very ingenious proof that $P\Rightarrow Q$ and a trivial but necessarily quite long proof that $Q\Rightarrow P$. It would be interesting to see what one’s impression was of the relative strengths of such a $P$ and $Q$. I would have thought that abstract algebra would be the most promising place to look for such an equivalence if there is one.

7. Daniel Moskovich Says:

This is really your first “objective” condition, but Vaughan Jones proposed a measuring the strength of a mathematical statement a while back… I don’t know a reference, and many people have their own versions (Dror Bar-Natan for example), but from lowest to highest it runs:
0.5) Result which hints at a theory
1) A theory
2) Existence of a solution
3) Algorithm exists to find the solution (e.g. Hall’s condition)
4) Practical (e.g. polynomial-time) algorithm exists to find the solution (e.g. the condition of being bipartite)
5) The closed-form solution itself.

Thus, despite logical equivalence between the condition and implication of the theorem, which is true for any theorem, one has here an advance from a statement of strength 3 to a statement of strength 4.

8. Terence Tao Says:

Dear Tim,

As regards your question of an equivalence which is short but ingenious in one direction, and trivial but long in another, here is a somewhat contrived example: for every integer n between 1 and 560, n is prime iff $a^{n-1} = 1 \hbox{ mod } n$ for every $1 \leq a < n$ coprime to n. The “only if” direction follows from Fermat’s little theorem, which is simple but does require at least one non-trivial trick; but the “if” direction basically requires a tedious check of hundreds of cases. (561, of course, is the first Carmichael number.)

9. gowers Says:

That’s a nice example that forces me to modify the question a bit. If I examine my not-thinking-too-hard reaction to the two statements “$n$ is prime” and “$a^{n-1}=1$ mod $n$ for every $1\leq a coprime to $n$” with it understood that $n$ is an integer between 1 and 560, then I feel as though it is a stronger statement that $n$ is prime, since that one implies the second statement (via Fermat’s little theorem) while the second doesn’t really imply the first in the sensible sense because if you’re given an integer and told that it is a Carmichael number that doesn’t imply that it’s prime. So it seems to imply it in the FLT-implies-4CT sense. (I don’t rule out that there’s some test for primality that is shortened if you’re already given that your number is a Carmichael number — if that is the case then I’d have to modify my remarks somewhat.)

Now if I examine my response in more detail, I see that it comes down to something like this. I have a bit of a weakness for radical finitism, so I feel a bit doubtful about whether a proof that’s a tedious finite check can really count as trivial. (For example, is it a trivial fact that there’s no projective plane of order 10?) That feels too much like saying that Goldbach’s conjecture is trivial because God could just check it for every even integer. So I’ll admit that you’ve found something that satisfies the criterion I put forward (a long proof that requires no idea), but what I really meant was a proof that was trivial in the sense of giving a trivial explanation for why the statement was true. I should start with a weaker question: is there a proof that’s (i) explanatory (this is supposed to rule out long case checks) (ii) trivial (in that every step is completely obvious) and (iii) long? Then the second question would be whether there is such a proof for a statement that has a true converse that is not trivial but that can be proved quickly using a clever idea.

10. Gil Kalai Says:

Hello all,

here is an example which is somewhat in the spirit of Hall’s theorem but is much simpler: Suppose that A B, C, and D are finite and pairwise disjoint sets. Suppose that f is a bijection from A to B.

It is true that there is a bijection g from C to D if and only if there is a bijection h from A $\cup$ C to B $\cup$ D.

Now the proof that h exists given that g exists is very easy. And the proof that g exists if h exits is considerably harder.

11. Terence Tao Says:

Hmm, perhaps the finitary proofs that come out of proof mining an infinitary proof might qualify for the revised question? For instance, suppose one managed to prove (say) Rado’s theorem by an infinitary argument (e.g. topological dynamics). General proof theory then tells us that one should be able to translate that proof into a purely finitary one by working through each step one at a time and carefully disentangling all the infinitary quantifiers. (Infinitely large -> sufficiently large, and so forth.) Such a translation is basically trivial, still captures the explanatory power of the original – but will be tediously long.

One direction of Rado’s theorem is fairly straightforward (but perhaps requires one small idea), and can be done finitarily, whereas the finitisation of a simple infinitary proof of the hard direction of Rado’s theorem is likely to be long, trivial, and technically explanatory…

12. Gil Kalai Says:

13. gowers Says:

I have a couple of worries about that as an example. The first is that I’m not sure that there’s any trivial, as opposed to simple, proof of Rado’s theorem, even if that proof is allowed to be infinitary. But perhaps one could cheat slightly and make everything relative to an infinitary version. I.e., one could add “and the infinitary version of the hard direction of Rado’s theorem holds” to the two finitary statements that are equivalent. But what would one’s perception of the strengths of the two statements be in this relative context? I’m not sure. I’m also slightly worried about the mixing up of metamathematics with mathematics: it might mean that the finitary proof is metatrivial rather than trivial.

The sort of thing I was looking for was more like this. I can’t instantly think of an example, though I think something will occur to me at some point, but it is a common feeling when one is trying to write up a result, that some stage of the proof is trivial even though it might look rather complicated technically (and might even appear to the uninitiated to contain ideas). Or another possibility is something I remember from lecturing the Bott periodicity theorem a long time ago. The proof involves showing that a cycle of six maps is exact, and I found some of the steps of the proofs very hard to remember, even though I was convinced that to someone more steeped in that kind of mathematics they would have seemed trivial, in the sense that each step would be pretty well the only thing one could do.

14. Gil Kalai Says:

Dear all, a few words of explanation to the example I gave. Recall that A B C and D are finite and pairwise disjoint. We are given a bujection f from A to C.

Now, given a bijection g from B to D we can show the existence of a bijection h from A $\cup$ B to B $\cup D$ simply by defining h(x) = f(x) if x $\in$ A and h(x)=g(x) if x $\in B$. This of course wirks also when the sets are infinite.

Now given a bijection h from A $\cup$ B to C $\cup D$ we can prove that |B|=|D|. But this, while automatic for us, is conceptually much harder, requires the definition of the cardinality of a finite set, and basic properties of arithmetic operations. It also does not give us an explicit (or cannonical) function g.

Write f* for f$^{-1}$. There is a beautiful bijective construction of g given f and h which (I think) goes back to works of Dominique Foata in bujective combinatorics. Starting from an element x in B consider the sequence h(x), f*(h(x)), h(f*(h(x))), f*(h(f*(h(x)))),… which you continue untill reaching an element y in D. Then define y=g(x).

(You can see some similarity with the alternating path algorithm for perfect matching.)

15. Gil Says:

Regarding Hall’s theorem it is also interesting to examine the relations between the statements:

(1) For every subset A of X, A has at least as many neighbors as its size.

and

(2) For every subset B of Y, B has at least as many neighbors as its size.

By Hall’s theorem these two statements are equivalent. (We assumed that |X|=|Y|.) And clearly, they are equally strong. But while the implications between them are not artificial as some of the examples in the post I am not aware of a direct combinatorial implication.

There is a linear algebra context where (1) refers to a statement about the row-rank of a square matrix (2) refers to a statement about the column-rank and the existence of perfect matching follows from the fact that a regular matrix has a nonzero determinant. (To see this consider a X by Y natrix with zero entry for x and y which are non-neighbors and a generic variable entry if x and y are neighbors.) In this context the equivalence of (1) and (2) can be proven directly but passing from the combinatorial setting to the linear algebra formulation requires some work. ( This is related to Rado’s theorem :))

Finally, Tim wrote
“One final remark is that questions of this kind show just how incomplete a picture of mathematics was provided by some philosophers early in the twentieth century, who held that it was merely a giant collection of tautologies. The way that tautologies relate to each other is fascinating and important, so even if one believes that mathematics consists of tautologies, the “merely” is unacceptable.”

Well, I do not know what precise statements by philosopheres are referred to and also what is meant by the word “mathematics” in “a picture of mathematics.” Still one can claim that the giant collections of tautologies contains also all the finer information regarding the “fascinating and important” way tautologies relate to each other.

16. Terence Tao Says:

Hmm… I think in order to make the adjective “trivial”, well, non-trivial, one has to assume a certain base of knowledge that one can be trivial relative to. Very little in mathematics is trivial just relative to the axioms of ZFC, but quite a bit is trivial relative to a decent graduate education in mathematics (or, in what is (very) roughly equivalent, 3000 years of progress in mathematics). For instance, there are many “trivial” proofs of a statement X that look like this:

1. Hey, X looks like a job for Zorn’s lemma / Hahn-Banach theorem / Baire category theorem / diagram chasing / Szemeredi regularity lemma / greedy algorithm / random algorithm / Cauchy-Schwarz / [insert favourite tool here].

2. Set up all the objects required for your tool.

3. Verify all the hypotheses required for your tool.

4. Apply tool.

5. Use the conclusion of the tool to conclude X.

If one accepts the statement of the tool, and the general conditions under which the tool is known to be useful, as trivial, then the whole proof often becomes trivial and requires no additional ideas – but the actual verification of the remaining “routine” steps may actually be rather lengthy. (How often does one really check every commutative square or exact sequence of a large commutative diagram?)

17. gowers Says:

Gil, the answer to your last (implied) question is that I was referring to the logical positivists. They famously held that a statement that couldn’t be empirically verified was meaningless (to oversimplify a bit, since there were various versions of the verification principle, some less extreme than this). However, this left them with a problem, since they certainly didn’t want to condemn mathematics as meaningless, but it also can’t be empirically verified. (One might try to make a case that some of it can, but that would be a difficult one and it wouldn’t apply to the vast bulk of mathematics.) To get round this, they proposed the idea that mathematics consists of tautologies. I don’t know enough about them to know whether they came up with detailed theories about how these tautologies came to be meaningful. The tautologies idea was I think attributed to the early Wittgenstein.

Terry, perhaps this is turning into a discussion about what “trivial” means, and if it is then I need to think a bit. My instant reaction to the type of argument you talk about is that I’d use a word like “routine” rather than “trivial”. For example, the proof of the triangle-removal lemma is trivial in your sense, once you have the hint that the regularity lemma should be used, and if you are familiar with a typical use of the lemma, since once you apply it and do very standard things like removing sparse and irregular pairs, the verification is more or less instant. But I would describe that by saying that it has become routine to those who are familiar with arguments of that general type, rather than that it is trivial. It certainly feels very different from the easy direction of Hall’s theorem. (Of course, the long but trivial proof whose existence I am wondering about would probably also feel somewhat different from the easy direction of Hall’s theorem.) Routineness is obviously a relative concept, but I feel that triviality is less so: with some theorems there really does seem to be only one sensible step to take at each stage and taking sensible steps gets you from the premises to the conclusion. (The notion of “sensible” is relative to experience I suppose, but somehow it feels less dependent on it than routineness.)

18. luca Says:

To return to the issue of the 4CT being equivalent to FLT because they are both true (while intuitively they are “independent” statements), this is an issue that has come up in theoretical computer science when comparing the strength of different assumptions/conjectures.

For example, Impagliazzo and Rudich considered in 1989 the question of whether the existence of secure public-key encryption could be equivalent to the existence of one-way functions (which in turn is equivalent, via several non-trivial theorems, to the existence of secure private-key encryption). They wanted to provide a negative answer, but the problem with even stating what a negative answer could be is that we believe that both one-way functions and public-key cryptosystems exist, in which case the statements about their existence are equivalent because they are both true.

Their approach, roughly speaking, was to observe that implication results in cryptography are usually proved via reductions that establish stronger results as well, and that the corresponding stronger result for the implication “one way functions imply public key encryption” was false (or required a proof of $P\neq NP$, depending on the model of reductions used).

This is in the same spirit of Terry’s argument to distinguish the two directions of Hall’s theorem by noting that one direction does not generalize to the infinite case (and so it is not provable via techniques that automatically generalize to the infinite case), but the other direction does.

19. Gil Kalai Says:

Thanks a lot, Tim. I suppose regarding mathematics as a large collection of tautologies is a a reasonable abstraction and not the most daring or controversial (or strange) thing about logical positivism. (Or about Wittgenstein.) Perhaps regarding math as a large collection of tautologies is like regarding chess as merely a 2-player zero sum game with complete information.

As for logical positivism, this amazing movement that for many decades dominated philosophy and intellectual life, wanted to base mathematics ,science, and philosophy itself on logic, rejected mataphysics, rejected, as you said, everything not empirically based as meaningless, including, “beauty”, “justice” “ethics” and other central issues in human thinking and philosophy, (at some later time philosophers, perhaps Oxford-based, did try to draw a line between “scientific” and “meaningful”), tried to develop logical probability-calculus that will eventually allow to give a probability (like truth value) to every meaningfull statement, and promoted many other unbelievable ambitious and unrealistic goals, did not really had a chance to triumph, and today even philosophy came back to the old problems. Yet, logical positivism through its totally unrealistic goals achieved so much!

20. gowers Says:

Gil, it sounds as though you have some sympathy for logical positivism, so let me confess that, despite the tone of my remarks above, so do I. In particular, some people have held that logical positivism is instantly defeated by the fact that the verification principle itself is not empirically verifiable. But as a mathematician I find that refutation as unconvincing as an argument that mathematics doesn’t work because we don’t prove that proofs are valid. And it seems to me that a weak version of the verification principle that doesn’t talk about meaning but merely says that there is an important distinction between statements that are empirically testable and statements that are not, and that this distinction coincides fairly well (if not perfectly — think of string theory) with what we would like to think of as the distinction between scientific and nonscientific statements, is correct and useful. Where I (like most people) part company with logical positivists is in their view that we should try to reduce all statements to very low-level ones, just as we can in principle do in mathematics, and thereby establish their true status. The later Wittgenstein mocks this view very nicely by asking, “Then does someone who says that the broom is in the corner really mean: the broomstick is there, and so is the brush, and the broomstick is fixed in the brush?” I regard that as a one-sentence demolition of one of the major strands of logical positivism.

Returning to 4CT and FLT (after reading Luca’s comment) I wonder if in maths the notion of implication will be quite hard to formalize, even informally, so to speak. One quite convincing argument that 4CT doesn’t imply FLT is that if you imagine a parallel earth that’s almost identical to ours, but in which Appel and Haken didn’t prove 4CT (and neither did anyone else), then you certainly don’t think, “Well, in that case Wiles probably wouldn’t have proved FLT either.” In other words, knowing 4CT doesn’t give you any advantage if you want to prove FLT. But that establishes a stronger conclusion: not only does 4CT not imply FLT, it doesn’t even help to imply it. This suggests to me that there could be a grey area. Indeed, I think there is. For example, does Ken Ribet’s work imply FLT? It certainly helped Wiles to prove it. But somehow we feel that Wiles put in a very significant extra input. But suppose, contrary to fact, that Wiles’s extra input had consisted of one ingenious but easily understood observation, after which FLT had dropped straight out. Then we would probably have said that Ribet’s work implied FLT. So it seems that this notion of implication actually gives us a spectrum that ranges from complete irrelevance all the way to trivial implication.

21. Timothy Chow Says:

Andrej’s suggestion of looking to reverse mathematics (a la Simpson’s book) is appealing; certainly if one can define a very weak logical theory over which A implies B but B does not imply A, then we have a very satisfying formalization of “A is stronger than B.” But as I mentioned in my comment to “When are two proofs essentially the same?” it is not clear that reverse mathematics will always give us the desired answer to Tim’s question: Many people feel that Brouwer’s fixed-point theorem and Sperner’s lemma are “equivalent,” yet the former is stronger than the latter in the sense that Brouwer is not provable in RCA_0 while Sperner is. This is another illustration of the general principle that one should not confuse logical strength with psychological difficulty.

By the way, regarding the non-canonicality of the perfect matching in Hall’s theorem: There’s a very nice and not-well-known result by Blass, Gurevich, and Shelah (“On polynomial time computation over unordered structures, J. Symbolic Logic 67 (2002), 1093-1125) that shows that the non-canonicality isn’t as bad as you might think. To state the result, let’s define the “saturation” of a bipartite graph with (equal-sized) parts A and B as follows. First partition A and B into disjoint subsets {A_i} and {B_j} so that for every i and j, every vertex in A_i has the same number of neighbors in B_j, and every vertex in B_j has the same number of neighbors in A_i. (This partition can be found by a so-called “stable coloring algorithm,” and in particular is “canonical” in a certain sense.) The “saturation” of the graph is obtained by joining *every* vertex in A_i to *every* vertex in B_j as long as there is at least one edge from A_i to B_j.

Theorem (BGS). (A, B) has a perfect matching iff its saturation does.

Now checking whether the saturation has a perfect matching still requires the usual algorithm, but the point is that “picking” an A_i or a B_j can be done canonically because the A_i and B_j are canonically defined. Thus the non-canonicality is limited to picking an arbitrary vertex from each A_i or B_j. If this sounds like a subtle distinction, let me just say that the BGS theorem is the key to proving that perfect matchability of bipartite graphs is computable in something called “choiceless polynomial time with counting,” which I won’t define here but which roughly speaking corresponds to the class of graph properties computable in polynomial time without having to assign arbitrary labels to the vertices.

A very nice open problem is whether the perfect matchability of an arbitrary (non-bipartite) graph is in choiceless polynomial time with counting.

22. Andrej Bauer Says:

I asked about reverse mathematics of Hall’s theorem on the Foundations of Mathematics mailing list. Stephen Cook explained how things stand with regards to Hall’s theorem, see http://www.cs.nyu.edu/pipermail/fom/2008-December/013256.html

Perhaps it is worth pointing out what logicians are trying to accomplish with the plethora of logical systems (Stephen Cook manages to list six systems in his short post: AC0, VTC0, TC0, VP, S-1-2, NC2): they capture various computational/proof complexity classes and reasoning principles, i.e., they are mathematical manifestations of ideas such as those appearing in this discussion, e.g., “any proof of X involves a counting argument”, or “any proof of Y not using X must be long”.

23. luca Says:

Perhaps a way to formalize Tim’s comments on 4CT versus FLT is to think of associating a measure of “difficulty” to any proof, and then saying that “A implies B” if there is a proof of A=>B which is significantly less difficult than any (known) proof of B. (This does give a spectrum of possibilities.)

As a first approximation, one could take “difficulty” to be “length,” but it has been remarked above that there are many examples in which a lengthy proof can be simple and a short proof can be hard.

Suppose, however, that B is only known to have a routine but lengthy proof, and that someone discovers a clever and non-trivial trick giving a short proof that A=>B; it may be incorrect to say that A=>B has a “less difficult” proof than B, but certainly the short proof of A=>B shows that there is some relation between A and B. (So even the naive choice of formalizing difficulty as length does not lead to completely meaningless conclusions.)

By the way, even results in reverse mathematics can be framed this way, if one defines “difficulty” not as a quantitative measure but as a the strength of the adopted proof system.

24. Questões matemático-filosóficas profundas « problemas | teoremas Says:

[…] no seu último post (como pode  um enunciado ser mais forte do que outro que lhe é equivalente?, How can one equivalent statement be stronger than another?, no original). Esta questão vem no seguimento de outras classificadas na categoria somewhat […]

25. Benjamin Says:

Suppose I have an “elaborate” set of mathematics (perhaps with many axioms and theorems) which I’ll call P. And suppose that the following can be demonstrated from P:

1) P=>(R=>Q)

Now also suppose that the “ideas” in R are contained within the ideas in Q s.t.:

2) Q=>R – A somewhat trivial example might be that Q is simply a composite statement of (R & S).

So given system P, Q and R are equivalent. But it happens that I can derive statements with Q that I cannot derive with R. (Using my trivial case above, I can derive S from Q but I cannot derive S from R.) Hence, Q is “stronger” than R.

The possibility that Q is equivalent to R, yet stronger than R, is contingent on the fact that Q and R are embedded within a system P, at least as far as this example is concerned. Separated from that system, Q and R are simply not equivalent.

26. Carrie Jenkins Says:

Hi Tim,

The purely psychological differences are important; I think that they are really all that’s responsible for the persistant interest in what’s now known as the Church-Fitch paradox (aka the paradox of knowability or Fitch’s paradox).

But I wanted to suggest a way of thinking about the (first) kind of more objective difference you’re looking for. The thought I have is to take seriously the temptation you feel to use modal language (“Is there some precise sense in which an efficient check that a graph satisfies Hall’s condition “has to” find a perfect matching?”) and apply an idea from the philosophy of modality.

This is the idea of varying worlds whilst ‘holding fixed’ certain facts. Normally this is done with possible worlds but to apply the idea to mathematics we’d need to use impossible worlds too. The idea would be that as we hold fixed facts which are particularly mathematically important and *vary* other mathematical facts, we get a range of non-actual worlds in which one of the two equivalent statements always implies the other but not vice versa. This could give you a sense in which the ‘stronger’ one has more ‘strength’ than the ‘weaker’ one: the ‘stronger’ one keeps its power in other salient (impossible) worlds where the mathematical facts are different. Whereas the ‘weaker’ one loses its power in some of those worlds.

27. gowers Says:

Hi Carrie, That’s an interesting suggestion, but I’ve been unable to get it to go anywhere. Let me explain the difficulty I have. If I understand you correctly, what one would like is something like a “plausible mathematical world” that is in fact contradictory, but for which the contradiction doesn’t jump straight out at you. And one would like to devise such a world in which the trivial direction of Hall’s theorem is true (whatever that means when the world contains a contradiction, but let’s go for something like “easily seen to be true”) and the nontrivial direction is false.

My first problem is that I can devise such a world in an uninteresting way, simply by adding to ZF the statement “there exists a graph that satisfies Hall’s condition but that does not contain a perfect matching”. Such a world is plausible in the sense that the contradiction doesn’t jump out, but in order to specify it I haven’t done anything interesting. So one needs to decide what counts as an interesting construction of an impossible mathematical world.

My second problem arose from an attempt to deal with the first. Here is a possible reason for regarding the easy direction of Hall’s theorem as trivial: it’s that we can give a proof in a high-level language using little more than easy syllogisms. Such a proof would go something like this. (I think I’ve made it needlessly complicated actually, but that doesn’t affect the point I want to make.)

Bijections preserve cardinality.

The restriction of a bijection to a subset is a bijection.

Therefore, the restriction of a bijection to a subset preserves the cardinality of that subset.

A perfect matching is a bijection such that the image of any element is a neighbour of that element.

Given any subset, its neighbourhood contains the images of all its elements.

Therefore, given any subset, its neighbourhood contains the image of that subset.

By the earlier remark, the image of that subset has the same size as the subset.

Therefore, given any subset, its neighbourhood contains a set of the same size as that subset.

If a set contains another set, then the first set is at least as big as the second.

Therefore, given any subset, its neighbourhood is at least as big as the subset.

Anyhow, suppose I could carry out some analysis like that, and argue on the basis of it that the easy direction of Hall’s theorem was trivial because it all took place at a superficial level. I could then perhaps define a “plausible mathematical world” as being one where all superficial deductions were valid, or something like that. It might be necessary to restrict the statements that one was even allowed to consider, to avoid the objection that any proof can be broken down into steps that are superficial. Or one might wish to argue that “trivially implies” is not a transitive relation between statements. Again, these are problems, but they are not the main difficulty, it seems to me. The main problem is that if I could give an analysis that did justice to our idea of what counts as a trivial deduction, then it’s not clear what would be added by talk of plausible mathematical worlds: I could just define P to be stronger than Q if P trivially implies Q and Q does not trivially imply P.

Both my main difficulties can be summed up as follows: how would one get the notion of a plausible mathematical world to do any work? I’m not saying that I think it couldn’t, but just that I don’t at the moment see how it could.

28. Timothy Chow Says:

The possible-worlds point of view is, very roughly speaking, the same as what I and others have been calling the reverse-mathematics point of view. I think it is best not to think of “impossible worlds” or worlds that are in fact contradictory. Instead, think in terms of “nonstandard worlds.”

In the case of Hall’s theorem, Cook and Nguyen’s work means that we’re in good shape. Take the very weak logical theory VTC_0. In this theory we can *state* Hall’s theorem, and in the standard model of VTC_0, Hall’s theorem means what we think it means. However, only one direction of Hall’s theorem is known to be *provable* in VTC_0. The provability of the other direction in VTC_0 is an open problem, but let’s suppose that it’s not provable. This means that there is some nonstandard model of VTC_0 in which one direction of Hall’s theorem holds but the other doesn’t. In this nonstandard model, “Hall’s theorem” doesn’t quite mean what we normally think of it as meaning, because we’re interpreting the words slightly differently from usual. Nevertheless, to the extent that VTC_0 captures a natural subset of correct mathematical reasoning about finite combinatorics, this gives a precise and natural meaning to the statement that one side of Hall’s theorem is stronger than the other.

It’s worth noting that Cook and Nguyen’s logical systems are very closely related to standard computational complexity classes. Thus, “stronger” in this context means something like “harder to compute.” Of course, the informal notion of “easy” doesn’t *exactly* coincide with “easy to compute,” but low computational complexity is at least one reasonably plausible way to interpret “easy.”

29. Jaime Montuerto Says:

Dear Prof Gowers,

I have this question that is beyond me but might just be appropriate now. The FLT’s equation:

a^n + b^n = c^n and gcd(c^n – 1, a^n + b^n -1) = 1

is similar. I mean the second equation implies the first one. Can you show or give counterexample of positive ODD factor on the second equation? What’s interesting about the second equation is that it is based on Fermat’s Little theorem. The question is if the second equation is true then the first one is true. But we all knew that the first one is true. It doesn’t automatically implies the second one to be true.

Thanks

30. Carrie Jenkins Says:

Hi Tim,

Suppose we can specify in other terms (i.e. without making reference to what we happen to find obvious) a set of necessary and sufficient conditions for a deduction to count as ‘trivial’. I agree it’s not straightforward but let’s suppose. We then use it to specify the relevant class of worlds.

The point is that even if the new specification of the relevant worlds ends up delivering exactly the same worlds (same extension) as if you had just said ‘holding fixed the trivial deductions but varying other things’, provided you have specified it in other terms (different hyperintension) you have given an account of the sense in which one statement is stronger than the other which does not make reference to what we happen to find plausible/obvious, and could therefore reasonably be counted as objective.

Here’s an analogy. Suppose I like all and only green things. And someone is wondering if there is an objective difference between these things and all the others. Obviously a non-objective difference is that I only like some of them. But then we notice that everything I like is green, and everything else isn’t. Then we can say that there is an objective colour difference between the two groups. The factor we have identified marks an objective difference although it coincides with something non-objective (my preferences).

31. gowers Says:

Tim, the responses to my original post have left me very curious to find out more about reverse mathematics, in which I was already interested in fact, though from a safe distance. For instance, I was recently working on a problem with a colleague, David Conlon, and at one point we managed to get ourselves unstuck in a sort of reverse-mathematics way: we needed a conclusion Q to be true, and there was a seemingly obvious way of proving it. But we couldn’t get the hypotheses P that we needed to carry out this proof. But then, by focusing on the converse direction, we observed (i) that Q did not imply P and (ii) that there was a different, and simple, set of hypotheses P’ that was equivalent to Q and that we could actually obtain. That’s not supposed to be an addition to the fascinating work on reverse mathematics, but rather an indication that the reverse point of view is, as its proponents say, of interest in “real” mathematics and not just to logicians.

Returning to your comment, I am of course much happier with the idea of nonstandard models than I am with inconsistent models, and very interested and impressed that there is such a beautifully precise answer to the question I asked about Hall’s theorem. At first, the idea of an inconsistent model seems itself to be inconsistent (after all, the existence of a model surely guarantees consistency). But I suppose there might conceivably be a way of carrying out Carrie’s suggestion. First, one would need to define some notion like an “imperfect model” for some axioms. The axioms wouldn’t be true in the model, but neither would they be blatantly false. If that could be done, it’s conceivable that it might be easier to construct such a model than to show syntactically that an implication was not trivial.

32. Joel Says:

Hi Tim,
The following is, I think, far less interesting than the set of examples discussed above … but no-one appears to have mentioned a possible distinction between “stronger” and “strictly stronger” conditions. This problem actually appears all over the place. After all, mathematicians cannot even agree whether the notation $A \subset B$ means that $A$ is a strict subset of $B$ or whether it means the same thing as $A \subseteq B$!
For topologies I use the terms “stronger” and “weaker” in the non-strict sense, but with some misgivings. For example, one of my favourite results of elementary point-set topology is the fact that whenever a compact topology is stronger than a Hausdorff topology, then the two topologies are, in fact, equal … in which case it seems strange to use the term stronger.
I suppose that when comparing conditions, “stronger” is easier to demonstrate than “strictly stronger”, which may well depend in any case on the setting you are working in (as demonstrated above).
Joel

33. Pete Says:

Dear Gil,

with respect to your example –

I’d claim that the definition of ‘finite’ is not totally trivial. Of course it is necessary: if one takes an infinite set A, a set B of the same cardinality, a set C with one element, and D the empty set, then there exist a bijection f from A to B (by definition) and g from A union C to B union D (via well-ordering) but of course not h from C to D.

So, then, here is an easy proof, using Peano and ZF (but of course not C).

Let r be an enumeration of B union D, i.e. a bijection from B union D to [k] for some integer k (I see no other definition of ‘finite’ that doesn’t involve a non-trivial proof to make sense of the definition).

Let s be an enumeration of C.

Consider the set r(h(A union C)-f(A)). This is a set of integers; let q(j) be the j-th integer of this set in increasing order (which can be defined by functions via summation of an indicator).

Let g(c)=r^{-1}(q(s(c))) for c in C.

If g fails to be well-defined, then this can only be because q fails to be well-defined, which it does not (Peano arithmetic).

It is clear that g is injective and maps onto D; to see that it is surjective is again Peano arithmetic.

I think the fact that g(b) as given here is actually a bijection requires about as much thought as the fact that the iterative procedure you give actually terminates – that is, both fail if there are inappropriate infinite quantities floating around, and for both proving that they succeed requires some idea of ‘finite’, which in turn requires Peano arithmetic to make sense. The difference is I think the proof I give is ‘obvious’ in the sense that it’s the first thing you’d try if you didn’t care for elegance and were told that the word ‘finite’ in the statement is necessary: which for me means the proof is trivial.

34. Mark Bennet Says:

Perhaps another example, which seems to me to be similar, might help thinking about this. Try Wedderburn’s Theorem that every finite skew field is commutative, which means that the two properties ‘being a finite skew field’ and ‘being a finite field’ are equivalent.

Some observations

* Being a skew field can be explicitly made part of the definition of being a field – this is not normally pointed out (or if it is, it is quickly forgotten) as one tends to learn about fields first and skew fields later. One could do something similar with Hall’s theorem defining matched graphs and paired graphs (or whatever, with fairly obvious definitions) and saying ‘every finite matched graph is a paired graph’. How natural would it be to say that a paired graph is a matched graph + another condition? – I don’t think it quite works the same.

* With Wedderburn there are some fairly obvious finite and infinite examples to hand: with Hall, the language of marriage and the way I learned graph theory both give me a psychological bias to thinking of finite cases first.

* I quite like what Benjamin posted above, which raises the question what properties beyond those explicit in the definition of Q are needed to prove R?

* Are there any natural/obvious examples of this phenomenon which don’t depend on a finiteness condition?

35. Timothy Chow Says:

Here’s a comment that I meant to make earlier but forgot. There is a somewhat related question that I brought up for discussion on the Foundations of Mathematics (FOM) mailing list some years ago, which is what sense we can make of statements such as, “The Riemann hypothesis might be true.” Surely Yoda (of Star Wars fame) would object, “True, or true not. There is no `might'”? That is, either RH is true in all possible worlds or false in all possible worlds; it’s not that RH is true in some possible worlds and RH is false in some possible worlds.

One suggestion that came out of the FOM discussion was that work in “dynamic epistemic logic” is relevant. That is, “RH might be true” is interpreted as “We don’t know that RH is false,” and so we formalize “RH might be true” by formalizing the concept of knowledge. Note that knowledge can change over time even if mathematical facts can’t; hence the adjective “dynamic.”

I haven’t invested the effort to understand the details of dynamic epistemic logic so I’m not sure how much the theory has to offer, but it seems that it might be relevant to the current discussion too. As has already been pointed out, the term “easy to prove” seems to have strong psychological overtones, and so maybe “easy proof” can be plausibly formalized along the same lines that “knowledge” can be formalized.

36. Joel Says:

Here is another example of the original phenomenon. It is one of those well-known coffee-table exercises whose origins I do not know, and where no-one should spoil anyone else’s fun. I offer my second-year undergraduates a prize for the first correct solution each year.

Let $G$ be a semigroup. Consider the following three conditions that $G$ might satisfy:
(a) $G$ is a group;
(b) for all $g$ in $G$ there is a unique $g^*$ in $G$ such that $g g^* g = g$;
(c) for all $g$ in $G$ there is a unique $g^*$ in $G$ such that $g g^* g = g$ and ${g^* } g {g^*} = {g^*}$.

The main exercise is to show that (a) is equivalent to (b). The implication (a) implies (b) is essentially trivial, but (b) implies (a) is much more fun.

A superficial comparison of (b) and (c) suggests that (c) might be stronger than (b), but in fact it is strictly weaker: (c) defines an inverse semigroup.

Joel

37. Cahit Says:

I like to comment on the triviality and non-triviality when we compare strength of the two equivalent statements. In another blog (http://tbgloops.blogspot.com/2008/08/definetrivial.html )
triviality has been discussed along with the proof of the 4CT. The conclusion was that “Intelligence is the ability to create any non-trivial thing”. I think the reverse “Non-triviality is the ability to create any intelligent thing” is also reasonable.

38. GG Says:

There are various ways to start from a statement A and get another statement B. One can, for example, deduct B from A, negate A to get B, add quantifiers to A to get B. Here, we are studying the interaction of statements with respect to deduction. However, it may be interesting to consider another form of ‘interaction’ between statements. Specifically I was thinking about dualisation. Indeed, many statements have natural dual statements. I often found that to better understand a statement, it is revealing to consider its dual statement. Also, at least empirically, dual statements are often true.

So suppose one has two equivalent statements A and B (with respect to deduction), with A having a natural dual statement, but not B. Then one could say A is stronger than B with respect to dualisation. Indeed, A is more prone than B to “producing revealing and often true statements”. This is very vague and based on an intuition. I’ll try to find an example illustrating this.

39. Gil Kalai Says:

Dear all, Here is another example of a somewhat different nature.

Consider a finite POSET (partially ordered set) X.

Dilworth’s theorem asserts that X can be covered by chains whose number is the size of the largest antichainin X.

A much easier theorem asserts that X can be covered by antichains whose number is the size of the largest chain in X.

(A chain is a set of elements so that every two are comparable. An antichain is a set of elements so that every two (distinct) are incomparable.)

Now, these two theorems are equivalent because of a Theorem by Lovasz which asserts that the complement of a perfect graph is also perfect.

A (finite) graph G is perfect if for every induced subgraph H of G, H can be covered by $\alpha(H)$ cliques (=complete subgraphs) where $\alpha (H)$ is the independence number of H.

Starting from a post X we can desribe two graphs: the comperability graph where x and y are adjecent if x>y or y>x, and (its complement) the uncomparability graph where x and y are adjacent if neither x>y nor y>x. Lovasz’s theorem gives an equivalent between two theorems one which is easy and another which is harder.

In this example we are talking about equivalence of absolute statements (and not equivalence of properties of graphs that sometimes hold and sometimes not like the perfect matching example), so it is more similar to the equivalence between theorems like at the end of the post. But here the equivalence is “genuine” in some sense.

40. Joel Says:

I was also thinking of Dilworth’s theorem, but in a way closer to the Hall matching theorem setting. It is trivial that you need at least as many chains as the size of the largest antichain, but the reverse inequality is non-trivial.
For a given positive integer k, you can consider the two properties the poset might or might not have:
(a) at least k chains are needed to cover the poset;
(b) there is an antichain in the poset with at least k elements.
Dilworth’s theorem tells us that (a) and (b) are equivalent. However, the fact that (b) implies (a) is trivial, while (a) implies (b) is non-trivial.
Joel

41. Greg Marks Says:

Perhaps the reason it “feels strange” to observe that the Four Color Theorem is implied by Fermat’s Last Theorem (or its negation) is that human beings seem predisposed to seek out causality in the external world, so when confronted with the statement “P implies Q” our instinct is to feel that P has “caused” Q to be true. I think examples like P = Fermat’s Last Theorem and Q = Four Color Theorem bring this false intuition to the fore.

42. Peter Shor Says:

I have a little story about strength of equivalent statements. In 2003, I proved that four additivity conjectures in quantum information were equivalent. Let’s call these A, B, C, and D1. Four of these implications are essentially trivial, with one- or two-line proofs. These are A&rightarrow;B, A&rightarrowC;, B&rightarrowD;, and C&rightarrow;D. To get implications in the oppositie direction, you have to do a moderate amount of work.
Earlier this year, Matt Hastings discovered a very clever counterexample to the additivity conjecture. Obviously, since A is the strongest of these equivalent conjectures, you would expect that the counterexample was to conjecture A. But no, the counterexample was actually to weakest conjecture D. Why? Probably because D is the simplest of these equivalent conjectures, as well as being the weakest, so counterexamples to conejcture D are easiest to analyze.

1For those who want to keep track, A is known as strong superadditivity of entanglement of formation, B is additivity of entanglement of formation, C is additivity of classical capacity of a quantum channel, and D is additivity of minimum entropy output of a quantum channel. These names are irrelevant to the point of the story.

43. Jack Says:

This is so funny, I know it’s a “dead thread”, but I was thinking about this the other day and came across this post. By the way I am only an undergraduate. From my point of view it is no coinkydink that the example comes from (finite) combinatorics. The fact that the existence of a perfect matching implies Hall’s condition is trivial, the fact that Hall’s condition implies the existence of a perfect matching is nontrivial. Quite objectively (see next sentence). The former does not require induction to prove, but the latter does. Or that’s what I think it comes down to, anyway.

This is why you could say something like, Dilworth’s theorem is a special case of the perfect graph theorem in graph theory, and in turn Hall’s theorem is a special case of Dilworth’s theorem. You don’t need to induct to prove these implications, they sort of “spill out”, in not too many lines. I honestly think that if you tried to reverse these implications, you would end up using induction, and that is what makes these theorems stronger. I am particularly sure about the impossibility of getting from Dilworth’s theorem to the perfect graph theorem, in such a way.

I think this fits the sense I get that, for example, proving Dilworth’s theorem by induction is doing the donkey work, and deriving Hall’s theorem from Dilworth’s theorem is just fiddling. Fiddling fiddling fiddling.

44. Jack Says:

I seem to be making another post. I am sure my original post contains plenty of gaps, anyhow.

The “requires induction” idea is surely not rigorous, whatever that means, but on the other hand, I am convinced that it could be made so, to such a degree that rigor actually has a value.

But somehow I don’t think “ZFC” is going to be the slightest bit of help. For one thing, ZFC concerns the whole of mathematics, whereas what I am thinking about concerns only (finite) combinatorics [it probably extends to combinatorial set theory in some limited form, or whatever].

As such I have no idea how one might, even if you wanted to, come up with a new, general pan-mathematical meaning to the word “equivalent” that would be more wise.

But I think it is important that implications can “just seem wrong”. Earlier (while googling what lead me to this thread) I read that Hall’s theorem implies Dilworth’s theorem and the max-flow-min-cut theorem. These implications, though logically valid, just seem wrong somehow, and seem even more wrong when viewed through the spectacles of induction.

45. Jack Says:

Back to Hall’s theorem, I can see how you could derive this theorem, without induction, from Konig’s theorem that min vertex cover equals max matching in a bipartite graph. But I am not sure how one could do the converse.

46. Basic logic — relationships between statements — converses and contrapositives « Gowers's Weblog Says:

[…] the second, while the second is not so easy to deduce from the first. I discussed this phenomenon in a blog post a few years ago, which provoked a number of very interesting […]

47. Austin Arlitt Says:

“At first, the idea of an inconsistent model seems itself to be inconsistent (after all, the existence of a model surely guarantees consistency). But I suppose there might conceivably be a way of carrying out Carrie’s suggestion. First, one would need to define some notion like an “imperfect model” for some axioms. The axioms wouldn’t be true in the model, but neither would they be blatantly false. If that could be done, it’s conceivable that it might be easier to construct such a model than to show syntactically that an implication was not trivial.”

I might be misunderstanding your intent here, but here’s one “inconsistent” system you can construct trivially. (It is, of course, only trivial in the sense that after you know a powerful theorem it is obvious, with the theorem here being incompleteness.)

One can construct a provability predicate for ZFC. Call it F(x).
Then both ZFC + F(0=1) and ZFC + ~F(0=1) are consistent if ZFC is.

The second system trivially tacks on another statement, which happens to be true, but the first one adds a false axiom. This axiom cannot “interact” with the arithmetic component of ZFC, so no formal inconsistency arises.

I’ll just leave this thought here for posterity’s sake. After all, I stumbled onto this thread.

48. treeowl Says:

I know I’m a few years late to the party, but another nice example is the Cantor-Schröder-Bernstein theorem. In one direction it follows immediately from basic definitions. In the other direction, surprisingly, it’s possible to prove it from a fairly limited subset of ZF axioms (it seems to be provable without foundation, power set, or infinity, although doing it with neither power set nor infinity is tricky). In that sense, it partially breaks the idea that the context needed to prove the theorem adds to its “strength”.