This is a mathematical post rather than a Tricki post, but the title comes from the fact that the issue I want to discuss has arisen because I have made a statement in a Tricki article and can’t quite justify it to my satisfaction.

The article in question is about a useful piece of advice, which is that if one is having trouble proving a theorem, it can help a lot to try to prove rigorously that one’s approach cannot work. Sometimes one realizes as a result that it *can* work, but sometimes this exercise helps one to discover a feature of the problem that one has been wrongly overlooking. I have a few examples written up, but one of them is giving me second thoughts.

The example in question is this. Let be a subset of for some prime . Define a *gap* in *of length* to be a set of integers mod that contains no elements of . A nice (and well-known) question one can ask is the following. Let be of size at least . How small can one make the largest gap if one dilates ? That is, if we write for the size of the largest gap in a set , what is the largest possible value of over all sets of size at least ? (Here stands for the set .)

If you take a random set of size , then with high probability the largest gap in any dilate of will have size at most , where depends on only. After a bit of experimenting, it becomes natural to conjecture that this is the best possible bound. But is it?

This looks like a combinatorial question until one observes the following. Let be a prime of the form and let be the set of all non-zero quadratic residues mod . Then every dilate of is either the set of all non-zero quadratic residues or the set of all non-zero quadratic non-residues. Suppose it is the case that all of are quadratic residues mod . Then, since is a quadratic non-residue, all of are quadratic non-residues. Therefore, every dilate of has a gap of length . So the conjecture above will be false if there are infinitely many primes congruent to mod such that the first numbers are quadratic residues for some that tends to infinity faster than .

Suddenly we have a number-theoretic question to solve, and it turns out to be a well-known unsolved problem. Nobody has even ruled out that there is some such that for infinitely many primes all the numbers from to are quadratic residues mod . Even with the help of the Generalized Riemann Hypothesis this has been improved to rather than .

I want to claim that this example demonstrates that the original problem is “not really” a combinatorial problem. But to do that, I have to argue against someone whose reaction to it is, “Ah, this shows that an apparently number-theoretic problem is in fact a combinatorial one.” This is hard, because there *are* some very nice examples of number-theoretic statements that have been hugely generalized and then solved combinatorially. So what is it about this problem that makes it appear that purely combinatorial arguments are doomed to fail?

One relevant factor is that many combinatorial tools apply to large systems of sets. Initially, it looked as though the set of dilates of was a large set system, but the example of quadratic residues shows that it can in fact be very small. This fact does seem to lie behind the general feeling (or at least, I think it is a general feeling) that the problem cannot be solved combinatorially, but I would like a more general answer to the following question. Suppose you are trying to prove a theorem of the form “For every , .” Then suppose you come across a particular example of an for which you cannot prove . When is it clear that you will not be able to tackle the general theorem without first having a thorough understanding of why it is true for ?

Clearly the answer is not “always”. For example, might just be an example that is sufficiently complicated not to have any special features. It may then be slightly easier (for notational reasons, say) to think about , but thinking about isn’t interestingly different from thinking about the general case in such situations. One relevant feature of the example I have just discussed is that the appropriate tools for tackling the question about quadratic residues seem to be number-theoretic, which makes the special case *different* from the general case. Another sign of that is that one can easily imagine a solution to the question about quadratic residues not helping all that much to solve the general problem. But why isn’t that an argument for aiming at the general problem and ignoring the particular one?

Another feeling I have about this question, but can’t do proper justice to, relates to the discussion in the previous post. If the conjecture is true, then you can’t find a prime such that the first integers are quadratic residues for a very large . However, we can create something like a “probably inconsistent model” in which such a prime does exist, and then for any combinatorial argument about a general set , we can interpret it as referring to our fictitious that consists of quadratic residues modulo our strange prime . And in some sense our combinatorial arguments “never realize” that the model is inconsistent. Are there any reverse mathematicians out there who can make sense of this? Is it conceivable that there is some weak axiom system that is sufficient for combinatorial arguments, in which one might possibly not be able to prove that such a prime cannot exist?

January 20, 2009 at 3:07 pm |

It is known that there are infinitely many primes for which the smallest quadratic non-residue is at least c log p, for some positive c, and probably this can be ensured with the condition that p is 3 modulo 4. (This goes back to Chowla and Turan, and can be guessed by considering the values of the Legendre symbol (x/p) for small x, and checking that, essentially, they behave like independent Bernoulli random variables for x up to a small constant times log p). Moreover, on GRH, Montgomery has improved this to c(log p)( log log p), so the natural probabilistic guess is (probably) not true for this problem. (And this tends to confirm that the problem is not purely combinatorial, if we believe that the Riemann Hypothesis is not combinatorial).

January 20, 2009 at 3:11 pm |

Thanks for that — I rather suspected that my knowledge about this problem was incomplete. So perhaps my question should be slightly modified: why does it seem hopeless to find a combinatorial proof even of the weaker assertion that there is a dilate such that the gap has size at most ?

January 20, 2009 at 5:44 pm |

Well, from the point of view of classical analytic number theory, the $p^{o(1)}$ bound for the least quadratic non-residue (which is known in that area as a conjecture of Vinogradov) is considered out of reach at the moment because all “standard” attack methods depend on results which more or less directly involve properties of the L-functions which are considered extremely deep. (If not the Riemann Hypothesis, it can be the Lindelöf Hypothesis, or estimates for very short exponential sums).

Actually, that doesn’t mean there couldn’t be a clever combinatorial re-formulation that would lead to a proof of this conjecture. For instance, typically, the standard approaches lead to much more precise results than a bound just for the least quad. non-residue: they lead to equidistribution of residues and non-residues in very small sets; conversely, assuming conjectures on the size of this least quad. non-residue does not imply much about L-functions (as far as I know). So maybe it’s not “hopeless” to go around the L-functions, but at least this special case shows that the problem lies quite deep.

(Maybe an analogy could be drawn with the proof of RH for varieties over finite fields: Grothendieck’s “programme” for that was highly geometric, and maybe it was felt at some point that it was hopeless to look for an arithmetic proof, but in the end Deligne’s work did involve a lot a arithmetic — and a lot of geometry, but it didn’t involve proving first the intermediate results Grothendieck had in mind).

January 20, 2009 at 6:13 pm |

I know that when the sum-product phenomenon was discovered, and Jean Bourgain used it to show that thin sets of with multiplicative structure (e.g. multiplicative subgroups) expanded rapidly under addition to fill the whole group, there was a bit of excitement as to whether one could show the reverse direction (i.e. whether thin sets of with additive structure, such as an interval, expanded rapidly under multiplication to fill the whole multiplicative group), since this would in particular say something about the least quadratic non-residue (in particular, the analogue of Bourgain’s results would give a bound of ). But it turns out that the techniques are not symmetric (Bourgain crucially uses at a key point that multiplication distributes over addition, whereas addition does not distribute over multiplication), and besides, the ultimate ingredients used in the sum-product analysis are too “elementary”, and do not seem to point out any obvious inconsistency (or at least improbability) in an alternate universe in which there was a large p with a big interval of quadratic residues.

It seems to me that any proof of the least quadratic nonresidue problem must, at some point, introduce a new (or at least underused) fact about quadratic residues that was not fully exploited in the past, as it seems that having a large least quadratic nonresidue is perfectly compatible with the “standard” facts we know about residues, though there is no real formalisation of this that I know of. (For instance, my theorem with Ben that the primes contained arbitrarily long arithmetic progressions used extremely little number theory, but it still needed to know something moderately non-trivial about the primes, namely that (roughly speaking) they were a dense subset of a pseudorandom set of almost primes – a fact which had been implicitly used a couple times in the past, e.g. in papers of Ramare and of Green, but perhaps not as systematically as it should have been.)

January 20, 2009 at 7:26 pm |

Actually, I just read (I didn’t know it before…) that Graham and Ringrose have proved unconditionally that the least quad. non-residue is at least c(log p)(log log log p) for infinitely many primes, so the log p probabilistic estimate is indeed false.

January 20, 2009 at 10:58 pm |

For the general question—when it is clear that you need to understand a specific X before you can prove it for all X—I am tempted to say that the answer is, “Never.” There are certainly cases where it *seems* that you need to understand the particular case before you can hope to prove the general case, and the quadratic nonresidue problem may be an example. But is this *clearly* true? I’m skeptical. For comparison, note that many people felt that the implication “every elliptic curve is modular => Fermat’s Last Theorem” showed that it was hopeless to prove that every elliptic curve was modular until we understood enough number theory to prove Fermat’s Last Theorem. Strong intuitions can be wrong and different people can have different intuitions about these kinds of things.

The one exception to my “Never” would be a situation where you can easily deduce the general case from the special case.

January 21, 2009 at 7:56 am |

Certainly generalisation to a more abstract setting that “forgets” some of the original structure can be a powerful technique – though in order for such an argument to not be easily transferable back to the original concrete setting, the argument must somehow exploit a freedom in the abstract setting that is not easily available in the concrete special case. For example, one could imagine working on the gap problem by some sort of iterative procedure that starts with an arbitrary set and performs some operation to increase its min-max gap until it is extremised, at which point some useful structural information on that set is obtained. This is not an approach one would think of if one insisted only on working on the set of quadratic nonresidues. (Though it may end up that the quadratic nonresidues are already somehow the “worst” case, being so self-similar; one may even hope to formalise this with some sort of equivalence between the gap problem and the least quadratic nonresidue problem, though I don’t see how to achieve this. Curiously, Ben Green has recently had some thoughts along these lines, specifically that quadratic residues may somehow be the only obstruction to improvements to the large sieve inequality.)

It is of course quite difficult, as Timothy points out, to definitively rule out the utility of tackling a general case first before looking at the special case, because there may be ways to exploit the generality that one simply has not thought of yet. The one exception I can see is if the general case has an important “near-counterexample” that forces any proof of that general case to be very delicate, whilst the special case, being somehow “very far” from this counterexample, looks more amenable to cruder and simpler techniques. [Of course, if the general case has a

genuinecounterexample, then it becomes very easy to decide whether to spend any time trying to prove the general case. :-) ]January 21, 2009 at 5:04 pm |

Two small remarks. One is that it may well be that the conjecture suggested by the random case is also something like : although you can’t do better than if you want the gaps to be small with high probability, there is quite a bit of independence around, so it is conceivable that probabilistic methods could show that the bound occurs with positive (but small) probability. I don’t know how the GRH proof goes for the quadratic-residue problem, but my guess is that it is philosophically similar and does in fact show (i) that quadratic residues mod behave like random sets as varies and (ii) that for that very reason you expect from time to time.

As for the general case, I suspect that, as has been suggested, it may be hard to give a convincing reason for its being impossible to prove the theorem by purely combinatorial means. However, for any given purely combinatorial attack, it may be possible to use the quadratic-residues example to show that it does not work.

I am trying, but so far failing, to think of a problem where it definitely does happen that a single example kills off all hope of solving the general problem by general means. A first question then might be: if you can’t prove it by general means, then how

canyou prove it? A possible answer to that is that you first do some sort of classification and then deal with each class separately. So there might be examples in group theory, say, where it is hopeless to prove a theorem without using CSFG. But that would still leave the question of whether for any given problem there are clear signs that one has to use a classification. There are examples in group theory of classification-free proofs of theorems that had been thought to need classification, so perhaps the answer to this question is no.In the Tricki article, I wanted to find circumstances where it is

probablybetter to concentrate on the special case, so I’d settle for a reasonably convincing and generalizable argument that that is the case here, even if the argument didn’t establish it beyond all possible doubt, and even if one could think of exceptions to the general rule.January 21, 2009 at 8:59 pm |

Here’s an extreme example of interaction between particular and general case, but which may be useful: if there existed a general algorithmic way to decide whether any given diophantine equation has an integral solution, we know from the existence of a universal equation (a particular case!) that we would get a contradiction (i.e., this is the argument solving the Hilbert 10th Problem).

In the opposite direction, but still in this area, Robinson had shown that the general problem would be unsolvable, provided a single particular “exponential” relation could be encoded in polynomial form, and this had apparently been taken to mean that her approach was doomed to fail…(see the famous review MR0133227 (24 #A3061)).

January 21, 2009 at 9:00 pm |

P.S. I should have written “Davis, Putnam and Robinson”, not simply “Robinson”.

January 22, 2009 at 7:30 pm |

By the way, this may seem like a purely theoretical question, but the problem of how big the gaps can be is very closely related to the analysis of a very practical data structure in computer science, hash tables with linear probing. If one views the “gaps” as parts of the table that are filled in by hash table entries, and the dilation mod p as being a hash function, the gap length tells you what the longest access time can be.

See, e.g., [A. Pagh, R. Pagh, and M. Ruzic. Linear probing with constant independence. In Proc. 39th STOC, pages 318–327, 2007], which shows that hash functions with limited independence achieve good hashing behavior. Here “limited independence” is still rather more independence than you would get by dilating by a single randomly-chosen t, though.

January 24, 2009 at 12:55 am |

Dear Tim and others,

A couple of points. Firstly, there is something on lower bounds for this gap problem in the book of Konyagin and Shparlinski, but I forget exactly what they prove.

Secondly, the p^{o(1)} conjecture, even when delta = 1/2, implies the Kakeya conjecture! I wrote some notes on this a long time ago, and here they are:

http://www.dpmms.cam.ac.uk/~bjg23/restriction_kakeya_phenomena/notes10.ps

The connection is basically due to Bourgain from about 1994.

Tim’s “probably inconsistent model” is very close to the idea of doing analytic number theory in which one admits the possibility of a zero of some L-function (or, worse, a Siegel zero). Occasionally one can work through arguments keeping track of the contribution of the hypothetical zero and it can even help (cf. Heath-Brown’s paper proving twin primes on the assumption there is a very strong type of Siegel zero)

Best

Ben

January 28, 2009 at 2:09 am |

Minor issue: I think you may want p to be 3 mod 4 instead of 1 mod 4 in order to make this work.

[Thanks — I’ve changed it, and also changed Emmanuel Kowalski’s first comment.]February 20, 2009 at 4:44 am |

If we examine the random model in the case of quadratic non-residues, then the expected upper bound should be logp loglogp, the same as the lower bound proven by Montgomery conditional on GRH. The model goes something like the following. The values of the quadratic character is completely multiplicative, and hence determined by its values at the primes. If we assume that the character can be modeled by a random variable which takes the values 1 and -1 both with probability 1/2, then the upper bound x should satisfy 2^pi(x) is the same size as p, where pi(x) is the number of primes up to x. Assume pi(x) = x/log x, and solve for x. (Everything depends on x and p being “large”.)

November 20, 2009 at 11:23 am |

[…] be to start by finding a counterexample to the Littlewood conjecture … (This is an issue that I have discussed before on this blog. Sometimes combinatorial generalizations seem to be no easier to tackle than the […]