EDP25 — third guest post by Gil Kalai

The Polynomial Method

The polynomial method is another basic combinatorial technique that occasionally works. One way to describe the method is as a way to translate a combinatorial statement into the vanishing of a certain polynomial modulo p.

A demonstration of the method

Theorem: Every graph (or hypergraph) G with n vertices and 2n+1 edges contains a nontrivial subgraph H with all vertex-degrees divisible by 3.

(This is a theorem of Noga Alon, Shmuel Friedland, and me from 1984.)

Before the proof: If we want to get a subgraph with all vertex degrees even then we need n edges (or n+1 edges for hypergraphs). This has a simple linear algebra proof which also gives an efficient algorithm.

From-scratch proof sketch: Associate with every edge e of the graph a variable x_e. Consider the two polynomials

P= \prod_{v \in V(G)} ((\sum_{e: v \in e}x^2_e)^2-1), and

Q=\prod_{e\in E(G)}(x_e^2-1)

If the theorem is false then P-Q=0, as polynomials over the field with three elements. This is impossible since P is a polynomial of degree 4n while Q is a polynomial which has a monomial of degree 4n+2 with nonzero coefficient.

The theorem follows more directly from a theorem of Chevalley-Warning and even more directly from a theorem of Olson, but the above proof serves best our purpose.

Remarks about the polynomial method:

1) The polynomial method has many applications but only in specific cases. It is not nearly as widely applicable as, say, the probabilistic method.

2) Good basic references: A. Blokhuis, Polynomials in finite geometries and combinatorics. In Keith Walker, editor, Surveys in Combinatorics, 1993, pages 35-52. Cambridge University Press, 1993.

Noga Alon, Combinatorial Nullstellensatz, Combinatorics, Probability and Computing 8 (1999), 7-29.

3) The polynomial method is related to the “linear algebra method” in combinatorics. Often, however, direct linear algebraic proofs lead to efficient algorithms while this is not known for applications of the polynomial method. For example, no polynomial algorithm to find the graph H in the above theorem is known, and there is a related complexity class introduced by Christos Papadimitrou . The polynomial method is closely related to arguments coming from the theory of error-correcting codes, and to arguments in TCS related to interactive proofs and PCP.

The modular EDP.

The following is an equivalent way to formulate the Erdős 1932 conjecture that the discrepancy for EDP is unbounded.

1) Consider the sequence x_1,x_2,\dots as a sequence with \pm 1 modulo p, where p is a prime that we can choose as large as we want.

2) Then every number modulo p can be expressed as a sum of the sequence along a HAP modulo p.

Translating EDP (in this form) into a statement about polynomials modulo p is cumbersome. But one thing we may have going for us is that it suggests a natural extension of EDP where the supposed-to-vanish polynomial is simpler.

Modular EDP Conjecture: Consider a sequence x_1,x_2\dots ,x_n of non-zero numbers modulo p. Then if n is sufficiently large w.r.t. p, then every number can be expressed as a sum of the sequence along a HAP modulo p.

As in the original EDP we can consider general sequences or just multiplicative sequences.

The Polynomial identity required for the modular EDP

Here is the polynomial identity in n variables x_1,x_2,\dots,x_n we need to prove over Z/pZ when p grows to infinity with n as slow as we wish. For every k, 0 \le k\le p,

(*) \prod_{i=1}^nx_i\prod \{(x_d+x_{2d}+\cdots+x_{id})-k~:~d,i:~di\le n\}~=~0

These polynomials are not familiar but they are related to generating functions which arise in permutation statistics. In particular, when we look at the product

\prod_{i=1}^n(x_1+x_2+\dots+x_i)

and expand it to monomials, the coefficients have a combinatorial meaning in terms of permutations and inversions.

Given a permutation \pi (1),\pi (2),\dots,\pi (n), and an integer i,~1\le i\le n we can ask how many inversions are there between i and a smaller integer. This is a number between 1 and i-1.

The coefficient of x_1^{i_1}x_2^{i_2}\dots x_n^{i_n} in the above product is the number of permutations where there are i_j integers contributing j inversions. The proposed identity (*) may be expressed in terms of modular properties of such permutation statistics.

Challenge: Prove the modular EDP using the polynomial method.

What does the LDH tell us about the modular EDP?

It is especially easy to apply the large deviation heuristic to the modular version of EDP. Suppose we want to compute the probability that all HAP-sums miss the outcome y.

Given x_1+x_2+\dots +x_r \ne y, the probability that x_1+\dots+x_r+x_{r-1} is not y is 1-1/(p-1). So we are interested in the value of p with (1-1/(p-1))^{nlogn} = (p-1)^{-n}. (Restricting our attention to multiplicative sequences will divide the exponents on both sides by \log n.) Solving this equation gives us p=\log n \log\log n. The LDH heuristic comes with a firm prediction and a weak prediction. In this case the LDH gives

a) (Firm prediction) There are sequences violating the modular EDP when p> \log n\log\log n.

b) (Weak prediction) There are no such sequences when p<<\log n\log\log n.

The firm prediction is correct by the logn discrepancy constructions for EDP and as a matter of fact the LDH itself gives an even stronger prediction of \sqrt {\log n} for \pm1-sequences. By restricting our attention to \pm 1 sequences we see that the weak prediction is incorrect and LDH for the modular EDP is blind to the special substructure of \pm 1 sequences. Note that the firm conjecture is far from being known when we extend the modular EDP and replace all integers by a random subset of integers, or by square-free integers , or by SCJ-systems of integers etc.

About these ads

59 Responses to “EDP25 — third guest post by Gil Kalai”

  1. Polynomial examples | My2shoppe Says:

    [...] EDP25 — third guest post by Gil Kalai « Gowers's Weblog [...]

  2. gowers Says:

    It might be interesting to search for long sequences of non-zero numbers mod 7 such that no sum along a HAP is equal to \pm 3 mod 7. We have some very long \pm 1 sequences with this property. If we allow the sequence to take the values \pm 2 as well, does it help? Indeed, even the occasional \pm 3 would allow us to jump from 2 to -2 or the other way. So the question I’m basically asking here is whether avoiding \pm 3 mod 7 for arbitrary sequences of non-zero numbers mod 7, which in theory is strictly easier than avoiding \pm 3 when the sequence has to be \pm 1-valued, is enough easier to allow us to find examples of length greater than 1124.

  3. gowers Says:

    Thinking more naively about the modular problem, if one attempts to find a counterexample, then the way things will go wrong is if one can find a number n with factors d_1,\dots,d_p (and maybe others) such that the sums along the HAPs \{d_i,2d_i,\dots,n-d_i\} are all distinct. So the points where things get difficult will be numbers with quite a lot of factors.

    Another very minor remark is that if we are trying to prove the modular conjecture for a pair (p,n), then we have to consider a set of HAPs that is linearly dependent, since otherwise we can choose the sums to be whatever we like just by solving some simultaneous equations. Of course, just having a few dependences around isn’t a problem, since we don’t have to make the sums equal to specific values — we just have to avoid some value. Nevertheless, it might be worth looking for interesting dependencies between HAPs (of course, I’m referring to characteristic functions of HAPs mod p).

  4. gowers Says:

    Just for fun, here’s a greedy approach mod 7. I’m going to try to avoid the number 0 as a sum along any HAP, and at each stage I’ll choose the smallest number mod 7 that I can. (That is, my first choice will be 1, my second choice 2, etc. I’m not allowed to choose 0, not that I’d want to.) I’ll bunch it into groups of 7 numbers to make it easier to read

    It starts 1,1,1,1,1,1,2; 1,1,1,1,1,2,2; 1,1,1,3,1,1,1; and then gets a bit tedious to work out by hand. It would be interesting to know how far a greedy approach like this can go. And if it’s allowed to do a small amount of backtracking, can it go much much further?

    • Jason Conti Says:

      If I did this correctly, it looks like the limit without backtracking is length 47.

      1, 1, 1, 1, 1, 1, 2; 1, 1, 1, 1, 1, 2, 2; 1, 1, 1, 3, 1, 1, 1
      3, 1, 1, 1, 1, 1, 4; 1, 4, 1, 1, 1, 2, 2; 1, 1, 1, 1, 5, 1, 2
      2, 1, 3, 1, 2

      With backtracking after about 20 minutes it seems to be stuck at 298: http://paste.ubuntu.com/1188167/

    • Jason Conti Says:

      The output was truncated in the above pastebin, it was actually 299: http://paste.ubuntu.com/1188186/

    • gowers Says:

      Thinking about it a little further, it seems quite likely that avoiding 0 is harder than avoiding a non-zero number. In particular, I don’t see how to convert a counterexample to EDP into a sequence mod p that avoids 0.

      But actually that might make the avoiding-0 problem an interesting one (even if it doesn’t help with EDP). Here is the formal statement. Suppose we have a sequence x_1,x_2,\dots of integers mod p. Is there always a HAP along which the sum of the sequence is zero? If so, how long can the sequence be (as a function of p) before the zero sum is forced to appear?

    • gowers Says:

      Gil’s comment above with links to earlier comments in it got sent to the moderation queue so I have only just seen it. I now realize, after looking at the links, that I am playing catch-up a bit here. Anyhow, here’s another small thought about the avoiding-0 problem (though I think something similar can be said about avoiding a non-zero residue).

      Let’s consider the following very general problem. You have a collection \mathcal{A} of subsets of \{1,2,\dots,n\} and a prime p and you want to determine whether there exists a sequence x_1,\dots,x_n such that \sum_{i\in A}x_i\not\equiv 0 for every A\in\mathcal{A}. We are interested in the case where n is large and \mathcal{A} is the set of all HAPs in \{1,2,\dots,n\}.

      When p=2 the problem can be solved in polynomial time, since it is asking for every sum to equal 1, so all we have to do is solve some simultaneous equations. But for p>2 we are looking at intersections of complements of subspaces that are not affine subspaces themselves. This is highly reminiscent of 3-SAT, so I would guess that it is an NP-complete problem. (In fact, my guess is that it is a known NP-complete problem — does anyone recognise it?)

      That doesn’t mean we can’t solve the problem in the case where \mathcal{A} is the set of HAPs, but it suggests that we will have to use the structure of that set in a very strong way, so that there is no question of being able to extract from the proof a general procedure for determining whether zero sums exist. The polynomial method seems not to be ruled out by this consideration, since we can hope to use the structure of HAPs to say strong things about coefficients that we would not be able to say about general set systems. What does seem to be ruled out is any kind of approach based on finding a solution to a set of linear equations — such as a clever decomposition of matrices.

  5. gowers Says:

    I’ve thought of a further strengthening of the strong modular conjecture. It could well be false, but if it’s true it might be a helpful approach to the conjecture.

    To motivate it, let’s think about how one might go about proving the strong modular conjecture in an elementary way (rather than using the polynomial method — though I certainly think the polynomial method should be investigated too). One idea is to imagine trying to build a counterexample. Each time you get to a number, it will be contained in a certain number of HAPs (one for each factor), along which the sums of the sequence will be s_1,\dots,s_k, say. If you are trying to avoid the number t, that tells you that at this number you must not let your function equal t-s_i for any i\leq k.

    Assuming the modular conjecture is true, this will presumably become harder and harder. That is, as you choose more and more of your sequence, you will find that you are facing more and more forbidden values. My suggested strengthening is that eventually you cannot avoid a point n such that as you run through its factors d the sums along the HAPs d,2d,\dots,n-d take p-1 different values. None of those values can be the forbidden value, so the sums take all the values except t. Therefore, if you choose a non-zero value for f(n), you can’t help one of the resulting HAP sums being t. Note that there is a slight resemblance between this and the colour-focusing proof of van der Waerden’s theorem.

    This is a very much stronger statement than the modular conjecture itself, since for the modular conjecture all you need is to find p different HAPs with different sums: you don’t need them to be the HAPs that correspond to factors of some single number n. Because it is so much stronger, it seems sensible to begin by trying to find a simple counterexample. In the unlikely event that that turns out to be hard enough that there is a chance that the strengthening is true, one could hope for a proof strategy such as one has for van der Waerden’s theorem: one would devise a clever inductive hypothesis — something to do with there being many points with many forbidden values, but it is far from clear what the statement should be — and prove that as n got bigger and bigger you ruled out more and more until eventually you were forced to rule out everything.

    • gowers Says:

      Let me try to explain in a different way why it’s a strengthening, since I’m having slight doubts. Suppose that this conjecture is false. Does it allow us to build a counterexample to the strong modular conjecture? At first the answer might seem to be yes: you just keep on defining values and at each stage there are at most p-2 forbidden values so you can pick a non-zero value that isn’t forbidden. But who says you can do that? If the very strong conjecture is false, all it gives you is two functions f and g such that none of the sums f(d)+f(2d)+\dots+f((m-1)d)+g(md) takes the forbidden value. So you can change the value of f(n) to avoid HAPs that end at n going wrong, but you can’t do that for all n simultaneously.

      That reformulation in terms of two functions makes it clearer just how much of a strengthening it would be if the very strong conjecture turned out to be true.

    • Gil Kalai Says:

      Tim, can you please formulate the strengthening again.

    • gowers Says:

      OK, here are two (I think equivalent) versions.

      1. For every prime p and every function f:\mathbb{N}\to\mathbb{Z}_p\setminus\{0\} there exists n such that the sums f(d)+f(2d)+\dots+f(n-d) (where d runs over factors of n) take at least p-1 distinct values.

      2. For every prime p, every t and every pair of functions f,g:\mathbb{N}\to\mathbb{Z}_p\setminus\{0\} there exists a HAP \{d,2d,\dots,md\} such that f(d)+f(2d)+\dots+f((m-1)d+g(md)\equiv t mod p.

    • gowers Says:

      Actually, the equivalence doesn’t quite work. Let me prove what I can. If 1 holds and you have a pair of functions f,g, then choose n such that the sums f(d)+\dots+f(n-d) take at least p-1 distinct values. What I’m missing here is the assumption that none of these values is the forbidden value. If I knew that too, then I’d be able to argue that since g(n)\ne 0, then there must exist d such that f(d)+\dots+f(n-d)+g(n) does take the forbidden value. But I don’t have that assumption.

      Maybe I want to replace 1 by what I originally intended, which is to say that there exists n such that the sums f(d)+f(2d)+\dots+f(n-d) take all p possible values. Of course, that trivially implies the strong modular conjecture, since one of those values is forbidden, but it is clearly a stronger statement, since there is nothing to stop any particular HAP taking any particular value. Let me call that statement number 3.

      Let me see whether 2 and 3 are equivalent. If 3 holds, then you can’t choose a value for g(n), since that will force all p different sums and one of them will have to equal t. Conversely, if 3 doesn’t hold, then let f be such that for every n there are at most p-1 distinct values of f(d)+f(2d)+\dots+f(n-d) as d ranges over the factors of n. Then let a be a value not taken and set g(n)=t-a.

      That doesn’t quite work because t-a might be zero. So after all that, let me formulate four questions, of which the last two are equivalent. The first two were given in the previous comment. The other two are these.

      3. For every prime p and every function f:\mathbb{N}\to\mathbb{Z}_p there exists n such that the sums f(d)+f(2d)+\dots+f(n-d) take all p possible values as d ranges over the factors of n.

      4. For every prime p, every t, every function f:\mathbb{N}\to\mathbb{Z}_p\setminus\{0\} and every function g:\mathbb{N}\to\mathbb{Z}_p there exist d and m such that f(d)+f(2d)+\dots+f((m-1)d)+g(md)\equiv t mod p.

      I think all four of them imply the strong modular conjecture. It is also trivial that 3 implies 1 and 4 implies 2.

    • gowers Says:

      Another way of thinking about statement 3 is this. The strong modular conjecture says that you cannot avoid a HAP along which the sum is t. This conjecture says that even if the value you are avoiding is allowed to depend on the maximum element of the HAP, you still can’t avoid it. (The word “conjecture” is not very appropriate because I still think it might be too strong to be true.)

  6. gowers Says:

    I’m now going to strengthen the modular conjecture so much that it will be a miracle if it is true. So — in the full expectation of spotting a counterexample within seconds of formulating the statement — here goes. I want to remove addition from the problem and preserve just one thing: that if you add a non-zero number mod p then what you get is different from what you had before.

    What that tells you is that as you take partial sums along a HAP, you have a sequence such that no two consecutive values are the same. So let’s now take a two-variable function f with values in \mathbb{Z}_p the property that f(m,d)\ne f(m+1,d) for every d and m. Must there exist pairs (m_1,d_1),\dots,(m_p,d_p) such that m_1d_1=\dots=m_pd_p and the values of f(m_i,d_i) are all distinct?

    The answer must surely be no. A first step towards finding a counterexample might be to try to find an example with the stronger property that f(m,d) depends only on md. Hmm — but maybe that’s too strong. I have to go now but will think about this later.

    • gowers Says:

      Oops — that was a bad conjecture, even for one that was probably going to be false. Just make f(m,d) take one value if m is odd and another if it is even. Then the condition is satisfied and you only use up two values.

    • gowers Says:

      Let g:\mathbb{N}\to\mathbb{Z}_p be a function that never takes the value zero and let f(m,d)=g(d)+g(2d)+\dots+g(md) for every m,d. What properties of interest does f have? It’s difficult to formulate any that don’t involve addition in some way. For example, we know that if d is a factor of n then g(n)=f(n/d,d)-f(n/d,d-1), so the right-hand side has to be independent of n. But that involves addition and allows us to reconstruct g, so it doesn’t naturally lead to an interesting strengthening.

    • gowers Says:

      Actually, that last observation does at least give rise to some condition that can be stated without reference to addition. Since f(n/d,d)-f(n/d,d-1)=f(n/c,c)-f(n/c,c-1) for any two factors c,d of n, it follows that f(n/d,d)=f(n/c,c) if and only if f(n/d,d-1)=f(n/c,c-1). To phrase that differently, if ab=cd, then f(a,b)=f(c,d) if and only if f(a,b-1)=f(c,d-1).

      Unfortunately, that still doesn’t rule out the trivial example of colouring (m,n) according to the parity of m+n.

      In general, I’m trying to find a strengthening of the strong modular conjecture that can be stated as a colouring problem. But I’m coming to the conclusion that such a strengthening doesn’t exist (or at least doesn’t exist without being either too complicated to be worth stating or else boringly and trivially false).

  7. gowers Says:

    Here’s a question that I thought was relevant until I tried to explain the relevance and failed. Now I just think it’s a question that is interesting if it doesn’t have too easy an answer. Let’s define a graph on the positive integers by joining md to (m+1)d for any two positive integers m and d. Equivalently, we join x to y if x-y is the highest common factor of x and y. Is the chromatic number of this graph finite?

    • Alec Edgington Says:

      I think we can colour the numbers from 1 to N using \lfloor \log_2 N \rfloor + 1 colours, by letting the colour of 2^k m (where m is odd) be k. Can this be improved?

    • Alec Edgington Says:

      … Well, it can certainly be improved for small N. For example, one can colour the numbers from 1 to 8 using three colours:

      12345678
      ABACABCA
      
    • Alec Edgington Says:

      I think I see why the chromatic number must be infinite. If it is finite, say C, let n be such that C colours are needed to colour 1, 2, \ldots, n. Then C colours must be needed to colour n!+1, n!+2, \ldots, n!+n. But this means we can’t colour n! with any of those C colours, a contradiction.

    • gowers Says:

      Nice. Let me explain why I briefly thought that an infinite chromatic number would imply EDP and why I was wrong. I’m doing that just in case something else can be done with this circle of ideas.

      Suppose we have a function f:\mathbb{N}\to\mathbb{Z}_p. If f never takes zero values, then as you take partial sums along a HAP, you never have the same value twice in succession. That is, the partial sum up to md is never the same as the partial sum up to (m+1)d. So we have a p-colouring of \mathbb{N} such that no two integers of the form md and (m+1)d have the same colour. Therefore if the chromatic number of that graph is infinite, we have a proof of the strong modular conjecture, and hence of EDP.

      I’ve tried to present that argument in a way that will make it sound plausible if you read it quickly, but it is total nonsense. That’s because it confuses “partial sum up to md” with “partial sum up to md along the HAP of common difference d“. Those are two completely different things, and because they are different there is no argument.

      An even stronger reason to be suspicious of the argument is that it proves a blatantly false fact: that for every function f you can find a HAP such that the partial sums along that HAP take the same value twice in succession somewhere. If f is constantly 1, then that is not the case.

      Nevertheless, I feel as though the fact that the chromatic number is infinite might not be completely irrelevant — it tells us at least something about the structure of HAPs, and with a bound that is encouragingly large. (Not just encouragingly large, but rather amusing: something like 2!!!...! where there are n factorial signs, except that it’s slightly larger and a bit more complicated than that.) More to the point, the lower bound is large too.

    • gowers Says:

      I’m just thinking aloud here, rather than making a serious suggestion, but here is a statement that would imply EDP: that if you colour \mathbb{N} with finitely many colours, then you can find arbitrarily long monochromatic generalized HAPs (that is, sets of the form \{rd,(r+1)d,\dots,sd\}).

      That is so much stronger than van der Waerden’s theorem that it surely has to be false. But you’ve just proved it for progressions of length 2. If true, then it would imply the strong modular conjecture, since you’d just have to find a HAP where f took the same (non-zero) value p times in a row.

    • Klas Markström Says:

      So this the same as saying that for any fixed number of colours you can find a monochromatic generalized HAP of any length.

      Unless I am missing some obvious obstruction that doesn’t sound unreasonable.
      If it is false it would provide an interesting hypergraph, with each geneneralized HAP as an edge, with bounded chromatic number.

      ( I’m having a busy month but I am trying to catch up with the discussion so far )

    • gowers Says:

      It is false — see the next few comments.

    • Klas Markström Says:

      Ah. That’s what happens when commenting before having time to read all the comments. But at least it provides an interesting hypergraph.

  8. gowers Says:

    I’m going to try to settle the question I raised at the end of the replies to the previous comment. The question was this. Is it true that for every r and every k if you colour \mathbb{N} with r colours then you can find a monochromatic arithmetic progression of the form \{(m+1)d,\dots,(m+k)d\}? It will be a miracle if the answer is yes, but since that would (if I haven’t made a silly mistake) imply EDP, it seems worth checking.

    I probably ought to start by trying to find a counterexample, but instead I want to start by trying to generalize Alec’s proof for r=2 in a van der Waerdenish way.

    One thing we can use Alec’s argument to do is find, for any t, positive integers m,d such that the colours of the numbers md,md+1,\dots,md+t are the same as the colours of the numbers (m+1)d,(m+1)d+1,\dots,(m+1)d+t. Does that help? It seems not to: for instance, the pair md+1,(m+1)d+1 does not appear consecutively along a HAP. It is somehow too additive.

    For a colour-focusing argument, I want a whole lot of monochromatic pairs of the form m_id, (m_i+1)d_i such that all the m_id_i are the same. Actually, that would be just for two colours, but let’s start with that. I have two colours and I want an AP of length 3. If I can find m_1d_1=m_2d_2 such that … hmm, I don’t think this is working. I’ll think about counterexamples after all.

  9. gowers Says:

    Let me consider colourings that depend only on the largest power of 2 that divides a number. Those can’t work (for progressions of length 3, that is), and here is the simple proof. Pick r<s such that numbers such that 2^r is the highest power and numbers such that 2^s is the highest power get the same colour. Then the generalized HAP 2^s-2^r,2^s,2^s+2^r is monochromatic.

    Ah, but the statement is false for progressions of length 4, and here’s the simple proof. Colour according to the parity of the power. Out of four consecutive integers, two are odd and one is an odd multiple of 2. Therefore, out of four integers md, (m+1)d, (m+2)d, (m+3)d at least one will have an odd power of 2 as the highest that divides it and at least one will have an even power of 2.

    • gowers Says:

      Although it wouldn’t have much to do with EDP, I feel a sort of moral obligation to sort out the case of progressions of length 3.

      Ah — got it. Every number can be written in the form (3a+1)3^b or (3a+2)3^b. Let’s colour them according to which we get. Given numbers md, (m+1)d, (m+2)d, let’s suppose that d is of the form (3a+r)3^b. Then at least one of m, m+1, m+2 is congruent to 1 mod 3 and at least one is congruent to 2 mod 3. When we multiply those by d, then they get different colours.

      So the result is false even for two colours.

    • gowers Says:

      It’s sort of interesting that if we make one of those colours 1 and the other -1 then we get one of our favourite low-discrepancy sequences. I spotted that only after I had finished writing the comment.

  10. gowers Says:

    Now that I’ve done that it’s coming home to me just how crazy the statement was. If it had been true, it would have shown that for the normal EDP you must be able to find an arbitrarily long sequence of consecutive 1s or consecutive -1s along some HAP. But we know of sequences where that is false. What’s more, with those sequences one can’t improve matters in any way by passing to a generalized HAP — the discrepancy will still grow at most logarithmically.

  11. gowers Says:

    Hmm, I’ve got a fairly convincing argument that no pure colouring argument of the kind I was looking for can possibly work. Roughly speaking, I wanted an argument that would work as follows. Given a function f:\mathbb{N}\to\mathbb{Z}_p that is never zero, we define g(m,d) to be f(d)+\dots+f(md). We then use very general properties of g, which we think of as a colouring, to argue that g(m-1,d)=g(m,d) for some m and d, which implies that f(md)=0, a contradiction.

    But if q is a prime less than p, then we can pick a function to \mathbb{Z}_q, take its partial sums along HAPs, and then just compose the resulting function with an injection from \mathbb{Z}_q into \mathbb{Z}_p. Just about the only property we had of g was that it avoided some value. Well, we’ve now got a function that avoids a value and has all the same sameness relations as a function that definitely exists.

    Sorry not to be more precise about what is ruled out, but anyway I’m more or less convinced that EDP isn’t a Ramsey problem in disguise.

    But that still leaves the possibility of a super-strong modular conjecture, where we show that for any function f:\mathbb{N}\to\mathbb{Z}_p\setminus\{0\} there exist p HAPs all with the same maximal element such that the sums of f along them are all distinct.

  12. Gil Kalai Says:

    One obvious remark is the following: For a hypergraph H on an underlying set A we can define the modular discrepancy moddisc (H) as the minimum p such that every everywhere non zero function f:A\to Z_p, every t \in Z_p can be expressed as a sum t=f(S)=:\sum\{f(a):a \in S\} for some edge S of H. So it can be of some interest to study it for simpler hypergraphs were the ordinary discrepancy is understood. It will also be interesting to find even simpler examples where the polynomial method is useful.

    As Tim said we can consider other methods for this modular discrepancy versions. For example, if f is random then the nonzero values of f(S) have uniform distribution. So we can try to understand how the distribution of non zero f(S) deviate from uniform. Perhaps some Fourier ideas can work.

    • gowers Says:

      One remark in that direction is that the proof of Roth’s discrepancy theorem (that for every \pm 1-function defined on the first n integers you can find an AP on which the discrepancy is at least cn^{1/4}) allows you to pick an AP of some fixed length close to n^{1/2}. It is clear that you can’t obtain a modular discrepancy theorem when all the sets have the same size, since the constant function 1 is a counterexample. It might be worth collecting other simple observations of this kind, to see if we can come up with a class of hypergraphs that are good candidates for a modular discrepancy theorem to hold.

    • gowers Says:

      It occurs to me that the example just mentioned is a special case of a much more general example. If you can find a nowhere zero linear functional \phi that is constant on your hypergraph (that is, a function that does not take zero values and has the same sum in every set in the hypergraph), then it provides a very strong counterexample. It also shows that the hypergraph is contained in an affine subspace of codimension 1. Maybe a good general type of condition to impose on the hypergraph is that it should be well distributed with respect to affine subspaces of codimension 1. This backs up Gil’s thought that Fourier ideas could work.

      I’m not quite sure what I’m really saying here, however, since the Fourier conditions I can think of imply in a rather trivial way that the modular discrepancy is at least p. I suppose what would be interesting is a condition that does not depend on any particular prime p. In fact, that looks much better, since it means that if you find an affine subspace of codimension 1 that works for one prime p, there is no reason to suppose that it works for other primes.

      To give a simple example of what I mean by the last remark, if all the sets in your hypergraph have odd cardinality, then the modular conjecture fails when p=2 but it may well succeed for larger p. So we want Fourier conditions that are “robust”, in the sense that if you change the prime you don’t lose the condition.

    • Gil Kalai Says:

      When we talk about the modular conjecture it makes sense to assume that the hypergraph is closed under taking subsets.

      Of course it is natural to ask if the modular discrepancy of AP’s is behaves like n^{1/4}. Perhaps one of the proofs for Roth’ddiscrepancy theorem can be extended?

    • gowers Says:

      That sounds like a rather strong condition given that it fails to apply to interesting cases like APs or HAPs. But maybe one could ask for weaker conditions such as that for every set in the hypergraph it is possible to remove an element and get another set in the hypergraph. One could also ask for the hypergraph to be closed under intersections.

    • gowers Says:

      Actually, perhaps I’ve misunderstood your remark, since the modular conjecture seems to me to be trivial if the hypergraph contains at least one reasonably large set and is closed under taking subsets: by the pigeonhole principle pick p elements where the value of the function is the same and then if the forbidden element is r, choose a subset of those p elements of size s, where rs\equiv 1 mod p.

    • Gil Kalai Says:

      Sorry my remark made no sense.

    • gowers Says:

      Another remark is that if you allow arbitrary arithmetic progressions, then the modular conjecture holds, though the argument I have in mind gives a rubbishy bound. The argument is that if n is large enough, then by Szemerédi’s theorem there is an arithmetic progression of length p-1 on which the function is constant, and by choosing a subprogression of appropriate length you can make the sum whatever you like. It would be very interesting to see whether the correct bound is similar to the n^{1/4} bound for the conventional discrepancy theorem. The proof I understand (basically, a diagonal decomposition argument) doesn’t seem a promising candidate for modifying to handle the modular conjecture, since it proves the result for progressions of the same length.

    • gowers Says:

      And a further remark is that the question is not completely uninteresting even for progressions of common difference 1. Of course, the modular conjecture is easily seen to be false for such progressions. However, it seems to be quite a strong condition on a sequence a_1,\dots,a_n that some value t (mod p) should be avoided by all sums of the form a_r+\dots+a_s. For instance, if we consider the set A of all sums of the form a_r+\dots+a_m and the set B of all sums of the form a_{m+1}+\dots+a_s, then A+B does not include t. That places quite strong conditions on A and B. There are many statements of a similar kind that one can make. Maybe it is possible to show somehow that these sorts of properties cannot hold simultaneously for lots of common differences.

  13. gowers Says:

    I think it would be a very nice project, or perhaps I should call it sub-project, to try to prove the modular conjecture for arbitrary APs with a decent bound. And I’d be happy to start by considering the more symmetrical case where we define a function not on \{1,2,\dots,n\} but on the cyclic group \mathbb{Z}_n and we define an AP of length m to be a set of the form \{a,a+d,\dots,a+(m-1)d\} mod n.

    In the original problem, the reason you get high discrepancy is that you can express the L_2 norm of your function efficiently in terms of products of AP-discrepancies. So if the L_2 norm is large, then by averaging one of the products is large and therefore so is one of the discrepancies. Is there anything that could conceivably do the same job for the modular conjecture? We still get a nice representation of the identity matrix, but it’s not clear what can replace the averaging argument: if I know the sum of a whole lot of numbers mod p, it doesn’t tell me much about the original numbers.

    Maybe one could somehow obtain a lot of different expressions relating sums along various APs and deduce from that that all values are represented.

  14. gowers Says:

    Going back to Fourier analysis, how might that be made to work? In the abstract, if you’ve got a hypergraph H of subsets of \{1,2,\dots,n\}, then you might normally think of it as a 01-function defined on the cube \{0,1\}^n (taking the value 1 on characteristic functions of sets in H and 0 elsewhere). But we could think of it instead as being defined on \mathbb{Z}_p^n. That is, given a vector of length n with coordinates in \mathbb{Z}_p (strictly speaking, a function from \mathbb{Z}_n to \mathbb{Z}_p), if it happens to be the characteristic function of an AP mod n then it gets the value 1 and otherwise it gets the value 0.

    Suppose now that n is much larger than p and that there is some function \phi and some t such that \phi doesn’t sum to t (mod p) on any A\in H. Then if we average over all p characters that are constant on affine subspaces of the form \langle f,\phi\rangle=r, we will find that at least one of those characters has a large inner product with (the characteristic function of) H.

    Conversely, if H has no large Fourier coefficients, then for every non-zero function \phi you can find sets on which \phi sums to any value you choose mod p.

    This is a much stronger conclusion than we can hope to prove, since it applies to all non-zero functions and not just to functions that never take the value zero. So what we will actually need to prove is something more like that while H will have at least some large Fourier coefficients (for example, if you consider the function that takes the value 1 in one place and zero everywhere else, then it takes the value 0 or 1 at every point in H, which will lead to large coefficients), these will tend to result from variables with a large influence. Maybe we can show that H has small Fourier coefficients at points with lots of non-zero values.

    I need to try to be clearer what I mean. Every character on the group \mathbb{Z}_p^n will be of the form “take the inner product with the function x\mapsto\omega^{r.x}“, where \omega=\exp(2\pi i/p) and r.x=\sum_ir_ix_i mod p. If r has lots of non-zero values (as opposed to only taking non-zero values), then maybe we can show that H has a small inner product with this character.

    How could that be easier than the original problem? I don’t know that it could, but the hope would be that we could write down an expression that measures whether H has large Fourier coefficients for functions r with many non-zero entries (and this is quite reminiscent of expressions that people write down when studying Boolean functions and influence of variables) and interpret it as a condition on H that we then try to prove for the specific H that interests us — namely, the set of APs mod n.

  15. gowers Says:

    Here is a possible first step towards proving a modular conjecture for APs rather than HAPs. It is to attempt to prove that in the special case of \pm 1 functions the distribution of sums along APs mod p tends to the uniform distribution. There is some choice about how to make that statement precise, but here is one version. For every p and every \epsilon>0 there exist m,n such that for every f:\mathbb{Z}_n\to\{-1,1\}, if you choose a random AP P mod n of length between m and 2m (say by choosing the length randomly and then choosing the AP randomly of that length), then the distribution of \sum_{x\in P}f(x) mod p is within \epsilon (in your favourite metric) of uniform.

    Note that it is essential to vary the length, since otherwise the function f(x)\equiv 1 is a counterexample. However, it might be interesting to think about what the distribution can look like when the length is fixed. Maybe we could show that either it is uniform or it is close to constant (because, say, f is close to a constant function), and then argue that in the second case the constant varies linearly with the length of the AP. So let me ask another question. If f takes values \pm 1 and is not approximately constant, can we then get a statement such as the one in the previous paragraph but for random APs of just one length m?

  16. gowers Says:

    A general philosophy that might be used to prove results of the kind suggested in the previous comment is this. Let f:\mathbb{Z}_n\to\mathbb{Z} take many non-zero values and only values that are at most some constant C. Then if m and n are large enough (with n significantly larger than m), then the distribution of the sums \sum_{x\in A}f(x) over all APs mod n of length m is similar to the distribution of the sums over all subsets of \mathbb{Z}_n of size m. In other words, once you’ve fixed the cardinality, making the further restriction that your set has to be an AP does not have much effect on the distribution.

    It isn’t obvious that this philosophy is correct, so the first thing to do is perhaps to look for another obstruction besides cardinality.

    Actually, I’ve seen a counterexample straight away. If the function f is periodic with some small period k (where I allow a little break when you get to n and start again), then there is a reasonably high probability that an AP doesn’t wrap around and has common difference a multiple of k. Hmm, no, that’s not true if m is large — it will almost certainly wrap around. OK, I’m not quite sure whether periodic functions (and similarly arithmetically structured functions) are a problem or not. If they are, then it might be possible to salvage the philosophy by talking about the distribution of mod p values rather than the distribution of \mathbb{Z} values.

    OK, here’s an example. Suppose we take the function that’s 1 up to n/2 and -1 from n/2 onwards. Then a random AP of length m will almost always have common difference substantially greater than n/m, and when it does, the sum of the sequence along that AP will be bounded. (If n is odd, then there will be a tiny bias, but since m is a lot smaller than n, it is too tiny to affect anything.)

    I didn’t say that very well. If the common difference is d and d lies between, say, \alpha n and (1-\alpha)n, then the sum along the AP won’t be more than about \alpha^{-1}. This happens most of the time. But if you pick a random subset of size m then you typically get a sum of around \sqrt{m}. So the philosophy looks false if we talk about distributions of \mathbb{Z} values.

    Actually, the same example shows that it’s false for mod p values as well. With probability 1/2 the sum along your mod-n AP of length m will have modulus at most 4, so its value mod p will be between -4 and 4. That is certainly not the case for a random subset of size m.

    In fact, the example is a counterexample to what I wanted to prove in the previous comment, since the sum is between -4 and 4 whatever the length m, as long as the common difference behaves itself.

    That doesn’t quite kill things off, but it suggests that we might have to have a two-case argument, one for highly structured functions and one for quasirandom functions. A bit like another famous Roth theorem in fact.

  17. gowers Says:

    I now want to find a Fourier condition that will guarantee that sums over APs are approximately uniformly distributed. Ideally, I’d like the condition to be very weak. The kind of thing I would like to see is this.

    1. If the function has a small amount of randomness, then the sums over APs of length m are approximately uniformly distributed.

    2. If the function does not have a small amount of randomness, then there is a long AP on which it is approximately constant, and hence by averaging there is an AP of length p on which it is constant, which allows us to create any sum we like.

    What could “a small amount of randomness” mean? Well, it reminds me of dual uniformity norms. Suppose we have a definition of quasirandomness (we might go for small Fourier coefficients, but I’ll leave that open). Then saying that a function is quasirandom is saying something like that it is “random everywhere”. But what if we just want it to be “random somewhere”. We can capture that by saying that the function correlates with a quasirandom function.

    As an example, if you take a \pm 1 function that is constant up to n/2 and random from n/2 to n, that is not quasirandom (because of the highly structured constant part) but it correlates very strongly with a random \pm 1 function that agrees with it between n/2 and n.

    So one thing I’d like to be able to prove is that if you have a \pm 1 function f that correlates with a quasirandom function, then the sums of f along APs are approximately uniformly distributed mod p. Of course, the statement has to be made precise, but it seems better to worry about that only after trying to prove it (so that the proof to some extent dictates the statement).

    Experience suggests that not correlating with a quasirandom function (a.k.a. having a small dual norm) is a very strong property that is likely to lead to long APs where the function is constant. But it seems better to concentrate on stage 1 first.

  18. gowers Says:

    I want to get a feel for whether it’s actually true that a small amount of correlation with a quasirandom function causes the sums to be approximately uniformly distributed mod p. To do that, let’s consider the case where f is a random \pm 1 function up to t and then arbitrary between t+1 and n. Is it possible for the second part somehow to “compensate” for the randomness of the first part?

    If I want sums along mod-n APs to be approximately uniformly distributed mod p, then I obviously need mt to be significantly greater than n, since otherwise most APs of length m don’t intersect the first t elements of \mathbb{Z}_n, so if I make f 1 between t+1 and n, then most sums will be m mod p. So let’s assume that t is indeed quite a bit greater than n/m.

    Actually, I realize that I have a more basic problem. How do I even show that the sums are approximately uniformly distributed when the entirety of f is quasirandom? (If it’s genuinely random then that’s OK, but we need to use quasirandomness.) I think this may require some offline calculation.

  19. gowers Says:

    I now realize that I can say slightly more without the help of pen and paper (my ideal during polymath projects being not to use them at all but to write down all my thoughts, even if a significant percentage of them turn out to be a waste of time). Let \phi:\mathbb{Z}_n\to\mathbb{Z}_p. How can we measure how uniformly distributed the sums of \phi are on sets in some hypergraph H? Here is a suggestion that turns out to involve polynomials, though not in quite the same way as in the polynomial method described by Gil in the post. (I haven’t thought enough about this to tell whether there is some connection — perhaps even a very close one.)

    Let \omega=\exp(2\pi i/p). A natural way of encoding the sum \sum_{x\in A}\phi(x) is as \prod_{x\in A}f(x), where f(x)=\omega^{\phi(x)}. Moreover, a good test of whether the values \sum_{x\in A}\phi(x) are approximately uniformly distributed mod p as A ranges over some hypergraph H is to see whether the average \mathbb{E}_{A\in H}\prod_{x\in A}f(x) is small.

    However, while that is a necessary condition for the sums to be approximately uniformly distributed, it is not sufficient. To get a sufficient condition, we should ask for more. We need the averages \mathbb{E}_{A\in H}\prod_{x\in A}f_r(x) to be small for every non-zero r (mod p), where f_r(x)=\omega^{r\phi(x)}.

    When r=0, we know that the average is 1, so we could try calculating something natural-seeming like the sum of the fourth powers of those averages and asking for it to be close to 1. I’ll try that in a new comment.

  20. gowers Says:

    The quantity I want to calculate, or rather simplify, is

    \displaystyle \sum_r|\mathbb{E}_{A\in H}\prod_{x\in A}\omega^{r\phi(x)}|^4.

    The fourth power of the expectation expands to

    \displaystyle \mathbb{E}_{A_1,A_2,A_3,A_4\in H}\prod_{x_1\in A_1,\dots,x_4\in A_4}\omega^{r(\phi(x_1)+\phi(x_2)-\phi(x_3)-\phi(x_4)}.

    Bearing in mind that the product of the powers of \omega is \omega raised to the power the sum, we see that when we sum over r this expression vanishes except when

    \displaystyle \sum_{x_1\in A_1,\dots,x_4\in A_4}\phi(x_1)+\phi(x_2)-\phi(x_3)-\phi(x_4)=0.

    Unfortunately, that isn’t terribly helpful. Or at least, it doesn’t seem to be. In fact, I now see that I could have gone for second powers. And all I’d have got out at the end was that the values are well distributed if and only if you don’t get too many pairs A_1,A_2\in H such that \sum_{x_1\in A_1}\phi(x_1)=\sum_{x_2\in A_2}\phi(x_2), which is a simple fact that didn’t need Fourier analysis to be established.

    But maybe what that is telling us is that we shouldn’t sum over r and simplify — since that is returning us to the original problem. Rather, we should think carefully about why we might expect sums of the form

    \displaystyle \mathbb{E}_{A\in H}\prod_{x\in A}\omega^{r\phi(x)}

    to be small. That is, can we formulate properties of \phi that guarantee that they will be small? Of course, if we are fixing r, then we might as well write \omega^{\psi(x)} instead. In fact, it’s not obvious that we need to insist that the values taken by the function are pth roots of unity. Perhaps the right question is this. Let f:\mathbb{Z}_n\to\mathbb{C} take values of modulus 1. Find conditions on f and the hypergraph H that ensure that

    \displaystyle \mathbb{E}_{A\in H}\prod_{x\in A}f(x)

    is small.

  21. gowers Says:

    In the case where H consists of all arithmetic progressions of length m for some large m, I find this question quite puzzling. Consider, for example, what happens when m=4. (The puzzle will be that what I say will break down when m gets large and it will no longer be clear what happens.)

    When m=4, we can define a function where the average is large as follows. For each x we randomly decide whether to set f(x) to be \omega^{x^2}, \omega^{-x^2}, \omega^{3x^2} or \omega^{-3x^2}. If we now work out the expectation of \prod_{x\in A}f(x) over all such f and over all arithmetic progressions A of length 4, we find that we obtain \mathbb{E}_{A\in H}\prod{x\in A}g(x), where g(x)=(\omega^{x^2}+\omega^{-x^2}+\omega^{3x^2}+\omega^{-3x^2})/4. This splits into 4^4 terms, of which two contribute 1/4^4 and the rest contribute close to zero. To cut a long story short, the fact that x^2-3(x+d)^2+3(x+2d)^2-(x+3d)^2=0 means that the expectation of the product is not small, because quite a lot of the products are equal to 1.

    Thus, for progressions of length 4 it is possible for a function with very small Fourier coefficients (as f is above) to give rise to a large average.

    This argument can be generalized to larger m, but as m gets larger the size of the expectation gets smaller, so if m depends on n then we don’t have an example any more.

    What this shows is that if we are going to prove that some kind of Fourier structure is necessary for the average to be large, the proof will have to break down when the length of the arithmetic progressions is small.

    A second slightly puzzling feature of the problem is that if \phi is a random function from \mathbb{Z}_n to \mathbb{Z}_p subject to the restriction that it takes the values \pm 1, then for large m the sums \sum_{x\in A}\phi(x) will be approximately uniformly distributed mod p, but the function f(x)=\omega^{\phi(x)} has non-trivial large Fourier coefficients. Thus, we very much need to use the fact that we are taking a product over a large set to “spread out” the function f.

  22. gowers Says:

    Hmm, maybe I have an idea. Suppose we look at the average just for arithmetic progressions of length m and common difference 1. The first product is \prod_{x=0}^{m-1}f(x). Let’s call that z_0. Then the next product is z_1=z_0f(m)\overline{f(0)}. And the next one is z_2=f(m+1)\overline{f(1)}z_1. And so on.

    That didn’t quite work out as I hoped, so I’m not sure where I’m going with it. I’ll just make the remarks that one can do it for other common differences, and that if m is large then it is somehow more difficult for the products f(m+k)\overline{f(k)} to “conspire” to make the averages large. (I’m not sure why I say that, however.)

  23. gowers Says:

    I feel the need to understand better what Fourier transforms of high-degree polynomials look like. That is, suppose that we have an expression of the form \sum_A\lambda_A\prod_{x\in A}f(x). How can we rewrite it in terms of \hat{f}?

    I’ll do the obvious thing and consider just the monomial \prod_{x\in A}f(x). Let’s write A=\{x_1,\dots,x_k\}. Then

    \displaystyle \prod_{x\in A}f(x)=\prod_i(\sum_r\hat{f}(r)\omega^{rx_i})

    \displaystyle =\sum_{r_1,\dots,r_k}\hat{f}(r_1)\dots\hat{f}(r_k)\omega^{\sum_ir_ix_i}

    If r=(r_1,\dots,r_k), then let’s write \hat{f}(r) for \hat{f}(r_1)\dots\hat{f}(r_k) and r.A for \sum_ir_ix_i, where (x_1,\dots,x_k) is the “obvious” ordering of the elements of A. (This makes sense for APs.) Then if all the sets A have size k, the polynomial becomes \sum_A\lambda_A\sum_{r}\hat{f}(r)\omega^{r.A}. The only thing I can think of to do now is interchange summation to get \sum_r\hat{f}(r)\sum_A\lambda_A\omega^{r.A}. In our case, all the \lambda_As are equal, so let’s look instead at the simpler expression \sum_r\hat{f}(r)\mathbb{E}_A\omega^{r.A}. So a useful thing to do will be to work out an expression for the last expectation — with luck it will pick out certain rs.

  24. gowers Says:

    Let r=(r_1,\dots,r_k) and let’s work out \mathbb{E}_A\omega^{r.A}, where A ranges over all APs mod n of length k. I now see that I haven’t made clear that f:\mathbb{Z}_n\to\mathbb{C} and that \omega=\exp(2\pi i/n).

    We can rewrite our expectation as

    \mathbb{E}_{a,d}\omega^{r_1(a+d)+r_2(a+2d)+\dots+r_k(a+kd)}

    which equals

    \mathbb{E}_{a,d}\omega^{(r_1+\dots+r_k)a+(r_1+2r_2+\dots+kr_k)d}

    which equals 1 if r_1+\dots+r_k\equiv 0 and r_1+2r_2+\dots+kr_k\equiv 0 mod n and 0 otherwise. So we’re summing \hat{f}(r_1)\dots \hat{f}(r_k) over all (r_1,\dots,r_k) that satisfy those two equations.

    It’s not at all clear that that helps.

  25. Gil Kalai Says:

    I find it a little scary to talk Fourier about long sums of AP. Maybe we can get what we need by having certain psudorandomness for behavior of pairs of elements in HAPs or something like that?

    • gowers Says:

      I agree. I thought I’d better try it, and it was indeed scary. I think this is a great problem though. I am trying (so far without even the success that would deserve a comment on this thread) to think of some nice statement about APs that would be sufficient to imply the modular conjecture. I think I’ll collect the thoughts I have so far (mostly observations about approaches that can’t work, which at least have the merit of giving some clue about what a proof might look like) and write a post. Maybe some people will be interested to think about this problem, given that it isn’t known to be hard in the way that EDP is known to be hard.

    • gowers Says:

      I have now written the post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

Join 1,546 other followers

%d bloggers like this: