EDP1 — the official start of Polymath5

After several hundred comments (including everything, we’re at 688 at the time of writing) about the Erdős discrepancy problem, it may seem odd to be talking about the project starting. However, the focus of the discussion so far has been on experimental results, and I have deliberately restrained myself from saying too much about the theoretical side of the problem — that is, proposing approaches, trying to prove partial results, etc. — and I think others have too. So from now on any theoretical comments are welcome. I have been trying to delay this moment as long as possible in order to have a chance of doing various other things I have to do, but I have a relatively free couple of weeks coming up, the experimental evidence is starting to lead to the possibility of making some precise conjectures, and Terence Tao recently came in with a theoretical comment: this conjunction of circumstances fatally weakens the argument for holding out any further.

I had planned, in this first theoretical post, to concentrate on approaches that don’t work. The idea behind this was that the more you can rule out, the more your moves are forced and the more likely you are to find whatever correct argument might remain. However, I am less inclined to do this now because the body of experimental evidence does the proof-constraining very effectively without it. As I have already commented, the fact that there is a sequence of length 1124 and discrepancy 2 probably rules out a simple inductive argument for a lower bound of the form $c\log n$ (for the discrepancy of a sequence of length $n$). And the fact that this example has so much structure strongly suggests that we should attempt to prove that an infinite bounded-discrepancy sequence must have structure of this kind. (A key property seems to be quasi-multiplicativity. We basically know what a quasi-multiplicative sequence is, though we have not yet settled on a formal definition.)

When I first suggested the Erdős discrepancy problem as a possible Polymath project, one thing that worried me about it was that, unlike with the density Hales-Jewett theorem (for $k=3$) I did not have a suggestion for how one might go about solving it. However, that has changed, thanks to this experimental work. I now have a suggestion for a broad approach, and slightly more detailed suggestions for how one might go about proving the various steps. I haven’t forgotten that my initial suggestions for DHJ3 were eventually abandoned, and that could happen again here. But it is still somehow helpful to have the initial suggestions.

My view of the problem before the 1124 structure emerged was something like this. There was a natural special case of the problem, where you assume that your sequence is completely multiplicative, and not only was this probably already very hard, but it probably wouldn’t help all that much with the general problem. Now I feel a little bit more optimistic about the problem for multiplicative functions (after a small amount of thought about how to produce good ones, and why one might fail) and a lot more optimistic about their relevance to the general problem, at least if one generalizes to quasimultiplicative sequences. So I would suggest the following broad approach to the Erdős discrepancy problem.

1. Prove that completely multiplicative sequences cannot have bounded discrepancy.

2. Prove that every infinite sequence of bounded discrepancy can be “cleaned up” so that it becomes quasimultiplicative, in some yet to be determined, but pretty strong, sense.

3. Generalize 1 to quasimultiplicative sequences.

Let me now say a little bit more about 1-3 in an effort to explain why I don’t feel that carrying out those steps is necessarily hopeless.

The best way to understand why I feel optimistic (but only moderately so, as I think there could be some serious difficulties with what I am about to propose) about 1 is to do a little exercise that I did a few days ago, namely generate by hand a completely multiplicative $\pm 1$ sequence of length 246, all of whose partial sums lie between -2 and 2. It turns out not to be that hard, provided that you are prepared to look ahead at all the consequences of your choices so far (up to 246 — the significance of which is that Alec and his computer have proved that that is as far as you can get). Once you have done that, you start to see that awkward constraints can arise if you have a longish interval with many numbers in it that I’ll call odd-smooth, until someone proposes a better, or more standard, name. By this I mean the product of a perfect square with a product of distinct small primes. Equivalently, I mean a number such that every prime that occurs an odd number of times in the prime factorization is small. The relevance of this is that if $\epsilon$ is your completely multiplicative $\pm 1$ function and you want $\epsilon(n)$ to be 1 (respectively -1), and if $p_1,\dots,p_k$ are the primes that occur an odd number of times in the prime factorization of $n$, then an even (respectively odd) number of the values $\epsilon(p_i)$ must be -1. Thus, each $n$ for which you want $\epsilon(n)$ to be -1 imposes a constraint on the values of the $\epsilon(p_i)$ that can be thought of as a linear constraint over $\mathbb{F}_2$. The hope is that if you have too many of these constraints applying to too few primes $p_i$, then you won’t be able to satisfy them all simultaneously.

The situation isn’t quite as simple as that, because you are rarely forced to take a particular value for $\epsilon(n)$. However, if you have several odd-smooth numbers in close succession, then you do at least have many constraints on the values they can take, and there seems to be at least some hope of demonstrating that they cannot all hold simultaneously.

This line of thought suggests a further division of 1 into two subproblems.

1a. Can we say anything about the distribution of odd-smooth numbers? How dense is the set of numbers $n$ such that the primes that occur an odd number of times in the prime factorization of $n$ are all at most $K$? (Obviously they will get sparser as you go on. A typical number around $n$ will have around $\sqrt{\log n}$ prime factors, so most of these will be large, and typically a large prime factor will occur only once.) And can one show that the distribution is somewhat irregular, so that there are intervals that are particularly rich in odd-smooth numbers? (Because the odd-smooth numbers thin out, such an interval would have to be not too far out.) Can one at least argue heuristically that this ought to be the case?

This is the part of the proof strategy that I feel least confident about. It could be that these questions are known to be very hard. But I’m crossing my fingers and hoping for the best: are there any number theorists out there who can help? Would sieves be useful perhaps? If nobody comes forward then I’ll ask on mathoverflow whether anything is known about odd-smooth numbers.

1b. Can some kind of $\mathbb{F}_2$-linear algebra argument be developed that proves that the discrepancy of an infinite multiplicative $\pm 1$ sequence is unbounded, subject to reasonable assumptions about the distribution of odd-smooth numbers (and their factorizations)?

I admit that 1a looks pretty hard and 1b doesn’t obviously work. But I hope it’s at least a start.

I feel much more hopeful that something can be done here, and I think that if we managed to show that the Erdős discrepancy problem was equivalent to the Erdős discrepancy problem for a highly structured class of sequences, then we could be pretty satisfied.

Very roughly, the programme would be this. Over at mathoverflow I asked a question about how much a sequence of 1s and -1s could be tidied up if you are allowed to take pointwise limits of sequences of translates. To see the relevance to us, consider the operation $T_2$ that we have talked about quite a bit. This is the map that takes a sequence $(x_n)$ to the sequence $(x_{2n})$. If we look at the iterates of $T_2$, then it is natural to partition the natural numbers into geometric progressions of common ratio 2, one for each odd number, since on each such GP $T_2$ behaves like a shift. The reason this is of some interest is that if $x=(x_n)$ has discrepancy C, then so does $T_2(x)$, and hence so does $T_2^m(x)$ for every $m$. Furthermore, every pointwise limit of sequences of discrepancy $C$ has discrepancy at most $C$.

Let us call a property of sequences stable if it is closed under translates and pointwise limits. An example of a stable property is that of never having three 1s in a row. An example of an unstable property is that of having density 1 (meaning that the density of $n$ such that $x_n=1$ is 1). The reason that the latter is unstable is that one could have arbitrarily long blocks of -1s but far enough out that the density of the sequence was 1, and then one would be able to find a pointwise limit of translates that was zero everywhere. Periodicity and almost periodicity (by which I mean the type of behaviour you see in sequences such as $(-1)^{\lfloor \alpha n\rfloor}$) are stable properties, and highly relevant to us. Let us call a property inevitable if for every sequence there is a pointwise limit of translates that has the property. Of particular interest are properties that are both stable and inevitable.

Let $P$ be a stable and inevitable property. If we keep applying $T_2$ to a sequence $(x_n)$, we can ignore everything except the subsequence $x_1,x_2,x_4,x_8,\dots$ and pass to a new sequence where this subsequence has property $P$. We can then repeat the exercise for the subsequence $x_3,x_6,x_{12},x_{24},\dots$, and the stability means that we will not lose the property on the first GP. Continuing in this way, we can in fact get all GPs to have any stable and inevitable properties we like. So it seemed to make sense to find out what such properties were. There have been several helpful answers to this question on mathoverflow.

Unfortunately, but not at all surprisingly, for a general sequence the kind of properties we can get are far too weak to be useful for showing that our original sequence can be converted into one with multiplicative structure. (We can get quasiperiodicity, but the kind of thing we are looking for is much much stronger than that.) However, our sequences are very far from general: they are subsequences along geometric progressions of a sequence of bounded discrepancy. This raises all sorts of possibilities. For example, if we can prove that some rather weak property must hold, then perhaps we can pass to some limit and greatly strengthen the property. (An example of this, though not a very interesting one, is that if we have a sequence of density 1 we can turn it into the all-1s sequence.)

I have made a list of weak properties that bounded-discrepancy sequences ought to have, and for some of them I think there ought to be a good chance of proving that they really do have the properties. Rather than repeating the list here, I refer you to the interesting subquestions section on the first page of the Polymath5 wiki.

To summarize what I’ve been saying here, I think that, given that the discrepancy can be very low indeed for very long sequences, the best chance of proving the conjecture may well be to use infinitary methods. I hope that we may be able to prove some weakish facts about hypothetical counterexamples to the conjecture that suggest some kind of weak multiplicative structure, and then to argue that in the limit there is a much nicer multiplicative structure — some kind of quasimultiplicativity of the kind we have been observing.

I don’t have much to say about this: we should probably walk before we try to fly.

The other direction.

Of course, I don’t think we should abandon the attempt to find longer and longer sequences of low discrepancy: it has been extremely fruitful already, but the patterns we have found have a tantalizing way of breaking up when the numbers get large. At the moment it feels as though we won’t understand the breaking up properly unless we can get our hands on more data. And it’s probably a bit dangerous to assume that the conjecture is true, or at least it should be healthier to attempt to disprove it as well as attempting to prove it. There are many loose ends arising from the discussion so far, and there are likely to be many more. (See the wish-list section on the experimental results page of the wiki for a sample of these — others are scattered in the comments.) Even if the conjecture is true, it would still be interesting to have an efficient algorithm that could generate better sequences than we know how to describe theoretically. (It feels as though we are closing in on the 1124 sequence, or rather sequences, but it also feels as though the description of those would involve some slightly arbitrary elements that would not generalize straightforwardly to infinity.)

At one point I raised the question of whether it would be good to have separate threads for the theoretical and experimental aspects of this project. At the moment I don’t plan to do this: the two seem too closely intertwined, and the experience with DHJ was that having two parallel threads didn’t really work unless they were fairly disjoint. (That applied both to the material being discussed and, to a lesser extent, to the participants.) So what I hope will happen is that the experimental part of the project will continue apace, but that anyone who might have been holding back on discussing the theory will no longer do so — just as I, by posting this, have no longer done so.

102 Responses to “EDP1 — the official start of Polymath5”

1. Harrison Says:

I have a little comment about (1) and some obstructions that the “odd-smooth” approach would have to get around.

Actually, no I don’t. I can’t quite seem to make it explicit. But there seems to be some tension in the approach between additive combinatorics (specifically, Roth’s and Varnavides’ theorems) on the one hand, and the abc-conjecture on the other, which suggests that:

1. If the density of odd-smooth numbers in an interval is high enough, then we can partition the set of odd-smooth numbers in this interval into multiples of a few (actually, probably just one) large, smooth number, and numbers where the square part is large.

2. Any approach to showing that there have to be intervals with a large number of odd-smooth numbers must either work with a relatively low density — to avoid Roth’s theorem coming into play — or rely to a large extent on some special properties of odd-smooth numbers, and won’t be very “generic” at all.

• gowers Says:

I’ve tried to work out where the abc tension comes in and haven’t managed. I see the general idea: that we have three numbers $a+b=2c$ by Roth’s theorem and the odd-smooth property looks as though it could cause the radical of $abc$ to be too small. But I can’t actually get the calculations to work. Can you flesh out the details of your comment a bit more?

• Harrison Says:

I can’t quite get the calculations to work either, which is why I was vague in the first place :). Thinking about it further, it now seems unlikely that the abc conjecture is an actual barrier here, since a, b, c can very easily have prime factors of size around $c^{1/3}$.

Working through the logic, though, I think that you can say (on the abc conjecture) that if you replace “odd-smooth” by “(not 0 mod 4)-smooth” then any solution to $a + b = 2c$ in “(not 0 mod 4)-smooth” integers requires a, b, c to have a large common factor. This may or may not have density implications via something like Roth’s theorem; if it does, it might at least rule out some avenues of attack that would generalize to this case.

2. Polymath5 – Is 2 logarithmic in 1124? « Combinatorics and more Says:

[…] Update: Gowers’s theoretical post marking the official start of Polymath 5 appeared. […]

3. kristalcantwell Says:

Is it clear that every sufficiently large quasi-multiplicative sequence does not contain a large multiplicative sequence? That seems one possible way of proving that every sufficiently large quasi-multiplicative sequence has a large discrepancy prove that it contains a large multiplicative sequence having already shown that all sufficiently large multiplicative sequence has a large discrepancy. So that would be a question to look at if it that has not been done already.

4. Terence Tao Says:

Just a quick post here. The following model problem may be worth studying: suppose we have a function $f: {\Bbb F}_2^n \to \{-1,+1\}$ with the property that $f(x+e_1)+\ldots+f(x+e_i) = O(1)$ for all x and $1 \leq i \leq n$, where n is large (we can even take O(1) to be “bounded in magnitude by 2” if one likes). What does this tell us about f? This is a variant of the “free” discrepancy problem in my previous comment, but with a gratuitious 2-torsion hypothesis thrown in to simplify things (more generally, we would work in ${\Bbb Z}^n$ rather than ${\Bbb F}_2^n$).

A quick use of Plancherel and an averaging argument suggests to me that the Fourier coefficients of f must necessarily be concentrated in those frequencies $(\xi_1,\ldots,\xi_n)$ where the partial sums $\xi_1+\ldots+\xi_i$ are bounded on the average. One could imagine exploiting hypercontractivity to make this bound stronger; if we could exhibit one large Fourier coefficient in this set, this would get close to establishing approximate quasimultiplicativity of f (in the spirit of point 2 in the post).

I also had an unrelated comment. The boundedness of partial sums $f(d)+\ldots+f(nd)$ also implies the boundedness of interval sums $f(md)+\ldots+f(nd)$. So one can define the _modified_ discrepancy to be the sup of $|f(md)+\ldots+f(nd)|$. This only differs by a factor of 2 from actual discrepancy, but it allows for a few more subproblems on the way to showing that discrepancy 2 cannot occur. For instance, modified discrepancy cannot equal 1, because this would mean that f((n+1)d) is always equal to -f(nd), and thus f(2), f(3), f(4) all have a different sign, which is absurd. But can one rule out an infinite sequence of modified discrepancy 2? What about 3? Ruling out modified discrepancy 4 would rule out actual discrepancy 2.

• Terence Tao Says:

I can’t quite eliminate the modified discrepancy 2 sequences yet, but I found an interesting structure (after taking limits) which looks exploitable.

It’s convenient to work on the rationals rather than the integers. We can then dilate a function $f: {\Bbb Q} \to \{-1,+1\}$ by rationals without affecting the (modified) discrepancy. We can then take a sequence of these dilates and use the Arzela-Ascoli theorem to find a subsequence that converges pointwise; this pointwise limit will also have modified discrepancy. This is basically what Tim refers to as “improving” the +-1 sequence.

Now, suppose that f(x) and f(2x) have the same sign, say +1, for some rational x. Then, with modified discrepancy 2, f(3x) has to have sign -1. But also, looking at the sequence f(2x/2), f(3x/2), f(4x/2), we see that f(3x/2) has to have sign -1 also. So we see in particular that if f(x)=f(2x), then f( 3/2 x ) = f( 3x ) (with the opposite sign), and more generally f( (3/2)^j x ) = f( (3/2)^j 2x ) for any positive j.

Now, this means that for any x, either f( (3/2)^j x ) = f( (3/2)^j 2x ) for all sufficiently large j, or else f( (3/2)^j x ) = – f( (3/2)^j 2x ) for all sufficiently large j. If one then dilates f by (3/2)^j for large j and then takes a subsequence limit to improve the sequence, we then conclude that for this value of x, either f( (3/2)^j x ) = f( (3/2)^j 2x ) for _all_ j (including negative j), or f( (3/2)^j x ) = -f( (3/2)^j 2x ) for all j. As the rationals are countable, one can pull this trick off for all x by a diagonalisation trick. In other words, after enough improving of the function f, we end up with a function f such that f(2x)/f(x) is invariant in the 3/2 direction.

This seems to be the limit of what one can accomplish just by looking at the places 2 and 3. (Basically, the only consecutive blocks of 23-smooth numbers are 1,2,3,4 and 8,9, and the latter is too short to contradict modified discrepancy 2.) But it is now tempting to throw in the place 5 and see what one gets; the blocks 1,2,3,4,5,6 and 8,9,10 look very useful. (There is also 15,16, but again this is too short, and I don’t think there are any other blocks of 235-smooth numbers that are likely to be of use.)

• Terence Tao Says:

Ah, just realised that the above analysis also shows that (after improving the sequence) one has f(3x/2) = – f(x) for all x. In particular f(4x) = f(6x), which implies f(5x) = – f(4x) by modified discrepancy 2. Unfortunately this renders the 8,9,10 block useless: we now have f(9x)=f(4x) and f(10x) = -f(4x), so one always is consistent with modified discrepancy 2 in this case. But the antisymmetry in the 3/2 and 5/4 directions do mean that we can effectively replace the 3 and 5 places by 2.

Now we can work on 7. Since f(5x) = -f(4x) and f(6x) = -f(4x), we must have f(7x) = f(4x) by modified discrepancy 2. Then we can work on the 14,15,16 block: f(14 x) = f(8x), f(15 x) = f(8x), and so f(16x) = – f(8x) (and also f(13 x) = – f(8x) ). So now we also have antisymmetry in the 2 direction, and symmetry in the 13 direction.

Presumably one can keep going like this and eventually get a contradiction. Actually, a model problem would be to first show that a completely multiplicative function cannot have modified discrepancy 2; the above argument has a highly multiplicative flavour to it…

• obryant Says:

In this postIn this post, I called $|f(md)+\dots+f(nd)|$ the drift. For a multiplicative sequence, which is determined by its values on the primes, the drift must be at least 5. Trees connecting assignments on the primes to the drift, and where the drift must happen, are linked to from this post.

I never finished the run for drift=6 (only finished with f(2)=f(3)=1), as I think it would take my desktop 3 or 4 days to complete, and I wanted to go another direction.

• Terence Tao Says:

Ah, thanks for the link and terminology.

It turns out I made some sign errors in the previous discussion, and a lot of what I said is not justified. The best I can do so far is reduce to an f for which one always has

f(2x) f(3x) f(4x) f(6x) = 1

for every rational x, but then I get stuck. I’ve written up what is salvageable from my previous comment at

http://michaelnielsen.org/polymath1/index.php?title=Drift

5. Polymath5 officially starts « Euclidean Ramsey Theory Says:

[…] officially starts By kristalcantwell Polymath5 has officially started. There have been some interesting patterns showing up in some of the examples and now some […]

6. Sune Kristian Jakobsen Says:

One reason that 1 (to prove that completely multiplicative sequences cannot have bounded discrepancy) is difficult:

(If I understand you correctly) we are hoping to prove that no sequence have discrepancy at most C, by finding a set of intervals, such that all multiplicative function must have a sum greater that 2C in absolute value on at least one of the intervals. If this approach such simplify things a lot, we need the maximal length of these intervals to grow slowly (e.g. linear in C), but Walters’s example* shows that the intervals has to have length exponential in C.

*Walter’s example is the multiplicative function, that sent n=1 (mod 3) to 1, n=2 (mod 3) to -1 and 3 to -1.

• Sune Kristian Jakobsen Says:

“*Walter’s example is the multiplicative function, that sent n=1 (mod 3) to 1, n=2 (mod 3) to -1 and 3 to -1.”
That is not exactly Walter’s example, but a improved version of it. Both this function and Walter’s example (that sends 3 to 1) can be used in my comment

• gowers Says:

I’d be thinking of $C$ here as a constant (but an arbitrary one) so that ${}\exp(C)$ is “merely” another constant. Meanwhile $n$ would be tending to infinity, so compared with $n$ the intervals could potentially be pretty small. Of course, it means that the proof isn’t likely to be all that explicit, but somehow I don’t find your objection too worrying. (Actually, I’d be pretty interested by a proof that a multiplicative sequence had to have discrepancy at least 6, say.)

• Sune Kristian Jakobsen Says:

I thought the advantage of considering these intervals was, that we only had to look at a small number of terms in the sequence. But if the length of the intervals grows about as fast as the length of longest sequence with discrepancy C, the advantage is not so big (but perhaps this grows much faster than exponential). Of course I agree, that a proof that multiplicative sequences had to have discrepancy >5 would be very interesting.

Somewhat related: A few days ago I wondered, what is the shortest proof, that a multiplicative sequence of length 247 has discrepancy>2? I was thinking of something like the 3th paragraph of this comment, (except that it should prove that we cannot obtain better sequence even if we change the value at some of the small primes). Or perhaps I should ask, what is the shortest proof, that an infinite multiplicative sequence have discrepancy >2?

• gowers Says:

By the way, I agree that your remark is an important one to bear in mind, and you may well be right that the Walters example rules out any argument of the type I was hoping for. However, if one is not trying to find an exact bound, but merely to prove that a super-long sequence has to have discrepancy at least $C$, then perhaps there is some small hope.

7. Klas Markström Says:

As for The other direction, I am still working on finding longer sequences. Since I did not expect sequences of this length when I started my first run I had left some inefficiencies in the setup of the computation. Now I have made some changes and restarted the whole process. This means that there will be a few days when I am mostly eliminating things and no new examples shows up, but after that step is done it should be much easier to find longer sequences, if they exist.

8. Greg Martin Says:

I’d like to lobby against the terminology “odd-smooth”, because of the “-smooth”. While numbers with only small prime factors have traditionally been called “smooth” in the English literature, the word “smooth” is nondescriptive here and suffers from overuse in mathematics. The superior (in my mind) alternative is “friable”, which is not otherwise used in math and has the proper meaning for the definition (and is also the same word in English and French).

I find “odd-friable” an improvement, but still a bit, well, odd. The product of all primes that divide n with odd multiplicity – that is, the natural representative of the image of n when projected into ${\mathbb Q}^\times/({\mathbb Q} ^\times)^2$ – is often called the “squarefree kernel” of n. Perhaps some term like “friable kernel” would suit this situation? (I know we’re looking for an adjective….)

9. Greg Martin Says:

Regarding 1a: for a fixed squarefree number $s$, the number of integers not exceeding $x$ whose squarefree kernel is $s$ is simply the number of squares not exceeding $x/s$, which is $\sqrt{x/s} + O(1)$. Therefore the number of integers whose squarefree kernel does not exceed $K$ is $\sum_{s\mid P} \big( \sqrt{x/s} + O(1) \big) = \sqrt x \prod_{p\le K} ( 1 + 1/\sqrt p ) + O(2^{\pi(K)})$, where $P$ is the product of all primes not exceeding $K$. For fixed $K$ this basically tells us what we need to know; even if $K$ grows with $x$, one can quickly show that the product multiplying the $\sqrt x$ is about $\exp(\sqrt K/\log K)$. The error term $O(2^{\pi(K)})$ is already pretty small (depending on what $K$ is), but one might be able to make it even smaller by considering the Dirichlet series whose coefficients are the characteristic function of the numbers whose squarefree kernel is at most $K$. This makes me lean towards the opinion that such numbers are actually quite regularly distributed.

One additional note: a typical number around $n$ has around $\log\log n$ prime factors, rather than $\sqrt{\log n}$.

• gowers Says:

Oops — I know that (about the typical number of prime factors) but temporarily “unknew” it when I wrote what I wrote. I was muddling it up with the standard deviation being $\sqrt{\log\log n}$.

I don’t have strong feelings about what the best expression should be to replace “odd-smooth” though I admit that “odd-smooth” itself feels suboptimal …

• harrison Says:

I’m going to reply here, because this is where it’s the worst. For some reason WordPress doesn’t seem to want to parse “K” in TeX for me. Are other people having this problem?

10. Jason Dyer Says:

If anyone wants a nice, tidier problem (and is smarter than me, I haven’t managed it yet), it would be good to have a by-hand proof that when bounding the discrepancy below by -1 and above by 2, the maximal length of sequence is 41.

Because nearly everything in the sequence is forced, I have been trying to prove solely from the multiplicative structure that the forcing will happen.

11. gowers Says:

I’ve just noticed something, inspired partly by a remark of Alec’s (I’ll link to it when I can find it again, which I now have), and partly (and indirectly) by this comment of Kristall’s. Recall that we took the first length-1124 sequence and for each $d$ looked at the subsequence obtained by restricting to the values $d,2d,4d,8d,\dots$. We found that they all seemed to be shifts, or minus shifts, of each other.

This meant that we could associate with each $d$ a sign $\epsilon(d)$ that told us whether we took a shift or minus a shift, and this function turned out to be multiplicative (which in many cases is trivially true, and I think we should be able to prove in general without too much trouble, though I haven’t got to that point yet). What Alec remarked in passing was that this multiplicative function itself had low discrepancy.

What I’ve just realized is that that was not a coincidence. Let me sketch a proof of a general result. I’ll suppose that I have a sequence of discrepancy at most $C$, and I’ll make the following rather weak assumption, which holds in the case of the sequence of length 1124: that for each $d$ there is a bias either towards 1 or towards -1 in the sequence $x_d,x_{2d},x_{4d},\dots$. By that I mean that there’s some constant $t$ that’s greater than 1/2, and the same for all sequences, such that the proportion of 1s is at least t or at most 1-t for all such sequences. In practice we have much more than that — we can say that the 1s and -1s form a very nice almost periodic sequence, but that doesn’t seem to be needed here.

Given this assumption, we can define $\epsilon(d)$ to be 1 if the bias is the same as it is for the sequence $x_1,x_2,x_4,x_8,\dots$ and -1 otherwise. I now claim that the function $d\mapsto\epsilon(d)$ has discrepancy at most something like $C/(2t-1)$. The reason is simple. If you can find $d,2d,3d,\dots,md$ such that the modulus of $\epsilon(d)+\epsilon(2d)+\dots+\epsilon(md)$ is at least $K$, then pick some large $N$ and look at the double sum $\sum_{i=1}^N\sum_{j=1}^mx_{2^idj}$. For large $N$, if we sum first over $i$ then we’ll get something like $\sum_{j=1}^m(\epsilon(dj)N(2t-1)$, which we are assuming to be at least $KN(2t-1)$ in modulus. On the other hand, summing first over $j$ and assuming the bounded discrepancy hypothesis we know that it is at most $CN$. This proves the result.

What this shows in particular is that from a quasimultiplicative example (with one extra hypothesis about the bias, which does seem to happen in practice) we get another example that looks as though it has to be completely multiplicative. It isn’t a subsequence as Kristall hoped, but it is still built out of the other example. (Note that the discrepancy can go up, but it remains bounded.)

• Jason Dyer Says:

I’m not convinced we have a bias yet, because we’ve been using a depth-first search for most of our results with a preference for +1. That is, one would expect sequence #1 to have a bias, but what if we take the “middle sequence” (that is, if we have n possible sequences, sequence n/2) does it have a bias?

• gowers Says:

The point here is that if you look at any subsequence of the 1124 sequence of the form $x_d,x_{2d},x_{4d},\dots$ it will show a bias (as long as it’s long enough to do so), and in fact it has a much stronger property that it is plus or minus a shift of a subsequence of the sequence $01011010110110101101011011...$ (which can be defined in many ways — a convenient one for writing it down is to use the substitution rules 0 goes to 01 and 1 goes to 011 and then write down a sequence that is invariant under those substitutions). Here I’m writing 0 instead of + and 1 instead of -.

Since this sequence is biased, so are all its shifts and hence so are all the subsequences of the given form.

We are probably talking slightly at cross-purposes, since I don’t understand your explanation of what you mean by the “middle sequence”.

• gowers Says:

Or maybe I do. Are you saying that you think there could be other 1124 sequences that don’t exhibit any bias? If so, then you could be right, but if they are quasimultiplicative then the relevant group could not have any elements of odd order, since if it did then there would be some $r$ such that as you went along GPs of common ratio $r$ you’d get odd period, and hence bias.

It is still conceivable that one could have a Sturmian sequence with an irrational rotation and an interval of size $1/2$. That didn’t seem to happen in the 1124 example and I have to confess I don’t fully understand why not. If you did have such an interval then there would be no discernible signs $\epsilon(d)$.

• gowers Says:

Here’s a rather nicer way of thinking about the proof above. Suppose we have sequences $x^{(1)},\dots,x^{(N)}$ all of which have discrepancy at most $C$. Suppose also that for every $n$ the average of the $x_n^{(i)}$ has modulus at least $a$. Ah, I see now that I need to use the stronger hypothesis that the modulus is almost exactly $a$. (That applies to the first version of the argument too, but we get this assumption in the quasimultiplicative case.) Then obviously the discrepancy of the average of the sequences is at most $C$, so if we divide through by $a$ we get (almost exactly) a $\pm 1$ sequence of discrepancy at most $C/a$.

We now take a quasimultiplicative sequence $x$ and apply the above simple averaging argument to the sequences $x, T_2x, T_2^2x,\dots$. This shows that under certain nice circumstances there are other notions of limit that give you improved structure.

• Jason Dyer Says:

I am talking about the specific algorithm we have been using to generate the sequences. Because we are listing the first ones that comes out, it will have a bias. Early choices in the tree have a tendancy to go to +1 if there is a choice.

If it generates 5,000,000 sequences, we should look at the 2,500,000th sequence generated if we want to find the least bias (at least by my understanding of how Alec’s algorithm works).

Now, it’s possible I’m confused on what exact data you’re using, in which case I apologize.

• Klas Markström Says:

Jason,
The initial sequences which I have generated, and Alec used as seeds for his search, do not come in a simple DFS order. The algorithm I use is not a simple DFS and does not have an easily described order.

However I agree that we should be careful no to draw too strong conclusions from these examples since there could very well be more sequences which are just harder for the program to find.

• Sune Kristian Jakobsen Says:

Even if it had a ordering, the middle sequence wouldn’t necessarily be the right one to look at. Suppose we wanted to study all sequences of length n. Now the first sequence would be 000….00. This sequence has a lot of structure, that is not shared by all length n sequences. But if we look at the middle sequences we get 100….00, which also has a lot of structure!

• Jason Dyer Says:

Klas, point taken. (Although you can drop the seed in Alec’s program to nothing if you like.) Could you give more details on how your algorithm is implemented?

Sune, in our case, using a binary tree, without the discrepancy condition the middle sequence would be +-+-+-+-+-+-+ … but I do see how that might not be the best one to look at.

• gowers Says:

Jason, it’s also worth bearing in mind that I’m talking about bias along geometric progressions rather than bias of the whole sequence (which will trivially tend to zero if the discrepancy is bounded). In fact, I’m even talking about bias along GPs of one fixed ratio. An extreme case might be where the function is multiplicative and $x_2=1$. Then if you look at any GP with common ratio 2, the values along it will be constant, so maximally biased, but the sequence itself could be pretty unbiased.

• Jason Dyer Says:

I’m also speaking geometrically (in the ordering of the DFS I am thinking of, given the middle sequence any geometric sequence the would be as close to as possible the alternating one), but I wasn’t thinking in terms of one fixed ratio.

12. Mark Lewko Says:

I have a simple remark related to 1a. This is probably well-known, but I haven’t seen mentioned yet.

A natural question (which was raised by Obryant https://gowers.wordpress.com/2010/01/14/the-erds-discrepancy-problem-iv/#comment-5123) is if we can find arbitrarily large intervals of $k$-odd-smooth numbers. If we drop the -odd- the answer is no by Størmer’s theorem (see: http://en.wikipedia.org/wiki/St%C3%B8rmer%27s_theorem). In fact I believe (although can’t find a reference at the moment) that there is a generalization of Størmer’s theorem that states that there are only finitely many pairs of $k$-smooth numbers differing by a fixed number l (where the bound on the number of pairs is independent of l). If this is the case it should imply that we can’t find arbitrarily long arithmetic progressions of $k$-smooth numbers.

It might be interesting to see if the proof of these results can say anything about the $k$-odd-smooth setting.

• Harrison Says:

Mark, do you mean arbitrarily *large* or arbitrarily *long*? I thought you meant the latter, which is trivially false — every interval of length 4k must contain at least one integer which is not k-odd-smooth — but it occurs to me you might mean the former, so…

Actually it occurs to me that it should be entirely possible to modify the method for Stormer’s theorem — note that they come in pairs as solutions to a bunch of Pell equations, and prove that the solutions grow too quickly for there to be infinitely many consecutive ones.

• Mark Lewko Says:

It is true that you can’t have an AP of length 4k of k smooth numbers by trivial considerations (which I overlooked when I made the comment). However, Størmer’s theorem does give you a bit more, since there is no dependence on what the $k$ fixed primes are in the theorem.

• Harrison Says:

Oh, wait! There are totally infinitely many consecutive pairs of k-odd-(smooth|friable) numbers! Because if C is squarefree and k-friable, then every solution to $x^2 - C y^2 = \pm 1$ gives us a consecutive pair.

I suspect this isn’t true, though (abc conjecture), if we replace “squarefree part” by, say, “fifth-power-free part.” Which makes me wonder if it might in fact be easier to prove that a completely multiplicative function from $\mathbb{Z} \rightarrow \{1, e^{2\pi i/5}, e^{4 \pi i/5}, e^{6 \pi i/5}, e^{8 \pi i/5}\}$ has unbounded discrepancy.

13. obryant Says:

The ABC conjecture is in the mix here, too. I state it like this: let q(a,b,c) be $\log(c)/\log(r)$, where r is the squarefree part of abc. For every positive epsilon, there are only finitely many triples (a,b,c) with a+b=c and $q(a,b,c) \ge 1+epsilon$, and none with $q(a,b,c) \geq 2$.

So if $a_ii, c_i$ are k-friable and b is fixed, with $c_i-a_i=b$, then r is at most the product of b and the primes smaller than k, and since $q(a_i,b,c_i) \leq 2$, we get a bound for $c_i$.

Hence, there are only finitely many k-friable numbers with fixed difference.

14. Guy Srinivasan Says:

I’ve been thinking about trying to estimate the number of sequences up to index N with discrepancy D<=2. Here's my first attempt and its results:

Start with 1 sequence (the empty sequence), and estimate the number of new sequences of length n+1 for each sequence of length n. Start with a really simple model: if a HAP might constrain xi, which means the partial sum of the HAP is at +2, 0, or -2 right now, then take the probability that it is at +2 as 1/4, at 0 as 1/2, and at -2 as 1/4. As if the partial sum has been a random walk through [-2,2] so far.

I need to find how many HAPs are relevant to constraining xi. For each factor j of i, HAPj is relevant if it's had an even number of values so far, but not 0. So if i!=j and i%j==0 and i/j-1%2==0 then HAPj's relevant. Count Ki relevant HAPs.

Now just say that xi has two options with probability (1/2)^Ki, one option with probability (1/2+1/4)^Ki-(1/2)^Ki, and zero options otherwise. Let the expected number of sequences S(n+1) be S(n)*E(options), and run…

This value tracks the values on the wiki fairly well for a while, only off (underestimating) by a factor of 10 by N=59, and off by a factor of 117 by N=94. The obvious missing piece (there may be others) is that the probabilities aren't really 1/4, 1/2, and 1/4. They're weighted more toward transitioning from +1 to 0 than +1 to +2, since at minimum if a partial sum is at +1 you expect slightly more x=+1 than x=-1, which means you're slightly more likely to be constrained by some other HAP to make the next value -1. For example, setting x7, if HAP1 was at +1 at x5 then there were three +s in the sequence and only two -s which makes it more likely than "expected" that HAP2 is going to force x6 to be -1. And *that* means HAP1 ends up at +0 at x6 more often than at +2 at x6 when transitioning from +1 at x5.

I haven't yet tried to incorporate the bias into the counting, but the code's on the wiki.

• Guy Srinivasan Says:

I factored in a lot of the bias, and now it is off by only a factor of 7 up to N=97. What’s very nice is that the estimated count goes up and up (to around 10^40ish) for a while, but starts to come back down at about N=400, and drop to ~31 at N=1124, then ~6 at N=1125. On the one hand, clearly there’s some coincidence going on there since we know there are over 2 million at N=1124. On the other hand, this at least gives very close to the shape our data leads us to expect.

The bias I factored in, explicitly, was this: when trying to assign xi, any HAP f that might constrain our choice might also be biased further towards or away from being at +2 or -2 than my naive 1/4 apiece. One way to detect a bias is to ask if on HAPf’s last index there was another HAP g which forced the issue. If so, then the more indices HAPf and HAPg share, the more likely HAPg was to have forced the issue in such a way to drive HAPf to 0 rather than to +-2.

Some things I know are missing:
-What’s the best way, given that k of HAPf’s 2k-1 values are +1 and that it shares s of it’s values with HAPg, to estimate the odds that HAPg’s partial sum is +2 rather than 0 or -2? Right now I’m just estimating the probability any one of HAPg’s values is +1, then finding the probability of exactly 1 more +1 than -1, exactly the same number, and exactly 1 less, then normalizing through to sum to 1 since we know those are the only possible results in a valid HAPg. Clearly there are some approximations there that could be tightened.
-What about when 2+ HAPs are maybe constraining a value, if one of them has partial sum +2 then since it shares some indices with the others they are more likely to be at +2 than if we didn’t know the partial sum of the first? i.e. HAP’s partial sums should be positively correlated in general, right?

• gowers Says:

I was hoping people would start to produce some heuristic arguments of this kind, and the fact that you’ve matched the actual data so closely is very interesting. A quick reaction to your last question, though I don’t know how relevant it is: in the multiplicative case, all HAPs are either equal or minus each other, so if you have two HAPs pointing at a point, then whether or not they have the same partial sum depends only on two different partial sums of the sequence itself, and those shouldn’t be positively correlated unless they’re almost the same partial sum. That leads me to think that there shouldn’t be much correlation in the general case: if we condition on the discrepancy being at most 2, then if we have a partial sum in one sequence, it gives you only a very small bias in that sequence, which can easily be taken care of by the values in the other sequence — or something like that.

• Guy Srinivasan Says:

I am still working on this, but there is a rather serious error in the code on the wiki right now. Previously I had estimated the probability a relevant HAP would constrain index i and then used that to estimate the probability index i would have 0, 1, or 2 options when it had x constraining HAPs and an arbitrary HAP had probability p of being +2 or -2 instead of 0. Then I added additional logic to determine each individual constraining HAP’s bias toward 0, but instead of redoing the second part of the calculation, I accumulated all of the bias into the generic probability p and THEN said “oh but we have x HAPs”. So if for example each HAP had a 60% chance of being 0 instead of 50%, and there were 10, I’d say “okay a generic HAP here has 1.5^10/(1+1.5^10) chance of being 0 instead of 50%, and I have 10 HAPs each with that probability”.

So the fact that it got down to ~6 exactly at 1125 really WAS a coincidence. But I’m still optimistic about the approach since the distribution has the right shape, just haven’t gotten to fixing that glaring error and adding in some additional thoughts I had. Probably a weekend project.

15. harrison Says:

A question, not related to anything in particular but which I just thought of and which might be easy to prove and at least give us some practice proving statements about discrepancy:

Suppose that $f, g$ are functions from $\mathbb{Z}$ to the complex unit circle, and both of them have unbounded discrepancy. Then does their pointwise product $fg$ necessarily have unbounded discrepancy?

• gowers Says:

If there exists a sequence $h$ of bounded discrepancy, then let $f$ be a random sequence and take the two sequences $f$ and $hf^{-1}$. With probability 1 that will be a counterexample.

• harrison Says:

I figured something like that might happen, but didn’t see it because I was trying to come up with a constructive counterexample. $f, g$ have pretty large discrepancy too, we can guarantee at least of order $n^{1/2}$, right? I wonder if we can find two linear-discrepancy functions that multiply to give something like a logarithmic-discrepancy function, but I sort of doubt it would be useful for anything other than assuaging my curiosity.

• gowers Says:

A rather similar trick works. Let $h$ be a function of low discrepancy. Let $f$ equal 1 on odd numbers and $h$ on even numbers, and let $g$ equal 1 on even numbers and $h$ on odd numbers. Then $fg=h$ and both $f$ and $g$ have partial sums that grow at rate roughly $n/2$.

16. Alec Edgington Says:

I’ve tried generating sequences satisfying the formula given at the end of Tim’s post here. That is, writing $\theta(n)$ for the function that’s $1$ when $\lfloor (n+1)\frac{\sqrt{5}-1}{2} \rfloor = \lfloor n \frac{\sqrt{5}-1}{2} \rfloor$ and $-1$ otherwise, the sequences satisfy

$f(2^a 3^b 5^c 7^d) = \theta (5 + a + b + 2c - 5d) (-1)^{b+c+d}$.

Notwithstanding the anomalies in the 1124 sequence at $7^2$ and multiples of $7^3$, it is possible to generate rather long sequences this way. The longest I’ve found so far has length 714; I’ve posted it on the wiki.

I’ve also tried the same experiment with the formula

$f(2^a 3^b 5^c 7^d) = \theta (a+b+2c) (-1)^{b+c+d}$,

which also gives long sequences. It’s interesting that all the shift coefficients in these examples are Fibonacci numbers.

• gowers Says:

There are several further things that would interest me to know about these sequences. One is whether they are (as one would definitely expect) very similar to the first 1124 sequence. Another is whether if you allow some backtracking (that is, you allow the pattern to degenerate a bit even for 7-smooth numbers), then you can extend the sequences you’ve generated quite a bit further.

At the moment, my tentative picture of what is going on is this. To construct a good example, it’s very important to make good decisions at smooth numbers, because these will belong to lots of HAPs. For some reason that I have yet to understand (though I’ve had a few thoughts in this direction and will post them soon), these quasimultiplicative choices with golden-ratio-generated Sturmian sequences seem to be particularly good. And once one has chosen the values at smooth numbers, the remaining numbers tend not to be in too many HAPs, so one can choose the values there with much less danger of running into trouble (where “trouble” at $n$ is defined to mean that the partial sum along one HAP whose next term is $n$ is 2 and the partial sum along another is -2).

Something I may try if I can face it is to write out the values of the original 1124 sequence by hand at all smooth numbers, then fill in all values that are forced (by drift considerations), and finally try to fill in the remaining values using some kind of looking forward. Then again, there’s so much checking to do that I probably can’t face it. So perhaps what I’ll do instead is describe as carefully as I can the algorithm I’d use if I could program. Actually, perhaps I won’t bother with even that, since I’m beginning to think that enough is forced for a simple backtracking algorithm to be fairly efficient. Perhaps a more fruitful line of enquiry would be a theoretical one: to try to predict when trouble will arise if you’ve sorted out all smooth numbers in some nice way.

• Alec Edgington Says:

The first question is easy to answer, at least for the 714 sequence satisfying the first formula (the other is on my computer at home and I don’t have it to hand, but I’ll have a look later). It differs from the 1124 sequence in quite a few places (173 in all), including the primes 17, 23, 29, 31, 41 and 47.

• gowers Says:

That provides some sort of support for my hypothesis: that if you sort out the smooth numbers then the constraints for the non-smooth numbers will be reasonably easy to satisfy (for a while at least). I’ll still be interested to know what happens if you do some backtracking. I’d also be interested to know whether a promising formula for smooth numbers (perhaps involving primes up to about 19 or something like that) together with a semi-intelligent greedy/backtracking algorithm (semi-intelligent meaning that you fill in anything that’s forced first and only then make choices) could lead rapidly to a very long sequence with discrepancy at most 3.

In general, I would say that although these quasimultiplicative examples are very interesting and I want to understand them better, the fact that they can almost certainly be converted into (somewhat worse) multiplicative examples means that I no longer think that they are the key to solving the problem. In short, I think that, contrary to my original expectations, Step 3 of my suggested proof strategy is fairly easy. In my next post I’ll write a revised proof strategy to take into account this and a comment of Sune’s that has somewhat changed my view of how to approach the problem for multiplicative functions.

17. harrison Says:

A “proof-of-concept” idea for 1: Instead of viewing it as a linear algebra problem over $\mathbb{F}_2$, think of the algorithm as constructing a SAT instance with variables $x_p$ for p prime, corresponding to the pth element of the sequence being +1; and “forced elements” as being implications from some formula to another formula. Then we can take everything down to CNF and we have an instance of CNF-sat which encodes (at least to some extent) whether we can build a multiplicative sequence.

I believe that past some threshold of clauses to variables, it becomes true with probability one that a formula won’t be satisfiable. Does the process of turning the multiplicative question into a SAT instance surpass this threshold?

• gowers Says:

There are some difficulties in carrying out this idea. The main one is that if you convert “has discrepancy at most $C$” into an instance of SAT, then the instance is likely to be pretty atypical. (For example, you may end up with conditions like “an even number of values in this set A should be -1” and to convert that into an instance of SAT you’ll have to have a lot of clauses with literals in A.) So random instances of SAT (or rather, k-SAT for some fixed k, which makes the situation even worse) are not obviously a good model for what is going on.

18. gowers Says:

I’ve had another go at creating a long completely multiplicative sequence of discrepancy 2 by hand. This time, instead of making what felt like the most sensible choice at every stage, I was a bit greedier. What I did was to choose 1 at each prime unless I felt forced to choose -1. However, I was pretty assiduous about following the consequences of my decisions — that is, I looked forward a lot.

The results were as follows. I first filled in all the perfect squares, since they are forced to be 1s. Then I chose f(2)=1 (even though I know from Alec’s experiments that this would ultimately doom me to not finding the best possible example). Since f(4)=1 that forced f(3)=-1 and f(5)=-1. It also forced f(7)=1 but for a subtler reason, which allows me to explain a simple lemma that had a big influence on my calculations. Because of my choice at 2, I had f(8)=f(9)=1. That implies that the partial sum up to 8 is zero. Why? Well, it has to be even. It can’t be -2 if f(8)=1 and it can’t be 2 if f(9)=1. The general lemma is that if f(2n)=f(2n+1) then the partial sum up to 2n has to be zero. This allows one to deduce lots of partial sums later on even when there are gaps in the sequence.

The first unforced prime after my choice of f(2)=1 turned out to be 11, so my decision to be greedy told me to choose f(11)=1. Following the consequences of that, using multiplicity, the lemma that tells me that certain partial sums are zero, and the requirement that no partial sum lies outside [-2,2] (sometimes iterating if I was lucky and able to deduce the value at other primes) I ended up filling in a large proportion of the numbers up to 160, but I didn’t quite manage to force the value of 13.

At that stage I found an awkward choice. I could prove quite easily that f(13) had to be -1, but only by considering what happened around 13 and 26 together. But that somehow doesn’t count as greedy. I’d think of it as a backtracking algorithm with only a tiny amount of backtracking. So I tried out f(13)=-1 and found a contradiction because I had already dedcued that the partial sum up to 132 had to be 0 and I was now able to deduce that f(133)=f(134)=f(135)=1.

After this exercise, I am beginning to lose my (already tenuous) belief in the approach to the problem for multiplicative functions that I outlined in the post above. Somehow that approach relies too heavily on just the multiplicativity, whereas proving a lower bound of 3 seems to involve an interplay between multiplicativity and the discrepancy condition.

Perhaps a good next step would be to try to guess how much is forced if you are given the values at the first $k$ primes and are told that the discrepancy is at most $C$. (For larger $C$ it would seem that there is much less forcing from the discrepancy condition. In particular, nothing like the trivial lemma above holds. But perhaps there is some much more complicated lemma that one proves by induction on $C$.)

19. gowers Says:

A few more comments on “improving” examples.

First, here’s an argument, I think correct, that given any sequence of 0s and 1s, you can always find a pointwise limit that has a density. (As I commented in the post, having a density isn’t a stable property, but it is still of some interest to get it.)

Let $\delta$ be maximal such that for every positive $\epsilon$ and every $n$ there is an interval of length at least $n$ in which the sequence has density at least $\delta-\epsilon$.

Then for sufficiently large $n$, inside every interval of length $n$ the sequence has density at most $\delta+\epsilon$. (Here, $\epsilon$ is arbitrary.)

It follows that if $J$ is a long interval in which the sequence has density at least $\delta-\epsilon$, then it cannot have more than a rather short initial segment inside which the density is less than $\delta-\sqrt{\epsilon}$ (by averaging). So remove a maximal such initial segment. The result is an interval in which the density is at least $\delta-\epsilon$, such that for no initial segment is the density less than $\delta-\sqrt{\epsilon}$.

Taking a pointwise limit of such intervals (in an obvious sense) gives a sequence where all sufficiently long initial segments have density between $\delta-\sqrt{\epsilon}$ and $\delta+\epsilon$.

By putting a few of these subsequences together, we can obtain a pointwise limit of density $\delta$.

20. Matemáticos colaboram num projecto aberto de investigação conjunta através do blogue do Professor Timothy Gowers « problemas | teoremas Says:

[…] E, em 19.01.10, o artigo EDP1 — the official start of Polymath5, no Gowers’s Weblog,  lança oficialmente o ataque ao The Erdős discrepancy […]

21. gowers Says:

Continuing from my previous comment, let me now ask the following question (which will be relevant if we want to find a limit where the density is defined both along the GP 1,2,4,8,… and along the GP 3,6,12,24,…). Suppose you are given two 01-sequences. Is it possible to find a sequence of translates so that if you take either 01-sequence then those translates will tend pointwise to a sequence with well-defined density?

I’m going to give an even sketchier argument. I’m pretty sure it can be firmed up. We can think of our pair of 01 sequences as a single sequence that takes values in $\{0,1\}^2$, or equivalently $\{1,2,3,4\}$. What I claim is that given such a sequence one can find a pointwise limit of translates such that every symbol has a well-defined density. To do this, my strategy is to choose $(\delta_1,\delta_2,\delta_3,\delta_4)$ that is maximal in the lexicographical order and has the property that to any desired approximation you can find arbitrarily long intervals $J$ such that inside $J$ the densities of the 1s, 2s, 3s and 4s are roughly $\delta_1,\delta_2,\delta_3$ and $\delta_4$. Then an argument like the one in the earlier comment should be enough to show that we can remove some beginnings of intervals (hmm … maybe this isn’t quite so obvious actually, since helping one symbol could hinder another, so this argument will need to be worked out carefully, but I’d be amazed if the claim is false) and end up with the desired pointwise limit.

If that argument can be got to work, then the next step is to prove it for a countable collection of sequences. This time we find for each $n$ a system of translates such that the first n sequences converge pointwise along those translates to sequences with well-defined densities.

Now by diagonalization choose a sequence of translates such that all the sequences converge pointwise to sequences with well-defined densities.

If that works (and I stress again that it isn’t carefully checked), then it easily implies the following result: if there is a $\pm 1$ sequence of bounded discrepancy, then there is a $\pm 1$ sequence of bounded discrepancy such that all subsequences of the form $x_d,x_{2d},x_{4d},\dots$ have well-defined densities.

If we have such a sequence, then we can associate with each $d$ the density $\lambda(d)$ (by which I mean the limit of the average of the first $n$ terms) of the sequence $x_d,x_{2d},x_{4d},\dots$, and the simple averaging argument I presented yesterday shows that the sequence $\lambda(1),\lambda(2),\dots$ has bounded discrepancy.

However, it does not have to be a $\pm 1$ sequence or even proportional to a $\pm 1$ sequence. For that we need to relate the different GPs somehow, which we can only do (as far as I can see) by picking other common ratios and playing a similar game with them.

The condition that the sequence of $\lambda(d)$s has bounded discrepancy is not all that interesting unless the $\lambda(d)$ are bounded away from zero. This suggests to me that the following question is important.

Question. In all the good examples so far found it seems that subsequences along GPs have the following compactness property: for every $\epsilon>0$ there exists $N$ such that amongst any $N$ translates of the sequence there will be a pair with pointwise product of density at least $1-\epsilon$. Must this always be the case?

If so, then the subsequences must all be approximately periodic in a pretty strong, and I think usable, sense, which is what we observe.

A first step would be simply to prove that for every sequence of bounded discrepancy there is a GP that has a pointwise product with some translate of itself of non-zero density. I feel sure this must be a realistic target. In fact, I feel very optimistic at the moment that we can prove that if there is a counterexample then there is a completely multiplicative counterexample.

• Terence Tao Says:

I think one can use the Furstenberg correspondence principle and the Birkoff ergodic theorem to answer your first question affirmatively (existence of a subsequence limit in which all four sequences have a density). The orbit closure of the original sequence can be given an invariant measure (the Krylov-Bogulybov theorem, see e.g. these lecture notes of mine), and then Birkhoff tells us that all functions have pointwise averages converging almost everywhere. In particular there is a point where the four functions corresponding to the situation of interest have a joint limit. Furthermore the same argument works for countably many statistics.

The answer to your latter question though seems to be negative – you are asking for all minimal systems to be compact, and this is not the case. The simplest example is, as you well know, given from 2-step systems, e.g. consider the one-dimensional function f(n) that is +1 when sqrt{2} n^2 has fractional part less than 1/2, but -1 otherwise. This is already a minimal sequence (can’t be improved further by subsequences) but is not compact in your sense. (It is though a compact extension of a compact system: thus given N translates, there is a pair whose product is not of density 1-eps, but instead is a sequence which is itself compact, in that given M translates of _that_ sequence, one pair has product with large density.)

• Terence Tao Says:

p.s. Regarding the first question, the buzzword here is “generic point”. See for instance these notes of mine.

Regarding compact extensions, actually I oversimplified a bit in defining compact extension. A sequence would form a compact extension of a compact system if for every epsilon, there was a finite number of sequences, such that every shift of the original sequence could be approximated to within epsilon by a linear combination of these finite number of sequences, multiplied by compact (=almost periodic) sequences. To describe all this cleanly one needs the machinery of Hilbert modules; see these lecture notes of mine.

• Terence Tao Says:

p.p.s. I should have clarified that I was writing additively when defining f(n), etc., rather than multiplicatively. To work multiplicatively, one could e.g. define f(n) based on the fractional part of $\sqrt{2} m^2$, where m was the power of 2 (say) dividing n.

• gowers Says:

And I should have clarified that my question (the one I think you’re referring to when you say “latter question”) was not about general sequences but about (hypothetical) sequences of bounded discrepancy. There the computational evidence suggests that there is some kind of compactness property — or at least I think it does. Or have I misunderstood what you were saying?

22. gowers Says:

Here are some thoughts towards an answer to a modification of the question I asked in the last paragraph of my previous comment. What I set out to show was that for any $r$ you can find two GPs of common ratio $r$ such that the corresponding subsequences correlate positively with each other.

I attempted to do this as follows. I’ll take $r=2$ for simplicity but the argument is general. Pick very large $M$ and $N$ and define a function $f(m,n)=x_{2^mn}$ for each $1\leq m\leq M$ and $1\leq n\leq N$. If there is no correlation between any of the sequences corresponding to the GPs $(2^mn)$, then this function is quasirandom, which is a symmetric statement in the two variables, so there is also no correlation between the values taken at the HAPs $(1,2,3,\dots)$, $(2,4,6,\dots)$, $(4,8,12,\dots)$, etc. This means that we have a whole lot of HAPs along which there is bounded discrepancy, and we have no correlation between them.

This feels as though it ought to be impossible, though I don’t see a proof that it is. At any rate, it ought to be harder to find several noncorrelating counterexamples to the EDP than it is to find just one — this could be a question worth looking into.

It also prompts me to ask whether there is an interesting sense in which the set of sequences of discrepancy at most $C$ is compact. A “boring” sense in which it is compact is that any sequence of counterexamples has a subsequence that converges pointwise. (That’s compactness in the weak star topology or something.) I call that boring because it applies just as well to the set of all $\pm 1$ sequences, and therefore doesn’t tell us anything about EDP.

So what might an interesting topology be? Well, one notion of closeness that seems relevant is one that applies to translates of Sturmian sequences: given enough of those you can find two that agree on a set of density at least $1-\epsilon$ (in the very strong sense that between any two disagreements there is a gap of length about $\epsilon^{-1}$). That is too much to expect in general (except in the trivial sense that EDP is quite likely to be true, so the set of counterexamples may well be empty). So is there some notion of closeness of two examples that is much stronger than agreeing on a long initial segment, and substantially weaker than agreeing everywhere except on a rather sparse set, that could be helpful to us?

• Alec Edgington Says:

Since we can associate any bounded-discrepancy sequence with a Dirichlet series convergent in the right half-plane, we can think about notions of closeness for the analytic functions so defined. There are many ways to do this, but perhaps something about the behaviour of the functions near to the real axis would be appropriate.

In the case of completely multiplicative sequences, could it be interesting to consider two sequences close if they differ on a finite (or suitably sparse) set of primes? I suspect this may be too strong, though.

23. gowers Says:

I’m trying to understand better Alec’s examples of multiplicative sequences of length 246. He informs us that they all agree up to 67, and it is notable that with one exception all the primes up to 67 are chosen to have value $1$ if they are quadratic residues mod 5 and -1 if they are non-residues, while 5 itself takes the value -1. In other words, the function is very similar to a character-like function of the generalized O’Bryant type.

The exception is 41, which gets the value -1. Why should this be?

The answer is that 41 is the first place where the partial sum goes outside the range [-2,2]. One can see this quickly by noting that the partial sum up to $n$ is equal to the number of 1s in the base-5 representation that are coefficients of even powers of 5 minus the number of 1s that are coefficients of odd powers of 5 plus the number of 3s that are coefficients of odd powers of 5 minus the number of 3s that are coefficients of even powers of 5. So to get to 3 with as small a number as possible you have to take $5^2+3.5+1=41$.

So it’s as though the function recognizes that character-like functions work rather well, and deviates from one only when it absolutely has to. (Beyond 67 it deviates a lot, but by then the choices are not unique.)

Why, one might ask, does the function not choose the base-3 character-like function, which has a slower-growing discrepancy? I think I can answer this question as well. The value of the partial sums this time is the number of 1s that are coefficients of even powers minus the number of 1s that are coefficients of odd powers. So the first time you hit 3 is at $3^4+3^2+1=91$. The big difference here is that 91 is not a prime, so we are not free to choose its value. And if we’ve followed the base-3 character-like function as far as 13 then we’ve already committed ourselves to the “wrong” value. But changing such early values will mess things up good and proper: indeed, changing the value at 7 will push us towards the more base-5-like function.

What I don’t get from this is any sense of how one might actually define a particularly good way of choosing the values at primes.

One thought I had, but it raises number-theoretic questions I don’t know the answers to, is this. Is it conceivable that there’s a large, but not too large, prime $p$ such that all the numbers up to 67, apart from 41, are quadratic residues mod $p$ if and only if they are quadratic residues mod $5$. In a fit of madness I tried 41 itself, but it fails at the first hurdle: $7^2$ is congruent to 8, which is not a quadratic residue mod 5. As a result, I now see that one probably needs $p$ to be significantly bigger than 67 (so that you don’t run into trouble for trivial mod-5 reasons when you first go above $p$ in the sequence of squares).

What I’m getting at here is that there could conceivably be some miraculous sequence of primes such that their quadratic residues give you longer and longer very good multiplicative sequences.

Indeed, here’s a closely related question. Take a prime $p$ and take the quadratic residues mod $p$ up to some $m$ that’s quite a bit smaller than $p$. Could that form a sequence with no large partial sums? Trivial Fourier methods don’t seem to rule it out. Maybe something nice could be proved using quadratic reciprocity — no time to try that just at the moment.

• gowers Says:

I did find a quick moment after all. A simple use of quadratic reciprocity plus the Chinese remainder theorem will give a prime $p$ that has whatever sequence of initial residues you want, but it will have to satisfy a condition for each prime that would typically hold with probability about 1/2, so one would normally expect $p$ to be exponentially large in the number of primes you want to fix. So it isn’t interesting to find just any old prime $p$ — it has to be not too big. The question I still haven’t answered is whether if you take 5, 41 and all the primes up to 67 that are 2 or 3 mod 5, does the fact that all but 5 and 41 are non-residues mod 5 give this set a special structure that makes it possible to find a smaller $p$ that works (because, say, there is a positive correlation between the conditions that have to hold)?

• gowers Says:

It occurred to me that my explanation of why the multiplicative examples don’t start out like base-3 character-like functions was incomplete. Just because those functions first go wrong at 91 and 91 is composite, it doesn’t follow that we can’t make an adjustment at a slightly earlier prime. Given that the partial sum up to 91 is too big, we find ourselves wanting to change the value at an earlier prime from 1 to -1. This means that the earlier prime must be 1 mod 3. Looking back, we find that that rules out 89 and 83. In fact, the largest prime less than 91 that is 1 mod 3 is 79, and changing the value at 79 from 1 to -1 does help for a while. One first runs into new trouble at 111=81+27+3, which without the adjustment at 79 would give -1, but with it gives -3. So can we find a nice friendly prime that’s 2 mod 3 that we can change from a -1 to a 1? Yes, 107 will do the job. So we change 79 from 1 to -1 and change 107 (or 101 if we prefer) from -1 to 1.

We are surely doomed to get into trouble at some point soon. However, I don’t think I can claim any longer that I see why base 5 is preferred to base 3 (other than the boring reason that ultimately it happens to work better).

• Alec Edgington Says:

Inspired by this observation, I tried searching for multiplicative discrepancy-3 sequences, initializing the search by sending $p$ to $-1$ if and only if $p \in \{2,3\}\,(\mathrm{mod} 5)$. The search very quickly got to $852$. I’ve put the resulting sequence on the wiki; the formatting in 60 columns makes the quasi-periodicity quite apparent.

• gowers Says:

That’s very interesting. It would also be interesting to see what happens if you send 5 to -1 as well, but otherwise keep things as they were. I would expect that to be good too (though not necessarily better, as the partial sums of the quadratic residues mod 5 do not remain positive).

• Alec Edgington Says:

If you mean initialize the search with 5 going to -1 (rather than impose it as a constraint), I tried that too. It produced exactly the same sequence. This suggests (but doesn’t prove, since the search hasn’t finished) that for discrepancy 3 it is better to send 5 to +1.

• gowers Says:

Let me check I understand what you’re doing. Do you take the base-5 character-like sequence and then backtrack from it (where that means something like that you look for the first place where the partial sum leaves [-3,3] and make the change to the largest possible prime that will deal with the problem, and then iterate, and if you run out of options then go back to the second largest possible prime, etc. etc.)?

• Alec Edgington Says:

Correct. But, I apologize! I hadn’t actually tried sending 5 to -1. (What I had stupidly done was to send the prime indexed by 5 to -1.) When I did just try seeding the search with 5 going to -1 (and otherwise as before), it generated huge sequences (beyond the limit of 1500 I’d set in the program). Let me investigate this further.

• Alec Edgington Says:

In fact it gets (rapidly) to 1530. I’ve put the resulting sequence on the wiki.

• gowers Says:

That’s great — and I hope it won’t be the end of the story.

It would be quite useful to have a complete list of primes where corrections have taken place — that is, primes congruent to 1 or 4 that take the value -1 or primes congruent to 2 or 3 that take the value 1. One might expect such primes to come in pairs, with each pair enclosing a value that would otherwise have caused problems — or at least, one might expect this to start with, until the corrections themselves start to cause problems.

It occurs to me that we might even be able to prove a theoretical result here. If we take a character-like function, it’s not just the case that we have to go exponentially far before reaching a partial sum that’s bigger than $C$, but it’s also the case that even when we do start having large partial sums, they are very infrequent and surrounded by long portions of sequence with very low drift. So if $n$ is a troublesome value, then an obvious strategy for dealing with it is to find primes $p$ and $q$ close to $n$ and on either side of it. If the partial sum at $n$ is too large, then we also want $p$ to be congruent to 1 or 4 and $q$ to be congruent to 2 or 3. We then change the value at $p$ from 1 to -1 and the value at $q$ from -1 to 1.

There is a huge range in which we can look for suitable $p$ and $q$ (since the drift is small) — easily large enough for quantitative versions of Dirichlet’s theorem to guarantee that they can be found.

Of course, once you change the values at $p$ and $q$, you have to change the values at all multiples of $p$ and $q$. At least to start with, these will also come in nearby pairs, but pretty soon that is no longer the case, and after a while one will be making more and more adjustments until, presumably, all trace of the base-5 structure is gone. The interesting question is whether that allows us to reach a sequence of superexponential length. At the moment it feels to me as though it will remain exponential but give a bigger constant in the exponent.

My rough reasoning is that the interference from changed values of $p$ and $q$ could start to bite by about $pq$, especially if there are several adjustments being made and not just the first two. But that’s so rough that it could well be wrong. Perhaps we could get from an exponential construction to something more like $C^C$. Perhaps we could at least argue heuristically that some improvement like that should be possible.

24. Steven Heilman Says:

If you take the special sequences of length 246 and 1124, and you take their Fourier and Mellin transforms, what is the result?

25. gowers Says:

This is an attempt to understand why using a formula of the kind discussed in this comment for the values of a sequence at smooth numbers should be so helpful. In particular, I don’t have a good explanation for the appearance of the golden ratio (especially as the 1112 sequence seems to show that you can do pretty well without it).

But to get some kind of handle on the question, let’s consider the sum along the GP 1,2,4,8,… . That goes up by 1 every time the integer part of $n\alpha$ changes (where $\alpha=(\sqrt{5}-1)/2$) and down by 1 every time it doesn’t change. Therefore, one plus the increment is twice the increment of the $\lfloor\alpha\rfloor$, which implies that the partial sum of the first $k$ terms of this sequence is $2\lfloor\alpha k\rfloor-k$. Because $\alpha>1/2$, this gives us a slight positive drift.

For now let me forget about integer parts (though this is dangerous and I expect them to be important later in a way that may make the calculations difficult, just as integer parts cause a lot of difficulty if you try to estimate the number of primes from 1 to n in a naive way).

What I want to do is estimate the partial sums if I ignore everything except for the smooth numbers — that is, I replace all other values of the sequence by zero.

Let’s now try for summing the values at all numbers of the form $2^a3^b$ that are less than $n$. I’m taking the sum of the values with $b=0$ to be $\log_2n(2\alpha-1)$. We associate the sign -1 with the number 3 (in the 1124 sequence), so the sum over numbers of the form $2^a3$ should be more like $-\log_2(n/3)(2\alpha-1)$, which we can write as $-(\log_2n-\log_23)(2\alpha-1)$. Continuing like this for all powers of 3 up to $\log_3n$ we get $\sum_{b=0}^{\log_3n}(\log_2)^{-1}(-1)^b(\log n-b\log_3)(2\alpha-1)$, which I’ll sloppily estimate as $(\log_3n/2)\log 3(2\alpha-1)$, which equals $(\log n)(2\alpha-1)/2$.

At this stage I’m forced to take the errors into account a bit more carefully, since for each $b$ the estimate will only be correct up to $O(1)$ and I’ve taken $\log_3n$ values of $b$. But I think the total error is in fact bounded, roughly speaking because the dubious term (the one nearest to $\log_2(n/3^b)$) is moving along the Sturmian sequence at a fairly constant rate and changing sign with each step. I don’t have the stomach to try to make this remark precise. All I want to say is that the fact that we divided by 2 was quite encouraging, as it raises the hope that one might be able to divide by 2 an unbounded number of times and thereby get sublogarithmic growth — but recall that this is only for partial sums along smooth numbers, so it wouldn’t mean we had sublogarithmic growth for the whole sequence.

However, if the partial sums along the smooth numbers are particularly well controlled, then perhaps it is possible to argue that the values at other numbers can be chosen to keep the discrepancy sublogarithmic, the rough reason being that each such number is contained in only a small number of HAPs, so the choices shouldn’t affect each other too much and constraints shouldn’t conflict too much.

26. Jason Dyer Says:

Progress on case bounded below by -1 and above by 2: I still don’t have a proof the longest sequence is 41, but I wanted to mention I am having better luck by presuming the sequence is a standard C=1 sequence and treating it with that structure, considering the points where the sequence reaches 2 to be a separately structured set of anomalies.

27. obryant Says:

There are a number of “discrepancy” notions that are relevant to this problem (I list the ones I’ve though of below). The one we’ve focused on so far has superior qualities for computation. From a theoretical standpoint, however, perhaps one of the other discrepancies is the correct one. Also, computing examples that are extremal with regards to the other discrepancies would be enlightening, although I don’t see how to make the computations.

I’ll repeat from the wiki a bit, and then state the discrepancies. For any +/- 1 valued function x from the natural numbers and let $\mathcal{A}$ be a set of finite sets of natural numbers, we set $\delta(\mathcal{A}, x)$ to be the maximum of $\left|\sum_{a\in A} x_a\right|$ as A ranges over $\mathcal{A}$. By a sub-HAP, I mean a sequence of the form md, (m+1)d, …, nd.

Erdos’s Homogeneous Discrepancy Problem: Is there any x for which $\delta(\mathcal{A}_N,x)$ is a bounded function of N, where $\mathcal{A}_N$ is the set of

(i) HAPs whose largest element is at most N;

(ii) HAPs whose length is at most N;

(iii) HAPs whose difference is at most N;

(iv) sub-HAPs whose largest element is at most N;

(v) sub-HAPs whose length is at most N;

(vi) sub-HAPs whose difference is at most N.

Only (i) and (iv) are obviously finite: (i) is where most attention has gone and (iv) we’ve been calling the drift.
For multiplicative functions x, (i) and (ii) are the same. There are obvious inequalities connecting (i) and (iv), (ii) and (v), and (iii) and (vi), but the latter 3 are more refined.

There is also a suite of discrepancies like: “HAPs whose difference is at most N, length at most 2N, and largest element $N^2$“, but cataloging those seems pointless just now.

Walters’ multiplicative example gives x with $\delta_i \ll \log N$, $\delta_{ii} \ll \log N$, $\delta_{iii} = \infty$, $\delta_{iv} \ll \log N$, $\delta_{v} \ll \log N$, and $\delta_{vi} = \infty$.

I’d appreciate a check on the following x. Each positive integer a can be written in the form $a=\sum_{n\ge1} b_n (n!)$ in a unique way with $0\leq b_n \leq n$. Set $x_a=1$ if $\sum b_i$ is even, and $x_a=-1$ otherwise. Then $\delta_{iii}(N)$ is finite for every N. Assuming that’s correct, I wonder how fast it grows?

Is $\delta_{ii}(100)$ always at least 3? If so, then there’d likely be a computational proof. If not, knowledge of how that happens may help us stay out of blind alleys.

Also, there are a number of functions x that have arisen for which these provide a list of subproblems. Multiplicative functions, of course, but also automatic sequences (as raised by Borwein, Choi et. al.) and in particular the Thue-Morse sequence hasn’t been knocked out of the parked (although we do know its discrepancies go to infinity, we haven’t decided how quickly they do so). Also the Matryoshka sequences with composite moduli haven’t been fully optimized.

• Jason Dyer Says:

Another theoretical one to try (that can be a pain experimentally) are HAPs that allow up to x terms that force the sequence (for any d) outside the given discrepancy.

28. gowers Says:

I’ve thought of a much weaker question than EDP that might conceivably slot into that nice intermediate area between “trivial question” and “just as hard as the whole problem”.

I’m not sure if this is standard, but given a set $\Sigma$ of infinite 01-sequences, I’ll define the entropy of $\Sigma$ to be $\lim n^{-1}\log S_n$, where $S_n$ is the number of sequences of length $n$ that are initial segments of some sequence in $\Sigma$. This is closely related to the rates of growth that people have been looking at already. The limit may not always exist: let’s not bother about that for now.

Now let $\Sigma$ be the set of all infinite sequences of discrepancy at most $C$. We’d like to prove that $\Sigma$ is empty, but perhaps we could start by trying to prove that it has zero entropy. Roughly speaking, this would mean that the rate of growth of the set of sequences of length $n$ couldn’t be exponential. Turning that round, if you have $\exp(cn)$ sequences of length $n$, for sufficiently large $n$, does it follow that at least one of them has discrepancy greater than $C$?

It occurs to me that this question may be fairly simple. By the Sauer-Shelah lemma, if you have $M$ binary sequences of length $n$ and $k$ is such that $\binom n0+\dots+\binom nk\leq M$, then there must be a set $A$ of size $k$ such that every possible assignment of 0s and 1s to $A$ is the restriction of some sequence in your set. If $M$ is exponential in $n$, then this makes $k$ proportional to $n$. So it would be sufficient to answer the following question.

Question. Let $C$ and $c$ be positive constants. Let $n$ be sufficiently large, and let $A$ be a subset of $\{1,2,\dots,n\}$ of size at least $cn$. Is it possible to assign pluses and minuses to the numbers in $A$ in such a way as to force every sequence that agrees with those signs on $A$ to have discrepancy greater than $C$?

This feels to me like quite a nice question (but I’ve only just thought of it so I may turn out to be wrong). As yet I don’t have any instinct about how much easier it should be than the original question.

• Gil Says:

Maybe it is enough to assign only pluses?

NB Gil almost immediately withdrew this suggestion — see the next main comment.

• Alec Edgington Says:

Interesting. One could ask, given $C$ and $n$, what is the size $M_C (n)$ of the largest subset of $\{1, \ldots, n\}$ on which we can choose the assignments freely and still be able to extend to a discrepancy-$C$ sequence on $\{1, \ldots, n\}$. For example, $M_1 (11) = 2$, since one can choose the values at $1$ and $11$ freely. If I understand the question, you’re asking whether $\lim_{n \rightarrow \infty} \frac{1}{n} M_C (n)$ is zero or undefined for all $C$. (And what we actually expect is that it’s undefined.)

If we generalize slightly, replacing the limit $C$ by upper and lower limits $-a$ and $b$, and define the function $M_{a,b} (n)$ analogously, I think I’ve shown (but would have to double-check) that $M_{1,3} (83) = 4$.

• gowers Says:

I’m not sure I quite follow — I would expect that limit to be zero (assuming that the conjecture is true).

• Alec Edgington Says:

If the conjecture is true then $M_C (n)$ is undefined because there is no subset on which we can choose values freely and still extend. Or am I barking up the wrong tree entirely?

• gowers Says:

My fault — I was wrongly thinking that in that case the largest subset was the empty set, whereas in fact it’s the set of sets that you can take that is empty …

29. Gil Says:

(sorry, I withdraw this suggestion)

• gowers Says:

Since I too had the same idea, let me briefly explain why it doesn’t work, just to save others from having it as well. The problem is that if there is a long sequence of discrepancy $C$, then $A$ might be precisely the set of places where that sequence takes the value 1. If so, then restricting the values on $A$ to 1 isn’t a huge problem!

Since the values you choose on $A$ must therefore depend quite seriously on $A$, and since it is far from clear how to make the choice, the best chance might be to choose them randomly. So a very tentative conjecture is this. If you choose the values randomly on a set of size $cn$, then every extension to the whole of $\{1,2,\dots,n\}$ has discrepancy at least $\epsilon(c)\sqrt{n}$. (But all I actually care about is getting it to be unbounded, so if this conjecture is way too optimistic, it doesn’t matter too much.)

• gowers Says:

It occurs to me that this question could be investigated empirically. Suppose you fix a number like 1000, choose a random set of 200 numbers (say) between 1 and 1000, assign random values to those 200 numbers, and then try to choose the remaining values in such a way as to minimize the discrepancy. How well can you do?

A slightly better version of the experiment might be to choose the random set, but then to remove from it a few points so that you don’t get any long subintervals of HAP sequences.

What I would hope to see is a discrepancy that is forced to be quite large. Some of the numbers above might have to be adjusted, however.

One other possibility might be to choose the values at odd numbers randomly. Then the “odd discrepancy” would be around $\sqrt{n}$, or around 30 if $n=1000$. How far can one bring that down by cleverly choosing the values at the even numbers?

• gowers Says:

Actually that last question was not a good one. If you choose the values at odd numbers randomly, then somewhere you’ll get a drift proportional to $\sqrt{N}$. Then either the drift of the whole sequence or the drift of the even numbers in that same interval will also be proportional to $\sqrt{N}$. But I think one could ask the question for a random assignment to numbers that are 1 mod 3.

30. gowers Says:

I’d like to resurrect an idea that I had quite some time ago and that Sune was also interested in at one point. It’s based on the observation that the EDP is equivalent to what I think of as the “EDP with arbitrary shifts”.

I can be slightly more precise now than I was before. Let us choose, for each prime $p$, a “privileged” residue class $a_p$ mod $p$. And let us call any initial segment of that residue class (by which I mean that one picks some $N$ and takes the set of all numbers that are congruent to $a_p$ mod $p$ and at most $N$) a privileged arithmetic progression mod $p$, or PAP. Let us also do the same for prime powers, choosing $a_{p^r}$ in such a way that if $s, then $a_r$ is congruent to $a_s$ mod $p^s$. (This is to guarantee that the infinite PAP mod $p^r$ is a subset of the infinite PAP mod $p^s$.) And finally, let us say that any intersection of PAPs is a PAP. By the Chinese remainder theorem, for every $m$ there will be exactly one $a_m$ such that the set of positive integers congruent to $a_m$ mod $m$ is an infinite PAP with common difference $m$.

Also by the Chinese remainder theorem, for every choice of $a_{p^r}$ for finitely many prime powers $p^r$ we can find some $N$ that is congruent to $-a_{p^r}$ mod $p^r$ for every $p^r$. If we then shift the origin to $N$ look at the HAPs beyond $N$, they behave just like PAPs. So if we can find some choice of shifts $a_r$ that force unbounded PAP discrepancy, then we have proved EDP.

The motivation for bringing up this concept is that it seems at least possible that many phenomena we observe at the moment are misleading and depend on the fact that all the HAPs are “lined up” at zero, whereas the above observation shows that the lining up is not really an essential feature of the problem. (What is essential is that we’re allowed just one congruence class for each $m$ and that the congruence classes must be “consistent” with each other.)

I think it would be interesting, and perhaps even quite enlightening, to investigate empirically some fairly random shifted version of the problem. To set it up, choose a very large integer $L$ randomly (between $10^{20}$ and $10^{21}$, say) and define the infinite PAP mod $m$ to be the set of all positive integers that are congruent to $L$ mod $m$. A finite PAP is just an initial segment of that arithmetic progression, and the discrepancy of a sequence is the discrepancy with respect to all finite PAPs. What I would like to know is this: if we search for sequences of PAP discrepancy 2, do we end up finding very long examples, just as in the lined up HAP case, or do the best ones tend to be much shorter? (I suppose it’s also possible that they could end up being much longer — that too would be interesting.)

If they end up much shorter, it suggests that it might be easier to prove that PAP discrepancy was unbounded (for some suitable, or perhaps random, sequence of shifts) than to prove it for HAP discrepancy. And yet the two problems are equivalent.

One further remark is that it is not at all obvious what structure one might expect to find in the PAP problem. For instance, I can’t think of a useful notion of multiplicativity, or a geometric progression, or anything like that. So there’s at least some hope that by shifting the HAPs we rule out some of the examples that cause us difficulties when we try to prove anything in the homogeneous case.

• Alec Edgington Says:

I’ll try this tonight. I can see why it might be harder to find long sequences (because you’d tend to hit numbers that intersect a lot of PAPs sooner), but also why it might be easier (because for that very reason you’d be forced to adjust to the stronger constraints earlier on). It will be interesting to see the results.

• gowers Says:

If you do find long sequences it will particularly interesting to see whether they have any discernible structure. Very broadly speaking indeed, I am holding out the hope that random shifts will somehow randomize the sequence itself and thereby force it to have a large discrepancy. It’s conceivable that the discrepancy could even be much bigger than logarithmic, because in order to get to the point where the shifts are suitably random one might have to go exponentially far out (so it wouldn’t contradict the Walters example).

• Sune Kristian Jakobsen Says:

“It’s conceivable that the discrepancy could even be much bigger than logarithmic, because in order to get to the point where the shifts are suitably random one might have to go exponentially far out (so it wouldn’t contradict the Walters example).”
Doesn’t the corresponding shift of Walters example give a logarithmic discrepancy? I haven’t checked, but I assumed that that was the case when I wrote this comment. I just realized, that what you suggested in this comment is a special case of what I wrote about (looking at a set of intervals, instead of just at one interval). But I think this is the most (only?) interesting case, and it would be very interesting to see the result of Alec’ experiments.

• gowers Says:

I wondered about that and then sort of reassured myself that it didn’t. But now that you mention it I think that my self-reassurance was wishful thinking. In fact, it looks as though all you have to do is a shift to make sure that the biggest power of 3 you care about is shifted to the right place and that any number that isn’t a power of 3 will take care of itself.

So I take back my hopes of a much larger discrepancy, and I also take back my statement that I didn’t see much chance of structure. I still can’t work out quite what structure I would expect to see though. Perhaps that will become clearer if we get an example to look at.

• Sune Kristian Jakobsen Says:

Another idea:
Instead of trying to find the longest sequence _beginning_ in N with discrepancy (or should I say drift) at most C, we could try to find a sequence _around_ N. So we decide the value in N, then N+1, then N-1,… I don’t really think that this would be a better experiment, but it might at least give a different result.

Note that working backwards from N (decide N then N-1 then N-2,…) would be the same as working forward from -N or from (n!-N) for some large n.

• gowers Says:

I’m slightly struggling with the following question: is it easy to tell from the values between some large $N$ and $N+n$ whether a $\pm 1$ function is the restriction of some multiplicative function?

We can characterize multiplicative functions as mod-2 sums of characteristic functions of HAPs with prime common difference (when you convert 0s into 1s and 1s into -1s). Thus, we could characterize restrictions of multiplicative functions to $[N,N+n]$ as mod-2 linear combinations of all prime HAPs that intersect that interval.

That gives us what one might call a global way of testing whether a function is a restriction of a multiplicative function, and it gives a polynomial-time algorithm for it. But that’s not what I mean by “easy to tell”. To see what I mean, consider an alternative characterization of multiplicative functions: they are ones where $f(1)=1$ and if $ab=cd$ then $f(a)f(b)f(c)f(d)=1$. That one might call a local characterization. (If we omit the $f(1)=1$ then we characterize functions that are either multiplicative or minus a multiplicative function.) Equivalently, if you go along any GP, the values either alternate or are constant (and $f(1)=1$).

Is there anything that could count as a local characterization in the shifted case? I don’t have a precise formulation of this question, or I suspect it would be easy to answer.

• Alec Edgington Says:

This is exactly what I was wondering on my way home from work, and I’m hoping the investigation of shifted low-discrepancy sequences might shed some light on it. I have just got my program working — I hope. Too soon to draw many conclusions: I’ve just run it a couple of times, with different $L$ and different seeds; each time it’s got to lengths in the 200s without difficulty (and without terminating).

31. Sune Kristian Jakobsen Says:

Meta comment:
The one level threading makes it a bit difficult to make sure that you read all the comments. One way to do it is to follow the RSS feed, but it only contains the latest 10 comments (you could also click the “Notify me of follow-up comments via email”, but then you get a lot of emails). It would be great if you would increase this number in Settings -> Reading -> “Syndication feeds show the most recent” (I think this works for post feeds as well as comment feeds, but I’m not sure).

32. Thomas Sauvaget Says:

Meta-comment: as a follow-up to Tim’s and Mark’s requests from a few days ago, I’ve now made a longer step-by-step tutorial about using C/C++ source code on a Mac with screenshots, which vastly improves the quick one I had made then (and which was still full of hurdles for a novice I realized) http://thomas1111.wordpress.com/2010/01/18/installing-xcode-and-using-gcc-on-mac/
I hope to add some more tutorials on basic C and C++ to modify a code, and on shell scripting to process data files, in the coming two weeks.

33. Polymath Reflections « Combinatorics and more Says:

[…] a miniPolymath leading to collectively solving an olympiad problem, and an intensive Polymath5 devoted to the Erdos discrepency problem which is running as we speak. We have a plan for resuming […]