This is an emergency post, since the number of comments on the previous post about Erdős’s discrepancy problem has become unwieldy while I’ve been away enjoying a bit of sunshine in Luxor. My head is full of amazing details from the walls and columns of ancient Egyptian tombs and temples. The main thing I didn’t see, or at least not until I finally saw them sticking up through the mist from my plane window yesterday morning, was the pyramids, since those are near Cairo. However, I didn’t feel too bad about that as my guide book, the Lonely Planet guide to Egypt, assured me that the pyramids never fail to disappoint (which was just one of many little gems of bad writing in that book).
For the first time for ages, I was completely away from all email and internet for a week, but just before I left, the results of the polls, as they then stood, about the next Polymath project were already suggesting that the Erdős discrepancy problem was the clear favourite, so I couldn’t help thinking about it a certain amount while I was in Egypt. I said that the process of choosing the next problem was not going to be fully democratic, but the fact is that I love the Erdős discrepancy problem and am currently somewhat grabbed by it, so I see no reason to go against such a clear majority, especially as it also has a clear majority of people who say that they are ready to work on it.
Now that I’m back, I see that the longest sequence yet found with discrepancy at most 2 in any arithmetic progression of the form has gone up to 1124, and is exhibiting some clear structure. So yet another argument for choosing this particular project is that it has in a sense already begun. I’d like to regard what is happening now as a kind of preliminary stage before the true launch. The fact that such long sequences with low discrepancy can exist, and the fact that they appear to have to have some structure, are two extremely helpful pieces of information, since they place very definite constraints on what any proof could look like. For example, it no longer seems all that likely that the true bound for the best possible discrepancy of a sequence of length is logarithmic in , and even if it is, the proof will not be some easy induction if the bound is logarithmic to a very high base. And I don’t think one can rule out the possibility that there exists an infinite sequence of bounded discrepancy: at any rate, it seems foolish to stop trying to find longer and longer sequences, since there is almost certainly more that we can learn from them.
There are a couple of terminological decisions I’d like to make collectively before we start properly. Then main one is what we should call an arithmetic progression of the form . I’ve never managed to come up with a good answer to this, and the lack of a good phrase for it is annoying when writing about the Erdős discrepancy problem, so any suggestions would be very welcome. [While writing this post, I found a description of the problem that called them homogeneous arithmetic progressions. That seems a pretty good solution to me, so perhaps we should go for that unless someone comes up with a clear improvement.] A more minor decision is how to refer to the problem itself. I think I favour EDP, since three-letter acronyms seem somehow right. The only argument against that is that we talked about DHJ rather than DHJP (or DHJT), but I think that is outweighed by the catchiness of EDP: ED just doesn’t do it for me in the same way. A yet more minor decision is what number to give to this project. I’m pretty sure it should be polymath5 (following on from Terry Tao’s polymath4, which was the search for a deterministic algorithm for finding primes). But before giving it that tag I thought I’d quickly check in case anyone thinks I’m wrong. Another decision I’d like to make, since not having made it clearly yet is another source of slight annoyance when I try to write about the Erdős discrepancy problem, is what counts as a positive answer. I’m not sure what Erdős actually conjectured, but I’d like to assume that the conjecture is that all sequences have unbounded discrepancy. Therefore, if I refer to a positive answer, that is what I mean, and a counterexample is an infinite sequence with bounded discrepancy. Another practical question: is there some way of displaying good examples so that their structure can be more easily seen? At the very least it might be good to put them in tabular form, perhaps with each row of some highly composite length such as 24. If I just see a long line of pluses and minuses and I want to know what the subsequence is like, then it is extremely tedious to find out. I don’t know what WordPress’s support for tables is like, though. If anyone can help me to display the sequence of length 1124 (and others) in a nice way, I’d be extremely grateful.
At some point reasonably soon, I plan to write a post that will describe various approaches to the problem that do not work, in the hope of stimulating a discussion about what kind of approach could conceivably work — which at the moment is not at all clear to me (at least if the conjecture is true). That post, rather than this one, will be the “official launch” of the project, and it is at that point that I shall start working seriously on the problem.
For the benefit of anyone who does not want to wade through well over a hundred comments on the previous post, here is a very quick summary of what we know now that we did not know when I put up the post. The main thing is that it is possible to have extremely long sequences of discrepancy at most 2 (meaning that the sum over any homogeneous arithmetic progression has modulus at most 2). But there is more to say. A simple observation that I mentioned in the previous post is that multiplicative sequences, by which I mean sequences such that for every and , are good candidates for low-discrepancy sequences, since the discrepancy of such a sequence is bounded by the largest size of any of its partial sums, which sort of means that there is less to check.
Of course, that is not a very convincing argument, because the price we pay for having less to check is a huge constraint on the sequence: we are now free to choose its values only at primes and everything else is determined. And the initial experimental evidence quickly showed that multiplicativity was indeed too strong a constraint if one is looking for the longest possible sequence with a given discrepancy, since the longest sequence of discrepancy 2 is much longer than the longest multiplicative sequence of discrepancy 2. However, what the examples given in the comments on the previous post are suggesting is that a weaker form of multiplicativity may well still hold, perhaps in some approximate sense. We can characterize a multiplicative sequence as a sequence that starts with and has at most two distinct subsequences of the form (one of which will be minus the other — there will be exactly two unless the original sequence is 1 everywhere). The weaker property is simply that there should be a small number of distinct subsequences. The very long examples don’t quite exhibit this, but they do show something a bit like it: if you look at the subsequences of the given form, then there are six beginnings that appear a lot. This has led to a suggestion that another way of building good examples is to take an Abelian group , to choose a function such that for all and , and to compose that with a function from to . In the light of what we have seen, a good choice for appears to be . It turns out that if a sequence has only finitely many distinct subsequences , then it must have a subsequence of this form.
It doesn’t look as though the example of a sequence of length 1124 is of this form for the group , but it certainly does seem that we can get some understanding of the sequence (which we do not yet fully have) by looking at it with very much in mind. If the conjecture is false, then it seems possible that to produce an infinite sequence of bounded discrepancy one would start with a long finite sequence produce with the help of a small group, and one would then introduce complications that would eventually be explained by the presence of a much larger group, and one would iterate this process. (That is intentionally vague — I do not know how to make it less so.)
Incidentally, the account of the problem I alluded to above, by Josh Cooper, gives it as an open problem whether a multiplicative sequence can have bounded discrepancy. I have some dim memory of being told by a number theorist once that it couldn’t: does anyone know? The most obvious -valued multiplicative sequence, the Liouville function, defined to be -1 raised to the power the number of prime factors of n, has partial sums that grow at rate that is, I think, known to be at least in the rough region of and at most at this sort of rate if and only if the Riemann hypothesis is true. [Added later: I now know that both these statements are correct.] But what about an arbitrary multiplicative sequence? We know from the Walters base-3 sequence that the partial sums can grow quite slowly, so perhaps this question really is also unsolved, in which case it would make an excellent auxiliary project: a strictly easier question that now appears to be highly relevant to the main question.