Apologies for the attention-seeking title, but that really is the purpose of this post. I want to draw attention to some ideas that are buried in more comments that most people are likely to want to read, because I think there is a chance that all that stands between where we are now and a solution to EDP is a few soluble technical problems. It goes without saying that that chance is distinctly less than 100%, but I think it is high enough for it to be worth my going to some trouble to lay out as precisely as I can what the current approach is and what remains to be done. I’ll try to write it in such a way that it explains what is going on even to somebody who has not read any of the posts or comments so far. The exception to that is that I shall not repeat very basic things such as what the Erdős discrepancy problem is, what a homogeneous arithmetic progression is, etc. For that kind of information, I refer the reader to the front page of the Polymath5 wiki.
Representing the identity.
Let me now discuss an approach that doesn’t work. (If you have been keeping up with the discussion, then this will be familiar material explained in a slightly different way.) Let be a large integer, and if and are two HAPs contained in then write for the matrix that is 1 at if and and 0 otherwise. In other words, it’s the characteristic function of Note that if then Let us write as etc.
Suppose that we could find for every pair of HAPs contained in a coefficient in such a way that and Then for every real sequence we have
It follows by averaging that there exist and such that
In particular, if each then there must exist and such that from which it follows that either or is at least So if we can get to tend to zero as tends to infinity then we are done.
Unfortunately, it is impossible to get to tend to zero. The reason is that the above argument would imply that if when mod 3, then for every there exists a HAP such that But that is not true: it can never be greater than 1.
Because of this example, we have been trying a different approach, which is to look for a more general diagonal matrix and write that as a linear combination of matrices of the form If one generalizes the approach in this way, then it is no longer clear that it cannot work — indeed, it seems likely that it can. However, it also seems to be hard to find a suitable diagonal matrix, and hard to think how one might decompose it once it is found.
Working over the rationals instead.
The single main point of this post is to suggest a way of overcoming this last difficulty. And that is to resurrect an idea that was first raised right near the beginning of this project, which is to look at the problem for functions defined on the positive rationals rather than the positive integers. (It is a straightforward exercise to show that the two problems are equivalent. For details go to the Polymath5 wiki and look at the section on the first page with simple observations about EDP.)
The point is that the counterexamples that show that the approach cannot work for the integers all make crucial use of the fact that some numbers are more divisible by small factors than others. But over the positive rationals all numbers are equally divisible. Or to put it another way, multiplying by a positive rational is an automorphism of This suggests that perhaps over the rationals it would be possible to use the identity matrix.
Dealing with infinite sets.
Now a problem arises if we try to do this, which is that the rationals are infinite. So what are we supposed to say about the sum of coefficients when we decompose the identity into a linear combination ?
Let me answer this question in two stages. First I’ll say what happens if we decompose the identity when the ground set is which shows a way of dealing with infinite sets, and then I’ll move on to where some additional problems arise.
Suppose, then, that the infinite identity matrix (that is, the function where and range over ) has been expressed as a linear combination Now let be a sequence. We’d like to show that it has unbounded discrepancy: that is, we’d like to show that for every there exists a HAP such that
Our problem is that is going to be infinite. We’d somehow like to show that it has “density” at most where is an arbitrary positive constant, or perhaps show that it has density zero. One way we might do this is as follows. For each positive integer Let be the sum of the over all such that both and have non-empty intersection with Then define the upper density of the coefficients to be If this is zero we can say that the coefficients have density zero. And if it is at most then we can say that they have density at most (In fact, even the would be OK — we just want the density to be small infinitely often.)
Let’s suppose that is at most Then if we truncate the sequence at , by changing all values after to zero, we find that
where I have written for the set of all HAPs that have non-empty intersection with Since the same averaging argument as before gives us a HAP (either or — WLOG ) such that
I fully admit that this is not very infinitary, but it is simple, and I’m not sure it matters too much that it is not infinitary. I’ll just briefly mention that one can use it to express the proof of Roth’s theorem (about AP discrepancy rather than HAP discrepancy). One expresses the infinite identity matrix as the following integral:
where One then expresses each function as a linear combination of HAPs of length (an arbitrary positive integer) and common difference at most One then obtains some cancellation in the coefficients, and proves that the density of the coefficients is at most (up to a constant factor). For details of how this calculation works, see this write-up, and in particular the third section.
The importance of the restriction on the length and common difference is that the edge effects (that is, APs that intersect without being contained in ) are negligible for large It is this feature that is slightly trickier to obtain for the rationals, to which I now turn.
Transferring to HAPs and rationals.
One useful feature of the set of APs of length and common difference at most is that each number greater than or equal to is contained in precisely such APs. A first question to ask ourselves is whether we can find a set of HAPs that covers the rationals in a similarly nice way. To start with, observe that if is a rational, then we can easily describe every HAP of length that contains Indeed, for every between and we have the HAP consisting of the first multiples of and that is all of them (since must be in the th place of the HAP for some and that and the length determine the HAP). So we have the extremely undeep result that every is contained in precisely HAPs of length (Note, however, how untrue this is if we work in the positive integers rather than the positive rationals.)
This looks promising, but we now need an analogue of the “increasing system of neighbourhoods” that was provided for us by the sets (It might have been more natural to work in and take the sets ) What is a sensible collection of finite sets with union equal to ?
One way of thinking about the sets is as follows. Using our system of APs, we can define a graph: we join to if there is an AP of length and common difference at most that contains both and The sets are quite close to increasing neighbourhoods in this graph: start with the number 1 and then take all points of distance at most from it. If we work with rather than then this graph is a Cayley graph, and after a while the neighbourhoods grow linearly with which is why the boundary effects are small.
What happens if we define a similar graph using HAPs in ? Now we are joining to if there exists and such that That is precisely the condition that there exists a HAP of length that contains both and In other words, we take the multiplicative group of and inside it we take the Cayley graph with generators all numbers with
This feels like the right graph to take, but it has the slight drawback that it is not connected: it is impossible to get from to if is a rational such that in its lowest terms either its numerator or denominator is divisible by a prime greater than The connected component of 1 in the graph is the set of all rationals where both and are products of primes less than or equal to But this is not really a problem: we’ll just work in that component of the graph.
Let’s write for the component of we are working in, and for the set of all points at distance at most from 1 in the graph. Now we can say how it would in principle be possible to prove EDP. We would like to find a way of writing the identity (this time thought of as the function where and range over ) in the form
where and are HAPs of length at most that are contained in For each let be the set of all such HAPs that have non-empty intersection with the neighbourhood Then we can define to be and we can define the upper density of the coefficients to be
Now let me show that if is at most for some sufficiently large then the HAP-discrepancy of every function on is at least This is by almost exactly the same argument that worked in The first step is to consider the restriction of to Then we know that
Now any HAP of length at most that intersects is contained in from which it follows that any HAP of length at most that intersects but is not contained in must intersect Since is a finitely generated Abelian group, the sets grow polynomially, which implies that the ratio tends to zero with
I’ll now be very slightly sketchy. We are supposing that is at most It follows that either or is noticeably smaller than In the second case we can change to and start again — we won’t be able to do this too many times so eventually we’ll reach the first case, where
In that case we have that
After that, the argument really is the same as before (give or take the small approximations). [Remark: I have not checked the details of the above sketch, but I’m confident that something along these lines can be done if this doesn’t quite work. It’s slightly more difficult than in because it isn’t obvious that the intersection of a HAP with is a HAP, whereas the intersection of an AP with is an AP.]
Characters on the rationals.
Now we must think about how to express the identity as a linear combination of HAP products with coefficients of density at most where is some arbitrary positive constant. Taking our cue from the case, it would be natural to express the identity on in terms of characters, and then to decompose the characters. So a preliminary task is to work out what the characters are.
This is (not surprisingly) a piece of known mathematics. I shall discuss it in a completely bare-hands way, but readers who don’t like that kind of discussion may prefer to look at the comment by BCnrd to this Mathoverflow question.
First I’ll work out what the characters on are, and then I’ll look at Recall that a character is a function from to the unit circle in such that for every That is, it is a homomorphism from to
Suppose we know the value of at a rational That tells us what is for every integer However, it does not tell us what is, say. All we know is that which gives us two possibilities for So in order to specify we need to specify its values at enough rationals that we can write every other rational as a multiple of one of them. And the choices we make at those rationals have to be compatible with each other.
A simple way of doing that is to choose the value of at for every positive integer making sure that That is, we choose to be any point in and then for each we have choices for given our choices up to that point.
An equivalent way of thinking about this is that we choose a sequence of real numbers satisfying the conditions that and is a multiple of Then the corresponding character is defined at to be for any Note that this is well-defined, since if and are both at least then so
Now because we chose an element of and then made a sequence of finite choices, it is easy to put a probability measure on the set of characters. We can therefore make sense of the expression
and prove that it is Let me briefly sketch this. Suppose that Then for every so we get 1. If then let when written in its lowest terms. If is the (random) sequence that determines then each is uniformly distributed in the interval and But is uniformly distributed in the interval so this expectation is zero (as ).
We have therefore shown that
where is the identity indexed by
What do we do if we want to modify this to work for ? Well, an initial complication that (I hope) turns out not to be a serious complication is that is not an additive group: it contains 1 and it does not contain any prime greater than However, it generates a subgroup of which consists of all rationals with denominators that are products of primes less than or equal to When I refer to “characters on ” I will really mean characters on
To describe these, we no longer need a sequence of reciprocals such that every rational is a multiple of one of them: we just want to capture all rationals in But that is straightforward: instead of taking the sequence we could take the sequence or we could replace by the product of all the primes up to There are any number of things that we could do.
Decomposing a character into “non-local” HAPs.
The thing that seems to me to make this approach very promising is that for any character on it is possible to partition into long HAPs on each of which is approximately constant. As this result suggests, it is possible to decompose in an efficient way as a linear combination of HAPs, which is very much the kind of thing we need to do in order to imitate the Roth proof.
I should warn in advance that it is not quite good enough for our purposes, because the HAPs we use are not “local” enough: they are sets of the form such that is small, but we do not also know that and are small. Without that, each number is contained in infinitely many HAPs, so we no longer have the condition that enabled us to define the “density” of a set of coefficients. Later I shall present a different idea that does use “local” HAPs, but fails for a different reason. My gut instinct is that these difficulties are not fundamental to the approach, but whether that is mathematical intuition or wishful thinking is hard to say.
Before I go any further, here is an easy lemma.
Lemma. Let be a character on and let Then there exists such that for every there exists a positive integer such that
Proof. Without loss of generality Let and let be the lowest common multiple of the numbers from 1 to [Thanks to David Speyer for pointing out at Mathoverflow that the l.c.m. is around a non-negligible improvement over .] Then by the usual pigeonhole argument we can find a positive integer such that so we can take
For the next result we define a HAP to be a set of the form
Corollary. Let be a character on let and let be a positive integer. Then we can partition into HAPs of lengths between and on each of which varies by at most
Proof. We begin by covering the integers. Find a positive integer such that Then on any HAP with common difference and length at most varies by at most We can partition the multiples of into HAPs of length and they will cover all the integers. (Indeed, they will cover all the multiples of )
We now want to fill in some gaps. Let us write and let us pick an integer a multiple of and greater than such that Between any two multiples of there are at least multiples of forming a HAP. This HAP can be partitioned into HAPs of common difference and lengths between and If we continue this process and make sure that every positive integer divides at least one (which is easy to do), then we are done.
Now a character restricted to a HAP is just a trigonometric function. If the character varies very slowly as you progress along the HAP, then convolving it with an interval (as in the Roth argument, but now we are talking about a chunk with the same common difference as the HAP) we obtain a multiple very close to 1 of the character itself. With the help of this observation, we can actually decompose the character efficiently as a linear combination of HAPs. Since this does not obviously help us, I will leave the details as an exercise.
What can we do with local HAPs?
Let us fix a positive integer and define a HAP to be local if it is of the form where (Thus, the definition of “local” depends on ) What happens if we have a character and try to decompose its restriction to as a linear combination of local HAPs (of reasonable length) on each of which varies very little?
The short answer is that it is easy to cover with HAPs of this kind, but it doesn’t seem to be easy to partition into them. In order to achieve the partition into non-local HAPs in we helped ourselves to smaller and smaller common differences, and correspondingly larger and larger values of and Another problem with that method was that what we are really searching for is a very nice and uniform way of decomposing characters, such as we had in the Roth proof. There the niceness was absolutely essential to getting enough cancellation for the proof to work, but it wasn’t essential to represent the identity — we could allow ourselves a bit extra as long as that extra was positive semidefinite.
So let’s not even try to partition Instead, we could simply take our character and use it as a guide to the coefficients that we will give to our HAPs.
The rough idea would be something like this. Given a character and a HAP we choose a coefficient in some nice way that makes it large when is small. It could for example be for some suitable And then we could use convolution to represent times the restriction of to as a linear combination of sub-HAPs of of length smaller than but not too much smaller, and common difference
We would know from the lemma above that every would be contained in at least some HAPs with large coefficients, so the restriction of every would in some sense be catered for. I think there would be some that were catered for too much (the precise relationship between those and remains to be worked out, but I think this will be straightforward), but I hope that the whole decomposition can be defined in a nice enough way for the function that results to be the pointwise product of the original character with a “nice” non-negative real function that’s bounded away from zero. More speculatively, one can hope that the coefficients have small density and that it is possible to subtract a not too small multiple of the identity from the matrix and still be left with something that’s non-negative definite.
An attempt to be more precise.
In this final section I want simply to guess at a matrix decomposition that might potentially prove EDP. As I write this sentence I do not know what the result will be, but the two most likely outcomes are that it will fail for some easily identifiable reason or that the calculations will be such that I cannot tell whether it fails or succeeds.
Actually, to make the guess just a little bit more systematic, let’s suppose that for each HAP in the class of HAPs we are considering and for each character we have a coefficient That is, corresponding to we are taking the function Given two HAPs and what will the coefficient of be in the decomposition ?
Expanding this out gives us
If we reverse the order of expectation and summation then we see that the coefficient of is
Now let us think what we are trying to achieve with the HAPs and Given a HAP of the form we want to use it to contribute to the representation of characters for which is small. If is such a character, then we know that the numbers vary only slowly. Therefore, if we take short subprogressions of the form then on each one will be roughly constant. If we fix a length (which may have to be logarithmic in or something like that), then we can represent the restriction of to as a linear combination of the HAPs of length give or take some problems at the two ends.
Roughly speaking, the coefficient of will be That is, if we write for the HAP then it is approximately true to say that the restriction of to is
Now we want to do this only if is close to 1. So the coefficient of in the decomposition of will in general be where is some nice function (which, if the Roth proof is anything to go by, means that it has non-negative and nicely summable Fourier coefficients) that is zero except at points that are close to 1.
What would we expect to be like? Is there any chance that it is close to a multiple of ?
Since is roughly constant on whenever is non-zero, the value of this function at is roughly the same as it is if we take just singletons — that is, if we set and define to be So the function we get should have a value at that is close to
In other words, we get something that has the same argument as but that has big modulus at a rational if it so happens that is close to 1 for unexpectedly many positive integers Now this can happen, but I would think that for nearly all and if is not too small, then would be about its expected value of So I am hoping that we will have some kind of rough proportionality statement.
Going back to the coefficient of in the decomposition, let us take the two HAPs and We have decided to define to be so is equal to
It is a significant problem that we are forced to consider the case where in the Roth proof, we did not have to look at products of APs with different common differences. But if we are to have any chance of decomposing into “local” HAPs, then necessarily we must use completely different common differences to deal with rational numbers that are a long way apart in the graph. This is such a significant difference from the Roth proof that it may be a reason for this approach not working at all. However, it does look as though there is plenty of cancellation.
I think the expression we would be trying to bound is
which is perhaps better thought of as
And our main strength would be that we would be free to choose as it suited us. There would be two challenges: to obtain a good bound on the above expression, and to prove that we could subtract a reasonable-sized multiple of the identity and still be left with a positive semi-definite matrix.
I think the above counts as a calculation for which I cannot tell whether it fails or succeeds. But I hope it may be enough to provoke a few thoughts.