In this post I want to elaborate on a strategy for proving EDP that I discussed in EDP15. Briefly, the idea is to take a representation-of-identity proof of Roth’s AP-discrepancy result and modify it so that it becomes a representation-of-diagonal proof of unbounded HAP-discrepancy.
The first step of this programme was an obvious one: obtain a clean and fully detailed proof in the APs case. That has now been completed, and a write-up can be found here. For the benefit of anyone who is interested in thinking about the next stage but doesn’t feel like reading a piece of formal mathematics, let me give a sketch of the argument here. That way, this post will be self-contained. Once I’ve given the sketch, I’ll say what I can about where we might go from here. It is just possible that we are in sight of the finishing line, but that is unlikely as it would depend on various guesses being correct, various potential technical problems not being actual, various calculations giving strong enough bounds, and so on. Thus, the final part of this post will be somewhat speculative, but I will make it as precise as I can in the hope that it will give rise to a fruitful line of enquiry.
Roth’s theorem on AP-discrepancy.
Recall that Roth proved that for every function there is an arithmetic progression such that , and that this result was shown to be sharp by Matousek and Spencer (who removed a log factor from an upper bound of Beck). Roth actually proved a more precise result: the discrepancy is attained on a progression of length at most . In this post I shall sketch an argument that shows that for any we can find an AP of length at most and common difference at most such that . (I haven’t actually checked, but I’m 100% confident that this slightly stronger result would also follow from Roth’s proof, and also from a proof due to Lovász that uses the semidefinite programming method that we’ve discussed at length in the past.)
Recall first some notation. If and are subsets of then I shall also write and for their characteristic functions. In general, given two functions and defined on will stand for the rank-1 matrix
Suppose now that we can find a diagonal matrix , arithmetic progressions and and coefficients such that Let the diagonal entries of be real numbers Then for any function defined on we have
Let be the ratio It follows from the above identity that there exists such that
and hence that there exists an arithmetic progression such that
That is the representation-of-diagonal method in its purest form. However, as Moses pointed out (I haven’t actually found the precise comment), it is enough to obtain an identity of the form where is positive semidefinite, and everything else is as before. That is because so including does not make things worse for us. And in fact, it turns out to introduce a useful extra flexibility.
We start with the following simple lemma. Up to now, the inner product has been the obvious one But from now on it is more convenient to take the inner product (where is, as usual, shorthand for ).
Lemma 1. Let be an orthonormal basis of Then
Proof. Let be the unitary matrix with rows The entry at of is which is the inner product of the th and th columns of Since the columns of a unitary matrix also form an orthonormal basis, this gives us as required.
Corollary 2. For each let be the function defined by Then
Proof. The functions form an orthonormal basis.
The plan now is to decompose each trigonometric function into a linear combination of characteristic functions of suitably chosen APs. This automatically gives us a decomposition of each matrix into a linear combination of products We then find that each individual is used for several different s, and when we add up the coefficients there is a considerable amount of cancellation. It is that cancellation that gives us a good upper bound on and leads to the results claimed above.
At this point, let us fix We would like to decompose into a linear combination of arithmetic progressions mod with common difference at most It will turn out that their cardinalities will be at most as well. Provided that is at most this will mean that each AP wraps around (that is, passes zero) at most once, so we can split it into two genuine APs.
Given how do we choose a suitable common difference? Well, in an earlier and less successful version of this argument we chose for each a such that and Here stands for the distance from to the nearest integer. This estimate implies that and that (together with the triangle inequality) implies that and more generally that In more qualitative terms, adding a small multiple of to does not change the value of very much.
Let be the interval inside where for some smallish absolute constant Let and let be the characteristic measure of — that is, the function that takes the value on and 0 outside From the observation just made it is straightforward to prove the following lemma. The fact that is real follows from the symmetry of
Lemma 3. Let . Suppose that and Then there is a real constant between 1/2 and 1 such that
The next lemma is also easy to prove. Note that we have dropped the hypothesis that is small and weakened the conclusion accordingly.
Lemma 4. Let and suppose that Then there is a real constant such that
It will be problematic for us if is negative, but a simple trick makes sure that that never happens: we simply convolve with twice. By the two lemmas above, we may conclude that where is always non-negative, and is at least 1/4 when (That is because )
A discrepancy bound of can be obtained if for each we choose such that (which is possible by the usual pigeonhole principle argument), and replace by which can be seen to be a linear combination of functions of the form
The main ingredient that improves the bound from to is instead to choose rather carefully some coefficients and to replace by For each the earlier proof took to be 1 for precisely one and 0 otherwise. What we do instead is approximate this in such a way that for each with the function is "sufficiently smooth".
The next lemma, which I shall state without proof (the proof is straightforward), gives a good idea of what we mean by "sufficiently smooth". Given a function we define its Fourier transform by the formula
Lemma 5. Let and define a function by whenever Then is non-negative and real for every and
From this and easy Fourier manipulations, we know that the function also has these two properties: that is, for every and
We now take to be an integer approximately equal to and we define to be (the dependence on is via the definition of the function ).
We note here that for a typical there will be values of such that and that the usual pigeonhole argument gives us at least one. (There are some for which there are more values of with small, but this does not matter.) On the other hand, since is nothing other than we know that for each fixed the function has non-negative Fourier coefficients that sum to 1. These properties are what are needed for the rest of the calculations to work.
We then take the matrix
Since for each there is at least one with the remark following Lemmas 3 and 4 tells us that this matrix has the form with each at least 1/8. It follows that is positive semidefinite, since it equals
It remains to bound from above the sum of the absolute values of the coefficients of the various products of progressions used in the above decomposition. I won’t reproduce those here, but the Fourier estimates for the functions turn out to come in and allow us to calculate the sum exactly. It equals Since is proportional to this is proportional to a gain of a factor proportional to on the trace of It follows that the discrepancy bound we get is proportional to When this gives us
From APs to HAPs.
Is there any possibility of adapting the above proof to give a proof of EDP? I think there is, and in this final section I shall try to explain why.
To begin with, let us set to be the diagonal matrix with down the diagonal. (By I mean the function discussed in the previous post and set of comments, defined by the properties and ) Now let us suppose that we could find an efficient HAP-decomposition of a matrix of the form
with every The inner matrix is of the form with positive semidefinite, so this would give us an efficient decomposition of where is also positive semidefinite. Now the matrix we would like to decompose can also be written as
It is therefore tempting to look for a proof that is similar to the proof of Roth’s discrepancy theorem above: we try to decompose each function (the value of which at is ) in an efficient and unified way as a linear combination of characteristic functions of homogeneous arithmetic progressions rather than just arithmetic progressions. And then we hope that there will be plenty of cancellation in the coefficients.
Now the decomposition of into APs very definitely used non-homogeneous ones. So why is there any hope of decomposing into HAPs? Well, the kind of thing one might try is this. Choose a small and for each choose some such that Let be (as before) the interval and let be (as before) the characteristic measure of Then which tells us that can be written as a (very nice) linear combination of translates of The trouble is that most of these translates are non-homogeneous.
However, if we choose to grow to infinity so slowly that it can be thought of as a constant (this is a standard trick — treat as a large constant, and as long as for sufficiently large we can get a discrepancy lower bound that tends to infinity with we are done), then the fraction of translates of that are non-homogeneous is “only” Furthermore, and much more importantly, is on average very small on the complement of the HAP So perhaps we can afford to ignore the decomposition outside this HAP.
How would we decompose the part inside the HAP? Well, let’s suppose we were trying to decompose rather than (We could do this, and the result would be a decomposition of a matrix with positive semidefinite. I don’t know whether changing the coefficients from to would be a problem.) Since is naturally expressed as a Dirichlet convolution (except at 1) we have a natural decomposition of into HAPs. I think we should also have a natural decomposition into HAPs of the restriction of to So if we use that, and then take the coefficients from the Roth argument (or perhaps something else tailored to the new situation), then … just perhaps … we will have an efficient decomposition that gives a good approximation to And if we want more than just a good approximation, then we could decompose the part of that lies outside into functions supported on singletons. This would be inefficient, but (after cancellation of the contributions from different ) the total weight of the inefficiency should be small.