In the comments on EDP18 we are considering a certain decomposition problem that can be understood in its own right. At various points I have asserted that if we can find a decomposition of a particular kind then we will have a positive solution to EDP. And at various points in the past I have even sketched proofs of this. But I think it would be a good idea to do more than merely sketch a proof. So in this post I shall (I hope) give a completely rigorous derivation of EDP from the existence of an appropriate decomposition. (Well, I may be slightly sketchy in places, but only about details where it is obvious that they can be made precise.) I shall also review some material from earlier posts and comments, rather than giving links.
Representing diagonal matrices
First, let me briefly look again at how the ROD (representation of diagonal) approach works. If and are HAPs, I shall write for the matrix such that if and 0 otherwise. The main thing we need to know about is that for every
Suppose now that is a diagonal matrix with diagonal entries and that we can write it as where each and each is a HAP. Then
If and for every then it follows that there exists such that
and from that it follows that there is a HAP such that So if we can make arbitrarily small, then EDP is proved.
The advantage of this approach (and similar approaches related to semidefinite programming) is that it replaces a universal question — every sequence has unbounded discrepancy — with an existential one — there exists a diagonal matrix that can be decomposed in a certain way. However, it doesn’t solve the problem just like that, because it is far from clear how to find such a decomposition of a diagonal matrix.
Representing the identity over the rationals
A more recent idea that appears to simplify the problem considerably is to work over the rationals, which allows one to take the diagonal matrix to be the identity. Let me give a heuristic argument for this, which I hope will make it clear what I mean, and then follow it up with a more rigorous version of the argument.
By “work over the rationals” I mean that we shall consider infinite matrices where the indices are positive rational numbers. The identity matrix is simply the matrix that is 1 if and otherwise. Matrix multiplication is defined in the usual way: We shall consider matrices with only finitely many non-zero entries in each row and column, so we do not have to worry about the sums being infinite.
Now let’s suppose, back in the normal integers-up-to- case that we have a decomposition
Given any matrix and any rational number define the -dilation of to be the matrix defined by That is, does to what does to Now let us take the matrix and call it the smearing of (The rough idea is that we’ve spread all over the place.)
If is a diagonal matrix, what is the smearing of ? Well if then for all positive rationals so And
That is, is the identity matrix multiplied by the trace of
Now each dilation of a matrix gives us another matrix of the same form — that is, another product of HAPs. (Moreover, if and have the same common difference, then that is true for the dilations as well.) So from a decomposition of a diagonal matrix into HAP products we can get a decomposition of the identity over into HAP products.
Where this gets a bit handwavy is the following statement: because the ratio of the sum of the absolute values of the coefficients to the trace of the diagonal matrix is small, the “average coefficient” in the infinite case is small too. Therefore, we can prove the same bound for functions defined on that we could prove for functions defined on It is this last step that I would like to make rigorous.
A finite approximation of the rationals
I shall use a technique that is standard in additive combinatorics (I’m referring here to regular Bohr sets), though the basic idea presumably goes back well beyond that. The particular way I’ll use it is essentially the same as what Terry Tao did in his Fourier reduction argument. I want to take a big finite subset of that is almost closed under multiplication by any number I could conceivably want to multiply by. By that I mean that if has size and is some rational that I might want to multiply by, then is almost the same size as To visualize this, imagine taking a huge sphere and intersecting it with Call the resulting set If we take a random point in and add a non-huge integer point to it, then unless we are extremely unlucky and is right near the edge of we will find that also belongs to So is almost closed under adding small integer vectors. I want to do the same but for multiplication, which is pretty similar to vector addition by prime factorization.
Suppose, then, that we have an matrix Then the non-zero entries of occur at pairs of the form where and are integers between and Therefore, the non-zero entries of the dilation are at pairs of the form where and are integers between and And from this it follows that the non-zero entries of the smearing of occur only at pairs of the form such that is a rational with numerator and denominator between and Let be the set of all such rationals.
Let us therefore construct a set of rationals such that for almost every (Here, stands for the obvious thing: the set ) We can do this in any number of ways. Perhaps the simplest is to let be the set of all primes up to to let be some large positive integer, and to take to be the set of all numbers of the form such that each lies between and Note that the product of any element of by an element of is an element of Note also that has cardinality So if we want of the points to have the property that then all we have to do is choose so large that which we can certainly do.
From now on, all we shall use about is this property: the details of its construction are not important. We shall also consider functions defined on which we are thinking of as a finite approximation to One remark about is that we can dilate it so that it becomes a set of integers, so the fact that I have been talking about rationals is just a convenient infinitary way of thinking about the problem rather than a fundamental difference of approach. (In particular, I am not using any clever notions of limit.)
Running the argument inside X
Now let’s suppose that we have an diagonal matrix that is decomposed as where each and is a HAP. (Incidentally, by “HAP” I mean here the slightly generalized HAPs of the form .) This time, instead of looking at the sum of all dilations of let us look instead at the sum of all dilations such that This gives us a diagonal matrix What can we say about it?
Well, For each the entry will be included in this sum if In particular, it will be included if Therefore, equals the trace of for at least elements For all it is at most the sum of the absolute values of the entries of and for it is zero.
This tells us, in a way that is easy to make precise, that the entries of are given by a function that is approximately the trace of times the characteristic function of (Because the functions are bounded and the support of the difference is much smaller than this approximation can be arranged to be good in any norm with )
But since any dilation of a HAP product is another such product, this means that we have a decomposition into HAP products of a diagonal matrix that is very close to the identity, where the indexing set is now rather than (Actually, the way I said it above, the matrix is supported not on but on But that can easily be fixed by summing over dilations corresponding not to all but to all such that So I won’t worry too much about this technical detail.)
The key point here is that the sum of the absolute values of the coefficients in the decomposition is roughly and the trace of the diagonal matrix we get, which is very close to the identity, is roughly times the trace of So we obtain roughly the same as before but now with the “identity matrix on “.
That is a precise and rigorously proved version of the statement that if we can find a diagonal decomposition with a small then we can find a representation of the identity on with the same small
The next thing to think about is what happens if we take a HAP product and sum over all its dilates. We are making the simplifying assumption that and have the same common difference. (It may turn out that we cannot achieve this. Then we simply switch to searching for more general decompositions where the common differences are allowed to be different.) Let us therefore assume that and What is the sum of all dilates of at a point such that and ?
Well, we need there to be some rational and integers and such that and Given such we know that Conversely, if can be written as with and then we can set We know then that (since ), and we have and Thus, the value of the smearing of at is the number of ways of writing with and That is, it is the number of integer points in the intersection of the rectangle with the line of gradient
Two thing follow from this. First, if we are going to look at smearings, then we need only look at products where the common differences of and are 1. That is, WLOG and The second thing is that if we have a linear combination of functions where each and each is a subinterval of then its smearing will be diagonal if and only if for every pair of distinct coprime integers we have where denotes the number of multiples of that lie in the rectangle
If we put all these observations together, together with the observation made in this comment, we obtain the following conclusion: the representation-of-diagonals approach to EDP works if and only if for every there exists a system of rectangles (each being an interval of positive integers between 1 and ) and coefficients such that
- for every pair of coprime positive integers between 1 and we have
I have explained the “only if” part of this. To see that a decomposition of this kind works, suppose that we have a function defined on For convenience we shall assume that takes values in though more general assumptions are possible. For convenience let us normalize so that We then take the smearing of the matrix This equals 1 on the diagonal (except at the boundary of — I’m referring to the smearing where we add all dilates for ) and 0 off it (again, perhaps with a few exceptions at the boundary). In other words, it gives us a very good approximation of the identity on
Let us write this decomposition as Here the HAPs run over all dilations of the HAPs by numbers Therefore, Since and the trace of the identity on is this tells us that our decomposition of the identity has been achieved with the same constant
Let us now calculate in two ways. The first is to think of it as (where is the identity indexed by ). The second is to use our decomposition and to think of it as
Since the two are the same (almost, at least), there must be some such that In particular, if for every then there must be some such that which proves EDP (since for sufficiently large the interval contains a dilate of , and HAPs remain HAPs after dilation).
EDP and EDP for multiplicative functions
Here is a thought that I haven’t followed through properly, but I’m pretty sure something can be got out of it. Suppose one tries to prove the existence of a decomposition of the kind we want, not by going ahead and finding one but by showing abstractly that such a thing exists. One can return to the formulation of the previous post, where we are trying to write as a linear combination of functions of the form which, like Alec, I prefer to write as on the understanding that this represents not the characteristic function of the set of all with and but the number of ways of representing a number as with and . By the Hahn-Banach theorem, this is impossible with a given upper bound for the sum of the absolute values of coefficients if and only if there is a certain functional that separates from all the functions By that I mean that and for every Now if is such a functional, and is a dilate of then I claim that is another such functional. Indeed,
(I haven’t checked that that is correct — it might be that the second term should be )
I have a slight technical difficulty here, which is that I don’t know about Let me assume that it is and try to justify that assumption in a moment. I then want to readjust what I said above by changing the new functional to if That ensures that we still send to 1. I think that to justify that assumption I might want to do something like assuming that that is optimal. But I didn’t promise a rigorous proof here so will come back to this unless someone else manages to fill in the details.
What it is supposed to be leading to is that we can assume that the function is a completely multiplicative -valued function. The rough idea would be that we could replace by bigger and bigger averages of the form and that if wasn’t completely multiplicative in the first place then these averages would eventually become small, which would contradict the fact that they separated from all functions of the form
If that is correct (and at the moment we have a rather dodgy step so this certainly cannot be assumed) then we seem to have something like a proof that if the representation-of-diagonals approach to EDP fails, then there is a completely multiplicative -valued function with bounded discrepancy. I’m suspicious of that statement, because it seems to suggest that the vector-valued formulation of the problem is equivalent to the -valued version. More likely, the effort to rescue the above argument from its technical problems would result in a modified statement that would be precisely what Terry proved in his Fourier reduction argument.
The moral of the previous section
I’m interested in trying to get the ideas of the previous section to work for their own sake, but there’s also a point I want to make about how one might hope to prove EDP by finding a rectangle decomposition of the kind discussed above. What the previous section suggests is that if we try to prove this using an abstract existence argument then we will have applied duality twice and will be back to the original hard problem (albeit in its completely multiplicative form, but even that seems to be hard). So we must resist the temptation to try to find an argument that says something like, “These functions are sufficiently numerous and spread about that we must be able to find a decomposition of the desired form,” for the simple reason that if there is some function of bounded discrepancy then it shows that they are not sufficiently numerous and spread about.
Instead, we must try to find a constructive proof that such a decomposition exists. Alec has got the process started for us by finding a general construction that gets arbitrarily close to 1/2. I think we will learn a lot if we can beat that barrier, even if initially we do so by finding a fairly formless example by means of a brute-force search.