In the comments on EDP18 we are considering a certain decomposition problem that can be understood in its own right. At various points I have asserted that if we can find a decomposition of a particular kind then we will have a positive solution to EDP. And at various points in the past I have even sketched proofs of this. But I think it would be a good idea to do more than merely sketch a proof. So in this post I shall (I hope) give a completely rigorous derivation of EDP from the existence of an appropriate decomposition. (Well, I may be slightly sketchy in places, but only about details where it is obvious that they can be made precise.) I shall also review some material from earlier posts and comments, rather than giving links.
Representing diagonal matrices
First, let me briefly look again at how the ROD (representation of diagonal) approach works. If and
are HAPs, I shall write
for the matrix
such that
if
and 0 otherwise. The main thing we need to know about
is that
for every
Suppose now that is a diagonal matrix with diagonal entries
and that we can write it as
where each
and each
is a HAP. Then
If and
for every
then it follows that there exists
such that
and from that it follows that there is a HAP such that
So if we can make
arbitrarily small, then EDP is proved.
The advantage of this approach (and similar approaches related to semidefinite programming) is that it replaces a universal question — every sequence has unbounded discrepancy — with an existential one — there exists a diagonal matrix that can be decomposed in a certain way. However, it doesn’t solve the problem just like that, because it is far from clear how to find such a decomposition of a diagonal matrix.
Representing the identity over the rationals
A more recent idea that appears to simplify the problem considerably is to work over the rationals, which allows one to take the diagonal matrix to be the identity. Let me give a heuristic argument for this, which I hope will make it clear what I mean, and then follow it up with a more rigorous version of the argument.
By “work over the rationals” I mean that we shall consider infinite matrices where the indices
are positive rational numbers. The identity matrix is simply the matrix that is 1 if
and
otherwise. Matrix multiplication is defined in the usual way:
We shall consider matrices with only finitely many non-zero entries in each row and column, so we do not have to worry about the sums being infinite.
Now let’s suppose, back in the normal integers-up-to- case that we have a decomposition
Given any matrix
and any rational number
define the
-dilation of
to be the matrix
defined by
That is,
does to
what
does to
Now let us take the matrix
and call it the smearing of
(The rough idea is that we’ve spread
all over the place.)
If is a diagonal matrix, what is the smearing
of
? Well if
then
for all positive rationals
so
And
That is, is the identity matrix multiplied by the trace of
Now each dilation of a matrix gives us another matrix of the same form — that is, another product of HAPs. (Moreover, if
and
have the same common difference, then that is true for the dilations as well.) So from a decomposition of a diagonal matrix into HAP products we can get a decomposition of the identity over
into HAP products.
Where this gets a bit handwavy is the following statement: because the ratio of the sum of the absolute values of the coefficients to the trace of the diagonal matrix is small, the “average coefficient” in the infinite case is small too. Therefore, we can prove the same bound for functions defined on that we could prove for functions defined on
It is this last step that I would like to make rigorous.
A finite approximation of the rationals
I shall use a technique that is standard in additive combinatorics (I’m referring here to regular Bohr sets), though the basic idea presumably goes back well beyond that. The particular way I’ll use it is essentially the same as what Terry Tao did in his Fourier reduction argument. I want to take a big finite subset of
that is almost closed under multiplication by any number I could conceivably want to multiply by. By that I mean that if
has size
and
is some rational that I might want to multiply by, then
is almost the same size as
To visualize this, imagine taking a huge sphere and intersecting it with
Call the resulting set
If we take a random point
in
and add a non-huge integer point
to it, then unless we are extremely unlucky and
is right near the edge of
we will find that
also belongs to
So
is almost closed under adding small integer vectors. I want to do the same but for multiplication, which is pretty similar to vector addition by prime factorization.
Suppose, then, that we have an matrix
Then the non-zero entries of
occur at pairs of the form
where
and
are integers between
and
Therefore, the non-zero entries of the dilation
are at pairs
of the form
where
and
are integers between
and
And from this it follows that the non-zero entries of the smearing
of
occur only at pairs of the form
such that
is a rational with numerator and denominator between
and
Let
be the set of all such rationals.
Let us therefore construct a set of rationals such that
for almost every
(Here,
stands for the obvious thing: the set
) We can do this in any number of ways. Perhaps the simplest is to let
be the set of all primes up to
to let
be some large positive integer, and to take
to be the set of all numbers of the form
such that each
lies between
and
Note that the product of any element of
by an element of
is an element of
Note also that
has cardinality
So if we want
of the points
to have the property that
then all we have to do is choose
so large that
which we can certainly do.
From now on, all we shall use about is this property: the details of its construction are not important. We shall also consider
functions defined on
which we are thinking of as a finite approximation to
One remark about
is that we can dilate it so that it becomes a set of integers, so the fact that I have been talking about rationals is just a convenient infinitary way of thinking about the problem rather than a fundamental difference of approach. (In particular, I am not using any clever notions of limit.)
Running the argument inside X
Now let’s suppose that we have an diagonal matrix
that is decomposed as
where each
and
is a HAP. (Incidentally, by “HAP” I mean here the slightly generalized HAPs of the form
.) This time, instead of looking at the sum of all dilations of
let us look instead at the sum of all dilations
such that
This gives us a diagonal matrix
What can we say about it?
Well, For each
the entry
will be included in this sum if
In particular, it will be included if
Therefore,
equals the trace of
for at least
elements
For all
it is at most the sum of the absolute values of the entries of
and for
it is zero.
This tells us, in a way that is easy to make precise, that the entries of are given by a function that is approximately the trace of
times the characteristic function of
(Because the functions are bounded and the support of the difference is much smaller than
this approximation can be arranged to be good in any
norm with
)
But since any dilation of a HAP product is another such product, this means that we have a decomposition into HAP products of a diagonal matrix that is very close to the identity, where the indexing set is now
rather than
(Actually, the way I said it above, the matrix is supported not on
but on
But that can easily be fixed by summing over dilations corresponding not to all
but to all
such that
So I won’t worry too much about this technical detail.)
The key point here is that the sum of the absolute values of the coefficients in the decomposition is roughly and the trace of the diagonal matrix we get, which is very close to the identity, is roughly
times the trace of
So we obtain roughly the same
as before but now with the “identity matrix on
“.
That is a precise and rigorously proved version of the statement that if we can find a diagonal decomposition with a small then we can find a representation of the identity on
with the same small
The next thing to think about is what happens if we take a HAP product and sum over all its dilates. We are making the simplifying assumption that
and
have the same common difference. (It may turn out that we cannot achieve this. Then we simply switch to searching for more general decompositions where the common differences are allowed to be different.) Let us therefore assume that
and
What is the sum of all dilates of
at a point
such that
and
?
Well, we need there to be some rational and integers
and
such that
and
Given such
we know that
Conversely, if
can be written as
with
and
then we can set
We know then that
(since
), and we have
and
Thus, the value of the smearing of
at
is the number of ways of writing
with
and
That is, it is the number of integer points in the intersection of the rectangle
with the line of gradient
Two thing follow from this. First, if we are going to look at smearings, then we need only look at products where the common differences of
and
are 1. That is, WLOG
and
The second thing is that if we have a linear combination of functions
where each
and each
is a subinterval of
then its smearing will be diagonal if and only if for every pair of distinct coprime integers
we have
where
denotes the number of multiples of
that lie in the rectangle
If we put all these observations together, together with the observation made in this comment, we obtain the following conclusion: the representation-of-diagonals approach to EDP works if and only if for every there exists a system of rectangles
(each
being an interval of positive integers between 1 and
) and coefficients
such that
- for every pair
of coprime positive integers between 1 and
we have
I have explained the “only if” part of this. To see that a decomposition of this kind works, suppose that we have a function defined on
For convenience we shall assume that
takes values in
though more general assumptions are possible. For convenience let us normalize so that
We then take the smearing of the matrix
This equals 1 on the diagonal (except at the boundary of
— I’m referring to the smearing where we add all dilates for
) and 0 off it (again, perhaps with a few exceptions at the boundary). In other words, it gives us a very good approximation of the identity on
Let us write this decomposition as Here the HAPs
run over all dilations of the HAPs
by numbers
Therefore,
Since
and the trace of the identity on
is
this tells us that our decomposition of the identity has been achieved with the same constant
Let us now calculate in two ways. The first is to think of it as
(where
is the identity indexed by
). The second is to use our decomposition and to think of it as
Since the two are the same (almost, at least), there must be some such that
In particular, if
for every
then there must be some
such that
which proves EDP (since for sufficiently large
the interval
contains a dilate of
, and HAPs remain HAPs after dilation).
EDP and EDP for multiplicative functions
Here is a thought that I haven’t followed through properly, but I’m pretty sure something can be got out of it. Suppose one tries to prove the existence of a decomposition of the kind we want, not by going ahead and finding one but by showing abstractly that such a thing exists. One can return to the formulation of the previous post, where we are trying to write as a linear combination of functions of the form
which, like Alec, I prefer to write as
on the understanding that this represents not the characteristic function of the set of all
with
and
but the number of ways of representing a number as
with
and
. By the Hahn-Banach theorem, this is impossible with a given upper bound for the sum of the absolute values of coefficients if and only if there is a certain functional that separates
from all the functions
By that I mean that
and
for every
Now if
is such a functional, and
is a dilate of
then I claim that
is another such functional. Indeed,
(I haven’t checked that that is correct — it might be that the second term should be )
I have a slight technical difficulty here, which is that I don’t know about Let me assume that it is
and try to justify that assumption in a moment. I then want to readjust what I said above by changing the new functional to
if
That ensures that we still send
to 1. I think that to justify that assumption I might want to do something like assuming that that
is optimal. But I didn’t promise a rigorous proof here so will come back to this unless someone else manages to fill in the details.
What it is supposed to be leading to is that we can assume that the function is a completely multiplicative
-valued function. The rough idea would be that we could replace
by bigger and bigger averages of the form
and that if
wasn’t completely multiplicative in the first place then these averages would eventually become small, which would contradict the fact that they separated
from all functions of the form
If that is correct (and at the moment we have a rather dodgy step so this certainly cannot be assumed) then we seem to have something like a proof that if the representation-of-diagonals approach to EDP fails, then there is a completely multiplicative -valued function with bounded discrepancy. I’m suspicious of that statement, because it seems to suggest that the vector-valued formulation of the problem is equivalent to the
-valued version. More likely, the effort to rescue the above argument from its technical problems would result in a modified statement that would be precisely what Terry proved in his Fourier reduction argument.
The moral of the previous section
I’m interested in trying to get the ideas of the previous section to work for their own sake, but there’s also a point I want to make about how one might hope to prove EDP by finding a rectangle decomposition of the kind discussed above. What the previous section suggests is that if we try to prove this using an abstract existence argument then we will have applied duality twice and will be back to the original hard problem (albeit in its completely multiplicative form, but even that seems to be hard). So we must resist the temptation to try to find an argument that says something like, “These functions are sufficiently numerous and spread about that we must be able to find a decomposition of the desired form,” for the simple reason that if there is some function of bounded discrepancy then it shows that they are not sufficiently numerous and spread about.
Instead, we must try to find a constructive proof that such a decomposition exists. Alec has got the process started for us by finding a general construction that gets arbitrarily close to 1/2. I think we will learn a lot if we can beat that barrier, even if initially we do so by finding a fairly formless example by means of a brute-force search.
September 7, 2010 at 8:03 pm |
[...] Polymath5 By kristalcantwell There is a new thread for Polymath5 here. Let me update this there is another thread here. [...]
September 8, 2010 at 3:16 am |
Apologies in advance for this being more meta-mathematics than mathematics. Reading over this approach to the EDP I am reminded of some notoriously thorny problems in operator theory: the Kadison-Singer problem, the Feichtinger conjecture, and other issues related to “pavable” operators and matrix pavings. Appropriately formulated, all concern the existence or nonexistence of various kinds of “decompositions” of matrices, with special attention paid to what is happening on the diagonal. (See e.g. http://www.aimath.org/WWN/kadisonsinger/ the paper “Equivalents of the Kadison-Singer problem.”) A good number of papers on pavable operators seem to follow the general theme that an operator that is “pavable” in any number of weak senses, must in fact be “pavable” in stronger, “more structured” senses. That seems very similar to what is desired here. (Nik Weaver even has a paper on Kadison-Singer stuff with the word “discrepancy” in the title— which is all I can say, not being all that familiar with his body of work, or any work on the EDP.)
Of course in this approach to the EDP it seems that the matrices one wants decompositions of are far from arbitrary (as they tend to be in Kadison-Singer type problems). They have a ton of extra “structure.” I wonder if there is anyone who is familiar with both this approach to the EDP and Kadison-Singer type problems and can comment on the distinction between the two. (e.g. are they similar enough that somebody with a strong opinion on the probable truth of the EDP should have a strong opinion on certain kinds of matrix pavings, or vice versa?)
September 10, 2010 at 9:01 pm
Gil also mentioned this a while back. It does look rather similar — though the impression I get from a skim through that article of Casazza and Edidin is that the problem has more of an algebraic and less of a number-theoretic flavour. (For example, the word ‘prime’ doesn’t occur anywhere in that article, whereas it’s hard to imagine a discussion of EDP that didn’t talk about primes at some point.) But it could well be that we could learn something from approaches to the Kadison–Singer problem.
September 9, 2010 at 1:04 am |
Not directly related to this post, but I am curious if experimental results of long sequences of discrepancy 2 are still interesting/relevant/helpful. I thought about it over the weekend after reading the EDP18 post and checking out the polymath wiki, and managed to come up with a method to generate such sequences, assuming I am computing the discrepancy correctly, and not overlooking some desired structure that makes the problem harder. Found a sequence of length 2026 that is relatively uniform here: http://pastebin.com/S3KZVx4V Another sequence of length 20386 that isn’t very uniform was found here: http://pastebin.com/fUwr6tg0
September 9, 2010 at 7:01 pm
Jason, your sequences start with three 1:s, so they have discrepancy at least 3.
September 9, 2010 at 7:09 pm
Indeed, if you look along the even numbers, you find that the first one (and possibly the second — I haven’t checked) has discrepancy a lot more than 3. This makes me think you may have some misconception about the definition of discrepancy.