I think this will be a bit long for a comment, so I’ll make it a post instead. I want to try to say as clearly as I can (which is not 100% clearly) what we know about a certain way of constructing a decomposition of the identity on Recall from the last post or two that what we want to do is this. Define a square in
to be a set of the form
where by
I mean the set of all positive integers
such that
Let us identify sets with their characteristic functions. We are trying to find, for any constant
a collection of squares
and some coefficients
with the following properties.
where
and
is the number of points in the interval that defines
or, more relevantly, the number of points in the intersection of
with the main diagonal of
- Let
Then for any pair of coprime positive integers
we have
The second condition tells us that the off-diagonal elements of the matrix you get when you convert the decomposition into a matrix indexed by are all zero, and the first condition tells us that we have an efficient decomposition in the sense that we care about. In my previous post I showed why obtaining a collection of squares for a constant
implies that the discrepancy of an arbitrary
sequence is at least
In this post I want to discuss some ideas for constructing such a system of squares and coefficients. I’ll look partly at ideas that don’t work, so that we can get a sense of what constraints are operating, and partly at ideas that might have a chance of working. I do not guarantee that the latter class of ideas will withstand even five minutes of serious thought: I have already found many approaches promising, only to dismiss them for almost trivial reasons. [Added later: the attempt to write up even the half promising ideas seems to have killed them off. So in the end this post consists entirely of half-baked ideas that I'm pretty sure don't work. I hope this will lead either to some new and better ideas or to a convincing argument that the approach I am trying to use to create a decomposition cannot work.]
Using squares and fly traps.
A general idea that I have not managed to rule out is to build a decomposition out of “squares and fly traps”. I’ve already said what a square is. If you take the two squares and
then their difference is the set of all points
or
such that
It has the shape of two adjacent edges of a square. It is this kind of shape that I am calling a fly trap.
The idea then is to take a collection of fly traps with negative coefficients and a collection of squares with positive coefficients. In order for the second condition to hold, we need the following to hold: as you go along any line from the origin other than the main diagonal if you sum up the coefficients associated with the squares you visit, then the result should be cancelled out by the sum of the coefficients associated with the fly traps. In particular, if all squares have coefficients equal to 1 and all fly traps have coefficients equal to -1, then the number of times the line hits a square should be the same as the number of times it hits a fly trap. (I think of the squares as sending out “flies” that are then caught by the fly traps, which have some nasty sticky substance at their points.)
It’s not really necessary for the fly traps all to point in the same direction, and there are other small adjustments one can make, but the basic square/fly-trap idea seems to be what the computer is telling us works best in very small cases. (It is far from clear that this is a good guide to what happens in much larger cases, but it seems sensible at least to consider the possibility.)
For a nice illustration of a square/fly-trap construction, see this picture that Alec produced. Alec also has a general construction that gives us arbitrarily close to 2. Rather than repeat it here, let me give a link to the relevant comment of Alec’s (if you think of the bar as squaring the interval, it will be consistent with what I am saying in this post), and a link to a similar comment of Christian’s.
This example (or rather family of examples) uses a single fly trap of width and squares of width 2 (unlike the example in Alec’s picture, which I therefore find more interesting, despite the fact that it gives a worse constant). It is instructive to see why this gives us a bound of
If the fly trap has width
then it has
off-diagonal points. So we need
flies. Each square of width 2 contributes two flies, so we need
such squares. This means that
(since the fly trap needs two squares to make it) and that
The ratio of these two numbers tends to 2 as
tends to infinity.
It is not hard to see that if we could use squares of width 3 instead, then we would be able to get a constant arbitrarily close to 3. However, significant difficulties arise almost immediately. However again, this could be good news, because if we can find some way of getting beyond 2, we may by that stage have found a pattern that can be generalized. And I think there is some hope of pushing
beyond 2, as I shall now try to explain.
One fly trap is not enough.
First, let us see why there is absolutely no hope of achieving this with just one fly trap. The argument is simple. Let be the flytrap
If all the off-diagonal points in the square
are caught by this fly trap, what can we say about
? One necessary condition is that both
and
are factors of
But this implies that
is at least
which in turn implies that
is at least
Since we need almost all points in the fly trap to catch flies, we need at least
distinct flies, which is more than
So, roughly speaking, we need a constant fraction of the numbers
of order of magnitude
to be such that both
and
are factors of
This just isn’t going to happen.
Note that if we make smaller, to give numbers near
a better chance of dividing
then we are forced to increase
(or else the flies miss the fly trap). And that makes things even worse — we now have fewer possible flies and a bigger fly trap.
I’m sure it would be easy to state and prove something rigorous here, but for now I’d prefer to leave those thoughts as a convincing enough demonstration that a single fly trap will not do the job. But if that’s the case, what can we do? Well, the obvious next thing to try is several fly traps.
Pure randomness as a way of catching flies.
How can we make use of multiple fly traps? A first thought is that if we take the square and send out some flies, we could create two traps, one to catch flies with maximum coordinate
and the other to catch flies with maximum coordinate
But the trouble with this is that it seems to be far too tailored to one particular square: it is hard to believe that such a trap could catch the flies from several different squares. We would be asking for two integers
and
such that there are many integers
such that one of
and
divides
and the other divides
Actually, on writing that I realize that I have given no thought to it at all, so perhaps it is worth trying to show that it cannot be done (just in case, contrary to expectations, it can be done).
Since and
are always coprime, there seems no point in giving
and
any common factors, so let’s take
and
to be highly smooth numbers that are coprime to each other. And let’s try to find 3-by-3 squares that send out flies that are caught by one of the fly traps
or
(I’m assuming that
and
are roughly the same size. If it is convenient to take different
s then I don’t mind doing it, but I don’t expect it to help.)
If is fixed and
and
are large, then … I think we’re completely dead. We have
fly trap points below the diagonal to get rid of and three flies below the diagonal per 3-by-3 square, so we need about
squares. If
is one of those squares, then for the fly at
not to miss the fly traps, we need
to be at least
where
is the rough size of
and
So we need
pairs
such that we can find an integer
with
and
But that more or less fixes the ratio of
to
and anyway
is bigger than
From this I am happy to conclude that we need to change our attitude and go for many fly traps. The idea would be that the reason a fly hits a fly trap is not that the fly starts out at a very carefully chosen point (roughly speaking, a “factor” of the fly trap) but that there are enough fly traps for it to be almost inevitable that the fly will hit at least one of them.
What I mean by “pure randomness” is that we use the following mechanism for ensuring that the fly at hits a trap. If
then we simply make sure that there are at least
traps, fairly randomly placed, or rather
traps for some large constant
Then the probability that the fly misses all traps is small: roughly speaking, the expected number of traps it hits is
and if we can get enough independence then we can hope that all flies will hit roughly
of the traps. (This model turns out to be much too crude, since the probability of a fly hitting a trap depends very much on divisibility properties of the coordinates of the fly and the trap. But let us work with it for now.)
Some back-of-envelope calculations.
Let us try to check the feasibility of this idea. An initial observation is that most fly traps are useless for our purposes. If you choose a random large integer
then the fraction of integers coprime to
will be around
(If
is the other integer, then the probability that
divides both
and
is
so the probability that they are coprime is roughly
) But if
is coprime to
then the point
cannot catch any flies. If we have a large set of such points, then we are in trouble.
To deal with this, it seems that the only option we have is to insist that our fly traps occur at highly composite values of
so that almost all other integers have quite high common factors with
and therefore give rise to points that can catch many flies. It will be convenient to call
and
the height and width of the fly trap
In that language, we want fly traps with highly composite heights. (Note that the height refers to the altitude of where the fly trap is placed, whereas the width measures the size of the trap itself. Indeed, “altitude” is probably a better word than “height” here, but I prefer an Anglo-Saxon word if there is one.)
Now let us suppose that we have 3-by-3 squares, and a reception committee of fly traps with highly composite heights between
and
If the widths of the fly traps are all
(or perhaps all between
and
or something like that), then we’ll need
fly traps if we want a one-to-one correspondence between flies and trap points, and a bit more than that if we want each fly to hit
traps. Let us take
fly traps.
Now consider a fly at say. If its chances of hitting a given trap are
then we’ll also need there to be about
fly traps. That is, we’ll want
to be about
And for that fly not to miss the traps altogether (because its angle from the main diagonal is too large), we’ll need
to be at most
So we’ll need
to be bigger than
That looks pretty problematic, because we now need a very large number of fly traps, and it will not be possible to put them all at highly smooth heights between
and
: there just aren’t that many highly smooth numbers.
Just to make that more conceptual, the problem we have is that there are two conflicting pressures on the flies. If they are not high enough, then the angle they make with the main diagonal is forced to be large and they therefore miss all the fly traps. But if they are too high, then they are very unlikely to hit any given fly trap, which forces the fly traps to be extremely numerous, which forces there to be several fly traps at non-smooth heights, and therefore several points in the traps that cannot catch flies.
Smooth traps and smooth squares.
Is there anything we can do to get round this problem? I think there may be. There was one questionable assumption in the discussion above, which was that the probability of a fly hitting any given fly trap was about
The condition we need for this fly to hit the trap
is that
should divide
and that
should be at least
Now if we choose
randomly, then of course the probability that
is
But if we choose it as a random number with lots of small prime factors, and if
also has quite a lot of small prime factors, then we hugely increase the chances that
For instance, if
is a random multiple of 6, and
also happens to be divisible by 6, then the chances that
divides
are now
Let us now go back to the attempt above. Again let us suppose that we have 3-by-3 squares. Again, if we are taking fly traps
with
between
and
and if we want each fly to hit
traps, then we will need about
or so traps. But now let us suppose that all the traps have very smooth heights. More precisely, let us suppose that all the heights
are such that all but a small proportion of integers have a fairly high common factor with
Simplifying absurdly, let us suppose that this gains us an extra factor of
when we think about the probability that a fly
or
is caught by a given fly trap: now the probability is more like
rather than
What does that do for us?
It means that now, if we want each fly to hit traps, we’ll need not
traps (where
is the height of the fly) but more like
traps. We already know we need about
traps, so equating the two we find that
needs to be about
And if we want a fly at that kind of height not to be too far from the diagonal to hit the traps, we need
to be at most
which tells us that
should have approximate size
which is rather better than the earlier estimate of
(up to a constant).
But at this point we have an important question: are there enough highly smooth numbers between
and
?
To answer that, we need to think about what the typical probability gain is for a given number. Suppose, for instance, that is divisible by
where
is some set of small integers. For what
can we say that a random integer
has a good chance of having a highest common factor with
of at least
?
The expected number of such that
is
and we can expect this to be reasonably concentrated if the expectation is not too small. Writing
for
and assuming that
is a fairly dense set of primes (something like a random set of
of the first
primes, say) then the expectation will be around
so the value we get for
assuming (not quite correctly) that the primes
that divide
are fairly evenly distributed, ought to be around
or around
(We could get there by saying that the typical size of a
is fairly close to
and we are choosing
of these primes.)
This is fairly worrying, because in order to gain a factor we have to make the set of
we are allowed to choose very much sparser. It seems as though we lose a lot more than we gain by doing this.
The “smooth squares” in the title of this section refer to the possibility that we might try to choose so that both
and
have quite a lot of small prime factors. But such numbers are hard to come by, so again it seems that any gain one might obtain is more than compensated for by the loss in their frequency.
Special configurations of squares and fly traps.
Can we achieve what we want by making very careful selections of our s? It’s clear that something that helps us is to have pairs of heights
such that
is, when written in its lowest terms, of the form
or
It’s quite easy to find such that all of
are of this form: just make
extremely smooth and make all the differences small. Then the differences will divide
and we are done. But what if we try to ensure that
and
are of the required form? Then we need large positive integers
and
such that
is of the form
for a positive integer
That is, we need the reciprocal of
to be an integer. Rearranging, we want
to be an integer. It’s moderately reassuring to observe that this can be done: for instance, if
and
we get
But how about if
and
are very large? Or perhaps they don’t have to be very large, just as long as we can find a set
such that many of the ratios
are integers.
Let’s think about this slightly further. Suppose we have such a collection of integers. Then choose with enough factors for all the numbers I mention to be integers, and for each
let
What we want is for
to divide
That is, we want
to divide
So we need
to be an integer. (This doesn’t look very symmetric, but it is true if and only if
is an integer.)
Suddenly this looks a bit easier. It looks as though we’ll be OK if we make the all fairly smooth and make their differences small. Hmm … except that that doesn’t look easy after all, since if
is smooth and
is small, then
will not be all that smooth.
I won’t think about this for the time being, but I think it may be possible to construct, in a not quite trivial way, an arbitrarily long sequence of integers such that
is always an integer.
Let’s suppose we managed that. Would it help? What we could do is this. We let be some huge factorial so that it’s divisible by whatever we need it to be divisible by. We then define the numbers
as above: that is,
Since whenever
we have
of the form
for some positive integer
we can find an integer
such that
and
Therefore, potentially at least, we could consider using the square to knock out some points in the fly traps at heights
and
However, for this to have a chance of working, we want to be big, since otherwise our flies will be out wide again, which will force the traps to be big and we’ll get into all sorts of problems.
But it’s problematic either way. If we want traps of width at most some fixed then we need
to be at least
For that we need the integers
to be of size at least
(since
), and more than that unless they are close together.
But we also need the to divide
so we can’t just choose the
and then make
huge. Rather, what we seem to want is a number
that has so many factors of size somewhat bigger than
that we can find interesting clusters of them such that many of the numbers
are integers.
I should think a bit more about how many of these numbers actually need to be integers. Perhaps we don’t need them all to be integers — if not, then we would have a much greater chance of success.
If the fly traps have width then we have
points below the diagonal that need to be hit. Each good pair
leads to three flies that can do the hitting. So it looks as though
needs to be bigger than
I think I must have made a mistake here, since there are basically only two chances to hit the point : we must do so either at
or at
So we need an extraordinary miracle to occur: it must be possible to partition (almost all of) the numbers
and
into pairs of consecutive integers. This does not feel possible to me.
I’m going to stop at this point. I’ll end with the obvious question: is it possible to create an example out of squares and fly traps? Part of me thinks that the square/fly-trap idea is almost forced, since we need the bigger points to avoid coprime pairs. I think also that I have not devoted enough space to discussing bigger fly traps — ones where the width is proportional to the height, say. This requires bigger squares, but it may be possible to do something. In fact, I’ll think about that (not for the first time) and if anything interesting comes out of it then I’ll extend this post.
September 10, 2010 at 8:33 pm |
Just to observe that finding smooth numbers close enough together in bunches is at the heart of the problem. The discrepancy 2 sequence looked as though it might possibly go on for ever, and was certainly longer than most initial expectations. It failed because smooth numbers which are close together (in a sense to be made precise) belong to enough different HAPs to cause a blockage – which seems very close to the ‘let’s design a flytrap’ problem.
Since we are now dealing with completely multiplicative sequences on the rationals, it is possible that there is a particularly effective flytrap associated in some way with the maximal completely multiplicative sequence with discrepancy 2. [maximal sequences which are not completely multiplicative might also play a part in this].
September 10, 2010 at 10:13 pm |
I have an observation that is similar in spirit to yours. (In case Mark’s comment gets replied to, I’m referring to the first comment on this post.) In a sense the thing that is causing me most difficulty is that there are lots of points
such that
and
are coprime. This too seems to be closely related to the “true” reason that the problem is hard: that there are many pairs of numbers such that choosing the value at one of them has almost no bearing on the value at the other.
A small comment is that we are not just restricting attention to completely multiplicative functions, though I suspect that we more or less can (especially if we allow them to be complex).
September 10, 2010 at 10:36 pm |
I’ve just had a thought that may be quite helpful on the theoretical side. A problem I’ve been having up to now is that small squares do not contain many flies and therefore do not hit many points in the fly traps. But there is another way that we could consider hitting them, which is with thickened flytraps.
Let me try to explain what I mean. Suppose that instead of a square of width 3, which gives us a contribution of 3 to the trace and 1 to the sum of coefficients, we were to take a function of the form
This would contribute 6 to the trace and 2 to the sum of (absolute values of) coefficients, so would be as good in that sense, but it would also send out far more flies. That would allow us to choose far fewer functions of this kind, so we might be able to place them in very special places where all of
are highly composite.
I don’t know whether this deals with my difficulties, but it gives us an extra parameter to play with, which can’t be a bad thing.
September 10, 2010 at 11:13 pm |
One other small thought, which might conceivably lead to something that could be searched for experimentally. Perhaps we could pick a whole lot of multiples of
for some small
, add together the corresponding Alec examples, and then hope that some of the 2-by-2 squares overlap or otherwise lead to potential gains in efficiency. For example, if we are lucky enough to have amongst our three 2-by-2 squares the squares
and
then we can replace them by a single 3-by-3 square
This will give us the same set of flies, but the efficiency will have gone up from 2 to 3 (since the sum of coefficients is now 1 instead of 3 and the contribution to the trace is now 3 instead of 6).
It’s just conceivable that we could get enough coincidences like this to push the value of
down below 1/2. But I think we might need a computer to search for them.
September 11, 2010 at 9:36 am
This is a great general idea, but unless mistaken I think the particular replacement you mention will not appear when adding Alec examples (since all the little 2-by-2 squares there are of the form
by construction, so one shall never have a
).
Perhaps one might try some reverse engineering, that is contruct three examples (even without a low
to start with) where one contains a
, a second a
and a last a
, in such a way that taking multiples allows to reproduce the trick again and again.
September 11, 2010 at 9:58 am
Oh yes. I think though that this can be dealt with by having some of the fly traps as inverted Ls and others as Ls. That is, some are of the form
and others are of the form
I’m in general keen on the idea of trying to get product constructions to work, so I’d be interested to understand better what you mean by your reverse engineering suggestion.
September 11, 2010 at 6:40 pm
What I meant is to design an example using your trick as a cornerstone. I think I finally managed to get it to work, the idea is to use something a bit reminiscent of telescoping series combined with your trick (resulting in an amplification of a small disturbance, like the tricki idea “create an epsilon of room”).
Namely, at step
we build a decomposition which has both the pair
,
and the third one but shifted by 1,
. So when we add the step
we get the cancellation between the pair of step
and the lone one of step
, i.e. we do the replacement you proposed and thus
decreases a little bit. When done and we add step
and do the new replacement to decrease
still a bit more, and so on.
Here is a drawing of an example which seems to work (it’s a picture of step
, the flies are cercled in green, and the traps in red). For instance at step 1 we have
, then adding step 2 to it and replacing we get
. So this way one would get
arbitratily small. Of course this very much requires independent confirmation, I may well have made an error again.
http://thomas1111.files.wordpress.com/2010/09/ideanew.jpg
September 11, 2010 at 7:05 pm
Typo in previous message: it should read “shifted by 1,
“.
September 11, 2010 at 8:41 pm
Thomas, I’m curious to understand your construction but I don’t. Would it be possible to say precisely what all the rectangles and coefficients are after, say, three steps of the construction?
September 11, 2010 at 9:26 pm
I find the general idea interesting. It seems that Tim is proposing a kind of transformation rule whereby we can replace any decomposition that includes
,
and
with the same decomposition with those three terms replaced by
, and still have a valid decomposition with smaller
. For example, we can replace
by
.
Now there may be other such transformations. And we can always form a convex combination of decompositions that achieve
(or better), to obtain a new decomposition that achieves
(or better).
So, I wonder whether given a sufficiently rich supply of decompositions (not necessarily particularly ‘good’ ones), of unbounded support, and of transformations like Tim’s, we could apply the transformations repeatedly, combined with taking convex combinations, to get better and better
.
I’d be interested to know whether there are other similar transformations, or whether this is a ‘one-off’ …
September 11, 2010 at 10:58 pm
All pieces have the same coefficient except for signs, so that computing
is done with the formula you derived earlier
, that is only the number of squares and rectangles and their sign and imbalance matter.
Here is a new picture with much better explanations (please disregard the previous messy one as I’ve changed color conventions): it shows what happens in three steps: first we have piece
which is made of two parts, then we add to it piece
, and finally we apply your trick. I’ve corrected some errors from my previous message: in fact piece
has
, and the addition of piece
and piece
after the trick becomes
.
(you can click on the image to see it full size in your browser)
http://thomas1111.files.wordpress.com/2010/09/idea-steps-explained.jpg
September 12, 2010 at 8:13 am
In fact the general formula after adding
pieces and doing the trick each time is
, so it doesn’t decrease arbitrarily as I thought, and my construction fails blattantly.
So to exploit properly your replacement idea, one should find a construction which manages at each step to use it more often than one adds new squares & rectangles, so as to obtain something like
with
.
September 11, 2010 at 11:30 am |
There seems to be a basic snag with what I was suggesting. Alec’s construction relies on our taking a multiple of
so that we can pick off all the points in a fly trap of width
(It would be sufficient, as he has noted elsewhere, to take the lowest common multiple of 1,2,…,k.) Then for each such multiple
he takes 2-by-2 squares near the numbers
Now we know that mod
the number
is congruent to
It follows that these numbers are all well-separated mod
And from that it follows that the kinds of coincidences I was hoping for, where we have pairs of numbers
and numbers
such that
is very close to
simply don’t exist.
I’m not sure about this, but I think it may be possible to prove that a construction with fly traps and with squares that are mostly of size 3 cannot be made to work. The rough idea of the proof (I have more of an idea of the proof than of the statement that it proves) is this. If you want fly traps of width
and you want all your flies to be at distance either 1 or 2 from the diagonal (defining the distance of
to the diagonal to be
), then almost every
used for a fly trap must be at a number such that
is divisible by almost every number from 1 to
But that makes every
divisible by half the l.c.m. of
Call that number
This means that if we have
and
then the distance between
and
is at least
if
and at least
if
and
So in all cases it is large.
What I think this proves is that the flies that are very close to the diagonal and that can be caught in highly smooth traps are well separated, and therefore cannot be covered efficiently by squares. And if we try to deal with this by making the traps less smooth, then we will introduce lots of points that cannot catch any flies that are very close to the diagonal.
I might have thought that we were in serious trouble at this point, but I think the thickened-L idea from a couple of comments back could be helpful.
September 11, 2010 at 6:52 pm |
“So in the end this post consists entirely of half-baked ideas that I’m pretty sure don’t work.”
Depending on the Topology of the Mathematical Ideas subspace of the “Ideocosm” [the space of all possible ideas], sometimes a 1/2 baked idea + a 1/4 baked idea a 1/8 baked idea a 1/16 baked idea … in the limit becomes a complete and correct idea.
September 11, 2010 at 6:59 pm |
Must have missed the “+ key from lack of coffee. Okay, if you permit me to try again:
Depending on the Topology of the Mathematical Ideas subspace of the “Ideocosm” [the space of all possible ideas], sometimes a 1/2 baked idea + a 1/4 baked idea + a 1/8 baked idea + a 1/16 baked idea … in the limit becomes a complete and correct idea.
Or should I be defining a Selberg zeta function over all closed geodesics on dynamical systems on the Ideocosm for surfaces of constant curvature -1 in 1-to-1 correspondence with the periodic orbits of the associated geodesic flow which takes place on the unit tangent bundle of the manifold?
September 12, 2010 at 11:38 am |
Here’s an idea for a general method of construction. But before I say what it is, I want to make an important general point, which is that there isn’t a hugely strong reason to believe that the decompositions we are currently searching for actually exist. That is because in order to simplify the search we are placing an artificial constraint on ourselves: to consider only those HAP products
for which
and
have the same common difference. Note that if this is the case, then we can divide them both through by the common difference, which is why we have been considering progressions with common difference 1. Maybe now that we have looked pretty hard it would be an appropriate moment to allow ourselves the occasional more general product if it seems to help. The only rule is that
and
must be segments of homogeneous progressions, and WLOG their common differences are at least coprime. So for example, if we wanted to use
we would be free to do so. Of course, if we allow ourselves extra flexibility, then we’ll have to experiment a bit before we get a feel for what the smallest “easy” value of
is.
But for the rest of this comment I want to stick with progressions of common difference 1. The basic idea can be summarized as follows: why not try to find a construction made out of fly traps and fly traps? That is, we could simply aim to take some linear combination of fly traps that cancels off the diagonal. (We know this is possible in non-trivial ways, since a square is itself a union of consecutive fly traps.)
Now at first this is hopeless, since the efficiency of a fly trap is at most 1/2 (it adds 2 to the sum of coefficients and makes a difference of 1 to the trace), meaning that the best
we can hope for looks like being 2. And if the positive and negative coefficients are fairly well balanced, then the efficiency will be smaller still. So there are two improvements that would need to follow. First, we would want the fly traps with negative coefficients to be quite a bit bigger than the ones with positive coefficients. That way, we would use fewer of them, so the trace would be proportional to the total number of fly traps, and perhaps even (if the negative fly traps are much longer on average) almost equal to it. But that will get the efficiency to
at best. To go any further we need something else to happen: we want a significant fraction of the positive fly traps to be partitionable into intervals of length greater than 1. If they also drop to the same point below the diagonal (which could be a rather awkward constraint to insist on — this is a potential problem with the suggestion), then they form thickened Ls and we can reduce the sum of coefficients.
This focuses attention on the following class of functions defined on rationals in
Let
be the characteristic function of the set
Can we find a linear combination of these functions that cancels, or almost cancels (leaving us to mop up the error) in an interesting way? And can we do it with plenty of clusters of consecutive
s?
I find this a clearer formulation of the problem — not too far from being equivalent to what we were trying to do anyway — since it focuses attention very firmly on the number theoretic questions that have to be solved. Also, one can simplify it a bit further by looking at
which is the characteristic function of
Bearing in mind the first part of this comment, we would be just as happy if we had a cluster of
s that formed a HAP segment. And I think we also wouldn’t mind choosing characteristic functions of sets such as 
One other remark I’ll make briefly but I’ll try to work out the details and post them in another comment later. I think if we always take the same ratio (or almost the same ratio)
for our functions
then we’ll probably be doomed to failure, since then the functions will all have small inner product with the characteristic function of rationals less than
minus the characteristic function of rationals greater than or equal to
whereas this is not true of the main diagonal. This is the essentially the same point that was made before about its not being possible to prove that there is high discrepancy on a “full” HAP, meaning that we fix
and insist that the HAP consists of all multiples of some
up to 
September 12, 2010 at 3:25 pm |
This comment is an attempt to understand when the existence of certain functions on
gives us a proof that decompositions of a certain kind cannot work.
Suppose, then, that we think we have a linear combination
where each
is a square
and the sum along off-diagonal lines is zero. It follows that for any function
defined on
that is constant along rays from the origin we have
Now
so if we can come up with a function
that has discrepancy at most
on any square
that we use in our decomposition, then the left hand side is at most
If in addition
then that tells us that
which tells us that we cannot make
any better than
with this collection of squares.
In particular, if we could find a completely multiplicative
-valued function
with HAP discrepancy at most
then we could define
and we would have the property
together with the property
so we would be able to take 
We shouldn’t be too discouraged by this of course: it is trivial that if there is a completely multiplicative function of bounded discrepancy then our approach is doomed to failure. Where this kind of observation might be useful is in demonstrating that certain even more restricted classes of decomposition cannot work.
For example, here is a slightly strange restriction that we cannot ignore. Let
be the Liouville function (that is, -1 to the power the number of prime factors). The above argument shows that to obtain a decomposition powerful enough to prove EDP we will have to use intervals
on which the sum of
is unbounded. This means that if we have the idea of some clever construction using a particular huge and smooth
or something like that, we have to have a reason that the intervals that arise are ones on which
sometimes has a large sum. Since
behaves in a fairly random way, especially on small scales, this looks like quite a hard thing to do explicitly. I think it more or less forces us to use more global arguments where there are so many intervals around that on average we are pretty sure that
(or any other multiplicative function) will have big discrepancy on some of them.
Ultimately, we won’t be trying to verify that multiplicative functions have large discrepancy on our collection of intervals. Rather, it will go the other way round — the existence of a clever decomposition will prove that this is the case. But it provides a plausibility check: if there aren’t very many intervals about and they are all quite small, and if there is enough freedom in the construction that we can regard where they are as somewhat random, then there is no reason for them not all to be intervals on which
has small discrepancy, so the attempt is almost certainly doomed to failure.
Now let me see whether the remark at the end of my previous comment is correct. Suppose we try to find a decomposition that is built out of squares
that all have the property that
for some fixed
(Since a fly trap is a difference of two squares, this applies to fly traps too.) An obvious function that might have small discrepancy on all of those squares is one where you choose
such that exactly half of the square
(now considered as a square in
) consists of points
with
Then one would expect the discrepancy on the integer approximations to these squares to be small too. What I don’t know at the time of writing is whether this discrepancy would be bounded or whether it would grow linearly. Hmm … or perhaps I do. It looks as though there would be a linear error arising from when the squares suddenly increase in width by 1 (in the integer case). So perhaps we can get away with squares of this form.
Incidentally, I forgot a condition: we want to insist that
is symmetric in
and
since we are insisting on the same for the decomposition.
September 12, 2010 at 3:39 pm |
As an immediate application of the criterion in the previous comment, we can say something about the approach suggested in the comment before that. I suggested that it might be a problem if we always tried to take
for our functions
And that is indeed the case, since if we define
to be 1 if
or
and -1 otherwise, then
will be 1 on approximately half of each fly trap and -1 on approximately half. This will make the discrepancy bounded on each fly trap (because we’ve fixed
) and will cause the approach to fail.
What this shows is that if we build a decomposition entirely out of fly traps, then we will be forced to use many different shapes of fly trap (where I am defining the shape to be the approximate ratio of width to height). I would dearly like to understand how differing shapes can be of any help. Perhaps it’s just for the simple reason discussed earlier: that we want the bottoms of the fly traps to be at the same level so that we don’t waste coefficients dealing with the small errors that would otherwise appear there.
September 13, 2010 at 9:07 am |
In this comment I’m going to try to find some interesting linear dependences amongst sets of the form
Alec’s example corresponds to this. We take
and then each
is an integer, so we can add up the singletons
and subtract the set 
As an experiment, let me try taking
and
Writing the fractions in their lowest terms we get
If we subtract off the set
we get the set
Apart from the
we can write this as a sum of singletons. And we can represent
as the difference between the multiples up to 5 and the multiples up to 4.
Translating back into 2×2 squares and fly traps this gives a decomposition with
Let me explain how I worked that out. Each AP of length more than 1 corresponds to a fly trap so it costs 2. For instance, the initial one consisting of the first ten multiples of 1/126 corresponds to the fly trap
A singleton of the form
can be realized as the 2×2 square
And a singleton that is not of this form can be represented as a difference of two APs, so has a cost of 4. We have two APs, four reciprocals and one stray rational 5/126, so the total cost comes to
The sum along the diagonal is 2 for each 2×2 square, 0 for the difference of two fly traps that gives us 5/126, and the main two fly traps have coefficients of 1 and -1, so the total comes to 8.
September 13, 2010 at 12:41 pm |
I seem to keep running into rather similar difficulties when I try to arrange for plenty of coincidences and efficiency gains. I want to see if I can understand these difficulties in a more precise way, by which I mean show that there are various requirements that hugely restrict the possibilities for any decomposition. At best, this would lead to more efficient ways of searching for decompositions.
If we stick with the idea of making everything out of fly traps and then sticking some of the fly traps together to save on the cost of coefficients, then in broad terms what we need is this. In order for
to be big compared with
it is important that the sizes of the
for which
is negative should be generally somewhat larger than the sizes of the
for which
is positive. It is also vital that there should be very few points with coprime coordinates in the second half of the decomposition, since it is impossible to cancel these out efficiently. This seems to tell us that the heights of the higher fly traps should be very smooth.
Let me try to quantify that last remark. For simplicity, let’s suppose that our combination of fly traps takes the form
where all of the
and
are fly traps. And suppose that we have managed to do this in such a way that the
are on average bigger than the
sufficiently so for the trace to be proportional to the number of
And let’s suppose that the
contain
points with coprime coordinates. Each of these points has to be cancelled out in a very crude way, at a cost of 4, so 4t needs to be small compared with the number of
Another constraint, which is quite hard to reconcile with the smoothness, is that we want there to be many consecutive heights. Combining that with the cancellation we need, that tells us that we want many heights to have factors that are very close to each other.
I can feel this comment getting less and less precise, so I’m going to abort it and try again when I’m clearer in my mind about what it is I want to say.
September 13, 2010 at 7:11 pm |
I’ve thought of a different way of trying to understand why the problem we are now thinking about is a difficult one. I more or less said it above, but now I think I have a clearer understanding of it. Actually, looking back, it seems that in this comment I came close to saying what I’m about to say here, but didn’t quite say it.
Anyhow, I just want to make the simple observation that if the current approach succeeds, then it implies a rather strong looking discrepancy result, though quite how strong is something I’m still not clear about. The observation is this. Suppose we can find a
-valued function
on
that is constant along rays (that is,
is always equal to
) and that has discrepancy bounded above by
on all squares
Then we cannot get a decomposition with
I gave the proof in the previous comment, but I then concentrated on functions of the form
where
is completely multiplicative. However, it may be helpful to think about more general functions.
For example, it occurred to me that if we could find a function
that is constant along rays and is such that for every
and
we have
then our approach would fail, since such a function has discrepancy at most 4 on any square. (This can be checked.)
However, it is quite easy to prove that no such function exists. The most conceptual way to see it is to observe that the entire function is determined by
I’ll illustrate that with an example:
One can show that
is a sum of a Morse-like function in
and a Morse-like function in
And such a function has no reason to be constant along rays — and indeed isn’t. The first place a problem shows up is this:
But we need
However, we could play a similar game with 4-by-4 squares, looking for a function
that sums to zero along every row and every column within each one of these squares. That allows a discrepancy of up to 16. Now we no longer have the situation where choosing the value at one place in a square forces all the other values — there are a number of different ways of getting all the row and column sums to be zero. So now it suddenly looks much much harder to prove that we cannot find such a function.
An additional very important moral of this is that if we use a restricted class of squares, such as squares that have a very small side length compared with how far up they are, then this discrepancy result has to hold for that class of squares. Indeed, if you’ve got a collection of squares and want to prove a lower bound for the
that you can use it to obtain, then a good way is to choose a suitable function
For instance, for Alec’s example one can choose
to be 1 when
and 0 otherwise. This has discrepancy 2 on all 2-by-2 rectangles and 1 on all fly traps, so we see immediately that it cannot do better than
(though of course this is also easy to see directly).
I’m fairly sure that this line of thought will make it very clear how the different squares must interact with each other if there is to be any hope of their beating the
bound. Basically, it is essential that there are lots of rays that intersect lots of squares.
September 13, 2010 at 8:53 pm
That’s a nice clarification of the problem.
Do we gain anything by insisting that
be a
-valued function? It looks to me as if any real-valued function will do. (Indeed the example you give at the end is a
-valued function.)
In your earlier comment, you wrote:
Did you mean
September 13, 2010 at 9:42 pm
I concentrated on
functions just because they’re quite nice to think about, but if we wanted to investigate the problem computationally, then it would probably be easier to look at more general functions, since then we’d have a linear program rather than an integer program. I allowed myself 0 to analyse your example because it didn’t seem to be that easy to do so without it, which does confirm that the extra flexibility could be useful. I’ve corrected the earlier comment, which was indeed wrong (though the correction was slightly different because I needed to sum over
).
If we do stick with
-valued functions, then we can formulate quite a nice combinatorial problem. Suppose we have a bunch of squares and we want to prove that no combination of them will give us a small value of
Then we can do it by finding a
-valued function that’s constant on rays and that has small discrepancy on every square.
The examples we’ve been looking at have tended to have rather small squares, so let us make the additional assumption that, apart from on the main diagonal, no square contains more than one point on any ray.
Now let’s form a set system. Its vertices are the rays, and the sets are given by the squares. That is, if
is one of our squares, we associate with
the set of all rays that pass through
(or equivalently the function
which because of our assumption takes values in
except at
where it takes the value
). We now want our set system to be sufficiently “dense” that it has a high discrepancy.
This is a necessary condition, but it’s a pretty strong necessary condition that should hugely narrow down the search for a construction. Indeed, it could even narrow it down so much that we were able to prove that there was no such set system, which would force us to consider HAP products with different common differences.
Now let’s narrow things down further and suppose that each ray that hits any squares at all hits exactly two squares (one with a coefficient of 1 and the other with a coefficient of -1). This tells us that in our set system each point is contained in precisely two sets. So an obvious preliminary question is whether we can think of any set system at all with the property that each point is contained in two sets and the discrepancy is at least, say, 10.
If each point is contained in two sets, we can form a graph: its vertices are the sets and its edges are the points, so if
belongs to
and
then we regard
and
as joined and
as the name of the edge joining
to
Now we are trying to find a
-valued function defined on the edges that is well-balanced at every vertex. Or rather, we want a graph where that cannot happen.
So now we have a problem that I rather like: is it true that for every constant
there is a finite graph such that, however you 2-colour the edges, there must be a vertex such that the number of blue edges coming out of that vertex differs from the number of red edges coming out of that vertex by at least
?
I’ll have a little think about this, but I’ll also post it now in case anyone else wants to think about it. It doesn’t seem as though it should be hard, but I don’t immediately see how to do it. I hope the answer will be that there does exist such a graph and that we can make it pretty sparse. If it turns out to be hard, then I’ll also post a question on Mathoverflow, but I won’t do that for the time being.
September 13, 2010 at 11:28 pm
A small remark about that last problem: if we can partition the edges of the graph into cycles of even length, then we can just colour the edges alternately red and blue along each cycle and we will have a discrepancy of zero on the edges at each vertex. In particular, this happens if the graph is bipartite and every vertex has even degree.
Also, if we have something like an expander of degree
then we can probably partition it into a bounded number of cycles. (I don’t know any theorem that actually says that we can, however.) Then we could do something similar, colouring edges alternately along the cycles. If a cycle was odd, we would have to have two consecutive edges of the same colour, but if it was long, then we would have a huge amount of choice about where to put the one bad vertex. So presumably we could have at most one bit of badness at each vertex, which would give us a discrepancy of at most 2.
So it seems that we want a graph that cannot be partitioned into even cycles and a few long odd cycles. It isn’t obvious to me how to construct such a graph.
But actually I’m starting to think that this problem is not as relevant to EDP as I thought. After all, if we have a system of squares where each ray hits exactly two squares, then the resulting graph must be bipartite for there to be any chance of a decomposition. (Proof: the coefficient of a square must be minus the coefficient of any neighbouring square, so the signs of the coefficients give you your bipartition.)
And now I think I have a trivial solution to the problem anyway. Consider the following algorithm for partitioning the edge set. We’ll construct a trail (that is, path that’s allowed to visit vertices more than once, but we are not allowed to revisit edges) and make it maximal in both directions. We do this by simply extending the trail however we like until we get stuck. How do we get stuck? The only way is if we arrive at a vertex and find that there are no unused edges leaving that vertex.
Having constructed a maximal trail, we colour its edges alternately. This contributes zero to the discrepancy of all vertices except those at the two ends of the trail, where it contributes 1 (or possibly 0 or 2 if those two ends happen to be at the same vertex). But a very important point is that if we cannot extend the trail any further, then there are no edges left at the vertex that has a discrepancy of up to 2, so if we now build some new trails, we will not contribute any further to its discrepancy.
So to continue, we simply remove all edges from the first maximal trail and start again. In that way we find a colouring of the edges such that at each vertex the discrepancy is at most 2.
The moral of this, which is quite interesting actually, is that to find a decomposition of the kind we want, we must have at least one ray that meets more than two squares, from which it is reasonable to conclude that we will actually want several rays that meet quite a bit more than two squares.
September 14, 2010 at 6:37 am
It occurs to me that the problem of constructing a function
, constant on rays, with low discrepancy on squares of the form
where
is an interval feels considerably easier than the problem of constructing multiplicative
-valued functions on
with low discrepancy on intervals (which we think is more or less equivalent to EDP). The reason is that for the latter problem, if we want to construct our function by assigning values to integers in order, we only have freedom to choose at primes, which are vanishingly rare as we go up. But for the two-dimensional problem, if we want to construct our function by assigning values in some moderately natural order, we have freedom to choose at all coprime pairs, which are relatively abundant – indeed, I think more than half of pairs will tend to be coprime. Given such a degree of freedom, it wouldn’t surprise me if it were quite easy (at least for a computer) to just write down a suitable function
with discrepancy 2 on squared intervals.
September 14, 2010 at 8:13 am
That’s certainly a tempting conclusion, and it may be correct. It would be quite interesting to get some idea of how easy the computer finds it, and perhaps we could prove quite quickly that any example with
would need
to be at least 1000 or something.
But it also seems to me conceivable that we would find that there were little clusters of smoothness — that is, rectangles that contain very few coprime pairs — that made the task difficult. It might even be that if we ran a program to find a function with small discrepancy on squared intervals, we could examine where it found itself doing a lot of backtracking, use that to identify a collection of points, and then search for a decomposition based on small squares that contain those points. This seems to me to be very much worth trying, as it could be an efficient method of searching for decompositions with significantly larger
I was about to write “squares that contain very few coprime pairs” above, when it occurred to me that the fly trap idea is more or less forced on us because a square that sits on the diagonal (that is, one of the form
contains
points of the form
(not to mention points of the form
etc.). So we have to go for long and thin.
I’m a bit anxious about the fact that we need a collection of squares such that many rays meet them more than twice. Even the one-dimensional consequences of this seems hard: we appear to need something like a collection of intervals such that almost every HAP that intersects any one of the intervals intersects at least two of them, and several HAPs intersect more than two of them. And we want the average lengths of these intervals to be significantly greater than 1. That could be another way of narrowing down the search when
is large: first search for some good
coordinates, and then search for decompositions that are largely based on those
coordinates.
September 14, 2010 at 8:35 am
The graph colouring problem is actually a variant of something which I have used in a paper on hard SAT-problem.
Given a connected graph wit hall degrees even we can find an eulerian walk on the graph, and colour the edges alternatingly along that path. If there is an even number of edges this gives discrepancy 0 and if their number is odd we get discrepancy 1. An old theorem of Kotzig shows that this construction is optimal on all eulerian graphs.
If there are vertices of odd degree we can add a matching among the odd degree vertices and use the eulerian walk construction on the new graph. By deleting the new edges we raise the discrepancy by at most 1. If there were vertices of even degree in the original graph we can avoid raising the discrepancy above 1 by starting the walk in a vertex of even degree.
September 14, 2010 at 9:17 pm
I’ve done a simple experiment to search for functions
, constant on rays, with low discrepancy on rectangles of the form
. (This is enough to give us four times the discrepancy on all rectangles, and is somewhat easier to work with.) The experiment has shown that it is at least not totally trivial. That is, if we step along diagonals of constant sum, in order ((1,1), (2,1), (3,1), (2,2), (4,1), (3,2), (5,1), (4,2), (3,3), (6,1), …), and we assume that the function is symmetric, and whenever we reach a coprime pair
we assign it a value that minimizes
(choosing positive values when the existing terms sum to zero), then the discrepancy goes up to at least 7 (we hit 7 at the point (143,13)).
September 14, 2010 at 12:05 am |
[...] Polymath5 By kristalcantwell There is a new thread for Polymath5 here. Let me update this there is another thread here. And yet another here. [...]
September 14, 2010 at 10:43 pm |
I’m going to think briefly about the 3-uniform hypergraph version of the graph problem I mentioned above that turned out to be easy and to have the opposite answer to the one I had hoped for.
The question becomes this. Does there exist
such that, given any collection of sets of size 3, is it always possible to colour them red and blue in such a way that for every point the number of blue sets containing that point and the number of red sets containing that point differ by at most
?
Let us call the points vertices and the 3-sets edges. (This is standard hypergraph terminology.) And let us define the discrepancy of a vertex to be the difference between the red degree of that vertex (meaning the number of red edges containing it) and the blue degree. Finally, let us define the edge discrepancy of the hypergraph to be the smallest we can make the maximum discrepancy at a vertex. (I’m calling it edge discrepancy because it is the edges that are coloured. The discrepancy of a hypergraph standardly refers to vertex colourings and minimizing the discrepancy over all edges.)
So now we are asking whether the edge-discrepancy of a 3-uniform hypergraph is bounded by some constant
This is a completely different problem from the graph case because the greedy algorithm we used there fails completely. Indeed, if you colour an edge red and then try to guarantee that you haven’t done too much damage, then you will typically have to colour three “neighbouring” edges blue, and then each of those will have two further neighbouring edges that need to be coloured red, and so on, expanding out into a 3-regular graph. Unlike a path, where there is trouble only at the end points, a 3-regular graph can grow at an exponential rate, which means that a positive proportion of its vertices could in theory be troublesome ones.
That leads me to suspect that the answer is going to be that there are 3-uniform hypergraphs with unbounded discrepancy. (Whether we could realize such a hypergraph in terms of squares and flytraps and rays that pass through them is quite another matter of course.) The obvious thing to try would be a fairly sparse random 3-uniform hypergraph. That would give us the kind of exponential growth that we expect to cause trouble.
It occurs to me that one might try to prove a lower bound on the discrepancy by the sorts of methods we are already using for EDP. That is, one might try to find a decomposition of the identity. Or perhaps one could work the other way round and try to build a hypergraph with a decomposition of the identity in mind.
Let me say in more detail what I mean by that. As a matter of fact, it’s quite simple now that I come to think of it. Let us call the set of edges that contain a vertex a star (because that is what we would call it in the graphs case). We are trying to find a hypergraph such that the set of stars has unbounded discrepancy. Now each edge is contained in exactly three stars, so this is equivalent to finding a set system with unbounded discrepancy such that each point is in exactly three sets. So we’d like to find an efficient decomposition of the identity as a linear combination of products of sets with the property that no point is in more than three of the sets we use.
I don’t immediately see how to do that, but I’d be quite surprised if it wasn’t possible.
September 14, 2010 at 11:12 pm
A quick comment before closing down for the evening.
If the hypergraph is too sparse then the local lemma should provide a good colouring.
September 15, 2010 at 8:32 am
Ah, that’s a very helpful remark.
September 15, 2010 at 9:23 am
Actually, having thought a little bit further I’m starting to wonder whether that is actually the case. If you randomly colour, then even the individual events you are trying to make happen will have quite low probability: if you want the discrepancy on a set of size
to be at most
say, then you’ll get a probability proportional to
But I thought that for LLL you needed that the sum of the failure probabilities of an event and all the other events on which it depends to be less than 1.
September 15, 2010 at 10:24 am
Yes the meaning of “too sparse” will definitely depend on the value of discrepancy we are considering. Getting things perfectly balanced will be difficult unless the graph is very sparse.
, since, roughly, we need $e(2k+1)e^{-k^2}$ to be smaller than 1.
A straightforward application of the LLL should give a bound on the discrepancy proportional to
September 15, 2010 at 10:29 am
Sorry, the “e^{-k^2}” should be the probability that the binomial random variable deviates more than
from
.
September 15, 2010 at 8:35 am |
I have now added a solutions file, called data-file-rectangular, for the case where general rectangles, rather than only squares, are allowed.
A quick inspection shows that the structure of the solutions has changed into a collection of small squares and narrow “bands”.
Alec, could you modify the program to use only rectangles with area at most K?
I think I can make the modification but I*m not comfortable with modifying someone elses code without having read enough of it to know that there are no unexpected side effects of the modifications.
September 15, 2010 at 11:35 am
A quick remark about this. From the point of view of proving EDP, rectangles are just as good as squares, since the inequality
implies a discrepancy of
just as well if
as if
Since we are trying to optimize the discrepancy for what in the context of the problem are very small values of
this probably means that the rectangle discrepancy gives us a better idea of what is going on. The reason we looked at square discrepancy was to try to reduce the number of variables. If looking for rectangles becomes prohibitively slow, then there is a compromise we might try, which is to look for arrangements of squares but to evaluate them differently, so that, for instance, a fly trap would have a cost of 1 rather than 2. In fact, that’s not quite correct. If we had a fly trap we would probably replace it by two rectangles of width 1 that overlapped on the diagonal, so we’d actually get a trace of 2 for a cost of 1 rather than a trace of 1 for a cost of 2.
It would be quite interesting to see whether the “trivial” bound changes when one adopts this point of view. I think perhaps it doesn’t, as 2-by-2 squares have a trace of 2 for a cost of 1, and the extra flexibility afforded by rectangles doesn’t seem to improve that. But if that’s the case, then it is good news as it increases the chances of finding an arrangement that beats the
bound.
September 16, 2010 at 7:28 pm
Earlier today I made the modifications needed in order to only use rectangles with a bounded are. Here area is really area, so a “2×2″ square has area 1.
I have added two solutions files data-file-rectangular-area-9 and data-file-rectangular-area-25
The solutions reached at the end of the area 9 run stand out in that, up to signs, they really have only two different coefficients, 1/36 and 1/18, and are still made up of 2×2 squares and rectangles of width 1.
September 20, 2010 at 3:32 pm
I have made one more solutions file, now with bounded area and no $1xK$ strips for K greater than 20. The restrictions are needed to keep the memory use down.
The file is “data-file-rectangular-area-4-length20″ in http://abel.math.umu.se/~klasm/Data/EDP/DECOMP/
The best solution in this class for N=66 is better than the best symmetric solution for N=840, and has only coefficient 1/12, with positive 2×2-squares and a single negative 1xK strip. However unless I am missing something a solution with that structure can never take us below C=1/2
September 15, 2010 at 4:58 pm |
I’ve just written some code to solve the linear programming problem which corresponds to finding a real-valued function
on
which is constant along rays, has
and which has maximum discrepancy over squares as small as possible.
So far, the results correspond precisely with the results for the problem of finding the decomposition with the smallest sum of coefficients. Namely, for
we get discrepancy
, which then goes up to
and then at
this goes up to
.
I suspect the two problems may be identical. For the decomposition problem we wish to minimise
constrained by
. For the discrepancy problem we wish to minimise
subject to
. Considering the numerical evidence I think there might be some sort of duality argument to relate the two problems.
This is my first post so I appologise if any of the LaTeX doesn't come out right.
September 15, 2010 at 5:10 pm
They are indeed the same problem — that was the motivation behind the discrepancy version. Do you have a sense of which is easier computationally? From a theoretical point of view it seems better to look for decompositions, because the discrepancy result would be strictly stronger than the statement that completely multiplicative functions have unbounded discrepancy, which we don’t know how to prove. But I am interested in the possibility of using the linear programming problem as a way of searching for good decompositions, especially if that turns out to be more efficient than searching for them directly.
September 15, 2010 at 6:49 pm |
Here’s a slightly strange observation, that’s either wrong or quite encouraging.
Suppose we are trying to find
values along the rays in such a way that the discrepancy on all rectangles is bounded. Then in particular this applies to rectangles of width 1. Now consider the rectangle of width
that consists of all points
where
runs from
to
Then the sum of the values on that rectangle must be bounded. Next, consider the rectangle that consists of all points
where
runs from
to
The values along here must be bounded too. Actually, I should say more: the partial sums as you go up the rectangle are bounded. But these are equal to the partial sums for the even
in the first rectangle. More generally, we find that the sums along all HAPs of
s in the first rectangle have to be bounded. So if EDP is true then we can’t find a
function that’s constant along rays and of bounded discrepancy on all rectangles.
This doesn’t quite prove that we can find a decomposition because we have to allow more general functions. I haven’t yet thought about …
Let’s just see what happens if we define
if either
or
is equal to
mod 3,
if
mod 3, and
if
mod 3. Is this constant on rays? Yes. Oh dear, it seems to have bounded discrepancy on all rectangles.
Phew, that got me worried, but I’ve just realized that it’s NOT constant along rays. So after that moment of madness I’ll continue the interrupted sentence.
… what happens if you do this.
September 15, 2010 at 7:30 pm
Ah, that’s a good observation. My gut feeling was completely wrong (assuming EDP, of course)!
I suppose the next thing to think about is what happens if we allow
and zero.
Alastair, could you post one or two of your solutions? It would be interesting to see how sparse the nonzero terms are.
September 15, 2010 at 8:39 pm
I find this a nice problem. At first it seems as though having 1s down the diagonal is an extremely weak condition, but then one realizes that if the discrepancy on every rectangle has to be bounded, which is equivalent (if we assume symmetry) to the discrepancy on every square
being bounded, then there must be a lot of -1s near the diagonal. And those imply a lot of -1s further away from the diagonal, which in turn force more +1s, and so on. At the moment, I don’t have a clear feeling about the theoretical version of exactly the question you ask: how sparse can the non-zero terms possibly be? It might be interesting to see whether if they have zero density (meaning the limit of the density inside
as
tends to infinity) then the discrepancy must be unbounded. Or rather, it might be interesting to see whether that is any easier to prove than the general statement.
September 15, 2010 at 10:16 pm |
Alec
My solutions are not sparse at all. For example for
, the solution is non-zero at all of the 91 coprime pairs involved. Some of the values are nice rationals, but others don’t appear to be, (although that’s maybe just a feature of how I’m converting to rationals). I can make the code, (which is for Matlab), or the solutions available if people want them but I don’t know how useful they’d be.
Tim
Could you possibly clarify, or reference a comment clarifying, why the problems are identical. I understand your explanation in a previous comment that the existance of a function with discrepancy
forces the
norm of the coefficients in a decomposition to be
, but I can’t see why the bound is atained. Is there a general way we can convert from a function with minimum discrepancy to the decomposition with best possible sum?
Computationally, solving the linear problem for discrepancy and that for decompositions seems fairly similar.
September 15, 2010 at 10:32 pm
I should have done that in my previous comment. The proof (or at least the proof that I like) uses the finite-dimensional version of the Hahn-Banach theorem. If you can’t express a function
as
times a convex combination of functions
that belong to some class
then
lies outside the convex hull of
so by the Hahn-Banach separation theorem there is a linear functional that separates
from
That is, there is a linear functional
such that
and
for every
In our case,
is the function
that is, the function defined on positive rationals that’s 1 at 0 and 0 everywhere else.
consists of functions obtained by taking a square
and counting for each rational how many pairs in the square have ratio equal to that rational. The property
is telling us that
and the property that
for every
is telling us that the discrepancy of the function
is at most
on every square.
September 15, 2010 at 10:38 pm
I think I would be quite interested in staring at your solutions for a bit, just to see whether anything can be read out of them. For example, I’d be interested to see whether there is a difference in behaviour at coprime pairs where the values are small, or where at least one of the numbers is reasonably smooth (obviously 12 is a bit small to tell that, but perhaps a point like (8,9) is different from a point like (5,7)), etc.
September 15, 2010 at 11:24 pm
The solution for
can be downloaded from http://dl.dropbox.com/u/3132222/12.txt
Its a text file with 3 columns, the first two giving the coprime pair and the third the value of the function at that pair. I haven’t included rational approximations to these as some of them seem very spurious. My code doesn’t assume that the function is symmetric in interchanging the two coordinates, hence we have values for both
and
. It looks like they’re all the same though, which isn’t surprising, but I haven’t checked it.
I’ll modify the code to assume reflective symmetry tomorrow and thus be able to produce some bigger solutions.
September 15, 2010 at 11:30 pm
If I’m not much mistaken, we can always replace a solution
by
so assuming symmetry should be fine.
September 16, 2010 at 12:38 pm
I’ve modified the code to assume symetry. The solution for
looks the same, but I’ve replaced the old version with it anyway. I’ve also got a solution for
which is at http://dl.dropbox.com/u/3132222/42.txt
September 16, 2010 at 7:53 pm
Here’s a plot of the solution for
:
http://www.obtext.com/erdos/sol42.png
I suggest saving the image file to your computer and zooming in. Light pixels correspond to large positive values, dark ones to large negative values.
September 15, 2010 at 11:53 pm |
A quick observation that’s so trivial it’s embarrassing, but I overlooked it for a while. It’s that while looking at rectangles of width 1 is sufficient to show that you can’t get a
-valued function that’s constant along rays and has bounded discrepancy on squares (if EDP is true), they are not enough to tell us about more general functions. The example is the function that’s 1 on the main diagonal and 0 everywhere else. This has discrepancy at most 1 on any rectangle of width 1.
Of course, we know that rectangles of width 1 are not good enough to make efficient decompositions, so this doesn’t come as a huge surprise. It seems to indicate that the
assumption is quite a strong one, though I’m not quite sure about that. It would be nice if the
version had some one-dimensional consequence: I think I’ll think about that next.
September 17, 2010 at 9:50 am |
This is to report on something I tried that has not yet worked. I thought (and still think) that it would be interesting to see if we could obtain a non-trivial estimate for the best bound that can be obtained using squares of side length at most 3 and fly traps. The way I wanted to do it was to construct a function that had small discrepancy on all such sets. The trivial bound is obtained by means of the function that is 1 when x=y and 0 otherwise. This has discrepancy at most 3 on all the sets that interest us. So the question is whether we can improve on 3. Note that if we just go for 2-by-2 squares, then Alec’s family of decompositions shows that this function is the best possible, since 2 really is the best bound. So this question amounts to asking whether, if we allow ourselves 3-by-3 squares as well, we can obtain a constant arbitrarily close to 1/3.
I set out on this hoping to find a clever function that would have discrepancy at most some constant less than 3 on all 2-by-2 squares, all 3-by-3 squares and all fly traps. (Of course, I’m always insisting that the function should be 1 on the diagonal and constant on rays.) I thought it was going to be easy because to deal with the squares I cared only about what happens for pairs (r,s) with |r-s| at most 2. And indeed, it is easy to get a non-trivial bound for the squares: for instance, if we define
to be 1 if
and
if
or 2, then the sum on all 2-by-2 squares is 3/4 and the sum on all 3-by-3 squares is -3/4. This starts to suggest a bound of 4/3, but we know that can’t be achieved, since the inclusion of 2-by-2 squares means that 2 is the best bound we can hope for. The reason this isn’t a contradiction is that we haven’t dealt with the fly traps.
And that is where things start to get difficult. Once you start putting in lots of values off the diagonal, as we have now done, you commit yourself to many more. From this point of view, the choice of
all the way down the diagonals
is disastrous for us, since it will force values of
everywhere on the fly traps of width
at
(that is, the ones Alec used). So it will in fact give rise to unbounded discrepancy.
Thus, there is quite a nice problem I don’t yet know how to answer. First and foremost, can one improve on the trivial bound of 3 for this problem? That is, can one find
and a function that’s 1 on the main diagonal, constant on rays, and that has discrepancy at most
on all squares
with
and on all fly traps? So far, I can’t even answer the following weaker question: can one get the discrepancy to be at most
on the squares and bounded on the fly traps?
September 17, 2010 at 12:18 pm
A simple observation that has some bearing on the relationship between the two questions is that the weaker version of the question for 2-by-2 squares is simple. Indeed, suppose we manage to get the discrepancy to be at most
on all 2-by-2 squares
Then the value of the function (assuming symmetry) has to be at most
at all points
It follows that the value at
is at most
for all
so the discrepancy is unbounded on fly traps.
I think this gives another perspective on why finding a decomposition using 3-by-3 squares is much harder than finding one using 2-by-2 squares: there is much more flexibility when it comes to devising functions with low discrepancy. For instance, we might try to do it as follows. First we define
to be
for every
Then we do a bit of adjustment: we look at numbers
with lots of small factors, where there will now be fly traps with large discrepancy, and we make some adjustments to the values at
when either
or
is of the form
for some small factor
of
That will involve creating some 2-by-2 squares where the discrepancy is now slightly bigger than 2, but we could keep it down to
perhaps.
Actually, I’ll stop there because I’ve got confused about what my aim is. Is it to keep the discrepancy down to almost 2, or is it merely to keep it below
? I’m interested in both problems, but they seem fairly different.
Instead, I’ll just make the general point that if the only numbers where we make adjustments are of the form
where
has many small factors and
is one of those small factors, then it’s not clear to me whether we have to make adjustments at big clusters of consecutive numbers. I think this may be at the heart of the problem — perhaps even at the whole of EDP — though I don’t yet have a precise formulation of what it is I’m asking.
September 17, 2010 at 5:14 pm |
I want to think aloud for a bit about the problem mentioned in my previous comment. Suppose that we start with a first approximation to what we want, by setting
and
to be -1/2 for every
Now we know that this causes problems for fly traps with lots of small factors, so the next step is to do an adjustment. For now I’ll concentrate on the “weak” problem, so all I want to do is make the discrepancy on fly traps bounded. Note that so far the discrepancy on the squares is 1 for 2-by-2 and 0 for 3-by-3, so we’ve got a bit of elbow room.
Let’s suppose that we want to keep the discrepancy on fly traps below something like 1000. Then we have a problem at
only if there exists
such that the number of factors of
that are less than
outnumbers the number of non-factors of
that are less than
by at least 1000. Let’s suppose that
and
are two such numbers, and that
and
are factors of
and
that are less than
Then
and
have at least 1000 factors in common, which … well, ordinarily it might suggest that
and
differ by the lowest common multiple of some pretty huge number, so that
and
are not close to each other.
But as I write that, I see that it’s not obviously true, which is good news. So let’s try to think in more detail about how it could be false.
I’ll take an extreme example. First we insist that both
and
are divisible by 1000!. Sorry, this example isn’t working. Back to my previous line of thought.
Let’s just look at
for a bit. Let
be some primes that do not divide
Then no number that is divisible by any of the
can be a factor of
It follows (provided
is not too big) that the probability that a number less than
is a factor of
is at most
or so. From that it follows that the sum of the reciprocals of the
cannot be too large. Assuming that
is not too small, that tells us that almost all primes up to
are prime factors of
Now I want to know whether it is possible to find some
such that there exist
and
less than
such that
is one number like that and
is another. The difficulty is that
and
are coprime, but maybe we can deal with that by multiplying them by
and
to get the extra factors we need. Except that that doesn’t seem very easy:
and
are much much smaller than
so they don’t seem to have enough smoothness to create all the extra divisibility that we need.
Let me try to say that more clearly. Here’s a number-theoretic question I don’t know the answer to. Fix a large positive integer
For how big a
can we find sets
and
of integers between 1 and
such that
and every number in
is coprime to every number in
? I suspect that
has to be quite a lot smaller than
One possibility is to partition the primes and take …
OK, that problem is easy. The best you can do is partition the primes into two and let
be all numbers you can make out of one set of primes and
be all numbers you can make out of the other. I feel as though that should make the product of the sizes of
and
be less than
but I don’t yet see why I’m saying that. Yes I do. If we make
and
maximal like that, then every number up to
can be written uniquely as a product of something in
and something in
(Just take the prime factorization and split it up in the way you have to.) But that just gives a lower bound on the product of the sizes of
and
I’m not sure how to get an upper bound. Actually, it’s false, since just by taking primes we can get
and
to have size
OK, I can at least prove, but won’t bother with the details, that one or other of
and
must have size 
I need to stop this rambling comment for a bit. But it’s looking quite hard to demonstrate that any major problems would arise if one decided to adjust the values on the very smooth fly traps to be zero, and made any other changes that were implied by that. Of course, there would still be many more changes that had to be made.
September 18, 2010 at 1:06 pm |
I’m finding it very difficult to say anything precise, or to decide whether certain things are likely to be possible. The question I’ve been struggling with above — can we have a function that is 1 when x=y, that has discrepancy at most 2.999 on all 2-by-2 and 3-by-3 squares, and that has bounded discrepancy on all fly traps — still seems to be hard, even though it is so weak that an answer either way would tell us little about EDP. But I still can’t solve it, so I want to make the question weaker still. Here, then, is a question that it really ought to be possible to answer.
The question is this. It is easy to create functions that are 1 on the main diagonal and that have discrepancy at most 2.999 on all squares
when
or
However, the obvious ones have the property that they give rise directly to unbounded discrepancy on fly traps. What I want to do is say precisely what I mean by “give rise directly to” and then ask whether what I have observed is necessary, or whether there exist cleverer examples that do not give rise directly to unbounded discrepancy on fly traps.
Here is the definition of “give rise directly to”. I just mean that once you’ve decided on the values of
when
is 1 or 2, you then put in all other values that follow from the condition
You then see if for every
there exists a fly trap such that it is impossible to fill in the remaining values on that fly trap so as to keep the discrepancy below
To be more precise, for each
you fill in all the values
that are forced by the constant-on-rays condition. Then you see whether there is an interval of
s such that the values are all fixed and add up to more than
in modulus.
September 18, 2010 at 2:51 pm |
An obvious way of weakening that question yet further is to insist that the interval in question starts at 1. So now the question is this. Suppose
is a function defined on the set of all
such that
and suppose that it is 1 when
and that the discrepancy on 2-by-2 and 3-by-3 squares of the form
never exceeds 2.999. (To make the question yet weaker we could go for 2.1 instead, so our hypothesis is stronger. In fact, I think that is probably a good idea. Note also that I am assuming that
for every
) Does it follow that for every
there exist positive integers
and
such that
and
?
I think I may already have established that this is not the case. If
then
What can we say about
mod
if it is an integer, and
otherwise? Answer: we know that it is a multiple of
or
Now if we have two distinct such numbers, they must differ by a lot. Either they will be two distinct multiples of
or
or they will use different
s but will in any case be distinct multiples of
for some
So now all one has to do is define
as follows. If
then
If either
or
is divisible by
then
If
or
and we have not already chosen the value of some
to be 0, then we set the value of
to be -1/2.
This guarantees immediately that the discrepancy on any 2-by-2 square is either 2 or 1 (depending on whether we put in 0 or -1/2 off the diagonal). As for a 3-by-3 square
I think the argument above shows that all its off-diagonal entries will be
unless one of
or
is
or
for some
such that
is a multiple of
Now the only thing that can stop the discrepancy on a 3-by-3 square being at most 2 is if all the off-diagonal entries are 0. Since the numbers
and
are well-separated, the only way even the entries
can both be zero is if
or
But even then, for the discrepancy to be more than 2, we need
to be zero as well. So we need
or
to be of the form
And that can’t happen.
If that argument is correct, then what it shows is this. We can’t find a proof that the discrepancy has to exceed 2 (that is, improve on
) by taking a bunch of 3-by-3 squares and one fly trap and arguing that if the discrepancy is at most 2 on those squares then it forces us to choose values on some fly trap that add up to something large.
That wasn’t very well put, but I think I’ve now thought of a nice way of putting it. I think it is possible to find a function with the following properties.
1.
whenever 
2.
whenever 
3. The discrepancy of
is at most 2 on all 2-by-2 or 3-by-3 squares.
4. The discrepancy of
is bounded on all fly traps.
In other words, any proof that the discrepancy is unbounded on fly traps if it is bounded on small squares would have to use the fact that
is constant not just on rays that go through points of the form
or
but on more general rays, even if we are talking only about 2-by-2 and 3-by-3 squares. I think the bound in 4 could be made pretty small too, but that needs checking.
September 18, 2010 at 5:36 pm |
Let me dualize that last problem in an attempt to understand it better. I think that the existence of such a function is telling us that we cannot find a decomposition
with the following properties.
1. Each
is a 2-by-2 or 3-by-3 square and each
is a fly trap.
2.
sums to 1 along the line
and to 0 along all lines that pass through a point of the form
or 
3. For all
that do not belong to one of the above rays, 
4.
Let me check that that does follow from the existence of a function
with the discrepancy property mentioned in the previous comment. I’ll do that by thinking about the sum
Let us split this sum up according to rays. The sum along the ray
is 1, since
is constantly 1 and
sums to 1. Along any ray that goes through a point of the form
or
the sum is zero, since
is constant and
sums to 0. Along any other ray, the sum is zero, since
is identically zero. So the sum is 1.
Now let’s get a contradiction by summing instead over the squares and fly traps used to decompose
We have that the sum is at most
But by hypothesis
for every
and
So the sum is at most
which, by hypothesis 4, is less than
the desired contradiction. As I say, I haven’t carefully checked, but I basically know that Hahn-Banach will show that this is necessary.
I think the existence of
therefore implies that there is no way of obtaining a bound better than
if we just use 2-by-2 squares, 3-by-3 squares and fly traps. (I don’t claim that I’ve definitively proved that — there are some important details that need checking, such as whether I can reduce
and what happens with intervals that do not start at 1. In fact, that second point may turn out to be particularly important.)
September 19, 2010 at 9:29 pm |
I’ve been searching for symmetric functions taking values in
with discrepancy 1 on rectangles of the form
. (This implies a discrepancy of 4 on all rectangles). To narrow down the search, and because it seems to be possible so far, I’m restricting the non-zero values
to
.
A depth-first search (favouring zero values) running in Python has so far reached 49:
http://www.obtext.com/erdos/psol-49.png
It seems to have got rather stuck at 50 (not surprisingly, since it seemed to go a bit mad at 25).
September 19, 2010 at 10:39 pm |
That’s a great picture. For my own part, given that I can’t by theoretical means find a decomposition that gets a bound better than
I’ve decided to concentrate on an easier task, which is to look for low-discrepancy functions that prove that decompositions with certain additional restrictions cannot exist. For instance, I think I may be able to manage a proof that it can’t be done with 3-by-3 squares and fly traps if all the fly traps have coefficients of the same sign, and perhaps even in general. I’m not actually claiming that as a result yet, but it seems a realistic target to determine whether it can be done. (I suppose I’d be happier if it could be done, but at the moment that is not what I believe.) I hope to have something a bit more definite to report over the next couple of days.
If that works, then what I would hope to do is generalize it as much as possible, the aim being to narrow down considerably what a decomposition could look like. That could have two possible benefits: making the computational problem of finding a decomposition much easier, and making the theoretical problem somewhat easier.
I’m interested in revisiting some of the experimental evidence from a few months ago from this point of view and seeing whether we can “dualize” it somehow. By that I mean that we could look at where various searches backtrack and try to build decompositions based on those points and relationships. For instance, perhaps the program could tell us that certain HAP-constraints were important ones and others were not. That would then give us a huge clue about how to find a decomposition.
One snag is that the decompositions tend to be equivalent to vector-valued problems, so it may be that the evidence we have collected so far is not in fact helpful for finding decompositions. And it’s not completely obvious to me how one should search for best possible bounds in the vector-valued case.
September 19, 2010 at 11:00 pm |
Another question I want to think about is the hypergraphs question I was discussing earlier. We have a collection of sets (squares or rectangles) and we want to find a function with small discrepancy on those sets, subject to the condition that certain values that that function takes have to be equal. To do this, we want to find a finite collection of the sets that we can use to form an efficient decomposition. But that will imply a discrepancy result for the following hypergraph. The vertices are the rays, and the hyperedges are sets of rays that go through a set in the collection.
An important difference between this question and most discrepancy questions is that we are not assuming that the function takes values
but instead are assuming that there is one vertex that belongs multiply to many hyperedges. It might be reasonable to assume that the ray
is the only ray that intersects any square in our collection more than once. In that case, we are trying to show that the effect is to give a positive kick to the sets in our collection that we cannot manage to cancel out.
What I’d like to investigate is the number-theoretic question of how to find squares that create a hypergraph that has even the remotest chance of achieving this. For instance, if any square contains a point that is the only point in its ray that intersects a square in our collection, then we can give that point any value we like and cancel out the discrepancy on that square. So we may as well get rid of that square.
I think I’ll write a new post soon and try to formulate various questions as precisely as I can.
September 20, 2010 at 7:28 am |
I haven’t followed the discussion in a while. I tried to catch up on the most recent comments in the last couple of days, but I am not up to speed on everything, so I may be asking naive questions and repeating things that you already know. It seems to me that the question being considered currently is very similar to some versions of the diagonal representation question we considered way back in EDP12 and EDP13. Is it the same as Problem 1 mentioned in Tim’s EDP13 post with the multiplicative constraint that Alec suggested in this comment ? The squares and fly-traps construction also looks similar to a construction we had earlier that was inspired by looking at LP solutions here .
It may be worthwhile to look at LP solutions for inspiration in constructing functions
that have low discrepancy on squares. In fact, if the squares and fly-traps construction does arise as an optimal solution to the LP, then the dual solution is a low discrepancy function with maximum discrepancy equal to the bound established by squares and fly-traps. One would hope that there should be considerable structure in this dual solution. One problem with looking for structure in these LP solutions is that you need large values of
since
. A trick to handle this issue is to start at
instead of
, so
plays the role of the highly divisible number. But I’m not sure if starting at
makes sense for the current discussion. In any case, it may be an easier variant to think about because there was a very definite trend in the optimal values of the LP for the problem over
– in fact the optimal value seemed to be exactly
.
I’m a little rusty, but I recall from looking at these LP solutions previously that it appeared that the optimal dual solution was probably not unique and there were some extra degrees of freedom. Even so, in principle, one should be able to determine the form of these dual solutions if indeed the squares and fly-traps construction arises as an optimal solution to the LP. What this tells us is that the squares used in the linear combination for the squares and fly-traps are the squares where the maximum discrepancy is achieved – all other discrepancy constraints are not tight. The tight constraints (provided we also know the sign for each of them) should determine a family of dual solutions and there must exist some choice of free variables such that the dual solution obtained satisfies all the discrepancy constraints.
September 20, 2010 at 12:08 pm
How amusing that we are revisiting an earlier discussion without (at least in my case) realizing it. But I think it is the right thing to do, and that we shouldn’t have forgotten about it the first time round.
What you say in the last paragraph is what I was saying, but more vaguely, in my last comment but one. I think it might be interesting to look at some of the experimental evidence coming out of the LP problem in order to get a feel for what the dual solutions are like — what I would be hoping is that it would be easier to spot patterns in the dual solutions.
I suppose another idea, though it might be rather too ambitious, would be to try to think of a strengthening of the discrepancy statement, insisting on lower discrepancy for some squares than for others, so that all the constraints become tight. Equivalently, one would attach different weights to the squares in a decomposition, so that some of them were more expensive than others, in the hope that the best decomposition then used the squares much more uniformly.