The situation we are now in with EDP is roughly this. We have a statement that would imply EDP and that seems easier to prove because it is a reasonably simple existential statement (there exists a decomposition with certain properties) rather than a universal statement (every function has unbounded discrepancy on HAPs). However, the existential statement seems itself to be hard, despite one’s initial expectation that it ought to be easier. So what do we do?
One tool that we have at our disposal is duality, which is what we used to convert the problem to an existential one in the first place. Now obviously we don’t want to apply duality twice and end up with the original problem, but, perhaps surprisingly, there are ways that applying duality twice could be useful.
Here are two such ways. The first is that you prove that a certain kind of decomposition would be sufficient to prove EDP. Then you argue that if such a decomposition exists, then a more restricted kind of decomposition must also exist. Dualizing again, one ends up with a new discrepancy problem that is different from the original one (though it will imply it). The second way is this: if it is not easy to write down a decomposition that works, then one wants to narrow down the search space. And one way of doing that is to prove rigorously that certain kinds of decompositions do not exist. And an efficient way of doing that is to use duality: that is, one finds a function with low discrepancy on the class of sets that one was hoping to use for the decomposition. Since this class is restricted, solving the discrepancy problem is easier than solving EDP (but this time it doesn’t imply EDP).
We have already had an example of the first use of dualizing twice. In this post I want to give in detail an example of the second.
General results about duality.
Recall that we are trying to find a linear combination with the following properties, and with as small as possible.
- Each is a “square” in the sense that it is the characteristic function of a set of the form where itself is an interval of positive integers.
- If and are distinct and coprime, then
Recall also that since an equivalent formulation is to replace conditions 2 and 4 by the condition that
Now the set of functions that satisfy conditions 2 and 3 is convex — indeed, it is an affine subspace And the set of linear combinations such that is also convex. If we cannot find such a decomposition, then these two convex sets must be disjoint, which implies that there is a linear functional that separates them. By rescaling, we can ask that should take the value 1 everywhere on and be less than 1 on every function that belongs to
The first of these two conditions on implies that is constant on all rays (that is, intersections of with lines through the origin), since otherwise we would be able to find points in such that the inner product with was whatever we wanted. The second condition implies that for every and is also implied by it. Therefore, if no decomposition of the desired form exists then there is a function that is 1 on the main diagonal, constant on rays, and that has discrepancy at most on all squares. The converse is almost trivial.
Now let us see what happens if we modify the requirements of the decomposition as follows. We shall no longer ask for the sets to be squares (so that we can allow, for instance, rectangles or fly traps). We shall insist that every set we use belongs to some set system We shall also assume that has disjoint subsets and such that if then the coefficient of is required to be non-negative, and if then the coefficient of is required to be non-positive.
What happens if we dualize the problem with these new restrictions? This time we replace the set of all combinations such that by the set of all linear combinations with that property and the additional property that if and if This tells us that is at most if at least if and at most in modulus for all other Again, the converse is easy.
Ruling out simple decompositions.
Now let us see how we can exploit this. I would like to obtain a lower bound for the best one can obtain if consists of 2-by-2 and 3-by-3 squares, and fly traps, and if the coefficient of every fly trap is non-positive.
To do this, let us define a function as follows. If then we take to be 1, as we must. If and one of or is a multiple of 3, then we take and otherwise we take (So the only way that can be -1/2 is if mod 3.) If and one of or is a multiple of 6, then we take (this is forced since we have already decided that ) and otherwise
That determines on all points such that and on all multiples of all such points. It is not hard to check that the sum of on any 2-by-2 or 3-by-3 square lies between 1 and 2. So it remains to look at the fly traps.
Since we are assuming that the coefficients of fly traps are all negative, our aim is to choose the remaining values of in such a way that the sum on any fly trap is never less than -2. There is a trivially best strategy for doing that, which is to choose to be infinity for all remaining points. (This is overkill, but it works.)
If we do that, then what could go wrong? We would have to have a fly trap on which the sum was less than 2, which means we would have some pair of positive integers such that the values of (or the same but with minuses) are all chosen before we put in any infinite values, and at least five of them are -1/2.
The condition that the values are all chosen is equivalent to the condition that every is a factor of Indeed, the value of is chosen only if or where is the highest common factor of and If then and if then so
Let us see what we can deduce from this. We know that or we certainly cannot have at least five s. Setting we deduce that Setting we deduce that And setting we deduce that Therefore, It follows that And from that it follows that so Since is coprime to both 2 and 3, (as is a multiple of 6). Therefore, from which it follows that so
There is a sort of race going on here. It is possible that but otherwise all must be 0 up to So now it follows that In particular, so But this implies that since now is a multiple of 36. So we’re sort of back to square one.
Once we know that what property must have if there is to be any hope that ? If then we need not to be a multiple of 3, so we need to be a multiple of 9. And if is a factor of but not of then we need not to be a multiple of 6, which it will not be since it is odd. So that is still a possibility, but it requires to be a multiple of 8.
Going back to our example, we now know that we can restrict attention to multiples of 8 or 9. But by the time we get to five of those, we have passed 16, which tells us that and that now needs to be a multiple of 16. And before we know it, we hit 27. And so on.
I won’t try to make that last bit rigorous, but I’m pretty sure that it’s possible to do so, and to prove that the race can never be won. What I mean by that is that for every fly trap we will be able to put in an infinity before we reach five -1/2s. If that is correct, then it proves that we cannot improve on a bound if we use a linear combination of 2-by-2 and 3-by-3 squares and try to cancel it out with some fly traps, all of which have the same sign.
I’ve just spotted that I made a mistake above, which is that a fly trap has two wings, so to speak, which means that for the sum to reach less than -2, we need only four -1/2s. (The diagonal starts you off at 1, and then each has the effect of subtracting 1, so we can afford up to three of them.) But I think the conclusions I drew are valid even for this tighter race.
What happens if in addition to fly traps we look at rectangles of width 1? We’ve dealt with ones that intersect the diagonal, but what about a rectangle such as the one that consists of all the points from to for some ?
Now we need five -1/2s. What can we say about the highest power of 2 that divides if all the points in divide ? Let be the largest power of 2 that divides any number in and let be the largest power of 3. Now let belong to that interval. How can fail to be 0?
If then a necessary condition is that should not be a multiple of 3, since otherwise But that implies that If but is not a factor of then must be a factor of since So the total number of -1/2s is at most the number of multiples of that belong to the interval plus the number of multiples of that belong to the interval. But since any three consecutive multiples of include a multiple of and any two consecutive multiples of include a multiple of the total number of -1/2s in a rectangle of width 1 that contains no infinity is at most 3.
This is actually a rigorous treatment of the fly trap case as well. It proves that we can’t obtain a decomposition with if all we allow ourselves to use is 2-by-2 squares, 3-by-3 squares, and fly traps and rectangles of width 1 with negative coefficients.
If we allow ourselves symmetric rectangle pairs, then the above function has discrepancy 3, so the argument no longer works. However, we can improve on that bound by changing the coefficient -1/2 to something slightly closer to zero. If we choose it to be then the worst discrepancy on a 3-by-3 square turns out to be and the worst on a rectangle pair is Equating the two, we can take and that gives us a discrepancy of which tells us that we can’t make any better than even with this extra flexibility.
A fatter example?
Suppose we try to extend the above result to include 4-by-4 squares as well. Now we have to give sensible values to points of the form The simplest way we might try to do that without messing anything up would be to define to be 0 if is not a multiple of 3 (in which case and are coprime) and otherwise. This means that except if mod 9, in which case it is
This does not add any s to the fly traps, since the only s we have put in the lines are ones that were already there. However, it does change some infinities to zeros. What difference does that make?
Well, the condition now for not to be infinite is that So if we have an interval on which is finite, and if is the highest power of 3 that divides any of the integers in that interval, then the highest power of 3 that divides is at least (whereas earlier it was ). Let’s suppose that it is exactly (since this is the most difficult case). If but does not divide then so is not divisible by 3, which means that So we must have This takes us back to the previous case, except that the highest power of 3 that divides is smaller. So now from the condition that should not be a multiple of 3, all we can deduce is that Unfortunately, if is the highest power of that goes into any number in an interval, can go into up to eight numbers in that interval. So it looks as though this method of fattening the example fails completely. This may be worth looking into in case it suggests a decomposition (though that is asking a lot, since there is no particular reason to suppose that the construction I have just attempted has to be exactly as it was in my attempt).
At the moment I am trying to solve the problem by means of the following time-honoured technique. First I try to prove the result. Then, when I run into difficulties, I try to disprove it. Then, when I run into difficulties there, I try to prove it again, or perhaps prove merely that the way that I was trying to disprove it cannot work. And so on. The more iterations of this process, the smaller the bearing any success will have on the problem, but if the problem is hard enough then it may be best to start small.
How does that relate to EDP? Well, we would like a decomposition. So we try finding a decomposition with stronger properties. Then, having failed to do so, we see whether we can prove that no decomposition with the stronger properties exists. Fortunately, duality gives us a technique for doing this: we try to find a function with small discrepancy on all the relevant sets. On failing to find such a function, we try to get some sense of why that is difficult, in the hope that it will lead us to a yet more refined class of possible decompositions. And so on.
So the next thing I feel like trying is to generalize the above argument to strips of width 3, 4, 5, etc. about the main diagonal. At some point difficulties will emerge (or if they don’t then we’ll have proved something very interesting). What I hope is that those difficulties will lead us to a construction that improves on I feel more and more that this would be a crucial breakthrough: if we can do that in a theoretical way (rather than just finding a pattern on a computer that we are unable to interpret, though even that would be very interesting) then we will have much more of a chance of generalizing it to arbitrary positive