Gil has a not quite finished third post that will appear soon on this blog. Meanwhile, here are a few thoughts I’ve had recently as I tried to get my brain into EDP mode again.
The approach to the Erdős discrepancy problem that, rightly or wrongly, I found most promising when we were last working on it was to prove a certain statement about matrices that can be shown quite easily to imply a positive solution to the problem. In this post, I’m going to treat that matrix statement as the problem, and think about how one might go about trying to prove it. I’ll give the brief explanation of why it implies EDP, but not of what the possible advantages of the approach are (discussions of which can be found in some of the earlier material).
First I’ll need to recall a couple of definitions and some notation. A homogeneous arithmetic progression, or HAP, is a set of the form for some pair of positive integers and . Let me say that is a HAP-function if it is the characteristic function of a HAP. Finally, if and are functions defined on , let denote the function defined on that takes the value at , which can be thought of as an infinite matrix.
Here, then, is the statement that it would be great to prove. What is nice (but also quite daunting) about it is that it is a pure existence statement.
Problem. Prove that for every there exists a real matrix decomposition with the following properties.
(i) is a diagonal matrix and .
(ii) Each and is a HAP-function.
I have stated the problem in this form because (for a reason I will outline below) I am confident that such a decomposition exists. However, it seems to be quite hard to find.
Why would the existence of an efficient decomposition of a diagonal matrix into products of HAP-functions prove EDP? Well, let be a -sequence and consider the quantity . On the one hand it equals
and on the other hand it equals
Since , there must exist such that , from which it follows that there exists a HAP such that . This shows that must have HAP-discrepancy at least for every , and we are done.
Of course, one can deduce anything from a false premise, so one needs at least some reason to believe that the decomposition exists. Quite a good reason is that the existence of the decomposition can be shown (without too much difficulty) to be equivalent to the following generalization of EDP.
Problem. Show that if and are any two sequences of vectors in a Hilbert space such that for every , then for every constant there exist HAPs and such that .
This is a strengthening of EDP, since it clearly implies EDP when the Hilbert space is . However, it doesn’t seem like such a huge strengthening that it might be false even if EDP is true. At any rate, it seems just as hard to find a counterexample to this as it does to EDP itself. (You might ask why I believe that EDP is true. It’s partly blind faith, but the heuristics presented by Gil Kalai in the previous post offer some support, as does — very weakly — the experimental evidence.)
How does one ever solve existence problems?
Let’s begin by thinking very generally about the situation we’re in: we have to come up with a mathematical object that has certain properties . (Of course, we could combine those into one property, but it is often more natural to think of the object as having several different properties simultaneously.)
I’m going to list strategies until I can’t (or at least can’t immediately) think of any more that seem to have any chance of success. (An example of a strategy that I think has no chance of success for this problem is to search for a just-do-it proof, the problem being that the constraints appear to be quite delicate, whereas a just-do-it proof works better when one has a number of independent constraints that are reasonably easy to satisfy simultaneously.)
1. Use an object you already know, possibly with some small modifications.
2. Start with examples that work for a certain constant , and use them as building blocks for larger and more complicated examples that work with a larger constant .
3. Use some kind of duality to prove that if an example does not exist, then some other object does exist, and show that in fact the second object does not exist.
4. Use a computer to search for small examples, try to discern a pattern in those examples, use that pattern to guess how to construct larger examples, and check that the guess works.
5. Strengthen the conditions on the object until there is a unique (or almost unique) object that satisfies those conditions.
6. Weaken the conditions on the object (e.g. by forgetting about some of the properties ), try to describe all objects that satisfy the weaker conditions, and then reformulate the problem as that of finding an object of that description that satisfies the stronger conditions. [As a very simple example, if you were required to find a positive solution of some quadratic equation, you would solve the equation without the positivity condition and then pick out a positive root.]
7. Assume that you have an object with the required properties, deduce some interesting further properties, find a natural object with the further properties, and hope that it has the original properties as well. [This is a variant of 6.]
How do the general strategies fare when it comes to EDP?
I will discuss them briefly at first, before concentrating in more detail on a few of them.
1. This doesn’t appear to be the kind of problem where there is already some matrix decomposition (or related mathematical object) waiting in the wings to be modified so that it becomes the example we want. Or if there is, the only way we are likely to discover it is by pinning down more precisely the properties we are interested in so that someone somewhere says, “Hang on, that sounds like X.” But I think it is very unlikely that we will discover an example that way.
2. I am definitely interested in the idea of starting with small examples and using them to build bigger examples. However, an approach like this needs plenty of thought: non-trivial small examples seem hard to come by, and the only obvious method for building bigger ones is taking linear combinations of smaller ones. (A trivial small example is to take a linear combination of the form where each is the characteristic function of a singleton. This gives us . To get anything interesting, we need the to be HAP-functions for longer HAPs, but then it is very difficult to get the off-diagonal terms to cancel.)
3. We are trying to write as an efficient linear combination of objects of a certain form. That is a classic situation for applying the Hahn-Banach theorem: if there is no such linear combination, then there must be a separating functional. What can we say about that functional?
This approach seems promising at first, but is in fact of no use at all, since if you pursue it, the conclusion you come to is that if there is no such linear combination, then there are sequences of unit vectors and in a Hilbert space and a constant such that for every pair of HAPs and . The reason duality doesn’t help is that we used duality to reformulate EDP as a problem about decomposing diagonal matrices. If we use duality again, we get back to (a generalization of) EDP.
4. We have tried this. Moses Charikar and others have written programs to search for the most efficient decompositions (for this and for a related problem). The results are useful in that they give one some feel for what the diagonal matrix needs to be like, but it does not seem to be possible to obtain a large enough matrix to use it to go as far as guessing a formula. For that, more theoretical methods will be necessary (but knowing roughly what the answer needs to look like should make finding those methods quite a bit quicker).
5. One difficulty is that we are not looking for a unique object: it would be nice if the properties we were trying to obtain determined the matrix and its decomposition uniquely, since then we could hope to calculate them. An obvious property to add is that the object is in some sense extremal (though what that sense is is not completely obvious). Another thing we can do is insist on various symmetries — something that has already been tried. Yet another is to restrict the class of HAPs that may be used: as far as we can tell at the moment, we can get away with HAPs with common differences that are all either prime or a power of 2. (I say that simply because it seems to be hard to find a sequence that has bounded discrepancy on all such HAPs. Gil’s heuristics suggest that the discrepancy should grow like for sequences of length .)
6. What weaker properties might we go for? One idea is simply to try to find a linear combination of products of HAP-functions that leads to a considerable amount of cancellation off the diagonal, even if there are still off-diagonal terms left. This would be interesting because if one naively tries to find a decomposition that works, it seems to be very hard to get a substantial amount of cancellation (except if very small HAPs are used, in which case the resulting decomposition is not very informative). Another (somewhat related) idea is to look for interesting decompositions but with not particularly large values of . Yet another might be to allow a wider class of functions that HAP-functions.
7. I don’t have much idea of how to apply this strategy to the entire decomposition, but there is plenty one can say about the diagonal matrix that ones wishes to decompose, and that information, if it could be expressed neatly, would surely be very useful: it seems much easier to try to decompose a specific diagonal matrix than to try to decompose a diagonal matrix that you have to choose first, when the vast majority of diagonal matrices cannot be efficiently decomposed.
The proof that the diagonal decomposition implies EDP actually proves a much stronger result. It gives us a function such that such that if is any sequence of real numbers (not necessarily -valued) with , then has unbounded HAP-discrepancy. To put that loosely, the sequence cannot correlate with the square of any sequence of bounded discrepancy. That puts very strong conditions on , but so far we have not fully understood what those conditions are. It seems that we should be able to make progress on this, perhaps even adding reasonable conditions that would allow us determine the function uniquely.
Creating new examples out of old ones.
Suppose that and . Clearly if we take a linear combination of and , we get another decomposition of a diagonal matrix, and if the original decompositions were into products of HAPs then so is the new one.
Can we do anything more interesting? Another possibility is to look for some kind of product. Now Dirichlet convolutions are natural objects to look at in the context of EDP, since the expression can be thought of as follows. You start with a function defined on , then you define the dilate to be the function that takes the value at and 0 at non-multiples of , and finally you take a linear combination of those dilates. If is identically 1, then this is taking a linear combination of HAPs.
We can’t take a Dirichlet product of two matrices, but we can do something similar: given functions and of two integer variables we can take the function . If and are diagonal (meaning that they are zero if ), then the terms of the sum are zero unless and , so in this case , considered as a function of only, is the Dirichlet convolution of and , also considered as functions of one variable.
This operation also respects products in the following sense. If and , then
which equals , where is the Dirichlet convolution of and and is the Dirichlet convolution of and .
One further obvious fact is that the operation is bilinear in the two matrices that it is applied to.
So far everything is working swimmingly: out of two decompositions of diagonal matrices into linear combinations of products we build another one that has some number-theoretic significance. However, there is a problem: a Dirichlet convolution of two HAP-functions is not an HAP-function except in trivial cases.
How much does that matter? Without doing some detailed calculations I don’t know. Here are a couple of simple observations. Let and be two HAP-functions. Then their Dirichlet convolution is a sum of dilates of , one for each point in the HAP of which is the characteristic function. So we can at least decompose the Dirichlet convolution of two HAP-functions as a sum of HAP-functions. The sum of the coefficients is the number of points in the HAP corresponding to — though since the situation is symmetrical, we can add dilates of instead, so it is better to say that the sum of the coefficients can be taken to be the length of the smaller of the two HAPs.
That doesn’t sound very efficient, but we must also remember that when we take the Dirichlet convolution of two diagonal matrices, we obtain a diagonal matrix that sums to the product of the sums of the original two matrices. So there may be some gain there to compensate for the loss of efficiency. That is why I say that more detailed calculations are necessary, a point I will return to at some stage.
What can we say about the diagonal matrix ?
As I have already mentioned, if a diagonal decomposition of the required kind exists, then we can place strong constraints on the diagonal matrix itself. Writing for , it must have the following property: if is any real-valued function defined on the integers such that , then the HAP-discrepancy of must be at least . (This follows from the proof that the existence of the decomposition for a diagonal matrix with implies that every -sequence has HAP-discrepancy at least .)
Turning that round, if we have any real-valued sequence of discrepancy at most , then cannot be more than .
A simple example of the kind of conclusion one can draw from that is that the sum of over all s that are not multiples of 3 is at most 1. This follows from the fact that the sequence has HAP-discrepancy 1. More generally, if is large, then for all small most of the largeness must occur on multiples of .
In qualitative terms, this suggests that ought to be concentrated at numbers with many factors, and the experimental evidence backs this up, though not quite as cleanly as one might ideally like. (However, “edge effects” could well account for some of the peculiar features that are observed experimentally, so I still think it is worth trying to find a very aesthetically satisfying diagonal matrix .)
Here is another reason to expect that should be larger for numbers with many factors: they are contained in lots of HAPs. We are trying to find a measure on (at least if we insist that takes non-negative values, which seems a reasonable extra condition to try to impose) such that any function with a large -norm with respect to that measure must have a large HAP-discrepancy. What might make it difficult to find a function with large -norm and small HAP-discrepancy? It would be that we had lots of constraints on the values taken by the functions. And the natural way that we can place lots of constraints on one value is if has many factors and is therefore contained in many HAPs.
As an example of the opposite phenomenon, suppose we take an arbitrary sequence and alter its values however we like at all primes. What will happen to its HAP-discrepancy? Well, each HAP contains at most one prime, so we cannot have changed the HAP-discrepancy by more than 2. This illustrates fairly dramatically how little it matters what a sequence does at numbers with few factors.
A first wild guess.
It is tempting to jump straight from this kind of observation to a guess of a function that might work. For example, what if we took some primes , a collection of subsets of and let be the characteristic measure of the set of all numbers that can be written as for some . That is, takes the value on all such numbers and zero on all other numbers.
Does a function like that have any chance of working? Because we have ensured that it sums to 1, we would now be looking for a decomposition with absolute values of coefficients summing to at most for a small constant . But before we even think about doing that, we should try to find a sequence that is large on many of the products of primes but that has bounded discrepancy. Only if we find it hard to do that is it worth looking for a decomposition.
Let me be more precise about what we are looking for. If we can find a decomposition of (the diagonal matrix with entries given by ) as where the and are HAP-functions and , then for every sequence , we have two ways of calculating . One of them gives us , while the other gives us . So if , then there must be some HAP on which has discrepancy at least .
Therefore, in order to show that this particular will not do, we need to find a sequence such that averages at least 1 on the numbers but has bounded discrepancy.
Let us make our task easier to think about by looking for a sequence that takes values on the numbers . Can we choose the values elsewhere in such a way that we cancel out any discrepancy that we might accidentally pick up with those values?
To answer this question, let us think about how HAPs intersect the set, which, since I’m mentioning it quite a bit, I’d better give a name to: I’ll write for the set of numbers with . Let me also write as shorthand for . Any HAP that intersects must have a common difference of the form for some set . The resulting HAP will contain every for which , which is rather pleasantly combinatorial.
Let’s suppose that we’ve chosen the values everywhere on . What we are trying to do now is choose values off (not necessarily ) in such a way that any HAP-discrepancy contributed by values in is cancelled out by a roughly opposite discrepancy in the values outside .
An obvious way to do that is to be fairly greedy about it. That is, we look at HAPs that have a substantial intersection with and simply choose a few further values on those HAPs, hoping that our choices won’t mess up too many discrepancies elsewhere. Let’s see if we can get something like that to work.
Consider, then, the (infinite) HAP with common difference . As already mentioned, this intersects in every such that . So as we trundle along the multiples , we find that the partial sums of the s that we encounter and have already chosen go up and down, and we would like to choose some further values to ensure that they remain bounded.
At which values of are we still at liberty to choose the value ? It must be a multiple of , and two ways of ensuring that that multiple does not lie in are (i) to make sure that is also a multiple of some with and (ii) to make sure that is a multiple of for some .
The question now is whether if we do that we will find that the value we choose has an effect on lots of other HAPs at the same time. And it looks rather as though it will. Suppose is quite a large set. Then any multiple that does not belong to is also a multiple of for every , so there is indeed a danger that by choosing a value of we are messing up quite a lot of HAPs and not just the one we were interested in. We could of course try to sort out the HAPs with large first and later correct any damage we had done to smaller sets — but a large set has a lot of subsets.
This is another situation where the devil is in the detail. Maybe there is enough flexibility that some kind of careful greedy approach would work — perhaps we could even choose to be 1 everywhere on — but so far it seems at least possible that there exists an efficient decomposition of a diagonal matrix with entries that are concentrated on square-free numbers with many prime factors.
It is sometimes convenient to describe arithmetic progressions of the form as HAPs. Let me do that now, and let me define two HAPs to be adjacent if they are disjoint and their union is also a HAP (in this generalized sense). In other words, two HAPs are adjacent if one is simply a continuation of the other.
If and are HAP-functions coming from adjacent HAPs and (in which case I’ll say that they are adjacent HAP-functions), then is 1 on the diagonal at all points in the HAP , while off the diagonal it is 1 on and latex P\times Q\cup Q\times P$. If we take lots of products of this form, we can hope to get lots of cancellation off the diagonal, while getting none at all on the diagonal.
Let’s try to pursue this idea in more detail. Suppose we have some kind of probability distribution on functions of this form. That is, we pick, according to some distribution, a random set and then a random pair of adjacent HAPs and . Now let and be their characteristic functions and let and be two positive integers. What is the expected value of ?
A necessary condition for this quantity to be non-zero is that should divide both and . So let us condition on this event. If we are clever about the way we choose our probability distribution, we should be able to organize it so that if you choose a random , then typically plenty of its subsets have a similar probability of being chosen. The reason that is potentially important is that if we fix a common difference and pick random pairs of adjacent HAPs and of some given length, then pairs that are close to the diagonal will tend to get positive values (because normally if one of them is in then so is the other, and similarly for ) while pairs that are a little further away tend to get negative values. But if we have many different common differences in operation then we can hope that some of these biases will cancel out: a difference that tends to result in a positive value at one scale might tend to result in a negative value at another scale.
While writing this I have just remembered a useful trick that we were well aware of during the previous attempt at EDP. It is straightforward to show that EDP is equivalent to the same question for the rationals. That is, one wishes to show that for every function defined on the positive rationals and every constant there is a HAP (the definition is obvious) on which the sum of the function has absolute value at least . The nice thing about this formulation is a much greater symmetry: for every rational , the map is an isomorphism from (considered as a group under addition) to itself. This makes it unnecessary to think about numbers with lots of factors. (It is not hard to get into this nice situation using just integers — for example, you can just multiply everything by for a very large and you can divide your numbers by whatever you like, within reason — but it is more natural to use rationals.)
The hope was that if we thought about functions defined on the rationals, then the rather peculiar properties of the diagonal matrix one wishes to decompose might become less peculiar. However, one would still need to think carefully about ratios of numbers.
I’m tempted to continue thinking aloud about these possibilities, but I’ll save that up for the comments.
A second wild guess.
I’m now going to ignore what I’ve just written about working with the rationals and go back to the question of trying to find a natural function that is biased towards positive integers with many factors. What I’d like to do is build a sequence of non-negative functions , each better than the last, in the hope of eventually finding one that has (or at least doesn’t obviously not have) the property that every sequence such that has large HAP-discrepancy. (To make this non-trivial, I also need .)
I’ll start with a function I know doesn’t work: let for every and 0 for every . That fails because the sequence has HAP-discrepancy 1 but (at least if is a multiple of 3). That doesn’t satisfy the condition I said, but it does if we multiply it by .
The problem there was that has a large sum on non-multiples of 3. To correct that and similar problems, it feels natural to give greater weight to numbers with more factors. But how? Well, let’s do it in a rather naive way and simply weight each number by how many factors it has (and therefore how many HAPs it belongs to). This at least has the advantage of being a natural number-theoretic function: it is (up to a constant) the Dirichlet convolution of the constant function 1 with itself, at least until you get beyond . That is, .
Roughly how big is ? This sum is the number of pairs of positive integers such that (since is the number of pairs of positive integers such that ). Counting the number of pairs with we can crudely estimate this as , which is roughly . On the other hand, the Dirichlet convolution with itself of the function that is up to and 0 thereafter sums to (since each pair of positive integers with contributes 1 to the sum). Since the two functions are equal up to , we find that the vast majority of the second function lies beyond the point where it equals .
But let’s not worry about that. On average, grows like , which is not too fast a rate of growth, so let’s simply take the function up to . We now want to prove that every sequence such that has HAP-discrepancy at least (for some that tends to infinity with ). Alternatively, we want to find a counterexample, which will induce us to try to improve the function.
An obvious example to try is the usual one, namely the sequence . To see whether this works, we need to estimate the sum of over all non-multiples of up to . That equals the number of pairs such that neither nor is a multiple of 3 and . And that is roughly times what you get without the divisibility condition. In other words, it is roughly . So to get to equal we need to multiply the sequence by , which gives it a HAP-discrepancy of , a slight improvement on the that we obtained with a constant function.
This doesn’t prove much — maybe there are better examples — but it suggests in a weak way that taking Dirichlet convolutions results in improved functions. To be clear what I mean by “improved”, I am now looking for a non-negative function defined on such that for large , or perhaps even all , if is any sequence such that , then has HAP-discrepancy at least . The bigger I can get to be, the better I consider the function to be.
If you take repeated Dirichlet convolutions of the constant function 1, then you get functions , where is the number of -tuples of positive integers such that . (The function is equal to .) Suppose is large, is very large, and is a sequence such that . Does it follow that has large HAP-discrepancy? More ambitiously, does it have HAP-discrepancy that grows exponentially with , provided that is large enough?
Back to the rationals.
The following idea is one that feels familiar — I think something similar has come up already. But there’s nothing like a break from a problem to make an old stale idea seem new and fresh, and sometimes the apparent staleness was an illusion. So here it is again.
I’ll begin by creating a function on the diagonal that seems to me to have a sporting chance of being efficiently decomposable into HAPs. Let (for generating set) be the set and let be the -fold multiplicative convolution of the characteristic function of with itself. That is, is the number of ways of writing as with and all the and being integers between 1 and .
Before I think about how one might go about decomposing the diagonal “matrix” (that is, a function defined on that is zero at unless ) into products of HAP-functions, I want to see whether it looks hard to find a function such that but is of bounded HAP-discrepancy.
To have any chance of that, we need to understand a bit about . We can get that understanding by representing every rational as a product of (possibly negative) powers of distinct primes. For the rationals where is non-zero, the primes are all at most , so we can think of as a function defined on a lattice of dimension (the number of primes less than or equal to ). Then is the -fold convolution of the characteristic function of regarded as a subset of this lattice. If is large enough, then this convolution will look like a large and spread-out Gaussian, which will have the potentially useful property that it is approximately invariant if you multiply (or divide) by an integer between 1 and .
If has that property, does it rule out some kind of variant of the example? A key feature of that example is that the function in question is zero on multiples of 3, but in there is no such thing as a multiple of 3 (or rather, everything is a multiple of 3 so the concept ceases to make sense).
But that argument is far from conclusive. How about defining a function as follows: given a rational in its lowest terms, let be 0 if either or is divisible by 3, and otherwise 1 if mod 3 and if mod 3?
Let us think what the values of are at , , etc. If is a multiple of 3, then all those values are 0 (since is not a multiple of 3). If is a multiple of but not , then except if is a multiple of . If and are congruent mod 3, then the values of go . If they are not congruent mod 3, then they go . That last argument worked even if , so we have covered all cases.
What is the density of rationals such that neither numerator nor denominator is a multiple of 3? I haven’t made the question precise, and that allows me to give two different answers. The first answer is that if we pick the numerator and denominator randomly, then the probability that neither is a multiple of 3 is . Also, the probability that they are coprime is , and the probability that they are both coprime and not multiples of 3 can easily be shown by similar methods to be at least an absolute constant. So it seems that our function is supported on a set of rationals of positive density.
But another way of thinking about it suggests that the support of the function has zero density: every rational can be written as a product of powers of primes, and the probability that the power of 3 that we choose is 0 tends to 0 as we choose more and more rationals.
The second answer is, I think, the correct one for us, though the question is a bit misleading. The -density of rationals for which the “3-coordinate” is zero does indeed tend to zero, so if we regard as our measure, rather than something more additive, then we do appear to have ruled out examples that are similar to the example.
Can we use information about the original problem to help us with the dual problem?
We know that for the integers there is not an efficient HAP-decomposition of a diagonal matrix that is 1 for the first terms and zero afterwards. The proof we have is that if such a decomposition existed, then the sequence would have to have large HAP-discrepancy, which it doesn’t. But can we give a “direct” proof? I’m not quite sure what I mean by that, except that it should not involve the use of a separating functional. Ideally what I’d like to do is use that example and others like it to work out some rather general constraint that would end up saying a precise version of, “If an efficiently decomposable matrix equals 1 on the diagonal, then it must have quite a lot of stuff off the diagonal,” and ideally say a fair amount about that “stuff”.
The hope would be to find a clear obstacle on the dual side to a direct approach to decomposing the identity matrix, and then to show that when we move to the measure on the rationals defined in the previous section, that obstacle goes away and allows us to do what we wished we could do over the integers. The thing that would make life easier would be the ability to divide anything by small integers.
Actually, maybe it isn’t so important to go for the dual side. Let’s just understand what the existing proof is telling us goes wrong when we try to find an efficient HAP-decomposition. For the decomposition to be efficient, the HAPs we use have to be quite long. And for that to be the case, they must contain roughly as many numbers congruent to 1 mod 3 as they do numbers congruent to 2 mod 3. (They might contain none of either.) Therefore, any HAP product has roughly as many points congruent to or mod 3 as points congruent to or mod 3. Let us call these white points and black points, respectively. By roughly I mean that these two numbers differ by at most a constant. In fact, I think they differ by at most 1. But no points on the diagonal are congruent to or to , so if a product of HAPs includes points on the diagonal that are congruent to or , then it must include at least more black points than white points off the diagonal.
So, roughly speaking, we can’t build up the non-multiples of 3 on the diagonal without building up a bias towards black points off the diagonal.
Does that matter? Can’t we correct that unwanted bias by adding or subtracting some other HAP products? No we can’t, unless we also cancel out much of the sum on the diagonal.
What rescues us if we deal with rationals instead (or more precisely a portion of the rationals weighted by the function )? Also, how easy is it to find an efficient decomposition so that the weight off the diagonal is comparable to the weight on it? I suspect the answer to the second question is that it is not easy, since we have used just the fact that the sequence has bounded discrepancy, but there are many other examples.
A variant of the HAP problem.
I stopped the previous section prematurely because I was feeling a bit stuck. Instead I would like to think about a modified problem that may well be easier. EDP asks for a high discrepancy inside some HAP. The set of HAPs has the nice symmetry property that if you dilate a HAP then you get another one. This is particularly good over the rationals, when you can also contract.
Now let me ask a slightly vague question and then attempt to make it precise. Can we prove EDP for a different “base collection” of sets? For EDP I’m regarding the base collection as the collection of all sets . If you look at all dilations of those, you get all HAPs. Since I’m having trouble proving EDP for HAPs, I’d like to try to prove it for a different base collection. Of course, the smaller the collection, and the more closely related it is to the set of HAPs, the better, but for now my priority is to be able to prove something.
I actually have a collection of sets in mind. Let us think of the rationals multiplicatively as an infinite-dimensional lattice, or rather the portion of that lattice that consists of points with only finitely many non-zero coordinates. That is, the point corresponds to the rational , the being integers. I’m not sure exactly what I want to do next, so let me be slightly less precise. If we look at the rationals multiplicatively, as we are doing here, then dilation becomes simply translation. What I’d really like is something like an orthonormal basis, but I’d like it to consist largely of functions that are translates of one another. So far that won’t work, but it looks a lot more hopeful if we allow some kind of “spreading out” (which would be the analogue of taking longer and longer intervals). The most natural type of spreading out is, I would suggest, to take repeated convolutions. So a possible question — not necessarily the exact one I’ll end up asking — is this. Suppose we start with the set that we considered earlier. Is it true that every function that has large norm relative to the measure defined by (multiplicatively) convolving times has a large inner product with some (multiplicative) translate of some iterated convolution of ?
This problem should be not that hard since it is purely multiplicative, so we can take logs and make it purely additive. If the answer turned out to be yes, then for any -function we would get a large inner product with some iterated convolution of . If we could nicely decompose that iterated convolution into HAPs, we would then be done, though that seems like quite a lot to ask.
Hmm … I’ve almost instantly run into trouble. It seems that my “multiplicative version” of EDP is just plain trivially false. Consider the function that is 1 for products of an even number of primes and -1 for products of odd numbers of primes. (This is the completely multiplicative function , with which we are already very familiar.) From a multiplicative point of view, where we think of as an infinite-dimensional lattice, this is putting 1 at “white points” and -1 at “black points”. If we now take any “cuboid”, by which I mean a set of the form for every , then the sum of on that cuboid is either 0, 1 or -1. I was about to say that I couldn’t see an elegant proof of this, but here’s a fairly quick one. If we remove two adjacent slices from the cuboid, we remove an equal number of white and black squares. “Removing two adjacent slices” means reducing by 2, provided . If some is even, then by repeating this operation for that we can remove all points, which shows that the numbers of white and black points were originally the same. If all are odd, then we end up reducing the cuboid to a set with just one point, in which case we get .
If we multiplicatively convolve any two cuboids, we obtain a function that can be expressed as a linear combination of translates of a cuboid, so its inner product with will also be small, as will those of all its translates. (What’s more, those small inner products will themselves alternate in sign as you multiply or divide by primes.) So will have small inner product with everything you build out of products of GPs using multiplicative convolutions.
It could be fruitful to think about why HAPs have a better chance of giving rise to discrepancy than the kinds of multiplicative sets I’ve just considered. In preparation for that, here is a question to which I know the answer. Let be a nested collection of subsets of such that for every and . Is it possible for there to be a function and a constant such that for every and every ? In the case that for every this is EDP. But for a general nested collection, it is easy to see that bounded discrepancy is possible for that collection and all its dilates. You just take a completely multiplicative function such as and choose in such a way that for every (and such that the are distinct and every integer is equal to for some ). Then the sum is always either -1 or 0, and multiplicativity ensures that the sum on dilates of the is always either 1, -1 or 0.
The partial sums of grow (it is believed) like , so if we choose our sets greedily — that is, by taking to be the smallest positive integer we have not yet chosen such that takes the value — then will contain the first integers, where is around (at most), and look a bit like a random selection of integers within around of for the rest of the set. In other words, will be pretty close to the set , so the system of sets will be pretty close to the set of all HAPs. This suggests that proving EDP is going to be really quite delicate.
In fact, it will be more delicate still, since we can take a better multiplicative function than . If we take the unique completely multiplicative function that takes to 1 if is congruent to 1 mod 3, to if is congruent to 2 mod 3, and 3 to -1, its partial sums grow logarithmically, so we can say something similar to the above but with around rather than around .
We can go slightly further with this function. Because it is 1 on the infinite arithmetic progression and on the infinite arithmetic progression , we can choose our sets to be intervals up to roughly plus logarithmically short (non-homogeneous) arithmetic progressions of common difference 3.
Since these sets are cooked up so as to ensure that the function has bounded discrepancy, they might seem not that interesting. But I think they are important, because if we want to find an efficient decomposition of a diagonal matrix into HAP products, we probably need to understand what it is that makes HAPs so much better than HAPs with highly structured tiny logarithmic perturbations at the end. Otherwise we are in danger of taking seriously arguments that would work just as well for statements that we know to be false.
As I write, it occurs to me that we might be able to prove a result that would be quite a bit easier than EDP but nevertheless interesting (and in particular interesting enough to be publishable, unless the answer turns out to be disappointingly easy): that there exists a permutation of such that the discrepancy of every sequence on the sets of the form is unbounded. We know it is not true for all permutations — even ones that in some sense don’t permute by very much — and we suspect that it is true for the identity permutation but find that very hard to prove. What about if is in some sense “fairly random”? For example, what if we chose a large and randomly permuted the first integers. That would give us sets and all their dilates. Could we prove that the discrepancy on that set system is at least ? What if we drop the condition that the union of the should be all of ? What if instead we ask that for some strictly increasing sequence ?
I rather like those questions, which makes this a good place to stop the post — for the length of which I apologize.