Gil has a not quite finished third post that will appear soon on this blog. Meanwhile, here are a few thoughts I’ve had recently as I tried to get my brain into EDP mode again.

The approach to the Erdős discrepancy problem that, rightly or wrongly, I found most promising when we were last working on it was to prove a certain statement about matrices that can be shown quite easily to imply a positive solution to the problem. In this post, I’m going to treat that matrix statement as *the* problem, and think about how one might go about trying to prove it. I’ll give the brief explanation of why it implies EDP, but not of what the possible advantages of the approach are (discussions of which can be found in some of the earlier material).

First I’ll need to recall a couple of definitions and some notation. A *homogeneous arithmetic progression*, or HAP, is a set of the form for some pair of positive integers and . Let me say that is a *HAP-function* if it is the characteristic function of a HAP. Finally, if and are functions defined on , let denote the function defined on that takes the value at , which can be thought of as an infinite matrix.

Here, then, is the statement that it would be great to prove. What is nice (but also quite daunting) about it is that it is a pure existence statement.

**Problem.** Prove that for every there exists a real matrix decomposition with the following properties.

(i) is a diagonal matrix and .

(ii) Each and is a HAP-function.

(iii) .

I have stated the problem in this form because (for a reason I will outline below) I am confident that such a decomposition exists. However, it seems to be quite hard to find.

Why would the existence of an efficient decomposition of a diagonal matrix into products of HAP-functions prove EDP? Well, let be a -sequence and consider the quantity . On the one hand it equals

,

and on the other hand it equals

.

Since , there must exist such that , from which it follows that there exists a HAP such that . This shows that must have HAP-discrepancy at least for every , and we are done.

Of course, one can deduce anything from a false premise, so one needs at least some reason to believe that the decomposition exists. Quite a good reason is that the existence of the decomposition can be shown (without too much difficulty) to be equivalent to the following generalization of EDP.

**Problem.** Show that if and are any two sequences of vectors in a Hilbert space such that for every , then for every constant there exist HAPs and such that .

This is a strengthening of EDP, since it clearly implies EDP when the Hilbert space is . However, it doesn’t seem like such a huge strengthening that it might be false even if EDP is true. At any rate, it seems just as hard to find a counterexample to this as it does to EDP itself. (You might ask why I believe that EDP is true. It’s partly blind faith, but the heuristics presented by Gil Kalai in the previous post offer some support, as does — very weakly — the experimental evidence.)

**How does one ever solve existence problems?**

Let’s begin by thinking very generally about the situation we’re in: we have to come up with a mathematical object that has certain properties . (Of course, we could combine those into one property, but it is often more natural to think of the object as having several different properties simultaneously.)

I’m going to list strategies until I can’t (or at least can’t immediately) think of any more that seem to have any chance of success. (An example of a strategy that I think has no chance of success for this problem is to search for a just-do-it proof, the problem being that the constraints appear to be quite delicate, whereas a just-do-it proof works better when one has a number of independent constraints that are reasonably easy to satisfy simultaneously.)

1. Use an object you already know, possibly with some small modifications.

2. Start with examples that work for a certain constant , and use them as building blocks for larger and more complicated examples that work with a larger constant .

3. Use some kind of duality to prove that if an example does not exist, then some other object *does* exist, and show that in fact the second object does not exist.

4. Use a computer to search for small examples, try to discern a pattern in those examples, use that pattern to guess how to construct larger examples, and check that the guess works.

5. Strengthen the conditions on the object until there is a unique (or almost unique) object that satisfies those conditions.

6. Weaken the conditions on the object (e.g. by forgetting about some of the properties ), try to describe all objects that satisfy the weaker conditions, and then reformulate the problem as that of finding an object of that description that satisfies the stronger conditions. [As a very simple example, if you were required to find a positive solution of some quadratic equation, you would solve the equation without the positivity condition and then pick out a positive root.]

7. Assume that you have an object with the required properties, deduce some interesting further properties, find a natural object with the further properties, and hope that it has the original properties as well. [This is a variant of 6.]

**How do the general strategies fare when it comes to EDP?**

I will discuss them briefly at first, before concentrating in more detail on a few of them.

1. This doesn’t appear to be the kind of problem where there is already some matrix decomposition (or related mathematical object) waiting in the wings to be modified so that it becomes the example we want. Or if there is, the only way we are likely to discover it is by pinning down more precisely the properties we are interested in so that someone somewhere says, “Hang on, that sounds like X.” But I think it is very unlikely that we will discover an example that way.

2. I am definitely interested in the idea of starting with small examples and using them to build bigger examples. However, an approach like this needs plenty of thought: non-trivial small examples seem hard to come by, and the only obvious method for building bigger ones is taking linear combinations of smaller ones. (A trivial small example is to take a linear combination of the form where each is the characteristic function of a singleton. This gives us . To get anything interesting, we need the to be HAP-functions for longer HAPs, but then it is very difficult to get the off-diagonal terms to cancel.)

3. We are trying to write as an efficient linear combination of objects of a certain form. That is a classic situation for applying the Hahn-Banach theorem: if there is no such linear combination, then there must be a separating functional. What can we say about that functional?

This approach seems promising at first, but is in fact of no use at all, since if you pursue it, the conclusion you come to is that if there is no such linear combination, then there are sequences of unit vectors and in a Hilbert space and a constant such that for every pair of HAPs and . The reason duality doesn’t help is that we used duality to reformulate EDP as a problem about decomposing diagonal matrices. If we use duality again, we get back to (a generalization of) EDP.

4. We have tried this. Moses Charikar and others have written programs to search for the most efficient decompositions (for this and for a related problem). The results are useful in that they give one some feel for what the diagonal matrix needs to be like, but it does not seem to be possible to obtain a large enough matrix to use it to go as far as guessing a formula. For that, more theoretical methods will be necessary (but knowing roughly what the answer needs to look like should make finding those methods quite a bit quicker).

5. One difficulty is that we are not looking for a unique object: it would be nice if the properties we were trying to obtain determined the matrix and its decomposition uniquely, since then we could hope to *calculate* them. An obvious property to add is that the object is in some sense extremal (though what that sense is is not completely obvious). Another thing we can do is insist on various symmetries — something that has already been tried. Yet another is to restrict the class of HAPs that may be used: as far as we can tell at the moment, we can get away with HAPs with common differences that are all either prime or a power of 2. (I say that simply because it seems to be hard to find a sequence that has bounded discrepancy on all such HAPs. Gil’s heuristics suggest that the discrepancy should grow like for sequences of length .)

6. What weaker properties might we go for? One idea is simply to try to find a linear combination of products of HAP-functions that leads to a considerable amount of cancellation off the diagonal, even if there are still off-diagonal terms left. This would be interesting because if one naively tries to find a decomposition that works, it seems to be very hard to get a substantial amount of cancellation (except if very small HAPs are used, in which case the resulting decomposition is not very informative). Another (somewhat related) idea is to look for interesting decompositions but with not particularly large values of . Yet another might be to allow a wider class of functions that HAP-functions.

7. I don’t have much idea of how to apply this strategy to the entire decomposition, but there is plenty one can say about the diagonal matrix that ones wishes to decompose, and that information, if it could be expressed neatly, would surely be very useful: it seems much easier to try to decompose a specific diagonal matrix than to try to decompose a diagonal matrix that you have to choose first, when the vast majority of diagonal matrices *cannot* be efficiently decomposed.

The proof that the diagonal decomposition implies EDP actually proves a much stronger result. It gives us a function such that such that if is any sequence of real numbers (not necessarily -valued) with , then has unbounded HAP-discrepancy. To put that loosely, the sequence cannot correlate with the square of any sequence of bounded discrepancy. That puts very strong conditions on , but so far we have not fully understood what those conditions are. It seems that we should be able to make progress on this, perhaps even adding reasonable conditions that would allow us determine the function uniquely.

**Creating new examples out of old ones.**

Suppose that and . Clearly if we take a linear combination of and , we get another decomposition of a diagonal matrix, and if the original decompositions were into products of HAPs then so is the new one.

Can we do anything more interesting? Another possibility is to look for some kind of product. Now Dirichlet convolutions are natural objects to look at in the context of EDP, since the expression can be thought of as follows. You start with a function defined on , then you define the dilate to be the function that takes the value at and 0 at non-multiples of , and finally you take a linear combination of those dilates. If is identically 1, then this is taking a linear combination of HAPs.

We can’t take a Dirichlet product of two matrices, but we can do something similar: given functions and of two integer variables we can take the function . If and are diagonal (meaning that they are zero if ), then the terms of the sum are zero unless and , so in this case , considered as a function of only, is the Dirichlet convolution of and , also considered as functions of one variable.

This operation also respects products in the following sense. If and , then

,

which equals , where is the Dirichlet convolution of and and is the Dirichlet convolution of and .

One further obvious fact is that the operation is bilinear in the two matrices that it is applied to.

So far everything is working swimmingly: out of two decompositions of diagonal matrices into linear combinations of products we build another one that has some number-theoretic significance. However, there is a problem: a Dirichlet convolution of two HAP-functions is not an HAP-function except in trivial cases.

How much does that matter? Without doing some detailed calculations I don’t know. Here are a couple of simple observations. Let and be two HAP-functions. Then their Dirichlet convolution is a sum of dilates of , one for each point in the HAP of which is the characteristic function. So we can at least decompose the Dirichlet convolution of two HAP-functions as a sum of HAP-functions. The sum of the coefficients is the number of points in the HAP corresponding to — though since the situation is symmetrical, we can add dilates of instead, so it is better to say that the sum of the coefficients can be taken to be the length of the smaller of the two HAPs.

That doesn’t sound very efficient, but we must also remember that when we take the Dirichlet convolution of two diagonal matrices, we obtain a diagonal matrix that sums to the product of the sums of the original two matrices. So there may be some gain there to compensate for the loss of efficiency. That is why I say that more detailed calculations are necessary, a point I will return to at some stage.

**What can we say about the diagonal matrix ?**

As I have already mentioned, if a diagonal decomposition of the required kind exists, then we can place strong constraints on the diagonal matrix itself. Writing for , it must have the following property: if is any real-valued function defined on the integers such that , then the HAP-discrepancy of must be at least . (This follows from the proof that the existence of the decomposition for a diagonal matrix with implies that every -sequence has HAP-discrepancy at least .)

Turning that round, if we have any real-valued sequence of discrepancy at most , then cannot be more than .

A simple example of the kind of conclusion one can draw from that is that the sum of over all s that are not multiples of 3 is at most 1. This follows from the fact that the sequence has HAP-discrepancy 1. More generally, if is large, then for all small most of the largeness must occur on multiples of .

In qualitative terms, this suggests that ought to be concentrated at numbers with many factors, and the experimental evidence backs this up, though not quite as cleanly as one might ideally like. (However, “edge effects” could well account for some of the peculiar features that are observed experimentally, so I still think it is worth trying to find a very aesthetically satisfying diagonal matrix .)

Here is another reason to expect that should be larger for numbers with many factors: they are contained in lots of HAPs. We are trying to find a measure on (at least if we insist that takes non-negative values, which seems a reasonable extra condition to try to impose) such that any function with a large -norm with respect to that measure must have a large HAP-discrepancy. What might make it difficult to find a function with large -norm and small HAP-discrepancy? It would be that we had lots of constraints on the values taken by the functions. And the natural way that we can place lots of constraints on one value is if has many factors and is therefore contained in many HAPs.

As an example of the opposite phenomenon, suppose we take an arbitrary sequence and alter its values however we like at all primes. What will happen to its HAP-discrepancy? Well, each HAP contains at most one prime, so we cannot have changed the HAP-discrepancy by more than 2. This illustrates fairly dramatically how little it matters what a sequence does at numbers with few factors.

**A first wild guess.**

It is tempting to jump straight from this kind of observation to a guess of a function that might work. For example, what if we took some primes , a collection of subsets of and let be the characteristic measure of the set of all numbers that can be written as for some . That is, takes the value on all such numbers and zero on all other numbers.

Does a function like that have any chance of working? Because we have ensured that it sums to 1, we would now be looking for a decomposition with absolute values of coefficients summing to at most for a small constant . But before we even think about doing that, we should try to find a sequence that is large on many of the products of primes but that has bounded discrepancy. Only if we find it hard to do that is it worth looking for a decomposition.

Let me be more precise about what we are looking for. If we can find a decomposition of (the diagonal matrix with entries given by ) as where the and are HAP-functions and , then for every sequence , we have two ways of calculating . One of them gives us , while the other gives us . So if , then there must be some HAP on which has discrepancy at least .

Therefore, in order to show that this particular will not do, we need to find a sequence such that averages at least 1 on the numbers but has bounded discrepancy.

Let us make our task easier to think about by looking for a sequence that takes values on the numbers . Can we choose the values elsewhere in such a way that we cancel out any discrepancy that we might accidentally pick up with those values?

To answer this question, let us think about how HAPs intersect the set, which, since I’m mentioning it quite a bit, I’d better give a name to: I’ll write for the set of numbers with . Let me also write as shorthand for . Any HAP that intersects must have a common difference of the form for some set . The resulting HAP will contain every for which , which is rather pleasantly combinatorial.

Let’s suppose that we’ve chosen the values everywhere on . What we are trying to do now is choose values off (not necessarily ) in such a way that any HAP-discrepancy contributed by values in is cancelled out by a roughly opposite discrepancy in the values outside .

An obvious way to do that is to be fairly greedy about it. That is, we look at HAPs that have a substantial intersection with and simply choose a few further values on those HAPs, hoping that our choices won’t mess up too many discrepancies elsewhere. Let’s see if we can get something like that to work.

Consider, then, the (infinite) HAP with common difference . As already mentioned, this intersects in every such that . So as we trundle along the multiples , we find that the partial sums of the s that we encounter and have already chosen go up and down, and we would like to choose some further values to ensure that they remain bounded.

At which values of are we still at liberty to choose the value ? It must be a multiple of , and two ways of ensuring that that multiple does not lie in are (i) to make sure that is also a multiple of some with and (ii) to make sure that is a multiple of for some .

The question now is whether if we do that we will find that the value we choose has an effect on lots of other HAPs at the same time. And it looks rather as though it will. Suppose is quite a large set. Then any multiple that does not belong to is also a multiple of for every , so there is indeed a danger that by choosing a value of we are messing up quite a lot of HAPs and not just the one we were interested in. We could of course try to sort out the HAPs with large first and later correct any damage we had done to smaller sets — but a large set has a lot of subsets.

This is another situation where the devil is in the detail. Maybe there is enough flexibility that some kind of careful greedy approach would work — perhaps we could even choose to be 1 everywhere on — but so far it seems at least possible that there exists an efficient decomposition of a diagonal matrix with entries that are concentrated on square-free numbers with many prime factors.

It is sometimes convenient to describe arithmetic progressions of the form as HAPs. Let me do that now, and let me define two HAPs to be *adjacent* if they are disjoint and their union is also a HAP (in this generalized sense). In other words, two HAPs are adjacent if one is simply a continuation of the other.

If and are HAP-functions coming from adjacent HAPs and (in which case I’ll say that they are adjacent HAP-functions), then is 1 on the diagonal at all points in the HAP , while off the diagonal it is 1 on and latex P\times Q\cup Q\times P$. If we take lots of products of this form, we can hope to get lots of cancellation off the diagonal, while getting none at all on the diagonal.

Let’s try to pursue this idea in more detail. Suppose we have some kind of probability distribution on functions of this form. That is, we pick, according to some distribution, a random set and then a random pair of adjacent HAPs and . Now let and be their characteristic functions and let and be two positive integers. What is the expected value of ?

A necessary condition for this quantity to be non-zero is that should divide both and . So let us condition on this event. If we are clever about the way we choose our probability distribution, we should be able to organize it so that if you choose a random , then typically plenty of its subsets have a similar probability of being chosen. The reason that is potentially important is that if we fix a common difference and pick random pairs of adjacent HAPs and of some given length, then pairs that are close to the diagonal will tend to get positive values (because normally if one of them is in then so is the other, and similarly for ) while pairs that are a little further away tend to get negative values. But if we have many different common differences in operation then we can hope that some of these biases will cancel out: a difference that tends to result in a positive value at one scale might tend to result in a negative value at another scale.

While writing this I have just remembered a useful trick that we were well aware of during the previous attempt at EDP. It is straightforward to show that EDP is equivalent to the same question for the rationals. That is, one wishes to show that for every function defined on the positive rationals and every constant there is a HAP (the definition is obvious) on which the sum of the function has absolute value at least . The nice thing about this formulation is a much greater symmetry: for every rational , the map is an isomorphism from (considered as a group under addition) to itself. This makes it unnecessary to think about numbers with lots of factors. (It is not hard to get into this nice situation using just integers — for example, you can just multiply everything by for a very large and you can divide your numbers by whatever you like, within reason — but it is more natural to use rationals.)

The hope was that if we thought about functions defined on the rationals, then the rather peculiar properties of the diagonal matrix one wishes to decompose might become less peculiar. However, one would still need to think carefully about *ratios* of numbers.

I’m tempted to continue thinking aloud about these possibilities, but I’ll save that up for the comments.

**A second wild guess.**

I’m now going to ignore what I’ve just written about working with the rationals and go back to the question of trying to find a natural function that is biased towards positive integers with many factors. What I’d like to do is build a sequence of non-negative functions , each better than the last, in the hope of eventually finding one that has (or at least doesn’t obviously not have) the property that every sequence such that has large HAP-discrepancy. (To make this non-trivial, I also need .)

I’ll start with a function I know doesn’t work: let for every and 0 for every . That fails because the sequence has HAP-discrepancy 1 but (at least if is a multiple of 3). That doesn’t satisfy the condition I said, but it does if we multiply it by .

The problem there was that has a large sum on non-multiples of 3. To correct that and similar problems, it feels natural to give greater weight to numbers with more factors. But how? Well, let’s do it in a rather naive way and simply weight each number by how many factors it has (and therefore how many HAPs it belongs to). This at least has the advantage of being a natural number-theoretic function: it is (up to a constant) the Dirichlet convolution of the constant function 1 with itself, at least until you get beyond . That is, .

Roughly how big is ? This sum is the number of pairs of positive integers such that (since is the number of pairs of positive integers such that ). Counting the number of pairs with we can crudely estimate this as , which is roughly . On the other hand, the Dirichlet convolution with itself of the function that is up to and 0 thereafter sums to (since each pair of positive integers with contributes 1 to the sum). Since the two functions are equal up to , we find that the vast majority of the second function lies beyond the point where it equals .

But let’s not worry about that. On average, grows like , which is not too fast a rate of growth, so let’s simply take the function up to . We now want to prove that every sequence such that has HAP-discrepancy at least (for some that tends to infinity with ). Alternatively, we want to find a counterexample, which will induce us to try to improve the function.

An obvious example to try is the usual one, namely the sequence . To see whether this works, we need to estimate the sum of over all non-multiples of up to . That equals the number of pairs such that neither nor is a multiple of 3 and . And that is roughly times what you get without the divisibility condition. In other words, it is roughly . So to get to equal we need to multiply the sequence by , which gives it a HAP-discrepancy of , a slight improvement on the that we obtained with a constant function.

This doesn’t prove much — maybe there are better examples — but it suggests in a weak way that taking Dirichlet convolutions results in improved functions. To be clear what I mean by “improved”, I am now looking for a non-negative function defined on such that for large , or perhaps even all , if is any sequence such that , then has HAP-discrepancy at least . The bigger I can get to be, the better I consider the function to be.

If you take repeated Dirichlet convolutions of the constant function 1, then you get functions , where is the number of -tuples of positive integers such that . (The function is equal to .) Suppose is large, is very large, and is a sequence such that . Does it follow that has large HAP-discrepancy? More ambitiously, does it have HAP-discrepancy that grows exponentially with , provided that is large enough?

**Back to the rationals.**

The following idea is one that feels familiar — I think something similar has come up already. But there’s nothing like a break from a problem to make an old stale idea seem new and fresh, and sometimes the apparent staleness was an illusion. So here it is again.

I’ll begin by creating a function on the diagonal that seems to me to have a sporting chance of being efficiently decomposable into HAPs. Let (for generating set) be the set and let be the -fold multiplicative convolution of the characteristic function of with itself. That is, is the number of ways of writing as with and all the and being integers between 1 and .

Before I think about how one might go about decomposing the diagonal “matrix” (that is, a function defined on that is zero at unless ) into products of HAP-functions, I want to see whether it looks hard to find a function such that but is of bounded HAP-discrepancy.

To have any chance of that, we need to understand a bit about . We can get that understanding by representing every rational as a product of (possibly negative) powers of distinct primes. For the rationals where is non-zero, the primes are all at most , so we can think of as a function defined on a lattice of dimension (the number of primes less than or equal to ). Then is the -fold convolution of the characteristic function of regarded as a subset of this lattice. If is large enough, then this convolution will look like a large and spread-out Gaussian, which will have the potentially useful property that it is approximately invariant if you multiply (or divide) by an integer between 1 and .

If has that property, does it rule out some kind of variant of the example? A key feature of that example is that the function in question is zero on multiples of 3, but in there is no such thing as a multiple of 3 (or rather, everything is a multiple of 3 so the concept ceases to make sense).

But that argument is far from conclusive. How about defining a function as follows: given a rational in its lowest terms, let be 0 if either or is divisible by 3, and otherwise 1 if mod 3 and if mod 3?

Let us think what the values of are at , , etc. If is a multiple of 3, then all those values are 0 (since is not a multiple of 3). If is a multiple of but not , then except if is a multiple of . If and are congruent mod 3, then the values of go . If they are not congruent mod 3, then they go . That last argument worked even if , so we have covered all cases.

What is the density of rationals such that neither numerator nor denominator is a multiple of 3? I haven’t made the question precise, and that allows me to give two different answers. The first answer is that if we pick the numerator and denominator randomly, then the probability that neither is a multiple of 3 is . Also, the probability that they are coprime is , and the probability that they are both coprime and not multiples of 3 can easily be shown by similar methods to be at least an absolute constant. So it seems that our function is supported on a set of rationals of positive density.

But another way of thinking about it suggests that the support of the function has zero density: every rational can be written as a product of powers of primes, and the probability that the power of 3 that we choose is 0 tends to 0 as we choose more and more rationals.

The second answer is, I think, the correct one for us, though the question is a bit misleading. The -density of rationals for which the “3-coordinate” is zero does indeed tend to zero, so if we regard as our measure, rather than something more additive, then we do appear to have ruled out examples that are similar to the example.

**Can we use information about the original problem to help us with the dual problem?**

We know that for the integers there is not an efficient HAP-decomposition of a diagonal matrix that is 1 for the first terms and zero afterwards. The proof we have is that if such a decomposition existed, then the sequence would have to have large HAP-discrepancy, which it doesn’t. But can we give a “direct” proof? I’m not quite sure what I mean by that, except that it should not involve the use of a separating functional. Ideally what I’d like to do is use that example and others like it to work out some rather general constraint that would end up saying a precise version of, “If an efficiently decomposable matrix equals 1 on the diagonal, then it must have quite a lot of stuff off the diagonal,” and ideally say a fair amount about that “stuff”.

The hope would be to find a clear obstacle *on the dual side* to a direct approach to decomposing the identity matrix, and then to show that when we move to the measure on the rationals defined in the previous section, that obstacle goes away and allows us to do what we wished we could do over the integers. The thing that would make life easier would be the ability to divide anything by small integers.

Actually, maybe it isn’t so important to go for the dual side. Let’s just understand what the existing proof is telling us goes wrong when we try to find an efficient HAP-decomposition. For the decomposition to be efficient, the HAPs we use have to be quite long. And for that to be the case, they must contain roughly as many numbers congruent to 1 mod 3 as they do numbers congruent to 2 mod 3. (They might contain none of either.) Therefore, any HAP product has roughly as many points congruent to or mod 3 as points congruent to or mod 3. Let us call these *white* points and *black* points, respectively. By roughly I mean that these two numbers differ by at most a constant. In fact, I think they differ by at most 1. But no points on the diagonal are congruent to or to , so if a product of HAPs includes points on the diagonal that are congruent to or , then it must include at least more black points than white points off the diagonal.

So, roughly speaking, we can’t build up the non-multiples of 3 on the diagonal without building up a bias towards black points off the diagonal.

Does that matter? Can’t we *correct* that unwanted bias by adding or subtracting some other HAP products? No we can’t, unless we also cancel out much of the sum on the diagonal.

What rescues us if we deal with rationals instead (or more precisely a portion of the rationals weighted by the function )? Also, how easy is it to find an efficient decomposition so that the weight off the diagonal is comparable to the weight on it? I suspect the answer to the second question is that it is not easy, since we have used just the fact that the sequence has bounded discrepancy, but there are many other examples.

**A variant of the HAP problem.**

I stopped the previous section prematurely because I was feeling a bit stuck. Instead I would like to think about a modified problem that may well be easier. EDP asks for a high discrepancy inside some HAP. The set of HAPs has the nice symmetry property that if you dilate a HAP then you get another one. This is particularly good over the rationals, when you can also contract.

Now let me ask a slightly vague question and then attempt to make it precise. Can we prove EDP for a different “base collection” of sets? For EDP I’m regarding the base collection as the collection of all sets . If you look at all dilations of those, you get all HAPs. Since I’m having trouble proving EDP for HAPs, I’d like to try to prove it for a different base collection. Of course, the smaller the collection, and the more closely related it is to the set of HAPs, the better, but for now my priority is to be able to prove *something*.

I actually have a collection of sets in mind. Let us think of the rationals multiplicatively as an infinite-dimensional lattice, or rather the portion of that lattice that consists of points with only finitely many non-zero coordinates. That is, the point corresponds to the rational , the being integers. I’m not sure exactly what I want to do next, so let me be slightly less precise. If we look at the rationals multiplicatively, as we are doing here, then dilation becomes simply translation. What I’d really like is something like an orthonormal basis, but I’d like it to consist largely of functions that are translates of one another. So far that won’t work, but it looks a lot more hopeful if we allow some kind of “spreading out” (which would be the analogue of taking longer and longer intervals). The most natural type of spreading out is, I would suggest, to take repeated convolutions. So a possible question — not necessarily the exact one I’ll end up asking — is this. Suppose we start with the set that we considered earlier. Is it true that every function that has large norm relative to the measure defined by (multiplicatively) convolving times has a large inner product with some (multiplicative) translate of some iterated convolution of ?

This problem should be not that hard since it is purely multiplicative, so we can take logs and make it purely additive. If the answer turned out to be yes, then for any -function we would get a large inner product with some iterated convolution of . If we could nicely decompose that iterated convolution into HAPs, we would then be done, though that seems like quite a lot to ask.

Hmm … I’ve almost instantly run into trouble. It seems that my “multiplicative version” of EDP is just plain trivially false. Consider the function that is 1 for products of an even number of primes and -1 for products of odd numbers of primes. (This is the completely multiplicative function , with which we are already very familiar.) From a multiplicative point of view, where we think of as an infinite-dimensional lattice, this is putting 1 at “white points” and -1 at “black points”. If we now take any “cuboid”, by which I mean a set of the form for every , then the sum of on that cuboid is either 0, 1 or -1. I was about to say that I couldn’t see an elegant proof of this, but here’s a fairly quick one. If we remove two adjacent slices from the cuboid, we remove an equal number of white and black squares. “Removing two adjacent slices” means reducing by 2, provided . If some is even, then by repeating this operation for that we can remove all points, which shows that the numbers of white and black points were originally the same. If all are odd, then we end up reducing the cuboid to a set with just one point, in which case we get .

If we multiplicatively convolve any two cuboids, we obtain a function that can be expressed as a linear combination of translates of a cuboid, so its inner product with will also be small, as will those of all *its* translates. (What’s more, those small inner products will themselves alternate in sign as you multiply or divide by primes.) So will have small inner product with everything you build out of products of GPs using multiplicative convolutions.

It could be fruitful to think about why HAPs have a better chance of giving rise to discrepancy than the kinds of multiplicative sets I’ve just considered. In preparation for that, here is a question to which I know the answer. Let be a nested collection of subsets of such that for every and . Is it possible for there to be a function and a constant such that for every and every ? In the case that for every this is EDP. But for a general nested collection, it is easy to see that bounded discrepancy is possible for that collection and all its dilates. You just take a completely multiplicative function such as and choose in such a way that for every (and such that the are distinct and every integer is equal to for some ). Then the sum is always either -1 or 0, and multiplicativity ensures that the sum on dilates of the is always either 1, -1 or 0.

The partial sums of grow (it is believed) like , so if we choose our sets greedily — that is, by taking to be the smallest positive integer we have not yet chosen such that takes the value — then will contain the first integers, where is around (at most), and look a bit like a random selection of integers within around of for the rest of the set. In other words, will be pretty close to the set , so the system of sets will be pretty close to the set of all HAPs. This suggests that proving EDP is going to be really quite delicate.

In fact, it will be more delicate still, since we can take a better multiplicative function than . If we take the unique completely multiplicative function that takes to 1 if is congruent to 1 mod 3, to if is congruent to 2 mod 3, and 3 to -1, its partial sums grow logarithmically, so we can say something similar to the above but with around rather than around .

We can go slightly further with this function. Because it is 1 on the infinite arithmetic progression and on the infinite arithmetic progression , we can choose our sets to be intervals up to roughly plus logarithmically short (non-homogeneous) arithmetic progressions of common difference 3.

Since these sets are cooked up so as to ensure that the function has bounded discrepancy, they might seem not that interesting. But I think they are important, because if we want to find an efficient decomposition of a diagonal matrix into HAP products, we probably need to understand what it is that makes HAPs so much better than HAPs with highly structured tiny logarithmic perturbations at the end. Otherwise we are in danger of taking seriously arguments that would work just as well for statements that we know to be false.

As I write, it occurs to me that we might be able to prove a result that would be quite a bit easier than EDP but nevertheless interesting (and in particular interesting enough to be publishable, unless the answer turns out to be disappointingly easy): that there *exists* a permutation of such that the discrepancy of every sequence on the sets of the form is unbounded. We know it is not true for all permutations — even ones that in some sense don’t permute by very much — and we suspect that it is true for the identity permutation but find that very hard to prove. What about if is in some sense “fairly random”? For example, what if we chose a large and randomly permuted the first integers. That would give us sets and all their dilates. Could we prove that the discrepancy on that set system is at least ? What if we drop the condition that the union of the should be all of ? What if instead we ask that for some strictly increasing sequence ?

I rather like those questions, which makes this a good place to stop the post — for the length of which I apologize.

September 1, 2012 at 9:03 am |

“Problem. Show that if and are any two sequences of unit vectors in a Hilbert space, then for every constant there exist HAPs P and Q such that .”

This seems to fail if you choose two orthogonal unit vectors a and b and set the sequence a to be constantly a and the sequence b to be constantly b.

September 1, 2012 at 9:57 am

Whoops — I missed out the crucial condition that for every . I’ve now gone back and added it.

September 1, 2012 at 11:48 am

Ah, so they don’t have to be unit vectors?

September 1, 2012 at 12:03 pm

Indeed. I’ll change that too. (When I wrote the post I was lazy and just relied on my faulty memory rather than thinking about the statement or looking it up.)

September 1, 2012 at 10:30 am |

After a good night’s sleep I’m losing faith in the questions with which I finished the post. Let me give a rough reason that I think the answer to the last one (about increasing sequences) is yes, but for an uninteresting reason. What I say won’t quite be a proof, but it will be close. Recall first that for every prime of the form , there is a matrix with entries such that each row is the previous row translated by 1 mod , and such that the inner product of any two rows is -1. (The matrix is simple to define: if and only if is a quadratic residue. If you add an extra row and column filled with 1s, you get the Paley matrix of size , which has orthogonal rows and columns.) Inspired by that, let us take our sequence as follows: we let be a large prime of the form and let be , where is the th quadratic residue mod . If we dilate this sequence, then looked at along the powers of 2 what we are actually doing is translating it. If we also had some kind of wraparound (which we don’t), then the result would be a whole lot of translates that were all equal to the constant sequence plus contributions that were orthogonal to one another. In that case, we could simply expand any sequence in terms of that orthogonal basis and obtain a discrepancy of around . We could then string together a lot of sequences like that for a rapidly increasing sequence of primes .

Actually, I think this argument can be turned into a correct one if we simply repeat the sequence a couple of times so that a segment of it has the orthogonality properties I want, but I haven’t checked the details.

If all that is correct, then we need to impose extra conditions on the sequence to make it interesting if we can prove unbounded discrepancy.

September 1, 2012 at 10:42 am

As a PS to that last remark, if it is correct, it also kills off the permutations version, since we can pick a permutation in such a way that is any sequence we like, and then we can go back and “fill up” much later.

September 1, 2012 at 11:13 am

And a PPS that kills the question off completely. There was no need to use Paley matrices. All we have to do is make sure that for every and every subset there exists such that for every , is a term in the sequence if and only if . In other words, roughly speaking you make sure that the set of powers of 2 you choose contains a copy of every finite set. Then for every sequence, to show that it has discrepancy at least , pick so large that all subsets of have appeared by . Then we can dilate the sequence so that it either picks out exactly the 1s in the set or all the -1s. At least one of those will give us a discrepancy of .

The remark just above shows that this deals with the permutations version as well.

September 1, 2012 at 11:30 am |

The comment above and its subcomments dispose of the questions in the last paragraph. The following extra conditions may possibly rescue them. In the permutations case, one can insist that for every , and in the sequence case one can insist that .

In both cases, the conditions are there to rule out the appearance of long sequences of powers of 2 (or of any number for that matter). I hope they are a good combination of quite weak from a structural point of view, but strong enough to rule out all counterexamples that use the basic set of ideas that showed that the questions as originally formulated were uninteresting.

September 1, 2012 at 12:16 pm |

“The proof that the diagonal decomposition implies EDP actually proves a much stronger result. It gives us a function such that such that if is any sequence of real numbers (not necessarily -valued) with , then \epsilon has unbounded HAP-discrepancy.”

If the d(n)’s sum to infinity, there is a finite subset that sums to at least 1. By setting for all n in this set and 0 for all other n, we get a sequence with bounded discrepancy (since it has only finitely many non-zero terms) that satisfies . I don’t know if you meant the last formula to be ?

What is the simplest non-trivial (that is, ) example we know of a decomposition of a diagonal matrix?

September 1, 2012 at 3:21 pm

I seem to remember that we basically didn’t have a non-trivial example: I think finding one may be a rather hard problem (related to the fact that there are long sequences with small discrepancy even in the case and therefore quite possibly even longer ones in the two-vector-valued-sequences case). So if we want to obtain partial results, it might be better instead to try to find interesting decompositions of matrices that are “reasonably well concentrated on the diagonal”, whatever that might mean.

September 1, 2012 at 3:32 pm

I thought there was a computer search for such decompositions? Someone used linear programming to find possible values on the diagonal. Didn’t we have a decomposition of the corresponding diagonal matrix?

September 1, 2012 at 3:42 pm

I think so, but in order to be able to regard the constant as non-trivial we had to be generous in our definition of HAP — allowing sets of the form . I think if you insist on sets of the form then we didn’t find any examples with . That’s a problem for any approach that might hope to use some kind of product construction, since one would like the examples to get better rather than worse when you take products!

September 1, 2012 at 4:08 pm

Ah. So do we have simple non-trivial examples using the generalized HAPs? Or just using any sets? I can’t think of any examples.

September 1, 2012 at 4:39 pm

My memory is that even with generalized HAPs we basically didn’t have any examples. But I’ll wait to see whether anyone else remembers the situation better. (In theory one could look it up I suppose.)

September 1, 2012 at 2:39 pm |

cannot be supported on the square-free integers as +,-,-,0,+,+,-,0,+,-,-,0,… have discrepancy 1 and is only 0 on square-divisible numbers (on multiples of 4, actually). In general, as you also noted, the sum must be small for small d (it is easy to see that it must be bounded by d-1 using the sequence where every d’th term is 0, but as I recall from last time we worked on EDP, we conjectured (or proved?) it to be ).

On the other hand, we know that if is a sequence with for all i, then , so we cannot choose to be supported on e.g. the factorials.

September 1, 2012 at 4:30 pm |

I’ve remembered why the strengthening of EDP that says that the common difference of the HAP can be taken as either a prime or a power of 2 had a chance of being true. (But it seems a bit outrageous, so I would stop short of conjecturing that it

istrue.) If you want to have bounded discrepancy on all HAPs with common difference an odd prime, then that’s easy. For example, the sequence 1, -1, 1, -1, … will do the job. More generally, so will lots of periodic sequences of even length: if the period is then you need to ensure that for every prime that divides the number of residues divisible by taking the value 1 is equal to the number taking the value -1. An example with is repeated over and over again.But no periodic sequence can have bounded discrepancy on all HAPs with common difference a power of 2. Indeed, if the period is and is the largest power of 2 that divides , then the restriction of the sequence to the HAP of common difference is periodic with odd period, so has unbounded sum.

The Morse sequence is the natural example of a sequence with bounded discrepancy on all HAPs with common difference a power of 2. But the Morse sequence has unbounded discrepancy on other HAPs. (I can’t remember whether we checked this for primes, but I think we did.)

The fact that the strengthening isn’t obviously false suggests the following question: what sorts of matrices can we make out of products of HAPs of prime common difference? I’ll stop this comment and think about that question in a new comment.

September 1, 2012 at 5:19 pm |

I now want to think about what we can do if we take combinations of HAP products where the HAPs have prime common differences. I’ll start with a rather simple case, where we choose weights for each odd prime and take the sum , where is 1 on all multiples of and 0 everywhere else. The value of this sum at the point is . Can we choose the weights in a way that makes this sum “nice”?

One set of weights that springs to mind straight away is to take . In fact, I’d rather take all HAPs with prime power difference and take . This is using the von Mangoldt function (defined to be at and 0 at all integers that are not prime powers). That is, we are taking the sum , which is easily checked to equal .

The level sets of this function are sets of the form if you’ll excuse the confusing notation (the first is an ordered pair and the second the hcf of and ). They are of some moderate interest in that pairs on the diagonal tend to have much bigger higher common factors than pairs off the diagonal.

The matrix we have obtained is not diagonal, but we can still ask how efficient the decomposition is. The total mass on the diagonal up to is roughly . The number of powers of up to is roughly , but … hmm … that’s killed by the term. Up to a constant I think this is , which is around . (I’m not certain that was correct.) As for the sum of the coefficients, the sum over all powers of is roughly , so the total sum is roughly . So it looks as though we would have an efficient decomposition if only we could pretend that the off-diagonal terms didn’t exist.

What can we learn from a decomposition of a non-diagonal matrix into HAP products? If , then we can still look at in two different ways. The fact that the sum of the coefficients above was at most tells us that if is at least , then the sequence must have HAP discrepancy at least . Recalling that , that tells us that for a sequence to have bounded discrepancy on all HAPs with prime-power common difference, the quantity must be bounded.

I’ll stop this comment here, because I have no idea whether that is a difficult condition for a sequence to satisfy.

September 1, 2012 at 5:40 pm

A quick remark about that last question — one could consider attacking it using Moses Charikar’s idea. That is, one would like to try to subtract some mass from the diagonal and still have a matrix that is non-negative definite. Can that be done?

September 3, 2012 at 7:45 pm

Just the obvious remark: since a diagonally dominant matrix is PSD, one question to ask is how much can we subtract from the diagonal while the remaining matrix is still diagonally dominant.

So instead of a decomposition of a diagonal matrix, it is enough to look for a decomposition of a strongly diagonally dominant matrix, i.e. a decomposition of a symmetric matrix where each diagonal entry satisfies for . The goal is to maximise with respect to the coefficients in the decomposition.

September 2, 2012 at 10:49 am |

It occurs to me that a useful thing we might do is describe all matrices that can be efficiently decomposed into products of HAPs of common difference 1. If we understood that properly, then for the primes/powers-of-2 question, we would be in a position where we needed to choose just one matrix (of a kind we understood) for each possible common difference. There would be a similar benefit for EDP itself, though possibly one that would be harder to use.

Let me be clear what I mean by an “efficient decomposition” in this context. I mean a decomposition such that is significantly less than . That is, the sum of the absolute values of the coefficients of the products of HAP-functions is significantly less than the sum along the diagonal of the resulting matrix. If and are both the characteristic function of , then the contribution to the diagonal is , so a rough way of thinking about efficient decompositions is to say that we mostly use HAPs of length significantly greater than 1, and the sum of the positive contributions is not cancelled out by the sum of the negative contributions.

If all the HAPs have common difference 1, then one thing we know about the resulting matrix is that its inner product will be small with any matrix of the form , where is a sequence that has bounded partial sums. Note that this very strongly rules out being diagonal. However, it doesn’t rule out living in a band around the diagonal: indeed, if we take a sum of the form where each is an interval of length , then the resulting matrix will be efficiently decomposed and will live in a band of width around the diagonal.

As an exercise, can we characterize infinite sequences that have inner product at most with all sequences that have partial sums bounded above (in absolute value) by 1? It seems to say something about the amount of oscillation that’s allowed. Indeed, a necessary condition is that for any the sum should be at most . In fact, what it seems to boil down to is that the sum should be bounded.

The condition for matrices rather than sequences will probably be a bit more complicated, but I think it should be possible to work it out.

September 2, 2012 at 11:38 am |

Come to think of it, maybe the condition for matrices is the obvious generalization of the condition for sequences. Could it be that a matrix can be efficiently decomposed as a linear combination of products of intervals (of integers) if and only if the sum of all is small? If is itself a product of intervals, then that sum evaluates to 4 (a contribution of 1 coming from each corner of the rectangle), so if is a linear combination of such products then the sum is at most 4 times the sum of the absolute values of the coefficients. So the condition is necessary. But it remains to see whether it is sufficient: can every matrix for which that sum is bounded be decomposed efficiently into products of intervals?

September 2, 2012 at 1:00 pm |

To try to tackle that question, let me do the 1D case first, since I haven’t actually done so. If we have an arbitrary sequence, we can express it as a sum of initial segments of using something like partial summation. If the sequence is , and is the set (or rather its characteristic function), then the sequence can be written as .

In fact, that isn’t even remotely clever: we just observe that the sequence that is 1 in the th place and 0 everywhere else can be written as and see what we get.

Can we do something similar in 2D? Let’s write for the set of all such that and . Then takes the value 1 at and 0 everywhere else.

If we now take a function in two positive integer variables, then the sum we end up looking at is , which works out as . In particular, if the sum is at most , then there is a decomposition of into products of intervals, with sums of absolute values of coefficients at most .

September 2, 2012 at 3:13 pm |

I was aware while writing that that I had done precisely the same calculation when we last worked on EDP. Indeed, I feel myself being drawn inexorably towards an approach we had to the problem back then, which is worrying, because that approach seemed pretty hard to make work.

The observation back then was that if we work in the rationals, then we can take any decomposition of a diagonal map that we might have and average it with a whole lot of dilates of the same decomposition. In that way, we can obtain a decomposition of a diagonal matrix that is approximately invariant under multiplication or division by small positive integers. If we look “locally”, then we can pretend that we have a decomposition of the identity over the rationals (that is, the function defined on that’s 1 when and 0 otherwise).

What this is effectively saying (I think) is that we should be able to take a single matrix made out of products of intervals and then add up all its dilates. As long as is absolutely summable, this will give us a well-defined infinite matrix indexed by . Its value at will be . So what we’d like is a matrix with the two-dimensional bounded-total-oscillation property from the last comment that sums to zero along all “rays” in . This feels like a much easier thing to look for than a general decomposition of a diagonal matrix into products of HAP-functions, but is it? I’ll think about that in a new comment.

September 2, 2012 at 3:58 pm |

I need to correct what I said above. To work out the value at we first write as with and coprime positive integers, and then the value at is . So that has to be zero whenever and are coprime and not both equal to 1.

This won’t work, but I’d nevertheless like to think about a function of the form for some complex . That gives us , where is the Riemann zeta function and . That is completely hopeless, since whether or not it is zero appears to have no significant dependence on the pair , and in any case .

What was the thought that led to this hopeless idea? It was roughly this. If you take a function of the form for, say, an irrational , then as long as , the average along all multiples of will be 0. Meanwhile, if , then the average is 1. However, this function fails to do the job we want, because its total 2D oscillation is infinite.

What I’d like is something similar but with a matrix whose entries are absolutely summable. To prove that the sum along non-trivial rays was zero, I would ideally like something like a multiplicativity property that would allow me to use essentially the same argument for all rays — apart, that is, from the main diagonal where we wouldn’t get zero because we wouldn’t have the oscillation.

There may be a good number-theoretic reason for this being impossible, but suppose we had a completely multiplicative -valued function such that the Dirichlet series had a zero at some real number . What could we say about the function ? If we summed it along the diagonal, we’d get , which would be positive. But if we summed it along multiples of for some coprime pair, we would get . This is again hopeless, since again it doesn’t in fact distinguish between points on the diagonal and points off it.

September 2, 2012 at 5:01 pm |

On further reflection, it is silly to look at functions of the form with completely multiplicative, since the sum along multiples of will be which will not pick out the diagonal.

What if is not multiplicative? Is it too simple to look for a rank-1 function , say? Then the sum over multiples of is . For that to work, we would need the sequences to be orthogonal to each other whenever , and we would also want the function to have bounded total oscillation and to tend to zero. That seems to be equivalent to being absolutely summable.

Is there an absolutely summable sequence defined on the positive integers such that for any pair of coprime positive integers and the sum is zero? It seems like a lot to ask, and I think it might be reasonably easy to prove that no such sequence exists.

September 3, 2012 at 8:31 am

Realized later: having bounded total oscillation and tending to zero is

notequivalent to being absolutely summable (which is good news for a number of reasons). For example, if has bounded total oscillation and tends to zero and is obtained by taking the first terms to equal , the next terms to equal , and so on, then has bounded total oscillation and tends to zero. Also, any decreasing sequence that tends to zero trivially has bounded total oscillation.I think what was in the back of my mind was that if you want to choose moduli such that choosing even quite random-looking signs results in bounded total oscillation, then you’d better make the moduli have a finite sum.

September 2, 2012 at 5:52 pm |

Although the condition on the sequence is pretty strong, it might be worth at least attempting to obtain it with the help of a greedy algorithm or just-do-it argument. For example, perhaps one could fix some and take to be of the form for some bounded (maybe even -valued) sequence . If was only just bigger than 1, then it is just conceivable that one could keep track of all the sums that are required to be zero, and ensure that they are all tending to zero.

September 3, 2012 at 12:16 am |

How easy is it for a non-trivial sequence to be absolutely summable and to have the property that for every ? For that it would be enough to find a completely multiplicative -valued sequence and a real number such that . And by the intermediate value theorem, it would be enough to find such that that sum is negative, since it will tend to 1 as .

But let’s try a greedyish algorithm. I’ll start with . To compensate for that I’ll take . Now I’m fine for but not so fine for . To compensate for I could take , but now I’ve messed up , as well as again. That’s more or less inevitable, but if I keep going like this I’m not going to get an absolutely summable sequence.

So to make things a bit more hopeful, let me start with and then take . The advantage of this is that I can correct the and problems by taking . However, let me learn from my earlier mistakes and take . Now I have problems with (so far the sum is ), (so far the sum is ) and (so far the sum is ). As for the sum of the absolute values, I’ve got up to 1+1/2+1/2+1/4+1/4. I must stop for today, but I have the feeling that pursuing a construction like this I can get an example. Of course, it’s just a 1D example whereas I’m looking for something 2D-ish. But it might be a useful start.

September 3, 2012 at 9:03 am

I now realize that I don’t need absolute summability — see reply to previous comment but one.

September 3, 2012 at 8:57 am |

Another little thought. Suppose that is a zero of the Riemann zeta function. What can we say about the total oscillation of the sequence ? If , then . The sequence spirals in towards zero. Speaking roughly, there will be some number such that between successive powers of the sequence has done one rotation. So the total oscillation is going to be something like . In particular, it is finite.

And now, if we sum along multiples of we get .

September 3, 2012 at 9:07 am

Hmm … what I wrote there was a bit fanciful, since the Dirichlet series doesn’t actually have a convergent sum.

September 3, 2012 at 9:17 am |

If a series sums to zero and has bounded total oscillation, must it converge absolutely? I don’t think so. Let’s take the series , where and changes sign at . This series doesn’t converge absolutely. The total oscillation is, up to an additive constant, . To ensure that the sums tend to zero as , we need for some sequence that tends to zero. That is, the must grow more slowly than exponentially. But there is no problem in doing that

andgetting the sums of their reciprocals to be bounded. For instance, we can take . That gives us a series that converges. Adding an extra term on to the beginning, we can get the sum to be zero.September 3, 2012 at 3:21 pm |

I’ve just remembered something from the last time we tried EDP, which suggests that one has to be a little bit careful with this approach. It’s that it is not hard to give a non-trivial example of a linear combination of products of intervals, with absolutely summable coefficients, such that whenever and and are not both 1. The rough idea is this. First you take the interval for some large and multiply it by itself. You now have 1s on the diagonal and 1s off the diagonal. Now pick a pair of coprime integers that are at most . Suppose that the sum along multiples of is . We can make it zero by picking very long intervals and in such a way that contains multiples of , and attaching a coefficient of to this product. Moreover, we can make as large as we want, so we can make the coefficient as small as we want.

Once we have made this correction, we can go on and correct other pairs. Moreover, once we have ensured that the sum along multiples of is zero, we can be careful to avoid those multiples in all future products of intervals that we use — just by making those intervals start a very long way away from 0.

So in a rather trivial way we can find an infinite matrix that’s a linear combination of HAP products with long HAPs and coefficients whose absolute values sum to at most 1, and we can do so in such a way that the sum along the diagonal is at least and the sum along multiples of is 0 whenever and are coprime and not both 1.

I remember getting very excited about this, and then realizing that it doesn’t prove anything. I need to check why it doesn’t imply EDP, since that will make it necessary to impose an extra condition on the matrix .

September 3, 2012 at 4:18 pm |

OK, suppose we have an infinite matrix , indexed by the positive integers, with the following properties.

(i) can be written as , where and are characteristic functions of intervals of integers and .

(ii) .

(iii) If and are coprime and not both equal to 1, then .

Let me try to deduce EDP from the existence of . Barring a major surprise, I will fail, but I want to understand why.

Incidentally, there are two reasons that it would be extraordinary if the proof worked. The first is that I realized last time that it didn’t work. But that on its own is not a good enough reason — it’s easy to make mistakes when deciding that something can’t work. A better reason is that there is something too simple about this approach. A solution to EDP ought to involve more work, and I can’t escape the feeling that if an approach like this worked, then it would prove a number of other statements that we know to be false. (However, I don’t have a way of making that hunch precise.)

The first thing I’m going to do is create a matrix out of , this one indexed by the positive rationals. To do this, for each rational define by the formula , and then let . If is the largest rational that goes into both and , then for a pair and of coprime integers. Then , which is non-zero only if is an integer (since both and are integers and is an integer combination of them). The values of for which is an integer are , where takes the values . These sum to zero unless , by property (iii) of , while if , so , they sum to a constant greater than .

What we have here is a way of expressing the identity matrix indexed by as a linear combination of HAP products with coefficients that are in some sense “small”. The sense in which they are small is that each matrix contributes more to the diagonal than it does to the sum of the absolute values of the coefficients, by a factor of at least .

The hope was to use some kind of limiting argument to show that this clean and tidy infinite situation could be approximated by a less clean and tidy large finite situation. It was there that the trouble arose. I’ll start a new comment to think about it.

September 3, 2012 at 4:51 pm |

Let’s suppose I’ve got a -valued function defined on . Ah, here’s where things got a bit hairy. As usual, let’s calculate . On the one hand, since is just times the infinite identity matrix (indexed over ) we get . On the other hand, if has discrepancy at most on every HAP, then the inner product is at most for any two HAP functions and , so the contribution to from each is at most . That is, (by property (i) of the matrix ).

So, speaking

extremelyloosely, on the one hand we get a contribution of for every positive rational, since the diagonal contribution is , and on the other hand we get a contribution of at most for every positive rational, since the sum is .It would be great if this were some kind of contradiction when , but clearly for that we need to go into a lot more detail about how things tend to infinity. I’ll turn to that in a new comment.

September 3, 2012 at 5:14 pm |

One easy way to make everything finite is to take for not a function that is -valued everywhere, but rather a function that is -valued on some very large finite set of positive rationals. To give this any chance of working, we’d want the set to be “approximately closed” under multiplication by rationals with small numerator and denominator.

We know how to define such sets, but how big do we want “small” to be in this context? Here is where it gets problematic. One might think that it would be a good idea if numerators and denominators were allowed to go up to some such that in some useful sense “most of ” is defined on . But what does “most of” mean here? It can’t refer to the sums of the absolute values of the coefficients , since we can make all but the first one arbitrarily small.

At this point one notices that even if the coefficients of the HAP products are small, the HAP products themselves are large in some pretty natural norms. For example, the Hilbert-Schmidt norm of , if , is 1, and we were in a situation where the length of the HAPs we needed to take was proportional to the reciprocal of the coefficient we attached to the product.

September 3, 2012 at 5:28 pm |

I don’t want to throw away these ideas completely, however. Let’s make sure everything is finite by looking at a very large positive integer , and let’s go back to EDP where the HAPs have to have common differences that are either primes or powers of 2.

I’ll say that a matrix has

total oscillation at mostif . I’ll also call a matrix adilateof a matrix if and is zero unless both and are multiples of .I would like to find a matrix for each allowable common difference (that is, a prime or a power of 2) such that

(i) each is an matrix, where ;

(ii) the sum of the total oscillations of the is at most ;

(iii) if is the dilate of by a factor , then is a diagonal matrix with trace at least .

Now this looks really pretty hard if almost all the have to be prime, so I would like to try to argue that it cannot be done. If it can’t, then we might manage to find a very interesting sequence with bounded discrepancy (for one of the modified problems). If it can, then we’re in great shape for EDP.

September 3, 2012 at 7:04 pm |

Here’s the intuitive, but very possibly wrong, reason that it seems to me not to be possible to find a set of matrices of the above form. If each has low total oscillation, then there ought to be a fairly large rectangle inside which all the are roughly constant (or rather, roughly constant when both coordinates are multiples of and zero elsewhere).

Now let’s just look at primes and let’s suppose that the are actually constant matrices. Then the value of the sum of the at can be expressed by a formula of the form . But that expression just doesn’t look as though it can be anything like constant: for instance, if we restrict attention to small primes, then in any large rectangle we can find pairs and such that the highest common factor is a product of all small primes for which is large and positive, and other pairs that are coprime.

The weaknesses in that argument are that larger primes do exist, and they make things more complicated, and also that, as I noted earlier, one can impose quite a lot of oscillation on, say, the sequence , while still having a bounded total oscillation. So the large rectangle doesn’t obviously exist.

September 3, 2012 at 10:21 pm |

I’ve just realized something else, which makes the whole discussion more complicated. I was implicitly assuming that we would try to decompose a diagonal matrix as a linear combination of products of HAPs with the same common difference. But if we do that, then for every prime there are lots of points that are contained only in products where the common difference is , since a positive proportion of all points have highest common factor .

I think this means that for the primes and powers-of-two problem one basically has to take products of HAPs with different common differences.

September 4, 2012 at 10:53 am |

I want to test that last claim more carefully. Suppose that we cannot represent any diagonal matrix of trace at least as a linear combination of products of HAP functions where the two HAPs in each product have the same common difference that belongs to a set and the sum of the absolute values of the coefficients is at most 1. The set of diagonal matrices of trace at least is convex, as is the set of all linear combinations of products of HAP functions with the properties just specified, so there must be a separating functional , which we can regard as another matrix. If for every diagonal matrix of trace at least , then every diagonal entry of must be at least . But also, no two diagonal entries can be distinct, because then we could add something to one diagonal entry of and subtract the same amount from another, in such a way as to make the inner product with whatever we like. So the diagonal entries of must all equal some constant, and that constant must be at least .

Now let’s think about what it says about if for every linear combination of products of HAP functions, where the two HAPs in each product have the same common difference that belongs to a set and the sum of the absolute values of the coefficients is at most 1. That will be the case if and only if it is the case for each individual product. So we are asking for whenever and are HAPs that both have common difference , where is required to belong to .

This is a kind of 2D version of EDP. If we scale things up, then it asks the following. Do there exist a constant and a function such that

(i) for every , ;

(ii) for every and every pair of HAPs of common difference , the sum has absolute value at most ?

In particular, do these exist if is the set of all primes or powers of 2? If so, then for that problem we are forced to consider products of HAPs of different common differences.

Maybe this is an interesting problem to add to our collection of EDP-related questions, but I think there may turn out to be a fairly simple example that has the given properties.

September 4, 2012 at 12:44 pm |

Hmm, there is indeed a reasonably simple example. Start with a periodic doubly infinite sequence of period 9 with the following properties.

(i) .

(ii) .

(iii) .

An example is the sequence where are . The discrepancy of this sequence along any HAP of common difference that is not a multiple of 9 is at most 1. Now let us define to be . Then we certainly have property (i) from the previous comment. How about property (ii)? I’m fairly sure we get it with at most something like 6, for all that are not multiples of 9, that is.

Here’s a quick proof. For every and the sums and are both zero. But the sum over can be decomposed into a sum of these zero sums plus the sum over a product of HAPs of length at most 8. So the discrepancy is bounded (by a constant I can’t be bothered to work out — it’s trivially at most 64 but that can be improved quite a lot; in fact, it’s trivially at most 16; in fact, it’s trivially at most 8, but even that isn’t best possible).

This shows that we can’t use products with the same common difference if consists only of non-multiples of 9. I have a strong suspicion that we can do the same for non-multiples of for an arbitrary . All we need is numbers such that and such that for every that divides apart from itself we have . It is trivially possible to satisfy those equations. Just choose the numbers arbitrarily except that each time you reach for some proper factor of make sure that it cancels out the sum .

This shows that if we want to decompose a diagonal matrix efficiently into products of HAPs of the same common difference, and if we insist that the differences all belong to some set , then must contain multiples of every positive integer (or every small positive integer if we are thinking more quantitatively).

But we’re still left with a 2D question related to EDP that might turn out to be interesting. Let be an arbitrary function such that for every . Is it true that for every there must exist HAPs and with the same common difference such that ?

If is forced to be of the form , then this question reduces to EDP. Since we are looking at a far more general class of functions, and since EDP appears to be “only just true”, it seems highly likely that the answer to the above question is no. I just haven’t seen it yet.

September 4, 2012 at 1:04 pm |

Suppose we restrict attention in the problem just discussed to functions of the form . Let and . Then . We can write this as . This equals , where is the convolution of the characteristic functions of the intervals and .

This gives us quite a strange EDP-like problem. We have a class of functions ( being the amount by which we dilate ) and we’d like to find a function , defined on , that has a bounded inner product with every function in . Of course, something has to force to be “large” (so we can’t just take the zero function). At first, the condition appears to be ludicrously weak — all we ask is that should equal 1 — but it isn’t weak because the functions are not bounded at zero.

As ever, we can dualize. It will be impossible to find such a function if one can write the function (which takes the value 1 at 0 and 0 everywhere else) as a linear combination of the functions in such a way that the absolute values of the coefficients can have arbitrarily small sum.

Another way of looking at the dual problem is that we try to find an efficient matrix decomposition into HAP products where the common differences in each product are the same, but instead of asking for the resulting linear combination to be a diagonal matrix with large trace, we ask merely that the matrix has large trace and that all diagonals apart from the main diagonal sum to zero.

September 4, 2012 at 2:58 pm |

Actually, I don’t see that taking HAPs with different common differences helps all that much. Suppose we have our numbers extended periodically as in the previous comment but one and, as then, we define to be . If and are not multiples of then for every we have and . Therefore, if and are HAPs with common differences and , we can subtract from a whole lot of zero contributions until we’re left with the sum over a product of HAPs of length less than . So the discrepancy is bounded.

If that is correct, then we can set and prove that there is no efficient decomposition of the identity into products of HAPs of prime-power length. That ought to prove that some generalization of EDP is false for such HAPs. I’ll try to work out the details — I’m not quite sure that this is correct.

September 4, 2012 at 4:03 pm |

Looking back, I see that the proof that the nonexistence of a HAP-product decomposition implies the non-symmetric vector-valued EDP is very simple but also slightly strange. You take a separating functional as above, thought of as a matrix with 1s down the diagonal. You then take your two sequences of vectors to be the rows of the matrix and the unit vector basis. Then we have that for every and that is equal to the sum of the matrix values over . What’s odd about this is that the norms of the tend to infinity, so we can’t use compactness. So it looks as though we can find, for any , sequences of vectors and such that for every and has modulus at most (say) for every pair of HAPs and with common difference not a multiple of 6. But it doesn’t seem to follow obviously from that that we can find infinite sequences with the same property.

September 4, 2012 at 5:47 pm |

Maybe that problem doesn’t exist actually. Let be the standard basis of . Suppose we define to be and to be . (All indices are mod 6.) Then for every , exactly as we want. In other words, the periodicity allows us to collapse everything down to six dimensions.

So I’m pretty sure that is a very simple example that shows that the non-symmetric vector-valued EDP relies strongly on taking HAPs of

everycommon difference. Or at least, you have to have multiples of every positive integer amongst your common differences.That doesn’t answer the question about whether EDP is true for common differences that are primes or powers of 2, but it says that a 6D generalization is false.

It is genuinely unclear to me whether the 6D counterexample should lead one to believe that there is a 1D counterexample. In one dimension there is a parity argument (in any sequence of odd length the number of 1s cannot equal the number of -1s) that has no counterpart in higher dimensions. Maybe that is just a superficial fact that stops a certain example from working. But maybe all examples have to be “essentially periodic” in some sense, in which case there is a genuine distinction between 1D and higher dimensions.

Perhaps we should try to find a complex example …

September 4, 2012 at 10:19 pm |

Hmm … suppose you take . Then if is not a multiple of 6, then the discrepancy along HAPs (or even APs) of common difference is at most 2. So the primes and powers-of-2 version of EDP does indeed fail in the complex case (as does the prime-powers version).

What I think this (embarrassingly simple) observation shows is that if we want to prove EDP for primes and powers of 2, then we are forced to use in a strong way the fact that the sequences are -valued: we can’t use matrix decompositions to prove the result, since they prove more general results that are false.

And what

thatshows — at least if you believe that matrix decompositions are the most promising approach — is that we are not likely to make EDP easier if we try to prove a stronger result by restricting the common differences that are allowed. It seems to be important to useallcommon differences.September 4, 2012 at 11:03 pm |

Backtracking a bit, what about the earlier rather big strengthening of EDP that I was wondering about, if we don’t restrict the possible common differences? The question was this. Suppose you have a constant and a function such that for every . Must there exist HAPs and with the same common difference (but if that’s too much to ask, then I’m happy to drop that condition) such that ?

On writing that, I have suddenly had an idea for a counterexample for the same-common-difference version. I’ll describe it very roughly and think about it properly tomorrow. The rough idea is to paint with diagonal stripes of constant width and alternating colours (the colours being 1 and -1), but at an angle whose tangent is a badly approximable rational. The hope would be that if and have the same common difference, then …

No, I don’t think that works. Given any two real numbers and we can find an integer such that and are both very close to an integer, and I think that fact can be used to kill off the idea I had.

Actually, I think a counterexample to this can be turned into a counterexample to the non-symmetric vector-valued EDP with the added condition that and have the same common difference. So maybe I believe that the answer is yes (but possibly without the same-common-difference condition). It also feels like a friendlier formulation of the problem, and one where it might be interesting to do some computer searches.

Just in case anyone’s reading this (it feels a bit monomathic, so maybe nobody is), here’s a search that would interest me a lot. How large a matrix, or even just a matrix with 1s on the diagonal, is it possible to find such that the sum over every product of HAPs is at most … well, whatever small number is most interesting? Are there very big matrices where it’s at most 2? What about 3?

September 5, 2012 at 6:00 am

Seems like a good problem to try with linear programming. I hope to find the time to try it.

September 5, 2012 at 8:05 am

That would be great. On further reflection, I think the version is not so interesting (not that you could attack that with linear programming anyway) — it seems to me that a counterexample to EDP could probably be turned into a counterexample to the version. So I prefer to stick with the condition that the matrix is 1 on the diagonal.

September 5, 2012 at 8:30 am

Do you mean the other way around: a counterexample (= low discrepancy function) to the version can be turned into a counterexample to EDP? Because the direction you wrote seems clear: just take where is the low discrepancy function.

In fact, you constructed this problem as a weakening of vector-valued EDP. In the vector valued EDP case the function would need to be PSD in addition to for all . I am not totally clear at what step the weakening came in. We took the dual of vector-value EDP, and then restricted it more by looking for a decomposition of a diagonal matrix as opposed to a decomposition of a PSD matrix. Taking the dual of a dual with additional constraints, resulted in this weaker discrepancy problem, I suppose?

One benefit is that this can be tried with linear programming and hopefully scale a little further than the semidefinite programming search.

September 5, 2012 at 3:56 pm

Let me try to sort out what I meant, or at least should have meant. First, if you have a -function on of discrepancy at most (with respect to products of HAPs of the same common difference), then for arbitrarily large you can find a -function defined on of discrepancy at most . That is because if you look at the th row of the 2D function, then any HAP of common difference at most along that row can be expressed as the difference between two products of HAPs of that same common difference.

Conversely, if you have a low-discrepancy -function on then, as you say, its product with itself obviously has at most the square of that discrepancy with respect to products of HAPs.

Looking back at your question, it seems that I did indeed mean what you thought I meant, which is not what I actually wrote.

The one thing that I still haven’t quite seen the answer to is what happens if you look at products of HAPs of the same common difference and the same length, or perhaps even products of HAPs with themselves. I’m not sure how interesting that is, though.