## EDP13 — quick summary

I don’t have any major new themes to talk about in this post. We are continuing to look for ways of decomposing matrices that would give rise to proofs of strengthenings of EDP, but I would say that at the moment we are still at the stage of trying to get a feel for the problem. (If you are reading this and wondering what the decomposition problems are that I am talking about, then I suggest having a look at EDP10 and EDP12.)

The difficulty we face seems to be this. (I do not have full confidence in what I am about to write. It could be that I will change my mind, and it could be that others already have a different and better analysis.) I’ll take for the purposes of illustration the problem of writing a diagonal matrix with unbounded trace as a sum $\sum_i\lambda_iP_i\otimes Q_i$ where the $P_i$ and $Q_i$ are HAPs and $\sum_i|\lambda_i|=1$. After trying a few things, I have started to feel as though it isn’t going to be easy to find a completely explicit construction for this. That is partly because no pattern has jumped out at me from the output from Moses’s experiments, though I have not looked as hard as I might. So the best I can think of to do at the moment is something like this (but almost certainly not precisely this).

1. By some kind of wild guess, choose some HAPs $P_i$ and some positive constants $\lambda_i$.

2. Form the matrix $\sum_i\lambda_iP_i\otimes P_i$, which has trace $\sum_i\lambda_i|P_i|$.

3. Make those choices in such a way that $\sum_i\lambda_i|P_i|$ is significantly larger than $\sum_i\lambda_i$.

4. Prove that it is possible to cancel out the off-diagonal part of this matrix and at most half the diagonal part by subtracting a sum of the form $\sum_j\mu_j Q_j\otimes R_j$, with $\sum_j\mu_j$ also significantly smaller than $\sum_i\lambda_i|P_i|$.

The problem with this is twofold. I don’t know how to make the wild guess in the first place (though I suppose we should be trying to favour numbers with many factors) and I don’t see how to do the subtracting-off part. In other words, I don’t see how to make any of it work. And we also know that nothing too neat can work, or we’d end up proving things that we know to be false.

Now if we had a good idea about how to produce the matrix that we then wanted to make diagonal, then we could go ahead and think about how to make it diagonal, and if we had a good idea about what the subtracting-off procedure was, then perhaps we could come up with a matrix that was particularly well suited to that procedure. But to have to guess both at once is rather more problematic.

On the plus side, we have not been thinking about it for very long and have not tried all that many approaches. So I still feel optimistic about this method of proof.

Let me briefly mention two open problems that have come up recently and are related to EDP.

Problem 1. Does there exist a matrix $A$ with 1s down the diagonal such that for every set $P$ of the form $\{ad,(a+1)d,\dots,bd\}$ we have $|P^TAP|\leq 2$?

We can also write $P^TAP$ as $\sum_{i,j\in P}A_{ij}$. So this is another discrepancy problem. Moses, who came up with this problem, has done some computer experiments that suggest that such a matrix may exist. If it does, then it proves that there exist sequences of vectors $(u_i)$ and $(v_i)$ such that $\langle u_i,v_i\rangle=1$ for every $n$ but for no set $P$ of the above form is it the case that $|\langle \sum_{i\in P}u_i,\sum_{i\in P}v_i\rangle|>2$, which would disprove a strengthening of EDP.

Problem 2. For which sets of HAPs is EDP true? Specifically, if you have a $\pm 1$ sequence of length $n$ for sufficiently large $n$, must there be a HAP with common difference a prime or a power of 2 on which the discrepancy is at least $C$? And what about a HAP with common difference 1 or else a HAP of the form $\{d,2d,\dots,\lfloor n/d\rfloor d\}$?

In both cases, these are minimal reasonably natural sets of HAPs that rule out certain counterexamples. In the first case, the powers of 2 are there to stop the counterexample 1,1,-1,-1,1,1,-1,-1,… and others like them. In the second case, allowing all lengths for HAPs of common difference 1 is to stop you taking the first $n/2$ terms to be 1 and the rest to be -1.

Also, a “morally necessary” condition if you want unbounded discrepancy is that the HAPs should form a spanning set. If they don’t, then you can find a sequence (though not necessarily a $\pm 1$ sequence) that has zero discrepancy on any of them. Even if that doesn’t quite disprove EDP for that collection of HAPs, it certainly rules out the kinds of methods we are trying to use to prove it, and it also rules out vector-valued strengthenings of EDP.

To finish, let me mention that Klas Markstrom seems to be getting close to the first rigorous (and very computer-assisted) proof that there is no infinite $\pm$ sequence of discrepancy 2. See this sequence of comments to get an idea of where he has got to. I think we can forgive Mathias for the admission in his paper that “Repeated efforts to improve the proposition to the case $D=3$ have failed.”

### 68 Responses to “EDP13 — quick summary”

1. Klas Markström Says:

“And what about a HAP with common difference 1 or else a HAP of the form $\{d,2d,\dots,\lfloor n/d\rfloor d\}$?”

This is a quick thought late in the evening so I might be completely off. Can’t you get around this set by using an alternating sequence up to n/2 and then switching to alternating in the opposite way?

• gowers Says:

Hmm … that looks right.

In which case perhaps the next question is to try to find some kind of property of sequences that have bounded discrepancy on all full HAPs and then try to add in extra HAPs that look as though they rule out all sequences with that property.

A first step might be to try to describe sequences that have zero inner product with all full HAPs with common difference at most $k$. I’ve no idea whether there is anything nice to say here.

• Moses Charikar Says:

I think this works. All the HAPs of odd common difference are clearly ok. They can have a discrepancy of at most 1 on {1,…,n/2} and at most 1 on {n/2,…,n}. Consider a HAP of even common difference. It has large discrepancy on {1,…,n/2} and large discrepancy on {n/2+1,…,n}, but these two contributions have opposite signs and differ by at most 1 in magnitude. Here we are using the fact that the HAP extends all the way to the end of the interval. Hence the overall discrepancy is at most 1.

• Anonymous Says:

This could be optimized a bit by instead of using two subintervals using $\sqrt n$ subintervals. This should give a maximum discrepancy of square root order instead of linear.

• Klas Markström Says:

Anonymous in this case was I. For some reason my mobile phone ignored my name.

• Klas Markström Says:

Here is another problem on restricted sets of HAPs.

Assume that we look at sequneces of length $N$ and include the HAPs with difference 1, and the HAPs with difference $d$ for all prime $d\geq f(N)$.

For $f(N)=\sqrt(N)$ sequences with bounded discrepancy are easy to construct. How small can we make $f(N)$ without getting a large discrepancy?

2. A new Thread « Euclidean Ramsey Theory Says:

[...] new Thread By kristalcantwell There is a new thread for Polymath5. It looks like Klas Markstrom is getting close to a proof that for a large enough N [...]

3. Huy Nguyen Says:

I just want to make a comment about the SDP approach. Let me recap some of the stuff we know. By now, we probably have some good(?) guess of the tail dual variables $t_{k,d}$‘s: they should be of some form similar to $t_{k,d} = c_d e^{-\alpha k d}$. I think we have some reasonable guess for the $c_d$ as well, e.g. number of divisor of $d$ divided by $d$. Suppose, we enforce these guesses to the problem. Then, the primal becomes the following.

Minimize the sum $(t_{k,d} - t_{k+1,d}) (v_d + \ldots + v_{kd})^2$ subject to $v_i^2 = 1$ for all $i$.

This problem is similar to the SDP for max cut so I think it is informative to look at it with the algorithms for the max cut SDP in mind. There is a primal-dual algorithm by Klein and Lu for max cut that is applicable to our SDP as well. It works roughly as follows. Recall we associate a dual variable $b_i$ with the constraint $v_i^2 = 1$. Says we make some guess of these variables, not their exact values, but only their ratio $b_i/b_1 \forall i$. For symmetric problem like Roth’s theorem, it is much easier to guess the ratio of the $b_i$’s (they should be equal), but maybe we can do it here as well. With these guesses, we can rewrite our problem as follows.

Minimize the sum $(t_{k,d} - t_{k+1,d}) (v_d + \ldots + v_{kd})^2$ subject to $\sum_i b'_i v_i^2 = 1$ for all $i$.

Let $X$ be the Gram matrix of $v_i$. We can rewrite the above problem as minimizing $M\bullet X$ subject to $\sum_i b'_i X_{i,i} = 1$ and X being positive semidefinite.

First notice that the minimum is achieved by a rank 1 matrix $X = uu^T$. In fact, this is exactly the problem of finding the minimum eigenvalue of a matrix! Define a vector $z$ with $z_i = \sqrt{b'_i}u_i$, then $z$ is the eigenvector corresponding to the smallest eigenvalue of of the matrix $M'$ with $M'_{i,j} = M_{i,j}/\sqrt{b'_i b'_j}$.

Actually for any choice of $b'_i$, the smallest eigenvalue would give a lower bound on the discrepancy: the squared discrepancy should be at least $\lambda_n (\sum_i b'_i) / (\sum_d t_{1,d})$.

Thus, this gives a way for verifying guesses and coming up with bounds without fully solving the SDP: we can guess the tail variables and the ratio of the $b_i$, and then either numerically or theoretically bound the smallest eigenvalue of the scaled matrix $M'$.

Better yet, it also give a way to improve one’s guess, which is always a feature of primal-dual approaches. Namely, we can look at the matrix $X = uu^T$, and check which constraint $X_{i,i} \ge 1$ is violated and by how much. The more severely a constraint is violated, the more we should increase its weight $b'_i$.

• gowers Says:

That is very interesting. Let me see if I can extract from it the main thing that one needs to know in order to use it. I’m doing this partly as a test of my understanding.

The basic SDP appoach is to find a matrix M made out of HAPs and a diagonal matrix D with large trace such that M-D is positive definite, and hence $x^TMx\geq\mathrm{tr}(D)$ whenever $x$ is a $\pm 1$ vector. It also tells us that if $U$ is a matrix with columns $u_1,u_2,\dots,u_n$, then $\mathrm{tr}(U^TMU)\geq\sum_ib_i\|u_i\|^2$, where $b_i$ is the ith entry on the diagonal of D.

If we let $v_i=u_i\sqrt{b_i}$, then this becomes $\mathrm{tr}(V^TD^{-1/2}MD^{-1/2}V)\geq\sum_i\|v_i\|^2$, since $V$, the matrix with columns $v_1,\dots,v_n$, equals $D^{-1/2}U$.

Let $M'=D^{-1/2}MD^{-1/2}$. Then we are trying to minimize $\mathrm{tr}(V^TM'V)/\mathrm{tr}(V^TV)$. Writing the numerator and denominator out in full we get $\sum_i\langle v_i,M'v_i\rangle$ and $\sum_i\|v_i\|^2$, from which it is clear that the best we can do is minimize the ratio $\langle v,M'v\rangle/\|v\|^2$ (this is your remark that the minimum is achieved for a rank-1 matrix), which is obviously achieved when $v$ is an eigenvector of $M'$ with minimum eigenvalue.

The conclusion is this. Suppose we make a guess about what $M$ and $D$ should be. We can then easily form the matrix $M'$. If we can then find the eigenvector with minimum eigenvalue, we either get a proof that the approach has worked, or we get a rather informative violation of what we hoped would be true.

I think it might be very useful to start with a simple but bad guess, like taking all HAPs and taking D to be the identity matrix. Perhaps in that case, or something a bit like it, we could determine explicitly what the eigenvector with minimum eigenvalue is and see what happens in the first iteration of the algorithm. That might help us guess what the final state will look like. (I wouldn’t be altogether surprised if the minimal eigenvector was closely related to 1,-1,0,1,-1,0,1,-1,0,…)

• Moses Charikar Says:

Another point that Huy makes is that the bound you obtain from this is a valid lower bound even if you did not iterate and try to converge to the best choice of $b_i$. If you chose a reasonable combination of HAPs to produce the matrix M, and made a good guess about what the $b_i$s should be (perhaps by improving earlier guesses a few times), then we might have a good bound already. In any case, the bound is always valid – the eigenvalue computation tells us by what factor the $b_i$s should be scaled to produce a valid bound.

Huy, how does the algorithm update the $b_i$ values after the eigenvalue/eigenvector computation ?

• Huy Nguyen Says:

The original update rule for $b_i$ is based on the framework by Plotkin, Shmoys, and Tardos for packing and covering linear programming. Below is a similar update rule based on a survey paper by Arora, Hazan, and Kale. All errors are mine, of course.

Suppose we want to approximate the SDP solution up to a factor of $1+\epsilon$. We are probably fine with any constant approximation so it is fine to set $\epsilon$ to be, says, $0.5$.

First, set all $b_i = 1$. We maintain $b'_i = \frac{b_i}{\sum_j b_j}$, which can be thought of as a distribution over the constraints. We want to find min $M\bullet X$ subject to $X$ PSD and $\sum_i b'_i X_{ii} \ge 1$. Compute the smallest eigenvalue and the corresponding unit eigenvector $v$ of the matrix $M'$. We get the solution matrix $X_{ij} = v_i v_j / \sqrt{b'_i b'_j}$. The $b_i$ is updated to reflect how important that constraint is i.e. constraints that are violated get increased weight while constraints that are satisfied get decreased weight. Namely, $b_i = b_i (1-\epsilon(X_ii - 1)) = b_i(1-\epsilon\frac{v_i^2 {\sum_j b_j}}{b_i} - 1)$.

At the end, the solution $X^*$ is an average over all the rank-1 matrices found during the execution of the algorithm.

• Huy Nguyen Says:

Sorry, I made a few mistakes in the previous comment. The update rule should read $b_i = b_i (1-\epsilon(X_{ii} - 1)) = b_i(1-\epsilon(\frac{v_i^2 {\sum_j b_j}}{b_i} - 1))$.

• Huy Nguyen Says:

Sorry, there was another thing I forgot, which might be of some importance if one wants to actually implement this. The update rule is actually $b_i = b_i (1-\epsilon(X_{ii} - 1)/\rho) = b_i(1-\epsilon(\frac{v_i^2 {\sum_j b_j}}{b_i} - 1)/\rho)$ where $\rho$, called the width, is an upper bound on $|X_{ii}-1|$ returned by the eigenvector computation. Intuitively, we don’t want the update rule to make radical change in $b_i$ in one step, thus the scaling by $\rho$. The number of iterations is proportional to the width and a function of $\epsilon$. If we just let $M$ to be any combination of HAPs, $b_i$ could be just anything, e.g. 0, so the width $\rho$ could be huge. Thus, to control the width, one would actually need to modify $M$ slightly, e.g. use all length-$1$ HAPs so that all $b_i$ are not vanishingly small, and might have to use somewhat different update rules. I did not explain the conditioning steps in full since it is somewhat tedious and it would be easier to read the paper anyway at http://www.cs.brown.edu/research/pubs/pdfs/1996/Klein-1996-EAA.pdf

4. Alec Edgington Says:

Regarding Problem 1, I think it would initially be worth looking for solutions of the form $A_{ij} = f(j/i)$, where $f : \mathbb{Q} \rightarrow \mathbb{R}$ is a function on the rationals satisfying $f(1) = 1$ and $\lvert \sum_{a \leq i,j \leq b} f(j/i) \rvert \leq 2$ for all $a, b \in \mathbb{N}^+$. (This is analogous to looking for multiplicative sequences of low discrepancy, and at least might force numerical solutions to exhibit more structure.)

• Moses Charikar Says:

Yes, this might be a good restriction to impose on the solution. With this in place, in order to answer Problem 1, we only need to ensure that $|P^TAP|\leq 2$ for sets $P$ of the form $\{a,(a+1),\dots,b\}$ since the multiplicative condition gives the bound for sets $\{ad,(a+1)d,\dots,bd\}.$

As Alec points out, imposing this condition on the matrix corresponds to the multiplicative version of the “non-symmetric” vector-valued EDP. The hypothesis is that given any two sequences $v_1,v_2,\dots$ and $w_1,w_2,\dots$ of vectors in a Euclidean space satisfying (1) the diagonal condition $\langle v_i,w_i\rangle=1$ for every $i$, and (2) the multiplicative condition $\langle v_{di},w_{dj}\rangle=\langle v_i,w_j\rangle$ for every $i,j,d$, there are HAPs $P$ and $Q$ such that $|\langle\sum_{i\in P}v_i,\sum_{j\in Q}w_j\rangle|\geq C$. In fact, with the multiplicative condition in place, we only need to consider pairs of HAPs $P,Q$ with common differences $d_P,d_Q$ such that $(d_P,d_Q)=1$.

• Moses Charikar Says:

It looks like imposing the multiplicative condition does not change the value of the optimal solution (for small values of $n$ that I looked at). Also the optimum value of the LP seem to be $2-5/(n+2)$. As mentioned here , I looked at HAPs over $\{0,1,2,\ldots,n\}$. There still seem to be some degrees of freedom in the optimal solution, so the particular solution we get is somewhat arbitrary. Ideally, all variable values are multiples of $1/(n+2)$. This happens sometimes but not always.

• Alec Edgington Says:

Moses, would you be able to post some of your optimal solutions for small $n$ I’d be interested to know if there is any sign of multiplicativity in the sequence $A_{1m}$, for example.

• Moses Charikar Says:

Sure, look at the files mult*.txt here:
http://webspace.princeton.edu/users/moses/EDP/try-lp/
Format is
i … j … $A_{ij}$
I have solutions for n=8,18,23,28,38,48 (since the denominators are likely to be nice numbers for these values). Only relatively prime pairs i,j, with i<= j are output.

• Moses Charikar Says:

Even though we haven’t quite figured out the LP solutions for Problem 1 obtained by imposing this multiplicative condition, it might be worthwhile to impose this condition in constructing diagonal representations. What this means is that the diagonal representation we construct will be a lower bound for the “non-symmetric” vector-valued EDP with the additional multiplicative constraint $\langle v_{di},w_{dj}\rangle=\langle v_i,w_j\rangle$. This should be interesting in itself, since any lower bound we construct for this problem would automatically be a lower bound for the HAP discrepancy of multiplicative $\pm 1$ sequences. Perhaps more importantly, proving lower bounds for this restricted problem may be an easier intermediate step and give us a handle on the general problem.

We have some additional flexibility in constructing diagonal representations for this restricted setting. Let $A$ be the matrix we express as a convex combination of $\pm P \otimes Q$. Then all off-diagonal entries of $A$ need not be zero. In fact, we need to satisfy the weaker condition that certain sets of off-diagonal entries sum to zero. The condition is this: for all $i,j$ such that $(i,j)=1$, $\sum_{d \geq 1} A_{di,dj} = 0$. This also implies that in constructing the convex combination, we only need to consider terms $P \otimes Q$ with common differences $d_P,d_Q$ such that $(d_P,d_Q)=1$.

5. conformal Says:

http://www.ams.org/news/home-news.html#abel-2010

6. Gil Says:

Perhaps a weaker version of EDP, that we may yield so some techniques from extremal or probabilistic combinatorics is this:

Show that the set +- 1 sequences of length n with bounded discrepancy has entropy o(n). (Namely, the number of such sequences is $2^{o(n)}$.

• gowers Says:

We thought about this question way back here. I still rather like it though, and I think it could be easier than EDP itself but still interesting.

• Gil Says:

Not sure how that happened, but I’ve fixed it now.

• Moses Charikar Says:

Tim, on that note, the link to Mathias’s paper is broken in your EDP13 post.

Also now fixed — I don’t know what’s going on.

• Gil Says:

Yes indeed Tim raised this question early on and proposed to use the Sauer-Shelah theorem (or equivalently to show that the VC-dimension of small discrepency sequences is o(n)). One nice thing about this direction is that it may well apply to sequences with +1 -1 and 0s or sequences of arbitrary integers. (Or when we talk about approriate versions of Housdorff dimension rather than VC-dimension even for sequences of arbitrary complex numbers.)

Another nice thing about these questions is that they may be easier than EDP.

A strong form of these conjectures (which may well strengthen EDP) would be that when we fixed the permitted discrepancy T and consider infinite sequences (of +1,1,0; of arbitrary integers; of arbitrary complex numbers), then the dimension of sequences of bounded discrepency is bounded.

• Gil Says:

Maybe we can start by showing small entropy o(n/logn) or even much smaller for completely multiplicative functions with values +1/-1. Perhaps we can look at a large but not huge interval of integers, then look at all primes involved in integers at that interval and then show that we can assign them +-1 as to force large discrepency on that interval.

• Gil Says:

The specific suggestion of my previous comment is based on late-night favorable quantifier-confusion. But still I thing that Tim’s Sauer-Shelah-VC-dimension approach towards bounding the entropy of multiplicative bounded discrepency sequences should be persued some time. Does the harmonic analysis reduction to the (slightly generalized) multiplicative case apply to the entropic version of EDP?

7. Kristal Cantwell Says:

I have been looking at this entry in the wiki:

http://michaelnielsen.org/polymath1/index.php?title=Forced_Drifts_in_Multiplicative_Sequences

If the drift of a sequence is 5 then the discrepancy must be at least 3. So in it there is a proof that a multiplicative sequence of length 5298 or more must have discrepancy 3. It is rather long though.

If the drift of a sequence is 3 than the discrepancy must be at least 2. So here we have a proof that if a sequence has length 18 the discrepancy is at least 2. Here the proof is much shorter.

Possibly our proof that a sequence of length 247 has discrepancy 3 can be modified to get a limit lower than 5298 than for the maximum multiplicative sequence with drift less than 5.

• obryant Says:

You’re off by one, we need a drift of 6 to guarantee discrepancy 3. For example, – - + + + + + has drift of 5 (from -2 to 2).

• Kristal Cantwell Says:

If there are 5 consecutive 1′s in a row then wouldn’t they add 5 to the value of the sum of the numbers before them? So in your example the -2 should would have 5 added to to it to get 3 which would push the discrepancy to 3.

8. Moses Charikar Says:

I want to point out a small observation about the representation-of-diagonals approach. Recall that the goal is to express a diagonal matrix as $\sum_i\lambda_iP_i\otimes Q_i$ where $\sum_i |\lambda_i|=1$, and the lower bound (on the square discrepancy) obtained from this is the trace of the diagonal matrix. In fact, we don’t need to completely cancel off all the off-diagonal elements. If we obtain a (not necessarily diagonal) matrix $A = \sum_i\lambda_iP_i\otimes Q_i$, then this gives a lower bound of $\sum_i A_{i,i} -\sum_{i \not = j} |A_{i,j}|$. This sounds similar to the variant discussed earlier where we said we need to produce a diagonal plus a psd matrix (instead of a diagonal plus a zero matrix).

The proof that $\sum_i A_{i,i} -\sum_{i \not = j} |A_{i,j}|$ is a lower bound follows from the fact that for a $\pm 1$ vector $x$,
$x^T A x = \sum _i A_{i,i} x_i^2 + \sum _{i \not = j} A_{i,j} x_i x_j \geq \sum_i A_{i,i} - \sum_{i \not = j} |A_{i,j}|$. Now plug this into Tim’s proof (see his EDP12 post) to conclude that there exist HAPs $P$ and $Q$, such that $(\sum_{i \in P} x_i)(\sum_{j \in Q} x_j) \geq \sum _i A_{i,i} - \sum_{i \not = j} |A_{i,j}|$.

This is not a significant change, but it gives a little more breathing room in allowing almost diagonal representations.

• gowers Says:

I like this observation, since getting rid of the last few entries looked as though it might be quite troublesome. I suppose one could consider taking it a step further and saying that we don’t even need $\sum_{i,j}|A_{ij}|$ to be small, since that is not a necessary condition for $x^TAx$ to be small for every $\pm 1$ vector $x$. For example, if $A$ is a random $\pm 1$ matrix, then the maximum possible value of $x^TAx$ will not be $n^2$, but something much smaller like $n\log n$ I think. But pursuing this line of thought could just lead us straight back to the SDP approach, so it may well not be much extra help.

9. gowers Says:

This comment contains some rather vague and miscellaneous thoughts about where we stand.

1. First, an attempt to articulate what seems to me to be a serious difficulty with what we are trying to do. We have shown that EDP follows if we can find certain ways of decomposing matrices. From a look at the experimental evidence, one gets the impression (perhaps wrongly) that there may not be an explicit formula for the decomposition. If that is the case, then one needs to find a way of proving nonconstructively that a decomposition exists. Now a tempting move is to observe that if we cannot find a decomposition of the desired kind then there must be a functional that proves this. But that functional turns out to be precisely a counterexample to some strengthening of EDP and we are back where we started. So it seems as though we are forced to “construct” a decomposition (possibly with the help of probabilistic methods). How do we do that?

Huy had an interesting suggestion in the SDP case, and I think we should pursue that approach at some point soon.

2. If we diagonalize a symmetric matrix $A$, we end up finding an orthonormal basis $u_1,\dots,u_n$ and showing that $A=\sum_i\lambda_iu_i\otimes u_i$. Equivalently, we factorize $A$ as $U^TDU$ for some orthogonal matrix $U$ and diagonal matrix $D$. That tells us that $D=UAU^T$. The $ij$th entry of this matrix is $\sum_{r,s}U_{ir}A_{rs}U_{js}$. If the $r$th column of $U$ is $u_r$, then this tells us that the matrix is $\sum_{r,s}A_{rs}u_r\otimes u_s$. So maybe, just maybe, we could define a symmetric matrix $A$ in some natural way so that we expected its eigenvectors to be efficiently expressible as linear combinations of HAPs.

As usual, I haven’t checked completely carefully that everything I’ve just written is 100% correct. But the general point stands: it may be helpful to think about what kinds of symmetric linear maps are likely to have HAP-ish eigenvectors.

Of course, we also want other properties: even if we can find a decomposition we need it to be efficient. So this is not a suggestion for how to solve the problem, so much as a suggestion for how perhaps to get slightly closer to a solution.

3. The kinds of matrices that are likely to have eigenvectors built out of HAPs are matrices that are themselves built out of HAPs. We could even build the matrices together with their diagonalizations. To do this, we would want to solve the following problem: find an orthonormal basis where each vector is a linear combination of HAPs and do it in a “non-trivial way”. (I’m not quite sure what the last condition means, but I don’t want to take e.g. the standard basis.) Having chosen the orthonormal basis, we would also want to choose the diagonal map. Perhaps one could do that in a clever way that would cause quite a few coefficients to cancel. (Again, this is meant as a rather vague thought.)

• Moses Charikar Says:

Continuing in speculative mode …. I was wondering whether we might learn some lessons from the SDP solutions about what good test vectors might be (for the representation of diagonals approach). Alternately, think of these as rather vague thoughts on how we might build matrices out of HAPs so as to have eigenvectors built out of HAPs.

1. Let $P_{k,d}$ denote the characteristic vector for the HAP $\{d,2d,\ldots,kd\}$. Let $M_d = \sum_k c_{k,d} P_{k,d} \otimes P_{k,d}$. The SDP solutions we have looked at suggest that we should take $c_{k,d} = C_d e^{-\mu kd}$. Ignore the constants $C_d$ for now. The SDP data seems to suggest a conjecture along the following lines: There exists a sequence $b_n$ with unbounded sum such that for any vector $x$, $x^t M_d x \geq \sum_i b_i x_i^2$ for some $d$.

Do the matrices $M_d$ suggest any particular test vector made from HAPs of common difference $d$ ? I don’t know how to express $M_d = v \otimes v$, but here is a small observation: Let $v_d = \sum_k \pm \sqrt{c_{k,d}} P_{k,d}$, where the $\pm$ signs are chosen independently and at random. Then $M_d = E[v_d \otimes v_d]$. Now I am not sure this is particularly useful, because we now believe that we should not be able to get a good diagonal representation solely from vectors of the form $v_d$. A positive answer to Problem 1 (in Tim’s EDP13 post) says that we ought to be mixing HAPs of different common differences. But perhaps random vectors such as $v_d$ could be part of the solution.

2. One way of mixing vectors of different differences would be to compute the orthogonal component $u_d$ of $v_d$ relative to the other $v_{d'}$ (where ‘other’ could refer to all $d' \not = d$ or all $d' < d$ say). I'm not sure where this will lead us, but if we construct the matrix $\sum_d \lambda_d u_d \otimes u_d$, then we ensure that the eigenvectors are precisely the vectors $u_d$ (assuming they are normalized to unit vectors) with corresponding eigenvalues $\lambda_d$. (Here we used the fact that the $u_d$ are mutually orthogonal.) The expressions $u_d \otimes u_d$ lead to mixing HAPs with different common differences.

A couple of issues with this: Firstly, the vectors $v_d$ as defined are random vectors, so I am not sure how we would reason about orthogonal components etc. Secondly, eventually, we want to prove a lower bound of the form $\sum b_i x_i^2$. Similar to Huy's suggestion about the SDP approach earlier, we may need to scale all the vectors we are dealing with by guessing $b_i$ in advance, and then dividing the $i$th coordinate by $1/\sqrt{b_i}$. Now compute orthogonal components etc for these scaled vectors instead of the original ones. But this seems quite messy, not to mention that it requires us to guess the $b_i$ values which we don't have a good handle on.

On another note, I am out of town for the rest of this week and may not be able to participate in the discussion until I get back.

• gowers Says:

On the same note as your last, I too will be away for a few days. I may be able to participate to a limited extent, but that is the most it will be until early next week.

10. Kristal Cantwell Says:

If a sequence has discrepancy less than four then at odd numbers the discrepancy will be plus or minus three or plus or minus one. If there are two consecutive plus ones or two consecutive minus ones starting at an odd number I claim that the value of the discrepancy at the starting point must be plus or minus 1. If It had value 3 or minus three then it would have the value 4 or minus 4 at the point proceeding or after it so those values are forbidden.

This is an attempt to generalize the result that if the discrepancy is less than two than if there are two consecutive ones or negative ones at an even number the value of the sum must be zero there.

• Kristal Cantwell Says:

I have an idea on using this to start a proof that the discrepancy of every multiplicative sequence must be at least 4. The idea is to start with a number that must be plus or minus 1 and then add at least 5 to it or subtract at most 5 and get plus or minus 5.

The starting point is http://michaelnielsen.org/polymath1/index.php?title=Forced_Drifts_in_Multiplicative_Sequences
particularly the cases showing the drift must be 5. For about two
or three hundred of the cases the starting point for the sequence of drift 5 is odd. If I have worked that out correctly that is when the difference is odd. For these numbers if they start with two values with the same sign then we can apply the above theorem. This would eliminate all but 300 cases. Now that is a lot of cases but it is still possible to deal with them if they are easy enough. I will be looking at some of these cases and seeing if this can be made to work.

• obryant Says:

I never posted it, but the drift 6 case with f(2)=1 finished quickly. For f(2)=-1, the code wouldn’t make it across $\mu_3$.

BTW, I just posted the code that generated the “forced drifts in multiplicative sequences” data.

• Kristal Cantwell Says:

I have been looking at the cases showing the drift must be 5 and so far I think I have the following:

If there is a multiplicative function with discrepancy 3 or less then:
If f(2)=1
then f(3)=f(5)=-1
If f(2)=1 and f(7)=1
then f(11)=f(13)=f(17)=-1

Let me assume I have the drift 6 case with f(2)=1 finished

Now we have from above this result:

If there is a sequence with discrepancy less than four there are two consecutive plus ones or two consecutive minus ones starting at an odd number the value of the discrepancy at the starting point must be plus or minus 1.

if the sequence starts at an even number the values must be 0 or plus or minus 2.

Then look at where the plus six sequence starts it must start with two values of the same sign or it contains a smaller +6 sequence the by the above two results the value at the start of this +6 sequence must be plus or minus 1, plus or minus 2 or zero. So we have that in that case the discrepancy must be 4.

So we would have if a multiplicative sequence has discrepancy less than
4 it must have f(2)=-1.

I would like to see the partial results for a drift of 6 because it would by the above reasoning allow me to deal with some of the open cases as above. Also I would like to try to complete the proof of drift 6 as I have a feeling I might be able to deal with some of the open cases.

• obryant Says:

I’ve now posted the drift=6 data (for f(2)=1). The data is unformatted; sorry for that. I’ve lost my Mathematica to HTML code that did the other tables.

• Kristal Cantwell Says:

I have made an error in the above posts. I need drift 7 to force discrepancy 4. Drift 6 only works for one where the drift starts with consecutive values of the same sign starting at an odd number. Drift 5
can be used to prove discrepancy 3.

11. Klas Markström Says:

This appeared on arxiv today, but I haven’t had time to take a proper look at it yet.
http://arxiv.org/abs/1003.5388

• Kristal Cantwell Says:

The result is that if the sum of the prime values of f (f is a multiplicative function from one to zero) less than x are large enough then the multiplicative function diverges. If we can show that a bounded f forces a bounded multiplicative then this result if it holds would allow further restrictions on a bounded f.

Also if we have a game in which player one has 2 moves per turn and player two has one and player one is trying to force a large discrepancy and player two prevent it then if this theorem is true the game will always be a win for player one. And this is true for any variation where player one has enough extra moves to force a large enough discrepancy in the signs of the primes.

This looks like an interesting paper.

12. obryant Says:

I’ve been doing some computations on the “least nonzero digit”-type constructions. That is, fix an integer m and a subset A of $\{1,2, \dots ,m-1\}$, and set $\tau_{m,A}(n)=1$ if the least significant nonzero digit of n (base m) is in A, and $\tau_{m,A}(n)=-1$ otherwise.

Our favorite $\lambda_3$ comes from $m=3, A=\{1\}$, but also from $m=9, A=\{1, 3, 4, 7\}$. Our second favorite $\mu_3$ comes from $m=9, A=\{1, 4, 6, 7\}$, and is our current record-holder for slowest growing discrepancy. And lest one think that only small examples are interesting, $m=29, A=\{1, 4, 5, 6, 7, 9, 13, 16, 20, 22, 23, 24, 25, 28\}$ yields a function with smaller discrepancy than $\lambda_3$.

A little arithmetic (spelled out below) reveals that $\tau_{m,A}$ has logarithmic discrepancy if and only if A contains exactly half of the nonzero elements of every additive subgroup of the integers modulo m. If this condition is met, then

$\delta(\tau_{m,A},N) \leq M(m,A) \log_m(N) + C$,

where $M(m,A)$ is the maximum absolute value (over $1\leq d < m, 1\leq \ell < m/\gcd(d,m)$) of

$\sum_{i=1}^\ell \tau_{m,A}( i d \mod m ).$

We should choose A so as to make $M(m,A)$ as small as possible; set $M(m),\tau_m$ be the resulting functions. Set also $c(m) = M(m)/\log(m)$, the coefficient of $\log(N)$ in the discrepancy of $\tau_m$. So c(m) is what we want to minimize.

The above-mentioned computations are of the triples $(m,M(m),c(m))$.

(3, 1, 0.910)

(5, 1, 0.621)

(7, 2, 1.028)

(9, 1, 0.455)

(11, 2, 0.834)

(13, 2, 0.780)

(15, 2, 0.739)

(17, 2, 0.706)

(19, 3, 1.019)

(21, ?, ?)

(23, 4, 1.278)

(25, 2, 0.621)

(27, at most 3, at most 0.910)

(29, 3, 0.891)

(31, 4, 1.165)

(33, ?, ?)

(35, ?, ?)

(37, at least 4, at least 1.108)

For most of these, there are very few sets A that work so well; for m=29 the set mentioned earlier is the unique (well, with its negative) set with M(29,A)=3. My computation so far has been focused (for simplicity) on prime values of m.

Note that for any positive integers m,r, with m odd, we have

$c(m^r) \leq c(m)$

so that c(m) does not conveniently go off to infinity. However, over prime values $c(p) \gg p^{1/4}/\log(p)$. So there is some real benefit to composite m. I haven’t ruled out M(105)=2, which would be a new record-low discrepancy (there’s no reason to think that M(105)=2, save fantasy, but I see no reason to think it doesn’t, either).

Now to justify the earlier claim: $\tau_m$ has logarithmic discrepancy if and only if A contains exactly half of the nonzero elements of every mod-m subgroup. First, the easy direction. If A does not contain exactly half of the subgroup generated by d, then $\sum_{i=1}^\ell \tau_{m,A}(i d)$ grows linearly with $\ell$, and so we get linear discrepancy.

Now suppose that A does have exactly half of every subgroup, which we use in the form $\sum_{k=1}^{m/(d,m)-1} \tau(k d)=0$. First we work out $\sum_{i=1}^\ell \tau(i d)$. Set $m'=m/(d,m)= |\langle d \rangle |$, $M=\lfloor \ell/m'\rfloor m'$. We have

$\sum_{i=1}^\ell \tau(i d) = \sum_{i=1}^M \tau(i d) + \sum_{i=M+1}^\ell \tau(i d)$

$\sum_{i=1}^M \tau(i d) = \sum_{j=1}^{\lfloor \ell/m'\rfloor} \sum_{k=0}^{m'-1} \tau((j m'-k)d)$

$\sum_{k=0}^{m'-1} \tau((j m'-k)d) = \left(\sum_{k=1}^{m'-1} \tau(k d)\right)+\tau(jm'd) = \tau(jm'd)$

And since $m'd$ is a multiple of m, we have $\tau(jm'd)=\tau(jm'd/m)=\tau(jd/(d,m))$. We also have

$\sum_{i=M+1}^\ell \tau(i d) = \sum_{i=1}^{\ell\mod m'} \tau(i d \bmod m)$

Together, we have:

$\sum_{i=1}^\ell \tau(i d) = \sum_{i=1}^{\lfloor \ell/m'\rfloor} \tau(i d/(d,m)) + \sum_{i=1}^{\ell\mod m'} \tau(i d \bmod m)$.

We can capture this recurrence by writing:

$S(d,\ell) \leq M(m,A) + S(d/(d,m), \ell/(m/(d,m))).$

Now $\delta(\tau,N) \leq \max_{d\ell\leq N} S(d,\ell)$, and the recurrence reduces the product $d\ell$ by a factor of $m$ each time it is used. This give $\delta \leq M(m,A) \log_m(N) + M(m,A)$.

• Alec Edgington Says:

Kevin, for the prime $m$ that you’ve worked out, is the optimal set $A$ always the set of quadratic residues modulo $m$?

• obryant Says:

In summary, quadratic residues are the unique optimal for m in {3,5,17, 29}, are the nonunique optimal for m in {7,13,19,37}, and are not as good as optimal for m in {11,23,31}.

Just being concerned about the coefficient of the log in the discrepancy gives some symmetries: all of A, t*A (the dilation of A mod m by a factor of t, with (t,m)=1), A’ (the complement of A in {1,2,…,m-1}, and t*A’ give the same coefficient. Ignoring those symmetries, here are the optimal A I’ve found so far:

m=7, A={1,5,6}

m=11, A={1,2,4,7,9}

m=13, A={1,2,4,9,11,12}

m=19, A={1, 4, 5, 6, 7, 9, 11, 16, 17} (quad. res.)

m=19, A={1, 2, 3, 5, 7, 12, 14, 17, 18}

m=19, A={1, 2, 3, 5, 9, 10, 16, 17, 18}

m=19, A={1, 2, 3, 6, 7, 12, 13, 16, 18}

m=19, A={1, 2, 3, 6, 7, 12, 13, 17, 18}

m=23, A={1, 2, 3, 4, 6, 8, 11, 12, 15, 17, 20}, and 567 other equivalence classes of optimal A, none of which are the quad. res.

m=29, A={1, 4, 5, 6, 7, 9, 13, 16, 20, 22, 23, 24, 25, 28} (quad res)

m=31, A={1, 2, 3, 4, 6, 8, 11, 12, 19, 23, 25, 27, 28, 29, 30}, and 51 other equivalence classes of optimal A, none of which are the quad. res.

m=37, A={1, 3, 4, 7, 9, 10, 11, 12, 16, 21, 25, 26, 27, 28, 30, 33, 34, 36} (quad res), and 12 other equivalence classes of optimal A

• obryant Says:

Latest data implies M(105)>2, so it can’t set a record. All that’s needed for a record is an odd m strictly between $9^r$ and $9^{r+1}$ with $M(m) \leq r$, or to show that $M(9^r) for some r. For r=1, this is impossible by computation for m=11 and because any sequence of length 12 has homo. disc. at least 2 even without wrap-around. I was hoping that perhaps m=105 could be worked with by understanding m=15,21,35, but even if that's the case we must have $M(105)\geq M(21) =3$, so that it won't be a record.

I'm not convinced that M(m)/log(m) is bounded away from 0 (it certainly seems likely), but I don't see any way to make progress on this.

Here's the new data (giving one example for each modulus)

m=3, M=1, A={1}

m=5, M=1, A={1,4}

m=7, M=2, A={1,2,4}

m=9, M=1, A={1,4,6,7}

m=11, M=2, A={1,2,4,7,9}

m=13, M=2, A={1,3,4,9,10,12}

m=15, M=2, A={1, 3, 4, 6, 10, 11, 13}

m=17, M=2, A={1,2,4,8,9,13,15,16}

m=19, M=3, A={1, 4, 5, 6, 7, 9, 11, 16, 17}

m=21, M=3, A={1, 2, 3, 6, 8, 13, 14, 15, 17, 20}

m=23, M=4, A={1, 2, 3, 4, 6, 8, 11, 12, 15, 17, 20}

m=25, M=2, A={1, 2, 6, 7, 11, 12, 15, 16, 17, 20, 21, 22}

m=27, M=2, A={1, 4, 6, 7, 10, 13, 15, 16, 18, 19, 22, 24, 25}

m=29, M=3, A={1, 4, 5, 6, 7, 9, 13, 16, 20, 22, 23, 24, 25, 28}

m=31, M=4, A={1, 2, 3, 4, 6, 8, 11, 12, 19, 23, 25, 27, 28, 29, 30}

m=33, M=3, A={1,2,4,7,9,12,13,15,18,20,22,23,24,26,29,31}

m=35, M=3, A={1,3,6,8,10,13,14,17,20,21,22,24,25,27,29,31,34}

m=37, M=4, A={1, 3, 4, 7, 9, 10, 11, 12, 16, 21, 25, 26, 27, 28, 30, 33, 34, 36}

m=39, M=3, A={1,2,7,9,11,12,13,14,15,19,20,24,25,27,28,30,32,37,38}

m=81, M=2, A=({1,4,6,7}+{0,9,18,27,36,45,54,63,72}) \cup (9*{1,4,6,7})

• obryant Says:

m=41, M>3

m=49, M=2, A=({1,2,4}+{0,7,14,21,28,35,42}) \cup (7*{3,5,6})

• obryant Says:

More data:

M>2 for m=43,47,51,53,55,57,59,61,63,65

M=2 for m=45 (A={1,3,4,9,10,11,12,14,16,19,21,25,26,29,30,31,34,36,39,40,41,44})

Currently working on: M(121).

13. Alec Edgington Says:

Here is a question in harmonic analysis to which the answer must surely be known (and negative). A positive answer would imply a counterexample to the Erdős discrepancy conjecture. But I’m interested in the negative answer too because I hope it might involve a construction that’s relevant to EDP.

Given a sequence $a_n$ of complex numbers with $\lvert a_n \rvert = 1$ (or we could insist that $a_n = \pm 1$), consider the partial sums of the power series: $f_N(z) = \sum_{n \leq N} a_n z^n$.

Is there such a sequence, and a constant $C > 0$, such that $\lvert f_N(z) \rvert \leq C$ for every $N$ and every $z$ in the closed unit disc?

Any such sequence is a counterexample to the Erdős conjecture, because $\lvert \sum_{n \leq N, d \mid n} a_n \rvert = \lvert \sum_{n \leq N} (\frac{1}{d} \sum_{r = 0}^{d-1} e^{2i\pi rn/d}) a_n\rvert$, which is bounded by $\frac{1}{d} \sum_{r = 0}^{d-1} \lvert f_N(e^{2i\pi r/d}) \rvert \leq \frac{1}{d} d C = C$.

By the maximum-modulus principle it’s enough to consider the bound on the unit circle. Assuming that the answer is negative, it would be interesting to have a proof that explicitly gave us a sequence $z_N = e^{2i\pi \theta_N}$ where $\theta_N = r_N / d_N$ and $\lvert f_N(z_N)\rvert \rightarrow \infty$ as $N \rightarrow \infty$. Then the $d_N$ would be differences at which the sequence would have to ‘work hard’, in some sense, to have bounded discrepancy.

• gowers Says:

If I’ve got the right question here after a quick reading, you find that the average of $|f_N(z)|^2$ over all $z$ in the unit circle is $N$.

• Alec Edgington Says:

So it is …

14. gowers Says:

I’m quite busy at the moment, so for a while I still don’t have time to put in a lot of effort on EDP. But in order to demonstrate that I haven’t given up on it — I’m just having a rest — I thought I’d write a quick comment.

I’m still thinking about trying to find a good decomposition of a diagonal matrix. I’m going to assume that this matrix is non-negative (even though we don’t actually need that to be the case). So let us call it $D^2$.

Now let $U$ be an orthogonal matrix and let the columns of $U$ be $u_1,\dots,u_n$. Then $UU^T$ can easily be checked to be $\sum_iu_i\otimes u_i$. Since that is the identity, it follows that $DU(DU)^T=D^2$. Now the columns of $DU$ are $Du_1,\dots,Du_n$, so $D^2=\sum_i (Du_i)\otimes(Du_i)$.

This gives us another way we could perhaps approach finding a diagonal representation. The idea would be to begin by finding some natural orthonormal basis — an obvious choice would be the trigonometric functions — then multiply them pointwise by a vector $(\lambda_1,\dots,\lambda_n)$ (the coefficients of the diagonal matrix $D$), and hope to choose the $\lambda_i$ in some clever way to make the resulting vectors $Du_i$ efficiently representable as linear combinations of HAPs.

It might sound as though a serious disadvantage of this approach is the usual difficulty that we don’t know how to guess the $\lambda_i$. But I’m not sure if that’s true: we could begin by looking at which vectors we find we can represent efficiently by HAPs, and then use those to guide us to our choice of the $\lambda_i$. And we also have some clues from the experimental evidence about which diagonal matrices can be represented.

15. obryant Says:

Set $M(m)$ as defined a few posts ago, and recall that we want $M(m)/\log(m)$ to be small. From exhaustive computations, we get that M(49)=2, which is quite possibly the largest m with M(m)=2. In any case, when m is a square we seem to get smaller-than-usual M’s. Also, M(27) is smaller than nearby values.

This led me to consider whether there might be a finite field construction of the sets A that work well. The way finite fields are used to generate as-good-as-random constructions is to use the connection between the cyclic multiplicative group and the representation of $x^i$ in some fixed basis. This isn’t so nice for us, since the multiplicative group will always have even order, whereas we need m to be odd.

Wait, the mult group isn’t always of even order. If $q=2^k$, then the multiplicative group has order $m=2^k-1$, which is at least odd. This still isn’t so nice for us, though, since such m will be prime for infinitely many k (no, I can’t prove that), and we know from Roth’s result that $M(p) \gg p^{1/4}$ for primes $p$.

Short story even shorter: I don’t see finite field arithmetic generating any good EDP constructions.

My machine is chugging away on deciding if M(121) is 2 or 3 or 4, and I expect that it will finish in a week or two.

16. Uwe Stroinski Says:

From a double counting argument we get ${-c\leq\frac{1}{n}\sum_{i=1}^n \sigma_0(i)x_i\leq c}$ (with ${\sigma_0}$ the divisor counting function) as a necessary condition. This condition is weak, however it led me to consider the divisor counts near 246 (the length of the longest completely multiplicative sequence with discrepancy 2) and 1.124 (the length of the so far known longest sequence with discrepancy 2). The proofs we have so far do not give us an idea of what makes these numbers so special. Now, a heuristical argument could be: 247 is near to 240, which is the first number with 20 divisors. In the non-c.m. case the situation is as follows: 840 has 32 divisors, 1.080 has 32 divisors and 1.120 has 24 divisors (still a lot). Either this divisor count heuristics is just a coincidence or it does only hold for c.m., or 1.124 is probably not optimal. Unfortunately, my lap top has not enough power to do searches beyond length 400. 1.260 (36 divisors) would be the next maximum.

17. Gil Kalai Says:

Here are a few questions that I am curious about and maybe they can be answered easily:

1) How is our basic $\mu_3$ multiplicative example (based on the least significant non zero digit in the ternary expansion) behaves if we restrict it just to the first k primes (and let it be 0 on other primes).

2) How small maximum discrepency can we ensure simultanusly for a
multiplicative -1 +1 function and all its restrictions when we force the value 0 on a finite subset of primes.

3) suppose we just ask EDP for square free numbers. (I still tend to speculate you can get logarithmically low discrepency in this case as well.) How good is $\mu_3$? (Maybe this was already answered.)

4) How does the greedy-look-ahead algorithm for multiplicative sequences works. In this algorithm, given n, you choose the value of f on the kth prime as to minimize the discrepency (for all intervals [0,r] r <=n), when the values for all larger primes is set to 0.

I think there is a good shot that it will be logarithmic.

• obryant Says:

What do you mean by “EDP for square free numbers”?

• Gil Kalai Says:

Dear Kevin, here I mean the following: you want to assign the square free numbers values -1 and 1 so that when you sum these values for the square free numbers in every HAP the discrepency (absolute value of the sum) is small.

I would expect that just like the original EDP, you cannot make the discrepency uniformy bounded. But that you can make it grow logrithmically.

18. Moses Charikar Says:

I have been tied up with other things and taking a hiatus from EDP. I just want to summarize various LP/SDP approaches to EDP. What I am saying is nothing new, but is a summary of what we know about the various strengthenings of EDP we’ve considered and their relationships.

Consider a $\pm 1$ sequence $X = (x_1,x_2,\ldots)$, and consider the associated matrix $A$ such that $A_{ij} = x_i x_j$, i.e. $A = X X^T$. The approach is to prove something about $A$ which would imply that the HAP discrepancy of $X$ is large, i.e. that $P \cdot X$ is large for some $P$ that is the characeristic vector of a HAP. In proving something like this, we need to exploit the fact that $A$ is obtained from a $\pm 1$ sequence. The various approaches differ in what properties of $A$ they use giving rise to a sequence of strengthenings of EDP. At the very least, they use the fact that the diagonal entries are 1, and in fact some use only this fact.

Here is a sequence of conjectures, ordered from strongest to weakest — each is stronger than EDP and would imply EDP if true.

Conjecture 1 : For any matrix $A$ with ones on the diagonal , there exists a HAP $P$ such that $|P^T A P|$ is large.

If true, this would imply EDP, because in particular, consider $A = X X^T$. Then $P^T A P = (X \cdot P)^2$. Thus $|P^T A P|$ is large implies that the discrepancy of $X$ is large. Unfortunately, this conjecture is too strong. Alec gave a construction of a matrix $A$ such that $|P^T A P| \leq 1$ for all HAPs $P$.

Conjecture 2 : For any matrix $A$ with ones on the diagonal , there exist HAPs $P,Q$ such that $|P^T A Q|$ is large.

The proof that this would imply EDP is similar to the argument earlier. Consider $A = X^T X$. Then $P^T A Q = (X \cdot P)(X \cdot Q)$, hence $|P^T A Q|$ large implies that the discrepancy of $X$ is large.

This conjecture is the basis for Tim’s representation of diagonals approach. We can express the problem of finding an matrix $A$ such that $|P^T A Q|$ is bounded as a linear program, then the dual problem gives a lower bound for how small $\max_{P,Q} |P^T A Q|$ can be made by choosing $A$ apppropriately. In fact, the dual is exactly the problem of constructing the diagonal representation that Tim defined.

We have also considered the stronger
Conjecture 1.5 : $|P^T A Q|$ is large for HAPs $P,Q$ with the same common difference . Experimental evidence suggests that this is false (see Problem 1 in Tim’s EDP13 post). So far, we don’t have an explicit construction of a matrix $A$ with ones on the diagonal that falsifies this conjecture, but I believe this should be possible.

Conjecture 3 : For any positive semidefinite matrix $A$ with ones on the diagonal , there exists a HAP $P$ such that $P^T A P$ is large.

Note that $A=X^T X$ is positive semidefinite, hence the statement applies to it. This conjecture is the basis of the SDP approach to proving a lower bound on EDP. The problem of finding the “best” psd matrix $A$ can be formulated as a semidefinite program and giving a lower bound for the value of this SDP by constructing a feasible dual solution amounts to producing a certain quadratic form, and subtracting a large diagonal term from it such that the remainder is positive semidefinite.

We ought to consider the analog of Conjecture 2 for psd matrices giving the potentially weaker statement:
Conjecture 4 : For any positive semidefinite matrix $A$ with ones on the diagonal , there exists HAPs $P,Q$ such that $|P^T A Q|$ is large.

In fact, it turns out that Conjecture 4 is equivalent to Conjecture 3, since (as Tim pointed out earlier) for psd matrices $A$, $|P^T A Q| \leq (P^TAP)^{1/2}(Q^TAQ)^{1/2}$.

• Gil Says:

Is there any relation to the Paving conjecture (aka Kadison-Singer conj)? (maybe this is related to EDP12′s discussion). It looks artificially similar.

• Moses Charikar Says:

I don’t know enough about Kadison-Singer to say if there is a connection. Gil, can you elaborate on what you think the relation might be ?

19. Uwe Stroinski Says:

I was toying around with obryants description of log-bounded examples and our earlier discussion on flipping signs. Let’s see what we can get.

Let f be completely multiplicative, fix a prime p and define another completely multiplicative g(p):=-f(p) and g(q):=f(q) for prime q not equal to p. Then, since f and g coincide on non-multiples of p, the partial sum of g up to n satisfies

${g[n] = f[n]-\sum_{i=1}^{\lfloor\frac{n}{p}\rfloor} f(p i)+\sum_{i=1}^{\lfloor\frac{n}{p}\rfloor} g(p i)}$

Using c.m. and the definition of g we end up with a recurrence

${g[n] = f[n]-f(p)\left(f\left[\left\lfloor\frac{n}{p}\right\rfloor\right]+g\left[\left\lfloor\frac{n}{p}\right\rfloor\right]\right)}$.

We solve this for ${n=p^m}$ and get

${g[p^m]=f[p^m]+2\sum_{i=0}^{m-1} (-f(p))^{m-i}f[p^i].}$.

For f with bounded discrepancy ${|f(n)|\leq c}$ this yields the following estimate

${|g[n]|\leq 2c\log_p n +c.}$

at prime powers ${n=p^m}$. Therefore, for a c.m. function to satisfy EDP it is necessary that the flipped function is log-bounded at the powers of the ‘flipped prime’.

Obryant’s log-bounded examples have discrepancy less than ${M(p,A) \log_p(n) + M(p,A)}$ (in his notation). In principle they could be flips of bounded discrepancy functions. I am not optimistic that this is the case, though.

20. Klas Markström Says:

In case someone involved has some software for computing Haar wavelet transforms I think it could be interesting to see what such a transform of the long discrepancy 2 sequences look like, both for the general and multiplicative versions.

21. Klas Markström Says:

Since it has been a bit quiet here now I can at least add a short progress report regarding the C=2 case. The computers are still working on the last stub and my original plan was to have it completed by now. However due to machine problem I had to restart some of the subcases from my backup copies and that set things back by about two weeks.

22. Jason Dyer Says:

I have written “A gentle introduction to the 5th Polymath project”:

http://numberwarrior.wordpress.com/2010/04/21/a-gentle-introduction-to-the-5th-polymath-project/

I did not link to this blog, only to the wiki page. I figure anyone qualified enough to join in will be able to find their way here.

23. Jason Dyer Says:

I have gone back to the Wiki and added a page for my graph theory generalization of the problem some 1000 comments back:

http://michaelnielsen.org/polymath1/index.php?title=Generalize_to_a_graph-theoretic_formulation

The prime factorization algorithm is horribly messy (and not entirely bug-checked).

I still have hope for a statement stronger than the EDP which will be a generalized combinatoric version of the problem.

24. Polymath5 – A collection of results concerning the completely multiplicative case « Random Thoughts Says:

[...] Computer experiments so far did not find a that grows slower than . For a given let . Then withis the largest such that there is an with . Proof idea: Do a computer search among all completely multiplicative discrepancy 2 functions with domain ( must be less than 247). A beating thus has at least discrepancy and thus implying . [...]

25. Polymath5 – A somewhat surprising observation « Random Thoughts Says:

[...] this post, Kevin O’Bryant has considered “least nonzero digit”-type examples . Encouraged [...]