I don’t have any major new themes to talk about in this post. We are continuing to look for ways of decomposing matrices that would give rise to proofs of strengthenings of EDP, but I would say that at the moment we are still at the stage of trying to get a feel for the problem. (If you are reading this and wondering what the decomposition problems are that I am talking about, then I suggest having a look at EDP10 and EDP12.)
The difficulty we face seems to be this. (I do not have full confidence in what I am about to write. It could be that I will change my mind, and it could be that others already have a different and better analysis.) I’ll take for the purposes of illustration the problem of writing a diagonal matrix with unbounded trace as a sum where the
and
are HAPs and
. After trying a few things, I have started to feel as though it isn’t going to be easy to find a completely explicit construction for this. That is partly because no pattern has jumped out at me from the output from Moses’s experiments, though I have not looked as hard as I might. So the best I can think of to do at the moment is something like this (but almost certainly not precisely this).
1. By some kind of wild guess, choose some HAPs and some positive constants
.
2. Form the matrix , which has trace
.
3. Make those choices in such a way that is significantly larger than
.
4. Prove that it is possible to cancel out the off-diagonal part of this matrix and at most half the diagonal part by subtracting a sum of the form , with
also significantly smaller than
.
The problem with this is twofold. I don’t know how to make the wild guess in the first place (though I suppose we should be trying to favour numbers with many factors) and I don’t see how to do the subtracting-off part. In other words, I don’t see how to make any of it work. And we also know that nothing too neat can work, or we’d end up proving things that we know to be false.
Now if we had a good idea about how to produce the matrix that we then wanted to make diagonal, then we could go ahead and think about how to make it diagonal, and if we had a good idea about what the subtracting-off procedure was, then perhaps we could come up with a matrix that was particularly well suited to that procedure. But to have to guess both at once is rather more problematic.
On the plus side, we have not been thinking about it for very long and have not tried all that many approaches. So I still feel optimistic about this method of proof.
Let me briefly mention two open problems that have come up recently and are related to EDP.
Problem 1. Does there exist a matrix with 1s down the diagonal such that for every set
of the form
we have
?
We can also write as
. So this is another discrepancy problem. Moses, who came up with this problem, has done some computer experiments that suggest that such a matrix may exist. If it does, then it proves that there exist sequences of vectors
and
such that
for every
but for no set
of the above form is it the case that
, which would disprove a strengthening of EDP.
Problem 2. For which sets of HAPs is EDP true? Specifically, if you have a sequence of length
for sufficiently large
, must there be a HAP with common difference a prime or a power of 2 on which the discrepancy is at least
? And what about a HAP with common difference 1 or else a HAP of the form
?
In both cases, these are minimal reasonably natural sets of HAPs that rule out certain counterexamples. In the first case, the powers of 2 are there to stop the counterexample 1,1,-1,-1,1,1,-1,-1,… and others like them. In the second case, allowing all lengths for HAPs of common difference 1 is to stop you taking the first terms to be 1 and the rest to be -1.
Also, a “morally necessary” condition if you want unbounded discrepancy is that the HAPs should form a spanning set. If they don’t, then you can find a sequence (though not necessarily a sequence) that has zero discrepancy on any of them. Even if that doesn’t quite disprove EDP for that collection of HAPs, it certainly rules out the kinds of methods we are trying to use to prove it, and it also rules out vector-valued strengthenings of EDP.
To finish, let me mention that Klas Markstrom seems to be getting close to the first rigorous (and very computer-assisted) proof that there is no infinite sequence of discrepancy 2. See this sequence of comments to get an idea of where he has got to. I think we can forgive Mathias for the admission in his paper that “Repeated efforts to improve the proposition to the case
have failed.”
March 23, 2010 at 11:05 pm |
“And what about a HAP with common difference 1 or else a HAP of the form
?”
This is a quick thought late in the evening so I might be completely off. Can’t you get around this set by using an alternating sequence up to n/2 and then switching to alternating in the opposite way?
March 23, 2010 at 11:52 pm
Hmm … that looks right.
In which case perhaps the next question is to try to find some kind of property of sequences that have bounded discrepancy on all full HAPs and then try to add in extra HAPs that look as though they rule out all sequences with that property.
A first step might be to try to describe sequences that have zero inner product with all full HAPs with common difference at most
. I’ve no idea whether there is anything nice to say here.
March 24, 2010 at 12:00 am
I think this works. All the HAPs of odd common difference are clearly ok. They can have a discrepancy of at most 1 on {1,…,n/2} and at most 1 on {n/2,…,n}. Consider a HAP of even common difference. It has large discrepancy on {1,…,n/2} and large discrepancy on {n/2+1,…,n}, but these two contributions have opposite signs and differ by at most 1 in magnitude. Here we are using the fact that the HAP extends all the way to the end of the interval. Hence the overall discrepancy is at most 1.
March 24, 2010 at 10:34 am
This could be optimized a bit by instead of using two subintervals using
subintervals. This should give a maximum discrepancy of square root order instead of linear.
March 24, 2010 at 3:27 pm
Anonymous in this case was I. For some reason my mobile phone ignored my name.
April 20, 2010 at 8:30 am
Here is another problem on restricted sets of HAPs.
Assume that we look at sequneces of length
and include the HAPs with difference 1, and the HAPs with difference
for all prime
.
For
sequences with bounded discrepancy are easy to construct. How small can we make
without getting a large discrepancy?
March 24, 2010 at 12:19 am |
[...] new Thread By kristalcantwell There is a new thread for Polymath5. It looks like Klas Markstrom is getting close to a proof that for a large enough N [...]
March 24, 2010 at 2:15 am |
I just want to make a comment about the SDP approach. Let me recap some of the stuff we know. By now, we probably have some good(?) guess of the tail dual variables
‘s: they should be of some form similar to
. I think we have some reasonable guess for the
as well, e.g. number of divisor of
divided by $d$. Suppose, we enforce these guesses to the problem. Then, the primal becomes the following.
Minimize the sum
subject to
for all
.
This problem is similar to the SDP for max cut so I think it is informative to look at it with the algorithms for the max cut SDP in mind. There is a primal-dual algorithm by Klein and Lu for max cut that is applicable to our SDP as well. It works roughly as follows. Recall we associate a dual variable
with the constraint
. Says we make some guess of these variables, not their exact values, but only their ratio $b_i/b_1 \forall i$. For symmetric problem like Roth’s theorem, it is much easier to guess the ratio of the $b_i$’s (they should be equal), but maybe we can do it here as well. With these guesses, we can rewrite our problem as follows.
Minimize the sum
subject to
for all
.
Let
be the Gram matrix of
. We can rewrite the above problem as minimizing
subject to
and X being positive semidefinite.
First notice that the minimum is achieved by a rank 1 matrix
. In fact, this is exactly the problem of finding the minimum eigenvalue of a matrix! Define a vector
with
, then
is the eigenvector corresponding to the smallest eigenvalue of of the matrix
with
.
Actually for any choice of
, the smallest eigenvalue would give a lower bound on the discrepancy: the squared discrepancy should be at least
.
Thus, this gives a way for verifying guesses and coming up with bounds without fully solving the SDP: we can guess the tail variables and the ratio of the
, and then either numerically or theoretically bound the smallest eigenvalue of the scaled matrix
.
Better yet, it also give a way to improve one’s guess, which is always a feature of primal-dual approaches. Namely, we can look at the matrix
, and check which constraint
is violated and by how much. The more severely a constraint is violated, the more we should increase its weight
.
March 24, 2010 at 8:47 am
That is very interesting. Let me see if I can extract from it the main thing that one needs to know in order to use it. I’m doing this partly as a test of my understanding.
The basic SDP appoach is to find a matrix M made out of HAPs and a diagonal matrix D with large trace such that M-D is positive definite, and hence
whenever
is a
vector. It also tells us that if
is a matrix with columns
, then
, where
is the ith entry on the diagonal of D.
If we let
, then this becomes
, since
, the matrix with columns
, equals
.
Let
. Then we are trying to minimize
. Writing the numerator and denominator out in full we get
and
, from which it is clear that the best we can do is minimize the ratio
(this is your remark that the minimum is achieved for a rank-1 matrix), which is obviously achieved when
is an eigenvector of
with minimum eigenvalue.
The conclusion is this. Suppose we make a guess about what
and
should be. We can then easily form the matrix
. If we can then find the eigenvector with minimum eigenvalue, we either get a proof that the approach has worked, or we get a rather informative violation of what we hoped would be true.
I think it might be very useful to start with a simple but bad guess, like taking all HAPs and taking D to be the identity matrix. Perhaps in that case, or something a bit like it, we could determine explicitly what the eigenvector with minimum eigenvalue is and see what happens in the first iteration of the algorithm. That might help us guess what the final state will look like. (I wouldn’t be altogether surprised if the minimal eigenvector was closely related to 1,-1,0,1,-1,0,1,-1,0,…)
March 24, 2010 at 3:59 pm
Another point that Huy makes is that the bound you obtain from this is a valid lower bound even if you did not iterate and try to converge to the best choice of
. If you chose a reasonable combination of HAPs to produce the matrix M, and made a good guess about what the
s should be (perhaps by improving earlier guesses a few times), then we might have a good bound already. In any case, the bound is always valid – the eigenvalue computation tells us by what factor the
s should be scaled to produce a valid bound.
Huy, how does the algorithm update the
values after the eigenvalue/eigenvector computation ?
March 24, 2010 at 11:27 pm
The original update rule for
is based on the framework by Plotkin, Shmoys, and Tardos for packing and covering linear programming. Below is a similar update rule based on a survey paper by Arora, Hazan, and Kale. All errors are mine, of course.
Suppose we want to approximate the SDP solution up to a factor of
. We are probably fine with any constant approximation so it is fine to set
to be, says,
.
First, set all
. We maintain
, which can be thought of as a distribution over the constraints. We want to find min
subject to
PSD and
. Compute the smallest eigenvalue and the corresponding unit eigenvector
of the matrix
. We get the solution matrix
. The
is updated to reflect how important that constraint is i.e. constraints that are violated get increased weight while constraints that are satisfied get decreased weight. Namely,
.
At the end, the solution $X^*$ is an average over all the rank-1 matrices found during the execution of the algorithm.
March 24, 2010 at 11:29 pm
Sorry, I made a few mistakes in the previous comment. The update rule should read
.
March 25, 2010 at 2:27 am
Sorry, there was another thing I forgot, which might be of some importance if one wants to actually implement this. The update rule is actually
where
, called the width, is an upper bound on
returned by the eigenvector computation. Intuitively, we don’t want the update rule to make radical change in
in one step, thus the scaling by
. The number of iterations is proportional to the width and a function of $\epsilon$. If we just let
to be any combination of HAPs,
could be just anything, e.g. 0, so the width
could be huge. Thus, to control the width, one would actually need to modify
slightly, e.g. use all length-
HAPs so that all
are not vanishingly small, and might have to use somewhat different update rules. I did not explain the conditioning steps in full since it is somewhat tedious and it would be easier to read the paper anyway at http://www.cs.brown.edu/research/pubs/pdfs/1996/Klein-1996-EAA.pdf
March 24, 2010 at 8:25 am |
Regarding Problem 1, I think it would initially be worth looking for solutions of the form
, where
is a function on the rationals satisfying
and
for all
. (This is analogous to looking for multiplicative sequences of low discrepancy, and at least might force numerical solutions to exhibit more structure.)
March 25, 2010 at 5:26 am
Yes, this might be a good restriction to impose on the solution. With this in place, in order to answer Problem 1, we only need to ensure that
for sets
of the form
since the multiplicative condition gives the bound for sets
As Alec points out, imposing this condition on the matrix corresponds to the multiplicative version of the “non-symmetric” vector-valued EDP. The hypothesis is that given any two sequences
and
of vectors in a Euclidean space satisfying (1) the diagonal condition
for every
, and (2) the multiplicative condition
for every
, there are HAPs
and
such that
. In fact, with the multiplicative condition in place, we only need to consider pairs of HAPs
with common differences
such that
.
March 25, 2010 at 6:53 am
It looks like imposing the multiplicative condition does not change the value of the optimal solution (for small values of
that I looked at). Also the optimum value of the LP seem to be
. As mentioned here , I looked at HAPs over
. There still seem to be some degrees of freedom in the optimal solution, so the particular solution we get is somewhat arbitrary. Ideally, all variable values are multiples of
. This happens sometimes but not always.
March 25, 2010 at 7:12 am
Moses, would you be able to post some of your optimal solutions for small
I’d be interested to know if there is any sign of multiplicativity in the sequence
, for example.
March 25, 2010 at 1:09 pm
Sure, look at the files mult*.txt here:
http://webspace.princeton.edu/users/moses/EDP/try-lp/
Format is
i … j …
I have solutions for n=8,18,23,28,38,48 (since the denominators are likely to be nice numbers for these values). Only relatively prime pairs i,j, with i<= j are output.
March 26, 2010 at 5:24 am
Even though we haven’t quite figured out the LP solutions for Problem 1 obtained by imposing this multiplicative condition, it might be worthwhile to impose this condition in constructing diagonal representations. What this means is that the diagonal representation we construct will be a lower bound for the “non-symmetric” vector-valued EDP with the additional multiplicative constraint
. This should be interesting in itself, since any lower bound we construct for this problem would automatically be a lower bound for the HAP discrepancy of multiplicative
sequences. Perhaps more importantly, proving lower bounds for this restricted problem may be an easier intermediate step and give us a handle on the general problem.
We have some additional flexibility in constructing diagonal representations for this restricted setting. Let
be the matrix we express as a convex combination of
. Then all off-diagonal entries of
need not be zero. In fact, we need to satisfy the weaker condition that certain sets of off-diagonal entries sum to zero. The condition is this: for all
such that
,
. This also implies that in constructing the convex combination, we only need to consider terms
with common differences
such that
.
March 25, 2010 at 4:48 pm |
http://www.ams.org/news/home-news.html#abel-2010
March 26, 2010 at 6:57 am |
Perhaps a weaker version of EDP, that we may yield so some techniques from extremal or probabilistic combinatorics is this:
Show that the set +- 1 sequences of length n with bounded discrepancy has entropy o(n). (Namely, the number of such sequences is
.
March 26, 2010 at 8:05 am
We thought about this question way back here. I still rather like it though, and I think it could be easier than EDP itself but still interesting.
March 26, 2010 at 8:54 am
(The link “here” is broken)
Not sure how that happened, but I’ve fixed it now.
March 26, 2010 at 12:25 pm
Tim, on that note, the link to Mathias’s paper is broken in your EDP13 post.
Also now fixed — I don’t know what’s going on.
March 27, 2010 at 8:29 am
Yes indeed Tim raised this question early on and proposed to use the Sauer-Shelah theorem (or equivalently to show that the VC-dimension of small discrepency sequences is o(n)). One nice thing about this direction is that it may well apply to sequences with +1 -1 and 0s or sequences of arbitrary integers. (Or when we talk about approriate versions of Housdorff dimension rather than VC-dimension even for sequences of arbitrary complex numbers.)
Another nice thing about these questions is that they may be easier than EDP.
A strong form of these conjectures (which may well strengthen EDP) would be that when we fixed the permitted discrepancy T and consider infinite sequences (of +1,1,0; of arbitrary integers; of arbitrary complex numbers), then the dimension of sequences of bounded discrepency is bounded.
March 27, 2010 at 9:19 pm
Maybe we can start by showing small entropy o(n/logn) or even much smaller for completely multiplicative functions with values +1/-1. Perhaps we can look at a large but not huge interval of integers, then look at all primes involved in integers at that interval and then show that we can assign them +-1 as to force large discrepency on that interval.
March 28, 2010 at 9:04 pm
The specific suggestion of my previous comment is based on late-night favorable quantifier-confusion. But still I thing that Tim’s Sauer-Shelah-VC-dimension approach towards bounding the entropy of multiplicative bounded discrepency sequences should be persued some time. Does the harmonic analysis reduction to the (slightly generalized) multiplicative case apply to the entropic version of EDP?
March 28, 2010 at 8:26 pm |
I have been looking at this entry in the wiki:
http://michaelnielsen.org/polymath1/index.php?title=Forced_Drifts_in_Multiplicative_Sequences
If the drift of a sequence is 5 then the discrepancy must be at least 3. So in it there is a proof that a multiplicative sequence of length 5298 or more must have discrepancy 3. It is rather long though.
If the drift of a sequence is 3 than the discrepancy must be at least 2. So here we have a proof that if a sequence has length 18 the discrepancy is at least 2. Here the proof is much shorter.
Possibly our proof that a sequence of length 247 has discrepancy 3 can be modified to get a limit lower than 5298 than for the maximum multiplicative sequence with drift less than 5.
March 29, 2010 at 5:19 pm
You’re off by one, we need a drift of 6 to guarantee discrepancy 3. For example, – - + + + + + has drift of 5 (from -2 to 2).
March 29, 2010 at 6:16 pm
If there are 5 consecutive 1′s in a row then wouldn’t they add 5 to the value of the sum of the numbers before them? So in your example the -2 should would have 5 added to to it to get 3 which would push the discrepancy to 3.
March 29, 2010 at 5:27 am |
I want to point out a small observation about the representation-of-diagonals approach. Recall that the goal is to express a diagonal matrix as
where
, and the lower bound (on the square discrepancy) obtained from this is the trace of the diagonal matrix. In fact, we don’t need to completely cancel off all the off-diagonal elements. If we obtain a (not necessarily diagonal) matrix
, then this gives a lower bound of
. This sounds similar to the variant discussed earlier where we said we need to produce a diagonal plus a psd matrix (instead of a diagonal plus a zero matrix).
The proof that
is a lower bound follows from the fact that for a
vector
,
. Now plug this into Tim’s proof (see his EDP12 post) to conclude that there exist HAPs
and
, such that
.
This is not a significant change, but it gives a little more breathing room in allowing almost diagonal representations.
March 29, 2010 at 10:49 am
I like this observation, since getting rid of the last few entries looked as though it might be quite troublesome. I suppose one could consider taking it a step further and saying that we don’t even need
to be small, since that is not a necessary condition for
to be small for every
vector
. For example, if
is a random
matrix, then the maximum possible value of
will not be
, but something much smaller like
I think. But pursuing this line of thought could just lead us straight back to the SDP approach, so it may well not be much extra help.
March 29, 2010 at 11:37 am |
This comment contains some rather vague and miscellaneous thoughts about where we stand.
1. First, an attempt to articulate what seems to me to be a serious difficulty with what we are trying to do. We have shown that EDP follows if we can find certain ways of decomposing matrices. From a look at the experimental evidence, one gets the impression (perhaps wrongly) that there may not be an explicit formula for the decomposition. If that is the case, then one needs to find a way of proving nonconstructively that a decomposition exists. Now a tempting move is to observe that if we cannot find a decomposition of the desired kind then there must be a functional that proves this. But that functional turns out to be precisely a counterexample to some strengthening of EDP and we are back where we started. So it seems as though we are forced to “construct” a decomposition (possibly with the help of probabilistic methods). How do we do that?
Huy had an interesting suggestion in the SDP case, and I think we should pursue that approach at some point soon.
2. If we diagonalize a symmetric matrix
, we end up finding an orthonormal basis
and showing that
. Equivalently, we factorize
as
for some orthogonal matrix
and diagonal matrix
. That tells us that
. The
th entry of this matrix is
. If the
th column of
is
, then this tells us that the matrix is
. So maybe, just maybe, we could define a symmetric matrix
in some natural way so that we expected its eigenvectors to be efficiently expressible as linear combinations of HAPs.
As usual, I haven’t checked completely carefully that everything I’ve just written is 100% correct. But the general point stands: it may be helpful to think about what kinds of symmetric linear maps are likely to have HAP-ish eigenvectors.
Of course, we also want other properties: even if we can find a decomposition we need it to be efficient. So this is not a suggestion for how to solve the problem, so much as a suggestion for how perhaps to get slightly closer to a solution.
3. The kinds of matrices that are likely to have eigenvectors built out of HAPs are matrices that are themselves built out of HAPs. We could even build the matrices together with their diagonalizations. To do this, we would want to solve the following problem: find an orthonormal basis where each vector is a linear combination of HAPs and do it in a “non-trivial way”. (I’m not quite sure what the last condition means, but I don’t want to take e.g. the standard basis.) Having chosen the orthonormal basis, we would also want to choose the diagonal map. Perhaps one could do that in a clever way that would cause quite a few coefficients to cancel. (Again, this is meant as a rather vague thought.)
March 30, 2010 at 7:25 am
Continuing in speculative mode …. I was wondering whether we might learn some lessons from the SDP solutions about what good test vectors might be (for the representation of diagonals approach). Alternately, think of these as rather vague thoughts on how we might build matrices out of HAPs so as to have eigenvectors built out of HAPs.
1. Let
denote the characteristic vector for the HAP
. Let
. The SDP solutions we have looked at suggest that we should take
. Ignore the constants
for now. The SDP data seems to suggest a conjecture along the following lines: There exists a sequence
with unbounded sum such that for any vector
,
for some
.
Do the matrices
suggest any particular test vector made from HAPs of common difference
? I don’t know how to express
, but here is a small observation: Let
, where the
signs are chosen independently and at random. Then
. Now I am not sure this is particularly useful, because we now believe that we should not be able to get a good diagonal representation solely from vectors of the form
. A positive answer to Problem 1 (in Tim’s EDP13 post) says that we ought to be mixing HAPs of different common differences. But perhaps random vectors such as
could be part of the solution.
2. One way of mixing vectors of different differences would be to compute the orthogonal component
of
relative to the other
(where ‘other’ could refer to all
or all
say). I'm not sure where this will lead us, but if we construct the matrix
, then we ensure that the eigenvectors are precisely the vectors
(assuming they are normalized to unit vectors) with corresponding eigenvalues
. (Here we used the fact that the
are mutually orthogonal.) The expressions
lead to mixing HAPs with different common differences.
A couple of issues with this: Firstly, the vectors
as defined are random vectors, so I am not sure how we would reason about orthogonal components etc. Secondly, eventually, we want to prove a lower bound of the form
. Similar to Huy's suggestion about the SDP approach earlier, we may need to scale all the vectors we are dealing with by guessing
in advance, and then dividing the
th coordinate by
. Now compute orthogonal components etc for these scaled vectors instead of the original ones. But this seems quite messy, not to mention that it requires us to guess the
values which we don't have a good handle on.
On another note, I am out of town for the rest of this week and may not be able to participate in the discussion until I get back.
March 30, 2010 at 3:39 pm
On the same note as your last, I too will be away for a few days. I may be able to participate to a limited extent, but that is the most it will be until early next week.
March 29, 2010 at 11:00 pm |
If a sequence has discrepancy less than four then at odd numbers the discrepancy will be plus or minus three or plus or minus one. If there are two consecutive plus ones or two consecutive minus ones starting at an odd number I claim that the value of the discrepancy at the starting point must be plus or minus 1. If It had value 3 or minus three then it would have the value 4 or minus 4 at the point proceeding or after it so those values are forbidden.
This is an attempt to generalize the result that if the discrepancy is less than two than if there are two consecutive ones or negative ones at an even number the value of the sum must be zero there.
March 31, 2010 at 8:50 pm
I have an idea on using this to start a proof that the discrepancy of every multiplicative sequence must be at least 4. The idea is to start with a number that must be plus or minus 1 and then add at least 5 to it or subtract at most 5 and get plus or minus 5.
The starting point is http://michaelnielsen.org/polymath1/index.php?title=Forced_Drifts_in_Multiplicative_Sequences
particularly the cases showing the drift must be 5. For about two
or three hundred of the cases the starting point for the sequence of drift 5 is odd. If I have worked that out correctly that is when the difference is odd. For these numbers if they start with two values with the same sign then we can apply the above theorem. This would eliminate all but 300 cases. Now that is a lot of cases but it is still possible to deal with them if they are easy enough. I will be looking at some of these cases and seeing if this can be made to work.
March 31, 2010 at 10:32 pm
I never posted it, but the drift 6 case with f(2)=1 finished quickly. For f(2)=-1, the code wouldn’t make it across
.
BTW, I just posted the code that generated the “forced drifts in multiplicative sequences” data.
April 2, 2010 at 8:48 pm
I have been looking at the cases showing the drift must be 5 and so far I think I have the following:
If there is a multiplicative function with discrepancy 3 or less then:
If f(2)=1
then f(3)=f(5)=-1
If f(2)=1 and f(7)=1
then f(11)=f(13)=f(17)=-1
Let me assume I have the drift 6 case with f(2)=1 finished
Now we have from above this result:
If there is a sequence with discrepancy less than four there are two consecutive plus ones or two consecutive minus ones starting at an odd number the value of the discrepancy at the starting point must be plus or minus 1.
if the sequence starts at an even number the values must be 0 or plus or minus 2.
Then look at where the plus six sequence starts it must start with two values of the same sign or it contains a smaller +6 sequence the by the above two results the value at the start of this +6 sequence must be plus or minus 1, plus or minus 2 or zero. So we have that in that case the discrepancy must be 4.
So we would have if a multiplicative sequence has discrepancy less than
4 it must have f(2)=-1.
I would like to see the partial results for a drift of 6 because it would by the above reasoning allow me to deal with some of the open cases as above. Also I would like to try to complete the proof of drift 6 as I have a feeling I might be able to deal with some of the open cases.
April 3, 2010 at 4:34 am
I’ve now posted the drift=6 data (for f(2)=1). The data is unformatted; sorry for that. I’ve lost my Mathematica to HTML code that did the other tables.
April 4, 2010 at 10:22 pm
I have made an error in the above posts. I need drift 7 to force discrepancy 4. Drift 6 only works for one where the drift starts with consecutive values of the same sign starting at an odd number. Drift 5
can be used to prove discrepancy 3.
March 30, 2010 at 8:38 am |
This appeared on arxiv today, but I haven’t had time to take a proper look at it yet.
http://arxiv.org/abs/1003.5388
March 30, 2010 at 7:30 pm
The result is that if the sum of the prime values of f (f is a multiplicative function from one to zero) less than x are large enough then the multiplicative function diverges. If we can show that a bounded f forces a bounded multiplicative then this result if it holds would allow further restrictions on a bounded f.
Also if we have a game in which player one has 2 moves per turn and player two has one and player one is trying to force a large discrepancy and player two prevent it then if this theorem is true the game will always be a win for player one. And this is true for any variation where player one has enough extra moves to force a large enough discrepancy in the signs of the primes.
This looks like an interesting paper.
April 3, 2010 at 4:19 am |
I’ve been doing some computations on the “least nonzero digit”-type constructions. That is, fix an integer m and a subset A of
, and set
if the least significant nonzero digit of n (base m) is in A, and
otherwise.
Our favorite
comes from
, but also from
. Our second favorite
comes from
, and is our current record-holder for slowest growing discrepancy. And lest one think that only small examples are interesting,
yields a function with smaller discrepancy than
.
A little arithmetic (spelled out below) reveals that
has logarithmic discrepancy if and only if A contains exactly half of the nonzero elements of every additive subgroup of the integers modulo m. If this condition is met, then
where
is the maximum absolute value (over
) of
We should choose A so as to make
as small as possible; set
be the resulting functions. Set also
, the coefficient of
in the discrepancy of
. So c(m) is what we want to minimize.
The above-mentioned computations are of the triples
.
(3, 1, 0.910)
(5, 1, 0.621)
(7, 2, 1.028)
(9, 1, 0.455)
(11, 2, 0.834)
(13, 2, 0.780)
(15, 2, 0.739)
(17, 2, 0.706)
(19, 3, 1.019)
(21, ?, ?)
(23, 4, 1.278)
(25, 2, 0.621)
(27, at most 3, at most 0.910)
(29, 3, 0.891)
(31, 4, 1.165)
(33, ?, ?)
(35, ?, ?)
(37, at least 4, at least 1.108)
For most of these, there are very few sets A that work so well; for m=29 the set mentioned earlier is the unique (well, with its negative) set with M(29,A)=3. My computation so far has been focused (for simplicity) on prime values of m.
Note that for any positive integers m,r, with m odd, we have
so that c(m) does not conveniently go off to infinity. However, over prime values
. So there is some real benefit to composite m. I haven’t ruled out M(105)=2, which would be a new record-low discrepancy (there’s no reason to think that M(105)=2, save fantasy, but I see no reason to think it doesn’t, either).
Now to justify the earlier claim:
has logarithmic discrepancy if and only if A contains exactly half of the nonzero elements of every mod-m subgroup. First, the easy direction. If A does not contain exactly half of the subgroup generated by d, then
grows linearly with
, and so we get linear discrepancy.
Now suppose that A does have exactly half of every subgroup, which we use in the form
. First we work out
. Set
,
. We have
And since
is a multiple of m, we have
. We also have
Together, we have:
We can capture this recurrence by writing:
Now
, and the recurrence reduces the product
by a factor of
each time it is used. This give
.
April 5, 2010 at 7:54 am
Kevin, for the prime
that you’ve worked out, is the optimal set
always the set of quadratic residues modulo
?
April 5, 2010 at 3:48 pm
In summary, quadratic residues are the unique optimal for m in {3,5,17, 29}, are the nonunique optimal for m in {7,13,19,37}, and are not as good as optimal for m in {11,23,31}.
Just being concerned about the coefficient of the log in the discrepancy gives some symmetries: all of A, t*A (the dilation of A mod m by a factor of t, with (t,m)=1), A’ (the complement of A in {1,2,…,m-1}, and t*A’ give the same coefficient. Ignoring those symmetries, here are the optimal A I’ve found so far:
m=3, A={1} (quadratic residues)
m=5, A={1,4} (quadratic residues)
m=7, A={1,2,4} (quadratic residues)
m=7, A={1,5,6}
m=11, A={1,2,4,7,9}
m=13, A={1,3,4,9,10,12} (quadratic residues)
m=13, A={1,2,4,9,11,12}
m=17, A={1,2,4,8,9,13,15,16} (quadratic residues)
m=19, A={1, 4, 5, 6, 7, 9, 11, 16, 17} (quad. res.)
m=19, A={1, 2, 3, 5, 7, 12, 14, 17, 18}
m=19, A={1, 2, 3, 5, 9, 10, 16, 17, 18}
m=19, A={1, 2, 3, 6, 7, 12, 13, 16, 18}
m=19, A={1, 2, 3, 6, 7, 12, 13, 17, 18}
m=23, A={1, 2, 3, 4, 6, 8, 11, 12, 15, 17, 20}, and 567 other equivalence classes of optimal A, none of which are the quad. res.
m=29, A={1, 4, 5, 6, 7, 9, 13, 16, 20, 22, 23, 24, 25, 28} (quad res)
m=31, A={1, 2, 3, 4, 6, 8, 11, 12, 19, 23, 25, 27, 28, 29, 30}, and 51 other equivalence classes of optimal A, none of which are the quad. res.
m=37, A={1, 3, 4, 7, 9, 10, 11, 12, 16, 21, 25, 26, 27, 28, 30, 33, 34, 36} (quad res), and 12 other equivalence classes of optimal A
April 5, 2010 at 10:18 pm
Latest data implies M(105)>2, so it can’t set a record. All that’s needed for a record is an odd m strictly between
and
with
, or to show that
for some r. For r=1, this is impossible by computation for m=11 and because any sequence of length 12 has homo. disc. at least 2 even without wrap-around. I was hoping that perhaps m=105 could be worked with by understanding m=15,21,35, but even if that's the case we must have $M(105)\geq M(21) =3$, so that it won't be a record.
I'm not convinced that M(m)/log(m) is bounded away from 0 (it certainly seems likely), but I don't see any way to make progress on this.
Here's the new data (giving one example for each modulus)
m=3, M=1, A={1}
m=5, M=1, A={1,4}
m=7, M=2, A={1,2,4}
m=9, M=1, A={1,4,6,7}
m=11, M=2, A={1,2,4,7,9}
m=13, M=2, A={1,3,4,9,10,12}
m=15, M=2, A={1, 3, 4, 6, 10, 11, 13}
m=17, M=2, A={1,2,4,8,9,13,15,16}
m=19, M=3, A={1, 4, 5, 6, 7, 9, 11, 16, 17}
m=21, M=3, A={1, 2, 3, 6, 8, 13, 14, 15, 17, 20}
m=23, M=4, A={1, 2, 3, 4, 6, 8, 11, 12, 15, 17, 20}
m=25, M=2, A={1, 2, 6, 7, 11, 12, 15, 16, 17, 20, 21, 22}
m=27, M=2, A={1, 4, 6, 7, 10, 13, 15, 16, 18, 19, 22, 24, 25}
m=29, M=3, A={1, 4, 5, 6, 7, 9, 13, 16, 20, 22, 23, 24, 25, 28}
m=31, M=4, A={1, 2, 3, 4, 6, 8, 11, 12, 19, 23, 25, 27, 28, 29, 30}
m=33, M=3, A={1,2,4,7,9,12,13,15,18,20,22,23,24,26,29,31}
m=35, M=3, A={1,3,6,8,10,13,14,17,20,21,22,24,25,27,29,31,34}
m=37, M=4, A={1, 3, 4, 7, 9, 10, 11, 12, 16, 21, 25, 26, 27, 28, 30, 33, 34, 36}
m=39, M=3, A={1,2,7,9,11,12,13,14,15,19,20,24,25,27,28,30,32,37,38}
m=81, M=2, A=({1,4,6,7}+{0,9,18,27,36,45,54,63,72}) \cup (9*{1,4,6,7})
April 7, 2010 at 6:10 pm
m=41, M>3
m=49, M=2, A=({1,2,4}+{0,7,14,21,28,35,42}) \cup (7*{3,5,6})
April 11, 2010 at 8:54 pm
More data:
M>2 for m=43,47,51,53,55,57,59,61,63,65
M=2 for m=45 (A={1,3,4,9,10,11,12,14,16,19,21,25,26,29,30,31,34,36,39,40,41,44})
Currently working on: M(121).
April 3, 2010 at 11:54 am |
Here is a question in harmonic analysis to which the answer must surely be known (and negative). A positive answer would imply a counterexample to the Erdős discrepancy conjecture. But I’m interested in the negative answer too because I hope it might involve a construction that’s relevant to EDP.
Given a sequence
of complex numbers with
(or we could insist that
), consider the partial sums of the power series:
.
Is there such a sequence, and a constant
, such that
for every
and every
in the closed unit disc?
Any such sequence is a counterexample to the Erdős conjecture, because
, which is bounded by
.
By the maximum-modulus principle it’s enough to consider the bound on the unit circle. Assuming that the answer is negative, it would be interesting to have a proof that explicitly gave us a sequence
where
and
as
. Then the
would be differences at which the sequence would have to ‘work hard’, in some sense, to have bounded discrepancy.
April 3, 2010 at 2:56 pm
If I’ve got the right question here after a quick reading, you find that the average of
over all
in the unit circle is
.
April 3, 2010 at 4:30 pm
So it is …
April 9, 2010 at 7:38 pm |
I’m quite busy at the moment, so for a while I still don’t have time to put in a lot of effort on EDP. But in order to demonstrate that I haven’t given up on it — I’m just having a rest — I thought I’d write a quick comment.
I’m still thinking about trying to find a good decomposition of a diagonal matrix. I’m going to assume that this matrix is non-negative (even though we don’t actually need that to be the case). So let us call it
.
Now let
be an orthogonal matrix and let the columns of
be
. Then
can easily be checked to be
. Since that is the identity, it follows that
. Now the columns of
are
, so
.
This gives us another way we could perhaps approach finding a diagonal representation. The idea would be to begin by finding some natural orthonormal basis — an obvious choice would be the trigonometric functions — then multiply them pointwise by a vector
(the coefficients of the diagonal matrix
), and hope to choose the
in some clever way to make the resulting vectors
efficiently representable as linear combinations of HAPs.
It might sound as though a serious disadvantage of this approach is the usual difficulty that we don’t know how to guess the
. But I’m not sure if that’s true: we could begin by looking at which vectors we find we can represent efficiently by HAPs, and then use those to guide us to our choice of the
. And we also have some clues from the experimental evidence about which diagonal matrices can be represented.
April 11, 2010 at 9:10 pm |
Set
as defined a few posts ago, and recall that we want
to be small. From exhaustive computations, we get that M(49)=2, which is quite possibly the largest m with M(m)=2. In any case, when m is a square we seem to get smaller-than-usual M’s. Also, M(27) is smaller than nearby values.
This led me to consider whether there might be a finite field construction of the sets A that work well. The way finite fields are used to generate as-good-as-random constructions is to use the connection between the cyclic multiplicative group and the representation of
in some fixed basis. This isn’t so nice for us, since the multiplicative group will always have even order, whereas we need m to be odd.
Wait, the mult group isn’t always of even order. If
, then the multiplicative group has order $m=2^k-1$, which is at least odd. This still isn’t so nice for us, though, since such m will be prime for infinitely many k (no, I can’t prove that), and we know from Roth’s result that $M(p) \gg p^{1/4}$ for primes
.
Short story even shorter: I don’t see finite field arithmetic generating any good EDP constructions.
My machine is chugging away on deciding if M(121) is 2 or 3 or 4, and I expect that it will finish in a week or two.
April 12, 2010 at 9:58 am |
From a double counting argument we get
(with
the divisor counting function) as a necessary condition. This condition is weak, however it led me to consider the divisor counts near 246 (the length of the longest completely multiplicative sequence with discrepancy 2) and 1.124 (the length of the so far known longest sequence with discrepancy 2). The proofs we have so far do not give us an idea of what makes these numbers so special. Now, a heuristical argument could be: 247 is near to 240, which is the first number with 20 divisors. In the non-c.m. case the situation is as follows: 840 has 32 divisors, 1.080 has 32 divisors and 1.120 has 24 divisors (still a lot). Either this divisor count heuristics is just a coincidence or it does only hold for c.m., or 1.124 is probably not optimal. Unfortunately, my lap top has not enough power to do searches beyond length 400. 1.260 (36 divisors) would be the next maximum.
April 12, 2010 at 6:25 pm |
Here are a few questions that I am curious about and maybe they can be answered easily:
1) How is our basic
multiplicative example (based on the least significant non zero digit in the ternary expansion) behaves if we restrict it just to the first k primes (and let it be 0 on other primes).
2) How small maximum discrepency can we ensure simultanusly for a
multiplicative -1 +1 function and all its restrictions when we force the value 0 on a finite subset of primes.
3) suppose we just ask EDP for square free numbers. (I still tend to speculate you can get logarithmically low discrepency in this case as well.) How good is
? (Maybe this was already answered.)
4) How does the greedy-look-ahead algorithm for multiplicative sequences works. In this algorithm, given n, you choose the value of f on the kth prime as to minimize the discrepency (for all intervals [0,r] r <=n), when the values for all larger primes is set to 0.
I think there is a good shot that it will be logarithmic.
April 13, 2010 at 3:01 am
What do you mean by “EDP for square free numbers”?
April 13, 2010 at 6:11 am
Dear Kevin, here I mean the following: you want to assign the square free numbers values -1 and 1 so that when you sum these values for the square free numbers in every HAP the discrepency (absolute value of the sum) is small.
I would expect that just like the original EDP, you cannot make the discrepency uniformy bounded. But that you can make it grow logrithmically.
April 14, 2010 at 7:15 am |
I have been tied up with other things and taking a hiatus from EDP. I just want to summarize various LP/SDP approaches to EDP. What I am saying is nothing new, but is a summary of what we know about the various strengthenings of EDP we’ve considered and their relationships.
Consider a
sequence
, and consider the associated matrix
such that
, i.e.
. The approach is to prove something about
which would imply that the HAP discrepancy of
is large, i.e. that
is large for some
that is the characeristic vector of a HAP. In proving something like this, we need to exploit the fact that
is obtained from a
sequence. The various approaches differ in what properties of
they use giving rise to a sequence of strengthenings of EDP. At the very least, they use the fact that the diagonal entries are 1, and in fact some use only this fact.
Here is a sequence of conjectures, ordered from strongest to weakest — each is stronger than EDP and would imply EDP if true.
Conjecture 1 : For any matrix
with ones on the diagonal , there exists a HAP
such that
is large.
If true, this would imply EDP, because in particular, consider
. Then
. Thus
is large implies that the discrepancy of
is large. Unfortunately, this conjecture is too strong. Alec gave a construction of a matrix
such that
for all HAPs
.
Conjecture 2 : For any matrix
with ones on the diagonal , there exist HAPs
such that
is large.
The proof that this would imply EDP is similar to the argument earlier. Consider
. Then $P^T A Q = (X \cdot P)(X \cdot Q)$, hence
large implies that the discrepancy of
is large.
This conjecture is the basis for Tim’s representation of diagonals approach. We can express the problem of finding an matrix
such that
is bounded as a linear program, then the dual problem gives a lower bound for how small
can be made by choosing
apppropriately. In fact, the dual is exactly the problem of constructing the diagonal representation that Tim defined.
We have also considered the stronger
is large for HAPs
with the same common difference . Experimental evidence suggests that this is false (see Problem 1 in Tim’s EDP13 post). So far, we don’t have an explicit construction of a matrix
with ones on the diagonal that falsifies this conjecture, but I believe this should be possible.
Conjecture 1.5 :
Conjecture 3 : For any positive semidefinite matrix
with ones on the diagonal , there exists a HAP
such that
is large.
Note that
is positive semidefinite, hence the statement applies to it. This conjecture is the basis of the SDP approach to proving a lower bound on EDP. The problem of finding the “best” psd matrix
can be formulated as a semidefinite program and giving a lower bound for the value of this SDP by constructing a feasible dual solution amounts to producing a certain quadratic form, and subtracting a large diagonal term from it such that the remainder is positive semidefinite.
We ought to consider the analog of Conjecture 2 for psd matrices giving the potentially weaker statement:
with ones on the diagonal , there exists HAPs
such that
is large.
Conjecture 4 : For any positive semidefinite matrix
In fact, it turns out that Conjecture 4 is equivalent to Conjecture 3, since (as Tim pointed out earlier) for psd matrices
,
.
April 14, 2010 at 10:02 pm
Is there any relation to the Paving conjecture (aka Kadison-Singer conj)? (maybe this is related to EDP12′s discussion). It looks artificially similar.
April 15, 2010 at 6:56 am
I don’t know enough about Kadison-Singer to say if there is a connection. Gil, can you elaborate on what you think the relation might be ?
April 16, 2010 at 12:26 pm |
I was toying around with obryants description of log-bounded examples and our earlier discussion on flipping signs. Let’s see what we can get.
Let f be completely multiplicative, fix a prime p and define another completely multiplicative g(p):=-f(p) and g(q):=f(q) for prime q not equal to p. Then, since f and g coincide on non-multiples of p, the partial sum of g up to n satisfies
Using c.m. and the definition of g we end up with a recurrence
We solve this for
and get
For f with bounded discrepancy
this yields the following estimate
at prime powers
. Therefore, for a c.m. function to satisfy EDP it is necessary that the flipped function is log-bounded at the powers of the ‘flipped prime’.
Obryant’s log-bounded examples have discrepancy less than
(in his notation). In principle they could be flips of bounded discrepancy functions. I am not optimistic that this is the case, though.
April 20, 2010 at 8:21 am |
In case someone involved has some software for computing Haar wavelet transforms I think it could be interesting to see what such a transform of the long discrepancy 2 sequences look like, both for the general and multiplicative versions.
April 20, 2010 at 8:23 am |
Since it has been a bit quiet here now I can at least add a short progress report regarding the C=2 case. The computers are still working on the last stub and my original plan was to have it completed by now. However due to machine problem I had to restart some of the subcases from my backup copies and that set things back by about two weeks.
April 21, 2010 at 6:23 pm |
I have written “A gentle introduction to the 5th Polymath project”:
http://numberwarrior.wordpress.com/2010/04/21/a-gentle-introduction-to-the-5th-polymath-project/
I did not link to this blog, only to the wiki page. I figure anyone qualified enough to join in will be able to find their way here.
April 23, 2010 at 5:38 pm |
I have gone back to the Wiki and added a page for my graph theory generalization of the problem some 1000 comments back:
http://michaelnielsen.org/polymath1/index.php?title=Generalize_to_a_graph-theoretic_formulation
The prime factorization algorithm is horribly messy (and not entirely bug-checked).
I still have hope for a statement stronger than the EDP which will be a generalized combinatoric version of the problem.
March 23, 2011 at 6:36 pm |
[...] Computer experiments so far did not find a that grows slower than . For a given let . Then withis the largest such that there is an with . Proof idea: Do a computer search among all completely multiplicative discrepancy 2 functions with domain ( must be less than 247). A beating thus has at least discrepancy and thus implying . [...]
April 5, 2011 at 9:42 am |
[...] this post, Kevin O’Bryant has considered “least nonzero digit”-type examples . Encouraged [...]