This is another post for the sake of not having too many comments on any single post. I actually wrote a completely different one yesterday, but it contained some theoretical thoughts that are probably better held back until the project starts in earnest. So I’m going to make this post very short indeed.

A quick report on what’s going on right at the moment. We have just been fed some more data: several more sequences of length 1124 and discrepancy 2, some sequences that have HAP-sums bounded between numbers like -1 and 3 instead of -2 and 2, and some nice two-dimensional examples. The Polymath5 wiki has some details about these and other aspects of the problem.

If anyone feels like doing an experiment that I’d be interested to know the result of, they could investigate the following sequence. I want to know whether it is good for anything. It’s the simplest example I can think of of a sequence that appears to exhibit multiplicative behaviour but keeps breaking the pattern. Let be the completely multiplicative function that takes every prime to , where . Then define to be 1 or -1 according to whether the imaginary part of is positive or negative. (Of course, .) At what kind of rate does the discrepancy of this sequence seem to grow?

Another experimental question: are all the HAP subsequences of the new 1124 examples significantly different from all the HAP subsequences of the old one? In other words, is it reasonable to say that they are completely different examples, or are they more like modifications of HAP subsequences of the original one?

### Like this:

Like Loading...

*Related*

This entry was posted on January 9, 2010 at 7:47 pm and is filed under polymath5. You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.

January 9, 2010 at 8:49 pm |

Here are a few observations based on the data connected with the second sequence of length 1124.

First off, the 8-sequence starts out very similar to the original sequence, but becomes fairly different after a while. This I find interesting. It’s not hard to check it on the page linked to above. (It might be rather nice to have a third display with the 8-sequence directly below the 1-sequence so it was easy to see where they were the same and where different.)

Secondly, here are the starts of the 2-sequence and the 3-sequence. I’ve put them in pairs, for easy comparison. (Thus, the 2-sequence begins + – + + and the 3-sequence begins – + – -.) I’ve done the first 50 multiples. The only ones that are not reverses of each other are at (74,111) and at (82,123).

+ -/- +/+ -/+ -/- +/- +/- +/+ -/+ -/- +/+ -/- +/- +/+ -/+ -/- +/+ -/+ -/- +/+ -/- +/- +/+ -/+ -/- +/+ -/- +/- +/+ -/- +/- +/+ -/+ -/- +/+ -/- +/+ +/+ -/- +/- +/- -/+ -/- +/+ -/+ -/- +/- +/- +/+ -/+ –

So multiplication of n by 3/2 appears to multiply by -1. Peeping ahead a bit, I get the impression that the agreement remains remarkably good.

This suggests another idea for generating good sequences quickly. Perhaps one should artificially impose constraints of this kind (that multiplying by 3/2 reverses the sign) and initially search for sequences with the extra constraints. The fact that the constraints aren’t satisfied exactly would complicate matters, but perhaps one could have some other routine that came into play when one got into difficulties and made a few pairs go wrong. The main point is that the search space would be reduced a lot.

January 9, 2010 at 9:27 pm

The differences correspond to positions 37, 41, 71, 73 then 352, 353, 355, 356, 368, 372

ie the early entries (position 2 or 3) in the sequences for primes which tend to be on the irregular list plus entries near the end of the sequence.

The sequences for d=20 and d=26 are also remarkably close to that for d=3. Sequence d=14 differs at positions 37, 41, 47, 49, 56 and then more extensively. d=17 differs from d=3 first at position 23.

So this suggests that the search space could be constrained by looking to change these sequences in prime positions.

The sequences for d=5 and d=6 are close to identical. They differ at positions 59, 61, 176, 178, 184, 185, 186 ie again at primes and at late positions.

January 9, 2010 at 10:21 pm

d=8 differs from d=1 at positions 37, 41, 53, 67, 71, 73, 74, 82, 94, 98, 101, 103, 109, 112, 116 …

January 9, 2010 at 9:17 pm |

I have moved most of the experimental page on the wiki to subpages:

http://michaelnielsen.org/polymath1/index.php?title=Experimental_results

I have also added a wish list, with experiments we should try.

As I wrote on the talk page http://michaelnielsen.org/polymath1/index.php?title=Talk:The_Erd%C5%91s_discrepancy_problem (by the way, how do you “make words into links” on this blog? html-code?) we should take a look at our definition of quasi-multiplicative. Right now it is

Let us call a sequence quasi-multiplicative if it is a composition of a completely multiplicative function from to an Abelian group G with a function from G to {-1,1}.

But you can always choose G to be the group of positive rationals under multiplication, and the function to be the inclusion map.

We could say that G had to be finite, but then the sequence you define in you post is not quasi-multiplicative. We could also keep the definition, and notice that all sequences are quasi-multiplicative. Then we could still say things like: Let be a quasi-multiplicative with .

January 9, 2010 at 10:12 pm

That’s a good point. I’m thinking more and more that what we want is either a finite group or a topological group with a particularly simple function. For example, we might ask for the set of points that go to 1 to be open, though I think I want something simpler still.

How might one measure simplicity? The kind of thing I have in mind is that G would have a natural measure on it and if you look at the partition of G (into points that go to 1 and points that go to -1) and let G act on itself, then the images of the cells of the partition should have a good compactness property: for any there exists such that if you take any images of one of the cells of the partition, then two of those images must differ in a set of measure at most .

All this is of course inspired by ergodic-theoretic concepts. I’m thinking more and more that if the conjecture is true, then an infinitary approach could be a good one. I’ll say more about this in my next post.

January 10, 2010 at 9:21 am

I’d also been vaguely wondering about an ergodic approach to the problem. If we define a family of maps from sequences to sequences by , and we let be the set of sequences such that the sums are bounded by , then is an isomorphism between the monoid of positive integers under multiplication and the family (), and we are want to know whether . I was wondering if we could turn these maps into topological ones in a nice measure-theoretic context; but am not at all sure how.

January 10, 2010 at 9:36 am

My thoughts are similar (and in particular I have also been using the notation for that map), but not quite identical. What I thought we might be able to do is start with just one putative counterexample and take the closure of the set of sequences (in the product topology). Better than this would be to start by converting the original sequence into an example that works over the rationals, so now we would have a map for every positive rational . In particular, the maps would form a group and it would act on a compact space.

What I am hoping is that we would be able to prove some interesting facts about this action that would allow us to conclude that the sequence had some kind of multiplicative structure. A yet wilder hope is that we could end up with a contradiction.

January 10, 2010 at 1:02 pm

How do you make is well-defined for rational a? That would imply that you could determine the sequence from the and , without knowing d. Are you assuming some structure on ?

January 10, 2010 at 1:20 pm

Another good point. Let me think about it, because I think it may be possible to pass to an example where the maps are well-defined, even if they aren’t obviously well-defined to begin with. But I’m not sure about this yet.

January 10, 2010 at 1:51 pm

Of course it is possible if every HAP has a HAP identical to the original sequence, but that seems to be a strong assumption.

I don’t know if you can prove that every HAP has a HAP identical to the original sequence assuming that is well-define, but lets see what we can do:

Assume that is well-defined and that there is a HAP that don’t contain the original sequence. Now “x is a HAP of y and y is a HAP x” is a equivalence relation, and “x is a HAP of y” is a partial ordering on the equivalence classes. If the ordering has a maximal element, we would have a sequence that is a HAP of all its HAPs. I think this would contradict the fact that this is not the case for the original sequence and that is well-defined(?) So there must be a infinite sequence of sequences, such that if n>m then the sequence is not a HAP of . I don’t know if we can use this.

January 10, 2010 at 2:39 pm

This is just an idea. I’m not sure if it works:

We start with a . Now might be ill-defined, but we just define it to be one of the possible sequences (I think it should be one that is not a HAP of y if possible), and then we adjust accordingly. When we have done this for all sequences y and all primes p, we have a well-defined (using compactness argument, perhaps?)

January 10, 2010 at 3:11 pm

My idea was the following, but I also haven’t checked that it works. You take a sequence and define by setting to be the limit of along some ultrafilter. I think this should have some nice properties, but I haven’t had time to investigate it properly.

January 9, 2010 at 9:33 pm |

Tim, the discrepancy of your sequence up to appears at first sight to grow linearly with . Here are the discrepancies for :

103, 205, 313, 409, 539, 641, 752, 877, 1013, 1117

This is surprisingly fast growth. For large the discrepancy seems usually to be attained for , and to reflect a bias towards positive values. This must mean that the integer part of is biased towards even numbers as where . I’ve no idea why this should be so. (It’s possible that this phenomenon would disappear if one went much higher, as often happens with phenomena related to the distribution of primes. It’s also possible that I’ve made a mistake!)

January 9, 2010 at 10:00 pm

That’s disappointing, but perhaps it is indeed a small-number phenomenon — for instance, there are a lot of primes and they’ll all give positive answers, and unlike with the Liouville function this bias could well take longer to get rid of (because it isn’t as nicely cancelled out).

How about the following alternative suggestion: define to be alternately and as you go up the primes? That might share out the bias nice and evenly.

January 9, 2010 at 11:00 pm

The growth seems even faster now. I get the following discrepancies for :

167, 374, 535, 697, 899, 1053, 1238, 1405, 1593, 1731

This time the bias is towards negative numbers. So, with the above notation, the integer part of is apparently biased towards odd numbers (where ).

January 10, 2010 at 12:09 am

Funny — I was expecting the bias to be positive because I’d not spotted that would often be zero. One other possibility might perhaps be to send every prime to . But I’m losing faith in this idea now.

January 10, 2010 at 8:38 am

Ah, yes, the zeros rather spoil the calculations, because it’s not clear whether the function should be or . I was testing for the imaginary part being strictly positive, so got a lot of dubious s. If you make the function when then you get a positive bias that is stronger (the discrepancy at being ).

However, I strongly suspect that the apparent linear growth is a small-number phenomenon.

I’m not sure I understand your last suggestion: wouldn’t this just give the negative of the first sequence, except for ?

January 10, 2010 at 8:39 am

Oops, I mean at .

January 10, 2010 at 8:53 am

You’re right, that was a stupid suggestion. (I won’t bother to try to explain the confusion that lay behind it.)

January 9, 2010 at 9:57 pm |

I have combined the sequences and colored.

http://spreadsheets.google.com/ccc?key=0AkbsKAn5VTtvdE93ajdDOGtMeDlpNldncm5PRHM4S2c&hl=en

January 9, 2010 at 10:09 pm

It’s clear the more sensible comparison is to flip the signs in sequence #2, so I have done so:

http://spreadsheets.google.com/ccc?key=0AkbsKAn5VTtvdFhYQUlaQ3J5OV83czdDWTJDVmJwRmc&hl=en

January 9, 2010 at 10:03 pm |

I’ve also been computing Tim’s sequence. The results I find are broadly consistent with those of Alec, but are off by a few units. For N=1000 to 10000 I find :

78, 196, 292, 402, 524, 640, 746, 870, 1006, 1116.

I’ve made a plot of my data here:

http://thomas1111.wordpress.com/2010/01/09/discrepancy-of-a-golden-ratio-related-sequence/

I’m probably the one who made the mistake though. Alec: are the first few values given in the table in that link the same as yours, or already different?

January 9, 2010 at 10:41 pm

Thomas, how do you get all those discrepancies of zero? Or am I misunderstanding your table? (My sequence agrees with yours.)

January 9, 2010 at 10:47 pm

Ah, I see, you’re plotting the partial sums , rather than the maximum of over all and . The fact that our sequences look similar for large values confirms the dominance of and the preponderance of positive values.

January 9, 2010 at 10:56 pm

Yes that’s what I did, now I understand, thank you.

January 9, 2010 at 10:17 pm |

I have solved the easier problem I suggested earlier today, where we let C depend on d: choose an irrational number and let be 1 iff the integer part of is even.

January 9, 2010 at 10:27 pm

Funnily enough, that example appears in the post that I decided to hold back. If you let be badly approximable, then it gives a sequence that has discrepancy of order up to .

January 9, 2010 at 11:45 pm |

One meta remark: one thing that will be very useful is a summary or discussion of discrepency theory and in particular some techniques which may be relevant to proving the conjecture. Of course it will be ideal if some experts in discrepency will take part in the project.

January 9, 2010 at 11:56 pm |

[...] Filed under: polymath proposals — Gil Kalai @ 10:56 pm Is taking place on Gowers’s blog! Leave a Comment TrackBack [...]

January 10, 2010 at 12:06 am |

[...] (January 2010) Polymath5 devoted to Erdos discrepency problem is launched on Gowers’s blog. [...]

January 10, 2010 at 9:12 am |

Here are a couple of things I’m going to put on the wish list, but I’ll mention them here too, since they’re more likely to get noticed. (By the way, to the person — I’ve forgotten who it was — who asked how to get links into comments, yes, you do indeed just do it in the normal html way.)

1. Take a moderately large and search for the longest sequence of discrepancy 2 that’s constructed as follows. First, pick a completely multiplicative function f to the group . Then set to be 1 if f(n) lies between 0 and k-1, and -1 if f(n) lies between k and 2k-1. Alec has already done this for k=1 and k=3.

2. Search for the longest sequence of discrepancy 2 with the property that for every . The motivation for this is to produce a fundamentally different class of examples (different because their group structure would include an element of order 5). It’s not clear that it will work, since 32 is a fairly large number. However, if you’ve chosen then that will have some influence on several other choices, such as and , so maybe it will lead to something interesting.

January 10, 2010 at 11:04 am

1. I should say that I probably haven’t found the longest sequence possible for . (I’ll try to revisit this at some point.) I’ll put the data somewhere on the wiki.

2. Just looking at this now. The program is still running and has found a sequence of length . I’ll put this on the wiki too. Sorry, run out of time for now, so here’s a quick cut-and-paste of it:

+ + – – + + – + – – + – – – + + – + – – + + + – + – – + – + + + – – – + – – + – + – + + – + – – + + + – + – – – + + – + – – + + + – + – – + + + – + – – – + + – + – – + + – – + + – + – – + + – – + + + – + – + – – – – + – + – + + – + – – + + + + – + – – + – + – – – + + + – – + – + + – + + – + – + – – + – – – + – + + + + – + – – + – – + – + – + + – – – + + + – + + – + – – – – + + – + + – + – + + – – – – + + + – – + – – + + + + – – – + – + – – + – + – + + – – + + – + + – – + + – + – – + + + – – + + – – + – – + + – – – + + – + – + – – + + + – – + – + + – + + – – + – – + – – + – + + – + + + – – – + + – + + – – + – + + – – – + + – + + – – – – + + + + – – + – – + – + – + – – + + + – – – + + – – + – + + + + – – – + – + + – – + + – – + + – – – + + – – + + – – + – + – + + – + – + – + + – + – + – – + + + – + – – +

January 10, 2010 at 11:57 am

Some observations about that sequence. Here are the beginnings of its 2-subsequence and its 3-subsequence.

+ – + + – – – + + – + – – + + + – + – – – + + – + – – – + + – + – – + + +

– + – – + + + – – + – + + – – – + – + + + – – + – + + + – – – – + + – – +

So it seems that takes the sequence to minus itself, give or take the odd error.

appears to be another map that takes the sequence to minus itself. So it’s tempting to conclude that we have indeed got the group here. The difficulty is that it is not very easy to detect the presence of 5-torsion, except where it has been shoehorned in. But we should be able to do it by jumping from sparse sequences to equivalent denser ones: basically, we can be confident that we’ve got our group if there are ten classes of sequence starts, divided into five plus/minus pairs.

Let me try to check this. I’ll write [a,b;c,d,e] to mean that the a-sequence and b-sequence are roughly the same, and are roughly minus the c-, d-, and e-sequences, etc. So far I’ve got [1,32;7] and [2;3]. From this I can deduce [4;6] and I know that 4 and 6 don’t repeat anything from earlier, or else would not have order 5 after all. So the next sequence to look at is the 5-sequence, which starts as follows.

+ – + – + + – – – + – + + + –

This appears to be very close to minus the 4-sequence, and hence to the 6-sequence. So I’ll expand [4;6] to [4;5,6].

The 9-sequence should be the same as the 4-sequence, so we get [4,9;5,6]. I’ve still got only three of the five sequence pairs. The 8-sequence ought to give us a fourth class, so let me write out how it begins.

+ + – + – – – + +

That appears to be minus the 10-sequence, as indeed it should since the 4-sequence is minus the 5-sequence. In fact, we’ve got [8,18;10,12] from that. So the next unforced sequence is the 11-sequence, which starts

+ + – + – – – + +

so I think we can safely say that the 11-sequence equals the 8-sequence. So we have [8,11,18;10,12]. Next up is the 13-sequence, which starts (modulo medium probability that I have made a mistake)

– – + – + + + + – – +

which seems to be a new class of sequence, so I hope it will equal plus or minus the 16-sequence. That one starts

+ + – +

so there is some hope, though it’s not yet obviously different from the 8-sequence. I can’t face working out more though, so I’ll wait for someone to produce a nice presentation of the entire sequence, which will make this kind of thing much easier.

January 10, 2010 at 12:51 pm

I’ve just made a table of Alec’s sequence here:

http://thomas1111.wordpress.com/2010/01/10/tables-for-a-c10-candidate/

I’ll add the subsequences later on.

January 10, 2010 at 1:41 pm

Many thanks. I’ve checked quite a bit further with the help of your chart and the 13-sequence definitely does seem to be minus the 16-sequence.

January 10, 2010 at 3:11 pm

I’ve just posted a -invariant sequence of length on the wiki. (Search still going on.)

January 10, 2010 at 4:12 pm

If your search starts to get stuck, then the next thing to do might be to take the sequence you have so far and revert to the old algorithm — that is, allow backtracking that might mess up the -invariance slightly. With a bit of luck, that could lead to a significantly longer sequence that was still

approximately-invariant.January 10, 2010 at 5:02 pm

I’ve now worked out and displayed the subquences to at the page mentioned above.

The shortest subsquence has 12 elements, does that mean we should truncate the others there before comparing them? Combining with your previous comment the following 10 groups of similar sequences seem to exist: [1,32;7] [2;3] [4;5,6,28] [8,18,11;10,12] [13,20,24;16,22,25,30] [17,19,27;15;18] [14;21] [23] [26] [29]. Does that sort of make sense?

January 10, 2010 at 5:19 pm

Some further identifications need to be made. For example, the 2-sequence looks the same as the 21-sequence, so we get [2,14;3,21]. Also, the 9-sequence is missing in your list — it goes with 4, so we have [4,9;5,6,28]. Finally, the 17-sequence is minus the 8-sequence, which leads to the group [8,15,18;10,12,17,19,27]. We now seem to have five classes of sequence (as we want) together with three sporadic sequences.

January 10, 2010 at 5:33 pm

I’d see the sequences

[1,32; 7]

[2, 21; 3, 14]

[4, 9; 5, 6, 28]

[8, 11, 15, 18; 10, 12, 17, 19, 27]

[13, 20, 24; 16, 22, 25, 30]

With 23, 26, 29, 31 unallocated

January 10, 2010 at 5:57 pm

It’s a little troubling that the 26-sequence is unallocated. It ought by rights to be minus the 1-sequence. It does start like the 1-sequence but fairly quickly it stops working. I would very much like to understand these deviations from multiplicity — though it’s not clear what there is to understand.

January 10, 2010 at 12:27 pm |

Another piece of experimental data: the prime factorizations of all the places where the first and second sequence of length 1124 differ, until I couldn’t face continuing.

1, 7, 37, 41, 47, , 61, 67, , , 101, 103, 107, 109, 191, 193, , 263, 271, , , , , 307, , , , , , 359, , , , 379, 383, , , 397, , 409, .

It’s noticeable that fairly large primes tend to be involved in these numbers, suggesting that when you choose the sequence, you get, as you might expect, a fair amount of flexibility when it comes to choosing the values at numbers with few factors.

January 10, 2010 at 12:42 pm |

One of the things I thought about in Egypt, which is closely related to what I was working out in my previous comment above, was methods of converting sequences of bounded discrepancy into other such sequences. An obvious method is to change them in only finitely many places. Another is to choose an increasing sequence of pairwise coprime integers, pick a subsequence such that the values alternate, and then change everything in the subsequence. The reason that works is that apart from the sequence itself, no HAP contains more than one element of the sequence, so the discrepancy goes up by at most 2.

Are there less trivial ways of changing a sequence and preserving bounded discrepancy? Yes there are, but I don’t think I’ve reached anything like the limit of what can be done. The motivation for this question is to try to develop a notion of “essentially the same” and see whether it suggests a quantitative version that would go some way towards explaining the data about the length-1124 sequences.

Suppose I choose two primes with and . Suppose I now change all three. The effect will be to add -2 temporarily to the partial sums along the p-sequence and q-sequence, but eventually to add 2 again. However, I’ll also add first -2 and then -4 to the partial sums of the main sequence, getting back to -2. But I can correct for that by having another such triple later on, this time with and . I can then take an infinite sequence of such triples.

That’s only very slightly less trivial than the previous example, but at least it shows that there can be infinitely many pairs in the sequence that have common factors.

I’m fairly sure I had a more sophisticated example, but I’ve got to go now so I don’t have time to reconstruct it.

A second motivation for thinking about this is that if we identify a suitable notion of “is essentially the same as” then we will know that any structural property we try to prove about counterexamples will have to be “up to essential sameness”.

January 10, 2010 at 2:57 pm

Let and be two sequences with discrepancy at most . For each natural number m define the sequence such that if n<m and otherwise. I think that each has discrepancy at most 3C and and , so in a sense all sequences with bounded discrepancy are the same (if we are allowed to make infinitely many changes)?

January 10, 2010 at 4:18 pm

I don’t have a precise response to what you say, but somehow the difference between what I was doing and this is that my procedures for altering sequences gave new sequences that depended heavily on the old one. (The reason this is not a precise response is that you could artificially choose for each sequence a distinct sequence to convert it to.)

January 10, 2010 at 4:36 pm |

Here’s another experiment that should be pretty easy to program and might yield something interesting. It’s to look at the how the discrepancy appears to grow when you define a sequence using a greedy algorithm.

I say “a” greedy algorithm because there are various algorithms that could reasonably be described as greedy. Here are a few.

1. For each let be chosen so as to minimize the discrepancy so far, given the choices already made for . (If this does not uniquely determine then choose it arbitrarily, or randomly, or according to some simple rule like always equalling 1.)

2. Same as 1 but with additional constraints, in the hope that these make the sequence more likely to be good. For instance, one might insist that for every . Here, when choosing one would probably want to minimize the discrepancy up to if had already been chosen. Another obvious constraint to try is complete multiplicativity.

3. A greedy algorithm of sorts, but this time trying to minimize a different parameter. The first algorithm will do this: when you pick you look, for each factor of , at the partial sum along the multiples of up to but not including . This will give you a set of numbers (the possible partial sums). If is greater than then you set , if is greater than then you let , and if they are equal then you make the decision according to some rule that seems sensible. But it might be that you would end up with a slower-growing discrepancy if you regarded as a multiset and made the decision on some other basis. For instance, you could take the sum of over all positive elements (with multiplicity) and the sum of over all negative elements and choose according to which was bigger. Although that wouldn’t minimize the discrepancy at each stage, it might make the sequence better for future development because it wouldn’t sacrifice the needs of an overwhelming majority to those of a few rogue elements.

The motivation for these experiments is to see whether they, or some variants, appear to lead to sublogarithmic growth. If they do, then we could start trying to prove rigorously that sublogarithmic growth is possible. I still think that a function that arises in nature and satisfies ought to be sublogarithmic.

January 10, 2010 at 5:33 pm |

Yet another item for the wishlist. I produced by hand the following sequence that has (I think) discrepancy 2 and has the additional property that for every . I’d be interested to know what happens if this is used as a “seed sequence” for a backtracking algorithm: does it lead to a much longer sequence such that is roughly equal to ?

The sequence (which, amusingly, starts very like the Morse sequence) is this: + – – + – + + – – + + – + – + + – + – – + – – + + – – + + – + – – + – – + + + + – – – + + + – – + – + + – + – – + -

January 10, 2010 at 6:01 pm

Just seeding the backtracking algorithm with this sequence gets you fairly quickly to a sequence of length 371, but the property breaks down immediately (in other words, it’s by no means forced to continue). However, if you insist on it continuing, things get more interesting! You quickly get as far as 580. I’ll let the program run a little longer and then put the result on the wiki.

January 10, 2010 at 6:30 pm

OK, I’ve posted a sequence of length satisfying on the wiki.

Interesting that the constraint sped up the search so much. I’d be interested to see if any other multiplicative properties are apparent in it, and perhaps to use these as constraints in a new search with this as a seed.

January 10, 2010 at 6:44 pm

I was expecting constraints of that kind to speed up the search because it hugely reduces the size of the search space. I do wonder whether some mixture of imposing constraints and eventually relaxing them could break the 1124 barrier. I’d also be interested to understand why seems to be the preferred structure if you let the algorithm run on its own. Also, given that good sequences seem to like multiplicative structure, I find it rather mysterious that the constraint was under no pressure to continue.

By the way, if you backtrack with the sequence you’ve just produced, can you get much further or do you quickly get stuck?

January 10, 2010 at 6:45 pm

Now found one of length . Wiki page updated.

January 10, 2010 at 6:54 pm

Needless to say, it would be wonderful to have this sequence (or whatever holds the record by the time someone gets round to it) displayed in a nice colourful way and its subsequence structure looked at. My guess at the moment, without having looked properly at the sequence, is that there will be six broad classes of HAP-subsequences.

January 10, 2010 at 7:02 pm

When a group element of order 5 was forced, the primes 7 and 11 found better places in the group structure than in the examples of long sequences previously analysed – particularly 7. It may be that there is some kind of ‘natural’ order for each prime, and that in the absence of forcing the primes 2, 3 and 5 tend to dominate.

If this were true it might be possible to show that increasing C would accommodate more primes with their natural order (with prospects of a proof or counterexample).

January 10, 2010 at 7:44 pm

Quick answer to Tim’s question about unconstrained backtracking from the long constrained sequence: it’s not particularly productive. It gets to 622 and then sticks for a long time.

January 10, 2010 at 8:13 pm

This is beginning to suggest that having a sort of rough multiplicative structure is very helpful to a sequence, regardless of what that structure is, but that some structures work better than others.

Another interesting question is this. The original 1124 sequence had the property that is roughly the same as . What happens if one does a search imposing that constraint? Does one quickly get to a very long sequence, or does imposing the constraint too strictly mess things up? This is potentially important for understanding what’s going on: is the sequence with shorter because the constraint is imposed more strictly or because it is a less fruitful constraint?

January 10, 2010 at 9:01 pm

Here is an analysis of the length 594 sequence based on f(2r)=-f(r). This precludes the identity which has been so evident in previous long sequences of discrepancy at most 2 ie f(3r)=-f(2r).

617: 1 4 7 10 16 22 25 28 37 40

873: 19 31 34

1202: 3 12 30

1266: 43

1458: 9 21 26 36

1833: 39

2262: 27

2454: 41

2637: 13 18 33 42 45

2893: 6 15 24

3222: 17 23 38

3478: 2 5 8 11 14 20 29 32 35 44

In another notation this is

[1, 4, 7, 10, 16. 22, 25, 28, 37, 40; 2, 5, 8, 11, 14, 20, 29, 32, 35, 44]

{note that 2 has order 2 in this group}

[19, 31, 34; 17, 23, 38]

[3, 12, 30; 6, 15, 24]

{21 should be here, but isn’t}

[13, 18, 33, 42, 45; 9, 21, 26, 36]

[39; 27]

39, 41, 43 unallocated

January 10, 2010 at 9:10 pm

Initilailizing the search with the first 1124 sequence and imposing the constraint , we quickly reach a sequence of length 587, and then stick for a long time. I’ve put that sequence up on the wiki too.

January 10, 2010 at 9:47 pm

One of length 614 now on the wiki. And I have found one of length 594 that satisfies the 2=5 identity exactly, coming to the wiki soon.

January 10, 2010 at 10:28 pm

Some observations on the sequence of length 594 satisfying everywhere:

whenever .

whenever , except at the end.

everywhere except at the end.

January 10, 2010 at 10:59 pm

Sorry, that last should have been .

I’ve now found (and put on the wiki) a sequence of length 636 satisfying 1=-2=-5 exactly. The first and third of the above identities still hold, but the second does not: there are more exceptions to , apparently at multiples of 11, 13 as well as 7 (and a few others).

January 10, 2010 at 11:25 pm

Analysis of Alec’s 636 length sequence with subsequences of difference 2 and 5 both the negative of the main sequence.

617: 1 4 10 16 19 25 31 34 40 46

873: 7 13 22 28 33 37 42

1458: 3 12 30 48

1833: 18 27 45

2262: 9 36

2454: 41

2637: 6 15 24

3222: 11 14 21 26 29 35 39 44

3478: 2 5 8 17 20 23 32 38 47

3661: 43

or

[1, 4, 10, 16, 19, 25, 31, 34, 40, 46; 2, 5, 8, 17, 20, 23, 32, 38, 47]

[7, 13, 22, 28, 33, 37, 42; 11, 14, 21, 26, 29, 35, 39, 44]

[3, 12, 30, 48; 6, 15, 24]

[18, 27, 45; 9, 36]

with 41 and 43 unallocated

This looks highly regular, consistent with the method of construction. The behaviour and grouping of multiples of 3 looks interesting.

January 11, 2010 at 12:42 am

“which, amusingly, starts very like the Morse sequence”

I’m not actually convinced that this is that much of a coincidence. After all, if we only look at homogeneous progressions where d is a power of 2, the Morse sequence has discrepancy 1, right? (Right, it’s self-similar.)

So, a question (to which I don’t have any idea what the answer is): Are there analogous 0-1 sequences to the Morse sequence for other prime powers? I guess the idea would be to have the same self-similarity…

January 11, 2010 at 6:13 am

I’m not sure if you can naturally generalize to a zero-one sequence, but a generalization that maps to th roots of unity could be defined recursively by

January 11, 2010 at 6:16 am

I’ve uploaded to the wiki a sequence of length 688 that satisfies 1=-2=-5 and 11=-13 exactly.

January 11, 2010 at 7:59 am

And now one of length 854. Mark has pointed out that the 688 one also satisfies . I haven’t yet checked this for the longer one.

January 11, 2010 at 8:08 am

It does mostly except at multiples of 9.

Sadly I now have to go to work! But I’ll post those plots later.

January 11, 2010 at 8:35 am

Sequence length 854 is interesting. Subsequence 2 is the same as subsequence 13, which means that the forced sequences are the same. 3 and 7 both split off and disrupt the pattern.

617: 1 4 10 13 16 19 22 25 27 31 34 40 42 46 52 55 58 64

3478: 2 5 8 11 17 20 21 23 26 29 32 38 41 44 50 54 59 62 65

809: 18 37 45

3286: 9 36

873: 7 28 49

3222: 14 35 47 56

1458: 3 12 30 39 48 57

2637: 6 15 24 33 43 51 60 63

1266: 53

January 11, 2010 at 9:03 am

Sorry that should have been subsequence 2 the same as 11, the negative of 13 …

One way of interpreting what is happening here is that the constraints are forcing the constrained parts of the sequence to be multiplicative. The unconstrained parts then adjust to reflect the fact that it is impossible to have a fully multiplicative sequence (the behaviour of the primes 3 and 7)

January 11, 2010 at 10:26 am

Mark, in your analysis here the number 27 appears in an unexpected place: given that 3 is not in the list of numbers that do the same as 2, it should not be on the other side of 9. Is that a mistake, or is it just a bizarre feature of the sequence? I’m trying to work out whether we’ve got a structure going on here but this is confusing.

January 11, 2010 at 1:14 pm

Tim the 27 subsequence of the length 636 example does look out of place – it is the negative of 9 except in place 19 (it differs from the subsequence for difference 18 only in place 19).

But this is different in the 854 sequence where the 27 sequence moves, and the first difference between the sequences for 9 and 27 is in the 7th place (3 and 7 are the low primes which behave differently)

January 11, 2010 at 1:33 pm

In the 854 example, the 49 sequence is in a curious place. If the 7-sequence is not the same as the 1-sequence, it doesn’t seem right that it should be the same as the 49-sequence.

January 11, 2010 at 4:05 pm

In the 854 example, the 49-sequence is identical to that for 7 up to position 17 (as far as possible). The two sequences differ from the 1-sequence only in position 9 (and the 7-sequence goes on to differ in position 18, 36, 37 …).

The sequence 9, 18, 36, 37 goes on to include 45 and 53 – then 71, 72 74. Up to 53 these numbers are grouped in the shorter sequences.

I did a double-take with 49. There is some curious structure there, and the forcing technique seems to have eliminated some noise caused by arbitrary free choices in positions which would otherwise be unforced.

January 11, 2010 at 4:47 pm

It seems that the division into classes may not be so clear-cut. A pattern that is emerging from these forced sequences is the existence of pairs of HAPs that agree when values on some other HAP or set of HAPs are excluded.

January 10, 2010 at 6:56 pm |

Here are tables for the 584 one, to help spot further structure:

http://thomas1111.wordpress.com/2010/01/10/a-sequence-of-584-elements-with-t2x-tx/

January 10, 2010 at 7:16 pm

A preliminary look is very interesting. I looked at the first sixteen multiples of all odd numbers up to 17. That gave the following exact equalities, where I’ll write to mean that the a-sequence is plus or minus the b-sequence.

2=5=17, 3=-15 (as one would expect if 2=5), 7=-9, 11=-13. In addition, 3 and 13 are pretty similar (and hence 11 and 15 are pretty similar). I still haven’t worked out what the orders are of the transformations that are not of order 2.

January 10, 2010 at 7:26 pm

A possible way of guessing the order of is to look at the values at powers of 3. These are – – – + +, which isn’t all that helpful. If one uses the identity to look at the values at (where by the value at -a I mean minus the value at a) then you can extend this to – – – + + +. So I’m going to hazard a guess that has order 6. But I’ll need to think a bit more about this.

January 10, 2010 at 8:12 pm

The 2=5 identity is strikingly well observed. We have for all but 15 of the 116 possible , and for all up to 56.

January 10, 2010 at 9:21 pm

In the hope that there might be a simple explanation for the identity 2=5, I tried being awkward and imposing the additional constraint that . No such luck: I got to 356 with both of these constraints. That is the limit, however.

January 10, 2010 at 7:35 pm |

When analysing examples of long sequences, we should be aware of those elements whose value is flexible. For instance, in the original 1124-sequence, the values of , and can be (separately) changed without invalidating the sequence — the sequence remains valid as long as at most one of these elements is -1. Perhaps we should colour these elements differently?

On a completely separate note, I have looked at the number of sign flips in the HAP’s in this sequence (e.g. + – – + – ++ has 4 sign flips). Here is a list of d,f,r with f = no. of sign flips, r = f / (length – 1):

1, 691, 0.615316

2, 350, 0.623886

3, 226, 0.605898

4, 185, 0.660714

5, 151, 0.677130

6, 127, 0.682796

7, 99, 0.622642

8, 89, 0.640288

9, 83, 0.674797

10, 72, 0.648649

11, 63, 0.623762

12, 57, 0.619565

13, 49, 0.576471

14, 52, 0.658228

15, 46, 0.630137

16, 45, 0.652174

17, 42, 0.646154

18, 38, 0.622951

19, 37, 0.637931

20, 36, 0.654545

Is this hovering about the golden ratio? Or is it just noise?

January 10, 2010 at 8:40 pm |

[...] thread for polymath5 By kristalcantwell There is a new thread for polymath5 here. There is also a wiki page for experimental results [...]

January 10, 2010 at 9:24 pm |

@Alec:

What happened to the program you talked about here:

http://gowers.wordpress.com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/#comment-4734 ? I think this is the most interesting experiment, because it could tell us if we have to “break to system” all the time. Have you tried to correct it?

January 10, 2010 at 9:38 pm

Hi Sune,

I did correct it, but it was just a Python script and hadn’t got anywhere after running for some time. It needs to be written in C for speed. I intend to do that (unless anyone else would like to), but haven’t got round to it, what with all the other fascinating things to think about.

January 10, 2010 at 9:45 pm |

I think that it would be a good idea to have a polymath5 tag at Math Overflow I think this was mentioned in the previous thread. It would be useful because there could be a search for that tag that would give a complete list of questions so there could a link from the wiki to all the polymath5 questions at Math Overflow. Otherwise if there were a lot of questions it could become a problem to keep track of them.

January 11, 2010 at 2:16 am

I agree that this sounds like a good idea, particularly if we start asking a bunch of questions on MO. Tim asked a question a few days ago about partial sums of multiplicative functions — I’ll leave it up to him whether to retag. Otherwise, I’ll keep my eyes open.

January 10, 2010 at 10:15 pm |

I have a prediction but not the time to check it (which in any case I’d have to do by hand, whereas others here could do it in seconds). The prediction is that if you take two HAP subsequences of the 1124 sequence and look at the partial sums of their pointwise product, then they will grow approximately linearly with a gradient between -1 and 1 that’s a multiple of 1/3. The idea is that this gradient should correspond to the angle between the roots of unity corresponding to the elements of . This prediction is obviously correct when the two sequences are either approximately equal or approximately minus each other — it’s the other cases that interest me.

January 10, 2010 at 11:11 pm

I’m afraid my first few calculations are inconclusive. The sequence may just be too imperfect. But it may be worth plotting a histogram showing the distribution of the gradients to see whether there seem to be peaks at these values.

January 11, 2010 at 7:52 am

I do have some interesting plots of these gradients. I’ll upload them and post a link soon.

January 11, 2010 at 1:48 pm

Here is a plot to give some idea of the situation (apologies for the primitive presentation):

http://www.obtext.com/erdos/gradients.png

This plots, for the first 1124 sequence , the partial sums of for primes and up to and including .

Short of time now, but I can easily produce similar plots for other HAPs on request!

January 11, 2010 at 2:50 pm

An interesting phenomenon revealed by those graphs is that, whereas some pairs of HAPs start in synchrony and then get out of step towards the end, others start out of step and get into synchrony later on. We have probably been missing these in our analysis.

January 11, 2010 at 4:47 pm

That is indeed very interesting, and may completely change my view about what is going on. There now seems to be the possibility that sequences like to mess about a bit early on and then gradually get more and more multiplicative, which is almost the opposite of what I thought before.

Now that I’ve got a bit more time, let me explain why I made the prediction that doesn’t seem to be especially backed up by the data you have provided. Suppose that we define a sequence by taking a completely multiplicative function to and composing it with the map that takes 0,1,2 to 1 and 3,4,5 to -1. Suppose now that we compare the d-sequence with the 1-sequence. If , say, then for every . So we’d expect that and would be equal two thirds of the time and different for the other third. Moreover, I’m pretty sure we would expect the unequal places to be distributed very evenly, so that we should go one step down for every two steps up, giving a gradient of 1/3. Similarly, if we’d expect gradients of respectively.

The fact that this does not seem to be happening is one of those bits of bad news that is not really bad news at all, but a sign that something interesting and unexpected is going on.

January 11, 2010 at 5:01 pm

A caveat to my comment above: it is possible that those high correlations appearing late on may be an artefact of the search algorithm used to find this sequence: for example, by values being initialized to . it would be worth regenerating the sequence with random initialization values to see if the pattern persists.

January 11, 2010 at 1:20 am |

This is a meta-comment; it would be useful for those of us “following along at home” (i.e., haven’t contributed much yet but are optimistic about their chances of being able to contribute later on), and it would likely be useful even for the core group of contributors that has established itself so far, if we had some uniformity in how we analyzed and presented experimental results.

It shouldn’t be that difficult to set up and make available some little bit of software that made it really easy to extract and visualize and compare HAPs (given a text file containing a sequence of +s and -s and some whitespace as input) and various other patterns with a really basic UI, should it? Right now it’s probably not very necessary, but in the future if we start looking at more complex patterns it might become essential. I could probably hack something like this together myself if I had to, but I’m a pretty lousy programmer, plus my semester’s starting this week and I’ll have other obligations…

January 11, 2010 at 4:11 am

Part of the reason I was working with CSV files is they allow import into Excel or Mathlab or whatever preferred software you might want to use for aforementioned analysis.

The source to convert raw text to CSV is in C and not that complicated, if you want me to upload it.

January 11, 2010 at 4:31 am

Uploaded, although I don’t have time to fight with the wiki software to fix the formatting.

http://michaelnielsen.org/polymath1/index.php?title=Convert_raw_input_string_into_CSV_table

January 11, 2010 at 10:02 am |

Since I haven’t seen it on the wiki, nor apparently discussed in previous posts (I may overlooked it), I’m wondering if listing valid sequences as a function of length has already been discussed. In particular, to what extent could an analogy to prime sieving be made (and to, say, Euclid’s proof that there are infinitel many primes)? I’ve started the list here: http://thomas1111.wordpress.com/2010/01/11/number-of-sequences-as-a-function-of-length/

January 11, 2010 at 10:29 am |

Thomas, it occurs to me that the following would be easy to produce and very useful. For each sequence, it would be great to have a diagram that displayed a grid of numbers, with the number mn in the mth row and the nth column, together with a plus or minus and a colour, given by the value of . Probably going up to about 20 columns and 30 rows would be enough (if the sequence has length at least 600 — one could reduce it for shorter sequences). With such a chart, it would become a much easier matter to compare the values of the sequence at HAPs.

January 11, 2010 at 3:15 pm

I’ve just finished writing the corresponding program, and I’ve put a fairly clean source code on the wiki http://michaelnielsen.org/polymath1/index.php?title=Create_tables_in_an_HTML_file_from_an_input_sequence so everyone’s invited to use it for themselves or add other features.

To see the kind of table containg HAPs it ouputs have a look at the bottom of my first post: http://thomas1111.wordpress.com/2010/01/09/erdos-discrepancy-a-program-to-get-html-tables/

Please let me know if something looks incorrect. I couldn’t find a way to upload my compiled version for windows on my blog nor on the wiki, I’ll try to find a host if someone’s interested.

January 11, 2010 at 3:40 pm

I made a comment which seems to be kept in the moderation bin. Basically that program you’ve asked for is now finished, the code is on the wiki, and to see the kind of HAP table it produces have a look at the bottom of this page: http://thomas1111.wordpress.com/2010/01/09/erdos-discrepancy-a-program-to-get-html-tables/

January 11, 2010 at 11:28 am |

This is one for Alec. I still haven’t given up on the golden ratio (or some other irrational) making an appearance and leading to interesting examples with low discrepancy. My first attempt was a dismal failure, but here’s a method that might work. In fact, it would also be very interesting if it

failsto work.Recently, we have been experimenting with sequences that are subject to constraints, which can be summarized as saying that maps of the form are forced to have certain periodicity properties (or are forced to equal each other). I am interested in forcing to be

quasiperiodic, or “periodic with an irrational period”. To achieve this is slightly trickier than it is just to make , so let me try to say exactly what I’m asking for.Given a point we can define a sequence by taking to be the golden ratio and looking at whether the imaginary part of is positive or negative. (Let’s choose so that it is never zero.) I’ll call this a sequence of period . I’m interested to create a sequence subject to the constraint that every subsequence of the form is periodic with period .

The difficulty is that there are infinitely many choices for . One could artificially reduce this to two: . I think even this could work. A more complicated alternative would be to choose some small (and probably even) integer k and choose a kth root of unity, optimizing in some way what happens at the first few values. But to begin with just choosing is very much in the spirit of what you have already been doing.

The motivation for this experiment is that multiplicative structure that is too exact seems not to work. Here is a multiplicative structure that avoids periodicity, and may thereby achieve theoretically what the experimental sequences seem to do. Having said that, I have not detected any evidence in the experimental sequences for this quasiperiodic behaviour. I just think it would be fascinating to see what happens if we try to impose it. Needless to say, there are many variants one could try if the first one doesn’t give good results.

January 11, 2010 at 11:50 am |

One question that I find interesting is if there is some notion of lines for so that the analogous question for these lines will imply Erdos’ conjecture and also will have a chance of being correct. (Combinatorial lines containing the 0 vector do not work.)

The following notion of lines may be appropriate: a line has p-1 points so that in every coordinate , and can be arbitrary numbers between 1 and p-1. (We can assume .)

If we replace by where t is large compared to p then such lines are maped to homogenious AP. So the question is if we give every element of {1,2,..,p-1}^n a sign is there such a line with unbounded dscrepency. (Maybe the answer is easily “no” but I dont see it off hand.) Such lines might be handled by some Fourieish arguments.

January 11, 2010 at 4:52 pm

For the reduction to work we do need that the lines will have only s points and that $r_i \times s < p$ for every i. The problem is of interest also for the lines as I defined above.

January 11, 2010 at 2:58 pm |

Alec, what does your program find if you require both that and ?

The reason that I ask is that I know that there is at least one sequence which has this property initally and can be extended to a length of at least a few hundred, but I don’t know if the property is preserved in the extensions.

January 11, 2010 at 5:26 pm |

The discussion continues here:

http://gowers.wordpress.com/2010/01/11/the-erds-discrepancy-problem-iii/