Again, this is just a brief report on what is going on. Plenty of work is still being put into making sure that the main results, questions, ideas, etc. are appearing on the Polymath5 wiki and especially (for the moment) the experimental results page, so looking at that is still a good way to keep up.

Alec came up with a sequence of length 1112 that is fairly different from the sequences of length 1124. See this comment and succeeding ones for information about it. I am particularly intrigued by the diagram showing partial sums of pointwise products of different HAP-subsequences. There is data here that demands to be explained, but the focus of the discussion is elsewhere for now.

Kevin has suggested thinking about how the number of sequences of length n compares with the number of sequences of length m, where by “sequences” I mean sequences with relevant properties such as having discrepancy 2 and perhaps being multiplicative as well. There is some interesting data here too.

The logarithm of the number of multiplicative sequences of discrepancy 2 and length n behaves in a suggestive way. Yet more to chew over.

I myself have started thinking about an algorithm for generating long low-discrepancy multiplicative sequences in the hope of proving that the discrepancy of such sequences can grow more slowly than logarithmically.

Gil has suggested looking at randomized modifications of the problem with a view to obtaining insights about the problem itself.

Alec has looked at the behaviour of Dirichlet series built out of some of the good examples we have.

Finally, I am hoping for good news from Klas in the not too distant future.

### Like this:

Like Loading...

*Related*

This entry was posted on January 14, 2010 at 12:20 pm and is filed under polymath5. You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.

January 14, 2010 at 3:07 pm |

Timothy, regarding the “intelligent greedy” algorithm, what I’d really like to do is a computer implementation, so I’d like to nail down the exact process.

From what I gather you’re using two kinds of forcing:

a.) using a completely multiplicative property

b.) using prohibited subsequences.

For b.) then it would be useful to start by generating a list of the prohibited subsequences.

Regarding a “stupider” version which might be more theoretically tractable, I was thinking of setting the main look-ahead of the sequence at numbers where (given a natural number x, is the corresponding term of the Möbius sequence) . (Here in the OEIS.) It’s a distinct-factor rich sequence but also related to something with known properties.

January 14, 2010 at 3:13 pm

I see at the very end of the last thread Thomas has similar thoughts to mine, so instead of coding I might focus instead on some theoretical algorithm analysis, perhaps with a finite-state automata or something along those lines.

January 14, 2010 at 4:10 pm

Your description of what I’m doing is right. Let me say a little bit more about the forbidden subsequences that kept coming up. I found that it was useful to put a little tick above a number every time I could prove that the partial sum up to that number had to be zero. (This may be less useful when the range of allowed sums becomes bigger.) For instance, as I said before, if you get the sequence – + + with the initial minus at an odd number 2n+1, then you know that the sum up to 2n must be 0, and you don’t need to know all the values up to 2n in order to deduce that. And of course, if you know what the partial sum is up to some m and you

doknow all the values from m to m+k, then you know the partial sum up to m+k.On the subject of a theoretical analysis, I think it would be interesting, and I also think it could well be possible, to describe circumstances that will force some partial sum to go above C in magnitude. (I’m still talking about completely multiplicative sequences.)

The kind of thing I mean is this. Let us say that the values at

forcethe value at n if are the factors of n that occur an odd number of times. For example, as I found when building a multiplicative sequence of length 246 by hand, the value at 2 forced the value at 242: once I had decided that I had no choice but to set to be -1 as well.Now let’s suppose we are trying to pick the values for all primes less than some M. We could make a list of all values that will be forced by the values we choose. If we did so, then for every n that is forced, there would be an associated set of the primes that force the value at n. If the number of -1s in is odd, then and otherwise . So if we have several intervals with a high density of forced numbers, we could find ourselves in a situation where we have several collections of sets and we need to choose -1s in such a way that for each collection we want to hit roughly half the sets an even number of times and roughly half an odd number of times. And that could turn out to be difficult.

Of course, proving the existence of these runs of forced numbers could be difficult too. But perhaps at the very least one could guess how much they should occur and try to come up with heuristic arguments that predict how far you need to go before you cannot avoid a discrepancy of C. My suspicion is that you can’t go for ever but that you can do better than exponentially in C.

January 14, 2010 at 4:52 pm

So that involves factoring n as st where s is a square and t is square-free and listing the prime factors of t. It ought to be possible to compute a long list for this quite rapidly.

January 14, 2010 at 5:54 pm

Comment to Mark Bennet, January 14, 2010 at 4:52 pm.

Just to confirm whether I understand your words in the general context.

For instance, for the full sequence by Thomas Sauvaget posted in Tables for the second 1124 sequence is , (; is a prime).

January 14, 2010 at 6:17 pm

Américo

My understanding is that in your example, Tim is noting that, in a multiplicative sequence, the value at 1124 is determined by that at 281.

So once the values at a number of primes are known, they force further values in a similar way.

Stage 1 of an algorithm is to fill in all the forced values.

Stage 2 is to deal with the implications of that forcing – so that if there are sequences of forced values matching particular patterns, they will imply constraints on any further choices at new primes. (Potentially, later, when we look at more intelligent searches, intervals in which the forced values are relatively dense, and match more sophisticated patterns). The fill in further forced values (stage 1) and iterate until no more values are forced by constraints, or a contradiction is reached (impossible to continue).

Stage 3 is then to determine that it is impossible to continue with the current value of C, or to make a free choice of the value at a new prime. It would be possible to do backtracking over free choices to find the optimum choice at each stage, and this may reveal further patterns to be taken into account at stage 2 to make the search more efficient.

January 14, 2010 at 6:22 pm

That’s a point (I’m referring to Mark’s first comment above). Having a list might be pretty useful as one could look for promisingly difficult areas and try to use them to prove lower bounds. For example, an initial target might be a proof that the discrepancy of an infinite multiplicative sequence cannot be 3. (The aim here would be to have a proof that doesn’t just do an exhaustive search.)

January 15, 2010 at 7:34 pm

I tried to implement something in a crude way – but overflowed the memory constraints on my laptop. I was aiming at stats up to N=100,000.

The procedure aimed at factoring n=pqr

p is the maximal square factor

q is a product of small primes (I went up to cube root of N)

r (which I left unanalysed) is then the product of at most two large primes.

Then analysing q gets you most of the constraints you are worried about.

January 16, 2010 at 9:05 pm

Mark

Thank you for your reply (January 14, 2010 at 6:17 pm)

and explanation.

January 14, 2010 at 6:32 pm |

This hasn’t gone much of anywhere, but in the Polymath spirit I’ll post what I did anyway.

I was thinking about how to represent what the relationships between forcings was rather than what the exact elements were, and using the C=1 case as an example I tried a graph structure where one could collapse nodes (see picture). With all the links and cycles drawn in the graph can eventually collapse into a single node (hence every move in the case C=1 is forced).

I was hoping to extract some meaning using leaves (which would include primes p such that 2p n) and cycles, but I haven’t had any luck yet.

January 14, 2010 at 6:34 pm

(HTML bugfix:

would include primes p such that )

January 14, 2010 at 6:35 pm

(Something is breaking in my equation attempts but I have no idea why.)

January 14, 2010 at 7:45 pm

I’ve noticed a couple of things that can break WordPress formulae: any double spaces in the equation, and sometimes the presence of less-than or greater-than signs.

January 14, 2010 at 8:14 pm |

Sometimes I post a draft post (or comment) just to make sure I get what I want. Then I copy the HTML code and post it as a comment.

January 14, 2010 at 8:17 pm |

My last comment was intended to

Jason Dyer, January 14, 2010 at 6:35 pm

“(Something is breaking in my equation attempts but I have no idea why.)”

January 14, 2010 at 8:19 pm |

This is probably too generalized/abstract/nonsensical to actually be useful for the problem, but it occurred to me just now and I can’t resist mentioning it, just in case:

Of course the positive integers form a poset under divisibility. A poset is a special kind of category.

In this category, each morphism has an associated integer. If we have a completely multiplicative sequence, this is equivalent to assigning to each morphism a +1 or -1 such that the sign of the morphism acts nicely taking commutative diagrams; equivalently it’s a 1-dimensional quiver representation over .

If we look at the first step in a quasi-multiplicative sequence, this gives us a 1-dimensional quiver representation (over C) which is also an irreducible representation of an Abelian group.

Can we study the problem using quiver representations? Like I said, I kind of doubt it (especially since it seems kind of hairy to define HAPs with any sort of generality) but it’s worth throwing out there.

January 14, 2010 at 8:56 pm |

A while back, Tim asked what happens if you take one of the 1124 sequences and search for sequences whose even terms match it. I thought this was an interesting idea, and tried it just now. It doesn’t get far. The maximum length of sequence subject to the constraint was 362. Applying unconstrained search using the resulting sequence got no further (in the time I let it run) than 363. Repeating the experiment with the additional constraint that , the maximum was 314 and the furthest I got with unconstrained search from there was 494. Still, it is quite interesting that it doesn’t work.

January 14, 2010 at 9:03 pm

It is certainly interesting. It occurs to me to wonder whether we might have more luck if we tried to choose which to and sign to go for (where here you chose and both plus and minus) with the multiplicative structure of the existing sequence very much in mind? It may be that one would still end up choosing and minus — I haven’t looked. For example, if is roughly minus 1, then it might make sense to look for a sequence whose third element came from the 1124 sequence.

But probably the idea just doesn’t work. I think we ought to see if we can think of a plausible reason for this.

January 14, 2010 at 9:29 pm

The 1124 sequence is closest to its 8-subsequence, so that might be worth a try. But the first 1124 sequence has 1 as a sporadic value (ie its initial values are not replicated in any of its own subsequences), and this may provide an explanation for the difficulty.

I’d try one of the more regular sequences: the second 1124 sequence has more regularity than the first.

January 15, 2010 at 8:08 am

Curiously, one gets exactly the same distance (363 and 314 respectively) constraining to be the first 1124 sequence as one does constraining to be the second 1124 sequence. (Note reversal of signs.) I can’t believe this is a coincidence, but whether it is something deep or a result of how the seed sequences were generated by Klas, I don’t know. Let me try it with the 1112 sequence next.

January 15, 2010 at 8:15 am

Not surprisingly, given how the 1112 sequence was formed, one gets much further in this case (to at least 998). Even with the ‘wrong’ sign constraint, one can get to 374 (but no further).

January 14, 2010 at 10:02 pm |

A quick update regarding refined greedy algorithms and multiplicative sequences: I’ve now a added a source code which I hope is straightforward enough to modify in order to test many ideas. So far I’ve done some runs with a method that compares the sum of partial sums in the little section here http://michaelnielsen.org/polymath1/index.php?title=Multiplicative_sequences#Sum_of_partial_sums The sequence I obtained is clearly not sub-logarithmic unfortunately. I’ll try to test more ideas using Tim’s comments tomorrow.

January 15, 2010 at 2:48 pm

I’ve now obtained and plotted some data which go some way in answering Tim’s question 4 of the wish list, and I’ve put it on the wiki http://michaelnielsen.org/polymath1/index.php?title=Multiplicative_sequences#Minimizing_D_up_to_the_next_prime

Namely, the answer thus seems: yes, the partial sums grow at least logarithmically.

January 15, 2010 at 3:06 pm

Thomas, in your last plot, I wonder what it would happen if you calculated as a weighted sum, giving weight to the discrepancy- sequence (in other words, weighting the HAPs according to how often they matter). This would be in line with the spirit of the greedy algorithm.

January 15, 2010 at 3:50 pm

Thomas, these plots are interesting, and seem to show that a naive greedy algorithm fails. My guess would be that a naive greedy algorithm is about as good as the Liouville function — that is, giving square-root-like discrepancy.

That still leaves me curious about the more sophisticated greedy algorithm where at each stage you try to minimize not just the discrepancy so far but some parameter associated with the difficulties you are potentially making for yourself later. (I call this greedy because what I am not allowing is any backtracking — once you’ve made a decision you have to stick to it.)

I’m not sure I see the point of plotting the partial sums along any of the HAPs other than the sequence itself, since the discrepancy for a multiplicative sequence is equal to the largest partial sum so far (in modulus).

January 15, 2010 at 3:54 pm

Of course: please ignore my last comment then — I was forgetting the functions were multiplicative.

January 14, 2010 at 10:53 pm |

While investigating the Dirichlet series, I noticed that the sequence obtained as the Dirichlet inverse of the 974-sequence with nice multiplicative properties has some rather nice ones of its own, evidently closely related to those of the original sequence. For example:

for all

whenever

whenever

(I’ve posted the sequence on the wiki, and plotted it here:

http://www.obtext.com/erdos/diri_inv_974.png.)

I think these properties can be derived from the Dirichlet inversion formula by induction, but haven’t unravelled the details: I’ll attempt to do so. It’s possible that understanding the properties of the Dirichlet inverse might provide an alternative way to generate long low-discrepancy sequences.

January 15, 2010 at 2:06 pm

OK, I’ve checked by induction that if , and is the Dirichlet inverse of , then for , and for all . Now I wonder what happens in the case of …

January 14, 2010 at 10:56 pm |

A while back, I posted about what I called the “HAP basis.” I finally wrote some code to compute the HAP basis of a given sequence, and have begun to look at the HAP basis of some of our low-discrepancy sequences. Disappointingly, what I was hoping (that they would “look like” bases of completely multiplicative sequences) seems not to be the case in general, but if someone else wants to try to find patterns (and I’ll keep looking), please go for it!

By the way, I’m considerably less optimistic about this than I was about an hour ago, but I’ll post it anyway. What motivated me to finally write the code was the thought that perhaps long low-discrepancy sequences only had a small number of vectors in their HAP basis — like how completely multiplicative sequences may only contain primes and prime powers in the basis.

I conjectured to myself that perhaps any sequence whose HAP basis had positive upper density had to have high discrepancy, and even thought briefly about how to go about proving it. (Density increment of some sort?) Again, I’m much less optimistic now that this is in fact something that we can prove (without proving the EDP itself), but it’s possible that it is true, provable, and just happens to give much worse bounds than the “actual” bound. If someone else thinks it could be attackable, please comment!

January 14, 2010 at 11:21 pm

Actually, I just tried something else which makes me

moreoptimistic about the conjecture possibly being approachable, which is run my program on Alec’s length-1112 sequence. So I think I’ll write a little more about my thoughts on it.So, the HAP basis is an alternative basis of the vector space of GF(2)-valued functions on N. It behaves rather nicely under passing to a HAP as a subsequence, and also it’s often possible to change slightly without affecting the discrepancy much. It also behaves nicely under truncation, but it’s pretty sensitive to changing the sequence pointwise. The definition of the basis is the characteristic functions of HAPs.

If you don’t look at discrepancy, the basis has the same statistical properties as any other basis (duh). In particular a random sequence of length N will decompose into about N/2 basis vectors.

One curious thing is that multiplicative sequences have highly restricted HAP-basis decompositions; they are sums of characteristic functions of HAPs with prime-power differences. Since multiplicative sequences (and sequences resembling them) seem to be able to have quite low discrepancy, I was led to my conjecture, which is again:

Let have positive upper density, and let be the sequence that decomposes in the HAP basis into exactly the (characteristic functions of) HAPs with common difference in S. Then has unbounded discrepancy.This sort of resembles Szemeredi’s theorem, and I wondered if some of the same techniques might not carry over.

January 14, 2010 at 11:42 pm

So how would we go about proving this conjecture? Well, one other thing about multiplicative and even quasi-multiplicative sequences is that passing to a HAP-subsequence doesn’t change the discrepancy much. So I thought, well, maybe we could try a “density increment” argument?

There are a few obvious problems here, though. First of all, I’m not even sure that we know how to prove the conjecture taking ! So the “base case” for density increment would probably be highly nontrivial. Second of all, you can’t partition with a finite number of HAP sequences (reference: Euclid), which is a major respect in which this problem differs from similar ones. Finally, note that passing to a HAP subsequence isn’t likely to drastically increase the upper density of the basis when you start with a random sequence, so you’d have to introduce yet another “random/structured” dichotomy.

Still, I hoped that one might be able to construct an argument along the following lines: First, prove the conjecture if the density is greater than some . (I'm not sure how this would go, even now.) Then, if there's some k for which has a denser basis, pass to that subsequence. Otherwise, use the fact that there's no such k to prove that one can add a positive fraction of new HAP basis vectors to obtain a new sequence which again would hopefully have bounded discrepancy.

This is rather farfetched, but I felt like if it had a chance of working, we'd be capable of making it work. But my hopes were dashed when I ran my program on the first sequence of length 1124 and saw that there were 494 basis vectors in the HAP decomposition — less than the "expected number" of 562, but not a whole lot less. I thought, well, clearly the density of the HAP basis isn't controlling much at this point.

But then I computed the HAP decomposition for Alec's 1112 sequence — it had only 247 basis vectors! (That's exactly half of 494, by the way, and it's also 1 more than the maximum length of a completely multiplicative sequence. I'm 99% sure that these are both coincidences, but I greatly hope I'm wrong, because that would be fascinating.) That's density about 1/4, which makes me wonder if there are some connections after all.

I no longer know if density increment is the way to go, though. I wonder if the 1124-sequence can be obtained from the 1112-sequence by pointwise operations and adding extra HAP vectors — actually trying to do this is probably out of my reach right now, but someone else might have an idea. I wish I knew the ergodic proof of Szemeredi; the "natural home" for my conjecture seems to be the infinitary setting, so I'd like to look at it with infinitary tools, but I don't have them in my toolbox. Apart from that… that's all I have to say for now.

January 15, 2010 at 1:02 am

One small remark is that the natural decomposition to look at for an additive combinatorialist would be into a small number of HAP-basis vectors plus some kind of quasirandom error, which would conveniently deal with the difficulty you are facing with changes at a small number of points.

I’m not altogether clear that a structure/randomness dichotomy approach could be made to work, but I don’t think it can be ruled out. The structured case would be multiplicative (and perhaps quasimultiplicative) sequences, and the random case would be one where you have almost no multiplicative structure at all. There are various ways of making that precise. I put one on the wiki in the interesting-subquestions section. Another, which I think was in a blog comment somewhere, is to say that a sequence has almost no multiplicative structure if the sum of over all quadruples such that is small. When the theoretical aspect of the project starts properly, proving that a low-discrepancy sequence must have at least some multiplicative structure is one of the first things I want to think about.

January 15, 2010 at 8:32 am

Harrison’s HAP basis looks very interesting. I could not understand the precise conjecture though. Can you restate?

January 15, 2010 at 10:56 am

Here’s what I understand it to be. I’d be interested to know whether my interpretation is correct. Let’s begin with the beginning of the first 1124 sequence. We’ll then expand it as a mod-2-linear combination of characteristic functions of HAPs. It’s easy to see that this can be done in precisely one way. In fact, I’m going to do it over instead because I think that gives interesting extra information (but you can always reduce mod 2 if you want to).

The sequence starts

+1 -1 +1 +1 -1 -1 -1 -1 +1 +1 -1 +1 +1 +1 -1 -1 +1 -1 +1 +1 -1 -1 -1 +1 -1 +1 +1 -1 +1 -1

Following Harrison I’ll write (d) for the HAP {d,2d,3d,…}. So this sequence is (1) plus

0 -2 0 0 -2 -2 -2 -2 0 0 -2 0 0 0 -2 -2 0 -2 0 0 -2 -2 -2 0 -2 0 0 -2 0 -2

which is (1)-2(2) plus

0 0 0 2 -2 0 -2 0 0 2 -2 2 0 2 -2 0 0 0 0 2 -2 0 -2 2 -2 2 0 0 0 0

which is (1)-2(2)+2(4) plus

0 0 0 0 -2 0 -2 -2 0 0 2 -2 0 0 2 -2 -2 0 0 0 0 -2 0 -2 0 -2 2 0 -2 0 0

which is (1)-2(2)+2(4)-2(5)-2(7) plus

0 0 0 0 0 0 0 -2 0 2 2 -2 0 2 4 -2 -2 0 0 2 2 -2 0 -2 2 -2 2 2 -2 2 0

(Remark: in order to get rid of the -2s at 5 and 7 there I “spoilt” a lot of zeros later on, which suggests that the HAP basis is not always very efficient, just as Harrison found.)

I think the idea is clear, so I’ll leave it there. I’m basically Möbius inverting. More precisely, if is the coefficient of (d) in this expansion, is the Dirichlet series associated with the original sequence , and , then . Since , it follows that .

It would be good if someone could do this to a few of the interesting sequences and post the results on the wiki. It would complement nicely Alex’s intriguing computations of Dirichlet inverses.

January 15, 2010 at 11:29 am

Hm, I meant to consider {+1, -1} multiplicatively (which is of course isomorphic to GF(2)), so the decomposition would instead proceed like:

+1 -1 +1 +1 -1 -1 -1 -1 +1 +1 -1 +1 +1 +1 -1 -1 +1 -1 +1 +1 -1 -1 -1 +1 -1 +1 +1 -1 +1 -1

Which I’ll write in binary so that it’s shorter, as:

010011110010001101001110100101

Now this is (2) = 010101… plus:

000110100111011000011011110000

which is (4) plus:

000010110110011100001010110100

which is (5) plus:

000000110010010100011010010101,

and so on.

(All of this is addition (mod 2), or multiplication in {-1, 1}.)

Actually this seems to be giving essentially the same basis as Tim’s decomposition — I’m not too sure why this should be the case, particularly if the sequence isn’t multiplicative, and it may be a coincidence. If they do agree in general (and I could easily be missing something obvious) then probably the computation over should be used since it would seem to give strictly more information.

January 15, 2010 at 11:31 am

OK: So let me see if I understood the conjecture:

Given a 0-1 sequance you can write it uniquely as a sum modulo 2 of characteristic functions of HAPs. Lets us write . Here (d) is the characteristic vector of the HAP (d,2d,3d,…).

Suppose that in this sum the set S has positive density.

Then there is no constant K such that the partial sums are all between -K and +K.

(Is this the conjecture?)

January 15, 2010 at 11:51 am

Well, not quite, but you have the right idea. First there’s the small issue of being a 0-1 sequence (in your formulation) rather than a +/- sequence; in addition, you need to take HAPs of difference other than 1, or else I’d imagine you can probably construct a counterexample. So replace the last statement by:

There is no constant C such that the partial sums are all between -C and +C.

And that’s the conjecture.

January 15, 2010 at 12:07 pm

I see, so your question is a special case of the entire conjecture. (You look at the discrepency of all HAPs, namely, you want that there is no constant C so that all the partial sums for every k and n are between -C and +C; right?)

(Also you insist that 1 is not in S?)

Do you have an example where the stronger statement is false: S is dense and yet all sums

are bounded between C and -C.

January 15, 2010 at 12:31 pm

Amusingly, I can’t construct a counterexample to your “stronger statement”, but I can easily prove that one exists! Think about the set of sequences obtained by replacing 0 by 01 and 1 by 10. There are too many of these sequences for them to all have small decompositions, but all of them have bounded sums of the form you give.

January 15, 2010 at 1:12 pm

Harrison, I think your construction is essentially the same as Tim’s, and is what I understood too. Alternatively, in product notation:

where .

The inversion formula is then:

I did experiment a while ago with searching over the instead of the , and found no dramatic difference in results. But it is an interesting construction, and perhaps as Tim suggests there may be important information in the coefficients over .

January 15, 2010 at 2:45 am |

By the way, I’m looking back over the old threads and this seemed to drop off a while back, but did we ever find a quasi-multiplicative sequence (factoring through ) with discrepancy 2 that was significantly longer than the longest completely multiplicative sequence? (I know Alec had a Python script which was meant to compute the longest such sequence, but it was too slow.) This seems to be one of the major things missing from the experimental results page, and I’d really like to be able to take a look at such a thing.

January 15, 2010 at 5:33 am |

Alec, I’m having trouble getting the depth-first search to work. I’m trying to get the four sequences you report for N(1,2), but I get no sequences. I set N to 41, C_min to -1, and N_TO_COUNT at 41, but the maximum sequence it manages is 11. That’s the only thing I change in the code.

January 15, 2010 at 5:52 am

For that matter, the N(1,2) sequence on the website doesn’t make sense to me either…

0 + – – + – + + – + +

0 1 0 -1 0 -1 1 2 1 2 3

am I reading this wrong?

January 15, 2010 at 5:56 am

Never mind. I really should get some sleep. Still can’t get the code to work, though.

January 15, 2010 at 7:43 am

Jason, you also need to remove the EXCLUDE conditions, either by changing the definition of EXCLUDE to ‘#define EXCLUDE 0’ or by commenting out the text ‘ || (EXCLUDE)’ in the program.

And another thing I should have mentioned is that init_x should not be longer than N. I neglected to put a check for this in the program.

January 16, 2010 at 1:22 am

Thanks, I got it working. I also modified the code display all the sequences (instead of only one) and should note that N(1,2) has only slight differences between the valid sequences:

0+–+-++-++–+-++-+—+-++–++-+-++–++-+-

0+–+-++-++–+-++-+—+-++–++-+-++—+++-

0+–+-++-++–+-++-+—+-++–++-+-++—+-++

0+–+-++-++–+-++-+—+-++–++-+-++—+-+-

January 15, 2010 at 11:16 am |

Alec, I’m writing some code to look at quasi-multiplicative sequences, and I’m attempting to do it — perhaps naively — by modifying your code for completely multiplicative sequences. However, I’m getting stuck at one point interpreting the code:

xx = [[[False]*n_primes, 1]] # 0th item = specification of primes for which x[p]=-1, 1st item = partial sum

It seems like xx is meant to contain some data about every low-discrepancy list of the length we’re at; but I can’t for the life of me figure out what data it contains. Can you shed some light on this situation?

January 15, 2010 at 1:05 pm

In that script, xx is a list of representations of good sequences. Each of those representations is a two-member list consisting of, first, a list of booleans corresponding to whether or not each prime is mapped to , and secondly, a number representing the resulting partial sum. In other words, xx[m][0][i] is True if and only if sequence m maps prime i to ; and xx[m][1] is the sum of sequence m.

January 15, 2010 at 11:31 am |

I’m going to post a question over at mathoverflow, because I’m almost certain there’s a good answer to it but I don’t know the answer myself. Here’s the question.

Suppose you take a sequence and you want to “improve it” by taking pointwise limits of translates. What properties can you guarantee to get in the limit?

Two examples illustrate what I think should be the extremes. If your sequence is random, then every finite subsequence occurs infinitely often, which means that it is easy to find translates that converge pointwise to any sequence you like. In particular, you can make the final sequence constant, though what interests me is that it is highly structured.

By contrast, let’s suppose that the sequence is quasiperiodic, or more concretely that you take a real number and define to be 1 or -1 according to whether the integer part of is even or odd. In that case, it is not hard to prove that every pointwise limit is quasiperiodic as well. (Unless I’ve made a mistake, the result is that there must be some such that is 1 or -1 according to whether the integer part of is even or odd.)

What can be said in general? Can we always “get rid of the randomness” and find a highly structured limit?

In our case I’m hoping to apply whatever insights people can provide to subsequences of the form , since if these are highly structured, then that should imply multiplicativity properties.

Incidentally, this time I tagged the question with polymath5. I agree that doing that is a good idea.

January 15, 2010 at 1:18 pm |

Here’s another possible mathoverflow question, if nobody here knows the answer. Depending on the answer, it might illuminate the situation with multiplicative sequences.

Let be the set of number whose squarefree part has no prime factors larger than k. Does contain arbitrarily long intervals?

Anticipating a negative answer, set to be the length of the longest interval in . Clearly , but does it grow linearly?

January 15, 2010 at 1:30 pm

One could prove some nonrigorous bounds by assuming that the maximum gap between consecutive primes is . That would mean that in order to get a long interval of the kind you want, you’d have to go exponentially far in , at which point you might argue as follows. A number has a positive probability of being square free, and typically you expect a number to have about prime factors, so in particular it will have a factor of size at least . If , then that gives us , which is a lot bigger than .

I haven’t checked carefully whether what I’ve just written is a load of rubbish, so it might be. But if not, then it suggests that long intervals, other than the trivial one, will be hard to find.

Having said that, it might well be enough to have long intervals in which

almostall the numbers belonged to . In that case, it would no longer be necessary to go a long way out to avoid primes, since one could afford a few primes. On the other hand, primes aren’t the only thing you need to avoid: for example you don’t like numbers that are products of primes with something small.January 15, 2010 at 6:32 pm |

Added to wish list: take some of the interesting long sequences that we’ve generated (particularly the ones that are

notconstrained to satisfy certain multiplicative relationships) and calculate the following. If the sequence has length , then work out the expected value of over all quadruples such that and all of are at most . For a random sequence one would expect this quantity to be fairly close to zero. For a completely multiplicative sequence it is 1, which is the trivial maximum. For a quasimultiplicative function with a small group it should be a not too small positive constant. My guess is that for the first sequence of length 1124 it will be something like 0.3, but I’d like to know whether that is right.January 15, 2010 at 10:20 pm

Tim, I’m computing it at the moment, I have a preliminary plot here which I’ll update tomorrow http://thomas1111.wordpress.com/2010/01/15/average-over-quadruples-of-the-first-1124-sequence/ Indeed it looks to be around 0.3ish…

January 16, 2010 at 1:33 am

That’s interesting. It would also be interesting (and much quicker) to do a similar computation with Alex’s discrepancy-1 sequence for the complex case. My guess would be that the parameter works out larger in this case.

January 16, 2010 at 9:18 am

The first computation is now finished (just short of 12 hours on my laptop, at 50%CPU): actually the final value is 0.493215…, so just about midway between complete randomness and complete multiplicativity, with a total of 2 541 424 such quadruples.

January 16, 2010 at 3:50 pm

I may have been calculating something different from Thomas. I looked for quartets for which ab=cd, and found 952678 of them. For example, there were 435 quartets with ab=30240 because there were 30 pairs with ab=30240, from 27*1120 to 168*180. It took about ten minutes on a 2GHz machine using Matlab.

The first length-1124 sequence gave a sum 310378 from 952678 terms, or average 0.3258. The second sequence a sum 299842 from 952678 terms, average 0.3147.

The 1112 sequence gave a sum 377253 out of 930443 terms, average 0.4055.

For the complex sequence, I guess you want to sum times the conjugate of . That gave a sum 2814 from 5535 terms or .5084.

January 16, 2010 at 4:04 pm

Oops, yes I did indeed want conjugates there. As I expected, the parameter is larger, but I have to admit that I thought it would be larger still. But it’s interesting to see this.

January 15, 2010 at 9:11 pm |

I decided to try searching for discrepancy-2 sequences with bounded Dirichlet inverses.

If we require the Dirichlet inverse to be bounded by 1, we obtain a multiplicative sequence of maximum length (246).

If we require it to be bounded by 2, we obtain a sequence of length 389. I haven’t looked into the properties of this sequence, but it is interesting that 389 is a frequent sticking-point in searches, so perhaps it represents the limit of some other useful property. I’ll post this sequence on the wiki.

The search with a bound of 3 on the Dirichlet inverse is still running (and has reached 489).

January 15, 2010 at 9:48 pm

Alec, have you tried to work out in what region the Dirichlet series converges if is a sequence of bounded discrepancy? Given the close relationship between the fact that the partial sums of the Möbius function are expected to grow at rate and the Dirichlet series is expected to converge for (and many similar facts), it seems at least possible that one could prove convergence for . I haven’t thought about this properly, but I’m tempted to see what I can prove using partial summation.

Actually, now that I’ve written that I suddenly feel much less sure about it: it seems to me that either it will be true for any sequence with bounded partial sums, in which case it isn’t very interesting to us since it isn’t using the discrepancy property, or it would need boundedness on all APs and not just HAPs, which is impossible. But it would still be good to clarify this.

January 15, 2010 at 10:11 pm

Tim, yes, we have convergence for . For this it is enough that

.

(This is proved in Hardy and Riesz, ‘The General Theory of Dirichlet’s Series’, pp. 6–7.)

So you are right, this fact doesn’t depend on the discrepancy property in any very useful way (unless we can get some handle on the Dirichlet series defined by the ).

January 15, 2010 at 10:17 pm

After reading that I realize that you have already pointed it out …

January 16, 2010 at 8:39 am

I’ve now analysed the 389-sequence that has DI bounded by 2. Up to 337, it is almost completely multiplicative, but departs at powers of 3 — while remaining multiplicative in the weaker sense. That is:

January 16, 2010 at 9:55 am

And in fact, that is almost as far as you can get with ordinary multiplicativity. The longest discrepancy-2 sequence with ordinary multiplicativity has length 344, and indeed satisfies the above identity.

January 16, 2010 at 1:01 am |

I may be missing something, but it seems to me that the first result of Borwein and Choi stated in their abstract is

muchsimpler than they make it seem. I’d be interested to know whether others agree with me about this.Note first that if n is congruent to 2 mod 3, then the number of prime factors (with multiplicity) that are congruent to 2 has to be odd. More generally, this is the case if the last non-zero digit of n in its ternary representation is a 2. So equals -1 if the last non-zero digit is a 2, and 1 otherwise. This is our old friend the function that Mark Walters defined. The partial sum of this function up to n can be computed (as I did in my first post) by partitioning the numbers up to n according to where the last non-zero digit is, and in that way one sees that it is exactly what they say, the number of 1s in the base-3 expansion. End of proof!

I’d be pretty surprised if their more general “surprizing” formula isn’t equally simple to prove. (An even bigger mystery about the paper is how they were allowed to get away with that Z.)

January 16, 2010 at 1:09 am

Just checked the mod-5 one, and it is indeed equally simple.

A couple of minutes later: just checked the mod-p one, and again it is equally simple.

January 16, 2010 at 2:17 am

I agree that it is as simple as you make it.

January 16, 2010 at 11:25 am

I’ve just found the published version of the paper, in which they’ve added a coauthor (Michael Coons) and presented the proofs in a way that makes them look less complicated. It looks to me as though they are essentially carrying out the investigation that you were interested in a week or so ago, though I haven’t checked how far they got. I’ll add this reference to the wiki.

January 16, 2010 at 2:32 am |

The following trees represent ways to assign primes in a completely multiplicative sequence. If a leaf is labeled {N,k}, it means that the corresponding assignment has with absolute value “drift”.

drift = 2

drift = 3

drift = 4

January 16, 2010 at 2:48 am

drift = 5

January 16, 2010 at 11:10 am

I find these diagrams interesting but I don’t understand them. For example, if I follow the branch that corresponds to the sequence posted by Alec on the wiki in the drift-5 diagram, it stops at 23 with the pair (23,7). What does that mean? (The values of the sequence up to 23 are 1+ 2- 3- 4+ 5- 6+ 7- 8- 9+ 10+ 11+ 12- 13- 14+ 15+ 16+ 17- 18- 19+ 20- 21+ 22- 23- .)

January 16, 2010 at 11:50 am

By the way, I’ve posted on the wiki a completely multiplicative sequence of length 627 that satisfies . I don’t know whether this is the limit.

January 16, 2010 at 1:28 pm

Alec, there are a few further bits of information that would be very useful to know in relation to your C=3 sequence, and you can probably provide them with your computer much more quickly than I can by hand. The main thing that would be good would be a list of the values that the sequence takes at primes. But it would also be interesting to know the largest m such that your sequence up to m can be extended to a completely multiplicative sequence of maximal length with C=2. I ask this because my hand-calculation suggested that if you go as far as you can with C=2 then you are suddenly forced to go up to C=4, so I’m curious to know how far you have to backtrack if what you actually want is C=3.

I’d also be interested to know roughly how your algorithm works, as it would be pointless to speculate about cleverer algorithms if they aren’t in fact any cleverer.

January 16, 2010 at 1:43 pm

I’ve found one of length 819 now (and put it on the wiki). I’ll do some analysis on it and report back.

The algorithm is not clever: it’s just an incremental depth-first search. I am intending to try to implement something a bit cleverer, with more look-ahead, this afternoon, and see how far that gets.

January 16, 2010 at 1:47 pm

I’m hoping it will get you

muchfurther, since that would suggest the possibility of a very interesting result (sublogarithmic growth for a multiplicative function).January 16, 2010 at 2:00 pm

Yes, that would be exciting; the jury’s still out on that question I think.

Right, I’ve added a list of the primes that map to . This list does not include , and therefore the sequence can’t be extended to a discrepancy-2 sequence of maximal length once it gets past . In fact, its discrepancy goes to at .

January 16, 2010 at 3:21 pm

@gowers: I think you followed 19- on the tree, whereas Alec’s has 19+. If you take 2,3,5,7,13,17,19,23 to – and 11 to +, then you get

0 1+ 2- 3- 4+ 5- 6+ 7- 8- 9+ 10+ 12- 13- 14+ 15+ 16+ 17- 18- 19- 20- 21+ 22- 23-

and the sum of the seven entries, the last of which is 23, is – – – – + – -, the sum of which is -5.

Alec’s sequence ends with label {372,5}: all five of 368, 369, 370, 371, 372 are negative.

January 16, 2010 at 3:27 pm

Alec’s new C=3 sequence (length 819) has drift 5 at 72,73,74,75,76 (all +).

January 16, 2010 at 3:30 pm

Yes that’s exactly what I did, so my confusion is now dealt with.

It occurs to me that another measure of drift would be the maximum difference that you can prove must occur between two partial sums. So if, for example, you have chosen the primes up to n and you find a consecutive block of k integers of which all but j are guaranteed (by multiplicativity) to take the value +1, then the drift is at least k-2j. I wonder how much difference that would make. I also wonder how a greedy algorithm would fare if it tried to minimize the drift at every stage.

I’m assuming that you calculate the drift of an interval only when the function is defined everywhere in that interval: am I right about that? (What I’m suggesting is that for not-yet-defined values one just assumes that they will end up making the sum in the interval smaller rather than bigger. An advantage of this is that it is a bit more continuous: if you choose the value at a large prime, it is unlikely to affect any drifts by more than 2.)

January 16, 2010 at 3:41 pm

The fact that Alec’s sequence takes the value 1 at 7 is a bit of a blow for greedy algorithms. Does it seem that the maximum length of a multiplicative sequence of discrepancy 3 that equals -1 at 7 is much shorter, or is this just an artefact of the algorithm (as might be the case if, say, you always started with and changed it only if you needed to when backtracking — but it would seem more natural to me to start with )?

January 16, 2010 at 3:52 pm

It could well be an artefact of the algorithm. This search actually did start with and I don’t think it’s tracked back that far. I’ll try using a maximal sequence as initialization and see what happens then…

January 16, 2010 at 3:53 pm

Started with , I mean.

January 16, 2010 at 4:39 pm

The good news for greedy algorithms is that you don’t have to change the values at any primes less than 71 if you want a long discrepancy-3 sequence. I’ve found one of length 545 (now on the wiki) that agrees at primes up to 67 with the maximal discrepancy-2 sequences, and that retains discrepancy 2 up to 243.

But this seems to contradict your earlier observation, so maybe one of us has made a mistake?

January 16, 2010 at 4:47 pm

I hope that if one of us has made a mistake then it’s me, as I wasn’t very pleased about the sudden jump from 2 to 4. My justification for that assertion was here. If I have a moment, I’ll check it again.

January 16, 2010 at 4:49 pm

Yes, I only calculated the drift of intervals that are completed determined. The calculation took less than a minute for drift=5. The computation for drift=6 ran for 10 hours and still was only about 40% done. The easy 40% (biased toward +’s and away from -‘s), actually.

The code isn’t optimized, and actually is running in Mathematica, so the timing isn’t interesting. The ratio of time is, however. The COMBINATORIAL EXPLOSION happens between drift=5 and drift=6.

Perhaps interesting is that the longest branches of the trees seem to be of the alternating sign sort. Sometimes two plusses followed by two minuses.

January 16, 2010 at 5:29 pm

Tim, in your analysis did you skip 244 (which maps to +1)? I’m also not sure why you say the sum to 241 must be at most 1 if we are allowing discrepancy 3.

January 16, 2010 at 3:46 am |

[…] this there is now a third thread for polymath5 see here. Let me update further there is now a fourth thread. There is also a polymath5 tag on Math […]

January 16, 2010 at 1:52 pm |

[…] 16.01.10: dentro do mesmo espírito, está em curso o The Erdős discrepancy problem, no blogue do Professor Gowers, que já vai no seu quarto artigo sobre este problema. […]

January 16, 2010 at 5:07 pm |

I have just tried a little computational (but simple enough to do by hand) experiment on the first 1124 sequence, and am absolutely fascinated by the result. Indeed, I now think there’s a real chance that we will be able to understand what this sequence is and perhaps even define an infinite extension of it.

The experiment was very simple. I took the numbers in turn and for each I wrote out the values at . Here is a table of the results. I start with what n is and then give the values. I have deliberately shifted the rows by varying amounts (I hope it will come out nicely when I submit the comment, but if not then perhaps someone like Thomas can produce a nicer table). Added later: sorry, it hasn’t come out all that well, so a proper table would be great.

2: – + – – + – – + – +

3: + – + + – + + – +

4: . + – – + – – + – +

5: . – + + – + + – +

6: . – + + – + + – +

7: . . – + – + + – + –

8: . . – – + – – + – +

9: . + – – + – – + –

10: . . + + – + + – +

11: . . – – + – – + –

12: . . + + – + + – +

13: . . + + – + + – +

14: . . . + – + + – + –

15: . . – – + – – + –

16: . . . – + – – + – +

17: . . . + – + + – + –

What I find remarkable about this table is just how perfect it is. The only anomalous value is (which will come as no surprise to Mark). If you changed that, then everything would be a shift, or minus a shift, of the sequence that you get for 2. (Perhaps I should have started with the row at 1.) Also, the row for 2 looks to me very like a “periodic sequence with irrational period”. Could it be the golden ratio? Also, it is notable that the amount you shift is closely related to the size of the number. So if we count the row I didn’t put (starting with 1) as the basic row, then rows 2 and 3 are shifted by 1, rows 4,5,6,9 are shifted by 2, rows 7,8,10,11,12,13,15 are shifted by 3, and rows 14,16,17 are shifted by 4. (The shifts at powers of 2 obviously have to be what they are.)

In the Polymath spirit, I am mentioning this straight away rather than trying to work out exactly what is going on and reporting back. One question I’d like to answer is whether the sequence + – + – – + – – + – + is compatible with a period that is related to the golden ratio. I very much hope so as I haven’t quite let go of that little fantasy. (That’s the sequence of values at 1,2,4,8,16,32,64,128,256,512,1024.) Another is whether everything continues to work wonderfully well if you look at powers of 3, though the sequences will be a bit short. Here’s what the values are at 1,3,9,27,81,243,729: + + + + – – -. And at twice those numbers: – – – + + +. Hmm, it’s looking pretty good!

January 16, 2010 at 5:20 pm

This also suggests that a modification to an experiment I suggested earlier could produce lovely long sequences. The earlier idea was to fix a sequence for powers of 2, given by a “periodic sequence with golden-ratio period” and then to insist that the values at d,2d,4d,8d,… were always given by either this sequence or minus this sequence. It now looks as though what I should have suggested was that the values were always given by a shift of this sequence, and that the shift associated with n should be roughly . With hindsight, this makes much better sense, since we shift by exactly at powers of 2.

So there’s a new item that’s jumped right to the top of my wishlist. Alec, can you get any super-long sequences this way?

January 16, 2010 at 5:22 pm

One other tiny thought is that the “anomaly” at 7 might actually be that the amount of the shift is unexpectedly large, or that it corresponds to a small phase shift that is just enough to tip the value at 7 from + to -.

January 16, 2010 at 5:33 pm

I’ve now found a golden-ratio-related sequence that agrees with the values of the 1124 sequence as you go along powers of 2, starting at 1. I’m getting excited — sorry, can’t help it.

January 16, 2010 at 7:43 pm

You’ve probably realized this already, but since you didn’t say it: If the sequence were perfectly multiplicative all the shifts would be 0 (and the rows would be periodic of period 1 or 2.) If the sequence were random, we wouldn’t expect any sort of structure like this to emerge.

I’m not

entirelyconvinced that this is a phenomenon that’s really unique to the 1124-sequence — it just seems to encapsulate some of the multiplicative structure in a rather nice way.January 16, 2010 at 8:49 pm

I looked at the 1124 sequence using powers of 3 instead of powers of 2, and the analogous subsequence appears to begin:

+ + + + – – – + …

With the following shifts:

n = 1: Shift of 0

n = 2: Shift of 1

n = 3: Shift of 1

n = 4: Shift of 2

n = 5: Shift of 2

n = 6: Shift of 2

n = 7: Shift of 0 (?!)

n = 8: Shift of 3

n = 9: Shift of 2

n = 10: Shift of 3

n = 11: Shift of 3

n = 12: Shift of 3

n = 13: Shift of 3

This is of course a very small sample but it does in fact look as though the shifts may be of size log n.

January 16, 2010 at 6:36 pm |

Interesting — there’s definitely a pattern here! If we write , and (to a good approximation) , where and is the shift (), then the seem to have quite low discrepancy too, as far as you’ve calculated.

January 16, 2010 at 6:44 pm |

I calculate, for :

January 16, 2010 at 8:12 pm

To continue a bit further, for :

: +1, -1, -1, -1, +1, +1, -1, +1, -1, -1, +1, +1, +1, -1, +1, -1, -1, +1, +1, +1, +1

: 0, 1, 2, 3, 2, 3, 3, 3, 4, 3, 4, 3, 4, 3, 5, 5, 4, 5, 4, 4, 5

Apart from one anomalous value in the 3 sequence (the one starting at 7) and one in the 16 sequence (the one starting at 33), the agreement is perfect up to here.

January 16, 2010 at 8:33 pm

Alec, I’d like to draw a distinction between serious anomalies and anomalies that are potentially explicable. I should say first what I now think the sequences may be. It seems to me that you get them by continually adding the golden ratio (where for convenience I’ll define this to be and assigning a + if the integer part changes and a – if it doesn’t, or the other way round. That’s equivalent to adding the golden ratio mod 1 and making your answer depend on whether you’re in a golden-ratio-sized interval or not.

If that’s the case, then each sequence will either be biased towards 1 or towards -1. If the former, then it should never have two -1s in a row. So I’d be interested to know whether the sequence for 16 has this property (that it doesn’t have both two 1s in a row and two -1s in a row).

Come to think of it, I can work this out for myself, so here goes. I get + – + + – -, which means that, disappointingly, it seems to be a serious anomaly rather than a mild one. (However, the anomaly first shows up at 1056, so maybe there are other examples that are OK.)

January 16, 2010 at 8:51 pm

In general, I now think we should be looking for a sequence defined as follows. (It may be necessary to modify this definition — this is a first attempt.) For the th prime we associate a sign and a real number . Then if we calculate the number mod 1. Next, we work out the product , and we set to be if belongs to a golden-ratio-sized interval and otherwise.

For this to work, it’s quite important that the signs Alec was calculating above (but renamed so that I define to be what he calls ) should be completely multiplicative. To my great delight, they seem to be.

January 16, 2010 at 8:55 pm

I meant to say in the previous comment that I think there’s a good chance of getting somewhere by making the wild guess that every is a small integer multiple of the golden ratio. From that it should be possible to guess what each is, and we can guess what the sign is just by seeing whether the corresponding sequence (when you multiply by powers of 2) has more 1s or more -1s. So one ought to be able to take the 1124 sequence and extract from it the formula that generates it — unless of course my wild guess is wrong.

January 16, 2010 at 9:03 pm

That definition, except for the part where we fix the size of the interval at , is a generalization of quasi-multiplicative sequence, right? In quasi-multiplicative sequence, we take all the to be rationals (with bounded denominators) and… hmm. I feel like there’s a small gap that may still need to be bridged — in quasi-multiplicative-land depends on but depends only on where is and not on any of the signs at the primes.

I still think we can, and almost certainly should, unify the two concepts. I have a sneaking suspicion that there’s some really deep, but rather well-understood, number-theory stuff lurking beneath the surface — adelic analysis? I don’t know enough about that stuff to be sure — and I’d love to have a clearer view of what it is.

January 17, 2010 at 2:08 pm |

Apologies if this has been considered before, I’m still trying to catch up with the whole conversation. Here’s a simple argument to suggest that there are length sequences of discrepancy 2. The argument is faulty, but I don’t know if fatally so.

There are , sequences of length such that and of length , so there are of lengthh . There are sequences in total, so the probability that a random sequence of length survives the sum test is . Looking at subsequence or our sequence, the probability that it survives the test is . (This is where the argument turns faulty, because I don’t know how correlated the events , are. I don't know if it kills the argument. Ignoring that and pressing on…) Continuing with the rest of the subsequences we get that the probability that the sequence has discrepancy is . Multiplying by the total number of sequences we get that there are length sequences of discrepancy 2.

Similar argument suggests that there are i.e. 0 for large enough sequences of discrepancy 1.

January 17, 2010 at 2:15 pm

I apologize for the LaTeX mistakes, this is my first time posting LaTeX in WordPress. I assumed I’d be put in a preview/fix mode, but I was wrong, and now I don’t see a way to edit it.

I meant to say that there are 4 x 3^{n-1} of +-1 sequences of length 2n and 2 x 3^n of length 2n+1. The norm brackets were meant to be absolute value brackets…

January 17, 2010 at 3:06 pm

I think there is likely to be a problem with probabilistic arguments, since it seems to me that the events “survives the d-HAP test” are likely to be negatively correlated. Somehow, the easiest way to survive the tests seems to be to be periodic with a different period (for instance, the sequence 1 -1 1 -1 1 -1 1 -1 … passes all tests when d is odd), so it feels as though if you pass one test it will increase the chances that you will fail another. This is of course a completely non-rigorous argument but for now I find it convincing.

January 17, 2010 at 2:41 pm |

I think that another interesting experimental direction would be exploring the structure of the trie constructed by the C=2 sequences by breadth first search instead of the depth first search, has anybody done that? I expect the width of the tree would overwhelm the memory pretty early on in the levels, so some judicious sampling would be necessary. The idea is to see whether the sequences are statistically similar to samples of some sort of a stochastic process. I’ll try to run some C code for that and report back.

January 17, 2010 at 2:54 pm |

The idea was mentioned in some of the replies to this comment. I’m not sure if anyone tried it.

The discussion is continued over here.