I am still working on this, but there is a rather serious error in the code on the wiki right now. Previously I had estimated the probability a relevant HAP would constrain index i and then used that to estimate the probability index i would have 0, 1, or 2 options when it had x constraining HAPs and an arbitrary HAP had probability p of being +2 or -2 instead of 0. Then I added additional logic to determine each individual constraining HAP’s bias toward 0, but instead of redoing the second part of the calculation, I accumulated all of the bias into the generic probability p and THEN said “oh but we have x HAPs”. So if for example each HAP had a 60% chance of being 0 instead of 50%, and there were 10, I’d say “okay a generic HAP here has 1.5^10/(1+1.5^10) chance of being 0 instead of 50%, and I have 10 HAPs each with that probability”.

So the fact that it got down to ~6 exactly at 1125 really WAS a coincidence. But I’m still optimistic about the approach since the distribution has the right shape, just haven’t gotten to fixing that glaring error and adding in some additional thoughts I had. Probably a weekend project.

]]>This is exactly what I was wondering on my way home from work, and I’m hoping the investigation of shifted low-discrepancy sequences might shed some light on it. I have just got my program working — I hope. Too soon to draw many conclusions: I’ve just run it a couple of times, with different and different seeds; each time it’s got to lengths in the 200s without difficulty (and without terminating).

]]>I’m slightly struggling with the following question: is it easy to tell from the values between some large and whether a function is the restriction of some multiplicative function?

We can characterize multiplicative functions as mod-2 sums of characteristic functions of HAPs with prime common difference (when you convert 0s into 1s and 1s into -1s). Thus, we could characterize restrictions of multiplicative functions to as mod-2 linear combinations of all prime HAPs that intersect that interval.

That gives us what one might call a global way of testing whether a function is a restriction of a multiplicative function, and it gives a polynomial-time algorithm for it. But that’s not what I mean by “easy to tell”. To see what I mean, consider an alternative characterization of multiplicative functions: they are ones where and if then . That one might call a local characterization. (If we omit the then we characterize functions that are either multiplicative or minus a multiplicative function.) Equivalently, if you go along any GP, the values either alternate or are constant (and ).

Is there anything that could count as a local characterization in the shifted case? I don’t have a precise formulation of this question, or I suspect it would be easy to answer.

]]>I hope to add some more tutorials on basic C and C++ to modify a code, and on shell scripting to process data files, in the coming two weeks. ]]>

It works! Thanks.

]]>I’ve changed it to 25. Let me know if it works. (But mysteriously I have an RSS comment feed for my own blog and it seems to display far more than just the last 10 comments.)

]]>The one level threading makes it a bit difficult to make sure that you read all the comments. One way to do it is to follow the RSS feed, but it only contains the latest 10 comments (you could also click the “Notify me of follow-up comments via email”, but then you get a lot of emails). It would be great if you would increase this number in Settings -> Reading -> “Syndication feeds show the most recent” (I think this works for post feeds as well as comment feeds, but I’m not sure). ]]>

Another idea:

Instead of trying to find the longest sequence _beginning_ in N with discrepancy (or should I say drift) at most C, we could try to find a sequence _around_ N. So we decide the value in N, then N+1, then N-1,… I don’t really think that this would be a better experiment, but it might at least give a different result.

Note that working backwards from N (decide N then N-1 then N-2,…) would be the same as working forward from -N or from (n!-N) for some large n.

]]>I wondered about that and then sort of reassured myself that it didn’t. But now that you mention it I think that my self-reassurance was wishful thinking. In fact, it looks as though all you have to do is a shift to make sure that the biggest power of 3 you care about is shifted to the right place and that any number that isn’t a power of 3 will take care of itself.

So I take back my hopes of a much larger discrepancy, and I also take back my statement that I didn’t see much chance of structure. I still can’t work out quite what structure I would expect to see though. Perhaps that will become clearer if we get an example to look at.

]]>“It’s conceivable that the discrepancy could even be much bigger than logarithmic, because in order to get to the point where the shifts are suitably random one might have to go exponentially far out (so it wouldn’t contradict the Walters example).”

Doesn’t the corresponding shift of Walters example give a logarithmic discrepancy? I haven’t checked, but I assumed that that was the case when I wrote this comment. I just realized, that what you suggested in this comment is a special case of what I wrote about (looking at a set of intervals, instead of just at one interval). But I think this is the most (only?) interesting case, and it would be very interesting to see the result of Alec’ experiments.

If you do find long sequences it will particularly interesting to see whether they have any discernible structure. Very broadly speaking indeed, I am holding out the hope that random shifts will somehow randomize the sequence itself and thereby force it to have a large discrepancy. It’s conceivable that the discrepancy could even be much bigger than logarithmic, because in order to get to the point where the shifts are suitably random one might have to go exponentially far out (so it wouldn’t contradict the Walters example).

]]>I’ll try this tonight. I can see why it might be harder to find long sequences (because you’d tend to hit numbers that intersect a lot of PAPs sooner), but also why it might be easier (because for that very reason you’d be forced to adjust to the stronger constraints earlier on). It will be interesting to see the results.

]]>I can be slightly more precise now than I was before. Let us choose, for each prime , a “privileged” residue class mod . And let us call any initial segment of that residue class (by which I mean that one picks some and takes the set of all numbers that are congruent to mod and at most ) a *privileged arithmetic progression* mod , or PAP. Let us also do the same for prime powers, choosing in such a way that if , then is congruent to mod . (This is to guarantee that the infinite PAP mod is a subset of the infinite PAP mod .) And finally, let us say that any intersection of PAPs is a PAP. By the Chinese remainder theorem, for every there will be exactly one such that the set of positive integers congruent to mod is an infinite PAP with common difference .

Also by the Chinese remainder theorem, for every choice of for finitely many prime powers we can find some that is congruent to mod for every . If we then shift the origin to look at the HAPs beyond , they behave just like PAPs. So if we can find *some* choice of shifts that force unbounded PAP discrepancy, then we have proved EDP.

The motivation for bringing up this concept is that it seems at least possible that many phenomena we observe at the moment are misleading and depend on the fact that all the HAPs are “lined up” at zero, whereas the above observation shows that the lining up is not really an essential feature of the problem. (What is essential is that we’re allowed just one congruence class for each and that the congruence classes must be “consistent” with each other.)

I think it would be interesting, and perhaps even quite enlightening, to investigate empirically some fairly random shifted version of the problem. To set it up, choose a very large integer randomly (between and , say) and define the infinite PAP mod to be the set of all positive integers that are congruent to mod . A finite PAP is just an initial segment of that arithmetic progression, and the discrepancy of a sequence is the discrepancy with respect to all finite PAPs. What I would like to know is this: if we search for sequences of PAP discrepancy 2, do we end up finding very long examples, just as in the lined up HAP case, or do the best ones tend to be much shorter? (I suppose it’s also possible that they could end up being much longer — that too would be interesting.)

If they end up much shorter, it suggests that it might be easier to prove that PAP discrepancy was unbounded (for some suitable, or perhaps random, sequence of shifts) than to prove it for HAP discrepancy. And yet the two problems are equivalent.

One further remark is that it is not at all obvious what structure one might expect to find in the PAP problem. For instance, I can’t think of a useful notion of multiplicativity, or a geometric progression, or anything like that. So there’s at least some hope that by shifting the HAPs we rule out some of the examples that cause us difficulties when we try to prove anything in the homogeneous case.

]]>Another theoretical one to try (that can be a pain experimentally) are HAPs that allow up to *x* terms that force the sequence (for any *d*) outside the given discrepancy.

My fault — I was wrongly thinking that in that case the largest subset was the empty set, whereas in fact it’s the set of sets that you can take that is empty …

]]>If the conjecture is true then is undefined because there is no subset on which we can choose values freely and still extend. Or am I barking up the wrong tree entirely?

]]>Actually that last question was not a good one. If you choose the values at odd numbers randomly, then somewhere you’ll get a drift proportional to . Then either the drift of the whole sequence or the drift of the even numbers in that same interval will also be proportional to . But I think one could ask the question for a random assignment to numbers that are 1 mod 3.

]]>I’m not sure I quite follow — I would expect that limit to be zero (assuming that the conjecture is true).

]]>It occurs to me that this question could be investigated empirically. Suppose you fix a number like 1000, choose a random set of 200 numbers (say) between 1 and 1000, assign random values to those 200 numbers, and then try to choose the remaining values in such a way as to minimize the discrepancy. How well can you do?

A slightly better version of the experiment might be to choose the random set, but then to remove from it a few points so that you don’t get any long subintervals of HAP sequences.

What I would hope to see is a discrepancy that is forced to be quite large. Some of the numbers above might have to be adjusted, however.

One other possibility might be to choose the values at odd numbers randomly. Then the “odd discrepancy” would be around , or around 30 if . How far can one bring that down by cleverly choosing the values at the even numbers?

]]>Interesting. One could ask, given and , what is the size of the largest subset of on which we can choose the assignments freely and still be able to extend to a discrepancy- sequence on . For example, , since one can choose the values at and freely. If I understand the question, you’re asking whether is zero or undefined for all . (And what we actually expect is that it’s undefined.)

If we generalize slightly, replacing the limit by upper and lower limits and , and define the function analogously, I think I’ve shown (but would have to double-check) that .

]]>Since I too had the same idea, let me briefly explain why it doesn’t work, just to save others from having it as well. The problem is that if there is a long sequence of discrepancy , then might be precisely the set of places where that sequence takes the value 1. If so, then restricting the values on to 1 isn’t a huge problem!

Since the values you choose on must therefore depend quite seriously on , and since it is far from clear how to make the choice, the best chance might be to choose them randomly. So a very tentative conjecture is this. If you choose the values randomly on a set of size , then every extension to the whole of has discrepancy at least . (But all I actually care about is getting it to be unbounded, so if this conjecture is way too optimistic, it doesn’t matter too much.)

]]>That’s great — and I hope it won’t be the end of the story.

It would be quite useful to have a complete list of primes where corrections have taken place — that is, primes congruent to 1 or 4 that take the value -1 or primes congruent to 2 or 3 that take the value 1. One might expect such primes to come in pairs, with each pair enclosing a value that would otherwise have caused problems — or at least, one might expect this to start with, until the corrections themselves start to cause problems.

It occurs to me that we might even be able to prove a theoretical result here. If we take a character-like function, it’s not just the case that we have to go exponentially far before reaching a partial sum that’s bigger than , but it’s also the case that even when we do start having large partial sums, they are very infrequent and surrounded by long portions of sequence with very low drift. So if is a troublesome value, then an obvious strategy for dealing with it is to find primes and close to and on either side of it. If the partial sum at is too large, then we also want to be congruent to 1 or 4 and to be congruent to 2 or 3. We then change the value at from 1 to -1 and the value at from -1 to 1.

There is a huge range in which we can look for suitable and (since the drift is small) — easily large enough for quantitative versions of Dirichlet’s theorem to guarantee that they can be found.

Of course, once you change the values at and , you have to change the values at all multiples of and . At least to start with, these will also come in nearby pairs, but pretty soon that is no longer the case, and after a while one will be making more and more adjustments until, presumably, all trace of the base-5 structure is gone. The interesting question is whether that allows us to reach a sequence of superexponential length. At the moment it feels to me as though it will remain exponential but give a bigger constant in the exponent.

My rough reasoning is that the interference from changed values of and could start to bite by about , especially if there are several adjustments being made and not just the first two. But that’s so rough that it could well be wrong. Perhaps we could get from an exponential construction to something more like . Perhaps we could at least argue heuristically that some improvement like that should be possible.

]]>Maybe it is enough to assign only pluses?

*NB Gil almost immediately withdrew this suggestion — see the next main comment.*