I’ve been pretty busy over the last month or so — hence my blog silence — and continue to be busy. But here is one more discussion of a problem that was on my earlier list of possible future Polymath projects. The problem in question is sometimes known as Erdős’s discrepancy problem. This will be a shortish post, because I don’t have much of a programme for solving the problem: if it were adopted, then we would be starting almost from scratch, but that could be an interesting experiment.

Here is the question.

**Problem.** *Is it possible to find a -valued sequence and a constant such that for every and every ?*

With the help of a little terminology we can ask this question in a slightly different way. If is a collection of subsets of a set and , define the *discrepancy* of to be the maximum value of over all . In our case, is the collection of all arithmetic progressions of the special form , and the question is whether there is any function that has bounded discrepancy for this . I say “bounded” rather than “finite” because one can define the function to be the discrepancy of with respect to all those sets in that are subsets of . Then the question is equivalent to asking whether there is any function for which is a bounded function of . The advantage of this formulation is that one can consider more precise questions such as what the best growth rate is. In particular, it allows one to prove results in the other direction. Here is one such result, observed by Mark Walters (and very possibly by several other people).

**Theorem.** *Let if the last non-zero digit of is 1 when is written in its ternary representation, and let if the last non-zero digit is a 2. Then the growth rate of the discrepancy function is logarithmic in .*

** Proof.** Note that for any and , since the map , is an isomorphism from the multiplicative group of non-zero integers mod 3 to the group under multiplication and last non-zero digits multiply together mod whatever base you’re working in. It follows that the theorem is true if and only if the partial sums grow logarithmically.

To see that this is the case, partition the integers from 1 to into subsets according to which digit is the last non-zero digit. As you go through the elements of any one of these sets in increasing order, the values alternate between 1 and -1. Therefore, their sum is either 0 or 1. There are at most such sets (that are non-empty), so the partial sum in question is at most . QED

This example rules out any approach that attempts to prove that the arithmetic progressions are “approximately orthogonal” and to deduce the result from some kind of coordinate expansion, since such results would give bounds more like .

The thought that goes into it also shows that we cannot hope to prove a more general conjecture that replaces the condition that by a looser condition that asks merely for to be bounded away from 0 on average. Indeed, the sequence has discrepancy at most 1 on all arithmetic progressions of the given form. However, I do not at this stage know of any counterexample to the following generalization that is designed to avoid this problem.

**Problem.** *Let , let and suppose that for every the lim inf of the partial averages is greater than . Can there be a constant such that for every and ? *

One might want to make map to to rule out boring counterexamples. The motivation for generalizing the problem in this way would be to try to make it more amenable to analytic techniques, but Walters’s example will always be a serious obstacle.

If I think of more to say about this problem, then I’ll add it. But since this post is quite short, and since this is the last problem I wish to describe in detail for now, let me also discuss my current thoughts about the next Polymath project for this blog.

It has been interesting to find how much my perception of a problem changes when I blog about it. For example, the effort to describe my tentative approach to the P versus NP problem led to my understanding that it couldn’t work, and I now do not regard that as a suitable project (though I would very much like to tackle a more realistic complexity question at some point in the future). I have found the discussion about modelling mathematically the origin of life extremely interesting: I think we are still some way from formulating a problem that is clearly good, so I would still like to wait a few months at least before embarking on a project on that general theme. (However, it might well be that formulating a good problem would be a necessary part of any such project — in that respect it would differ in an interesting way from projects where the aim is to prove a clearly stated conjecture.) I found myself regrabbed by Littlewood’s conjecture as soon as I posted about that, with the result that I now know the answers to many of the problems that I thought (rightly, as it turns out) would be significantly easier than the main problem. The result is that the drawback with that project, that it is a notoriously hard problem, is shown in rather sharper relief. If I felt that that was clearly the most popular choice, then I would certainly be happy to go along with it, but I am not sure it is the most suitable.

The least risky choices, in the sense that they are the most similar in nature to DHJ, would be the polynomial-DHJ-related problem about graph differences or Erdős’s discrepancy problem.

I welcome any thoughts that people might have about this. Once there has been some discussion, I will put up a poll, though I must warn in advance that the process will not be fully democratic, since I will attach more than one vote’s worth of weight to my own enthusiasm and to my perception of whether there is likely to be a nucleus of people who are ready to think seriously about the given problem. One reason I am drawn to the polynomial-DHJ project is that I already know of people who are very interested in that and ready to think about it. If you are keen on one of the other problems in a similar way, then feel free to let me know. If you don’t want to declare yourself publicly on this blog, then you could always drop me an email.

**An additional thought about the problem.**

I forgot to mention a variant of the problem that occurred to me, which is to look at the complex case instead. That is, one could ask the following.

**Problem.** *Let be a sequence of complex numbers, each of modulus 1. Is it necessarily true that for every constant there exist and such that ?*

And once one has asked that, one sees straight away that it is not really a complex version of the problem so much as an version. So here is a more general variant.

**Problem.** *Let be a positive integer and let be a sequence of unit vectors in . Is it necessarily true that for every there exist and such that ?*

I feel it should be easier than the original problem. Obviously, the larger is, the easier it is to find a sequence that works in . But if it is *not* possible to find such a sequence in any dimension, then I think that trying to prove it in a higher dimension ought to be easier because it forces you not to use special and ultimately irrelevant properties of sequences.

It suggests yet another problem, this one a generalization of the original problem but a specialization of the problem in .

**Problem.** *Let be a partition of , let be a sequence, and let . Must there exist some and positive integers and such that , where ?*

If the answer to this problem is no, then the answer to the previous problem is no as well, since one can take an orthonormal sequence and define to be , where is the colour class that contains .

However, I hope that the answer is yes, as is suggested by the Walters example. There, as I commented earlier, you can define a sequence that has bounded discrepancy, but if you want to make it a sequence then you are forced to assign values to multiples of 3 as well. In fact, that suggests a slight variant of the previous problem that is more explicitly Ramsey-like.

**Problem.** *Let be a partition of . Does there exist such that for every and every sequence there exist and such that ?*

The difference between this problem and the previous one is that is not allowed to depend on the sequence . It may be that the two problems are equivalent: I haven’t checked.

Another way to make the problem easier by making it more difficult, so to speak, would be to try to prove that there is no infinite sequence with bounded discrepancy even if one restricts to arithmetic progressions with common difference d that is not divisible by any of 2,3,5,7,…,101 or something like that. Then it would be an easy matter to create extremely long finite sequences that did have bounded discrepancy, so in trying to prove that there couldn’t be an infinite sequence, one would not be distracted by small-number difficulties and would be forced to come up with more global arguments. Let me state this formally as yet another problem.

**Problem.** *Let be a positive integer, let and let be a sequence. Must there exist and such that is not divisible by any prime and such that ?*

I’m not quite sure whether this is the most interesting problem that results from restricting the possible . Other possibilities are just to insist that or to insist that is divisible by a prime that is at least .

December 17, 2009 at 2:17 pm |

A Hungarian-umlaut “o” can be generated from Unicode: concatenate the symbols

& # 3 3 7 ;

and the browser should interpret it correctly. I’ll try here:

Erdős

More information than you want is at http://unicode.org/charts/ – this particular datum is near the bottom of page 14 at the “Latin Extended-A” link (and 337 is the decimal equivalent of the hexadecimal 0151 listed there).

December 19, 2009 at 1:46 pm

Many thanks — got it sorted out now.

December 17, 2009 at 2:37 pm |

This is a very naive question. But: In the original formulation, the set of vertices is the infinite set and the edges are the arithmetic progressions for . Then in the reformulation your set of vertices is and the set of edges is the family of AP’s on [N]$? I’m getting a bit confused comparing to AP discrepancy and the Roth, Matousek, Spencer bound . Could you explain the differences?

December 17, 2009 at 6:28 pm

The difference is that the APs are always of the special form . So it’s definitely not about normal AP discrepancy, but just about these very special ones that start at 0.

December 17, 2009 at 4:25 pm |

Constructively, wouldn’t it work to take the Copeland–Erdős constant and for every digit that is =5 put -1 in the sequence? I don’t know the exact nature of their proof of normality but it likely could be modified for varying values of d.

December 17, 2009 at 4:26 pm

Ok, that got HTML-mangled: for every digit less than 5 put -1 in the sequence, and for every digit 5 or greater put 1 in the sequence.

December 17, 2009 at 5:16 pm

Yes, this works: all you need is a lemma that says that if a number m is normal for k digits (for all k greater than or equal to 1) then if you construct a new number r choosing every dth digit, the new number is normal (only needing for k=1). This is trivial because of the reduction in number of needed digits. Suppose given some d, r is not normal. But that means m is not normal when selecting d digits. Therefore we have a contradiction.

Copeland and Erdős’s original proof

actually showed any concatenated increasing sequence of integers will work, not just the primes. They were unable (and I believe it still open) to prove this for any base given the original numbers are written in a fixed base, but that’s irrelevant for our current problem.

December 17, 2009 at 6:27 pm

The problem with your suggestion is that the discrepancy in a progression with common difference d will depend on d, whereas the problem looks for an upper bound that is uniform over everything.

December 17, 2009 at 6:50 pm

You’re right. I would give near-zero hope to a constructive proof then.

December 17, 2009 at 10:21 pm |

It is an appealing problem. Is there anything known about the analogous problem for linear subspaces of F_q^n. (Even when q=2.) We give every element of an n-dimensional space over a field with q elements a sign; is there a linear subspace with |sum of signs| >M.

And about the analogous problem for combinatorial lines containing 0: I.e., is for every M there is d and n so that for every signs assined to [d]^n there is a combinatorial line containing the all 0 vector so that |sum of signs over the line|>M.

(The second problem is stronger than the original one but maybe it is obviously false or knwon to be false.)

December 18, 2009 at 11:27 pm

“It is an appealing problem. Is there anything known about the analogous problem for linear subspaces of F_q^n. (Even when q=2.) We give every element of an n-dimensional space over a field with q elements a sign; is there a linear subspace with |sum of signs| >M.”

Wouldn’t that be a two coloring of the vector space? Then I think there is a Ramsey theorem that gives a monochromatic space for large enough n whose sum is the number of points in the space and that would be bigger than M.

“And about the analogous problem for combinatorial lines containing 0: I.e., is for every M there is d and n so that for every signs assigned to [d]^n there is a combinatorial line containing the all 0 vector so that |sum of signs over the line|>M.”

We could assign each point that is on a combinatorial line containing zero the following coloring if it is all zeros the color corresponding to one, if it has a nonzero coordinate its coordinates must be either zero or one particular value giving the the point the sign corresponding to that value plus one we get a correspondence from this problem to the problem on the integers for d+1. So if there is a working solution for all of the integers there will be a working solution for each d. If there is not a working solution for the integers and it breaks down at n we can assign n+1 to d. So I think this problem is equivalent to the problem on the integers.

December 19, 2009 at 12:54 am

I’m not sure I agree that the problem for combinatorial lines is equivalent. Also, Gil was talking about discrepancy for subspaces that contain 0. But I think you may be right that the Ramsey result (the Graham-Rothschild theorem?) is relevant here. For instance, in if we find a monochromatic affine subspace , then if it contains 0 we are done. If it does not, then we can add one dimension and obtain a subspace that can be partitioned into and a subspace that contains 0 and has the same cardinality as . Then the discrepancy on at least one of and must be large.

It feels to me as though a more complicated variant of this argument should work for as well, but I haven’t checked.

December 19, 2009 at 1:18 am

I think it is the Graham-Leeb-Rothschild theorem. Here is the corrected version of the paper in which it is proved: http://www.math.ucsd.edu/~sbutler/ron/87_10_categories.pdf.

December 19, 2009 at 7:01 pm

If we replace an arithmetic progression “d, 2d, 3d, …, nd” by a two dimensional array {id+je: where i=1,2,…,n and j=1,2,…,n} is the questionit known? (easy? trivial?)

December 19, 2009 at 7:03 pm

(sorry, it is trivial if we set e=1.)

December 19, 2009 at 7:34 pm

I don’t know how relevant this observation is, but if one took a counterexample to the original problem and defined a function on by , then the sum over any 2-dimensional array of the form would be bounded by the square of the bound that one had obtained in the 1D case. However, it’s not clear that this would enable one to find a counterexample for your question. (I haven’t yet worked out why it is trivial when .)

December 20, 2009 at 7:45 am

Sorry, I was wrong to say it is trivial. If we set e=1 then (for some d) in this 2 dimensional AP we have by Szemeredi theorem a whole AP with the same sign. “Clearly” this should suffice I thought. (Similarly to Tims comment above about F_2.) But this is not clear at all. Anyway, the problem for high dimensional AP might be easier than the 1 dimensional problem.

January 7, 2010 at 6:36 pm

the vectors in a combinatorial lines containing the zero vector have the 0 vector and some coordinates replaced by ’1′s ’2′s …’r’s. So if we give all 0-k vectors the sign (-1)^k the discrepency will be bounded by 1. So we need something more than combinatorial lines that will still somehow imply the conjecture for integers.

December 18, 2009 at 1:22 am |

I hope it’s okay for me to respond to your frame story, rather than just the specific Math question posed.

I very much admire your writing: “I have found the discussion about modelling mathematically the origin of life extremely interesting: I think we are still some way from formulating a problem that is clearly good, so I would still like to wait a few months at least before embarking on a project on that general theme. (However, it might well be that formulating a good problem would be a necessary part of any such project — in that respect it would differ in an interesting way from projects where the aim is to prove a clearly stated conjecture.)”

In a selfish way, I’d have wanted a polymath on that to start soon. But I felt compelled to explain how very hard the problem is even to state correctly. I suspect that this comes from Math/Computer Science people having accepted at best “toy world” models of Biology.

I might add (I’m not sure that this will come out correctly as a drawing through your web form) that the following diagram does NOT commute:

Cell (Monk’s) ———————> Cell (Organism)

| |

| |

V V

Cellular Automata —————-> Biology in 2-D + time

Upper Left to Upper right arrow: The word “cell” for the basic microstructure of life evolved via Robert Hooke coining the term “cell”, … as he described the microscopic structure of cork like a tiny, bare room or monk’s cell.

John von Neumann and Stan Ulam create the lower left corner of the diagram, to (academically improperly) cite wikipedia: “Stanisław Ulam, while working at the Los Alamos National Laboratory in the 1940s, studied the growth of crystals, using a simple lattice network as his model. At the same time, John von Neumann, Ulam’s colleague at Los Alamos, was working on the problem of self-replicating systems. Von Neumann’s initial design was founded upon the notion of one robot building another robot. This design is known as the kinematic model. As he developed this design, von Neumann came to realize the great difficulty of building a self-replicating robot, and of the great cost in providing the robot with a ‘sea of parts’ from which to build its replicant. Ulam suggested that von Neumann develop his design around a mathematical abstraction, such as the one Ulam used to study crystal growth. Thus was born the first system of cellular automata. Like Ulam’s lattice network, von Neumann’s cellular automata are two-dimensional, with his self-replicator implemented algorithmically. The result was a universal copier and constructor working within a CA with a small neighborhood (only those cells that touch are neighbors; for von Neumann’s cellular automata, only orthogonal cells), and with 29 states per cell. Von Neumann gave an existence proof that a particular pattern would make endless copies of itself within the given cellular universe. This design is known as the tessellation model, and is called a von Neumann universal constructor. In the 1970s a two-state, two-dimensional cellular automaton named Game of Life became very widely known, particularly among the early computing community. Invented by John Conway, and popularized by Martin Gardner in a Scientific American article…”

Unreached lower right: 4-D Biology. However, the popularity and abstract power of these conceal how un-biological they are. Ulam explained that his goal was to shrink the delta t and delta x, delta y discrete model with difference equations, to a continuous space-continuous time system of differential equations, which von Neumann felt that he alone was capable of solving.

Naive CS folks ever since have hoped that the 2-D cellular automata (2-D lartice structure + simple rules + time) would give dynamics that yield insight into 3-D cells and organisms + time. But on this planet, cells are intrinsically 4-D and smarter than little boxes with a small number of rules.

Latest support for this:

Cells Move in Mysterious Ways, Experiments Reveal

http://www.sciencedaily.com/releases/2009/12/091216151151.htm

ScienceDaily (Dec. 17, 2009) — Scientists at Brown University and the California Institute of Technology have for the first time tracked how cells move in three dimensions by measuring the force exerted by them on their surroundings. The scientists’ experiments revealed that cells move in a push-pull fashion, such as probing at depth, redistributing weight at various points and coiling and elongating….

What they found was cells move in intriguing ways. In one experiment, a cell is clearly shown operating in three dimensions by extending feelers into the gel, probing at depth, as if thrusting a leg downward in a pool. The Brown and Caltech scientists also found that as a cell moves, it engages in a host of push-pull actions: It redistributes its weight, it coils and elongates its body, and it varies the force with which it “grips,” or adheres, to a surface. Combined, the actions help the cell generate momentum and create a “rolling motion,” as Franck described it, that is more like walking than shuffling, as many scientists had previously characterized the movement.

“The motion itself is in three dimensions,” Franck said.

Franck’s lab plans to use the new force-measurement technique to examine how the movement of normal healthy cells differs from mutant or malignant ones. “That promises to give us greater insight in how cells’ behavior changes when they become diseased,” Franck said.

David Tirrell and Guruswami Ravichandran, scientists at Caltech, contributed to the research. The National Science Foundation funded the work.

December 18, 2009 at 1:23 am |

Cell (Monk’s) ———————> Cell (Organism)

|……………………………………. |

|……………………………………. |

V…………………………………… V

Cellular Automata —————-> Biology in 2-D + time

December 18, 2009 at 3:44 am |

Here is a reference a remember for this problem, where a particular case is solved.

Mathias, A. R. D. On a conjecture of Erdős and Čudakov.

“Combinatorics, geometry and probability” (Cambridge, 1993)

December 18, 2009 at 9:07 am

Would it be possible for you to tell us what the main result of that paper is?

Ah, I’ve found it now. He proves that if all the sums are at most 1, then they cannot be bounded below. The argument is very short and can be found online here.

January 3, 2010 at 7:40 pm

The maximal sequence for C=1 has length 11, and Mathias builds his counterexample on a contradiction with a sequence of length 12.

It looks like one of the building blocks for any proof that a sequence with discrepancy bounded above by 2 then it cannot be bounded below would have to use a maximal sequence for C=2 – this would get the total for some subsequence down to -3. There aren’t the same structural features to work with, but other features may emerge which could be used in a proof that this implies arbitrarily small totals.

December 18, 2009 at 2:40 pm |

If it’s still an open question whether suffices, that would surely be a good place to start.

For what it’s worth, this sequence achieves :

(Note that the index starts at zero.)

That’s the longest sequence my simple program has found. But I bet there’s scope for more computational trickery.

Of course, even being able to find arbitrarily long finite sequences that achieve would not (as far as I can see) guarantee that any infinite sequence does.

December 18, 2009 at 3:48 pm

I think it does guarantee it with the usual diagonal/compactness argument: you just take a sequence of finite sequences with lengths tending to infinity and pass to a subsequence that converges pointwise. Unless I’m missing something, that will give you an infinite sequence with a C that’s at least as small as the largest that you get for the finite sequences.

December 18, 2009 at 3:53 pm

Ah, yes indeed. Thanks.

December 18, 2009 at 6:14 pm

By the way, I meant to say that your example was interesting. It made me wonder whether it was better than Walters’s example (bearing in mind that with Walters’s example there is some flexibility: for any power of 3 you can reverse all multiples of that power, and you can compose operations of that type). I haven’t had time to check.

December 18, 2009 at 7:20 pm |

[...] polymath project By kristalcantwell Here is a link to a post on a possible polymath project. It is Erdős’s discrepancy [...]

December 18, 2009 at 10:28 pm |

(Actually you can append a or to the end of that sequence and it still works. I haven’t found any going beyond though, and I suspect that may be the limit.)

It occurred to me that the problem could be framed in terms of Banach spaces. Because the points with coordinates are the extreme points of the unit ball, the question is whether the map

defined by

is bounded.

December 19, 2009 at 8:12 am

Oops, I got myself confused and that comment was completely wrong!

December 19, 2009 at 1:22 pm |

Let me quickly try to optimize the Walters example by choosing more carefully the values at powers of 3. To start with I’ll go for . If we insist that the function is multiplicative and depends on the last non-zero digit, and that , then that determines everything. It starts

1+, 2-, 3-, 4+, 5-, 6+, 7+, 8-, 9+, 10+, 11-, 12-, 13+, 14-, 15+, 16+, 17-, 18-, 19+, 20-, 21-, 22+, 23-, 24+, 25+, 26-, 27-. The partial sums so far are bounded between -1 and 2 and the last one is equal to -1. If we therefore repeat the whole sequence then the partial sums between 28 and 53 will be between -2 and 1 with the last one equal to -1. We then take the value of 54 to be +, so we get a partial sum of 0. We then repeat the sequence yet again, but at 81 we have a new power of 3, the value at which we will set to be +. So the partial sums between 55 and 80 will be the same as those between 1 and 26, and the partial sum at 81 will be 1 instead of -1. We should soon get into trouble, since now we are starting all over again, but from 1 instead of 0. Indeed, since the partial sum at 10 is 2, the partial sum at 91 will be 3. But I think we’ve managed to get all the way up to 90 with all partial sums between -2 and 2, and since the sequence is multiplicative, that means that the sums along all APs of the special form are between -2 and 2 as well.

One might be able to go slightly further by artificially changing the value at 91. In fact, since the partial sums at 7 and 13 are both 1, and therefore not critical, this is definitely possible. Maybe a computer search could take over from here: start with the example that gets you all the way up to 90 and then search for ways of tinkering with it to extend it a bit further. But I suspect it won’t be possible to go all that much further. The fact that one can go as far as this, however, is a good indication of why the problem could be quite hard.

December 19, 2009 at 1:37 pm |

Actually, it occurs to me that a better way of tinkering would be to choose the value at 81 to be – instead of +. That way, the next 27 partial sums would all be between -2 and 1 (because the first 27 were between -1 and 2, but now the partial sum at 81 is -1). If we do this, then the partial sum at 81+27=108 will be -2, but we could perhaps try to fix that by choosing the value at 108 to be + instead of -. At this point, we are departing from multiplicativity enough to make various checks necessary, which I can’t face doing right now. Actually, I should also check that taking the value at 81 to be – doesn’t mess things up too much. If the partial sum up to 27 was -1, then the sum over all multiples of 3 up to 81 used to be 1 and we’ve now changed it to -1. But that will cause some problems because we rather needed the partial sum at 27 to be -1, so from the point of view of multiples of 3 we rather needed the partial sum at 81 to be 3. The problem turns out to arise when we get to 111.

I think, but haven’t quite checked hard enough to be 100% certain, that this gets us up to 110 though.

December 19, 2009 at 1:39 pm |

All this suggests that a rather nice preliminary goal for a project of this kind would be to prove that the sums along special APs cannot all lie between -2 and 2.

December 20, 2009 at 4:43 am |

Of the polymath projects you have listed that you want to try, I think this one and the one on the origins of life are probably the best from the point of view of getting loads of participants (alas, I will not be among them, as I just have too many things to do — and, I would like to think about the Polymath4 project some more). However, this one has the advantage over the origins of life in that it begins with a well-defined mathematical problem.

One more thing: I wanted to point out that your Theorem above is reminiscent of some beautiful identities of Peter Borwein and Stephen Choi:

http://www.cecm.sfu.ca/personal/pborwein/PAPERS/P204.pdf

Their identity is as follows: Let , where is the number of distinct prime factors of that are mod , counting multiplicity (Tim, this is your function , I believe). Then, is, of course, completely multiplicative. And what Borwein and Choi prove is the astonishing fact that

,

where is the number of ‘s in the base-3 expansion of . I say `astonishing’ here because one wouldn’t expect there to be such a strange relationship like that between multiplicative functions and partial sums.

December 20, 2009 at 3:05 pm |

This set me wondering about how far a completely multiplicative -valued sequence defined on the positive integers could continue with its partial sums lying between . Such a sequence is determined by its values at primes. This choice gets you as far as :

December 20, 2009 at 4:50 pm

That’s quite something. If we take that sequence and use it to define a function , then we can define a function on all the integers by taking to be evaluated at the last non-zero digit when is written in base 211. This will again be a completely multiplicative function and if the sum of your function up to 210 is zero, then the same argument that proves that the growth rate in Walters’s example is at most is enough to give a bound here of , which is smaller than by a factor of approximately 2.4. (If the sum up to 210 is not 0, then one can try other primes such as 197 or 199.) Again, this illustrates why the problem is hard: if there is no sequence with bounded discrepancy, it is difficult to use small examples to give a hint about how to prove this. Also, there doesn’t seem to be any chance of finding the best possible sequence.

December 21, 2009 at 11:01 am

By the way, I’ve checked that the furthest you can possibly get with a completely multiplicative sequence having partial sums between is . There are exactly five hundred sequences that get that far, but they all trip over at !

December 21, 2009 at 11:31 am

For what it’s worth, if you let be 1 if the last non-zero digit of is 1 or 4 base 5 and -1 if it is 2 or 3, then you can improve the growth function from to . But that’s not as good as .

December 21, 2009 at 1:43 pm

Of the aforementioned sequences, it turns out that have . So with your construction above, these would give a bound of .

December 21, 2009 at 1:49 pm

Sorry, that should be .

December 22, 2009 at 10:56 pm |

Brute-force search reveals 359 as the length of the longest sequence having partial sums between +/- 2. Here is an example:

+ + – + – - + – + + – + – - + + – - + – + + – + – - – - + – + +

- + + – + – + – - + – + + + – - – + + – + + – + – - + + – - – +

+ – + – + – + – - + – + + + – - + – - – + + – + – + + – + – - -

+ + + + – + – + – - + + – - – - + – + + – + – + + – + – - + + -

- + + – + – - – + + – + + + – + – - – + + – + – - – + – + + + +

- – - – + + – + + – + – - + + + – + – - + – + – - + + + – + – -

+ – - + – + – - + – + + + – - + – - + + – - + + – + + – - – + -

+ + + – - + + + – - + – - + + + – - + – - + + – - + – + + – - +

+ – - + – + + – - + + + – - + – + – + + – - – - + + – - + + – +

+ – - + + – - + – + – - + + + – + – - + + + – - + – + + – + – -

+ – + – - + – + – + + – + – - + + – - + – + + + – - + – - + + +

- – + – - + +

And 11 is the length of the longest sequence having partial sums between +/- 1. So…any sequence of length (2n)!/2 has a partial sum of absolute value >= n, if n is 1, 2, or 3. I sense a pattern!

December 22, 2009 at 11:35 pm

That’s interesting. It leaves me curious to know how far it is possible to go with a brute-force search. By brute force you obviously don’t mean just checking all sequences separately. I presume the point is that most sequences are ruled out fairly early on so you gain a lot by doing a depth-first search.

Anyhow, the question I now find myself wanting an answer to is whether there is a way of mathematically constructing an example that gives 359, or more generally (2n)!/2. The fact that there are several examples (if I read between the lines correctly) suggests that a linear-algebraic approach might be worth thinking about.

December 23, 2009 at 12:10 am

>It leaves me curious to know how far it is possible to go with a brute-force search.

Well, the next one up is 8!/2 = 2160, which is out of reach to my current program, by several orders of magnitude.

>The fact that there are several examples…

Yes, if only because the value of x_p for any prime p more than half-way through the sequence has no effect on any of the partial sums.

December 23, 2009 at 3:00 am

You meant to say that 8!/2=20160.

December 23, 2009 at 8:40 am

Looking again at your example, it seems to be + at 2,4,6,8, which would give a sum of 4. Also, the partial sums of your sequence seem to reach 5 pretty quickly. And going further I seem to be able to get up to 20, so in general the sequence is fairly biased towards +. Am I misunderstanding something?

December 23, 2009 at 9:19 am

Ah, it was WordPress’s fault. I’ve put spaces between your pluses and minuses and now they’re all showing up …

December 23, 2009 at 9:46 am

Going back to the question of how your brute-force algorithm worked, I note that not only can the sequence take any value at a prime p such that 2p is greater than 359, but it can also do so at any 2p such that 3p is greater than 359 or at any 3p such that 4p is greater than 359 and x_p+x_2p=0, or at any 4p such that the sequence hasn’t already messed up and 5p is greater than 359.

The number of such places is fairly large, and makes me wonder whether two raised to that power is larger than the number of operations a computer can reasonably be asked to perform. On the other hand, one can tell the computer in advance not to bother looking at all those possibilities and just to set them all to +. Did you do something like that?

Actually, on further reflection I realize that it’s not true that the value at a prime greater than 180 can be set arbitrarily, since we care about the partial sums of the entire sequence. So if these partial sums hit both the values 2 and -2 fairly late on, then the values at all earlier primes cannot be changed (or at least cannot be changed without fairly serious consequences later). So most of this comment is nonsense.

December 23, 2009 at 9:57 am

If the pattern is that simple, would an inductive proof be out of the question? It would be enough to show that, given a sequence of length whose sums are bounded by on all its special APs, there must be a subsequence of length whose sums are bounded by on all its special APs…

December 23, 2009 at 10:04 am

(Just to clarify: the ‘special APs’ of the subsequence are defined relative to its own numbering. If the subsequence happened also to be a special AP of the original sequence, then one could use the same numbering, but that’s perhaps too much to ask.)

December 23, 2009 at 11:10 am

The problem with that approach, I think, is that pretty well any sequence of length N will be a subsequence of the sequence of length N(2C+1)(2C+2), since the latter sequence will be pretty densely packed with 1s and -1s. So I think you need to put some conditions on the subsequence, which in any case seems a good idea because one would want somehow to use the property that the main sequence had.

December 23, 2009 at 11:47 am

Yes, I didn’t want to be too specific, but was thinking, for example, of chopping the long sequence into contiguous blocks, or considering subsequences consisting of the first multiples of some . (But I haven’t checked whether either of these has a chance.)

December 23, 2009 at 1:18 pm

Yes, something like that would make more sense. It might be worth trawling through TonyG’s example to see what its restrictions to APs look like. If we could find a way of generating sequences of length 359 that did not rely on brute force, it might give useful clues.

December 23, 2009 at 11:05 am |

I’m going to start a new comment, though in a way it is still a reply to the previous one. I’m in two minds about the conjecture that there is an exact bound of (2n)!/2 for the point where you are first forced to get a sum of modulus n. On the one hand, I don’t feel I can completely dismiss a sequence that starts 1, 12, 360. But on the other hand, the fact that the problem has a significantly number-theoretic character, by which I mean that the freedom you have to assign a value to an integer m depends a lot on what the factors of m are, makes it seem unlikely that a formula such as (2n)!/2 would be correct.

At the moment, I incline towards the view that the sequence 1,12, 360 is probably a coincidence (though perhaps it might be worth looking in Sloane’s database to see whether there are any more number-theoretic sequences that start like that). However, it still seems worth investigating. If it did turn out to be correct, then what could a proof conceivably look like? Is there, for instance, a clever way of associating with each term in a sequence of pluses and minuses a permutation of {1,2,…,2n} in such a way that if the sums over all APs lie between -n and n, then none of these permutations is equal to any other or equal to the reverse of any other (by which I mean what you get when you write the other backwards)? If one could do that with the additional property that none of the permutations in question was the identity or the reverse of the identity then one would be done. But I don’t hold out much hope for this approach: how will the sum of the sequence over all multiples of 17 have anything to do with whether two of the permutations are equivalent?

December 23, 2009 at 4:10 pm

Oh dear. After generating 169,137 sequences of length 359, my latest program found a sequence of length 360:

+ + – + – - + – + + – + – - + + – - + – + + – + – - – - + -

+ + – + + – + – + – - + – + + + – - – + + – + + – + – - – +

+ – - + + – + – + – + – - + – + + + – - + – - – + + – + – +

+ – + – - – + + + + – + – + – - + + – - – - + – + + – + – +

+ – + – - + + – - + + – + – - – + + – + + – - + – + – + + -

+ – - – + + + + – + – - + – + + – + – - + – - – + + + + – -

+ – + – - + + + – + – - + – - + + + – - – - + + + + – - – -

+ + + – + + – + – - + – - + + + – - – + + – + – - – + + + +

- – + – - + + + – + – - + – - – + + – + + – + – - + – + + -

- + + – + + – - – - + + + – + + – + – + – + – - + + – - – -

+ + + – - – + + + – + – - + + + – + – + – - + – - – + + – +

+ – + – + – + + – - – + + + – - – + – + + – + – + – + + – -

There goes my credibility. Sorry if I wasted anybody’s time.

December 23, 2009 at 4:38 pm

I’d also be interested to know how your algorithm works. If you simply keep track of all sequences that work up to , and let the list grow and shrink as you try appending successive values, the size quickly gets beyond the capacity of my computer (at there are sequences). But if you have a specific target, especially one with lots of factors, perhaps you can order the differently. Suppose you’re interested in sequences on ; you can start by considering all possible values of the where — which enter into the most APs — and then move on to the where , then those where , and finally the rest. But from your last comment (the fact that you went via ) it sounds like this isn’t what you’re doing.

December 23, 2009 at 5:00 pm

On second thoughts, that woudn’t help much: you’d probably still have millions of sequences (however many there are when you get to ) to consider at the last stage.

December 23, 2009 at 5:06 pm

The program is very simple — 79 lines of C code. It does nothing clever; it simply maintains, for each r and each d | r, the sum x_d + x_2d + … x_r. This is enough to make the depth-first search fast. It took several hours to find the sequence of length 360. There may be even longer sequences; I will work on this. I can post the current code here if anybody is interested.

The previous version had a faulty termination condition, which is why I thought 359 was the limit.

December 24, 2009 at 8:06 am

I see, thanks. So I presume the sequences of length were all dead-ends that it backtracked from before eventually finding the sequence? That’s quite a cull!

December 23, 2009 at 10:59 pm |

I reckon that adding -+++-+-+—++- to the end of Alec’s list gets us up to 374.

The method I’m using is a semi-manual one – I did brute force search up to 120 manually before inputting and checking Alec’s details.

Basically it is an Excel spreadsheet in which the dth row of the main table keeps track of the sums of the progression of difference d. I’m able to check the next digit =1 in the nth column, and if this doesn’t work, use -1, and if this doesn’t work backtrack. The table helps by keeping track of which progressions are critical at any stage. With a bit of intelligent design (or a better programming capacity) it would be possible to improve the search to flag entries which are forced, and conflicts in forced entries, which would avoid some time-consuming backtracking.

and 361ff = -+-++-+—++-+–+++–++ gets to 384

Of course this method isn’t particularly scalable

December 24, 2009 at 12:19 am

One thing I don’t understand here — if TonyG (I presume you actually meant him rather than Alec) could get to 360 and what you say is correct, and he is using a depth-first search, then why didn’t he come up with these extensions? I don’t know whether it’s you or Tony who should answer this question.

December 24, 2009 at 8:18 am

Looks like WordPress is up to its tricks again. I assume that it converts double minus signs to en dashes and triple minuses to em dashes.

So, inserting spaces, your first sequence should be:

- + + + – + – + – - – + + -

But I can’t work out what your second sequence should be.

December 24, 2009 at 9:36 am

>…why didn’t he come up with these extensions?

I still thought that 359 was the limit, so I had the program print a message and halt if it got to 360.

December 24, 2009 at 9:11 am |

Yes I did mean extending Tony’s sequence, not Alec’s.

Sorry about not understanding what wordpress does. I think the following (which has spaces) does up to 389 – it differs from Tony’s from position 355.

+ + – + – - + – + + – + – - + + – - + – + + – + – - – - + – + + – + + – + – + – - + – + + + – - – + + – + + – + – - – + + – - + + – + – + – + – - + – + + + – - + – - – + + – + – + + – + – - – + + + + – + – + – - + + – - – - + – + + – + – + + – + – - + + – - + + – + – - – + + – + + – - + – + – + + – + – - – + + + + – + – - + – + + – + – - + – - – + + + + – - + – + – - + + + – + – - + – - + + + – - – - + + + + – - – - + + + – + + – + – - + – - + + + – - – + + – + – - – + + + + – - + – - + + + – + – - + – - – + + – + + – + – - + – + + – - + + – + + – - – - + + + – + + – + – + – + – - + + – - – - + + + – - – + + + – + – - + + + – + – + – - + – - – + + – + + – + – + – + + – - – + + + – - – + – + + – + – - + + – + – - + – - + + – + – - + + – + + – + + – + – - + + – - – - +

December 24, 2009 at 9:16 am

Sorry, I’m also messing up the threading by replying to the wrong thing.

Obviously, using my crude search I’m not trying to generate every sequence, so I’m not checking hundreds of thousands, just trying to find a sequence which works.

One disadvantage of this is that I’m not searching for sum=0 in useful places.

December 24, 2009 at 10:35 am

I’m just running a C program now that will go as far as 1024 if necessary. It’s reached your 389 and is thinking hard. The fact that multiples of 30 seem to be big sticking points for the search may be a useful clue when it comes to proving bounds by methods other than brute force.

December 24, 2009 at 11:29 am |

The following small question occurred to me as possibly a good candidate for the first thing to attempt to prove theoretically, because there is a chance that it is easier than the problem itself. In Adrian Mathias’s paper, he proves that if the sums have an upper bound of 1, then they cannot be bounded below. Might it be possible to prove that if the sums cannot be bounded between -2 and 2, then if they are bounded above by 2, then they cannot be bounded below? That is, I’m asking for a conditional result: assuming that the sums must eventually get above 2 in absolute value, prove the one-sided result that if they are bounded above by 2 then they must get arbitrarily large on the negative side. (Even proving that they must get down to -4 would be interesting, of course.)

In response to Alec’s 10.35 remark, I’d be interested to know whether the sticking points tend to be arithmetic progressions with small common difference (up to 6), as this multiples-of-30 phenomenon might suggest. If so, one might be able to speed up the search by trying to take care of these small-difference APs first. Of course, one would need to find a “space” of examples that work for the small-difference APs, so that it was then possible to make further adjustments to deal with larger ones.

December 24, 2009 at 12:05 pm

This oddly was the same thought I had.

For any sequence of length n there are 2^n potential solutions. However, 1/2 of the solutions are identical to the other half with the exception of a sign change (eg there are ( 2^n/2 solutions and 2^n/2 conjugate solutions). I would agree that Mathias didn’t check his conjugate solution, because it still appears to meet his other constraints (since he only specified an upper bound).

Now the only question is what does the conjugate solution represent?

I would think that any conjugate solution to a preferred solution represents a mapping to the negative integers. So the total set of 2^n solutions appears to be mapableto the full set of integers and not just the positive integers (since any solution I map to the positive integers automatically has a conjugate that can be mapped to the negative integers).

It appears though that any test solution that fails to meet an upper bound will automatically have a conjugate that meets the upper bound, but fails to meet any equivalent magnitude negative lower bound. I have playing with the idea of what it might mean to define a superposition of solutions and conjugate solutions, but I am not sure if that would be useful in solving the conjecture.

December 24, 2009 at 1:11 pm

On sticking points, I’d make the following observations about a depth first style or greedy search (just see how fay you can go).

There is never a problem choosing a value at any prime p, because this is only in the progressions with difference 1 or p, and the p progression doesn’t have enough elements yet to cause a problem. The primes can be used (with some constraints) to sort out the values in difference 1.

Similarly 2p only has a problem in difference 1 or 2 (not p) and this is generally easy to fix. However the choice of p and 2p can force a value at 3p [and the same is true for any 'n' of course] – so it is possible to get a fair few forced values in the progression of difference 3.

4p causes a problem only at difference 1 or 2 or 4 – but as time goes on, other values in difference 4 get forced, and it is these, in part, which seem to be causing problems making progress beyond 389, combined with values at 330/340/350/360 in the difference 10 progression.

Some other specific issues:

With the values we have so far 390 is forced to be -1 by values at 130/260.

As an example of a blockage with higher primes 374 is forced to be -1 in the 77 progression, and this has caused problems because 341 and 352 are easily forced to be -1 too, and this can leave problems with the 11 progression if 363 is not +1

These blockages with higher primes require more backtracking to fix, and it is likely to be efficient programming to identify constraints in the progressions with larger differences, and cure the smaller ones by search. It looks as though the small differences are causing the problem, but in fact it seems to me that it is the constraints imposed by the larger differences which are more significant to a search.

As primes become sparser, the proportion of easy values decreases, and there is less flexibility to fix any problems.

December 24, 2009 at 12:56 pm |

Here’s an idea for a completely different way of searching for a solution. The “conventional” way would be to extend a given solution number by number and do a depth-first search. One can think of that as doing a search with respect to the standard basis of functions (from {1,2,…,n} to {-1,1}). But a much more natural “basis” for this problem consists of characteristic functions of arithmetic progressions. That’s partly because these are closely related to the statement of the problem, and partly because if you pick out a subprogression of {1,2,…,n} and change all the entries in that subprogression, you don’t have too dramatic an effect on any of the sums, unless the sequence you had was a very bad one to start with, in which case you may well improve it substantially.

How could one use this to search for good sequences? The idea would be to start with the all-ones sequence, and then add characteristic functions of pretty long APs (by which I mean add them mod 2). There could be a scoring system for how good/bad a given sequence is, and one could look out for APs that improve the sequence as much as possible. For instance, one might start by adding the AP that consists of all odd numbers, since then one would get the sequence – + – + – + … (maybe I should say that I multiply pointwise by the function that is -1 on the AP and 1 everywhere else, but that is equivalent to addition mod 2) which is now beautifully behaved on all APs with odd common difference.

If one only ever used APs of the form {p,2p,3p,…} then one would always end up with a multiplicative sequence. So the idea would be to allow just a bit more flexibility than that. Perhaps by doing something like this one could end up with a sequence that did pretty well, and one could then go back to a more conventional depth-first search for ways to adjust the sequence to keep it within bounds everywhere. Most of the time one would fail, but the hope would be that this procedure would give one a good enough probability of success for examples to appear from time to time.

One thing I want to think about is this. If one just randomly adds together a few fairly long APs, will the resulting growth rate be logarithmic? One would have to think a bit about what the precise probability distribution should be, but I think there is the potential here to give a large source of examples that achieve logarithmic growth. That feels to me to be important as a way of coming to grips with the problem — one wants to “understand the enemy”, so to speak. This, incidentally, is another reason that I think looking at the standard basis is not the right thing to do, since a purely random sequence will give you a growth rate of around even if you just look at the partial sums .

Having said all that, the numbers we now have suggest to me that the true growth rate is probably slower than logarithmic. So another interesting initial target would be to prove precisely that — in other words, to attempt to construct an example of a sequence where the AP-sums are bounded between -n and n and the length of the sequence is something like rather than .

December 24, 2009 at 3:43 pm

That is a very interesting idea.

One idea it gave me straightaway was to initialize the search with a multiplicative sequence of length 246 (instead of with the singleton 1). When I tried this and ran the program, it very quickly reached 599. (The old program was still running and still stuck at 389.)

December 24, 2009 at 5:34 pm

That’s pretty extraordinary — I would have expected the longest sequence with C=2 to be much shorter than that.

I think it would be very interesting to take some of these examples and see whether they can be expressed as a mod-2 sum of not too many characteristic functions of APs. Or perhaps they can be mostly expressed like that, but with a few adjustments. However, this may not be an easy computational problem, since the somewhat similar problem of finding a point in a given subspace of F_2^n that is close to a point that’s given to you is in general hard, if I remember correctly. It’s also an important problem in coding theory, for obvious reasons.

December 24, 2009 at 10:56 pm

The idea of searching over an alternative -basis suggests a transformation of the original problem, in terms of the coordinates of the original sequence in the new basis. If we take the set of APs of the form as our basis — as seems rather natural — then the question is whether there is a -valued sequence and a finite such that for all ,

.

This doesn’t look particularly friendly, however, as in general not many of the terms factor out of the sum.

For the record, here is the longest sequence I’ve found so far, of length 669. (This is from the search that started with the long multiplicative sequence. It’s been stuck here a long time.)

+ – - + – + – - + + + – - + + + – - + – + – - + + + – - + – + – - + + + – - + + – - – + – + – - + – + – + + – + – - + + + – - + + + – - + – - – + + – + – - + – + + – + + + – - – + + – - + – + – - + + – - + + – - + – + + + – - + + + – - + – + – + + – + – - + – + – - + + + – - – + + + – + – - – - + + + – - + – + + – - + + – - – + + – - + – + – + + – + – + + – + – - + + + – - + + – - – + – + – - + – + + – + + – - – + + – + + – + + – - – - + – + + + + – - – - + – + + + + – - – + + – - + – - + – + + + – - + – + – - + + + – - + – + + – + – + – - + + + – - + – + + – - – + + – + + – + – + – + – - – - + – + + + + – - – + + – - + + + – - + – + – - + – + + + – - + – - + + + – - – + + – - + + – + – - + + – + + – - – - + + + – - + – + + – - – + – + + + + – - – - + + – + + – - + – + + – - + – + + – + + – - + – - + – - + – + + – + + – + – + – + + – + – + – - + + – - + – + + – - – - + + – + + – - + + – + – - + – + + – + + – + – - – + + – + – - + – - + – + + – + + – - + + + – - + + – - + – - + + + – - – + – - + – + + – - + – + + – + + – - + + + – - + + – - + – + + – - + – + + – + – - – + – - + + – + – + + – - – + + – - + + + – + – - – + + – + + + – - + – + + – - – + – + + + – - + + – - + – - + + – + – + – - + + – - + – + + – + + – - + – - + – + + – - + – + + – + + – - + – - + + + – - – + – + + – + + – - + – + + – + – - – + – - + – +

December 26, 2009 at 10:48 am

I think the reason that this calculation is hanging is that you quite likely have to backtrack before 536 to have a chance of getting to 670.

December 28, 2009 at 8:51 am

The problem at is that while . After reaching the program quickly backtracks to well below . After I left it running overnight it had got back to . It seems that the multiplicativity of the initial sequence up to gives such a lot of flexibility that it is difficult to rule out numbers in the upper part of that range.

I’ve tried starting with a few different multiplicative sequences of length , and they have all got stuck at . At that point they have all had identical sequences and . (I’m assuming that .) So it did occur to me that the values on these two APs may be forced, and hence that is the limit — but the evidence is pretty scanty. It’s interesting to note that the multiplicative sequences of length all agree at primes up to and including .

December 29, 2009 at 1:20 am

A way to speed up brute force searched (perhaps you’re already implementing this) is to first optimize the order at which you fix entries. For example, there is little point in selecting $x_p$ for a prime p, before 2p is also reached.

The optimal order may well depend on the length of the sequence one is aiming for. A good order will be one where the sums along many arithmetic progressions are determined as soon as possible.

December 29, 2009 at 9:06 am

Yes, I had wondered about trying to optimize the order in which the values are tried when searching. It isn’t quite so simple, though, because the value at does determine a lot of AP sums (those of difference 1 passing through ). I think if you want the maximum number of AP sums to be determined with numbers, the optimum choice does turn out to be . (This determines about , or about , APs.)

On the other hand, it may be worth changing the order if one is particularly interested in certain APs. For example, I’m interested in whether the values at and are forced for a sequence of length with . So maybe I should be setting those values first in order to eliminate impossible sequences as early as possible.

On that topic, I’m currently running a version of the program that checks those values whenever it finds a sequence of length . By the time it had tracked back to it had found such sequences, all of which agreed at those points. This lends some support to the idea that a proof could focus on them.

December 29, 2009 at 9:09 pm

I have just belatedly noticed a symmetry that could save a great deal of wasted search time. If you have a sequence that works (with ) up to , and you have two primes both greater than (so that the only relevant APs that they enter into are those with difference one), and if and have opposite sign, and the partial sums of the sequence from to are all the same sign as or zero, then you can negate both and and the sequence will still work.

That may sound like a lot of conditions, but I'm pretty sure this observation would reduce the time needed for a complete search by many factors of two. I wonder if there are other symmetries of this nature to be found.

December 31, 2009 at 10:18 am

There is at least one other symmetry of this kind, related to the first. If are primes between and , and , with all of the partial sums of APs of difference or between and having the sign of or zero, then the sequence obtained by swapping and works up to if and only if the original sequence does.

This is less useful than the first, but does allow some further economies in searching.

December 26, 2009 at 4:14 am |

A couple of probably obvious observations, but ones that haven’t been stated.

We can define an inner product for the string of -1′s and 1′s.

The inner product will be equal to the length of the string n (eg x * x = n).

The arithmetic progressions are analogous to samples of the original string.

If the sample interval is d, then then the inner product of the sample is n/d.

This implies a decreasing upper bound on the sum of the samples as the sample interval d increases.

This implies that there is a set of constants C, n, d that can satisfy the problem, but C isn’t arbitrary.

December 26, 2009 at 11:10 am

Some more trivial observations.

If f(r) has bounded discrepancy for all r , then so does g(r) = f(nr).

The idea of a completely multiplicative function gives g(r)=f(r) or g(r)=-f(r), but within any viable f there will be concealed a family of related functions. It ought to be possible to identify some closure properties for such a family. Is it possible that such a family is infinite?

If there is a viable f, we also have f(2r), f(4r), f(8r) etc

If we have f(r) therefore, we can create a trial function g(2r) = f(r), and then have to fill in the odd values of g (this process could be called expansion).

Parity considerations suggest that the sum along any AP up to 2r+1 will be odd. If C is odd, 3 for example, and the sequence is good up to 2r, it can be completed to 2r+1 (with either of ±1).

There are as many APs with odd d as with even d, so we have half the values to play with and half the APs to control (the even ones are all done by f). But there are constraints at the even positions in the odd APs, which are already fixed.

December 26, 2009 at 4:50 pm |

Another idea occurred to me for a method that might produce a faster search for examples, but the people who’ve actually done some searching will have a better intuition than I have for whether it would actually work.

The idea is based on the thought, which has already been expressed above, that multiples of 30 are particularly likely to be problematic because they have lots of small factors, and hence lots of potentially conflicting constraints. To combat this, one could perhaps define a “block” to be a -sequence of length 30 with the property that all AP sums are between -2 and 2, and furthermore . One could then make a big list of blocks (of which I imagine there are quite a lot), noting various properties they have, such as whether the sum is 1 or -1.

Come to think of it, the multiples of 2 are slightly problematic, so another thing one could do is make a list of all pairs of blocks that give you a sequence that works all the way up to 60. But that’s just a detail. The main idea would be to restrict one’s search to concatenations of blocks. I think it could eliminate a lot of repetition from the search, and therefore be much faster. The drawback is that it isn’t clear that all very long sequences are concatenations of blocks as I’ve defined them above, but it seems to me that if the multiples of 2, 3 and 5 are in good shape at each multiple of 30 then it ought to be quite useful.

There are also a number of variants one could imagine of this basic idea. For instance, having made the list, then instead of doing a completely basic depth-first search, one could do a slightly cleverer one as follows: having chosen blocks , then one would see whether any of the next thirty values were forced, and then look only for blocks that took the right values at the forced places. Or one could make a list of mini-blocks that dealt with the next six places and first make a list of mini-blocks that don’t cause trouble and only then search for blocks that start with one of the mini-blocks that had not been ruled out. In other words, one could do a depth-first search, but because each node in the tree would have many children, one could also use simple tricks to rule out all but a few of the children when the sequence started to get long.

December 26, 2009 at 6:46 pm

More generally, one could for each block make a table of the following information. For every d and every a between 0 and d one would have an interval I(a,d) of all the values taken by the sums . So, for example, if you have chosen the first ten blocks, and if , then you will need a new block such that the sums etc. all lie between -4 and 0.

December 27, 2009 at 1:10 pm |

I’d like to draw attention here to the fact that I’ve added a new section to the original post, with some additional variants of the original problem.

December 28, 2009 at 7:12 pm |

A question that interests me, related to the final problem in the updated post above, is this. If we worry only about those progressions such that has a prime factor greater than , then how big does need to be before one can get very long sequences with ? For example, if has to be a multiple of a prime greater than 10 (meaning that if the discrepancy on an arithmetic progression with some other is large then we don’t care), can we find a sequence of length 500? I imagine this would be a piece of cake to program, especially if you’ve already got a program for looking at . (I’d also be interested to know whether forgetting about a few small makes sequences with so easy to find that you can just go on and on and on without getting stuck.)

December 28, 2009 at 9:08 pm

If we only care about APs with a common difference divisible by a prime greater than , then with the longest I’ve found with a quick search has length :

+ – - + – + – - + + + – - + + + – - + – + – - + + + – - + – + – - + + + – - + + – - – + – + – - + – + – + + – + – - + + + – - + + + – - + – - – + + – + + – + – + + – + + + – - – + – - – + – + – - – + + – + + – - + – - + + – - + + + + – + – + – + + – + – - + – +

With we can do at least ; with , at least . Longer sequences may be possible, but aren’t a doddle to come by.

December 28, 2009 at 9:50 pm

In response to the question about ignoring a few small in the case of , my initial impression is that it doesn’t make a dramatic difference. If we stop caring about , my program sticks at ; and even if we stop caring about anything divisible by or , it sticks at . I haven’t run it for long, though.

December 28, 2009 at 10:54 pm

That lends support to the pretty plausible hypothesis that when gets large it’s the APs with larger common difference that really count.

Another question while I’m still here. Can you get any sense experimentally of what happens if you just look at prime common differences?

Added two seconds later: oops, that’s a silly question. If you just look at odd primes then you can take the sequence , and if you want 2 in there as well then you can take 1,1,-1,-1,1,1,-1,-1,…

December 29, 2009 at 1:30 am

Come to think of it, the above numbers (for ) must be best possible. For if is the smallest prime greater than , then we care about all sub-APs of ; and one of these must fail.

December 29, 2009 at 8:20 am

I meant to say ‘if we stop caring about ’ (not ‘’).

December 28, 2009 at 7:53 pm |

I am unlikely to be able to do much to this blog over the next week, so here’s one final comment. It’s a small observation connected with how one might attempt to prove the result.

One could summarize the observation as saying that one can do quite a lot of shifting of the APs with respect to which the sequence is required to have bounded discrepancy. To explain this, I’ll pretend that we’re only interested in APs for which the common difference is either 1 or a prime number. A preliminary remark is that if you can prove that the discrepancy on APs of the form is unbounded, then you’ve solved the problem (since if that’s big then the discrepancy is big either on the first multiples of or the first multiples of ). But that means that instead of looking at the interval we can look at the interval for any of our choice. The intersections of the multiples of primes less than with that arithmetic progression will, when you subtract , be shifts of the multiples of , and by the Chinese remainder theorem we can choose, for each , what we want the shift to be.

For any particular choice of shifts, the problem could well be just as hard as the original problem. But it is just conceivable that one could get somewhere by taking a random collection of shifts. Of course, one would have to work out what happens for arithmetic progressions with composite common differences (unless it turns out that there must be unbounded discrepancy on a progression with prime common difference, but that feels rather optimistic).

December 29, 2009 at 12:05 pm

Assume that there is a sequence with bounded discrepancy, and that () has a solution . Now we can use a shift to prove:

There exist a sequence and a constant such that for any and

This is just the shift that Gowers mentioned. But lets consider an infinite number of congruences (), such that any finite collection of them has a solution . Using a compactness argument we can now prove:

There exist a sequence and a constant such that for any and

December 29, 2009 at 6:56 pm

A small thought on this idea. Let be the set of differences in the APs we are interested in. (So in the actual problem.) This argument relies on having the property that any two of its elements are coprime. But such are, in a rough sense, orthogonal to the ones we are really interested in: namely, those that contain long APs of the form . So any step in the direction of loosening the dependence on such sets would be interesting.

December 30, 2009 at 11:18 am |

I believe this is an example of length 717 for C=2. There are at least 14 different examples of length 700

{1, -1, 1, -1, 1, 1, -1, -1, -1, 1, -1, 1, -1, 1, -1, 1, 1, -1, 1,

-1, -1, -1, 1, -1, 1, 1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, -1, 1,

-1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, 1, -1, -1,

-1, -1, 1, 1, 1, -1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1, -1, 1, -1,

-1, 1, 1, -1, 1, -1, -1, 1, 1, 1, -1, 1, -1, -1, -1, 1, -1, 1, 1, -1,

-1, -1, 1, -1, 1, -1, 1, 1, -1, 1, 1, 1, -1, -1, -1, 1, -1, 1, 1, -1,

1, -1, -1, -1, 1, 1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, -1, -1, 1, 1,

1, 1, -1, -1, 1, -1, -1, -1, 1, 1, 1, 1, -1, 1, -1, -1, 1, 1, -1, -1,

1, -1, 1, -1, 1, -1, -1, 1, -1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1,

1, 1, -1, 1, -1, -1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1,

-1, 1, 1, -1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, -1, -1, 1, 1, -1, 1,

1, -1, 1, -1, -1, 1, -1, 1, -1, -1, 1, 1, 1, -1, 1, -1, -1, 1, 1, -1,

-1, 1, -1, 1, 1, -1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, -1, -1, 1, 1,

-1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1,

1, -1, -1, -1, 1, 1, 1, -1, 1, 1, -1, 1, -1, -1, -1, 1, 1, 1, -1, -1,

1, -1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, 1, 1, -1, 1, -1, 1, -1,

-1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1, -1, 1, 1, 1, -1, -1, -1, 1,

1, -1, 1, -1, -1, 1, -1, -1, 1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1,

-1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, 1, 1,

-1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1, -1, 1, 1, -1, 1,

-1, -1, -1, 1, -1, 1, 1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, 1,

-1, -1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1,

1, -1, 1, -1, -1, 1, -1, -1, -1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1,

-1, 1, 1, -1, -1, -1, 1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, 1, -1, 1,

-1, -1, 1, -1, -1, 1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, -1,

-1, 1, -1, -1, 1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1,

-1, -1, 1, 1, -1, 1, -1, 1, -1, -1, 1, -1, 1, -1, 1, 1, -1, 1, -1,

-1, 1, -1, -1, 1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1, -1, 1, -1, -1, 1,

1, 1, -1, 1, -1, 1, -1, -1, 1, 1, -1, -1, -1, -1, 1, -1, 1, 1, 1, -1,

1, -1, -1, 1, -1, -1, 1, 1, -1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1, -1,

-1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, 1, 1, -1, 1,

-1, -1, -1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, -1, 1, 1, -1,

-1, 1, -1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, 1, -1, -1, 1,

-1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, 1, 1, -1, 1, -1, -1, 1, 1,

1, -1, -1, 1, -1, -1, 1, 1, -1, -1, -1, 1, 1, -1, 1, 1, -1, -1, 1, 1,

1, -1, -1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1}

December 30, 2009 at 11:41 am

Wow. And a little modification of that gets us to 727:

+ – + – + + – - – + – + – + – + + – + – - – + – + + + + – + – - + – - + – + – + + – + – - + – + + – + + – - + + – - – - + + + – + + – + – - + – - + + – + – - + + – + – - + + + – + – - – + – + + – - – + – + – + + – + + + – - – + – + + – + – - – + + + + – + – - + – - + – - + + + + – - + – - – + + + + – + – - + + – - + – + – + – - + – + – + + – + – - + + – + + – + – - + – - + – - + – + + + – - + – + + – + + – + – - + – - + – - + + – + + – + – - + – + – - + + + – + – - + + – - + – + + – + + – + – - + – - + – - + + – + + – + – - + + – + – - + – + + + – - + – - – + + + – + + – + – - – + + + – - + – - + – - + + – + – - + + + – + – + – - + + – + – - + – - + + – + + + – - – + + – + – - + – - + + – + + – + – - + + – + – + + – - + + – + – - + – - + + – + + – + – - + + – + – - + – - + + – + + – + – - – + – + + – + – + + – - + – - + + – - + – + + – + – - + + – - – + + – + + + – + – - + – - – + – + + – + – - + + – + + – + – - + + – + – - + + – + + – + – - + – - + + – + + – + – - – + – + – - + – + + + – + + – + – - + + – + – - + – - + + – + – + – - + – + – + + – + – - + – - + + + + – - – + + – + – + – - + + + – + – + – - + + – - – - + – + + + – + – - + – - + + – + + + – - – + + – + – - + – - + + + + – - + – - + + – + + – + – - – + – + + – + – - + + – + – + + – - + – - + + – - + – + + – + + – + – - + – - + – - + + – + – - + + + – + – - + + + – + – - – + + – - – + + – - + – + + + + – - – - + – + + – + + – + + – + – - + – - + + – + + – + – - + – +

December 30, 2009 at 12:01 pm

Yes I haven’t really tried to optimize the construction here so please try out some backtracking on truncations of the sequence. When my program have finished n=700 I’ll try n=800. The the first run will probably finish some time tonight.

I have started out with all admissible sequences of length 25 and then successively eliminated those which could not be extended to length 100, 200,300, 400,500,600. There was a large drop in the number of remaining sequences for n=600, and many for which the program reached it’s time limit.

This was done by a simple modification of a program I already had so there is probably room for some specialised improvements if one really want’s to push this problem.

December 30, 2009 at 7:24 pm

Well, when I went off for Christmas I had this idea that 719 might be the limit (3!! = 720).

If we have a long sequence such as this we rely on the APs with differences 2, 3, 4 etc all being extendable. It may be possible to identify limits on the sequence by examining the APs with small differences.

January 3, 2010 at 8:20 pm

Klas

Do you have a record of your 14 sequences up to 700?

These could come in pretty handy in analysing the subsequences of long sequences.

It might then also be possible to identify sequences we could usefully extend which have been filtered out by your program by the time limit.

Mark

January 3, 2010 at 8:36 pm

Mark

I do not have the full sequences available right now, only the initial stubs which were extended to length 700. But later I can find explicit extensions as well. I have already modified the program so it keeps the full sequence rather than just the stubs.

When the run had finished it had identified 202 such which are extendable to length 700. There are at least 72 which can be extended to length 798.

Now I do not work with multiples of 100 any more bur rather number with many factors, and I have begun to extend the remaining stubs to longer lengths than 25.

My program does not filter anything out due to the time limit. The program keeps all sequences which get a time out as potentially extendable, it only removes those which it finds inextendable.

December 30, 2009 at 12:52 pm |

It occurred to me that there is a sort of self similarity involved in defining the curve. The problem in some sense is asking whether it is possible to define some sort of self similar curve with respect to constraint C.

December 30, 2009 at 9:59 pm |

Here is an example of length 798 for C=2, there is at least one more.

{1, -1, -1, 1, -1, 1, 1, -1, -1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1,

-1, 1, -1, -1, -1, 1, 1, 1, 1, -1, 1, -1, -1, -1, 1, -1, 1, -1, -1,

1, 1, 1, -1, 1, -1, -1, 1, 1, 1, -1, -1, -1, -1, 1, -1, 1, 1, 1, 1,

-1, -1, -1, 1, 1, -1, 1, 1, -1, 1, -1, -1, -1, -1, 1, 1, 1, -1, 1,

-1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1,

1, -1, -1, 1, -1, -1, 1, 1, -1, -1, 1, 1, 1, -1, -1, -1, 1, -1, 1, 1,

-1, 1, -1, 1, -1, 1, -1, 1, 1, -1, 1, -1, -1, -1, 1, -1, 1, -1, -1,

1, -1, 1, 1, 1, 1, -1, -1, -1, 1, -1, -1, 1, 1, -1, 1, 1, -1, 1, -1,

-1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1, 1, 1, -1, -1, 1, -1, -1, 1,

1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, 1, -1, -1, -1, -1, 1, 1, 1, -1,

1, -1, -1, 1, -1, -1, 1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, 1,

-1, 1, -1, -1, -1, -1, 1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1, -1,

1, -1, -1, 1, 1, -1, 1, 1, -1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1,

-1, -1, 1, 1, 1, -1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1,

1, -1, -1, 1, -1, 1, -1, 1, -1, -1, 1, -1, 1, -1, 1, -1, -1, 1, 1,

-1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1, -1, 1,

1, 1, -1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1, -1, 1, 1, -1,

1, -1, 1, -1, 1, -1, -1, 1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1,

-1, 1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1,

1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1,

-1, -1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1, -1, -1, -1, 1, 1, -1, 1, 1,

-1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, -1, -1,

1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, 1, 1, 1,

-1, -1, 1, 1, -1, -1, -1, -1, 1, -1, 1, 1, 1, 1, -1, -1, -1, 1, 1,

-1, 1, -1, -1, 1, 1, 1, -1, -1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, 1,

-1, 1, 1, -1, -1, -1, 1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, -1, 1,

-1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, -1, 1, 1, -1, -1, -1, 1, -1, 1,

1, -1, 1, -1, -1, 1, 1, -1, 1, 1, -1, 1, -1, -1, -1, 1, 1, 1, -1, -1,

-1, -1, 1, 1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, -1, -1, -1, 1,

-1, 1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1, -1, 1, 1, 1, -1, -1, 1,

1, -1, -1, 1, -1, -1, 1, 1, -1, -1, 1, -1, 1, -1, -1, 1, 1, -1, 1, 1,

-1, -1, 1, 1, -1, -1, -1, 1, -1, 1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1,

-1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1,

1, -1, -1, -1, 1, 1, 1, -1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1,

-1, -1, 1, 1, 1, -1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1,

-1, 1, -1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, 1,

-1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1, -1, 1, 1, -1, 1, 1, -1, -1,

-1, -1, 1, -1, 1, 1, 1, -1, 1, 1, -1, 1, -1, 1, -1, 1, -1, -1, -1,

-1, 1, -1, 1, 1, -1, 1, 1, -1, 1, -1, 1, -1, 1, 1, -1, 1, -1, -1, -1,

-1, 1, 1, 1, 1, -1, 1, -1, 1, -1, -1, 1, 1, -1, -1, 1, 1, -1, -1, -1,

-1, 1, -1, 1, -1, 1, 1, 1, -1, -1, 1, 1}

December 30, 2009 at 10:50 pm

That is 799, I think? Backtracking from there gets you quickly up to 832:

+ – - + – + + – - + + + – - – + + – + – + – - – + + + + – + – - – + – + – - + + + – + – - + + + – - – - + – + + + + – - – + + – + + – + – - – - + + + – + – - + + – + – - + + – + + – + – - + + – + – - + – - + + – - + + + – - – + – + + – + – + – + – + + – + – - – + – + – - + – + + + + – - – + – - + + – + + – + – - + + – + – - + – - + + + + – - + – - + + – - + – + + – + + – + – - – - + + + – + – - + – - + + – + + – + – - + + – + + – + – - – - + + – - + – + + + – - + – + – - + + – + + – + + – + – - + – - + + – - + + + – - + – - + + – + – - + – + + + – - + – + – + – - + – + – + – - + + – + + – + – - + + – + – + + – - – + – + + + – - – + + – + – - + – - + + – + + – + – + – + – - + – + – - + + + + – - + – - + + – + – - + – - + + + + – - + – - + + – + + – + – - + + – + – - + – - + + – + + – + – - – + + – + – - – + + – + + – - + + – - + – + + – + – - + + – + – - + – - + + + + – - – + – + + – - + + + – - + + – - – - + – + + + + – - – + + – + – - + + – + – - + + – + + – + – - + – - + – - + + – + – - + + – + – - + + – + + – + – - + + – + – - + – - + + – + + – + – - + + – + + – + – - – + + + – - + – - + + – + + – + – - + + – - – - + – + + + – + – - + – - + + – + + + – - – + + – + – - + – - + + – + – - + + – + + – - + + – - – + – + + + + – - – + + – + – - + – - + – + + + – - + – + + – - + – + – - + + + + – - – - + + + – + – - + – - + + + – - + + – - + + – + – - + + – - + – + + – + – - + – - + + – + – - + + – + + + – - – - + – + + – + – - + + + – - – + – + + + – + + – + – + – + – - – - + – + + – + + – + – + – + + – + – - – - + + + + – + – - – - + + – + + + – - – + – - + – + – + – + + – + + + – + – - – + + – - – + + + – - + – - + – + + + + – + – - + +

December 31, 2009 at 3:34 pm |

Just thinking out loud, please forgive any redundancy:

Start with a string of length n.

If p is the probability of a +1 occurring, then q = 1-p is the probability of -1 occurring.

Constraint C places an upper and lower bound on q.

qmax = 1/2 – (Cd/2n)

qmin = 1/2 + (Cd/2n)

for the initial string, d=1, therefore

qmax = 1/2 – (C/2n)

qmin = 1/2 + (C/2n)

and for any C, as n approaches infinity, qmax and qmin converge at 1/2.

when d > 1, then as d approaches n,

qmax = 1

qmin = 0

I have some more thoughts on this, but I want to make them more concise before I post them.

December 31, 2009 at 3:37 pm |

Sorry

when d > 1, then as Cd approaches 2n,

qmax = 1

qmin = 0

December 31, 2009 at 4:11 pm |

[...] For a description of Erdos’ discrepancy problem and a large discussion see this blog on Gowers’s blog. [...]

December 31, 2009 at 7:37 pm |

I must be suffering from a fit of dyslexia…please forgive

when d > 1, then as |Cd| approaches n,

qmax = 1

qmin = 0

January 1, 2010 at 4:42 pm |

Here is a sequence of length 897 for C=2, no additional optimization done

{1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, -1, 1, -1, -1, -1, -1, 1,

1, -1, -1, 1, 1, -1, 1, 1, -1, -1, -1, 1, 1, 1, -1, 1, -1, -1, 1, -1,

-1, 1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, -1, 1, -1, 1, -1, 1, 1,

1, -1, -1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1, 1, 1, -1, -1, 1, 1, -1,

1, -1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1,

-1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, -1, 1, 1,

-1, -1, -1, 1, 1, 1, 1, -1, -1, -1, -1, 1, 1, -1, 1, 1, 1, -1, -1, 1,

-1, -1, 1, 1, -1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, -1, -1,

1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1,

-1, 1, -1, -1, 1, -1, 1, -1, 1, 1, -1, -1, 1, 1, -1, -1, 1, 1, 1, -1,

1, -1, -1, 1, -1, -1, 1, 1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1, -1,

-1, 1, -1, 1, 1, -1, -1, -1, 1, -1, 1, 1, -1, 1, -1, 1, -1, -1, 1, 1,

1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1,

1, -1, -1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, 1, -1, -1, 1, -1, 1,

1, 1, -1, -1, -1, -1, 1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, -1, 1, 1,

-1, 1, 1, -1, -1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1,

1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, -1,

-1, 1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1,

1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1,

-1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1,

1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, -1, 1, -1, -1,

1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, -1, -1,

-1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1,

1, -1, 1, -1, 1, 1, -1, -1, -1, 1, 1, 1, -1, 1, 1, -1, -1, 1, -1, -1,

1, -1, 1, 1, -1, -1, 1, -1, 1, -1, -1, 1, 1, 1, -1, 1, -1, 1, 1, -1,

1, -1, -1, -1, 1, 1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1,

-1, -1, 1, -1, 1, -1, 1, 1, -1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1,

-1, 1, -1, -1, 1, -1, 1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1,

-1, -1, 1, -1, 1, 1, -1, -1, 1, 1, 1, 1, -1, 1, -1, -1, -1, 1, -1, 1,

1, -1, 1, 1, -1, -1, 1, -1, -1, -1, 1, 1, 1, -1, -1, 1, -1, 1, 1, -1,

-1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1,

1, -1, -1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1,

1, -1, -1, 1, 1, 1, -1, -1, -1, 1, 1, -1, -1, -1, 1, 1, -1, 1, -1,

-1, 1, 1, 1, 1, -1, -1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, -1, 1,

1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1,

1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, -1, -1, 1, 1, -1, 1, 1,

-1, -1, 1, 1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1,

-1, 1, -1, 1, -1, -1, 1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, -1, 1, -1,

-1, 1, -1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, 1, -1, -1, -1, 1, 1, -1,

1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, -1, 1, 1, 1, 1, -1, 1, -1, -1,

1, -1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, -1, 1, -1, -1, 1, 1,

-1, -1, 1, -1, 1, 1, -1, 1, -1, 1, -1, -1, 1, -1, 1, -1, 1, 1, -1,

-1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1,

-1, -1, 1, -1, 1, 1, -1, -1, -1, 1, -1, 1, 1, 1, 1, -1, -1, -1, -1,

1, 1, -1, 1, -1, 1, 1, 1, -1, -1, -1, -1, 1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, -1, 1}

January 1, 2010 at 9:29 pm

Simple optimization only gets you one further in this case, to 898:

+ – + + – - + – + + – + – + – - – - + + – - + + – + + – - – + + + – + – - + – - + + – + + – + – - + + – - + – + – + + + – - – - + – + + + – - + + + – - + + – + – - + – - + + – + – - + – + + + – - + – + – - + + – + – - + – + + + – - – + + – - – + + + + – - – - + + – + + + – - + – - + + – + – + – - + – + + + – - – - + + – + + – - + + – + – - + – + + – - + – + – - + – + – + + – - + + – - + + + – + – - + – - + + + + – + – - + – - + + – - + – + + – - – + – + + – + – + – - + + + – + – - + – + + – - + – + + – + + – - + – - + – + + – - + + – + – + – - + – + + + – - – - + + – - + – + + + – - – + + – + + – - + + + – - – + + – + – - + – + + – + + – - + – + + – - + – - + – + – - + + – + + – - + + – + – - + – + + – - + – + + – + + – - + – - + – + + – - + – - + – + + – - + – + + – + + – - + – - + + + + – - + – + + – + – - + + – - + – - + + – - – + + – + + – - + + + – - + – - – + – + + – + + – - + + – + – + + – - – + – + + + – - – - + + + – + + – - + – - + – + + – - + – + + – + + – - + – + + – + – - – + – - + – + + + – + – - + – + + – - + – + – + + – - – + – + + – + + – - + – - + – + + + – + – - + – + + – - + – - + – + + – - + + + + – + – - – + – + + – + + – - + – - – + + + – - + – + + – - + + – + – + + – + – - – + – - + – + + – - + – + + – + + – - + – + + – + + – - + – - + – + + – - + + – - – + + + + – - + + – + – - – + – - + – + + + – - – + + – + + – - + – + + – - + + – + – - + – + + – + + – - + – + + – - – - + + – + + – - + + – + – + + – - + – - + – + + – - + – + + – + + – - + – - + – + + – + – - + – + – + – - + – + + + + – - – + + – + – + + – - + – - + – - + + + + – + – - + – + – + – - + – + + + – - + + – - + + – - + – + + – + – + – - – - + – + + – - + – + + – + + – - + – + + – - + + – + – - + + + – - – - + – + + + + – - – - + + – + – + + + – - + – + + – - + – - + – + + – + +

The sticking point here is that while .

I'd be interested to know the number of sequences of length that can be extended to , for the that you have so far (?).

January 1, 2010 at 9:39 pm

I can add a 1 to this to get a sequence of 898.

899 = 29 x 31 and the sequences with difference 29 and 31 are the ones blocking progress to 899.

January 2, 2010 at 7:13 pm

Looking at the subsequences of the length 898 sequence (ie the sequences obtained by picking out the elements rd for fixed d) there are quite a lot of different initial sequences.

The one early observation which may have some interest (because it involves small primes) is:

The sequence for d=5 is the negative of that for d=4 except at positions 59 and 61 (up to limit 179)

The sequence for d=6 is the negative of that for d=4 (up to limit 149)

Some other connections

The (shorter) sequences for d=36 and d=43 are identical to the negative of that for d=9 (so far as they go)

d=18 is the negative of d=10 and d=27

There seems to be a tendency for the first divergence in these subsequences (for those which are identical or negative for a longer stretch than usual) at primes eg 13, 17. I have not checked, but I have not noticed any divergence at any value which has two or more distinct prime factors.

January 2, 2010 at 7:17 pm

I still think it would be useful to investigate these subsequences in rather more depth, and to understand how far they can be extended. This may be related to Alan’s question about sequences of length 25 which can be extended.

January 3, 2010 at 12:01 pm

Alec, since my program has a time limit built in I do not known how many of the length 25 it is possible to extend to a given length. I have some which I know cna be extended and some which run out of time and are then passed on to the next step where they are often easier to eliminate.

January 3, 2010 at 2:42 pm

Re subsequences with differences first occurring at n where n has two or more prime factors.

The sequence of length 1124 below has subsequences for d=2, d=24 which are negatives of each other up to the 44th term.

So though prime numbers or prime powers seem to be strongly preferred, they are not forced.

January 2, 2010 at 10:06 am |

Tim

Some easy observations, and a question:

A trivial answer to part of your final question about common difference not divisible by 2, 3, 5 … 101.

For common difference not divisible by 2: the sequence which is 1 on odd numbers and -1 on even numbers has sum = 1 or 0 on every AP starting at 0 with odd difference. This is at the cost of every progression with even difference being unbounded (sum to nd is -n)

The sequence which is built starting with 1, and at each stage appending the negative of the sequence to date:

1

1, -1

1, -1, -1, 1

1, -1, -1, 1, -1, 1, 1, -1

Has sum -1, 0 or 1 on every progression whose difference is a power of 2.

So it looks as if it is 2 which causes a problem.

Is it possible to prove, for example, for C= 2 that there is an infinite sequence which works for differences which are divisible only by 2 and 3 (generalise to other sets of primes, perhaps higher powers of 2 and other values of C)?

January 2, 2010 at 12:36 pm |

For the problem in higher dimensions, it’s conceivable that the answer might depend on the particular norm chosen for the condition on the . (For the condition on the AP sums, of course, it makes no difference to the problem which norm one chooses.)

I’ve done a little investigation using the Euclidean norm for both in two dimensions, and found that one can get at least as far as with . (One can achieve this using points of the form ).

January 2, 2010 at 3:41 pm

And with the norm and , one can get to at least in two dimensions even restricting the points to and .

January 3, 2010 at 9:45 am

I realized that with the Euclidean norm in two dimensions, vectors placed on a regular hexagon are particularly useful. Using only such vectors, with we can get as far as (and no further). The following sequence achieves this, where :

[0, 3, 3, 0, 3, 0, 2, 4, 0, 1, 4, 3, 1, 5, 0, 2, 4, 3, 1, 5, 5, 1, 4, 1, 2, 5, 3, 2, 5, 4, 1, 4, 1, 2, 5, 0, 2, 4, 4, 1, 5, 2, 1, 4, 3, 1, 5, 5, 2, 4, 1, 2, 4, 0, 1, 5, 4, 1, 4, 2, 2, 4, 0, 1, 5, 4, 2, 5, 1, 2, 4, 3, 1, 5, 5, 1, 4, 1, 3, 4, 2, 1, 0, 4, 1, 4, 3, 1, 5, 0, 2, 4, 5, 2, 4, 1, 1, 5, 3, 2, 0, 3, 4, 5, 1, 1, 3, 4, 1, 5, 0, 2, 4, 1, 3, 5]

By using only these vectors we constrain the sums to a set of seven points: the six points of the hexagon and zero.

January 3, 2010 at 12:07 pm |

Here is a sequence of length 1073 of C=2.

{1, -1, 1, 1, -1, -1, -1, -1, 1, 1, -1, 1, 1, 1, -1, -1, 1, -1, 1, 1,

-1, -1, -1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1,

-1, 1, 1, -1, 1, 1, -1, -1, -1, 1, 1, 1, -1, -1, 1, -1, 1, -1, -1, 1,

1, 1, 1, -1, -1, 1, -1, -1, 1, 1, -1, -1, 1, 1, 1, -1, -1, 1, 1, -1,

1, -1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1,

-1, 1, -1, -1, 1, 1, -1, -1, -1, 1, 1, -1, 1, 1, 1, -1, -1, -1, 1,

-1, -1, 1, -1, 1, 1, 1, 1, -1, -1, -1, -1, 1, 1, -1, 1, 1, 1, -1, -1,

1, -1, -1, 1, -1, -1, 1, -1, 1, -1, 1, 1, -1, 1, 1, 1, -1, -1, -1,

-1, 1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1,

1, -1, 1, 1, -1, 1, -1, 1, -1, 1, 1, -1, -1, -1, 1, 1, -1, 1, 1, -1,

-1, 1, -1, -1, 1, -1, -1, 1, 1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1,

-1, -1, 1, -1, 1, 1, -1, -1, -1, 1, -1, 1, 1, -1, 1, 1, 1, -1, -1, 1,

1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1,

-1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, 1, -1, -1, 1,

-1, 1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, -1, -1, 1, 1, -1, -1, 1,

-1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1,

1, -1, 1, -1, -1, 1, 1, -1, -1, 1, -1, 1, 1, 1, -1, 1, -1, -1, 1, -1,

1, -1, -1, 1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1,

-1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1,

-1, 1, 1, -1, -1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1,

-1, -1, 1, 1, 1, 1, -1, -1, -1, -1, 1, 1, -1, 1, 1, -1, 1, 1, -1, -1,

1, -1, -1, 1, 1, -1, -1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1,

1, -1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1,

-1, -1, 1, 1, -1, 1, 1, 1, -1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1,

1, -1, -1, -1, 1, 1, 1, -1, -1, 1, -1, 1, -1, -1, 1, 1, 1, -1, 1, -1,

1, 1, -1, 1, -1, -1, -1, 1, 1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1,

-1, 1, 1, -1, -1, 1, -1, 1, -1, 1, 1, -1, 1, -1, 1, -1, -1, 1, -1, 1,

1, -1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1, 1, 1,

-1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, -1, -1, -1,

1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, -1, -1, 1, 1, 1, -1, -1, 1, -1,

1, 1, -1, -1, 1, 1, -1, 1, -1, 1, 1, -1, 1, -1, -1, -1, 1, -1, -1, 1,

-1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1,

1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1,

-1, 1, -1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1,

-1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1,

-1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1,

1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1,

-1, 1, 1, -1, -1, 1, -1, 1, -1, -1, 1, 1, 1, -1, 1, -1, -1, 1, -1, 1,

1, -1, -1, 1, -1, 1, -1, -1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1, 1, -1,

-1, 1, 1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1,

-1, 1, -1, -1, 1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1, 1,

-1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1,

-1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1,

-1, -1, 1, -1, -1, 1, 1, 1, -1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1,

-1, 1, 1, 1, 1, -1, 1, -1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1,

-1, -1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1,

-1, 1, -1, 1, 1, -1, -1, -1, 1, 1, -1, -1, 1, 1, -1, -1, 1, -1, 1, 1,

-1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, 1, -1, -1, 1, -1, 1,

1, -1, -1, 1, -1, 1, 1, -1, 1, -1, 1, -1, -1, 1, -1, 1, -1, 1, 1, -1,

-1, 1, -1, 1, -1, -1, 1, 1, 1, -1, 1, -1, -1, -1, 1, 1, 1, -1, -1, 1,

-1, -1, 1, -1, 1, 1, -1, -1, 1, 1, 1, 1, -1, 1, -1, -1, -1, 1, -1, 1,

-1, -1, 1, 1, 1, -1, 1, -1, -1, 1, 1, 1, -1, -1, -1, 1, -1, -1, 1,

-1, 1, 1, 1, -1, 1, -1, 1, 1, -1, 1, -1, -1, -1, 1, -1, 1, 1, -1, 1,

1, -1, -1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, -1, 1,

1, -1, 1, -1, 1, -1}

My main program, which aims for a sequeqce of a specific length, was used to find sequene of length 1008 and then a very simple extrension program was run to find the sequence of length 1073. The sequence of length 1008 seems to have quite a lot of extension so it would be interesting to see what comes out if it is used as the initial sequqnce for a backtrack program. Here is the sequence of length 1008

{1, -1, 1, 1, -1, -1, -1, -1, 1, 1, -1, 1, 1, 1, -1, -1, 1, -1, 1, 1,

-1, -1, -1, 1, -1, 1, 1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1,

-1, 1, 1, -1, 1, 1, -1, -1, -1, 1, 1, 1, -1, -1, 1, -1, 1, -1, -1, 1,

1, 1, 1, -1, -1, 1, -1, -1, 1, 1, -1, -1, 1, 1, 1, -1, -1, 1, 1, -1,

1, -1, -1, 1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, 1,

-1, 1, -1, -1, 1, 1, -1, -1, -1, 1, 1, -1, 1, 1, 1, -1, -1, -1, 1,

-1, -1, 1, -1, 1, 1, 1, 1, -1, -1, -1, -1, 1, 1, -1, 1, 1, 1, -1, -1,

1, -1, -1, 1, -1, -1, 1, -1, 1, -1, 1, 1, -1, 1, 1, 1, -1, -1, -1,

-1, 1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1,

1, -1, 1, 1, -1, 1, -1, 1, -1, 1, 1, -1, -1, -1, 1, 1, -1, 1, 1, -1,

-1, 1, -1, -1, 1, -1, -1, 1, 1, 1, 1, -1, 1, -1, -1, 1, -1, -1, 1, 1,

-1, -1, 1, -1, 1, 1, -1, -1, -1, 1, -1, 1, 1, -1, 1, 1, 1, -1, -1, 1,

1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1,

-1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, 1, -1, -1, 1,

-1, 1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, -1, -1, 1, 1, -1, -1, 1,

-1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1, -1,

1, -1, 1, -1, -1, 1, 1, -1, -1, 1, -1, 1, 1, 1, -1, 1, -1, -1, 1, -1,

1, -1, -1, 1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1,

-1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1,

-1, 1, 1, -1, -1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1,

-1, -1, 1, 1, 1, 1, -1, -1, -1, -1, 1, 1, -1, 1, 1, -1, 1, 1, -1, -1,

1, -1, -1, 1, 1, -1, -1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1,

1, -1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1,

-1, -1, 1, 1, -1, 1, 1, 1, -1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1,

1, -1, -1, -1, 1, 1, 1, -1, -1, 1, -1, 1, -1, -1, 1, 1, 1, -1, 1, -1,

1, 1, -1, 1, -1, -1, -1, 1, 1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1,

-1, 1, 1, -1, -1, 1, -1, 1, -1, 1, 1, -1, 1, -1, 1, -1, -1, 1, -1, 1,

1, -1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1, 1, 1,

-1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, -1, -1, -1,

1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, -1, -1, 1, 1, 1, -1, -1, 1, -1,

1, 1, -1, -1, 1, 1, -1, 1, -1, 1, 1, -1, 1, -1, -1, -1, 1, -1, -1, 1,

-1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1,

1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1,

-1, 1, -1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1,

-1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1,

-1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1,

1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1,

-1, 1, 1, -1, -1, 1, -1, 1, -1, -1, 1, 1, 1, -1, 1, -1, -1, 1, -1, 1,

1, -1, -1, 1, -1, 1, -1, -1, 1, 1, 1, -1, -1, -1, 1, 1, -1, 1, 1, -1,

-1, 1, 1, -1, 1, -1, 1, 1, -1, -1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1,

-1, 1, -1, -1, 1, 1, 1, -1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1, -1, 1,

-1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, 1, -1, 1, -1,

-1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1,

-1, -1, 1, -1, -1, 1, 1, 1, -1, -1, -1, 1, -1, -1, 1, -1, 1, 1, -1,

-1, 1, 1, 1, 1, -1, 1, -1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, -1, 1,

-1, -1, 1, -1, 1, 1, -1, -1, 1, -1, 1, 1, -1, 1, 1, -1, -1, 1, -1,

-1, 1, -1, 1, 1, -1, -1, -1, 1, 1, -1, -1, 1, 1, -1, -1, 1, -1, 1, 1,

-1, 1, 1, -1, -1, 1, -1, -1, 1, -1, 1, 1, 1, -1, 1, -1, -1, 1, -1, 1,

1, -1, -1, 1, -1, 1, 1, -1, 1, -1, 1, -1, -1, 1, -1, 1, -1, 1, 1, -1,

-1, 1, -1, 1, -1, -1, 1, 1, 1, -1, 1, -1, -1, -1, 1, 1, 1, -1, -1, 1,

-1, -1, 1, -1, 1, 1, -1, -1, 1, 1, 1, 1, -1, 1, -1, -1, -1, 1, -1, 1}

January 3, 2010 at 12:44 pm

1124, in no time at all:

+ – + + – - – - + + – + + + – - + – + + – - – + – + + – + -

- + + – + – - + – - + + – + + – - – + + + – - + – + – - + +

+ + – - + – - + + – - + + + – - + + – + – - + – - + + – + -

- + – + + + – - + – + – - + + – - – + + – + + + – - – + – -

+ – + + + + – - – - + + – + + + – - + – - + – - + – + – + +

- + + + – - – - + + + + – - – + + – + – - + – + + – - + – +

+ – + – + – + + – - – + + – + + – - + – - + – - + + + + – +

- – + – - + + – - + – + + – - – + – + + – + + + – - + + – -

+ – - + – + + – - + – + + – + + – - + – - + – + + – - + + -

+ – + – - + – + + + – - + – + + – - – - + + – - + – + + – +

+ – - + – + + – - + + – + – - + – + – - + + – - + – + + + -

+ – - + – + – - + + + + – - – + + – + – - + – + + – - + – +

+ – + + – - + – - + – + + – - + + – - – + + – - + – + + – +

+ – - + – - + + + + – - – - + + – + + – + + – - + – - + + -

- – + + – + + – - + – + + – + – - – + – + + – + + – - + – -

+ – + + – - + + – + + + – - – + – + + – + + – - + – - – + +

+ – - + – + – - + + + – + – + + – + – - – + + – + – + + – -

+ – - + – + + – - + – + – + + – + – + – - + – + + – - + – -

+ – + + + – - – + + – + + + – + – - + – + + – - + – + + – +

- – - + – + + – + + – - + – - – + + + – - + – + + – - + + -

+ – + + – + – - – + – - + – + + – - + – + + – + + – - + – +

+ – + + – - + – - + – + + – - + – - + – + + – + – - + + – +

+ – - + – - + – + + – - + – + + – + + – - + – + + – - + + -

+ – - + – + + – - + – - + – + + – - + – + + – + + – + + – -

+ – + + – - + – - + – + + – - + – + – - + + + – + – - + – +

+ – - + – + – - + + + – - – + + – + + – - + + – + – + + – -

+ – - + – - + – + + – + – - + + + – + – - + – + + – - + – +

- – + + – - + – + + – + + – - + + – + – - + – - + – + + – +

+ – - + – + + – + + – - + – - + + + – - – + – - + – + + – -

+ + + + – + – - + + – - + – + + – - + – - + – + + – - + – +

+ – + + – - + – - + – + + – - – + + – - + + – - + – + + – +

+ – - + – - + – + + + – + – - + – + + – - + – + + – + – + -

- + – + – + + – - + – + – - + + + – + – - – + + + – - + – -

+ – + + – - + + + + – + – - – + – + + – + + – - + – - + + +

- – - + – - + – + + + – + – + + – + – - – + – - + + + + – -

+ – + + – - + – + + – + – - + – + + – - – + – - + + – + – +

+ – + – - – + – + + + + – + – - – - + – + + – + + – - + + -

+ + – + – - + – + + – + – -

January 3, 2010 at 7:12 pm

Working with 1124 sequence

Some interesting patterns emerging here.

Subsequence difference 2 is the same as the negative of subsequence 3 up to position 71.

Subsequence 4 is the negative of subsequence 5 as far as it goes and 6 is the same as 5 up to and including position 175. (4 and 6 – cf 2 and 3), 9 is the negative of 5 up to position 105 and of 40 to position 27 (as far as it goes)

Subsequence 7 is the same as subsequence 1 up to position 60.

Subsequence 8 is the negative of 10 up to position 80 and of 12 up to position 88 (cf relationships between sequences 4,5,6). It is the same as subsequence 15 up to position 73, and the negative of 27 up to position 40 (and the same as 1 up to position 47).

Subsequence 16 is the negative of sequences 20 and 24 (up to 43)

So it looks as if there may be something going on in the subsequences which is related to the multiplicative structure of the low primes, but which isn’t anything like “completely multiplicative” – and this may be worth some further investigation.

January 3, 2010 at 9:34 pm

Sorry, misread something subsequence 1 is not the same as subsequence 8, closest is 49.

January 3, 2010 at 10:45 pm

There is a strong tendency to subsequences which begin (my numbers based on a simple binary code identify sequences, codes for the negative of sequence n and 511-n) sequence [values of d for which sequence applies]:

(114) -1, 1, -1, -1, 1, 1, 1, -1, -1 … [2, 16, 21, 22, 25, 30, 36, 37, 39, 46, 57, 81

(397) 1, -1, 1, 1, -1, -1, -1, 1, 1 … [3, 14, 17, 20, 24, 26, 33, 38, 45, 54, 69

(181) 1, -1, 1, -1, 1, 1, 1, -1 … [4, 9, 29, 32, 35, 41, 42, 44, 50, 51, 60, 65, 72, 74, 78, 92, 95, 99, 109

(330) -1, 1 -1, 1, -1, -1, 1, -1, 1 … [5, 6, 28, 31, 34, 40, 43, 48, 52, 55, 63, 66, 67, 75, 76, 90, 103, 108, 111

(332) -1, -1, 1, 1, -1, -1, 1, -1, 1 … [8, 11, 15, 18, 23, 58, 64, 70, 79, 82, 84, 85, 88, 91, 93, 97, 100, 102, 106

(179) 1, 1, -1, -1, 1, 1, -1, 1, -1 … [10, 12, 13, 19, 27, 56, 59, 62, 68, 77, 80, 83, 86, 87, 89, 96, 101, 104, 105, 110

Apparently sporadic values

(269) 1, -1, 1, 1, -1, -1, -1 -1, 1 … [1, 49

(242) -1, 1, -1, -1, 1, 1, 1, 1, -1 … [7

(458) -1, 1, -1, 1, -1, -1, 1, 1, 1 … [47

(180) -1, -1, 1, -1, 1, 1, -1, 1, -1 … [53, 107

(331) 1, 1, -1, 1, -1, -1, 1, -1, 1 … [94

(333) 1, -1, 1, 1, -1, -1, 1, -1, 1 … [61, 112

(178) -1, 1, -1, -1, 1, 1, -1, 1, -1 … [98

(182) -1, 1, 1, -1, 1, 1, -1, 1, -1 … [71

(329) 1, -1, -1, 1, -1, -1, 1, -1, 1 … [73

Note that 8r tends to belong to the same sequence as r, and that while 1 and 7 belong to sporadic sequences, 8 and 56 belong to the main set.

If 2r belongs to n, 3r belongs to 511-n

The tendency of sporadic values to attach to primes, and the curious case of 7. Some of the higher values may change when the sequence is extended.

The persistence of the six strong sequences suggests further work on these – the analysis above is based on the first nine values in each sequence. It is, of course, possible to analyse further in the same way. But I thought I’d give this to see if anyone spotted any obvious patterns.

January 3, 2010 at 11:09 pm

The six strong sequences exhibit some very regular behaviour when d is multiplied by low primes, with the following cycle structure:

2 – (114, 181, 332) (397, 330, 179) [square of 5]

3 – (114, 330, 332, 397, 181, 179) [inverse of 5]

5 – (114, 179, 181, 397, 332, 330)

7 – (114, 397) (181, 330) (179, 332) [cube of 5] [13 has the same structure]

11 – (114) (397) (181) (330) (332) (179) [identity]

So we appear to be in the cyclic group of order 6.

This structure could be used in the search for longer sequences.

January 3, 2010 at 11:37 pm

Just to note that taking the coding out a little further, the same structure persists (here the negative sequence for code n is 4095-n). There are a few more sporadic values associated with particular primes, but six main sequences, with a cyclic group structure (note that numbers with the same code relate to the same group elements, so the code elements can be regarded as the elements of the group.

What happens when we take the sequence for 5 (group generator) or 3 (its inverse) and try to extend these? -ie begin with the sequence f(3r) or f(5r)

What about taking the sequence for 8 (group identity) – extending f(8r)? (or f(15r)) – how far do these get?

Or should we be asking different questions? How is this group structure best deployed? Does it apply only to this sequence, or does every long sequence constrained to C=2 have this cyclic group structure (as seems likely)? And how is the structure explained – where does it come from?

179: 13, 19

181: 29

1203: 10, 12, 27, 56, 59, 62, 68, 77, 80, 83

1204: 53

1205: 4, 9, 32, 35, 41, 42, 44, 50, 51, 60, 65, 72, 74, 78

1206: 71

1266: 7

1421: 3, 14, 17, 20, 24, 26, 33, 38, 45, 54, 69

1482: 47

2674: 2, 16, 21, 22, 25, 30, 36, 37, 39, 46, 57, 81

2829: 1, 49

2889: 73

2890: 5, 6, 28, 34, 40, 43, 48, 52, 55, 63, 66, 67, 75, 76

2892: 8, 15, 18, 58, 64, 70, 79, 82, 84, 85

2893: 61

3914: 31

3916: 11, 23

January 4, 2010 at 8:29 am

Passing to subsequences (eg f(8r) in previous post) looks as though it may reduce the number of sporadic sequences (so 7 would not be sporadic any more) and may help to illuminate the question of whether these sporadic subsequences are getting in the way of progress, or whether they are necessary.

January 4, 2010 at 10:10 am

Note that – it appears (speculating beyond data) that if one of the six key sequences arises, they are all required. For example, each of the six is a subsequence of the [identity] sequence.

The case of 1, 7, 49 cannot be usefully investigated further (it may be significant) until (?) we have a much longer sequence to show what happens for 343.

January 4, 2010 at 1:57 pm

Actually, it looks as though 7 would be sporadic in f(8r), because 112 is sporadic. It would be interesting to get deeper into this behaviour of 7 (which induces the same group element as 10, 13 …) and to know whether it might be reflected in larger primes too.

January 4, 2010 at 4:36 pm

The sequences belonging to r=8 and r=10 split themselves into two when they are extended to 15 digits, while the others persist – but this involves differences in entries from about the 800th place – and it may be that a successful long sequence would differ in these places …

… or it might be that the pattern is both necessary and unsustainable that far out (leading to a contradiction).

January 6, 2010 at 12:28 pm

Mark, the 181 sequence you wrote down doesn’t have discrepancy 2. I presume that’s just a typo, but what should it be?

I’ve just realized I can answer that for myself. The final 1 and -1 should be the other way round. If you confirm that you agree with that then I’ll edit your comment.

January 6, 2010 at 12:50 pm

No, he forgot a -1. 181 should be the negative of 330.

January 3, 2010 at 4:15 pm |

Instead of asking for a function from the natural numbers (just another way to represent the sequence ) with bounded discrepancy, we could ask for such a function from the positive rational numbers . That is, such that for any positive rational number and any integer :

Using a compactness argument, these two questions turn to be equivalent. In other words: Assuming that there is a sequence with bounded discrepancy, we can find a finite sequence, that can be extended in the way Mark described in http://gowers.wordpress.com/2009/12/17/erdoss-discrepancy-problem/#comment-4618 arbitrary many times (and by arbitrary factors).

January 3, 2010 at 9:46 pm |

Here is a possible way of making progress by looking at the subsequences.

Suppose there is an infinite sequence f(n) for some C (one dimensional case).

Examine the subsequences f(rd) for fixed d – these will also do for C.

Suppose that only a finite number of infinite sequences exist for the value C.

Then, by taking subsequences whose differences are powers of 2, we must find two the same (there are only a finite number).

Take the smallest sequence with the lower power of 2 as the base sequence. We now have a sequence for C which has a subsequence identical to the original sequence (g(n)=g(nd) for some fixed d and all n)

We can therefore find the sequence g(n) for which g(n) = g(nd) for the smallest possible d. What can we say about de in this case (eg with the current case of C=2)?

January 4, 2010 at 8:40 am

See above – it looks as though the C=2 case forces 6 significant subsequences plus some sporadic ones. The six subsequences have a group structure (cyclic, order 6). [NB this is the one dimensional case]

This structure could do with some explanation – it depends on a map which is nearly a homomorphism from the multiplicative group of positive integers to the cyclic group of order 6 [the sporadic values are exceptions]. The subsequence structure shows that there will be a family of such maps.

January 4, 2010 at 12:29 pm

Ignore that comment about a family of maps.

Attempts to extend the six basic sequences as far as possible will illuminate the structure further.

January 4, 2010 at 12:42 pm

Note that the primes associated with the sporadic sequences seem to induce permutations of the six basic sequences which are members of the cyclic group (though the data supporting this are thin). It may therefore be possible to associate a group element to each sporadic sequence too.

January 4, 2010 at 10:49 pm

Mark, I wonder (quite speculatively) if some of the patterns you are finding arise because the sequence is ‘trying’ in various places to approximate the Legendre symbol modulo a prime . This is a promising candidate for bounded discrepancy because it is multiplicative and periodic modulo , so that the partial sums on all special APs are bounded by a constant depending only on . It doesn’t actually solve the problem, of course, because it’s zero on multiples of , but perhaps long sequences can be obtained by judicious placement of s and s at those multiples.

January 5, 2010 at 12:01 am

Interesting thought, Alec, but I’m not sure it will work (it may) – because I’d expect some clear distinction between primes of the forms 4n+1 and 4n+3, which doesn’t quite seem to be there. And there is the issue of what is to be done with the prime 2, which seems to me to have a key role, but which I don’t really understand.

The issues at the back of my mind are:

1. Without some tractable structure it would seem that an infinite sequence with maximum discrepancy 2 would be impossible to prove. It has seemed several times that a limit may have been reached, and the extent of progress has been surprising every time. There is as yet no explanation of this, in terms of how the blockages in particular cases relate to a deep structure.

2. If a limit to sequences with maximum discrepancy 2 is eventually reached, then the case C=3 is there to be explored. It seems to me that any limit will then be very large indeed, and certainly beyond my computer resources – and that some understanding of how structure arises in the sequences could give some tools for analysis. As I noted above, there are different parity issues which arise for odd C, and the even subsequence may be very significant.

January 5, 2010 at 1:55 pm

Incidentally, I don’t think one can improve on the bound by using the Legendre symbol modulo , filling in the multiples of with the Legendre symbol modulo , and so on — at least not without choosing the very carefully. According to the paper of Borwein and Choi referred to above, Schur has shown that the partial sums of the Legendre symbol reach at least . (Actually it’s not entirely clear from the remark in that paper whether this is true for all or just for infinitely many — can anyone confirm this?)

January 18, 2010 at 9:08 pm

I believe it is true for all p, via a Fourier analysis argument (Parseval’s thm), if my memory serves me.

January 4, 2010 at 2:58 pm |

I think Mark’s observations are very interesting. If I understand him correctly, a sequence of this type without sporadic subsequences would be of this type:

Let be a finite abelian ( seems to be a good one) and definite a multiplicative function and a function . Now define .

Notice the a multiplicative sequence is of this form with .

January 4, 2010 at 3:04 pm

I hope this one contains fewer typos :s :

Let be a finite abelian group ( seems to be a good one) and define a function such that and a function . Now define .

January 5, 2010 at 4:47 pm

I think we should use this to search for long sequences with bounded discrepancy. I order to define a sequence of the above form, you only need to specify , for every element in and for every prime. A good starting point seems to be , , for and otherwise, and .

I haven’t checked, but I think that if a sequence only have a finite number of different subsequences, then it must have a subsequence of the above form.

January 5, 2010 at 10:19 pm

Here is the proof that if a sequence only have a finite number of different subsequences, then it must have a subsequence of the above form:

Let be a sequence with only finitely many different subsequences. Now you can alway find a subsequence such that every subsequence of contains a subsequence identical to : If this is not true for , there must be a subsequence of such that don’t contain a subsequence identical to . Now try : This sequence contains fewer different subsequences, so you can only do this a finite number of times. Finally you find a that works.

Now I want to show that is of the above form. Let and be two natural numbers such that . For any natural number $d_3$ we have . In other word: If the “-subsequence” and the -subsequence of the -sequence are the same subsequence, then for any subsequence of , the -subsequence and the -subsequence of are the same. We now identify the natural number with the sequence . We want to define a group structure on the subsequences: Define the composition of two subsequences by choosing a natural number that represents them, multiply these numbers, and take the corresponding subsequence. This composition is clearly associative, commutative and has identity and inverse.

January 6, 2010 at 12:10 am

So the question is whether the sporadic bits are smoothed out in the big picture, or whether they multiply and become uncontrollably complex.

It is trivial, of course, that the homomorphic image of a group is determined by the images of generators, so the behaviour of primes is fundamental.

The fact that changing the sign of a sequence does not change the discrepancy pattern has always implied a group element of order 2. It is the existence of an order 3 element which I don’t have a handle on yet. And I conjecture that this would help to identify “pure sequences” rather than the ones we have which may contain sporadic fluctuations. The fluctuations might be necessary, of course, but they would be understood in their proper context.

I would like to be able to make a conjecture, for example, whether C=3 would give a group element of order 4 (=3+1) or order 5 (the 3rd prime) or something else. An abelian group of order 6, or 30, is necessarily cyclic, but this is not true for order 12 or 24.

January 6, 2010 at 12:34 am

I’m not sure I understand. If the cyclic group of order 6 works for C=2 it will also work for C>2. Are assuming that the conjecture is true (no sequences have bounded discrepancy) and talking about patterns in long finite sequences for C=2 and C=3?

January 6, 2010 at 7:52 am

I’m planning for the contingency that C=2 is false.

January 5, 2010 at 8:55 pm |

Just got back from my internet-free week. Not much time for anything right this moment, but I have to make one comment: wow! That’s at the breaking of the 1000 barrier. I had assumed that the conjecture was true (that bounded discrepancy is impossible) but I think that can no longer be taken for granted, even if it still seems more likely. I have a few simple observations that I’ll post later.

January 11, 2010 at 5:25 pm |

The discussion continues here:

http://gowers.wordpress.com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/

August 28, 2011 at 7:40 am |

[...] of the form (r,2r,3r,…kr}. EDP was extensively discussed and studied in polymath5. Here is the link of the first post. Here are links to all polymath5 posts. Here is the link to polymath5 [...]

August 22, 2012 at 4:59 pm |

[...] sequences where grows slowly. EDP was extensively discussed and studied in polymath5. Here is the link of the first post. Here are links to all polymath5 posts. Here is the link to polymath5 wiki. A sequence is called [...]

August 23, 2012 at 10:54 am |

If you assumed that there exists an infinite sequence such that C=2, could you prove that the sequence had a given sub-sequence? It obviously has all length 2 sequences as sub-sequences, and it has + + – and – - + as sub-sequences, but is it possible to prove that the sequence has other sub-sequences?

March 25, 2013 at 9:24 pm |

[...] sums stay within . For a proof of this and much more information about the EDP, see this original post by Timothy Gowers, a more-recent post by Gil, and this Polymath [...]

July 31, 2013 at 6:08 am |

Have we tried to prove that the drift must be at least 4? We could make it easier by trying to prove that the drift is at least 4 or the discrepancy is at least 3. I will work on it.

July 31, 2013 at 10:12 am

Never mind, I took a look at the 1124 sequence, and the drift of 4 doesn’t happen for the first 100 or so, so it would be really hard to prove it must eventually happen.