EDP7 — emergency post

I don’t feel particularly ready for a post at this point, but the previous one has got to 100 comments, so this is one of the quick summaries again — but this one is even shorter than usual.

We are still doing an experimental investigation of multiplicative functions, trying to understand how special they have to be if they have low discrepancy. Ian Martin has produced some beautiful plots of the graphs of partial sums of multiplicative functions generated by various greedy algorithms. See this comment and the ensuing discussion.

Terence Tao has some thoughts about how one might try to reduce to the character-like case.

I came up with a proof strategy that I thought looked promising until I realized that it made predictions that are false for character-like functions such as \lambda_3 and \mu_3. Even if the idea doesn’t solve the problem, I think it may be good for something, so I have written a wiki page about it. Gil has had thoughts of a somewhat similar, but not identical, kind. Here is a related comment of Gil’s, and here are some more amazing plots of Ian’s. (I think we should set up a page on the wiki devoted to these plots and the ideas that led to them.) Regardless of what happens with EDP itself, I think we have some fascinating problems to think about, which can be summed up as, “What is going on with these plots?”

105 Responses to “EDP7 — emergency post”

  1. Various Polymath Projects « Euclidean Ramsey Theory Says:

    […] Let me update this there is a new thread for Polymath5. […]

  2. gowers Says:

    I’ve been having another look at this plot of Ian’s. It’s the partial sums of the completely multiplicative function you get if you define f(p) to be 1 if and only if the partial sum up to p-1 is less than or equal to -4. I used to be quietly confident that the fairly low partial sums we were witnessing was a sign that the growth we were witnessing was n^{o(1)}, but now I’m not so sure. The reason for my doubts is that the “greedy positive” sequence seems to show n^{1/4} type growth, which makes me think that perhaps this one does too. Given that the fourth root of 2,500,000 is about 40 or so, the numbers do seem to be in the right ball park.

    It’s interesting to see how the graph is very flat to start with and then appears to take off quite rapidly, after which a new function (the more solid blue) takes over and interferes with the old one, causing it to drop back down a bit before resuming a more rapid growth. I would dearly like to know whether this kind of pattern continues, but I understand from Ian that the computations are becoming quite slow now, so this may not be feasible. Or perhaps the issue is one of storage: how many values of f is it reasonable to store directly? I’d be quite interested to know the answer to that so I can try to think what the limit of feasibility would be. So my two questions are: roughly how many numbers can you store and roughly how many arithmetical operations can you do per second? I realize these questions are a bit vague, but vague answers would suffice.

    The algorithm I have in mind would be this. You define two arrays, x_1,\dots,x_N and y_1,\dots,y_N, for some large N. You initialize them to 1,0,0,0,\dots and 0,0,0,\dots. Then a global step of the algorithm consists in doing the following.

    1. Find the smallest n such that x_n=0 (which will always be a prime).

    2. Set f(n)=1 if y_{n-1}\leq -4 and f(n)=-1 otherwise.

    3. For every m such that f(m)\ne 0, set f(mn)=f(m)f(n).

    4. (i) Set k=0. (ii) Let y_{n+k}=y_{n+k-1}+x_{n+k}. (iii) Add 1 to k. (iv) If x_{n+k}=0 then goto 1, else goto 4 (ii).

    This seems to need space N and time around N^2 (because of step 3).

  3. Alec Edgington Says:

    Below is Tim’s algorithm as a Python script (pylab is just for the plotting). On my PC here at work this runs very quickly with N = 5000000, but runs out of memory at 10000000. On a better machine or in a faster compiled language it could no doubt do a little better.

    import pylab as pl
    N = 1000000
    x = [0]*(N+2)
    y = [0]*(N+1)
    x[1] = 1
    n0 = 2
    while n0 <= N:
      x_n0 = 1 if y[n0-1] <= -4 else -1
      for m in range(1,N/n0+1):
        if x[m] != 0:
          x[m*n0] = x[m] * x_n0
      while x[n0] != 0:
        y[n0] = y[n0-1] + x[n0]
        n0 += 1
  4. gowers Says:

    Would it be possible to provide plots of the partial sums up to 5000000? It would be nice to see the partial sum as a function of \log n, and also the log of the partial sum as a function of \log n. We won’t get all that much more information than we already have from Ian’s plots, but even a little might be helpful.

    If the algorithm is running very quickly but needing too much space, one could cut down the space requirement by a factor of two and slightly increase the complication of the calculation by just storing the partial sums up to n_0 and the values thereafter, since the values up to n_0 can be got by differencing the partial sums. I don’t know enough about how computers work to know whether that would mean you could go twice as far.

    If one wanted to save even more space, one could save just the values at primes and the partial sum up to n_0 (but not the previous partial sums), doing a plot as one went along, and perhaps taking note of each time the partial sum reached a new record, and calculating f(n) each time by working out its prime factorization. That feels as though it could get you over 15 times as far if space remains the main constraint, unless one is forced to waste space by storing a table of the primes in increasing order. But even then one should be able to get a seven-fold improvement. (I’m approximating the log of 10^6 by 15.)

  5. Alec Edgington Says:

    Partial sums plotted against \log n:

    Logarithm of absolute value of partial sums plotted against \log n:

    I’ll have a go at optimizing the program (but no time at the moment…).

    • gowers Says:

      Oh dear, it is as I feared. The log versus log graph looks very linear, so it seems that this greedy algorithm is in fact giving us power-type behaviour. However, it also seems that the power is definitely smaller than 1/2, with my money still on 1/4.

      I’d still very much like to understand this better, but I feel a lot less optimistic about finding n^{o(1)}-type growth this way.

      Maybe my optimism would increase if the power went down if one changed the -4 threshold to a smaller number such as Johan’s original -10 or something smaller like -50. If that happened, then I would expect to be able to obtain n^{o(1)} growth by having a variable threshold such as -c\log n. So it might be worth running the program again with this tiny change.

    • Alec Edgington Says:

      I managed to get up to 2 \times 10^7, and these plots may give some cause for optimism. They are log-log plots of the maximum value attained by the partial sums, for three versions of the algorithm: setting f(n)=1 when the partial sums are less than or equal to -1, -4 and -10 respectively. But as usual one would really like more data!

    • Alec Edgington Says:

      (Tim, my reply is in your moderation queue — and I meant f(n) = +1 when I said -1; could you correct this please?)

      Is it possible to upload images to the wiki?

    • Klas Markström Says:

      Alec, what limits your runs? Time or memory? How much is used?

    • Alec Edgington Says:

      Here are some log-log plots going up to 5 \times 10^7:

      The blue dots show where a new maximum occurs; the red dot shows the final point on each graph.

    • Alec Edgington Says:

      Klas, memory seems to be the main constraint; these used between 1 and 2 GB. The program could be optimized further.

    • Klas Markström Says:

      Alec, if you want to try some longer runs after optimizing the code I have access to some linux machines with 16 Gb RAM.

    • gowers Says:

      Alec, one piece of “more data” I would very much like to see, which should be easy for you given what you’ve already produced, would be a diagram like this one but with a -10 threshold. I prefer the plots of the actual values (or rather the logs of the absolute values of the partial sum versus log n) to the plots that show records, because they give strictly more information: one can see what the records are fairly easily, but one can also discern underlying trends, such as the striking approximate linearity of the boundary of the solid blue part of the picture.

      My very slight cause for optimism is that in that diagram there seem to be three regions, each of which displays slightly different activity. One could I suppose argue that what happens up to 20 or so is not to be taken all that seriously, and that the difference between what happens up to {10}^{4} and what happens beyond {10}^{4} is not as significant as it looks, in which case we would seem to have n^{1/4} growth. But we might be very lucky and find that after a bit a new type of slower growth took over.

      The reason I’d be interested to see the -10 log versus log plot is to see if there is again some discernible linearity in the underlying trend (rather than in the outliers, where there isn’t), and if so how the gradient compares with 1/4. I am of course hoping that it will be smaller (or rather, that it will start out like 1/2, since to begin with we get the Liouville function, and will then drop to something smaller). But I’m not holding my breath …

      It must be possible to upload images to the wiki, and it would be very good to do so, but I’m afraid I don’t know how if you don’t (by which I mean that if you don’t find it obvious then it’ll be beyond me).

    • Alec Edgington Says:

      Here is a plot showing all values obtained with a threshold of -10:

      It seems that my plotting engine is consuming a lot of the memory, which is why I can’t get plots like this for higher N. There must be some way to get it to downsample…

    • Alec Edgington Says:

      Klas, the code (in Python, using the pylab module for plotting) is very short: pasted below. It would be good to see how far you can get with it. Time is O(N \log N), so I don’t think that will be an issue.

      import pylab as pl
      N = 5000 # limit of sequence
      C = -10 # set f = +1 when sums get less than or equal to this
      plot_name = 'loglogplot' # save with this filename
      x = [0]*(N+2)
      # generate sequence x
      x[1] = 1
      n0 = 2
      tot = 1
      while n0 <= N:
        x_n0 = 1 if tot <= -10 else -1
        for m in range(1,N/n0+1):
          if x[m] != 0:
            x[m*n0] = x[m] * x_n0
        while x[n0] != 0:
          tot += x[n0]
          n0 += 1
      # convert x to abs(sum(x))
      tot = 0
      for n in range(1,N+1):
        tot += x[n]
        x[n] = abs(tot)
      # make plot
      pl.title('log-log absolute partial sums, N = %d, C = %d' % (N,C))
      pl.savefig(plot_name + '.png')
    • gowers Says:

      OK, I am now fairly convinced that we’re running up against some kind of universality here and will be hard pressed to do better than a growth rate of n^{1/4} by these methods.

      The fact that we seem to be able to get n^{1/4} is still of some theoretical interest, however, since it tells us that aiming for n^{1/2} in the non-character-like case would be aiming for something too strong.

      I can’t help thinking of other variants that it would be good to try. Let me see if the following idea persuades anybody.

      The heuristic behind the greedy algorithm was supposed to be that adding a drift towards the origin at a rate of 1/k to a random walk should tend to keep it confined to within k most of the time, so if we can do what we want at primes then we ought to be able to produce logarithmic growth. The flaw in the argument is that if the walk ceases to be random then this drift term is (i) no longer strong enough and (ii) reinforces the non-randomness.

      Here is a revised heuristic. The Liouville function seems to behave like a random function, so we may gain something if we have a bias towards negative values. Indeed, in a small way we have already gained that by setting thresholds to be smaller than zero for when we choose f(p)=1. But we could try various other methods to see whether they do better. Here are a few.

      (i) If the partial sum up to p-1 is at least -A then set f(p)=-1. Otherwise, set f(p)=1 with probability \alpha. (It’s not obvious that we don’t want A=0. It seems likely that we want \alpha>1/2, but even this probably can’t be taken for granted.)

      (ii) If the primes are enumerated in increasing order as p_1,p_2,\dots, then let f(p_n)=-1 except if n is a multiple of (a smallish) k, in which case choose f(p_n) according to the usual greedy procedure (with a threshold of -A, which again might perhaps be zero).

      The thought behind these is that if we apply a greedy algorithm at fewer primes, or with a smaller probability, then we will disrupt the randomness of the Liouville function less, but will also make enough changes to add a substantial drift term. It probably won’t work, but it feels worth trying. Even if it gave n^{1/4} again, it would still be interesting as evidence that we have some kind of universal law going on (which it would be very good to explain heuristically).

    • gowers Says:

      It occurs to me to add that a more serious problem with the heuristic picture may be related to what Alec pointed out recently. He drew attention to a graph of the partial sums of the Liouville function that can be found on the Wikipedia page for that function. Although the peaks of the graph grow like the square root of n, the graph does not resemble a random walk all that closely — its oscillations are too regular.

      I’m tempted to ask the following question in response to that fact: is there some way of choosing a multiplicative function that makes its partial sums behave more like a random walk than those of the Liouville function? What about, for example, a random multiplicative function (where you choose f(p)=1 with probability \alpha)? A plot of one or two of those might be enlightening. If they look fairly random, then maybe the algorithms I suggested above could be modified so that the role of the Liouville function is played by a more random function.

    • Alec Edgington Says:

      Klas: oops, please replace

      tot <= -10

      in that code with

      tot <= C
    • Klas Markström Says:

      Alec, the large memory machines are purely for batch runs so I can’t do graphics on them. Maybe you could modify the program to save only consecutive maxima and minima in a file, or something similar which doesn’t grow too fast? Then I could run the program as far as possible and post the files.

      My replies are a bit low this week due to a course I have to attend.

    • Alec Edgington Says:

      Hi Klas,

      This modified version will just save the coordinates of each new maximum in a file.

      N = 5000 # limit of sequence
      C = -10 # set f = +1 when sums get less than or equal to this
      file_name = 'max_sums_%d_%d.txt' % (N,C) # save with this filename
      x = [0]*(N+2)
      # generate sequence x
      x[1] = 1
      n0 = 2
      tot = 1
      while n0 <= N:
        x_n0 = 1 if tot  t_max:
          ff.write('(%d, %d)\n' % (n, ta))
          t_max = ta
    • Alec Edgington Says:

      Klas, this modified version will just save the coordinates of each new maximum to a file:

      N = 5000 # limit of sequence
      C = -10 # set f = +1 when sums get less than or equal to this
      file_name = 'max_sums_%d_%d.txt' % (N,C) # save with this filename
      x = [0]*(N+2)
      # generate sequence x
      x[1] = 1
      n0 = 2
      tot = 1
      while n0 <= N:
        x_n0 = 1 if tot  t_max:
          ff.write('(%d, %d)\n' % (n, ta))
          t_max = ta
    • Alec Edgington Says:

      Hm, WordPress seems to be rejecting code snippets now. The modified program is here:


      This will just save the coordinates of each new maximum to a file.

    • Klas Markström Says:

      Alec, here are the results of a few runs on my local machine

      For larger cases I have to wait until one of the batch machines frees up.

    • Alec Edgington Says:

      Interesting, looking at the graphs I can’t decide whether the growth is more likely to be n^{o(1)} or \sqrt{n}

      We’ll see what the big machines come up with.

    • gowers Says:

      I’ve slightly lost track of what this graph is. I agree it’s hard to tell what the function underlying it is.

      I’d be very interested if it was possible to plot the “robust maxima”, that seemed to show up on some of the other graphs and behave very linearly (on the log versus log scale). One way it might be possible to do that would be to pick, for each n, an interval about n of some suitable width (perhaps it would have to be of width proportional to n, but if that got too big one could do random sampling), then draw a little graph of the percentage of m in the interval such that the partial sum up to m is at least R, and then identify an S such that as R approaches S from below this percentage decreases with a noticeably negative derivative, but after S the derivative is almost zero (because the function itself is almost zero).

      I’m not sure how clear that was, or how easy it would be to implement, but my feeling is that this “robust maximum” is likely to be given by a smoother function that can be more easily identified. And it would then give us very strong evidence for a lower bound for the growth rate.

    • Alec Edgington Says:

      It is a superposition of three log-log graphs (superposed because their behaviour was quite similar), corresponding to thresholds of -5, -10 and -20. (When the partial sum is at or below the threshold we set f(p) = +1). All three runs were up to 10^8.

      I agree it would be good to have more detailed plots showing the sections of linear-looking growth; perhaps it would be as simple as plotting the local variance of the partial sums with respect to an exponentially increasing window. I’ll see if I can knock together some code to do this.

    • Alec Edgington Says:

      Sorry, I should have said that the points indicate where a new maximum in the partial sums occurs (in any of the three runs).

    • Klas Markström Says:

      Alec, I have added some new data.

      I suggest that you should use different colours if you plot different runs in the same figure.

    • Alec Edgington Says:

      Klas, could you change the access permissions please?

    • Klas Markström Says:

      Done. I was in a hurry, between meetings, when I scp:d them over earlier and forgot to do that.

    • Alec Edgington Says:

      A couple more plots.

      The first is based on Klas’s latest upload, showing maxima up to 5 \times 10^8 on a log-log scale with a threshold of -10:

      Andthe second goes up to 10^8, with the same threshold, also on a log-log scale, but is an attempt to capture the ‘local’ maximum of the square of the partial sum (so divide gradients by two when comparing): each point is the maximum over a window; there are 100 non-overlapping windows of geometrically increasing size:

  6. Alec Edgington Says:

    PS. I added a plot for a -50 threshold:

  7. Guy Srinivasan Says:

    Each time we set a value at a prime to +-1, it seems to me we’re doing something like adding the previously determined partial sums stretched by a factor of the prime, and again stretched by the prime squared, etc. When x2 gets set to -1 our initial partial sums look periodic, and so everything after that’s going to be adding periodic functions. Can we look at functions that are the partial sums up to n as if only the primes up to some prime have nonzero values?

    Define F1(n) = 1 if n=0, 0 otherwise.
    F_2(n) = F_1(n) + (-1)F_1(\frac{n}{2}) + (+1)F_1(\frac{n}{4}) + (-1)F_1(\frac{n}{8}) ... = \frac{1}{2} (cos(\pi \frac{ln(n)}{ln(2)})+1)

    F_k(n) = \sum_{j=0}^{\infty} x_k^j F_{k-1}(\frac{n}{p_k^j})

    and with any luck that will collapse to some form we can deal with and the limit of F_k’s will converge to something we can deal with.

    Or if it turns out to not be easy in general, maybe at least with the \lambda_3 sequence?

    F3(n) with x3=+1 doesn’t converge… but it’s maybe not terrible when taking the sum (of the contributions of the powers k of the prime) to m-1 instead of infinity. I get F3(n) “!=” \frac{1}{2}(m+cos(\pi \frac{ln(n)}{ln(2)}) \frac{1-e^{-i \pi \frac{ln(3)}{ln(2)} m}}{1-e^{-i \pi \frac{ln(3)}{ln(2)}}})

  8. Ian Martin Says:

    Here’s a very small extension of the plot for the algorithm that’s constrained to remain positive I showed before.

    With my computer skills I’m not going to be able to extend this much further – I could probably leave my laptop working away for 24 hours and not get anywhere near the plot Alec could generate in an hour! (It would also be good to have an independent confirmation that the code which generated this plot is correct, since I’m somewhat less confident in it…)

  9. Alec Edgington Says:

    A little remark on the nature of the long discrepancy-2 sequences we’ve found. One way to express the pattern noted here is that the sequences frequently coincide with a linear combination of a small number of multiplicative sequences taking values in \mathbb{T}. This comes from taking the Fourier transform of \theta. In particular, they coincide at most 5-smooth values with a particular linear combination of three multiplicative sequences whose values are sixth roots of unity. This adds some weight to the idea (also suggested by Terence’s Fourier-reduction approach) that we perhaps ought to study the problem for \mathbb{T}-valued multiplicative sequences.

    • gowers Says:

      A further little remark on this topic. A feature of multiplicative functions to {-1,1} is that there are some rather crude dependences: if ab is a perfect square, then f(a)=f(b). This will probably place quite a big constraint on the growth rate of the partial sums (not that I can prove anything). But the dependences for complex multiplicative functions are a bit subtler, since perfect squares do not have to map to 1. It could be that this gives them a small advantage and explains why we get quasimultiplicativity appearing in the length-1124 sequences.

  10. Terence Tao Says:

    I’ve been thinking a little more about the idea of taking limits along translates f(x+n_j) where n_j is increasingly divisible. One thing this does for us is turn a 1D sequence with drift bounded by C into a 2D (or even an infinite-dim) sequence with drift bounded by C. For instance, starting with a 1D sequence f(x) of drift bounded by C, consider for each j the 2D sequence

    (x,y) -> f( x + n_j y )

    and take a subsequence limit to obtain a new function f(x,y), which has drift bounded by C along rays through the origin in the sense that |f( ax, ay) + … + f(bx, by)| <= C for all a,b,x,y, but one also has a new drift property in the horizontal direction in the sense that |f(ax,y)+…+f(bx,y)| is less than C.

    One can extend this procedure to higher dimensions. For instance, one can take a subsequence limit of the functions that take (x,y,z) to f(x+n_j y + n_j^2 z) and now we can bound |f(ax,ay,az)+…+f(bx,by,bz)|, |f(ax,ay,z)+…+f(bx,by,z)|, and |f(ax,y,z)+…+f(bx,y,z)| all by C. And indeed one can get a sequence on Z^omega (the union of the {\Bbb Z}^k for all k) which has all sorts of bounded drift properties.

    It’s tempting to try to hit this infinite-dimensional sequence with a Fourier reduction and/or a powerful Ramsey theorem (not quite Hales-Jewett – that would be amusing – but perhaps something like Hindman or Carlson-Simpson). I haven’t tried this yet.

    • Terence Tao Says:

      OK, it seems that this infinite dimensional extension of the function does not, in fact, gain very much. If f(x) has discrepancy at most C, then the function f(x,y) defined by f(x) when x is non-zero, and f(y) otherwise, has bounded discrepancy at most C+1 at the 2D level; similarly, f(x,y,z), defined as f(x) when x is non-zero, f(y) when x=0 and y is nonzero, and f(z) when x=y=0, also has discrepancy C+1, and similarly in higher dimensions – in all cases the discrepancy amplifies from C to C+1 at best, leading to no asymptotic advantage for the EDP problem. The point is that this model is not capturing the very long intervals that lead to log divergence for things like \mu_3. Ah well, back to the drawing board…

  11. Gil Kalai Says:

    Regarding the Fourier reduction. One side question is: if we take M to be a power of 2 and look at the Walsh expansion of (Z/MZ)^d it will also work, right?

    Another question is: It looks that if F is expressed as a complicated combination of multiplicative functions, or, in other word, if the coefficiets \hat F(\xi) are all small, then the argument is very “wasteful” and that if you start with additive combination multiplicative functions (even not to {-1,1} but, say, with some 0s in the range), the discrepency behaves almost additively (without cancellation).

    Is it of any interest to look at the Fourier description of our basic (-1)^(last non zero ternary digit) function and its expansion in term of “simpler” multiplicative functions?

    Somehow the Fourier expansion we to study the additive perodicity (like Terry’s comment above) is different than the one used in Terry’s proof.

    Very vaguely, it “feels” that we need a quantitative statement (along with a direct connection to discrepency) extending the trivial fact that if we have a sum of a finite number of periodic functions, so that each function vanishes on integers divisible by its periods then the sum also vanishes somewhere. I wonder if there is a complicated f.t. proof to this trivial statement.

    • Gil Says:

      (The question in the second paragraph is based on missing something in the Fourier proof…)

    • Gil Says:

      Here is some related question which, on the one hand, can be easier, but on the other hand it or an argument for proving it may be useful. (Yet on the third hand it may be nonsense):

      Let us consider a function from Z to {-1,0,1} (the definitions may continue to make sense when the range is C). Now lets consider the following measure for the discrepency in some large interval: for every HAP d 2d 3d …, rd we take (y_d+y_2d+…+y_rd)^2|y_rd|^2 and then we average over all HAP.

      So in other words, in our case when the values are -1, 0 and +1 we just assign discrepency 0 to an HAP whose last term is 0.

      Now consider a function so that the support has density t and we think of t as small. (the support refers to non zero entries.) So the average discrepency, as we defined it, should be at least t or so. The question is if we can show that the average discrepency is at least t log t.

      The role model: pieces of our matriushinka function: 0 for numbers not divisible by 3^k and then 0 1 or -1 according to the first non zero digit in the ternary expansion.

    • Gil Says:

      Just to make it clear, the role model is: x_k = 0 if k is not divisible by 3^k Otherwise, x_k is the value of the kth digit in the ternary expansion. This is a periodic function and the density t of the support is (2/3)3^{-k}.

      The reason we get t log t in our measure of discrepency is that while we can expect that the density of the sequence restricted to a HAP is t, in 1/3 of times it will be 3t and 1/9th of times it will be 9t etc. The question is if there is no way around this log t. of course if we take a random +1 -1 0 sequence of density t, a restriction to a HAP will not have much larger density but in this case the discrepency itself will be larger.

  12. Kristal Cantwell Says:

    I have been working on the human proof that a sequence with discrepancy 2 must have length at most 246. There was one case left f(2)=f(3)=f(5)=-1 and f(7) = 1. I think I have a proof of this. Let me outline it f(98)=f(99) =-1 which means there is a cut at 98 so the sum is 0 at 98. f(100) =1 so the sum is zero at 100. f(110)=f(111)=-1 so there is a cut at 110 so at 110 the sum is zero. This means that the sum of the numbers 101 to 110 is zero which means three of 101,103,107 and 109 must have positive value with the third negative.

    If we look at 200 201 202 and 203 we see that 200 and 201 have value -1 which means there is a cut at 200 and if 202 is negative then since 203 is negative the sum at 203 will be negative 3 so 202 must be positive and hence 101 must be negative and 103,107 and 109 must be positive.

    Now look at 206 through 215. Since 103 is positive 206 is negative and since 207 is negative there is a cut at 206 but the sum from 207 to 215 is negative 3 and we have a contradiction.

    I putting this in the wiki and updating it with some stuff in the thread EDP5 which is not up yet.

    • Kristal Cantwell Says:

      I have updated the wiki.

    • Kristal Cantwell Says:

      I am still working on the wiki which is here:


      I have been trying to get the proofs that there is no sequence with discrepancy 2 or less longer than 247 when f(2)=-1 and f(5) equals 1.

      That involves trying to expand a computer generated proof.

      So that involves expanding out something like this:

      Assume f(5) = 1

      Suppose f(11)= 1

      Setting f(2)=-1… Setting f(5)=1… Setting f(11)=1… Deducing… Setting f(3)=-1… Deducing… Setting f(7)=-1… Setting f(241)=1… Deducing… Setting f(23)=-1… Setting f(17)=-1… Setting f(19)=1… Deducing… Setting f(13)=-1… Setting f(47)=-1… Setting f(59)=1… Setting f(67)=-1… Setting f(239)=1… Setting f(251)=1… Setting f(31)=1… Deducing… Setting f(29)=-1… Setting f(53)=-1… Contradiction! f(1) = 1 and f(1) = -1 (from f(58)=-1)

      In this particular case I have had problems. I note that the program is going out of the 247 range to 251 at one step. I think I can prove this though but it may take more work.

      I am not sure there is a simple proof for the the sequence having length less than 247. There seem to be many long sequences and each of them is a case to be dealt with.

    • Uwe Stroinski Says:

      This morning I ran my prover under the premise that only conditions on the first 247 numbers are used. No simple proof was found.

      In case you are interested I can post the completed cases one by one. They are rather long and messy though.

    • Jason Dyer Says:

      I took just these deductions, before the fatal one:
      Assume f(5) = 1

      Suppose f(11)= 1

      Setting f(2)=-1… Setting f(5)=1… Setting f(11)=1…
      Deducing… Setting f(3)=-1… Deducing… Setting f(7)=-1…
      Setting f(241)=1… Deducing… Setting f(23)=-1… Setting f(17)=-1…
      Setting f(19)=1… Deducing… Setting f(13)=-1…

      and filling in further multilpicative values and checking by hand, I get a contradiction with a sum of 3 at the 25th position:

      0 1 2 3 4 5 6 7 8 9

      0 + – – + + + – – + 0-9
      – + – – + – + – – + 10-19
      + + + – + + ? ? ? ? 20-29

    • Jason Dyer Says:

      and of course I made a typo. Nevermind.

      I will continue working by hand to see if there’s some tricky logic step that will work here.

    • Jason Dyer Says:

      At the very least, you can get f(23)=-1 without jumping to f(241).

      Assume 5+, suppose 1+. Then 3-, and 2- and 3- make 6+.
      6+ forces 7-.
      Adding in multiplicity:

      0 1 2 3 4 5 6 7 8 9

      0|+ -|- +|+ + – -|+ 0-9
      -|+ -|a + – + b – c 10-19
      +|+ -|? +|+ A – – ? 20-29
      + ? – – B – + ? C A 30-39

      a and b can’t both be -.
      Suppose a and b are +. Then c is -.
      Suppoose a+ and b-, then c+.
      Suppose a- and b+, then c+.

      This represents three posibilities for abc: –+, -+-, and +–.

      Suppose -+-:

      0|+ -|- +|+ + – -|+ 0-9
      -|+ -|- + – + + – – 10-19
      +|+ -|- +|+ + – -|? 20-29
      +|+ -|- – – + ? + + 30-39 (impossible)

      The other two:

      abc –+
      ABC ++-
      0|+ -|- +|+ + – -|+ 0-9
      -|+ -|- + – + – – + 10-19
      +|+ -|- +|+ + – -|? 20-29
      + ? -|- +|- +|? – + 30-39
      – ? – ? +|+ + – -|+ 40-49

      abc +–
      ABC -++
      0|+ -|- +|+ + – -|+ 0-9
      -|+ -|+ + – + – – – 10-19
      +|+ -|- +|+ -|- – + 20-29
      +|+ -|- +|- +|? + – 30-39
      – ? – ? +|+ + – -|+ 40-49

      Both branches force 23 to be -.

    • Jason Dyer Says:

      …but of course 23 gets forced earlier without the branching. I think I’m going to stop now before I lose my marbles.

      My next step though would be to take the two branches to contradiction.

    • Kristal Cantwell Says:

      I would like to see the cases from the prover. If they are too long to post here then you could post them somewhere in the wiki a just give the URL here. Once you get f(11), f(3) and f(7) =-1. Then f(20),f(21) and f(22) are 1 and there is a cut at 20, so the sum at 20 is zero and if f(23) is 1 then the sum at 23 would be 3 so f(23) must be -1.

    • Uwe Stroinski Says:

      I have put some stuff on the Wiki directly behind the quoted part. It is not polished and not complete. Maybe we can use some human input to fix some further primes thereby reducing the number of cases to come. My program is not completely automatic and operating it further requires some work.

    • Kristal Cantwell Says:

      Thank you for the cases you posted. I found it on the wiki. I think I may be able to do something with it and if not prove the entire theorem at least fix some more primes.

  13. Uwe Stroinski Says:

    Towards an efficient proof for the non-existence of a multiplicative function (of length 247 or infinite) with discrepancy 2.

    We might learn something from tweaking the constants in our log-growth examples. Remember {\mu_3(i)} is 1 if the last digit in the ternary representation of i was 1 and it is -1 else. Let {\mu_3[i]:=\sum_{i=1}^n\mu_3(i)} denote the sum. We have {\mu_3[0]=0,\mu_3[1]=1,\mu_3[2]=0} and the recursion

    {\mu_3[3^{m-1}+i] = \mu_3[i] + 1,}

    {\mu_3[2\cdot 3^{m-1}+i] =\mu_3[i]}

    for {0\leq i\leq 3^{m-1}-1, 2\leq m}. {\mu_3[\cdot]} is positive, self-similar (btw nice picture) and satisfies

    \textstyle\max_{0\leq i\leq 3^m}\mu_3[i]=\mu_3\left[\frac{3^m-1}{2}\right]=m.

    The number of 1s (-1s) in {\mu_3} has asymptotic density 1/2. This is necessary for EDP. Are there conditions we can add to make this sufficient?

    If we modify {\mu_3} by an extra factor -1 if the number of zeros behind the last digit is odd, we still get logarithmic growth. Can we alter {\mu_3} further to get better constants?

  14. Snow and Polymath5 « Random Thoughts Says:

    […] jams I was not lazy and added up +1’s and -1’s to get an idea of what is going on here: Polymath5 project. The Erdös discrepancy problem is easy to formulate and really hard to solve. Give it a […]

  15. Klas Markström Says:

    When I was waiting for a bus yesterday, and happened to have my laptop at hand, I decided to take a look at the discrepancy of all multiplicative functions up to an integer n, rather than tryingout various algorithms for generating random samples from this space.

    Let Prime(k) denote the nth prime number.

    I fixed f(1)=1 and then for each integer k \leq 2^n-1 constructed the multiplicative function $f$ which is f(prime(i))=  (-1)^{digit(k,i)}. Here digit(k,i) is the ith binary digit of k, with the convention that the nth digit is the one which correspond to 2^0 The reason behind this convention is that it gives more structure to the pictures

    For each function f I have plotted the base 2 logarithm of the maximum of the modulus of the partial sum of f(x) for x \leq Prime(n)+1. So the value at x=10 in the figure is the base 2 logarithm of discrepancy of the function derived from the digits of the number 10. With the convention for digits mentions earlier all functions up to 2^{n-1} will have f(2)=(-1)^0, and the function corresponding to 0 is the function which is constant 1.

    Here is the plot for n=10





    For large n is not possible to test all function but one can sample them. Here is a plot based on sampling for n=200

    and one for n=500

    For small n one may also plot the correlation between the number of 1s in the binary expansion of i and the discrepancy of the ith function

    Here the constant fuction 1 is the extreme on the left and the Liouvilel function the extreme on the right

    Likewise one may plot the same correlation for the number of runs of digits instead

    Here the constant function and the Liouville function become the leftmost functions and the two extreme functions on the right are the two functions which are alternatingly 1 and -1 on the primes larger than one (Which I mentioned in an earlier post)

    • Klas Markström Says:

      “m of the maximum of |f(x)] for ” that should of course be the maximum modulus of the partial sum of f(x)

    • Alec Edgington Says:

      In relation to Klas’ plots I thought it would be interesting to look at the overall density and cumulative distribution of the maximum modulus of the partial sum, over all completely multiplicative functions up to N. Here are the plots for N=80:

      The plots illustrate the distribution of the logarithm of the maximum absolute sum divided by \log N. The shapes seem to be pretty much independent of N. In particular I noticed that the cumulative curve becomes almost vanishing below \frac{1}{4}. One might interpret this by saying that ‘almost all’ multiplicative functions exhibit growth of at least n^{\frac{1}{4}}, which is interesting in light of some of our previous observations for greedy algorithms.

    • Alec Edgington Says:

      PS. Having looked at some distribution graphs of random samples from longer random multiplicative sequences, such as this one (N=10^5, sample size 1000), I see that I was too hasty in claiming that the shape was pretty much independent of N. The distribution seems to get more concentrated around \frac{1}{2} (corresponding to \sqrt{n} growth) as N gets bigger.

    • Klas Markström Says:

      I believe that asymptotic bound is one of the results in the paper mentioned here

      I have only the read the review of the paper, but it would be interesting to look at the paper itself.

    • Alec Edgington Says:

      Ah yes. I haven’t found a copy of that paper either, but I did find this:

      Click to access sumy.pdf

      According to this paper (see remark following Theorem 3.2), if a_n is a random \mathbb{T}-valued multiplicative function (that is, the values a_p are independently and uniformly distributed on the unit circle), then, almost surely,

      \sum_{n \leq N} a_n = O((N \log N)^{\frac{1}{2}} (\log \log N)^{\frac{1}{2} + \epsilon}))

  16. Klas Markström Says:

    I think my long post ended up in the moderation queue. The Mathematica notebook for making the plots, which you hopefully will soon get to see, is available here

  17. Gil Kalai Says:

    Trying to keep track where we are (without yet digesting all the arguments) I wonder about the following statement from the wiki on the strategy described in the post above.

    “This strongly suggests that we might be able to prove a result that applies to all functions f that satisfy some Fourier condition. The condition should tell us that f does not correlate with residue classes modulo small primes, so it should tell us something like that \hat{f}(\alpha) is small whenever α is close to a rational with small denominator. Intuitively, these are the functions that ‘can tell the difference between HAPs and ordinary APs’.”

    My question is: Suppose we know that functions which do not correlate with every residue class modulo small primes (or small integers) have large discrepency, will the result about functions that have large correlation with a Direchlet character suffice to finish all cases?

    (Or, say, all completely multiplicative cases, which are more or less sufficient.)

    • Gil Kalai Says:

      Another question (again, this may already been discussed): moving from a function f(1),f(2),…,f(N) to g(1)=f(1), g(2) f(1)+f(2), …, g(N)=f(1)+…+f(N) looks very similar to moving from f(x) to g(x)=x\int_0^xf(t)dt so maybe there is a simple relation between fourier coefficients of g to those of f.

    • gowers Says:

      Gil, I thought I’d let you know that I’m still very interested in this line of attack. I’ve tried a few calculations privately (the kind of thing that is quite hard to put into a blog comment) and not made any serious progress. However, it might be worth trying to collect my thoughts and write something so that others can look at it and perhaps make better suggestions. I’ll try to find the time to do that.

    • Gil Kalai Says:

      There is the following informal suggestion.

      If f(1)+f(2)+…+f(n) is bounded for every n by a constant C then f is “roughly” periodic with a period depending on C.

      We would like to have some measure, perhaps based on the Fourier expansion, for what “roughly periodic” is.

      The following (non multiplicative) function was already mentioned and I think it will be interesting to understand in what sense the informal suggestion is correct for it.

      Divide the integers to blocks of 12 (say; blocks of two are fine too) and in each block take at random 6 +1’s and 6 -1’s.

      Is this function roughly periodic in some sense? Is this detected in some way by the Fourier coefficients?

      (I vaguely think that there is some sort of trichotomy: a) “roughly periodic” b) Fourier coefficients \hat f(t) are supported on t’s corresponding to small periods c) Fourier coefficients leak to “large periods” where they become irrelevant. So the function above seems to be in class c). )

  18. gowers Says:

    An interesting question raised by Sune (if I remember correctly) is whether if f is a bounded multiplicative function and g is the same as f except that you change it at one prime, then g must be unbounded. Here is a rather weak (but easy to prove) result in that direction. Let f be a complex multiplicative function with bounded partial sums, let p\geq 3 be a prime, and let C be a constant. I shall show that it is possible to change the value of f(p) so as to make a function g with a partial sum that is at least C.

    To do this, I define a function F as follows: F(n) is the partial sum of f up to n except over multiples of p. Equivalently, it is the partial sum of the multiplicative function h that is obtained from f by changing the value of f(p) to zero.

    Note that

    \displaystyle \sum_{x\leq n}f(x)=F(n)+f(p)F(\lfloor n/p\rfloor)+f(p)^2F(\lfloor n/p^2\rfloor)+\dots.

    Now for any k we can easily choose n such that F(\lfloor n/p^r\rfloor) has modulus at least 1/2 for every r\leq k and is zero when r>k. To do this, we observe that F(1) has modulus 1, that at least one out of F(p+1) and F(p+2) has modulus at least 1/2 (all the time my assumption is that |f(x)|=1 for every x), that if a is 1 or 2, then at least one of F(p^2+ap+1 or F(p^2+ap+2) has modulus at least 1/2, and so on. In this way, we build up a number p^{k-1}+a_{k-2}p^{k-2}+\dots+a_1p+a_0 such that every time you divide by p and take the integer part you have a number x such that |F(x)|\geq 1/2, as desired.

    But now, by Parseval, if you take the mean square modulus of

    \displaystyle \sum_{x\leq n}f(x)=F(n)+f(p)F(\lfloor n/p\rfloor)+f(p)^2F(\lfloor n/p^2\rfloor)+\dots

    over all possible choices of f(p)=\exp(2\pi i\theta), then you get at least k/4, which implies that there exists a choice of \theta for which you get at least \sqrt{k}/2.

    At the time of writing I don’t see an obvious compactness argument to get a single \theta that works for every C. However, here is a sketch of a not completely obvious and not completely checked argument.

    By continuity, the proof above gives not just a single \theta, but an interval of values of \theta that work (if I choose k such that \sqrt{k}/2\geq 2C, say). The problem I then face is that to apply Parseval I need \theta to run over the entire circle. But if we just let \theta run over an interval, then on the other side we get not the \ell_2 norm of all the F(\lfloor n/p^k\rfloor) but rather the \ell_2 norm of the convolution of that sequence with the Fourier transform of that interval (though we could take a nicer function supported in that interval, if e.g. we wanted a Féjer kernel rather than a Dirichlet kernel). But it seems to me (and this is what I have not checked) that we have enough flexibility to ensure that the \ell_2 norm of such a convolution can be made arbitrarily large: at each stage we have the chance to add one of two vectors in \ell_2 that have a distance apart that is bounded away from zero. I don’t quite see it yet, but am trying to write this straight out of my head. Anyhow, it seems to me that something like this should work.

  19. gowers Says:

    I have a strategy for thinking about multiplicative functions. It’s not so much a strategy for how to prove things as for how to attempt to prove things, but it still feels moderately promising. (I’m not using “moderately” in the mathematician’s sense of “very” but in its usual sense — roughly half way between a complete waste of time and something to get excited about.)

    Let me set up some notation. I’ll fix a multiplicative function f, and since I need a bit of generality in the discussion I’ll say that the values it takes belong to the set \{-1,0,1\}. (The point is that I’m interested in \pm 1-valued functions, but it is instructive to think about why the proposed argument fails for Legendre characters.) Given such a function, let us define F(n) to be the partial sum \sum_{x\leq n}f(x), and for any prime q, let us define G_q(n) to be the “miss-out-q partial sum” \sum_{x\leq n,(x,q)=1}f(x).

    Then we have the identity (which I have also mentioned in various other comments)

    \displaystyle F(n)=G_q(n)+f(q)G_q(\lfloor n/q\rfloor)+f(q)^2G_q(\lfloor n/q^2\rfloor)+\dots.

    The expression on the right-hand side looks like an infinite sum but of course it isn’t really, since all the terms are zero once q^k is larger than n.

    The general strategy at this point is to try to choose a value of n for which the above expression is large. Intuitively, it feels as though this ought to be possible, since we can proceed as follows. We try to pick a sequence a_0,a_1,a_2,\dots of integers between 0 and q-1, with a_0\ne 0, in such a way that for some k the sum

    \displaystyle f(q)^kG_q(a_0)+f(q)^{k-1}G_q(a_0q+a_1)+\dots

    \displaystyle \dots+G_q(a_0q^k+a_1q^{k-1}+\dots+a_{k-1}q+a_k)

    is large. Since for each i we have q choices for the number a_0q^i+\dots+a_i (given the choices we have already made for a_0,\dots,a_{i-1}), we ought to be able to do something simple like making the value of G_q as big as we can when i is even and as small as we can when i is odd (here I’m imagining that f(q)=-1 but one can also look at the case where f(q)=1, in which case one might consider trying to maximize G_q at every stage).

    Needless to say, a simple-minded strategy of this kind doesn’t work. However, it seems to me that the non-working of this strategy ought to place some pretty severe constraints on the function G_q, which is what interests me about the argument.

    To get some idea of what these constraints might be like, it helps to look at examples where we know that the strategy must fail. It is here that it is useful to allow f to take the value zero. Let’s consider the example where f is the function 1, -1, 0, 1, -1, 0, 1, -1, 0, …, otherwise known as the Legendre character mod 3. Since this function is completely multiplicative and has all its partial sums equal to 0 or 1, the argument is certainly guaranteed to fail. But how does it fail?

    Let us begin by taking q=2, a case I’ve already mentioned on this blog. In that case, G_q goes like this (starting with the value at 0): 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, … In other words, it is 1 if x is congruent to 1, 2, 3 or 4 mod 6 and 0 otherwise. We also have f(q)=f(2)=-1. So we now try to build up a sequence of numbers b_0, b_1, b_2,\dots, with the constraint that each b_i is either 2b_{i-1} or 2b_i, and we try to make G_q(b_0)-G_q(b_1)+\dots+(-1)^kG_q(b_k) large. (In terms of my earlier notation, b_i=a_0q^i+\dots+a_{i-1}q+a_i, q=2, and each a_i is 0 or 1.)

    To analyse this situation, let us form a directed graph on the set \{0,1, 2, 3, 4, 5\}. We have an edge from r to s if and only if it is possible for b_i to equal r mod 6 and b_{i+1} to equal s. The condition for this is that s should be 2r or 2r+1 mod 6. So 0 is joined to 0 and 1, 1 is joined to 2 and 3, 2 is joined to 4 and 5, 3 is joined to 0 and 1, 4 is joined to 2 and 3, and 5 is joined to 4 and 5.

    Now let us partition the vertices into the following three sets: A=\{0,5\}, B=\{1,4\}, and C=\{2,3\}. Note that every edge from B goes to C, and no edge from C goes to itself. This means that if we write out a sequence of As, Bs and Cs according to which set b_i falls into, we have the rules that B must always be followed by C, C can never be followed by B, and there is a third rule that I have not mentioned, which is that A can never be followed by C. This means that the sequence will look something like this: BCBCAAABCABCBCBCBCAAAABC … In other words, it is an arbitrary sequence made out of As and BCs.

    How does this correspond to the sum we are trying to evaluate? Well, we go along the sequence and at the ith term, if we have a B or a C then we add (-1)^i, and if we have an A then we add 0. (That is because G_q(x)=1 if and only if x belongs to either B or C.) Therefore, the sum can never be bigger than 1 in modulus.

    Because this comment is getting quite long, I’ll use a separate comment to discuss a different example.

    • gowers Says:

      The example I had in mind was what you get if you take q=7 instead of q=2 but keep f the same as above. But in the hours since I wrote the comment, I’ve come to see a bit more clearly what is happening, so I’ll try to describe the general phenomenon instead.

      As I write, I remember a serious weakness about this kind of argument, at least as so far presented, which is that it the only way that it exploits multiplicativity is when it comes to multiplication by q. It’s just possible that this could be OK, since one might start by choosing a q with some property, the choice depending on the function f. But even then it feels like too little to be considering.

      Nevertheless, let us consider a general recipe for creating bounded functions G such that it is not possible to find a sequence b_0, b_1, \dots, b_k such that each b_i is of the form qb_{i-1}+r with 0\leq r<q and |G(b_0)+\dots+G(b_k)|\geq C. (If you want an alternating sum, then that can be done too.) The idea is to build G up as follows. First choose the values of G between 1 and q-1. (We know that G(q)=G(q-1).) The difference between successive terms should always be \pm 1 when we do this, and we keep the values between -C and C. Now for each a between 1 and q-1 we choose the values between aq and aq+q-1 by taking the value of G(aq+r) to be -G(a)+h(a,r). Again, we insist that h(a,r) lies between -C and C. Next, we define G(aq^2+bq+c) to be -h(a,b)+h(a,b,c). In general, we let h be an arbitrary function defined on finite sequences of elements of the set \{1,2,\dots,q-1\} with the property that for every (a_1,\dots,a_{k-1}) the successive values of h(a_1,\dots,a_k) differ by \pm 1 as a_k varies, and then we define

      \displaystyle G(a_0q^k+\dots+a_k)=h(a_0,\dots,a_k)-h(a_0,\dots,a_{k-1}).

      Then if all values of h are confined to a set B, it follows that all values of G are confined to the set B-B. Also, if n=a_0q^k+\dots+a_k, then F(n) is equal to h(a_0,\dots,a_k)-h(a_0), which also belongs to the set B-B.

      The one other thing we must ensure when we choose the function h is that G(n)-G(n-1) is always \pm 1 when n is not a multiple of q and 0 when it is. I haven’t quite worked out a general criterion for this, but I think it is reasonably straightforward.

  20. gowers Says:

    Two small comments.

    The first is that if anyone feels like writing an easy program, then one thing I’d very much like to see is a plot on a log-versus-log scale of the partial sums of a random multiplicative function. What I want to know is whether it is noticeably different from a random walk, or whether regular oscillations magically show up.

    The second is to revisit the question of whether a multiplicative function that takes values \pm 1 if x is not a multiple of 3 and 0 if it is, and also has bounded partial sums, must be the function 1, -1, 0, 1, -1, 0, 1, -1, 0, … As Terry, and possibly others, pointed out, this looks as hard as the main question, since if you take any \pm 1-valued multiplicative function with bounded partial sums and replace its values at multiples of 3 by 0, then you will get a new multiplicative function with bounded partial sums. But if a \pm 1-valued multiplicative function with bounded partial sums has to equal 1 when x is congruent to 1 mod 3 and -1 when x is congruent to -1 mod 3, then it has to be either \lambda_3 or \mu_3, which is a contradiction since these two functions do not have bounded partial sums.

    The conclusion from this is that if there is a \pm 1-valued multiplicative function with bounded partial sums, then there is a counterexample to the question about functions that are 0 at multiples of 3, so a positive answer to that question would imply a positive answer to EDP for multiplicative functions.

    Despite this, one might argue that there is a small chance that looking at this problem could be a good approach. The reason is that at least with this problem we have an extremal example that we want to push for, whereas in the original problem we do not believe in any example at all (of a multiplicative function with bounded partial sums).

    And here is a counterargument: we would need in particular to prove that there is no multiplicative function f that’s zero on multiples of 3 and \pm 1 otherwise and has bounded partial sums, such that f(2)=1. How would we prove that? We have the problem that there is no extremal example to push for, because we do not believe that such a function exists at all. So the difficulty has not gone away after all.

    A further counterargument is that all the usual difficulties arise: we can find functions with f(2)=1 such that the partial sums grow logarithmically, we can use greedy algorithms to try to keep the partial sums small, and so on.

    • Alec Edgington Says:

      On random multiplicative functions: I have done some experiments with them and found that they looked superficially fairly similar to a normal random walk — at least, I couldn’t discern any magical large-scale oscillations. Here are two sample plots (log-log plots of the absolute partial sums of a random \pm 1-valued multiplicative function and a random \pm 1-valued function):

    • gowers Says:

      Just to add to the first point above, it would also be good to have a log-versus-log plot of an actual random walk, because it’s not completely obvious how it should look. That would make the comparison with the same thing for a random multiplicative function much easier.

    • Klas Markström Says:

      Well I already had a program for plotting random multiplicative functions so I just had to add a line which prints the figures to a file.

      You can find the plots here

      I included 5 pure random walks and 15 random multiplicative functions.

      I have taken the base 2 logarithm of both coordinates, taken \log(0)=0, and for other values used sign(s(x))\log(|s(x)|), where s(x) is the partial sum up to x

    • gowers Says:

      As ever, it is fascinating to get this kind of visual information. There seems to be a significant instability in the random multiplicative functions, since by no stretch of the imagination could one say that they all look roughly the same. But then again, the random walks look rather different too, so I’m not quite sure what to say. (I still feel that the qualitative differences in behaviour are more striking for the random multiplicative walks, though.) Perhaps it would be interesting to see whether the local behaviour of the partial sums of a random multiplicative function is roughly Brownian. One could do this by plotting such functions on ordinary linear scales between say 10^6 and 10^6+10^3. (Of course, the actual random walks could just be between 0 and 1000.)

      I still think that a very interesting sub-project would be to try to come up with some intuitive reason for the apparent n^{1/4} growth that showed up in some of the functions produced by greedy algorithms. Also, I still haven’t quite given up on the idea of getting something better by using cleverer algorithms.

      Actually, that gives me a thought. Suppose we tried a greedy algorithm but in Alec’s hexagonal lattice set-up. Perhaps we could start with f(2)=\exp(2\pi i/3), just to force the complex numbers to appear, and then define f(p) thereafter in order to minimize the modulus of the partial sum up to that point, always taking it to be a sixth root of unity. (Or we could go for all complex numbers of modulus 1, but then we would lose the exactness.) The oscillatory behaviour of the functions we have produced in the real case suggests to me that we might get some nice spirals showing up.

    • Klas Markström Says:

      I realised that by changing just two lines of code I could make the plot program much faster. I have added 15 plots of random multiplicative functions up to n=10^6.

    • Alec Edgington Says:

      I think that partial sums of a random multiplicative function should look locally Brownian if you go high enough. Given K, then for sufficiently large N, the values f(N), f(N+1), \ldots, f(N+K-1) will look like independent Rademacher random variables, since there will usually be primes p_0, p_1, \ldots, p_{k-1} such that p_i forces $f(N+i)$ and no others. (I’m waving my hands vaguely in the direction of the Erdos-Kac theorem…)

      However, perhaps the interesting question is how N depends on K.

    • gowers Says:

      I’m gradually coming to realize that at least some of what I called “qualitatively different behaviour” of different instances of random walks or random multiplicative walks is nothing of the kind: it just reflects the fact that on a log scale if you are near the origin then everything is blown up a lot. So now I’m wondering if there is some way of displaying the information that would be better. One obvious idea is just to display it with ordinary axes: exponentially slowing oscillations will still be apparent if they occur, but the local behaviour will now be the same everywhere in the random case.

      Basically I’m saying that I think I made a mistake when I suggested plotting things on a log-versus-log scale. Apologies.

    • Klas Markström Says:

      No need for apologies, I think the log-log plots actually show the periodic behaviour better and judging from the n=10^6 plots there seems to be several different types of behaviour.

      I have added 15 plots for n=10^6, with standard coordinates, to the directory.

  21. Uwe Stroinski Says:

    Let {f} be completely multiplicative, {\{\pm1\}}-valued with bounded partial sums and let {g} be completely multiplicative, the same as {f} except that you change it on one fixed prime {p}. Then {g} has unbounded partial sums.

    To see this, we count the occurrences of 1 and -1 in {g}. For {f} we have the estimates {\frac{n-c}{2}\leq | \{f(i)=f(p):1\leq i\leq n\} | \leq\frac{n+c}{2}} with {c} the bound in the discrepancy condition. This implies {\lim_{n\rightarrow\infty}\frac{1}{n}| \{f(i)=f(p):1\leq i\leq n\} |=\frac{1}{2}} for all bounded functions. For {g} the number of occurrences changes by {\sum_{i=1}^\infty (-1)^i\left\lfloor\frac{n}{p^i}\right\rfloor} counting the sign swaps. Evaluating the sum and taking limits yields {\lim_{n\rightarrow\infty}\frac{1}{n}| \{g(i)=f(p):1\leq i\leq n\} |=\frac{1}{2}-\frac{1}{p+1}\not= \frac{1}{2}}.

    That seems too easy to be true. If it was, we could try to use the inclusion-exclusion principle to get stronger results. Maybe even uniqueness.

    • gowers Says:

      I don’t understand this argument. Some of the sign swaps will go in one direction and some in another, whereas your calculation seems to be assuming that all sign swaps involving a given p^i go in the same direction. Or have I misunderstood?

    • Uwe Stroinski Says:

      You are right. Counting sign changes is not enough. One has to ensure that they are in the same direction. Ship sunk.

    • Uwe Stroinski Says:

      For completely multiplicative functions {x_{p i}}-subsequences are {f(p)} multiplies of the original sequence. Thus we have some control on the direction of flips and can use this to get a recursion for the partial sums (g and f are related by flipping f(p)):


      This puts some constraint on the possible unboundedness of the partial sum if we flip at a single prime. For example, if we flip {\mu_3(3)} we go from {\mu_3[i]\leq\log_3(2i+1)} to our (still nameless) record holder example with partial sum{[i]\leq\log_9(8i+1)} and vice versa. Can we jump from {\log}– to {x^\epsilon}-growth?

  22. Steven Heilman Says:

    I have a few questions which I hope are distinct from others that have been asked:

    Q1: Is it possible for a \pm1-valued completely multiplicative function f to have a negative average value? In [Granville and Soundararajan 2001], they show that a completely multiplicative function has partial sum (up to x) uniformly bounded below by (-.656999...+o(1))x. For fixed x, it is relatively easy to find an f which achieves this lower bound. But for a fixed f, can this lower bound hold for all x? If not, how close can one come to the lower bound?

    Q1a: Can one prove results about the time of a “record high/low” of the sum?

    Q1b: Can one prove ergodicity results about the values of f? I suppose this has been the implicit theme for some time. That is, we have chosen different weight functions over which we average the values of f (averaging of course being analogous with integration). Is there any better or more natural weight function than those that we have considered? Can stronger/weaker ergodicity results be proven?

    Q2: Suppose f has finite discrepancy. How well can it correlate with characters? This is a rather imprecise question but it is motivated by Theorem 2.1 of [BGS- Multiplicative Functions in Arithmetic Progressions]. In words, I interpret this theorem [perhaps incorrectly] as follows: f cannot be “totally uncorrelated” with all characters, because the left side is (essentially) O(1/x), but the right hand side is (1+y^2)e^{-y^2}+Error for y=y(x,f,\chi)=\{the distance from f to \chi along 1,\ldots,x\}. Though it looks nice as written, this bound may be rather ineffective.


    Finally, a trivial observation: Suppose we have a bounded, completely multiplicative function. Then, there exists N such that for all n>N, f(n) can no longer achieve any record highs or lows. In other words, after a while, f behaves better than it did for lower n. This last statement is clearly nonsensical, and I would have hoped it could be exploited in the proposed limiting arguments. However, I’m sure this (rather vague) leveraging scheme has been tried already.

  23. obryant Says:

    I know we looked at these sequences a few hundred posts ago, but I don’t recall anybody digging to the bottom. This isn’t the bottom, but maybe…

    What’s the discrepancy of the sequence x_n = (-1)^{\lfloor n \alpha \rfloor}?

    I considered the partial sums of such sequences in math.NT/0308087, but haven’t kept up with further progress on the problem.

    First with the obvious: with \alpha rational, the sequence is periodic and so has linear discrepancy.

    Less obvious: with $\alpha$ irrational, we can get the partial sums to grow aribtrarily slowly, and it’s possible logarithmic growth is typical. Here’re some details.

    Here are some details on that. Let S_N = \sum_{i=1}^N x_i, and let q_0=1, q_1, ... be the continuants (denominators of convergents) to \alpha. Then, taking p/q to be the convergent with q maximal (but strictly smaller than N), we get

    S_N = S_q + (-1)^p S_{N-q}

    Here’s an example (from math.NT/0308087), if the continued fraction of \alpha is [a_0; a_1, a_2,\dots], with a_0 odd and all the other a_i even, then all of the convergents have odd numerators, so that (-1)^p=-1. Thus, if N>2q, we get

    S_N = S_q - S_{N-q} = S_q-(S_q-S_{N-2q})=S_{N-2q}

    Further, S_{q_i} = 0 if i is odd, and S_{q_i}=-1 if i is even.

    These combine to give the following: let $q_i$ be the largest continuant not larger than N. Then

    |S_N | \leq i.

    If the continuants grow super exponentially, then this give $S_N$ being sublogarithmic. An example is

    \alpha = \frac 2{e-1}=[1;6,10,14,\dots].

    So if we could find such an $\alpha$ all of whose multiples had nice continued fractions, then we could get sublogarithmic growth. On the other hand, it’s likely that if such an \alpha existed, only the parities of the $a_i$ would be relevant, and so we could choose them to grow arbitrarily rapidly, thus achieving arbitrarily slow growth of discrepancy.

    On the third hand, this system of parities (odd, then all even) for the partial quotients is not the only system that is workable, just the only one that I’ve worked.

    • Gil Kalai Says:

      That’s very interesting! I suppose the challenge is to show that we cannot find $\alpha$ with the super exponential growth of the continued fraction expansion for each multiple. Now, I could not understand the comment about parities.

    • gowers Says:

      I’m confused. I thought examples like this had square-root growth. Here, roughly, was my argument. By the pigeonhole principle, for any m we can find r\leq m such that \alpha r is within Cm^{-1} of an even integer. But then the value of f at the first m/C multiples of \alpha r is the same, so the partial sum along that HAP is m/C, and we reach it by time \alpha mr/C\leq C'm^2.

    • obryant Says:

      You’re not confused. I was focused on getting the partial sums small, without looking yet at the other HAPs.

      So the 2 dimensional (x_n = (-1)^{\lfloor n \alpha \rfloor + \lfloor n \beta \rfloor}) version suffers from the same type of defect, but less so. Map n into [-1,1)^2 by n\mapsto (n\alpha \mod 2, n\beta \mod 2), and split [-1,1)^2 into m^2 squares with side length 1/m. By pigeonhole, there’s an r\leq m^2 with both (r\alpha,r\beta) within 1/m of an even integer. It follows then that x_{r}=x_{2r}=\dots=x_{mr}, so that \delta(m^3,x)\geq m. I’m under the impression that (excusing the constant) this pigeonhole argument gives the correct size of r, but we could perhaps get large discrepancy with a less strict r.

      Have we ruled out a sequence like

      x_n = (-1)^{\sum_i^{\infty} \lfloor {n\alpha_i} \rfloor}

      having logarithmic (or smaller) growth?

    • gowers Says:

      That’s an interesting example. It’s not obvious that one actually needs both r\alpha and r\beta to be close to an even integer for the partial sums along multiples of r to get large, however. Isn’t it enough if we just have r\alpha=2m+c and r\beta=2m'+c' with c and c' very close to each other? Then the parities of \lfloor r\alpha\rfloor and \lfloor r\beta\rfloor will be the same for a long time. But the pigeonhole argument gives that in time m rather than time m^2.

    • obryant Says:

      That seems right in principle, but I don’t think it’s enough for r\alpha,r\beta to be close mod 2. Say r\alpha=2m+2/3+\epsilon, r\beta=2m'+2/3-\epsilon. Then modulo 2 we have (\lfloor i r\alpha\rfloor, \lfloor i r\beta \rfloor) being (0,0), (1,1), (0,1).

    • gowers Says:

      You’re right, but I think one can argue like this: if ir\alpha and ir\beta are often close to an integer then either you can replace r by 2r and get them always close to an integer (when r\alpha is close to an integer plus 1/2) or the parity is the same more than 50% of the time, so you get linear growth in the partial sums. I haven’t checked that carefully, but the more general point I think holds (as you also seem to be suggesting) that a slight tweaking of the argument should be enough.

    • obryant Says:

      What I meant by “right in principle” is that asking for r\alpha,r\beta to both be close 0 mod 2 is asking for much more than we need. I still don’t see what we need to get square-root growth, though. I do agree that there most be *some* tweak that finishes this.

      We can’t guarantee that the first case (they are close to an integer) happens unless we go out to m^2.

      In the other case, for generic \alpha,\beta, the sequence (ir\alpha,ir\beta)\mod2 is u.d. for every $r$, so I don’t see linear growth at all.

      Do you think it’s square-root growth for the general question (infinitely many \alpha_i, tending to 0)?

    • gowers Says:

      Let me try to be more precise than I was in my previous comment. I still think square-root growth happens with the \alpha and \beta example, but maybe trying to prove it properly will cause me to realize some mistake.

      I argue as follows. First let us pick some r\leq m such that r(\alpha-\beta) is at most 2/m mod 2. Now let us ask when it can possibly be that \lfloor ir\alpha\rfloor and \lfloor ir\beta\rfloor have different parities. We know that ir\alpha and ir\beta differ by at most 2i/m mod 2, so a necessary condition is that they should both be within 2i/m of an integer.

      Ah, I now see that we are talking slightly at cross purposes. I certainly don’t say that we choose some r, once and for all, and that for that r we get linear growth. What I’m saying is that if you just look at the partial sums up to n you can get growth of around \sqrt{n} by choosing an appropriate r that depends on n.

      Going back to the argument, just about the only way that multiples of r\alpha can be close to an integer at least half the time is if r\alpha is close to a multiple of 1/2. If that is the case, then 4r\alpha is close to an integer, and hence so is 4r\beta, which means that the parities stay the same for a long time. If, on the other hand, multiples of r\alpha are close to an integer under half the time, then the parities are roughly the same over half the time and we get a linear lower bound.

      That’s still not properly detailed, but maybe now it is clearer what I meant.

      I was about to write that something similar ought to work for more numbers, but actually I now don’t find that obvious at all, even for three numbers. The only easy way I can see to get \lfloor\alpha\rfloor+\lfloor\beta\rfloor+\lfloor\gamma\rfloor to have a parity that’s heavily biased is to do something “too strong” like finding r with r\alpha and r\beta very close mod 2 and \gamma very close to 0 mod 2. So I am coming round to being interested in your example after all. (That’s not meant as a rude comment — I was interested in it, but just didn’t see how it could work. But with this small modification I no longer see how to show that it doesn’t.)

    • gowers Says:

      Going back to your original post, the same issue arises. You say that if all multiples of \alpha have continued fraction expansions of a certain kind then we get sublogarithmic growth. But what that means is not that there’s some uniform upper bound of C\log n for the partial sum of the first n terms along any HAP, but rather that for each r there is an upper bound of the form C_r\log n along the multiples of r.

    • obryant Says:

      The bottom line of the original post is this: If irrational \alpha has a continued fraction $[a_0;a_1,\dots]$ with $a_0 odd$ and all other $a_i$ even, then

      \sum_{n=1}^N (-1)^{\lfloor n\alpha \rfloor} \leq i

      where $q_i$ is the largest continuant (denominator of a convergent) smaller than $N$. Since the denominators have to grow at least as fast as the Fibonacci numbers, we get i \leq C \log N for an absolute constant C. The denominators can be made to grow arbitrarily fast, giving arbitrarily slow growth to the partial sums.

      So if we had a real \alpha all of whose multiples had this special shape, then we would have a sequence whose discrepancy is at most logarithmic. As noted a few comments ago, though, the discrepancy must be at least \sqrt N, thereby proving that there is no $\alpha$ with such special multiples.

    • gowers Says:

      The point I’m trying to make is that one has to be careful about what “all of whose multiples had this special shape” means. For example, if you take the number \sum_{n=1}^\infty 10^{-10^n}, then all its multiples will have continuants that grow very fast indeed. However, this will not be a uniform statement, so although the growth rate for any given multiple will be slow, the initial growth rate may be large. To put that another way, there’s an important difference between the eventual growth rate and, say, the minimum ratio between any pair of successive continuants. But perhaps it was the latter that you were talking about all along.

      None of this affects the point that it still seems to be interesting to think about the growth rate along HAPs of sequences of the form (-1)^{\lfloor\alpha n\rfloor+\lfloor\beta n\rfloor+\lfloor\gamma n\rfloor}.

    • Sune Kristian Jakobsen Says:

      Any {-1,1}-sequence is on the form
      x_n = (-1)^{\sum_i^{\infty} \lfloor {n\alpha_i} \rfloor}
      Given a sequence we want to find a set of \alphas that gives this sequence:
      Start with the empty set of \alphas. If n is the first term where the original sequence and the sequence obtained from the \alphas differ, add a \alpha, 1/n\leq \alpha < 1/(n-1) to the set of \alphas. Continue this, possibly infinity many times.

      Of course this might still be a good way to look at the sequences. We could assume that \alpha_n \to 0 much faster than 1/n. Or that the sum is finite.

  24. Sune Kristian Jakobsen Says:

    I want to suggest a generalization of the EDP. This is slightly more general that what I suggested in the last paragraph of this comment. I want to generalize EDP from sequences indexed by \mathbb{N} to sequences indexed by l_0=l_0(\mathbb{N}_0), the set of sequences of non-negative integers such that the sequence is constant 0 from some point. We think of the sequences as the powers in the prime factorization, so if a=(a_1,a_2,\dots) and b=(b_1,b_2,\dots) we define ab=(a_1+b_1,a_2+b_2,\dots). So we say that a=(a_1,a_2,\dots) divides b=(b_1,b_2,\dots) if a_1\leq b_1, a_2\leq b_2\dots.

    Now the formulation should be that for any sequences (x_n)_{n\in l_0} over {-1,1} (of course this could also be T or some other subset of a vector space) and any C\in \mathbb{N} there is n,d\in l_0 such that
    |\sum_{d|i,i\leq n}x_i|\geq C.
    But for this to make sense we need a ordering on l_0. This ordering should fulfill: If a\geq b and c\geq d then ac\geq bd and for any a\in l_0 the set \{b\in l_0 | b<a\} should be finite.

    I think (but I'm far from sure) that Terry reduction from general sequences to multiplicative sequences works for this general problem. So if there exist a sequence with bounded discrepancy for a given ordering of l_0, the must exist a multiplicative sequences over T with bounded discrepancy (wrt. the same ordering of l_0). Can anyone confirm this?

    I'm going to post another comment where I will explain why I think this problem is interesting.

    • Sune Kristian Jakobsen Says:

      The reason I think this generalization is interesting is that there are orderings of l_0 where the multiplicative EDP over {-1,1} fails: Lets call the elements p_i=(0,0,\dots ,0,1,0,0,\dots) primes. Now we set x_{(1,0,0,...)}=-1 and x_{(0,1,0,\dots)}=-1 and define log((1,0,0,...))=1 and log((0,1,0,...)) to be an irrational number. We use these logs to define an ordering in the obvious way. We let multiplicativity define the values at x_{a,b,0,0,\dots} and anytime the discrepancy becomes greater than 1, we define the next prime to be slightly less than where the problem arises. This way we can keep the discrepancy at 1.

      So anytime we have a strategy for proving EDP, we must be sure that the strategy doesn’t work for this ordering on l_0. You might think that we should use that the sequence of primes grows to quickly in the natural numbers, but here is another critical ordering:

      For any a=(a_1,a_2,a_3,\dots) we let f(a)=a_1+2a_2+4a_3+8a_4+\dots. This function gives us a partial ordering defined by f(a)>f(b)\Rightarrow a>b. If f(a)=f(b) we use the lexicographic ordering. We now define x_a=(-1)^{a_1+a_2+a_3+\dots}. I think this gives a sequence with discrepancy 1, but I haven’t found a proof.

      In this example the sequence of primes grows very fast. One thing to notices in this example is that you can’t define the ordering only by looking at the logs of the primes (unless you use infinitesimals): We have p_1^2> p_2 so 2log(p_1)>log(p_2) but p_1^{2n}<p_2^{n+1} for any n so 2log(p_1)\leq log(p_2).

      Do there exist orderings "defined using only logs" where the sequence of primes grows about as fast as in the natural numbers and where the EDP fails?

    • Sune Kristian Jakobsen Says:

      One obvious ordering on l_0 is: a\leq b  \Leftrightarrow 2^{a_1}3^{a_2}\dots \leq 2^{b_1}3^{b_2}\dots. For this ordering the problem is just the EDP. One nice thing about this ordering is that for any a\in l_0, the “numbers” (elements in l_0) divisible by a are exactly every nth number, where n=2^{a_1}3^{a_2}\dots. For a general ordering, the numbers divisible by a is not even periodic. It seems natural to use that fact that say every 3rd number is divisible by 3 in a proof of EDP. This motivates the question:

      Are there any other ordering such that exactly every 3rd number is divisible by some prime?
      and more generaly
      Are there any other ordering and any a\in l_0 and n\in\mathbb{N} such that the numbers divisible by a are exactly every nth number.

    • Gil Kalai Says:

      Dear Sune, I think this is a great point regarding the problem.

  25. Gil Kalai Says:

    I tried some approach which i may talk more about later but the following toy question (which might even be related to Sune’s extensions) came up.

    Suppose that you consider all numbers that can be presented as p_1^{n_1}p_2^{n_2}\dots p_d^{n_d} with their ususal order, so that
    1\le n_i\le m and ignore all other numbers. (So, in words, in case the latex wont compile, all the integers that can be written as the product of prime powers where you take the first d primes and the exponents at most m.)

    We give each such number a sign and we want to show that the discrepency on some “HAP” x 2x 3x …. is large. (But note that the sequence 1,2,… takes only those integers left alive).

    The reason this problem might be easier is that the (multiplicative) Fourier tools may go further.

    If we look at the argument in the Fourier reduction ( http://michaelnielsen.org/polymath1/index.php?title=Fourier_reduction ) it magically use sequences x 2x 3x … nx where n is very small so these sequences hardly interlace. (Now we allow n to be as large as possible as long as nx is still a “surviving” integer.)

    • Alec Edgington Says:

      Gil, I’m not quite sure what your question is, since you seem to be talking about a finite set of integers. (By the way, do you mean 0 \leq n_i \leq m?) To get a lower bound on the discrepancy as a function of d and m would clearly be interesting. But I suspect that we can probably get bounded discrepancy C_d if we restrict to numbers of the form p_1^{n_1} \cdots p_d^{n_d} for fixed d and arbitrary n_i, perhaps by some variant of the constructions discussed in the comments following this post — though it isn’t immediately obvious how…

  26. Gil Kalai Says:

    Alec, yes indeed the exponents should be between 0 and m. The question is if the discrepency is large when d is large.

  27. APrimos em Python/A Python script for computing primes « problemas | teoremas Says:

    […] Este script é uma modificação ao de  Alec Edgington neste comentário (em resposta a uma ideia de Tim Gowers.) Divulgado por Blogs de Ciência Blogs de Ciência […]

  28. EDP22 — first guest post from Gil Kalai « Gowers's Weblog Says:

    […] recover uniquely the Beurling primes. A much more general notion of “pseudointegers” was suggested over polymath5 by Sune Kristian Jakobsen. See also the overview over Polymath5′s wiki. An […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: