Just for the record: By the definition one finds immediately that (I guess that was the intuition behind the Poisson distribution). Basically I guess the only question is whether the conditioning destroys anything, but that seems unlikely to me.

]]>The used python code is below:

import numpy as np

import numpy.random

import matplotlib.pylab as plt

def sample(n):

“””Sample a dice”””

while True:

dist = numpy.random.randint(1,n+2,n)

if np.sum(dist) == n*(n+1)/2:

return dist

def delta(dist, k):

“””Return the first component of Delta_k”””

return np.sum(dist==k) – 1

n = 10000

dist = sample(n)

kList = np.arange(n) + 1

deltaList = np.array( [delta(dist, k) for k in kList] )

# Plot directly

plt.figure()

plt.plot(kList, deltaList)

plt.xlabel(‘k’)

plt.ylabel(‘$\Delta_k$’)

# Histogram plot

if deltaList.min() < 0:

deltaListP = deltaList – deltaList.min()

binData = np.bincount(deltaListP)

binLabels = np.arange(len(binData)) + deltaList.min()

plt.figure()

plt.bar(binLabels, binData / n)

plt.xlabel('Value of $h_A(k)-h_A(k-1)$')

plt.ylabel('Frequency')

Maybe some other approach to bound the Fourier transform focusing on

the direct differences.

Consider the Fourier transform

(I drop the constant shift in the second component, because it only

adds a constant factor with modulus one)

Let

with the convention . Then

With this

Let

Then we can rewrite the Fourier transform as

As , the absolute value can be bounded as

The sum consists of summands and each of it is at most . Hence

if we can bound a positive fraction of the summands as

strictly less than , then is strictly bounded away

from .

As an illustration, assume that the dice is such that there exist

constants and and such that

and

and

(I guess that could already be true for a typical dice , but

otherwise the argument could easily be refined. Moreover, looking at

the distribution of for a typical dice seems approachable.)

Then for around (in the torus topology),

there exists a constant such that

(if we are not close to , the modulus of the Fourier transform is easily bounded

away from ).

This shows

and

With the constants , we can then easily

bound the Fourier transform as

This seems pretty good to me, because we only need a bound when

is relatively large (in particular I guess significantly

larger than ) and this shows that

for an appropriate constant .

N.B.: We can recover the approach sketched in the post, in a

generalised setting by comparing the term th neighbouring term,

i.e. writing the sum as

so that we need to bound terms of the form

where

In the above approach it was taken , but for refined

estimates maybe could be more helpful.

This seems vaguely analogous to forcing the sum of discrete dice to be n-choose-2, but other than that, this may not be relevant to whether discrete dice are intransitive, or not.

]]>I’m not familiar with R, so I don’t quite follow the details of this generation (or the comparison function). But it roughly appears to generate a continuous function on [0,1] with length_scale scaling the derivatives so as to affect the length scale over which the function values are roughly equal. Is that understanding at least close?

Unless I misunderstood your model, it sounds like the only difference is that in the large n limit, this model yields a continuous distribution where-as the discrete dice would represent a discontinuous distribution. If that is the only difference, then somehow the intransitivity _depends_ on the discontinuity!? Somehow normalized continuous functions on [0,1] are almost totally ordered (ties and exceptions to ordering are of probability 0), where-as normalized discontinuous functions are almost perfectly “not ordered”? Is there some simple reason this should be obvious? If true, this seems fascinating and non-intuitive to me.

Would this mean that in the large n limit, if an idea doesn’t somehow capture the essential “discontinuities”, and instead approximates it as something smooth, it is bound to fail to describe the discrete dice?

So I hope you continue poking at this continuous model of dice to pull out some more information. It sounds interesting and could also be telling us something important.

]]>I’m not quite sure this is relevant; I think it might be, though, because some of the discussion seems to involve the difference between continuous and discrete probability variables. Even if it doesn’t help settle the question, I think it’s an interesting sidelight.

It’s on Github at

https://github.com/joshtburdick/misc/blob/master/polymath/dice/fuzzyDice.pdf

(Actually, my main motivation for doing this was that I like the description “fuzzy dice” ðŸ™‚

]]>Unfortunately, I don’t know how easy it is to get a local limit theorem using Stein – it tends to be used to control total variation.

]]>With indicator function if but otherwise we should be able to control in a similar way.

]]>Therefore, by Chebyshev, the probability that should be bounded above by something like . If we want that to be at most , then we need to be at least . This will be multiplied by a constant factor that depends on , but it should give us an upper bound of .

]]>Here’s why one might think that is true. Suppose we add 1 to . Then to the given number we will add and subtract . But is the multiplicity with which occurs in , and we would expect these multiplicities to behave like independent Poisson random variables of mean 1. So what we have here looks like a random walk where each step is given by a difference of two independent Poisson random variables. After several steps, the distribution should be approximately Gaussian, but what we’re interested in is not the distribution of the end point, but rather the distribution of a random point of the random walk. And this is a somewhat more slippery object.

If we want to use the technique of pretending variables are independent, proving that something holds with very high probability, and then conditioning on an event that isn’t too unlikely, that limits what we can hope to prove here. For example, the probability that the normal random walk stays positive for steps is of order , so if we then condition on an event of probability we cannot rule out that this will happen. And in fact, I think that these kinds of things actually *will* happen: it seems that conditioning the sum of to be makes it quite likely that will be broadly positive for the first half and broadly negative for the second, in which case would be broadly positive.

However, for the purposes of bounding the Fourier coefficient, we don’t need to nail down the distribution too precisely: we just need it to avoid having “silly” properties such as always being even. I’d also quite like to know that it doesn’t take the same value too many times.

To that end, here’s a simple question to which the answer is probably known, but a quick Google search didn’t reveal it to me. Suppose we take a standard one-dimensional random walk for steps. Let be the number of times the origin is visited. For what value of does become smaller than ? (The 3 there is just some arbitrary constant that’s safely bigger than 1.) I would expect the answer to be something like times a logarithmic factor, since we would expect the walk to wander around in an interval of width of order , not necessarily visiting all of it, but not being too concentrated on any one part of it.

]]>One approach to LCLTs that’s in the literature is to use the Berry-Esseen theorem together with a separate argument that the distribution is “flat”, in the sense that if you change by a small amount, then the probability of landing at does not vary by much. Then one can estimate probabilities by using the Berry-Esseen theorem to get the probability of being inside some small region and the flatness to get the probabilities of the individual points in that region (which are roughly the same and have a known sum, so can be calculated). Although I chose to use characteristic functions above, I think that approach could work as well.

]]>