For simplicity, let us talk about ties conjecture, for which it suffices to prove that converges to $0$.

Let , and let be an estimate of obtained by integrating density of standard normal distribution around point . Then converges to , so all we need is to control the ratio .

Now, substituting x=(0,0) into the formula (2.5) on page 25 in the book, we get , while (again according to the book) is of order $c/\sqrt{n}$. This seemed to clearly imply that converges to $1$, and we would be done. The proof that converges to 1/2 is similar.

However, while the constant in Berry–Esseen theorem was universal (0.4748 works), the constant c in the formula (2.5) in the book seems to depend on our initial random vector (X,Y). In our case, the vector is not fixed but different for every n, so c depends on n, which makes the inequality meaningless.

So, we need a version of Local Central Limit Theorem, as in the book, but with all constants universal, and the dependence of the initial random vector (X,Y), if any, should be explicit (in terms of moments, etc.) It seems not easy to extract this from the proof in the book.

]]>It seems to me that there will probably be some kind of technical lemma needed to argue that the probability that the entire random walk lives in a sublattice tends to zero. But probably this too follows reasonably easily from the fact that for every .

]]>However, in fact , where are frequencies for each value, and implies , hence . So, .

On the other hand, most of proof strategies we had for the dice problem, if worked, could also be used to prove that converges to 0, while converges to 1/2…

]]>The theorem in the paper measures closeness to a Gaussian as the maximum discrepancy between the Gaussian probability of being a convex set and the probability of the given distribution being in that set. However, the convex set that we want to restrict to will be a one-dimensional set, and therefore very small.

The kind of problem this could cause is illustrated by the following simple example. Let be a random variable that has mean zero and is supported on a finite subset of . Let and be independent copies of . Then a sum of independent copies of will approximate a normal distribution with mean and a certain variance.

Now let’s suppose that is non-zero. Then the normal distribution we converge to will assign non-zero probabilities to every point in .

But if instead we start with a different random variable with the same mean and variance as but supported on the even numbers, then the normal distribution we converge to will assign zero probability to numbers with an odd coordinate and will make up for it by assigning more weight to points in .

This is no problem when it comes to the probability of landing in a convex set, but it is more worrying when we condition on one of the coordinates being zero, which will lead to the probabilities being doubled, in a certain sense.

In our case, we should be OK, because the variable whose value we are conditioning on will not be supported in a residue class, but I think it means we may have to delve into the proof of the multidimensional Berry-Esseen theorem rather than just quoting it — unless of course someone else has proved a version that is better adapted to our purpose.

]]>This mathoverflow question points us to a multidimensional Berry-Esseen theorem. In particular, it looks as though Theorem 1.1 in this paper is probably what we want. Does it look strong enough to you? (I could probably work this out, but I think you’ll be able to do so a lot more quickly.)

]]>One thing it gives us quite easily is a way of proving statements about the function . For convenience I’ll take a set of size as my basic object. Then if we list the elements of as , the elements of the multiset are (with multiplicity equal to the number of times they appear in this list). I’ll call this multiset and I’ll write for the number of elements of , with multiplicity, that are at most .

It doesn’t seem to be wholly pleasant to define directly in terms of the set : it is the maximal such that . However, we do at least have that the “inverse” is a nice function. That is, we know that the th largest face of the die is . And that is quite helpful. For instance, if instead of choosing a random that adds up to 1 we instead choose a random subset of where each element is chosen independently with probability , then we would normally expect to be around , and not to differ from by much more than . So we get that for this completely random model that the th largest face of the die typically has value close to , and is well concentrated. If we now condition on the size of the set being and the sum of the faces of the die being , this will not change too much. And now we can just invert and say the same about .

We also have that beats if the number of such that is greater than the number of such that . This may turn out to be reasonably easy to work with — I’m not sure.

]]>Small conjecture: if beats and beats then the probability that beats is approximately 1/2.

Big conjecture: the tournament obtained by the relation “beats” is quasirandom.

We now know that the small conjecture is equivalent to the statement that almost every die beats approximately half the other dice, and the big conjecture is equivalent to the statement that for almost every *pair* of dice, the remaining dice are split into four approximately equal classes according to the four possibilities for which of the two they beat.

I’ve only really been thinking about the sequences model, but I think the experimental evidence is suggesting that the same should be true for the multisets model as well.

Also, I think we may already essentially have a proof of the small conjecture for the sequences model, but I’m waiting to see whether Bogdan Grechuk agrees with me about this.

]]>https://github.com/perfluxion/dicetools ]]>

“Data on k=4 tournaments and fraction of ties

Calculated with multiset distribution, 100000 random dice each test.”

I meant each calculation involved 100000 tests (4 random dice each test)

]]>My comment from a couple days ago is still stuck in the moderation queue. ]]>

But suppose that we have some with . Ah, there’s a useful property here, which is that is increasing, so decreases by at most 1 when increases by 1. So we get values of with , which gives a lower bound for of and therefore a lower bound for the square root of this of , which is a lot bigger than .

So it’s possible that we’re done, but I’ve only half understood Bogdan’s comment, so I’ll need to await confirmation. Also, my calculation above could turn out to be nonsense — it’s not carefully checked.

]]>It seems at least possible that for suitably optimized choices of and we’ll get a strong enough bound from this kind of argument. Indeed, if is something like , then I think the above sketch should show that with very high probability for at least half the values of , which would give a lower bound for of .

That looks unfortunate, but I now see that we can probably improve it quite substantially by observing that each time is large, it is probably surrounded by quite a lot of other large values (since the only way this could fail to be the case is if there are unexpected clusters in ).

]]>I think it should be possible to say something about this quite easily — though whether what I say is strong enough I haven’t yet worked out.

Suppose we choose a random die by choosing a purely random sequence and conditioning on its having the right total. For the purely random sequence, is a sum of Bernoulli random variables with probability , so the probability that it differs from its mean by more than is going to be bounded above by for some positive . So if , the probability starts to be at most .

But when we condition on the sum being zero, this probability will multiply by at most or so, since the probability that the sum is correct is of order . So I think we can safely say that for a typical die , no will be bigger than, say, .

With another elementary argument I think we should be able to show that the sum of the squares of the will not be too small. So maybe the ingredients are almost all in place.

]]>