Fortunatelly the post I’m talking about was not posted (I had to register in between)

After identification of $A=(a_1,…,a_n)$ and $A^*=(n+1-a_n,…,n+1-a_1) $I was trying to build $>>$, an ” almost transitive ” relation between these classes (that are pair or singleton (if $A$ ties with $A^*$)) in such a hope that if $A$ beats $B$ and $B$ beats $C$, then the fact that $A$ beats $C$ only depends on $>>$

I made a mistake thinking I had this relation but the two very basic facts that I like are

1)$A$ beats $B$ iff $B^*$ beats $A^*$

2) If $A$ does not tie with $B$ then, either $A$ beats/defeats BOTH $B$ and $B^*$ either $B$ beats/defeats BOTH $A$ and $A^*$ so that we have a new relation on the quotient of all dies after identification of a die and its dual.

Then we have something like this : $X$ beats $Y$ that beats $Y^*$ that beats X^*$ (where $X,Y\in \left\{A,A^*,B,B^*\right\}$, and $X$ does not tie with $Y$)

Now if $X$ beats $X^*$ we have a transitive chain involving the four dies.., I kind of feel that an investigation on this might be intersting…

I’m in the middle of a new post, which I’ll now have to tone down slightly, to make clear that there is still work to do (or a convenient result to find in the literature). My guess is that we don’t actually need a new idea, but rather we will have to go through the proof of the LCLT very carefully, extract some assumption we need about the random vector and then prove that for almost all we have that assumption when is for a randomly chosen . The kinds of properties we might want are that the support of is quite small (about ) and that is not concentrated in some sublattice of .

]]>Ah, the second time when I thought we are almost done, I then see a problem. For fixed A and random j we have a 2-dimensional random vector (X,Y) taking values with equal chances. Note that for odd n it takes integer values only. Let be i.i.d copies of (X,Y), , .

For simplicity, let us talk about ties conjecture, for which it suffices to prove that converges to $0$.

Let , and let be an estimate of obtained by integrating density of standard normal distribution around point . Then converges to , so all we need is to control the ratio .

Now, substituting x=(0,0) into the formula (2.5) on page 25 in the book, we get , while (again according to the book) is of order $c/\sqrt{n}$. This seemed to clearly imply that converges to $1$, and we would be done. The proof that converges to 1/2 is similar.

However, while the constant in Berry–Esseen theorem was universal (0.4748 works), the constant c in the formula (2.5) in the book seems to depend on our initial random vector (X,Y). In our case, the vector is not fixed but different for every n, so c depends on n, which makes the inequality meaningless.

So, we need a version of Local Central Limit Theorem, as in the book, but with all constants universal, and the dependence of the initial random vector (X,Y), if any, should be explicit (in terms of moments, etc.) It seems not easy to extract this from the proof in the book.

]]>That blog post is definitely worth a read, but let me add that everyone has access to the Lawler/Limic book.

]]>Just as a reference for those following along who don’t have access to the Lawler/Limic book, Terry Tao gives a bit of related exposition in a blog post: https://terrytao.wordpress.com/2015/11/19/275a-notes-5-variants-of-the-central-limit-theorem/#more-8566. Hope it helps!

]]>Fantastic! I also came across the Lawler/Limic book but had not understood it well enough to see that it had what we wanted (assuming that that is indeed the case).

It seems to me that there will probably be some kind of technical lemma needed to argue that the probability that the entire random walk lives in a sublattice tends to zero. But probably this too follows reasonably easily from the fact that for every .

]]>It seems I have finally found the correct type of theorems we need: these are Discrete local limit theorems for random vectors. The original result is due to Richter (MULTI-DIMENSIONAL LOCAL LIMIT THEOREMS FOR LARGE DEVIATIONS), but the formulation there is complicated. A recent book is “Random Walk: A Modern Introduction” by Gregory F. Lawler and Vlada Limic, Section 2, Theorem 2.1.1. There are different estimates there, but, basically, the Theorem states that the for every single point x in the lattice, the probability that we will be exactly at x after n steps can be estimated from the corresponding normal distribution, plus some error term.

]]>An example. Fix a generic solution to the system of 5 equations: . Now, let random vector (X,Y) takes values each with probability 1/3. Then , , . Let be i.i.d copies of this random vector, , . Then the joint distribution of converges to standard 2D normal. For example, converges to 1/4, etc.

However, in fact , where are frequencies for each value, and implies , hence . So, .

On the other hand, most of proof strategies we had for the dice problem, if worked, could also be used to prove that converges to 0, while converges to 1/2…

]]>Ironically, the one thing I was rather certain about (based on the voting rules analogy) is that the big conjecture would follow from the small conjecture. Oh, well 🙂

]]>Thank you for the reference to 2D Berry-Esseen theorem. Indeed, it does not seem applicable, and the problem is much trickier than I thought. I do not think that the problem is because we have discrete distribution. In the continuous case, the fact that random vector (X,Y) is “close” to Gaussian in the sense that “the maximum discrepancy on convex sets” (as defined in the reference) is small, tell us nothing about the conditional distribution of X given Y=0, because P(Y=0)=0…

]]>I think the kind of statement we might want is that the distribution tends, in a suitable sense, to a distribution with a Gaussian type formula (that is, the probability of being at is something like for some positive definite quadratic form ) on some sublattice of (or more generally ).

]]>It looks to me as though we wouldn’t be quite there, because the 2D Berry-Esseen theorem needs to be discrete, and that raises certain considerations that may turn out to be non-trivial.

The theorem in the paper measures closeness to a Gaussian as the maximum discrepancy between the Gaussian probability of being a convex set and the probability of the given distribution being in that set. However, the convex set that we want to restrict to will be a one-dimensional set, and therefore very small.

The kind of problem this could cause is illustrated by the following simple example. Let be a random variable that has mean zero and is supported on a finite subset of . Let and be independent copies of . Then a sum of independent copies of will approximate a normal distribution with mean and a certain variance.

Now let’s suppose that is non-zero. Then the normal distribution we converge to will assign non-zero probabilities to every point in .

But if instead we start with a different random variable with the same mean and variance as but supported on the even numbers, then the normal distribution we converge to will assign zero probability to numbers with an odd coordinate and will make up for it by assigning more weight to points in .

This is no problem when it comes to the probability of landing in a convex set, but it is more worrying when we condition on one of the coordinates being zero, which will lead to the probabilities being doubled, in a certain sense.

In our case, we should be OK, because the variable whose value we are conditioning on will not be supported in a residue class, but I think it means we may have to delve into the proof of the multidimensional Berry-Esseen theorem rather than just quoting it — unless of course someone else has proved a version that is better adapted to our purpose.

]]>Thanks for this clarification. I’ve added the 1/2 now.

This mathoverflow question points us to a multidimensional Berry-Esseen theorem. In particular, it looks as though Theorem 1.1 in this paper is probably what we want. Does it look strong enough to you? (I could probably work this out, but I think you’ll be able to do so a lot more quickly.)

]]>I am sorry about “a power 1/2 in the denominator” (can you correct this?). As I noted in the comments to my post, control of maximal hj for a typical dice seems to be easy (thank you for adding the details, the observation that hj decreases by at most 1 indeed proves the statement), but, for the argument to work, Berry–Esseen theorem should remain correct for random vectors, while Wikipedia formulation is for the one-dimensional case only. I have asked for a reference with no response yet. So, I would not say we are done. But very close, one step away.

]]>One thing it gives us quite easily is a way of proving statements about the function . For convenience I’ll take a set of size as my basic object. Then if we list the elements of as , the elements of the multiset are (with multiplicity equal to the number of times they appear in this list). I’ll call this multiset and I’ll write for the number of elements of , with multiplicity, that are at most .

It doesn’t seem to be wholly pleasant to define directly in terms of the set : it is the maximal such that . However, we do at least have that the “inverse” is a nice function. That is, we know that the th largest face of the die is . And that is quite helpful. For instance, if instead of choosing a random that adds up to 1 we instead choose a random subset of where each element is chosen independently with probability , then we would normally expect to be around , and not to differ from by much more than . So we get that for this completely random model that the th largest face of the die typically has value close to , and is well concentrated. If we now condition on the size of the set being and the sum of the faces of the die being , this will not change too much. And now we can just invert and say the same about .

We also have that beats if the number of such that is greater than the number of such that . This may turn out to be reasonably easy to work with — I’m not sure.

]]>Well, I may be wrong, but I personally am leaning towards the small conjecture being correct and the big conjecture being false, where these are the following.

Small conjecture: if beats and beats then the probability that beats is approximately 1/2.

Big conjecture: the tournament obtained by the relation “beats” is quasirandom.

We now know that the small conjecture is equivalent to the statement that almost every die beats approximately half the other dice, and the big conjecture is equivalent to the statement that for almost every *pair* of dice, the remaining dice are split into four approximately equal classes according to the four possibilities for which of the two they beat.

I’ve only really been thinking about the sequences model, but I think the experimental evidence is suggesting that the same should be true for the multisets model as well.

Also, I think we may already essentially have a proof of the small conjecture for the sequences model, but I’m waiting to see whether Bogdan Grechuk agrees with me about this.

]]>I put the code on github in case someone wants to run similar calcuations (or just wants to use the random dice generation methods mentioned above).

https://github.com/perfluxion/dicetools

Oops, correction on:

“Data on k=4 tournaments and fraction of ties

Calculated with multiset distribution, 100000 random dice each test.”

I meant each calculation involved 100000 tests (4 random dice each test)

]]>Again, apologies about that, and a curse on WordPress. To make the comment easy to find, here is a link to it.

]]>My comment from a couple days ago is still stuck in the moderation queue. ]]>

I’m getting slightly confused, which is partly because I think that in Bogdan’s comment in the place where he defined the he forgot a power 1/2 in the denominator. That’s slightly better, and means that the lower bound I sketched for this denominator would be proportional to . So we need to gain a further factor of .

But suppose that we have some with . Ah, there’s a useful property here, which is that is increasing, so decreases by at most 1 when increases by 1. So we get values of with , which gives a lower bound for of and therefore a lower bound for the square root of this of , which is a lot bigger than .

So it’s possible that we’re done, but I’ve only half understood Bogdan’s comment, so I’ll need to await confirmation. Also, my calculation above could turn out to be nonsense — it’s not carefully checked.

]]>Here’s the kind of elementary argument I’m talking about that should prove that the sum of the squares of the won’t usually be too small. Fix some and let us consider just the values of at . The probability that is at most . More generally, the probability that given the value of will be at most . So for lots of the to be small we need a lot of unlikely events to happen, the probabilities of which are multiplied together.

It seems at least possible that for suitably optimized choices of and we’ll get a strong enough bound from this kind of argument. Indeed, if is something like , then I think the above sketch should show that with very high probability for at least half the values of , which would give a lower bound for of .

That looks unfortunate, but I now see that we can probably improve it quite substantially by observing that each time is large, it is probably surrounded by quite a lot of other large values (since the only way this could fail to be the case is if there are unexpected clusters in ).

]]>