Hence, showing that with macroscopic probability, and are close for every should not be very hard. For a nonconditioned dice, the variables and should be approximately Gaussian with explicit covariances, so for a conditioned dice the are still jointly Gaussian. Proving it properly would require a local limit theorem to handle the double conditioning. But this time, the step distribution is known and very simple, so the proof should be easier than the previous one (or, even better, maybe a general result could be applied).

On the other hand, deducing from here that and are uniformly close does not seem obvious. We would need to show that cannot vary too much in a short interval. A possible way would be to show that is in some sense absolutely continuous with respect to a nonconditioned random walk, and then to use known results about the max of a random walk on a short interval. The absolute continuity also requires a local limit theorem, but this should not be too hard for the same reasons as above.

]]>That’s a good point, and I don’t see a way around it either.

But now I am thinking that “being excluded from the analysis in your main theorem” is *not* uncorrelated with “having lots of repeated faces” (and thus being relatively overrepresented in the multiset model), but is *negatively* correlated with it. If that’s true, then at least in some sense the main theorem should be easier in the multiset model than in the balanced sequence model (since the excluded cases are less common in its distribution).

It’s taking me awhile to write up my reasons for that thought (and even once written they will be vague), so I thought I’d mention that general idea first.

]]>The difficulty with just not worrying about such coincidences as occur is that the weights are very sensitive to the numbers of coincidences. For example, if two values of multiplicity 3 are allowed to merge into one value of multiplicity 6, then the weight gets divided by . And it seems that to take account of this brings us back to the problem we started with (since if we knew how to deal with these mergers then we could simply take the multiplicity class to be all singletons and deal directly with the multiset model).

That’s just how it seems to me, but as with my previous remarks, anything that sounds pessimistic can potentially be knocked down by some observation that I have not made, or some additional factor that I have not taken into account, and I don’t rule that out here.

]]>(a) I guess the following won’t work, but I’d like to confirm that understanding (and that my reasoning makes sense about the other parts):

If we fix a “multiplicity class”, then a balanced sequence is just a sequence that (1) obeys certain equalities between elements (to make certain subsets of them equal), (2) obeys inequalities between the elements that are supposed to be distinct, (3) has the right sum (so it’s balanced). If the value of each subset of sequence elements which are required equal by (1) is given by an independent random variable, then is the probability of ((2) and (3)) too low? (I guess (2) and (3) are nearly independent.) For (3) I’d guess the probability is similar to the balanced sequence model (the condition still says that some linear sum of the variables has its expected value, I think); for (2) we’re saying that choices of random elements of fail to have overlaps, where depends on the multiplicity class but could be nearly as large as . I guess the probability of (2) is then roughly exponentially low in , which is why this doesn’t work. Is that right?

(b) thinking out loud:

But what if we just omit condition (2)? Then we have some kind of generalization of a “multiplicity class” (except we want to think of it as a random distribution over dice, not just as a class of dice). It’s no longer true that all the dice in this distribution have the same preimage-size in the map from the balanced sequence model to the multiset model… but (in a typical die chosen from this distribution) most of the random variables have no overlaps with other ones, so only a few of the subsets of forced-equal sequence elements merge together to increase that preimage size. Can we conclude anything useful from this?

(We would want to first choose and fix one of these distributions, then show that using it to choose dice preserves the desired theorems, then show that choosing the original distribution properly (i.e. according to the right probability for each one) ends up approximating choosing a die using our desired distribution. In other words, we’d want some sum of these sort-of-like-multiplicity-class distributions to approximate our desired overall distribution.)

]]>Because the random distribution weights between sequence and multiset change so drastically (as you mention it can be as extreme as n! : 1), it feels like either something very special is being exploited for the conjectures to still hold in both models, or this should just happen fairly often with a change of weights. But we’ve already seen that the intransitivity is fairly fragile when changing the dice model.

I think this “something special” is that with the sequences model, not only is the score distribution for a random die very similar to a gaussian, but I conjecture this is true with high probability even when looking at the score distribution for the subset of dice constrained to have some particular multiplicity of values (ie. 12 numbers are unique, 3 are repeated twice, 5 are repeated three times, etc.).

Given the already completed sequence proof, the stricter conjecture is equivalent to saying the U variable is not correlated with the multiplicity of values. Looking at how U is defined, that sounds plausible to me, and may be provable.

If this stricter conjecture is true, then any change of weights for the random distribution will be fine if each “multiplicity class” are changed by the same factor. And this is the case for the shift from sequences -> multiset.

]]>If one wants to come up with at least some distinguishing property, it seems good to focus on things like the number of repeated elements, or more generally how the numbers of the different elements are distributed. If we define a map from sequences of length to multisets by writing the sequences in increasing order, then the number of preimages of a multiset depends very strongly on how many repeated elements it has, with extremes ranging from 1 (for the multiset ) to (for the multiset ). Since multisets with many repeats give rise to far fewer sequences, one would expect that repeats are favoured in the multisets model compared with the sequences model. I would guess that from this it is possible to come up with some statistic to do with the number of repeats that holds with probability almost 1 in the multisets model and almost zero in the sequences model.

]]>I haven’t been following this project closely, but my impression is that your existing results can be characterized as “all dice behave ‘reasonably’ except for a negligible fraction, and among the ‘reasonable’ ones, our theorems hold, and from this it follows they hold in general”.

So if we take as that predicate that a die is ‘unreasonable’, then if switching to the multiset model (and thus changing the distribution over dice) makes any of the analogous theorem statements false (and if my general understanding is correct), that predicate has to be one which is negligible in the subsets model but not in the multisets model. (Let’s call that a “contrasting predicate”.)

(I’m not conjecturing these “contrasting predicates” don’t exist — in fact, I’m guessing that someone here might be immediately able to give an example of one — maybe it’s enough for the predicate to require that the distribution of element frequencies in the multiset has a certain property. But I’m wondering if thinking about the requirements on such a predicate might be illuminating.)

]]>