Over at the Polymath blog, Gil Kalai recently proposed a discussion about possible future Polymath projects. This post is partly to direct you to that discussion in case you haven’t noticed it and might have ideas to contribute, and partly to start a specific Polymathematical conversation. I don’t call it a Polymath project, but rather an idea I’d like to discuss that might or might not become the basis for a nice project. One thing that Gil and others have said is that it would be a good idea to experiment with various different levels of difficulty and importance of problem. Perhaps one way of getting a Polymath project to take off is to tackle a problem that isn’t necessarily all that hard or important, but is nevertheless sufficiently interesting to appeal to a critical mass of people. That is very much the spirit of this post.
Before I go any further, I should say that the topic in question is one about which I am not an expert, so it may well be that the answer to the question I’m about to ask is already known. I could I suppose try to find out on Mathoverflow, but I’m not sure I can formulate the question precisely enough to make a suitable Mathoverflow question, so instead I’m doing it here. This has the added advantage that if the question does seem suitable, then any discussion of it that there might be will take place where I would want any continuation of the discussion to take place.
Fast parallel sorting.
I am in the middle of writing an article about Szemerédi’s mathematical work (for a book about Abel Prize winners), and one of his results that I am writing about is a famous parallel sorting network, discovered with Ajtai and Komlós, that sorts objects in rounds. This is a gorgeous result, but unfortunately the constant is quite large, and remains so after subsequent improvements, which means that in practice their sorting network does not work as well as a simpler one that runs in time for a much smaller .
The thought I would like to explore is not a way of solving that problem — obtaining the AKS result with a reasonable constant — but it is closely related. I have a rough idea for a randomized method of sorting, and I’d like to know whether it can be made to work. If it gave a good constant that would be great, but I think even proving that it works with any constant would be nice, unless it has been done already — in which case congratulations to whoever did it and I very much like your result.
The rough idea is this. As with any fast parallel sorting method, the sorting should take place in rounds, where each round consists in partitioning all the pairs (or a constant fraction of them, but there isn’t much to be lost by doing extra comparisons) and swapping the objects round if the one that should come before the other in fact comes after it.
An obvious thing to try is a random partition. So you start with objects arranged in an arbitrary order. Your target is to rearrange them so that they are arranged according to some linear ordering that you don’t know, and all you are allowed to do is pairwise comparisons. To visualize it, I like to imagine that the objects are rocks that all look fairly similar, that you want to lay them out in a line going from lightest to heaviest, and that all you have to help you is a set of balances that can fit at most one rock on each side.
With this picture, the random method would be to partition the rocks randomly into pairs and do the corresponding comparisons. When you have compared rocks and , you put them back in the same two places in the line, but with the lighter one to the left of the heavier one (so they may have been switched round).
What happens if we do random comparisons of this kind? The answer is that it doesn’t work very well at all. To see why not, suppose that the ordering is approximately correct, in the sense that for every the th lightest rock is in roughly the th place. (To be more concrete, we could say that it is in the th place and that or something like that.) Then when we do a random comparison, the probability that we move any given rock is extremely small (in the concrete case around at most). Basically, we are wasting our time comparing rocks when we are pretty certain of the answer in advance.
This suggests an adjustment to the naive random strategy, which is to have a succession of random rounds, but to make them gradually more and more “localized”. (Something like this is quite similar to what Ajtai, Komlós and Szemerédi do, so this idea doesn’t come out of nowhere. But it is also very natural.) That is, at each stage, we would choose a distance scale and pair up the rocks randomly in a way that favours distances that are of order of magnitude , and we would make shrink exponentially quickly as we proceed.
The thing that makes it a bit complicated (and again this has strong echoes in the AKS proof) is that if you just do a constant number of rounds at one distance scale, then a few points will get “left behind” — by sheer bad luck they just don’t get compared with anything useful. So the challenge is to prove that as the distance scale shrinks, the points that are far from where they should be get moved with such high probability that they “catch up” with the other points that they should be close to.
It seems to me at least possible that a purely random strategy with shrinking distance scales could sort objects in rounds, and that if one devised the random partitions in a particularly nice way then the proof might even be rather nice — some kind of random walk with drift might be taking place for each rock.
So my questions are these.
(i) Has something like this been done already?
(ii) If not, is anyone interested in exploring the possibility?
A final remark is that something that contributed a great deal to the health of the discussion about the Erdős discrepancy problem was the possibility of doing computer experiments. That would apply to some extent here too: one could devise an algorithm along the above lines and just observe experimentally how it does, whether points catch up after getting left behind, etc.