This post is one of three threads that began in the comments after the post entitled A combinatorial approach to density Hales-Jewett. The other two are Upper and lower bounds for the density Hales-Jewett problem , hosted by Terence Tao on his blog, and DHJ — the triangle-removal approach, which is on this blog. These three threads are exploring different facets of the problem, though there are likely to be some relationships between them.
Many proofs of combinatorial density theorems rely in one way or another on an appropriate notion of quasirandomness. The idea is that, as we do, you have a dense subset of some structure and you want to prove that it must contain a substructure of a certain kind. You observe that a random subset of the given density will contain many substructures of that kind (with extremely high probability), and you then ask, “What properties do random sets have that cause them to contain all these substructures?” More formally, you look for some deterministic property of subsets of your structure that is sufficient to guarantee that a dense subset contains roughly the same number of substructures as a random subset of the same density.For example, suppose you are trying to find a triangle in a dense graph. (We can think of a graph with vertices as a subset of the complete graph with vertices. So the complete graph is our “structure” in this case.) It turns out that a quasirandomness property that gives you the right number of triangles is this: a graph of density is quasirandom if the number of quadruples such that and are all edges is approximately . That is, the number of labelled 4-cycles is roughly what it would be in a random graph. If this is the case, then it can be shown that it is automatically also the case that the number of triples such that and are edges is approximately , again the number that you would expect in a random graph.
This might seem a slightly disappointing answer to the question. We are trying to guarantee the expected number of one small subgraph and we assume as our quasirandomness property something that appears to be equally strong: that we have the expected number of another small subgraph. How can that possibly be a good thing to do?
A first answer is that if you have the right number of 4-cycles then it doesn’t just give you triangles: it gives you all small subgraphs. But an answer that is more pertinent to us here is that we can draw very useful consequences if a graph does not have the right number of 4-cycles. For example, we can find two large sets of vertices and such that the number of pairs such that is an edge is significantly larger than (which is what we would get for a random graph). Note that this is a global conclusion (an over-dense bipartite subgraph) deduced from a local hypothesis (the wrong number of 4-cycles). We can turn this round and say that we have an alternative definition of quasirandomness: a graph of density is quasirandom if for any two large sets and of vertices, the number of pairs such that is an edge is about . This definition turns out to be equivalent to the previous definition.
This sort of local-to-global implication is extremely useful. The local property can be used to show that sets have substructures. And if the local property fails, then one has some global property of the set that can be exploited in a number of ways.
It is not too important for the purposes of this thread to understand what all these ways are. But one of them does stand out: a global property is useful if it allows you to find a substructure of the main structure such that the intersection of with that substructure is denser than it would be for a random set. In the graphs case, our substructure will be a large complete bipartite graph. The obvious substructure to hope for in the density Hales-Jewett set-up is a combinatorial subspace (which is like a combinatorial line except that you have several different variable sets, with the number of these sets being called the dimension of the combinatorial subspace). However, as I shall argue in a comment after this summary, we will probably need to go for something more complicated than a combinatorial subspace.
So one question that it would be very good to answer is this: what on earth do we mean by a quasirandom subset of ? Or rather, what local property would guarantee roughly the expected number of combinatorial lines in a dense subset of ? (Technical point: even this is probably not quite the right question, since we ought to work not in but in a suitable triangular grid of slices, so that the uniform measure is a sensible one to use.) Of course, we would also like to understand what global properties would guarantee this, but that is a topic for the next section.
Obstructions to uniformity.
If one wants to devise a proof that depends on a quasirandomness property, then one wants that property itself to have some properties. These are (i) that it should have a local definition that allows one to show that quasirandom sets contain many combinatorial lines; (ii) that every non-quasirandom set should be non-quasirandom for a global reason (such as intersecting a large structured set more densely than expected); and (iii) that every set that is non-quasirandom for a global reason like this should look “too dense” in some substructure that is similar to the main structure.
One of the main purposes of this thread is to understand (ii) in the case of the density Hales-Jewett problem. What are the global reasons for a set not to have roughly the expected number of combinatorial lines? These are what is meant by obstructions to uniformity. The dream is to be able to say, “It’s clear that a set with one of these global properties will not necessarily have the expected number of combinatorial lines. But in fact, the converse is true too: if a set does not have the expected number of combinatorial lines then it must have one of these global properties.”
How might we work out what the obstructions to uniformity are? There are two possible approaches, both of which are valuable. One is just to think of as many genuinely different ways as we can for a set to fail to have the expected number of combinatorial lines. Another is to devise a local quasirandomness property and to see what we can deduce if that local property does not hold. So far, all the discussion has centred on the first approach, though in the end it is the second approach that one would have to use (as it is in other contexts where the theory of quasirandomness is applied).
There was quite a bit of discussion about the first approach in the “obstructions to uniformity” thread on the combinatorial approach to density Hales-Jewett post. It was summarized by Terence Tao in his comment number 148, so I won’t give lots of links here (though I will give some). But let me briefly say what the obstructions we now know of are, and add one that has not yet been mentioned.
Example 1. One fairly general class of obstructions is obtained as follows. Let be some fixed function from to , let be a positive integer, let be a structured set of integers mod (I’ll say what “structured” means in a moment), let be integers mod and let be the set of all sequences such that .
Suppose now that is a combinatorial line in , with the variable coordinates being 1 for , 2 for and 3 for . Let be the variable set. Then and . Therefore, . Now we already know that , so if there is a significant correlation between and the dilation of (let’s assume for simplicity that is invertible mod ), then knowing that and belong to makes it more (or less if the correlation is negative) likely that will belong to . Such correlation is possible if is a set such as an arithmetic progression or a Bohr set, and the fraction has reasonably small numerator and denominator when written in its lowest terms. (Incidentally, Boris Bukh is the expert on sums of sets with their dilates.)
The example given is what one might call an extreme obstruction. If is an extreme obstruction constructed in that way, then a less extreme obstruction derived from would be any set such that is significantly different from . A simple example of such a set is any set such that the number of sequences with is significantly different from . Here we are taking for , , for every , , and . (These are not the unique choices that give the example.)
Example 2. The next class of obstructions is closely related to obstructions that occur in the corners problem. Suppose that and are two subsets of , with density and , respectively. Then has density in . Now suppose that we have two points in of the form and for some . Then the point also belongs to . Therefore, if you have the two 45-degree vertices of a corner, you automatically get the 90-degree vertex. The effect of this is to give more corners than a random subset of of density . This is an important example in the corners problem, because if is a structured subset of and the sets and are random subsets of , then will have density roughly inside as well, unless is very small. The original proof of Ajtai and Szemerédi accepted this problem and used Szemerédi’s theorem to pass to a very small substructure with a density increase, and I think we should (initially at least) follow their lead, because it is likely to be much simpler. Shkredov, in his quantitative work on the corners problem, defined “structured set” to mean something like a set , where both and are highly quasirandom subsets of a structured subset of .
These facts about corners naturally suggest the following class of obstructions in the DHJ problem. This class is a generalization that I should have spotted of a class that was suggested in my comment 116 . It also relates very interestingly to some of the discussion over on the Sperner thread about simpler forms of DHJ. (See for example comment 335 and comment 344. )
Let , and be subsets of . Then let be the set of all sequences such that the 1-set of belongs to , the 2-set of belongs to and the 3-set of belongs to . Suppose that the resulting set happens to be dense (which is not necessarily the case even if , and are dense). Then has a tendency to contain too many combinatorial lines, however random the sets , and are, basically because if is a combinatorial line then you know things like that if the 1-set of lies in , then so does the 1-set of .
The discussion over at the Sperner thread has almost certainly already given us the tools to deal with a set of such a kind. The idea would be to find a partition of almost all of into combinatorial subspaces with dimension tending to infinity. The proof would be a development of the proof of DHJ(1,3), which it looks as though we already have. This would be an imitation of the Ajtai-Szemerédi approach to corners, we will find ourselves wanting to use DHJ(1,3) as a lemma in the proof of DHJ(2,3). (Actually, as I write this, I begin to see that it may not be quite so straightforward. We would probably need a multidimensional Sperner theorem, and I now realize that it is thoughts of this kind that motivate Jozsef’s attempt to find a density version of the finite-unions theorem—see comment 341 and several subsequent comments.)
Example 3. This is an intriguing example introduced by Ryan O’Donnell in comment 92. He considers (amongst other similar examples) the set of all sequences that contain a consecutive run of 1s, where is chosen so that the density of is between and , say. (A simple calculation shows that you want to take to be around .) Now if two points in a combinatorial line contain a run of 1s, then at least one, and hence both, of those points contains a run of 1s in its fixed coordinates, which forces the third point to contain a run of 1s as well. So once again we get the wrong number of combinatorial lines, because you only have to find two points in a combinatorial line to guarantee the third.
It is easy to see that a number of variants of this example can be defined. (For example, one can permute the ground set, or one can ask for a run whose length varies according to where you are, etc. etc.)
If we are going to try to prove that a non-quasirandom set must have an easily understood obstruction to uniformity, our task will be much easier if we can come up with a single description that applies to all the examples we know of. That is, we would like a single global property that is shared by all the above examples (and any other examples there might be) and that allows us to obtain a density increase on a substructure. In comment 148, Terry did just that, defining a notion of a standard obstruction to uniformity. However, I do not yet understand his definition well enough to be able to explain it here, or to know whether it covers the more general version of Example 2 above. Terry, if you’re reading this and would like to give a fully precise definition, it would be very helpful and I could update this post to include it.
The main questions for this thread, it seems to me, are these.
1. What would be a good definition of a quasirandom subset of , or rather of some suitable union of slices in ?
2. Have we thought of all the ways that a set can have too many combinatorial lines? (Ryan’s example is troubling here: it seems to contain an idea that could be much more widely applicable. One might call this the “or” idea.)
3. Do we have a unified description of all the obstructions to uniformity that we have thought of so far, or, even better, a description that might conceivably apply to all obstructions to uniformity?
Comments on this post should begin at 400.