**Update: Nets Katz and Michael Bateman have posted a preprint to the arxiv that gives an improvement to the bounds for the cap-set problem. More on this later.
**

I have been extremely pleased with the response to my first post on the cap-set problem, in that various people have very generously shared their ideas about it. Indeed, Nets Katz has even posted a comment that may give an improvement to the bound by a factor of for some very small (For reasons I have already explained, this would be very interesting if it worked.) This followed a previous comment in which he outlined some arguments that he had found with Michael Bateman that go a lot further than the ones that appeared in the section entitled “Sets with very weak additive structure” in my previous post.

Also, Ernie Croot, in addition to sketching a different very interesting approach to the problem, suggested that I should get in touch with Seva Lev, which I did. Seva has an example of a cap set that is interestingly different from the usual examples. I will discuss it in a moment.

My plan, by the way, is to try to understand the ideas in Ernie’s and Nets’s comments and then explain them in my own words in a future post. But I’m not yet ready to do that, so in this post I want to discuss some other aspects of the problem, including Seva’s example.

**Seva’s example of a cap set.**

Let me do that first. Regardless of what it may actually turn out to do for us, the example is a nice one. It’s defined as follows. Let We shall think of as where is the field with elements. Additively, this is the same as of course, but it gives us some multiplicative structure that we can use to define our set. The definition of the set, which I’ll call is very simple. Let be an element of that isn’t a square. Then we take the set of points of the form (Note that and are all points of so these points do indeed belong to )

Why is a cap set? Well, if it contained an AP of length 3, then we would have elements such that the following three points all belonged to and formed an arithmetic progression.

Taking the sum of the first and third points and subtracting twice the second (which mod 3 is the same as adding the second) and using the fact that the third coordinates must be in arithmetic progression, we find that But that implies that which contradicts the assumption that was not a square. (More precisely, the only way not to get a contradiction is if so the only APs are degenerate ones.)

What is interesting about this example? I won’t go into much detail, but will just point out a few of its features.

The only method I knew of for producing examples of cap sets until hearing from Seva was the following very simple-minded approach: take small examples and multiply them together. For example, the subset of contains no non-trivial progression, so the set doesn’t either. One can obtain slightly more interesting examples by starting with larger sets: if you have a cap set in of size where then for arbitrarily large you can find a cap set in of size

Seva’s example has density so it does not answer the question of whether a density of is possible. However, it does provide an example of a set that does not have an obvious product structure (and Seva has a proof of a rigorous statement to the effect that it does not have a product structure at all). Moreover, and this was one of Seva’s main points, the Fourier coefficients of the set are such that this set does not have any more bias on a codimension-1 subspace than is guaranteed by the simple Roth/Meshulam argument. In fact, it looks as though something similar is true for all small-codimension subspaces, for some suitable interpretation of “small”. However, because the density is exponentially small in the dimension, the example does not contradict the conjecture (or at least hope) that one can say something interesting after passing to a subspace of codimension that is a power of the logarithm of the density.

**Thoughts provoked by Ben Green’s observation.**

Recall the following question from the previous post.

**Question 1.** *Let be a cap set in of density For which values of and can one prove that there must be an affine subspace of of codimension such that the density of inside is at least *

If one is feeling very optimistic, one might hope to be able to take for some absolute constant and But if that feels a bit too bold (and after Ben’s comment it does seem to be asking quite a lot to prove it, even if it is true) then one could tone down the hope a bit and ask for to be for some absolute constant such as 4, say. Even this would be enough for a very substantial improvement to the Roth/Meshulam bound.

Now I’ve just realized that the question above is different from the one I actually asked. What I asked was the “off-diagonal” version of the question, where the initial assumption is not that is a cap set but that we have three sets and such that contains no triple in arithmetic progression. Ben made the observation that any interesting bound for Question 1 can easily be converted into a bound for the codimension of a subspace contained in where is a dense set. The argument is as follows. If we have a positive answer to Question 1 for and (in the question I took to be an absolute multiple of so I am generalizing slightly here), then let be a set of density Either is all of or it misses out a point In the latter case, misses out 0, so applying Question 1 to the three sets and we find a subspace of codimension such that the density of in is at least The number of iterations we can do of this argument is at most by which time we have passed to a subspace of codimension at most In particular, if and then contains a subspace of codimension at most

Ben made the point that if is too small then this bound is better than the remarkable recent bound obtained by Sanders. He also hinted at the converse point that perhaps one might try to argue in the opposite direction, using Sanders’s ideas (which I shall come to soon) to show that Question 1 has a positive answer as long as isn’t too small.

In order to investigate that possibility, we need to understand a bit about Sanders’s argument, and in particular the lemma of Croot and Sisask on which it depends. Rather than jumping straight in, I shall describe a proof strategy and then discuss whether it can be made to work.

Just before I do that, I’ll make an obvious comment given the discussion so far. It’s that I do not know of any technique for obtaining a density increment given that contains no non-trivial AP that does not work more generally, giving a density increment in on the assumption that there are very few solutions to the linear equation with and where and are sets of comparable density and and are non-zero. I can’t even think of a technique that *might* work. Until I can, or until somebody else can point me in the direction of such a technique, I shall look at the more general problem, which amounts to looking at what we can say about a set if there is some convolution that is very small on a set with all of and of comparable density. (This approximates the hypothesis that has density at most where is the density of and of )

**A possible approach to obtaining good cap-set bounds.**

If is a set of density and is not all of then in particular has density at most Suppose now that we could find a subspace such that was approximately equal to Recall that convolving with replaces the value with the average of over the coset Thus, what we are supposing is that the convolution tends to be approximately equal to its averaging projection, and therefore approximately invariant under translations by vectors in

Since has density at most , there is a set of density on which takes the value 0. And since we can say something like that there is a union of cosets of of total density at least on each of which has small density.

All we actually need is that there should be one coset on which is small. If we now pick any non-zero then it cannot often be the case that the density of is large in both and since otherwise we would add up a large number of reasonably large contributions to in the coset contradicting the fact that was supposed to be small there.

Speaking very approximately, for each we can put into or but not both. (That would be correct if . I am claiming that it is approximately true if we only approximately have that assumption.) It follows that must have at least double the density in some cosets (since it can intersect only half of them).

This sketchy argument would be of zero interest were it not for the fact that there is now a rather exciting new technique for obtaining statements of precisely the desired form. Having provided some motivation for finding approximate translation invariance of convolutions, let me now turn to the Croot-Sisask approach to obtaining it.

**The Croot-Sisask lemma.**

My aim in this section is not to give a full proof of the lemma, but merely to explain enough of the method to turn it into an exercise for anyone with a little experience in additive combinatorics. (I could try to spread the net a little wider, but my guess is that anyone who does not at least find the sketch below plausible enough to be persuaded that the result is correct will have lost interest in these posts by now.)

Suppose we have two subsets and of with densities and We want to be approximately invariant under all the translates from some subspace. But let’s settle for less for the time being — we’ll merely try to get it to be approximately invariant under as many translations as we can, not worrying about whether those translations form (or contain) a low-codimensional subspace.

Why should have any translation invariance properties at all? One answer comes from Fourier analysis: the Fourier coefficients of decay reasonably fast (they are absolutely summable), so we can approximate in (or indeed in any -norm with ) by keeping only the part of the Fourier transform with large Fourier coefficients, of which, by Parseval, there are not too many. But a character on is constant on cosets of a 1-codimensional subspace, so a linear combination of characters is constant on cosets of a -codimensional subspace. The problem with this argument is that has to be rather large. I won’t go into details, but this kind of argument is nothing like strong enough to give an improvement to the cap-set bounds, or even to equal them.

Croot and Sisask have a different approach that does not involve the Fourier transform. To explain it, it will be helpful to normalize as follows: instead of considering the convolution I’ll consider the convolution We can think of this probabilistically. Its value at is the probability that if you choose uniformly at random from Equivalently, you take all the sets with and take the average of their characteristic functions.

Suppose now that we could find a small such that is approximately equal to for a positive fraction of all sets of size Then there must be some set of size such that is approximately equal to for a positive proportion of all translates of But if is approximately equal to both and then is approximately equal to (Just to spell this out, if is close to then trivially is close to so the claim follows from the triangle inequality.)

How, then, are we to find a small set such that There is an obvious answer: just pick a random small subset of And this is exactly what Croot and Sisask do. Or rather, it’s almost exactly what they do, but because independent random variables are easier to handle, what they actually do is take elements from uniformly at random (with replacement) and consider the function If are distinct (which they usually are) and we let then this is indeed

To analyse how this works, consider the value at a point of the function It is the average of independent random variables, each of which involves picking a random and setting the value to be 1 if and 0 otherwise. That is, we have independent Bernoulli variables with mean and variance at most 1.

How many of these do we need in order that their sample mean will, with high probability, be close to Well, the variance of the sample mean of of them is at most so at the very least we need this to be at most To give an important example, suppose that is a random subset of of density Then is approximately so we need or And a moment’s thought shows that this is trivially necessary: the convolution is roughly constant, and since has density there is no hope of making it roughly constant if you take the average of fewer than translates.

I haven’t actually said what kind of approximation I am talking about here, but it is sort of implicit in the fact that I looked at the value at an individual is that I am looking at an approximation for some And this is indeed what Croot and Sisask, and also Sanders, do.

Now let us think about two further aspects of the method: how large a set it gives, and whether that set has any useful structure. The answer to the first question is that there are ordered subsets of of size (where is the density of and ), so if we have chosen such that for at least half of them we have then we can find at least of those that are translates of each other. So as a rule of thumb, the density of the set of good translates you get is where is the density of and is the size of your random samples.

In the case of the example where and are random sets of density this gives us a set of density Even if that set is a subspace, it has codimension so isn’t that good.

Before I discuss that further, let me quickly mention the matter of structure. Croot and Sisask observed that if you choose so that the approximations are all very good, by which I mean that we obtain a set such that is *very* close to for every then by the triangle inequality we still get pretty good approximations even if all we know about is that it belongs to for some not too large number of summands. But iterated sumsets have much more structure than arbitrary sets, a fact that was heavily exploited by Sanders (using Chang’s theorem).

However, the example above suggests that one might be able to do something else. After all, in that case is roughly constant, so it is roughly invariant under *all* translates. Can we get anywhere by understanding why the Croot-Sisask approach gives only a few of these translates?

Here is one natural thought that sheds some light on what goes wrong in the random example. Suppose we look for plenty of small sets such that approximates not in some norm but in the norm. Since all the translates of are roughly equal in the norm, this is going to be very easy — in fact, we can take The triangle inequality will then give us plenty of values such that is small. But we know that the Fourier coefficients of are absolutely summable. If the difference has small norm on the Fourier side, then it must have small -norm, so by Parseval we get some kind of bound on This suggests that the problem with the argument might be that the intermediate step it aims for (an bound) is too strong.

I haven’t checked what the bound is that comes out of the above argument, because in the next section I want to discuss an example that messes up all the thoughts I have had along these lines.

**A killer example?**

In this section I want to discuss an example that, to put things rather melodramatically, has been the bane of my life for many years. And I am not alone: for example, Nets Katz briefly alluded to it in one of his comments on my previous post. I am not yet well enough up on the new methods to know how damaging it is. I will say what I can about how I think the example might be dealt with, and will be very interested to know whether others have thoughts on the matter.

The basic idea behind the example is simple: take a random union of subspaces. Of course, there are some choices to be made here. We need to decide on the dimensions of the subspaces and the number that we shall take. Broadly speaking, if we take too many subspaces then their union will start to look like a fairly random set and we will no longer see its internal structure, whereas if we take too few then their union will be highly structured.

Suppose, then, that we take subspaces of codimension Since we have chosen them randomly, their pairwise intersections have codimension (with high probability). A property that will be very handy will be if we can treat the union as though its characteristic function were just the sum of the characteristic functions of the so we would like it if the measure of the intersections was small.

Let us write both for the subspace and for its characteristic function (a practice I was adopting above with the set when I convolved it with ). Let be the density of each The density of the union of the is, by the inclusion-exclusion formula, at least which is approximately provided is substantially less than 1. So let us assume this. Note that the square of the norm of the sum of the is (since the inner product of with is when and when ). So with this condition on and we have that the union and the sum are approximately equal in

Let The next thing I want to look at is the convolution Let and assume that which, as we have seen, implies that is small. By an easy case of Young’s inequality it follows that is small. I won’t bother with the precise calculation here — I’ll just feel free to use instead of .

Of course, is very easy to calculate: it is Now whereas if then is the constant function that takes value everywhere. (This second assertion follows from the fact that and are subspaces of density and that ) Therefore, equals

I would like to choose and so that the first term is not tiny compared with the second term Recall that we were talking about an approximation. A typical non-zero value taken by is so we need to be smaller than for which in turn we need to be at most or so.

If we also want to be about so that has density then it seems that a good choice is and So let us make that choice. Then for reasonably small the function is close to the function In other words, instead of being a constant function, it is biased by having twice as much density as it should on the set and slightly less everywhere else.

The rough point I want to make now is that is roughly constant only in directions that belong to almost all the subspaces But that forces us down to a subspace of codimension which is about So it seems as though if we want a subspace such that then it has to have disappointingly large codimension.

**Properties of the set A just constructed.**

In this section I shall try to understand the set a bit better, and try to see as precisely as I can what proof methods it rules out. (That is very much the point here. Obviously, a union of subspaces does not threaten to be a cap set, but it does have annoying properties that show that certain lemmas one would like are false.)

To begin with, let us work out the Fourier transform of By our choices of and this is well approximated by the Fourier transform of the function Let be the annihilator of for each Then it is easy to check that the Fourier transform of is Recalling that and noting that the subspaces are disjoint except for and have cardinality this gives us Fourier coefficients of size Moreover, they span a subspace of dimension which equals the Chang bound.

Next, let us observe that since is symmetric, the 3AP density of is which, by our discussion of above, is approximately This is significantly more than the that one expects in the random case.

Next, let us think what happens when we pass to an affine subspace Our set then becomes the union which has density at most (where I am also using to denote the uniform distribution on ). Let be the annihilator of Then

If intersects in a subspace of dimension then this works out as If we fix the dimension of to be then by convexity if then we maximize this by taking one to be and the rest to be zero. (I’m assuming that the are in general position.) Thus, the best density increment we can get on an -codimensional subspace is obtained by the obvious method of making that -codimensional subspace contain one of the subspaces That multiplies the density of that subspace by and doesn’t affect the densities of the other

Next, let us think about translation invariance of Recall that this is approximately equal to Now take a point For what density of do we have that ? Before we discuss that, we ought to decide how good we want the approximation to be. Since a typical value of is around it seems reasonable to ask for the difference to be small compared with For this it is important not to choose such that and

Now the density of is around (since we chose and to make this the case). If then it is not possible for and both to belong to and otherwise it is. So let be the set of such that Then to a pretty good approximation, has density around

It follows that unless belongs to at least half the the density of points such that is small compared with is at most Since in the sketchy argument earlier I wanted to know with reasonable accuracy on a set of density and since the density of that belong to half the is extremely small, this example seems to show that there is a fundamental limitation to any argument that seeks to find a subspace and prove that is roughly constant on translates of

The above argument is a bit sketchy and I confess that I have not (yet) backed it up with precise statements and rigorous calculations. But I am reasonably confident that I am drawing the correct moral from it.

**Some further observations concerning the above example.**

1. There is an obvious analogue in of the set I have just constructed in Instead of taking a random union of subspaces, one takes instead a random union of random Bohr sets (which will have predictable densities because they are defined by highly dissociated frequencies). This suggests that one ought to be able to prove the sharpness of Chang’s inequality in without resorting to niveau sets. I plan to check this. (Alternatively, if anyone feels like doing some of the estimates, then perhaps there is scope for a joint paper here. The project would be to take the discussion of the previous paragraph, make it rigorous, and generalize it to all finite Abelian groups. Of course, all that would depend on its being correct in the first place.) [Added later: I’ve now realized that what I wrote here is wrong, since the example is weaker than the Chang bound. Chang’s inequality tells us that the number of dissociated frequencies where the Fourier coefficient is at least is at most whereas this union-of-subspaces example gives only ]

2. It is interesting to note that there is an easy way of getting a density *increase* for the set above, but getting the density to *decrease* is much harder. To get the density to increase, we pick on a subspace and choose in such a way that If we want to decrease the density, then it looks as though we should choose to be disjoint from But this has a much smaller effect. This appears to demonstrate that the one-sidedness that both Nets Katz and I have run up against (and actually Seva Lev talks about it too) is a phenomenon that genuinely occurs. It also suggests that one might perhaps be able to find an interesting example by taking the *complement* of a union of subspaces, so that density *increases* become hard to obtain. However, I do not yet see how to make this thought work. (I can’t just take the complement of the set above as that has density But if I try to make it smaller, then the union and the sum are no longer roughly the same, so the example becomes harder to analyse. And now, a while later, I have made a fairly serious effort to analyse it and it seems to me to be hard to get the density to be small without the set becoming quasirandom.)

3. The main feature I would like to draw attention to in the above example is that it splits naturally into pieces, each of which has excellent translation-invariance properties. So a natural thing to try to do is find a general argument for splitting sets up into their “components” before attempting to apply an improved Croot-Sisask lemma. Alternatively, perhaps one can give an argument of the following kind: if a set contains pieces and that have low mutual additive energy, then is roughly constant, which implies that contains 3APs. If it does not contain such pieces, then ought to have some unusual structure that prevents this. Perhaps, for instance, there must be many directions such that is unusually large, and perhaps that implies that we can obtain a better density increment than Roth-Meshulam gives.

4. The annoying example has a large spectrum that splits into subspaces. We know that we can in fact assume that no subspace can contain too many elements of the large spectrum. Could it be that we can obtain useful translation invariance of under the additional assumption that the large spectrum of is “well-dispersed”?

As with my previous post, I end this one feeling as though there is much more to say. But I think my thoughts will be more organized if I leave it a few days, especially if I get some useful feedback on this post: I’ll be particularly interested in any ideas people might have for dealing with random unions of subspaces. Given that these cause difficulties for many of the ideas that occur to me, perhaps the title “What is difficult about the cap-set problem?” would have been more appropriate for this post than the last one.

January 19, 2011 at 9:20 am |

[…] (March 2009). Here are two “open discussion” posts on the cap set problem (first, and second) from Gowers’s blog (January 2011). These posts include several interesting Fourier theoretic […]

January 19, 2011 at 4:41 pm |

I have a further remark to make concerning the “killer example”, still based on the assumption that it has the properties I claimed it has. It is closely related to observation 2 above.

Suppose we want to try to prove that a set of density with no APs of length 3 must have a substantial density increase on some subspace of not very high codimension. It is tempting to try to prove a more general assertion, namely that if is any bounded function that averages and is significantly smaller than then averages significantly more than on some low-codimensional subspace.

However, I think the “killer example” shows that this statement is false. It gives us an example of a set with density and with AP-density roughly Now consider the function This again has density Its 3AP count is However, the fact that it is hard to obtain a density

decreasefor makes it hard to obtain a densityincreaseforThis shows that any proof that hopes to use a density-increment method to obtain a substantial improvement to the Roth-Meshulam bound must use in a crucial way the fact that the characteristic function of takes values in rather than, say, in More suggestively, the balanced function of takes values in and we must use the “one-sided” fact that its negative values cannot have large modulus.

Since this is potentially an important point, let me try to be more precise about what kinds of arguments the example rules out. To do that, I’ll bound from below the density of in an -codimensional affine subspace Let be the annihilator of and let be the annihilator of Then the density of in is unless contains a point of In the second case, contains a vector capable of distinguishing from Whether or not it does so depends on whether we choose the right translate of but the point is that the density of in is roughly where is the number of the that are intersected by Since the are assumed to be in general position, the density of in is at least roughly Thus, to get a density increase proportional to the number of dimensions we must lose is also proportional to

If this is correct, it gives a test that should be applied to any attempted proof that tries to use a density-increment argument: does the argument generalize in a natural way to -valued functions? If it does, then the attempt will not improve on the Roth-Meshulam bound.

As with the post, this comment comes with a health warning: I haven’t written things up carefully so it is possible that what I am saying is nonsense.

January 20, 2011 at 11:46 am |

I’d like to point out briefly that the arguments of Nets Katz and Michael Bateman (and the more simple-minded arguments from my previous post) pass the test I discussed in my previous comment. That is, they exploit positivity in a crucial way. Let me spell this out.

It’s clear that what I have just said

mustbe true, since one of the key facts driving what Nets was writing about is that you can’t have more than Fourier coefficients of size in a subspace of dimension whereas the function I discussed in my previous comment had Fourier coefficients lying in a subspace of dimension This isn’t a contradiction, because the function does not take values in though it does take values in But let us see in more detail why it isn’t a contradiction.The argument for the bound goes like this. (I have already given it, and Nets has given it in a different form.) Suppose we can find a subspace of dimension such that for values of the Fourier coefficient has modulus at least Then (The initial comes from the fact that Now the restriction of to is the Fourier transform of where is the subspace annihilated by (so has codimension ). Therefore, by Parseval, If is substantially less than 1, then this roughly tells us that and therefore that there is an affine subspace of codimension (a translate of ) where takes at least the value which is the same as saying that the density of in that affine subspace is at least It is here that I have used positivity: if I didn’t know that was positive, then its norm might be accounted for by some large negative values instead.

Now if we did Roth-Meshulam iterations, we would obtain a density increment of whereas actually we have obtained So the assumption that we cannot improve on Roth-Meshulam implies that cannot be substantially bigger than as claimed.

January 21, 2011 at 4:40 am |

Your construction of a union of a bunch of random subspaces made me think of a nice idea/result due to A. Blokhuis. Rather than write this all out, let me direct you to the following problem I posed at the ICM satellite conference on Analytic Number Theory:

See section 1.4 of the note. Basically, Blokhuis’s idea is an algebraic method that gives good bounds on the “union of subspaces” problem in that section (sec. 1.4).

Perhaps there is a way to combine Tom’s Bogolyubov theorem (or my work with Olof) with this algebraic approach to give a strong solution to the problem at the beginning of that note (since Tom’s result implies that sumsets contain 99 percent of the elements of a large coset progression, one almost has that contains the union , where is some large subspace/subgroup with the element removed). If so, then, as I comment, one would get strong upper bounds on the size of subsets in without 3APs, under a certain “uniformity assumption” (which perhaps can be removed) mentioned in the note.

Assuming this could be done, it makes me wonder whether one could say something similar in — can algebraic and combinatorial/analytic methods be combined to attack the CAP set problem?

January 24, 2011 at 6:08 am |

It is late so forgive me if what I write is not correct — but Lev’s example is a quadric surface S in A^3 over F = F_{3^m}, and I suppose what’s going on is that the non-squareness of lambda means that the two familes of lines contained in S are switched by the Galois group of F — in particular, there is no line at all defined over F and contained in S. But then, by Bezout, any line intersects S in at most two points. So I think something even stronger is true; that in Lev’s example, any F-arithmetic progression (i.e. a set of the form a + Fb with a,b in F^3) intersects S in at most two points. Any ordinary three-term arithmetic progression is contained in one of these.

January 27, 2011 at 11:16 am |

This is a bit of a non-comment, but I wanted to write some kind of acknowledgement of the comments of Ernie Croot and JSE above: I may not have thought of anything worth saying in response, but I found both of them interesting and thought-provoking.

January 28, 2011 at 7:49 am |

A comment to follow up on Jordan’s one. First, a set in with no three collinear points is called a cap because it looks like a cap (=hat like thing). An elliptic quadric (i.e. a quadric whose ruling is only defined in a quadratic extension) is a cap with points (where ). More surprising is the following theorem of B. Segre, if is odd, then a cap in whose is cardinality is at least (where is some constant I don’t remember) is contained in an elliptic quadric.

January 28, 2011 at 8:19 am

Many thanks for this interesting comment. I have to confess that the hattishness of a set with no three collinear points doesn’t jump out at me, unless you are talking about subsets of — is that the point?

I was about to ask why the last statement doesn’t solve the cap-set problem, but suddenly realized that not containing three collinear points in is much stronger than not containing a line in But that is an interesting result you mention.

In case you are wondering why your latex didn’t initially come out, it is because you put backslashes in front of “latex”.

January 29, 2011 at 7:04 am

Thanks for fixing my latex. Yes, the terminology is supposed to evoke the geometry of although it may not be necessarily true even there that a set with no 3 collinear points looks like a hat.

January 30, 2011 at 4:44 pm |

While we are on the topic of collinear points (and incidences), I thought I would mention a problem related to 3APs:

Problem. Can one encode three-term progressions in terms of point-line incidences in ? More specifically: suppose you are given a set of pairs , which represents the fact that the point is on the line . Furthermore, suppose that this list of incidences may not be exhaustive in that there could be other point-line incidences in the drawing not accounted for by the list. Then, no matter how you draw the configuration of points and lines, so long as the incidences are respected, must the set of points ALWAYS contain a 3AP?

Take, for example, the grid of points, with lines, each incident with three of those points. Some points are incident with of those lines; some are incident with only ; and one point is incident with lines. It turns out that there are many OTHER ways of drawing the points and lines (uncountably many, even after accounting for affine transformations), besides the lattice grid, such that the points do not contain a three-term progression. (It is a nice exercise to work out some examples!)

But what about grids, or , etc.?

….

The original motivation of this problem was to find a way to convert problems about 3APs into incidence questions that could then be approached algebraically (e.g. using elliptic curves).

January 30, 2011 at 9:18 pm

I’m not sure whether I understand the problem correctly, but do you mean by three points forming an 3AP that one of them is the midpoint of the line segment joining the two other ones? If so (or also otherwise) could you, please, indicate a point/line-configuration that enforces the presence of a 3AP? At the moment I cannot think of any, mainly because there seem to be far too many projective maps that preserve incidence but tend to destroy the midpoint property. (To be more precise: Take a drawing of a given configuration, colour all points which are fourth harmonic points of collinear triples of points you have used yellow, consider a line avoing all these yellow points and send it to infinity.)

January 31, 2011 at 4:27 am

Christian,

“but do you mean by three points forming a 3AP that one of them is the midpoint…?” Yes, that is what I mean.

“Can you indicate a… ?” I don’t know of a single one! That’s my question — prove that one exists.

Assuming that the point-line incidence restrictions for all the lines associated to some grid pin down the grid structure up to affine transformations, it should be possible to decide this algebraically: in order to bypass the problems of multiple configurations due to affine transformations (which preserve incidences), one can anchor three of the points to be and . Then, one can represent triples of collinear points by quadratic questions, which reflect the fact that the slopes of the segments and are equal. In principle, then, one can represent all the point-line incidences (using only the lines meeting three or more of the points) by a large number of these quadratic equations, which *should* have a unique solution (since the anchoring prevents multiple solutions coming from affine transformations). Perhaps one of these computer algebra packages (like MAGMA) can then verify that the equations indeed have a unique solution in the reals.

Assuming this is true (the incidence data for sufficiently large grids uniquely determines the structure of the grid up to affine transformations), a bigger and more interesting challenge would be to produce a set of point-line pairs where every line is incident to EXACTLY three points (i.e. not four or more), such that for any drawing of the incidence the points must ALWAYS contain a 3AP. I happen to believe it is possible to find such a configuration; in fact, I believe that projecting down a grid (and the associated lines hitting three points each) to the 2D plane (bijectively) will give such a configuration when the number of dimensions is large enough.

Why consider point configurations where every line meets exactly three points? Well, that has to do with ellitpic curves… another time, perhaps.

January 31, 2011 at 11:49 am |

Dear Ernie,

your argument involving a large number of quadratic equations sound compelling, but I still don’t see why the following rules out that, e.g. a grid, can enforce 3APs. The map preserves collinearity of points, so if we apply it to all points with integers ranging from to we get a copy of our grid. (I assume here that and are linearly independent over the rationals, so that we do not divide by zero. It is also useful to assume these numbers to be algebraically independent for the sake of discussion. If one does not want to rely on that, one can replace them by two other “sufficiently general” real numbers). Now in the new drawing no point seems to be the midpoint of two other points (basically because this would tell us something non-trivial about and .)

So it may be that the quadratic equations you produce are not very independent from one another. A possible reason for that, apart from getting only insted of equations for any $ latex k$ collinear points might be hidden in results such as Pappus’ or Desargues’ theorem, which seem to tell us that sometimes one of these equations is entailed by eight or nine others in a non-trivial way. Hilbert’s Nullstellensatz suggests that these implication can be made rather explicit in single equations, which would incidentally yield – probably rather ugly – one-line-proofs of these two theorems.)

January 31, 2011 at 11:53 am

Oups, apart from that I failed to press the reply button, a crucial formula did not parse, so here is another attempt to write it down with slightly more English: The point is mapped to .

February 1, 2011 at 2:17 am

Dear Christian,

Wow! That’s fantastic. I had thought about various kinds of maps like you had described, but had somehow overlooked this particular possibility. I suppose it sinks my hope of encoding 3APs by incidences, but at the same time it means that problems about 3APs really are fundamentally different from incidence-type problems (at least superficially).

By the way, there is an old problem posed by T. Tao (and I think also J. Solymosi and others) appearing in the following (problem 4.6)

about classifying collections of points and lines with almost the maximal number of incidences. The fact that there are maps such as you describe suggests that the answer will not simply be “dense subsets of uniform grids”, but will instead be much more complicated.

February 1, 2011 at 11:44 am

Dear Ernie,

it seems to me that this kind of map is by far to general to imply anything interesting about the inverse question associated with the Szemerédi-Trotter thereom, except for that it suffices to classify the relevant configurations up to “projective isomorphisms”. By this I mean: These maps can be applied not only to grids, but to all configurations of points and lines, so in particular to those maximizing the number of incidences. By the way, I think I can do Problem 4.13 from your list, but probably that’s not new given that the list seems to be a few years old.

Best regards,

Christian.

February 2, 2011 at 3:27 pm

About that problem 4.13. Let’s continue this conversation off of this blog. Send me an email.

I guess I was being silly about the 3APs and incidences question, now that I think about it. I should have thought about the artist’s drawing of a horizon — that transformation preserves lines and incidences, but not 3APs.

But funnily enough I had asked this problem about 3APs to several people (including a geometer), and not one of them thought of using these projective transformations.

February 4, 2011 at 1:00 pm |

Concerning my example — here is an idea of how one could try to develop it further, to construct even larger caps. Let be the field of order , and let and be quadratic forms over (more precisely, over in five variables). Denote by the set of common zeroes of and . Suppose that contains no (non-trivial) arithmetic progression \emph{with a difference from }, and consider the set

Any arithmetic progression in is of the form

where and for . The latter condition implies , whence . But now the former condition along with the assumption that contains no arithmetic progressions with the difference in show that ! Thus, is a cap, and we have . Can we choose , and so that (as )? (What gives a hope that this is possible is that we do allow progressions in a vast majority of directions!) If so, we have constructed a cap in of size which, as far as I remember (I may well be wrong about it though), beats the best known constructions.

February 11, 2011 at 10:34 am

I did a Monte Carlo with q=9, and I think found two quadratics whose zero-sets shared q points, all on a line through zero. (That is not surprising for homogeneous quadratics.)

If that happened in other fields, it would allow points in A, out of a total points, which is not as good as your other solution.

The only better result for this method would be if there are two quadratics whose only zero-point in common is the zero vector itself. In that case, the size of A would be and you get your density

February 12, 2011 at 6:54 am

The Chevally-Warning Theorem (1936) says that any set of polynomials over a finite field, with more variables than the total of the degrees of the polynomials, has simultaneous zeros.

In our case, that means two quadratics in five variables have simultaneous non-trivial zeros, so A can’t have points.

Or three quadratics in seven variables have simultaneous non-trivial zeros, and again we can’t do better than your first solution.

Perhaps two quadratics in eight variables could have just q simultaneous zeros?

February 15, 2011 at 8:47 am

Consider two quadratics in four variables whose only common zero is the zero vector. They lead to a solution (w,x,y,z,Q1,Q2) with points, and the same density as Seva’s original example.

I don’t know if they include new examples or not.

The only example I can think of is and . where is not a square.

Unfortunately, that is just two solutions multiplied together.

February 6, 2011 at 9:58 pm |

[…] Polymath6 has started. It is about improving the bounds for Roth’s theorem. See this post. Also there is a page on the polymath blog here. There is also a wiki here. Also see this post and this post. […]

June 14, 2011 at 9:53 pm |

[…] after many years, the Roth-Meshulam bound. See these two posts on Gowers’s blog (I,II). Very general theorems discovered independently by David Conlon and Tim Gowers and by Matheas […]

February 5, 2013 at 11:51 pm |

Regarding random union of subsets: It looks that if when we consider moving from V to a hyperplane, and then a codim 2 space and so on, the density-increase grows geometrically with the codimension, are there examples avoiding this feature?

July 8, 2020 at 5:00 pm |

[…] methods that for some ! This was very exciting. See these two posts on Gowers’s blog (I,II). This raised the question if the new method allows breaking the logarithmic barrier for […]