Archive for the ‘Straight maths’ Category

Reflections on the recent solution of the cap-set problem I

May 19, 2016

Sometimes blog posts about recent breakthroughs can be useful because they convey the main ideas of a proof without getting bogged down in the technical details. But the recent solution of the cap-set problem by Jordan Ellenberg, and independently and fractionally later by Dion Gijswijt, both making crucial use of an amazing lemma of Croot, Lev and Pach that was made public a week or so before, does not really invite that kind of post, since the papers are so short, and the ideas so transparent, that it’s hard to know how a blog post can explain them more clearly.

But as I’ve got a history with this problem, including posting about it on this blog in the past, I feel I can’t just not react. So in this post and a subsequent one (or ones) I want to do three things. The first is just to try to describe my own personal reaction to these events. The second is more mathematically interesting. As regular readers of this blog will know, I have a strong interest in the question of where mathematical ideas come from, and a strong conviction that they always result from a fairly systematic process — and that the opposite impression, that some ideas are incredible bolts from the blue that require “genius” or “sudden inspiration” to find, is an illusion that results from the way mathematicians present their proofs after they have discovered them.

From time to time an argument comes along that appears to present a stiff challenge to my view. The solution to the cap-set problem is a very good example: it’s easy to understand the proof, but the argument has a magic quality that leaves one wondering how on earth anybody thought of it. I’m referring particularly to the Croot-Lev-Pach lemma here. I don’t pretend to have a complete account of how the idea might have been discovered (if any of Ernie, Seva or Peter, or indeed anybody else, want to comment about this here, that would be extremely welcome), but I have some remarks.

The third thing I’d like to do reflects another interest of mine, which is avoiding duplication of effort. I’ve spent a little time thinking about whether there is a cheap way of getting a Behrend-type bound for Roth’s theorem out of these ideas (and I’m not the only one). Although I wasn’t expecting the answer to be yes, I think there is some value in publicizing some of the dead ends I’ve come across. Maybe it will save others from exploring them, or maybe, just maybe, it will stimulate somebody to find a way past the barriers that seem to be there.

Entropy and Sidorenko’s conjecture — after Szegedy

November 18, 2015

Here is a simple but important fact about bipartite graphs. Let G be a bipartite graph with (finite) vertex sets X and Y and edge density \alpha (meaning that the number of edges is \alpha |X||Y|). Now choose (x_1,x_2) uniformly at random from X^2 and (y_1,y_2) uniformly at random from Y^2. Then the probability that all of x_1y_1, x_1y_2, x_2y_1 and x_2y_2 are edges is at least \alpha^4.

The proof is very simple. For each x, let f_x:Y\to\{0,1\} be the characteristic function of the neighbourhood of x. That is, f_x(y)=1 if xy is an edge and 0 otherwise. Then \sum_{x,y}f_x(y) is the sum of the degrees of the x\in X, which is the number of edges of G, which is \alpha |X||Y|. If we set f=\sum_{x\in X}f_x, then this tells us that \sum_{y\in Y}f(y)=\alpha|X||Y|. By the Cauchy-Schwarz inequality, it follows that \sum_{y\in Y}f(y)^2\geq\alpha^2|X|^2|Y|.

But by the Cauchy-Schwarz inequality again,
(\sum_{y\in Y}f(y)^2)^2=(\sum_{y\in Y}\sum_{x_1,x_2\in X}f_{x_1}(y)f_{x_2}(y))^2
= (\sum_{x_1,x_2\in X}\sum_{y\in Y}f_{x_1}(y)f_{x_2}(y))^2
\leq |X|^2\sum_{x_1,x_2\in X}(\sum_{y\in Y}f_{x_1}(y)f_{x_2}(y))^2
=|X|^2\sum_{x_1,x_2\in X}\sum_{y_1,y_2\in Y}f_{x_1}(y_1)f_{x_2}(y_1)f_{x_1}(y_2)f_{x_2}(y_2).
That last expression is |X|^2 times the number of quadruples x_1,x_2,y_1,y_2 such that all of x_1y_1, x_1y_2, x_2y_1 and x_2y_2 are edges, and our previous estimate shows that it is at least \alpha^4|X|^4|Y|^2. Therefore, the probability that a random such quadruple consists entirely of edges is at least \alpha^4, as claimed (since there are |X|^2|Y|^2 possible quadruples to choose from). (more…)

Sums of kth powers

November 4, 2014

Recently I was giving a talk at a school in London and had cause to think about sums of kth powers, because I wanted to show people the difference between unenlightening proofs and enlightening ones. (My brief was to try to give them some idea of what proofs are and why they are worth bothering about.) On the train down, I realized that there is a very simple general argument that can be used to work out formulae for sums of kth powers. After a brief look online, I haven’t found precisely this method, though I’ve found plenty of methods in the vicinity of it. It’s bound to be in the literature somewhere, so perhaps somebody can point me towards a reference. But even so, it’s worth a short post.

I’ll start with the case k=1. I want to phrase a familiar argument in a way that will make it easy to generalize in the way I want to generalize it. Let R_2(n) be the rectangle consisting of all integer points (x,y) such that 1\leq x\leq n and 1\leq y\leq n+1. We can partition R_2(n) into those points for which x\geq y and those points for which x<y. The number of points of the first kind is 1 + 2 + ... + n, since for each x we get x possibilities for y. The number of points of the second kind is 0 + 1 + ... + n, since for each y we get y-1 possibilities for x. Therefore, 2(1 + 2 + ... + n) = n(n+1) and we get the familiar formula. (more…)


July 19, 2014

The title of this post is a nod to Terry Tao’s four mini-polymath discussions, in which IMO questions were solved collaboratively online. As the beginning of what I hope will be a long exercise in gathering data about how humans solve these kinds of problems, I decided to have a go at one of this year’s IMO problems, with the idea of writing down my thoughts as I went along. Because I was doing that (and doing it directly into a LaTeX file rather than using paper and pen), I took quite a long time to solve the problem: it was the first question, and therefore intended to be one of the easier ones, so in a competition one would hope to solve it quickly and move on to the more challenging questions 2 and 3 (particularly 3). You get an average of an hour and a half per question, and I think I took at least that, though I didn’t actually time myself.

What I wrote gives some kind of illustration of the twists and turns, many of them fruitless, that people typically take when solving a problem. If I were to draw a moral from it, it would be this: when trying to solve a problem, it is a mistake to expect to take a direct route to the solution. Instead, one formulates subquestions and gradually builds up a useful bank of observations until the direct route becomes clear. Given that we’ve just had the football world cup, I’ll draw an analogy that I find not too bad (though not perfect either): a team plays better if it patiently builds up to an attack on goal than if it hoofs the ball up the pitch or takes shots from a distance. Germany gave an extraordinary illustration of this in their 7-1 defeat of Brazil.

I imagine that the rest of this post will be much more interesting if you yourself solve the problem before reading what I did. I in turn would be interested in hearing about other people’s experiences with the problem: were they similar to mine, or quite different? I would very much like to get a feel for how varied people’s experiences are. If you’re a competitor who solved the problem, feel free to join the discussion!

If I find myself with some spare time, I might have a go at doing the same with some of the other questions.

What follows is exactly what I wrote (or rather typed), with no editing at all, apart from changing the LaTeX so that it compiles in WordPress and adding two comments that are clearly marked in red.

Problem Let a_0<a_1<a_2<\dots be an infinite sequence of positive integers. Prove that there exists a unique integer n\geq 1 such that

a_n <\frac{a_0+a_1+\dots+a_n}n\leq a_{n+1}\ .

The work of Endre Szemerédi

March 8, 2013

A few years ago, Springer published a book called, The Abel Prize: 2003-2007 The First Five Years. A brief calculation will reveal that a second volume ought to be due soon, and that is indeed the case. I was asked to write the article about Endre Szemerédi, the 2012 winner, which I have just finished. I am glad to say that Springer’s policy with regard to this book is extremely enlightened: not only am I allowed to post my article as a preprint, but the entire book will be posted on the Norwegian Academy of Sciences website and will be freely accessible.

I was told to write the article as I pleased — the articles in the first volume are very different in style from each other — so I went for a style that was not unlike what I might have written if I had wanted to present several of Szemerédi’s results in a series of blog posts. That is, I’ve tried to explain the ideas, and when the going gets tough I have skipped the details. So it seems appropriate to post the article on this blog.

If you look at it and happen to notice any typos, false statements, wrong emphases, etc., I think it isn’t too late to make changes, so I’d be grateful to hear about them. Here is the article.

Ted Odell

February 10, 2013


I was shocked and saddened to hear about a week ago that Ted Odell, a mathematician to whom I owe a lot, died suddenly on January 9th of a heart attack while he was travelling to this year’s joint AMS/MAA meeting in San Diego. He was 65, but seemed a lot younger.

Ted was a world leader in Banach space theory, and in particular in the infinite-dimensional theory. The wry and slightly enigmatic smile you see in the photo was extremely characteristic: if I imagine Ted, I automatically imagine him with exactly that smile. Less clear from the photo, though perhaps it can be guessed from the camera angle, is that he was extremely tall: he belonged to a handful of mathematicians I know who make me feel short (Tom Sanders and Alex Scott being two others).

I first met Ted when I went to my first ever conference, in Strobl am Wolfgangsee in Austria in 1989. I can’t remember how it came about, but I ended up chatting to him, and he started explaining to me in a wonderfully clear way — the kind of explanation you just can’t get from a textbook — how Tsirelson’s space worked. I read in an obituary by András Zsak (which starts on page 30 of this issue of the LMS newsletter) that Ted had a reputation for being kind and encouraging to young mathematicians. He certainly was to me at this conference, taking the time to give this explanation to a graduate student about whom he knew nothing. Most of the next section describes an argument that he sketched out for me on one of those yellow pads of paper that seem to be standard in US maths departments. (I think I’ve still got the yellow sheets that he let me keep, but I’ve no idea where they are.)

What are dense Sidon subsets of {1,2,…,n} like?

July 13, 2012

The short answer if you don’t feel like reading a post with some actual mathematics in it is that I don’t know.

Now for the longer answer. A subset A of \{1,2,\dots,n\} is called a Sidon set if the only solutions of the equation a+b=c+d with a,b,c,d\in A are the trivial ones with a=c and b=d or a=d and b=c. Since the number of pairs a\leq b with a,b\in A is |A|(|A|+1)/2 and 2\leq a+b\leq 2n whenever a,b\in A, it is trivial that if A is a Sidon set, then |A|(|A|+1)/2\leq 2n, which gives an upper bound for |A| of around 2\sqrt{n}.

A new way of proving sumset estimates

February 10, 2011

A very important role in additive combinatorics is played by the following theorem. If A is a finite subset of \mathbb{Z}, or indeed of any Abelian group, and if |A+A|\leq C|A|, then for any non-negative integers k and l we have the estimate |kA-lA|\leq C^{k+l}|A|. This is a theorem of Plünnecke that was rediscovered by Ruzsa. The proof had a structure of the following kind: define a type of graph, now known as a Plünnecke graph, and a notion of “magnification ratio”; formulate an inequality concerning Plünnecke graphs; using Menger’s theorem, prove the inequality for graphs with magnification ratio 1 (Menger’s theorem being used to find a collection of disjoint paths, the existence of which is a trivially sufficient condition for the inequality to hold in this case); using the tensor product trick and the construction of Plünnecke graphs that approximately have certain magnification ratios, deduce the result for all magnification ratios; construct a Plünnecke graph from a subset of an Abelian group; show, in a nice but not entirely obvious way, that the sumset estimates follow from the inequality about graphs.

The argument is an attractive one in the sense that it builds up to the result by means of clean and elementary steps (except perhaps that the approximations needed in the final step require one to get one’s hands slightly dirty). However, it is not an attractive argument if you have to give the proof in a lecture course: there are a lot of details to remember. I have this difficulty with other arguments too: there is a certain style of argument in combinatorics where one proves an amazing result after a long sequence of elementary observations. Reading such a proof can be easy, but remembering the precise sequence of elementary observations sometimes isn’t. Of course, if one feels this way about a proof, it is a good sign that one probably hasn’t fully understood the proof, as there is usually some story to tell about how the sequence of elementary observations was discovered. I freely admit to not fully understanding the proof of the sumset estimates in this deep way, and relying to some extent on memorization when I have lectured the result.

But now, I am glad to say, all that is a thing of the past. A research student of mine (about to be a former research student of mine), George Petridis, has come up with a ridiculously simple proof of the sumset estimates. Indeed, it is so ridiculously simple that my first reaction was that it could not be correct. Here’s the kind of reasoning I used: Terence Tao came up with a very nice argument, quite a bit simpler than the Plünnecke-Ruzsa proof, which gave a weaker bound that was still strong enough to be just as useful for applications in additive combinatorics. So it’s sort of obvious that getting the more exact bound must be irreducibly difficult. However, I should mention that George had previously come up with a new proof of Plünnecke’s graph inequality that had the same quality of “obvious wrongness”: it did not use Menger’s theorem, or anything like it, and did not use any kind of product trick.

Anyhow, the proof is so short that I can give it in full in this blog post, with George’s permission. (more…)

Polymath6: A is to B as C is to ???

February 5, 2011

The Polymath experiment is still very much in its infancy, with the result that we still have only a rather hazy idea of what the advantages and disadvantages are of open online multiple collaboration. It is easy to think of potential advantages and disadvantages, but the more actual experience we can draw on, the more we will get a feel for which of these plausible speculations are correct.

In my last post but one I outlined a few thoughts about the cap-set problem. Although this wasn’t meant as the beginning of a Polymath project (as I said in the post, it was modelled more on Gil Kalai’s notion of an “open discussion”), it had something in common with such projects, to the point where one of the worries people have about the Polymath approach actually occurred: I suggested a line of attack that was, unbeknownst to me, actively being pursued by somebody.

I do not know exactly what calculations went on in the minds of Michael Bateman and Nets Katz when they decided to go public with their ideas before they had fully worked them out, but my guess is that they wanted to establish that they had got there first. This they did by the very effective method of explaining that the simple observations that I had made were much weaker than what they already knew how to prove. As it happens, the story has had a happy ending for them, since they managed, soon after posting their comments, to push their ideas to the point where they have obtained the first improvement to the Roth/Meshulam bound, something that many people have wanted to do for a long time.

More on the cap-set problem

January 18, 2011

Update: Nets Katz and Michael Bateman have posted a preprint to the arxiv that gives an n^\epsilon improvement to the bounds for the cap-set problem. More on this later.

I have been extremely pleased with the response to my first post on the cap-set problem, in that various people have very generously shared their ideas about it. Indeed, Nets Katz has even posted a comment that may give an improvement to the bound by a factor of n^\epsilon for some very small \epsilon. (For reasons I have already explained, this would be very interesting if it worked.) This followed a previous comment in which he outlined some arguments that he had found with Michael Bateman that go a lot further than the ones that appeared in the section entitled “Sets with very weak additive structure” in my previous post.

Also, Ernie Croot, in addition to sketching a different very interesting approach to the problem, suggested that I should get in touch with Seva Lev, which I did. Seva has an example of a cap set that is interestingly different from the usual examples. I will discuss it in a moment.

My plan, by the way, is to try to understand the ideas in Ernie’s and Nets’s comments and then explain them in my own words in a future post. But I’m not yet ready to do that, so in this post I want to discuss some other aspects of the problem, including Seva’s example.