ICM2014 — Bhargava, Gentry, Sanders

On my last day at the ICM I ended up going to fewer talks. As on the previous two days the first plenary lecture was not to be missed — it was Maryam Mirzakhani — so despite my mounting tiredness I set my alarm appropriately. I was a little surprised when I got there by just how empty it was, until eventually I saw that on the screens at the front it said that the lecture was cancelled because of her Fields medallist’s lecture the following Tuesday. I belonged to the small minority that had not noticed this, partly because I have had a lot of trouble with my supposedly-smart phone so was there with a temporary and very primitive replacement which was not the kind of phone on to which one could download a special ICM app that kept one up to date with things like this. I had planned to skip the second lecture of the morning, so I slightly rued my lost couple of hours of potential sleep, while also looking forward to being able to use those hours to work, or perhaps make progress with writing these posts — I can’t remember which of the two I ended up doing.

As a result, the first talk I went to was Manjul Bhargava’s plenary lecture, which was another superb example of what a plenary lecture should be like. Like Jim Arthur, he began by telling us an absolutely central general problem in number theory, but interestingly it wasn’t the same problem — though it is related.

Bhargava’s central problem was this: given a function on the integers/rationals that takes integer/rational values, when does it take square values? In order to persuade us that this problem had been a central preoccupation of number theorists for a very long time, he took as his first example the function f(x,y)=x^2+y^2. Asking for this to take square values is asking for a Pythagorean triple, and people have been interested in those for thousands of years. To demonstrate this, he showed us a cuneiform tablet, which was probably the Babylonian tablet Plimpton 322, which contains a list of Pythagorean triples, some of which involve what in decimal notation are 5-digit numbers, and therefore not the kind of example one stumbles on without some kind of systematic procedure for generating it.

If one takes one’s function to be a cubic in one variable, then one obtains an elliptic curve, and rational points on elliptic curves are of course a huge topic in modern number theory, one to which Bhargava has made a major contribution. I won’t say much more about that, since I have already said a reasonable amount about it when discussing his laudatio. But there were a few extra details that are worth reporting.

He told us that Goldfeld and Katz and Sarnak had conjectured that 50% of elliptic curves have rank 0 and 50% have rank 1 (so the density of elliptic curves with higher rank is zero). He then told us about some work of Brumer and McGuinness in 1990 that seems to cast doubt on this (later) conjecture: they found that rank 2 curves occur quite often and their frequency increases as the coefficients get larger. More recent computational work has very strongly suggested that the conjecture is false: if you draw a graph of the average rank of elliptic curves as the size goes from 10^5 to 10^8, it increases quickly from 0.7 before tailing off and appearing to tend to about 0.87. Apparently the reaction of Katz and Sarnak was a cheerful, “Well, it will go down eventually.”

Bhargava was pretty sceptical about this, but became properly interested in the problem when he learnt about work of Brumer, who showed assuming the generalized Riemann hypothesis and the Birch–Swinnerton-Dyer conjecture that the average rank was bounded above by 2.3. As Bhargava put it, this was a result that depends on two million dollars worth of conjectures. But that meant that if one could prove that the average rank of elliptic curves was greater than 2.3, then one would have shown that at least one of the generalized Riemann hypothesis and the Birch–Swinnerton-Dyer conjecture was false.

Still using the two million dollars worth of conjecture, Heath-Brown got the bound down to 2 in 2004, and Young got it to 1.79 in 2009. Bhargava and Shankar managed to improve that by 0.9 and two million dollars: that is, they obtained an unconditional bound of 0.89, amusingly close to the apparent asymptote of the graph that comes from the computations. As Bhargava pointed out, if one could extend those computations and find that the density eventually surpassed 0.89, this would, paradoxically, be very good news for the conjecture of Katz and Sarnak, because it would prove that the graph did eventually have to start coming down.

More recently, with Chris Skinner, Bhargava got an unconditional lower bound of 0.2.

One thing I understood a bit better by the end of Bhargava’s lecture was the result that the Birch–Swinnerton-Dyer conjecture holds for a positive proportion of elliptic curves. Although this is a remarkable result, there is a sense in which it is a slight cheat. What I mean by that is that Bhargava and his collaborators have a clever way of proving that a positive proportion of elliptic curves have rank 1. Then of those curves, they have a clever way of showing that for a positive proportion of those curves the order of the L-function at s=1 is also 1. What this argument doesn’t do, if my understanding is correct, is show something like this (except perhaps in some trivial sense):

  1. Every elliptic curve that satisfies a certain criterion also satisfies the Birch–Swinnerton-Dyer conjecture.
  2. A positive proportion of elliptic curves satisfy that criterion.

So in some sense, it doesn’t really get us any closer to establishing a connection between the rank of an elliptic curve and the order of the associated L-function at s=1. Perhaps in that respect it is a bit like the various results that say that a positive proportion of the zeros of the zeta function lie on the critical line, though I’m not sure whether that is a good analogy. Nevertheless, it is a remarkable result, in the sense that it proves something that looked out of reach.

Perhaps my favourite moment in Bhargava’s talk came when he gave us a hint about how he proved things. By this time he was talking about hyperelliptic curves (that is, curves y^2=P(x) where P is a polynomial of degree at least 5), where his main result is that most of them don’t have any rational solutions. How does he show that? The following slide, which I photographed, gives us a huge clue.


He looked at polynomials of degree 6. If the hyperelliptic curve y^2=a_0x^6+a_1x^5+\dots+a_5x+a_6 has a rational solution x=t, then by applying the change of variable x'=x-t, we can assume without loss of generality that the rational solution occurs at x=0, which tells us that a_6=c^2 for some rational c. But then you get the remarkable identity shown in the slide: a pair of explicit matrices A and B such that det(Ax-B)=a_0x^6+a_1x^5+\dots+a_5x+a_6. Note that to get these matrices, it was necessary to split a_6 up as a product c\times c, so we really are using the fact that there is a rational point on the curve. And apparently one can show that for most polynomials of degree 6 such a pair of matrices does not exist, so most polynomials of degree 6 do not take square values.

Just as the Babylonians didn’t find huge Pythagorean triples without some method of producing them, so Bhargava and his collaborators clearly didn’t find those matrices A and B without some method of producing them. He didn’t tell us what that method was, but my impression was that it belonged to the same circle of ideas as his work on generalizing Gauss’s composition law.

The lecture was rapturously received, especially by non-mathematicians in the audience (that could be interpreted as a subtly negative remark, but it isn’t meant that way), who came away from it amazed to feel that they had understood quite a bit of it. Afterwards, he was mobbed in a way that film stars might be used to, but mathematicians rather less so. I photographed that too.DSC00677
If you give the photo coordinates in [0,1]^2, then Bhargava’s head is at around (1/4,1/2) and he is wearing a dark red shirt.

At 2pm there was the Gauss Prize lecture. I thought about skipping it, but then thought that that would be hypocritical of me after my views about people who left the laudationes just before the one for the Nevanlinna Prize. I shouldn’t be prejudiced against applied mathematics, and in any case Stanley Osher’s work, or at least part of it, is about image processing, something that I find very interesting.

I went to the talk thinking it would be given by Osher himself, but in fact it was given by someone else about his work. The slides were fairly dense, and there was a surprising amount of emphasis on what people call metrics — numbers of papers, H-factors and so on. The fact that the speaker said, “I realize there is more to academic output than these metrics,” somehow didn’t help. I found myself gradually zoning out of this talk and as a result, despite my initial good intentions, do not have anything more to say about Osher’s work, clearly interesting though it is.

I then did skip the first of the afternoon’s parallel sessions. I wondered about going to hear Mohammed Abouzaid, because I have heard that he is a rising star (or rather, an already risen star who probably has even further to rise), but I found his abstract too intimidating.

So the first talk I actually did go to was in the second session, when I went to hear Craig Gentry, a theoretical computer scientist famous for something called homomorphic encryption, which I had heard about without quite understanding what it was. My target for the 45 minutes was to remedy this situation.

In the end two things happened, one good and one bad. The good one was that early on in the talk Gentry explained what homomorphic encryption was in a a way that was easy to understand. The bad one was that I was attacked by one of my periodic waves of tiredness, so after the early success I took in very little else — I was too absorbed in the struggle to keep my eyes open (or rather, to ensure that the brief moments when I shut them didn’t accidentally turn into stretches of several minutes).

The basic idea of homomorphic encryption is this. Suppose you have some function f that encrypts data, and let’s suppose that the items one encrypts are integers. Now suppose that you are given the encryptions f(m) and f(n) of m and n and want to work out the encryption of m+n. For an arbitrary encryption system f there’s not much you can do other than decrypt f(m) and f(n), add up the results, and then encrypt again. In other words, you can’t do it unless you know how to decrypt. But what if you want people to be able to do things to encrypted data (such as, say, carrying out transactions on someone’s bank account) without having access to the original data? You’d like some weird operation \oplus with the property that f(m+n)=f(m)\oplus f(n). I think now it is clear what the word “homomorphic” is doing here: we want f to be a homomorphism from (integers, +) to (encrypted integers, \oplus).

Having said that, I think Gentry told us (but can’t remember for sure) that just doing this for addition was already known, and his achievement has been to find a system that allows you to add and multiply. So I think his encryption may be a ring homomorphism. Something I haven’t stressed enough here is that it isn’t enough for the “funny” operations \oplus and \otimes to exist: you need to be able to compute them efficiently without being able to decrypt efficiently. The little I took in about how he actually did this made it sound as though it was very clever: it wasn’t just some little trick that makes things easy once you’ve observed it.

If you want to know more, the talk is here.

The last talk I went to, of the entire congress, was that of Tom Sanders, who was talking about the context surrounding his remarkable work on Roth’s theorem on arithmetic progressions. Sanders was the first to show that a subset of \{1,2,\dots,n\} of density 1/(\log n)^{1-o(1)} must contain an arithmetic progression of length 3. This is tantalizingly close to the density of the primes in that interval, and also tantalizingly close to the density needed to prove the first non-trivial case of Erdős’s famous conjecture that a subset X of \mathbb{N} such that \sum_{n\in X}n^{-1}=\infty contains arithmetic progressions of all lengths.

Sanders discussed the general question of which configurations can be found in the primes, but also the question of why they can be found. For instance, quadruples a,b,c,d such that a+b=c+d can be found in the primes, but the proof has nothing to do with the primes other than their density: the number of pairs (a,b) with a,b prime and less than n is about (n/\log n)^2, and the number of possible sums is at most 2n, so some sum can be achieved in several ways. By contrast, while there are many solutions of the equation x+y=4z in the primes (an example is 13+31=4\times 11), one can easily find dense sets of integers with no solutions: for instance, the set of integers congruent to 1 mod 3 or the set of integers strictly between n/3 and 2n/3.

Roth’s theorem concerns the equation x+y=2z, and while has been known for many decades that there are many solutions to this equation in the primes, there is no proof known that uses only the density of the primes, and also no counterexample known that shows that that density is insufficient.

I had a conversation with Sanders after the talk, in which I asked him what he thought the lowest possible density was that guaranteed a progression of length 3. The two natural candidates, given what we know so far, are somewhere around n/\log n, and somewhere around n\exp(-c\sqrt{\log n}). (The latter is the density of the densest known set with no progression of length 3.) Recent work of Schoen and Shkredov, building on Sanders’s ideas, has shown that the equation x_1+x_2+\dots+x_5=5y has non-trivial solutions in any set of density at least \exp(-(\log n)^c). I put it to him that the fact that Schoen and Shkredov needed the extra “smoothness” that comes from taking a fivefold sumset on the left-hand side rather than just a twofold one paradoxically casts doubt on the fact that this type of bound is correct for Roth’s theorem. Rather, it suggests that perhaps the smoothness is actually needed. Sanders replied that this was not necessarily the case: while a convolution of two characteristic functions of dense sets can have “gaps”, in the sense of points x where the value is significantly less than expected, it is difficult for that value to go all the way down to zero.

That will be a bit too vague to be comprehensible if you are not an additive combinatorialist, so let me try to give a little bit more explanation. Let A be a subset of \mathbb{Z}_p (the integers mod p) of density \alpha. We say that A is cquasirandom if the sizes of the intersections A\cap(A+x), which have mean \alpha^2 p, have standard deviation at most cp. Now one way for the standard deviation to be small is for most of the intersections to have roughly the same size, but for a few of them to be empty. That is the kind of situation that needs to happen if you want an unexpectedly dense set with no arithmetic progression of length 3. (This exact situation doesn’t have to happen, but I’m trying to convey the general feel of what does.) But in many situations, it seems to be hard to get these empty intersections, rather than merely intersections that are quite a bit smaller than average.

After Sanders’s talk (which is here), I went back to my room. By this time, the stomach bug that I mentioned a few posts ago had struck, which wasn’t very good timing given that the conference banquet was coming up. Before that, I went up to the top of the hotel, where there was a stunning view over much of Seoul, to have a drink with Günter Ziegler and one other person whose name I have forgotten (if you’re reading this, I enjoyed meeting you and apologize for this memory lapse). Günter too had a stomach bug, but like me he had had a similar one shortly before coming to Korea, so neither of us could be sure that Korean food had anything to do with it.

The banquet was notable for an extraordinary Kung Fu performance that was put on for our entertainment. It included things like perfomers forming a human pyramid that other performers would run up in order to do a backwards somersault, in the middle of which they would demolish a piece of wood with a sharp blow from the foot. It was quite repetitive, but the tricks were sufficiently amazing to bear quite a bit of repetition.

My last memory of ICM2014 was of meeting Artur Avila in the lobby of the hotel at about 5:25am. I was waiting for the bus that would take me to the airport. “Are you leaving too?” I naively asked him. No, he was just getting back from a night on the town.

7 Responses to “ICM2014 — Bhargava, Gentry, Sanders”

  1. Pablo Says:

    Dear Tim,

    In this recent preprint, Schoen and Sisask obtain Behrend-shape upper bounds for 4 variable equations: http://arxiv.org/pdf/1408.2568v1.pdf

  2. galoisrepresentations Says:

    The analogy with zeros on the critical line is a good one. Except, in the latter case, one can get a pretty optimal result for only one million dollars.

  3. Gil Kalai Says:

    I ran a poll about the lowest possible density that guarantees 3-term AP among {1,2,…,n}.

    1) Out of 29 voters 5 thought that it would be n/(log n)^c for c1

    3) 1 voter said n/log n^c is enough for any c but if the answer is n/g(n) g(n) is still on the polynomial side in the sense that it g(n) is smallet then some exp exp… exp log log log …cn where the number of logs is one more then the number of exps

    4) 1 voter said that g(n) is larger on the “linear side” exp exp..exp log log log …log n with the same numbers of logs and exps (but not as large as the next answers)

    5) 6 voters said exp. log n^c for some c< 1/2

    6) 9 voters said exp log n^1/2

    I was the voters in 3. For some fantastic wishful thinking baseless reasons


  4. Gil Kalai Says:

    Let me repeat the outcome of the 2009 poll in a clearer way:

    Say the lowest possible density is n/g(n):

    1) 5 voters out of 29 – g(n) behaves like (\log n)^c for c<1

    2) 7 voters — g(n) behaves like (\log n)^c for some c\geq 1.

    3) 1 voter (me) g(n) is larger than any power of log (n) but g(g(n)) is smaller than log n

    4) 1 voter: g(n) is smaller than e^{\log n^c} for any c>0 but g(g(n)) is larger than log n.

    5) 6 voters: g(n) behaves like e^{{\log n}^c} for some c<1/2.

    6) 9 voters g(n) behaves like exp (sqrt (log n))

    The result by Schoen and Sisask strengthen the "case" for 5-6. (but the regularity consideration weakens this strengthening…) I am fond of the possibility that g(n) is larger than any power of log n but is still on the "polynomial side".

    For the capset problem the results were very decisive on the exponential side (I still hope better examples can be found.)

  5. Gil Kalai Says:

    Something is getting wrong with my comments: Let me repeat just the first and second items: first: five votes for g(n) being a power less than one of log n. (We know now that this is not the case). Second: seven votes that g(n) is some power greater or equal than 1 of log n.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: