I dragged myself out of bed on the third day feeling pretty terrible — in fact, terrible enough to be slightly worried that I would be doing myself some damage with this succession of short nights, which I couldn’t see how to do anything about, given the necessity of starting early (a result of the schedulers’ evil decision to put superstars and known excellent speakers on in the first slot of the day). But nothing much distinguishes the beginning of the third day from the beginnings of the two previous days, so let me jump to the first talk, Artur Avila’s plenary lecture.
But before I do so, I have remembered one small thing that did make this day slightly different. In order to put on the unpleasant insect repellent, I took off my name badge, and then I forgot to put it back on again. I realized what I had done just as the bus was pulling out of the hotel. I wondered whether it was worth delaying the bus for three minutes while I dashed to my room and back, but Irit Dinur, who was in the same bus (I had not previously realized she was even at the same hotel, because she took a much more relaxed attitude than I did about getting up for the first talk — the previous day she’d watched the streaming video from the hotel instead), told me that she had forgotten hers yesterday, and the only consequence was that she had had to go and ask for a replacement. Well, that wasn’t quite the only consequence — the replacement badge no longer said, “Invited Speaker” on it, and she did not get a new set of coupons for lunch and coffee.
I decided I could handle walking around as a mere delegate, and could even handle not having a lunch coupon, though that was slightly disappointing. But for some reason when I asked for my replacement badge it was an exact replica of the original one and it did come with the coupons. (So in theory I could have had seconds for lunch the next day — but in fact I ended up not even having firsts.)
Artur Avila is young (we were told that he was this ICM’s youngest plenary speaker) and strikingly handsome in a black polo shirt and dark jeans, as this photo demonstrates rather inadequately.
To his left in the photo is Etienne Ghys, who gave an excellent plenary lecture in Madrid four years ago.
However, I can in all sincerity say that the two plenary lectures I attended on this morning were excellent. The first thing I wrote about Avila’s was, “Started well.” I can’t remember quite what it was that provoked that, but it was probably something like a very good setting out of his stall — explaining in elementary terms what the talk was going to be about. His voice sounded oddly Russian to my ears (oddly since he’s not Russian at all) and then later reminded me a bit of Mihalis Dafermos, a colleague of mine at Cambridge who is, as his name suggests, Greek but who has spent quite a bit of time in the US. He shared with Mihalis a tendency to end sentences on a high-pitched note.
He spent quite a bit of the talk telling us about iterating polynomials such as the famous explaining why this was interesting, informing us what a Lyapounov exponent was (something I ought to have known but didn’t), and discussing period doubling, the onset of chaos, etc. After a short while I stopped taking notes and as a result I am left without a clear idea of what Avila’s own contribution to this area is. What I remember is that I did have a reasonable idea of it at the time, but now it is a week later and it has slipped out of my mind. The one thing I do remember is that Etienne Ghys told us that Avila is famous as a problem solver, and also as somebody who likes to work in collaboration.
Next was Irit Dinur’s plenary lecture. I asked her earlier in the morning whether she felt nervous about addressing an audience of something like 500 people (I’m not sure why I didn’t say 1000). She said she was, but in a way that was consistent with either feeling nervous or not feeling particularly nervous but just responding in a fairly automatic way to my question. Here is a picture of her and Hendrik Lenstra (who, as it happens, was chairman of the programme committee) just before her talk.
Lenstra started singing her praises, but fairly soon stopped and said that while there were plenty more good things he could tell us about her, he did not want to embarrass her further. She then started her talk with a good show of mild embarrassment, followed by some charming apologies, such as for having a not quite perfect set of slides because her computer had crashed when she had tried to put the finishing touches to them. (Apart from one small mistake that she later pointed out to us, there was no sign of this supposed imperfection.)
Pretty soon she was into her stride, and gave an exemplary ICM plenary lecture: keeping it simple and comprehensible to non-experts, and having time for a few hints about her own work at the end. My only disappointment was a selfish one — I would have preferred less introduction (things like being told what P and NP are and why they are important) and more about her proof of the PCP theorem. But the fact that I felt that way is the opposite of a criticism: it is a sign that she was doing her job. Strange though it may seem to some people, a significant proportion of professional mathematicians do not know what NP is, and if they had not been told then they would not have had a chance of understanding the talk. (At one point Dinur told us that NP does not stand for “non-polynomial time”. I think it is quite common to believe that it does, but one wonders how somebody can think that and also be aware that the question of whether P=NP is a famous open problem.)
I took no notes in this talk, but let me see if I can give an informal statement of the PCP theorem without completely butchering it. First, PCP stands for “probabilistically checkable proof”. What is a probabilistically checkable proof? Well, let me give a very simple example. Suppose I have a bag of marbles and I want to convince you that at least half of them are blue. And suppose that in fact two thirds of them are blue. Then I could invite you to perform the following experiment: you put your hand in the bag, take out a marble, look at it, and put it back again. The chances that you will take out a blue marble are 2/3. Now that doesn’t prove much, but if you do it a hundred times (shaking the bag a lot in between so that the outcomes are independent) then you will almost certainly draw out well over 50 blue marbles, so much so that you can calculate that the probability that I am lying is extremely small.
The main points here are that you can be convinced that I am telling the truth after looking at only a constant number of marbles, where by “convinced” I mean that you can make your probability of being wrong as small as you like.
However, this is not a perfect example, because it applies just to one bag of marbles, and does not work if, say, the number of blue balls is exactly 50%. But if for some strange reason we knew in advance that the number of blue marbles would either be at most half the total number of marbles or at least two thirds of the total number of marbles, then to convince ourselves of which, with an error probability of at most 0.0001, say, it is enough to look at just a constant number of marbles.
Now let’s consider another example that is related but that doesn’t quite work. This time suppose we are interested in whether a certain graph is 3-colourable. A typical “proof” that a graph is 3-colourable consists of … a 3-colouring of that graph. But that is not a probabilistically checkable proof because you need to look at more than a constant amount of a colouring to be convinced that it is a colouring. For instance, suppose I had a graph that was not quite 3-colourable but had a way of colouring the vertices so that just a tiny fraction of edges joined vertices of the same colour. If you now looked at 100 edges, then with high probability each one would join two vertices of different colours in my “fake” 3-colouring. So with this protocol (given a 3-colourable graph, I 3-colour it and you look at a constant-sized random sample of edges and check whether they join vertices of different colours) you cannot be convinced of the reliability of my proof that a graph is 3-coloured (even if my proof itself is perfectly valid). Of course, with a different protocol, one that runs in polynomial time, you can be convinced — you just look at every single edge of my colouring — but that is merely telling us that the problem belongs to NP.
So how might one devise a probabilistically checkable proof that a graph is 3-colourable? What would we even want? Well, suppose I had a way of transforming any graph , in polynomial time, into a new graph with the following properties.
- If has a proper 3-colouring, then has a proper 3-colouring.
- If does not have a proper 3-colouring, then for any 3-colouring of , at least 10% of the edges of join vertices of the same colour.
Then we can adopt the following protocol. You give me a graph and I transform it into and hand it back to you, together with a 3-colouring of the vertices of . (If you’re worried about how I found that 3-colouring, don’t. The point is that we’re not claiming that it is easy to find a colouring. We’re just talking about a system for efficiently verifying that a graph is 3-colourable if a 3-colouring is magically provided.) You then look at 1000 edges of at random and see whether they join vertices of the same colour. If they all do, then you should be convinced beyond all reasonable doubt that is 3-colourable. Why? Because if were not 3-colourable, then at least 10% of the edges of would join vertices that had the same colour in my colouring, and with extremely high probability you would choose several of these bad edges in your sample of 1000 random edges.
I’m not going to say much more about what the PCP theorem says. But three points are particularly worth noting. One is the main point of the theorem itself, which is that every problem in NP has a probabilistically checkable proof system like the one just mentioned (which, I stress, depends on the huge unanswered question of how to construct the graph ). For example, if the problem is to input the statement of a mathematical theorem and output a proof (in some suitable formal language) of length at most if such a proof exists, then there is a PCP system for that. In principle, we could all write our proofs in a formal language, then apply some clever transformation to them in such a way that all a journal would need to do to be 99.9999999% certain that the proof was correct is check 1000 bits of the transformed proof. As Dinur said, that is not actually a practical suggestion, but it is certainly an amusing one to fantasize about.
Another is that the PCP theorem turns out to be equivalent to the statement about graphs above. That is, if you have found a clever transformation of graphs that sends 3-colourable graphs to 3-colourable graphs, but sends non-3-colourable graphs to graphs that don’t just fail to be 3-colourable but fail badly to be 3-colourable, then you have in fact proved the entire theorem: all other examples can be reduced to this one.
If you think that makes the problem sound easy, then I recommend spending a couple of hours trying to think of a method of transforming graphs that has these two properties.
The third point is the question of why computer scientists are so interested in the theorem. It is regarded as not just a pretty theorem with amusing consequences, but as one of everybody’s top half dozen results in theoretical computer science. (Or at least, that’s how it seems to me that it is treated.) The main reason for this is that it can be shown also to be equivalent to hardness of approximation results.
To see what this means, consider the following attractive idea. We know that finding a 3-colouring of a graph is, if PNP, extremely hard. And the same goes for many other problems, such as finding a Hamilton cycle. But in practice we would like to be able to do some of these things. Perhaps a good compromise would be to accept slightly imperfect approximations: for instance, we might be satisfied with a colouring such that 99.9% of edges joined vertices of different colours.
But note that the PCP theorem in its graph formulation tells us that that cannot be done. If we had a way of efficiently finding a colouring that worked for 99.9% of edges, then we could apply it to our transformed graph If we succeeded in finding such a colouring, it would tell us that the original graph had a proper 3-colouring, since if not then the best colouring of would yield only 90% of edges that joined vertices of different colours. The moral of all this is that the PCP theorem tells us that even approximating is hard. There are by now some amazing results that say that even finding approximations that are very slightly better than trivial ones is hard. For instance, suppose you have a bunch of linear equations in variables over and want to know how many of them can be simultaneously satisfied. If you choose the randomly, then on average half the equations will be satisfied. Hastad proved that even if it is possible to satisfy 99% simultaneously, actually finding variables such that 51% of them are satisfied simultaneously cannot be done in polynomial time if PNP.
Having made those points, I’ve thought of a couple more things I’d like to say. The first is to stress the importance that constructing from should be done with a polynomial-time algorithm: it is trivial to find some transformation that works (e.g. make a large complete graph if cannot be properly 3-coloured and a large empty graph if it can).
I also want to say a tiny bit about the proof. You may find that the statement reminds you slightly of coding theory. A good code is one that transforms bit sequences to bit sequences in such a way that any two distinct sequences become very distinct. That’s a bit like a graph that is not 3-colourable becoming very non-3-colourable. So it will not come as a big surprise to hear that the original proof of the PCP theorem (due to Arora and Safra, and Arora, Lund, Motwani, Sudan and Szegedy, building on previous work by several further authors — you’ll have to look it up if you want to know who did what) did indeed involve coding theory in an essential way. (I’m saying this based on some vague memory of hearing the proof described, and have not managed to find definitive confirmation of it online.)
Dinur’s proof is different. As she pointed out to us, it is sufficient to find a way of transforming a graph with vertices into a new graph with at most vertices in such a way that if can be properly 3-coloured then so can , while if cannot be properly 3-coloured, then any defect in the colouring of , which initially affects just one edge … hmm … I now realize that what I felt as though I was understanding during the talk has not lasted well enough in my brain for me to be able to reproduce it. Here is what it felt like: you have a graph that cannot be 3-coloured without at least one error somewhere. For some reason, the graph is an expander (meaning that any set of at most half the vertices has at least neighbours not in ) and when you build the next graph the error propagates to the immediate neighbourhood of where the errors were last time. To achieve this, the new graph has edges where the previous graph had short paths, but they are chosen in a clever way. So after logarithmically many transformations the error has spread all over the graph and you’ve got your desired outcome. But actually the steps are divided into two, which Dinur described as being a bit like inhaling and exhaling. The “exhaling” deals with some features that you don’t want that result from the “inhaling” part. Sorry that’s all a bit vague, but I have at least found a much less vague account for anyone who might be interested. (Of course, you can also look at Dinur’s paper, which, to judge from a quick look, seems to be very well written.)