ICM2014 — Khot laudatio

After McMullen’s laudatio on Mirzakhani, it was time for Sanjeev Arora to talk about the work of the Nevanlinna prize winner Subhash Khot. It was also the time that a significant proportion of the audience decided that enough was enough and left the room. The same thing happened in Hyderabad four years ago, and on both occasions I was fairly shocked: I think it shows a striking disrespect, not so much for the speaker and prizewinner, though there is that aspect too, as for theoretical computer science in general. It seems to say, “Right, that’s the interesting prizes over — now we’re on to the ones that don’t really matter.” Because I have always been interested in computational complexity and related areas, my interest in the Nevanlinna prize is comparable to my interest in the Fields medals — indeed, in some ways it is greater because there is more chance that I will properly appreciate the achievements of the winner. And the list of past winners is incredible and includes some of my absolute mathematical heroes.

When the announcement was made a few hours earlier, my knowledge of Subhash Khot could be summarized as follows.

  1. He’s the person who formulated the unique games conjecture.
  2. I’ve been to a few talks on that in the past, including at least one by him, and there have been times in my life when I have briefly understood what it says.
  3. It’s a hardness conjecture that is a lot stronger than the assertion that P\neNP, and therefore a lot less obviously true.

What I hoped to get out of the laudatio was a return to the position of understanding what it says, and also some appreciation of what was so good about Khot’s work. Anybody can make a conjecture, but one doesn’t usually win a major prize for it. But sometimes a conjecture is so far from obvious, or requires such insight to formulate, or has such an importance on a field, that it is at least as big an achievement as proving a major theorem: the Birch–Swinnerton-Dyer conjecture and the various conjectures of Langlands are two obvious examples.

The unique games conjecture starts with a problem at the intersection of combinatorics and linear algebra.

Suppose you are given a collection of linear equations over the field \mathbb{F}_q. Then you can use Gaussian elimination to determine whether or not they have a solution. Now suppose that you find out that they do not have a solution. Then something you might consider doing is looking for an assignment to the variables that solves as many of the equations as possible. If q=2, then a random assignment will solve on average half the equations, so it must be possible to solve at least half the equations. So the interesting thing is to do better than 50%. A famous result of Johan Håstad states that this cannot be done, even when each equation involves just three variables. (Actually, that restriction to three variables is not the surprising aspect — there are many situations where doing something for 2 is easy and the difficulty kicks in at 3. For example, it is easy to determine whether a graph is 2-colourable — you just start at a vertex, colour all its neighbours differently, etc. etc., and since all moves are forced apart from when you start again at a new connected component, if the process doesn’t yield a colouring then you know there isn’t one — but NP-hard to determine whether it is 3-colourable.)

More precisely, Håstad’s result says that for any fixed \epsilon>0, if there were a polynomial-time algorithm that could tell you whether it was possible to satisfy at least a proportion 1/2+\epsilon of a collection of linear equations over \mathbb{F}_2 (each equation involving three variables), then P would equal NP. His proof relies on one of the big highlights of theoretical computer science: the PCP theorem.

The unique games conjecture also concerns maximizing the number of linear equations you can solve, but this time we work mod p and the equations are very special: they take the form x_i-x_j=c_{ij}.

To get a little intuition about this, I suppose one should do something I haven’t done until this very moment, and think about how one might go about finding a good algorithm for solving as many equations of this type from some collection as possible. An obvious observation is that once we’ve chosen x_i, the value of x_j is determined if we want to solve the equation x_i-x_j=c_{ij}. And that may well determine another variable, and so on. It feels natural to think of these equations as a labelled directed graph with the variables as vertices and with an edge from x_i to x_j labelled c_{ij} if the above equation is present in the system. Then following the implications of a choice of variables is closely related to exploring the component of that vertex in the graph. However, since our aim is to solve as many equations as possible, rather than all of them, we have the option of removing edges to make our task easier, though we want to remove as few edges as possible.

Maybe those few remarks will make it seem reasonably natural that the unique games conjecture can be connected with something called the max cut problem. This is the problem of finding a partition of the vertices of a graph into two sets such that the number of edges from one side to the other is as big as possible.

Actually, while browsing some slides of Håstad, I’ve just seen the following connection, which seems worth mentioning. If p=2 and all the c_{ij} equal 1, then x_i-x_j=c_{ij} if and only if the variables x_i and x_j get different assignments. So in this case, solving as many equations as possible is precisely the same as the max cut problem.

However, before we get too carried away with this, let me say what the unique games conjecture actually says. Apparently it has been reformulated a few times, and this version comes from 2004, whereas the original version was 2002. It says that even if 99% of the equations (of the form x_i-x_j=c_{ij} over \mathbb{F}_p) can be simultaneously satisfied, then it is still NP hard to determine whether 1% of them can be simultaneously satisfied. Note that it is important to allow p to be large here, since the random approach gives you a proportion p^{-1} straight away. Also, I think 99% and 1% are a friendly way of saying 1-\epsilon and \epsilon for an arbitrary fixed \epsilon >0.

In case the statement isn’t clear, let me put it slightly more formally. The unique games conjecture says the following. Suppose that for some \epsilon>0 there exists a polynomial-time algorithm that outputs YES if a proportion 1-\epsilon of the equations can be solved simultaneously and NO if it is impossible to solve more than a proportion \epsilon of them, with no requirements on what the algorithm should output if the maximum proportion lies between \epsilon and 1-\epsilon. Then P=NP.

At this point I should explain why the conjecture is called the unique games conjecture. But I’m not going to because I don’t know. I’ve been told a couple of times, but it never stays in my head, and when I do get told, I am also told that the name is something of a historical accident, since the later reformulations have nothing to do with games. So I think the name is best thought of as a strange type of label whose role is simply to identify the conjecture and not to describe it.

To give an idea of why the UGC is important, Arora took us back to an important paper of Goemans and Williamson from 1993 concerning the max cut problem. The simple random approach tells us that we can find a partition such that the size of the resulting cut is at least half the number of edges in the graph, since each edge has a 50% chance of joining a vertex in one half to a vertex in the other half. (Incidentally, there are standard “derandomization” techniques for converting observations like this into algorithms for finding the cuts. This is another beautiful idea from theoretical computer science, but it’s been around for long enough that people have got used to it.)

Goemans and Williamson were the first people to go beyond 50%. They used semidefinite programming to devise an algorithm that could find a cut for which the number of edges was at least 0.878 times the size of the max cut. I don’t know what that 0.878 really is — presumably some irrational number that came out of the proof — but it was sufficiently unnatural looking that there was a widespread belief that the bound would in due course be improved further. However, a check on that belief was given in 2004 by Khot, Kindler, Mossel and O’Donnell and in 2005 by Mossel, O’Donnell and Oleskiewicz (how they all contributed to the result I don’t know), who showed the very surprising result that if UGC is true, then the Goemans-Williamson bound is optimal. From what I understand, the proof is a lot more than just a clever observation that max cut can be reduced to unique games. If you don’t believe me, then try to explain to yourself how the constant 0.878 can arise in a simple way from a conjecture that involves only the constants “nearly 0” and “nearly 1”.

In general, it turns out that UGC implies sharp thresholds for approximability for many problems. What this means is that there is some threshold, below which you can do what you want with a polynomial-time algorithm and above which doing what you want is NP hard. (So in the max cut example the threshold is 0.878: getting smaller than that proportion can be done in polynomial time, and getting above that proportion is NP hard — at least if you believe UGC.)

Almost as interesting is that the thresholds predicted by UGC all come from rather standard techniques such as semidefinite programming and linear programming. So in some sense it is telling us not just that a certain bound is best possible but that a certain technique is best possible. To put it a bit crudely and inaccurately, it’s saying that for one of these problems, the best you can do with semidefinite programming is the best you can do full stop.

Arora said something even stronger that I haven’t properly understood, but I reproduce it for completeness. Apparently UGC even tells us that the failure of a standard algorithm to beat the threshold on a single instance implies that no algorithm can do better. I suppose that must mean that one can choose a clever instance in such a way that if the standard algorithm succeeds with that instance, then that fact can be converted into a machine for solving arbitrary instances of UGC. How you get from one instance of one problem to lots of instances of another is mysterious to me, but Arora did say that this result came as a big surprise.

There were a couple of other things that Arora said at the end of his talk to explain why Khot’s work was important. Apparently while the UGC is just a conjecture, and not even a conjecture that is confidently believed to be true (indeed, if you want to become famous, then it may be worth trying your hand at finding an efficient algorithm for it, since there seems to be a non-negligible chance that such an algorithm exists), it has led to a number of non-obvious predictions that have then been proved unconditionally.

Soon after Arora’s laudatio, Khot himself gave a talk. This was an odd piece of scheduling, since there was necessarily a considerable overlap between the two talks (in their content, that is). I’ll end by mentioning a reformulation of UGC that Khot talked about and Arora didn’t.

A very important concept in graph theory is that of expansion. Loosely speaking, a graph is called an expander if for any (not too large) set of vertices, there are many edges from that set to its complement. More precisely, if G is a d-regular graph and S is a set of vertices, then we define the expansion of S to be the number of edges leaving S divided by d|S| (the latter being the most such edges there could possibly be). Another way of looking at this is that you pick a random point s\in S and a random neighbour t of s, and define the expansion of S to be the probability that t is not in S.

The expansion of the graph as a whole is the minimum expansion over all subsets S of size at most n/2 (where n is the number of vertices of G). If this quantity is high, it is saying that G is “highly interconnected”.

Khot is interested in small-set expansion. That is, he picks a small \delta and takes the minimum over sets of size at most \delta n rather than at most n/2.

The precise reformulation I’m about to give is not in fact the one that Khot gave but rather a small modification that Boaz Barak, another well-known theoretical computer scientist, gave in his invited lecture a day later. The unique games conjecture is equivalent to the assertion that it is NP hard to distinguish between the following two classes of graphs.

  • Graphs where there exists a set of size at most \delta n with small expansion.
  • Graphs where every set of size at most \delta n has very big expansion.

I think for the latter one can take the expansion to be at least 1/2 for each such set, whereas for the former it is at most \epsilon for some small \epsilon that you can probably choose.

What is interesting here is that for ordinary expansion there is a simple characterization in terms of the size of the second largest eigenvalue of the adjacency matrix. Since eigenvalues can be approximated efficiently, there is an efficient method for determining whether a graph is an expander. UGC is equivalent to saying that when the sets get small, their expansion properties can “hide” in the graph in a rather strong way: you can’t tell the difference between a graph that has very good small-set expansion and a graph where there’s a set that fails very badly.

I had lunch with Boaz Barak on one of the days of the congress, so I asked him whether he believed UGC. He gave me a very interesting answer (a special case of the more general proposition that Boaz Barak has a lot of very interesting things to say about complexity), which I have unfortunately mostly forgotten. However, my rapidly fading memory is that he would like it to be true, because it would be a beautiful description of the boundary of what algorithms can do, but thinks it may very well be false. He thought that one possibility was that solving the problems that UGC says are NP hard is not in fact NP hard, but not possible in polynomial time either. It is perfectly possible for a problem to be of intermediate difficulty.

Although it wouldn’t directly contradict NP hardness, it would be very interesting to find an algorithm that solved the small-set expansion problem in a time that was only modestly superpolynomial: something like \exp((\log n)^4), say. That would probably get you an invitation to speak at an ICM.

18 Responses to “ICM2014 — Khot laudatio”

  1. Flit Says:

    “The same thing happened in Hyderabad four years ago, and on both occasions I was fairly shocked: I think it shows a striking disrespect, not so much for the speaker and prizewinner, though there is that aspect too, as for theoretical computer science in general. It seems to say, “Right, that’s the interesting prizes over — now we’re on to the ones that don’t really matter.” ”

    Well, the whole Nevanlinna prize is very strange. I don’t understand why it’s promoted by the IMU as a fifth Fields medal, since it only covers one area of mathematics. I think most people would react the same way if it was a prize reserved to topologists or analyst or any specific field.

  2. BQ Says:

    Possibly, Guy Kindler is missing in “2004 by Khot, Mossel and O’Donnell”?

    Thanks — updated.

  3. BQ Says:

    I read the wikipedia article about PCP that you linked to and (as a non-expert) I find it strange.

    The CSP reformulation sound very intuitive: it is hard to find a satisfying assignment and, if you can not find it, it is also hard to guarantee that all assignments satisfy no more than, say, 99% of constraints. (Did I understand it correctly?)

    On the other hand, the original formulation in terms of probabilistically checkable proofs is kind of mind blowing – if you have a logically correct statement, you can explain it a little better (increasing the length polynomially), and then I will only read 1000 words to convince myself with 99% confidence that the proof is correct!

    Does it mean that the equivalence between these two statements is just as non-trivial as the PCP formulation, let’s say, in the same sense that it is equivalent to 2 x 2 = 4?

    • Toronto Says:

      The equivalence is not completely trivial but is much more trivial than the proof of the PCP theorem, which is very hard.

    • BQ Says:

      Does it mean that somehow the surprise of the PCP theorem is just a matter of language? If an equivalent intuitive statement is just as difficult to prove, is there a disconnect between our intuitive understanding of what a `proof’ is and what it means to `check the proof’, and the technical definitions in the PCP theorem?

    • Attila Szasz Says:

      I’m afraid I don’t really think you internalized the gapCSP view of the PCP to the point that it should feel intuitive. It’s really the same theorem as the ‘surprising’ probabilistic witness one. The technicalities have a lot to do with error correction codes and expander graphs, which in turn – very roughly speaking – manage to stretch the badness of an instance into another string that is sort of uniformly bad in a way that allows for the constant-query log-randomness sound and complete PCP verification. Hardness of approximation comes into play with this surprising and tricky uniformization procedure that you can transform any NP instance into a 3SAT one in polynomial time, such that you can always force the no-instances to have their values (maximum ratio of clauses satisfiable under the mapping) under some fixed constant, say p, while maintaing the yes-ones at 1. This stops you from getting a better than p-approximation for MAX-3SAT, unless P=NP.

      On the other hand, you can surely take the interpretation and language of this result to reach philosophical heights when combined with the fact that all formalized math theorems are basically SAT instances, and PCP guarantees you that there exists these switch-off-a-few-bits strange proofs for anything you like.

    • Toronto Says:

      Perhaps a more accurate formulation of the PCP theorem in the CSP setting is: There is an efficient procedure that, given a list of constraints, outputs a slightly longer list of constraints such that if the original was satisfiable then the new one is satisfiable and if the original one was not satisfiable, then the new one cannot be even 90% satisfied (every assignment will satisfy at most 90% of the constraints).

      Does that make it less intuitive?

    • BQ Says:

      Yes, I can see that this efficient uniformization algorithm should be very difficult to find. Because the efficient algorithm can not check whether the formula is satisfiable, it is amazing that it can add constraints (can it simply add new constraints or it really needs to transform the formula?) without spoiling satisfiability. The 90% part look easier but, probably, it is not, since you have to balance both goals. Once this is done, is it difficult to see how one can observe unsatisfiability by looking at a finite number of constraints randomly? I guess the constraints can not be just randomly picked since they will be disjoint…

      I really like this formulation, because it highlights what the difficulty is better. Of course, it is possible I did not internalize this correctly yet and also wrongly assume that the checking part is easy once you have this algorithm.

    • Attila Szasz Says:

      Well, I’d say that most of this stuff doesn’t happen that directly or head-on you make it sound, and it’s also pretty standard that randomness is of help here, although the fact that a constant queries are sufficient is really kind of magical. (I think, historically it was discovered to be something like the square root of log n first)

      Actually, Dinur’s gap amplification via expanders is more direct in this sense, and sort of resembles more what you may have in mind, but you can get weaker results reducing some error correcting (and linearitiy testing) -friendly NP complete problem -like solving quadratic equtions over finite fields- to your PCP approach, which has a pretty different feel than messing around a constraint problem, yet works. Even the general approach of passing between the CSP and proof view needs a few smart moves with the notion of NP hardness, which might be mindtwisting for someone not that familiar with complexity theory thinking.

  4. vznvzn Says:

    thx for the overview of UGC esp the links to expander graphs link which seem to be the stars of many CS thms and are of great interest to mathematicians also.

    there is a natural synergy and also a low-level antagonism between mathematicians & computer scientists captured somewhat in the anecdote about all the mathematicians leaving the breakthrough/ award CS talk (the kind of offhand/ colorful/ revealing observation thta would never be reported almost anywhere else except a blog)…. have observed this friction in some occasional ways & it can be seen in your blog comments at times… the word “frenemies” comes to mind… often applied to teenage girl relationships :/

    anyway you are leading/ pioneering in building bridges in this area eg with your interest/ fascination with P vs NP etc & think that is highly commendable. & more on the theme of (longterm) math-CS fusion in my blog…

  5. luca Says:

    “Apparently UGC even tells us that the failure of a standard algorithm to beat the threshold on a single instance implies that no algorithm can do better. I suppose that must mean that one can choose a clever instance in such a way that if the standard algorithm succeeds with that instance, then that fact can be converted into a machine for solving arbitrary instances of UGC.”

    It works the other way around: an instance in which the standard algorithm performs poorly on a problem P (of a certain type) can be turned into a reduction from Unique Games to the problem P, showing that all algorithms have to perform similarly poorly, or else UGC is false.

    Very roughly, if the problem P is a graph problem, then a reduction from UG to the problem P describes how to turn each variable into a little graph, and how to turn each constraint into a set edges between the two little graphs corresponding to the two variables. An instance in which the standard algorithm performs poorly can be used to define the little graphs and the connections. The properties that a graph has to have to make the standard algorithm perform poorly are precisely the properties that make the reduction work.

  6. Josephina Says:

    This year, we celebrate the first woman Fields medalist.

    I for one look forward to the day when there will not be a separate prize for informatics AND we see the first informatician Fields medalist.

  7. Vikram Chandrasekhar Says:

    Thanks for the excellent post. This is the first post that gives me a lucid description of what the UGC is and what the implications are if UCG is TRUE.

    Are there sharp implications on what can and what cannot be done in polynomial time if UGC is proven FALSE? Any insights here would be appreciated. Thank you.

  8. Boaz Barak Says:

    This is a pretty good summary, and also very impressive (all I remember from the ICM is a haze of rushing to talks and wondering when I’ll get a chance for a bathroom break..).

    Trevisan’s AMS survey is a great source for more about the Unique Games Conjecture, including this transformation of a bad instance for an algorithm to a gadget used in the hardness reduction.

    BTW a slightly easier to remember form of the UGC is to simply talk about linear equations involving two variables (not necessarily in the form x_i - x_j = c_{i,j}). It turns out that these are equivalent. However, I do prefer to talk about the small-set expansion hypothesis (which was posited by Raghavendra and Steurer in 2008), since I believe the linear-algebraic structure is a red herring and the UGC is really about expansion in graphs.

    I should note however that these two conjectures are not (yet?) known to be formally equivalent, though I (and many others) view them as “morally equivalent” since almost all algorithmic and hardness results apply equally well to both conjectures. (The SSEH is known to imply the UGC, and all the reasons we have to believe the latter also apply to the former.) The constant in the expansion parameter can be taken to be arbitrarily close to 1 and not just 1/2.

    Another note is that a quasipolynomial-time algorithm for the Unique Games Conjecture would actually contradict its NP hardness, since it is widely believed that SAT doesn’t have such an algorithm (which would be implied if UG had a quasipolynomial time algorithm and SAT had a polynomial-time reduction to UG).

    Also, thank you for the compliment! I too don’t remember what I told you 🙂 but indeed, one of the main reasons to believe the UGC is the beautiful picture it gives us of the complexity landscape, though recently I’ve been tending more to the belief that it is false.

    One can still hope that the UGC’s refutation would be beautiful too, in the sense of not merely showing that the problem is not NP hard, but actually giving a new algorithmic paradigm that applies to many problems for which currently the UGC current posits a barrier of going beyond the known linear-programming/semidefinite-programming techniques. The Sum-of-Squares algorithm is currently our best hope for such a “beautiful refutation”.

  9. Gil Kalai Says:

    The UGC asserts that that SAT has a polynomial-time reduction to UGC. This seems quite plausible but it is also quite possible that UG has intermediate difficulty. Having “intermediate difficulty” (i think) need not be manifest “intermediate running time”. It is possible that UG (or small set expansion) are exponential (e.g. its running time matches the algorithm by Arora, Barak, and Steurer) but nevertheless there is no polynomial reduction between SAT and UGC.

  10. S Says:

    Thanks for this post.

    Just for the record: the 0.878 number of the Goemans-Williamson algorithm shows up as 2/\pi times the smallest positive value of \theta / (1 - \cos \theta). This seemed pretty arbitrary and just an artefact of the technique/proof, which is why it’s very surprising that the UGC (whose statement seems to bear no relation to semidefinite programming or trigonometry) implies it’s optimal.

  11. Sasho Nikolov Says:

    The short story with the two papers KKMO and MOO cited for the UG-hardness of approximating MaxCut better than the GW algorithm is that the proof in KKMO was initially conditional on the Majority is Stablest conjecture, which belongs to harmonic analysis on boolean functions and roughly says that among all balanced boolean functions all of whose variables are not very influential, majority is most stable to perturbing the arguments. MOO proved the conjecture, hence their share in the credit.

  12. jonas Says:

    See also Scott Aaronson’s blog entry about the same prize to Subhash Khot at “http://www.scottaaronson.com/blog/?p=1958”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: