My pennyworth about Deolalikar

Given that I have expressed rather unambiguously an interest in the P/NP problem on this blog, I feel duty bound to make some kind of remark about Deolalikar’s recent attempt to prove the theorem. (If by some extraordinary chance you don’t know what I am talking about, then an excellent starting point is Dick Lipton’s blog, a link to which can be found in my blogroll.)

The problem is that I haven’t looked at Deolalikar’s paper in any detail, and already I can tell that I cannot possibly compete with the extremely knowledgeable comments that have been made by several TCS experts. So instead, let me say that, like everybody else, I greatly admire what Deolalikar has done, whether or not it turns out to be correct, and am happy to wait to see how things pan out.

Instead of commenting directly on his work, I thought I’d air some thoughts I had a few years ago. I was reminded of them by reading that an important feature of Deolalikar’s proof is that it uses uniformity in an essential way (though whether this is really the case has been disputed by some commenters).

For the benefit of readers not versed in complexity theory, let me briefly say what uniformity is. One of the key strategies in the attempts to prove lower bounds for computational complexity is to use circuit models. Roughly speaking, a circuit is an array of AND, OR and NOT gates. You feed in a Boolean function, calculate what all the gates do, and get an output. It turns out that a function that can be computed by a Turing machine in time n can be calculated by a circuit of size m, where m has a polynomial dependence on n. Therefore, a lower bound on the size of a circuit needed to compute a function gives a lower bound for the number of steps needed in a Turing computation.

Now a major difference between Turing complexity and circuit complexity is that if you have a sequence of related Boolean functions with inputs of increasing size (such as trying to find a Hamilton circuit in a graph with n vertices), then in a certain sense the Turing complexity is the number of steps you need if you use “the same process” whatever the size of the problem. And that is characteristic of the kinds of algorithms we produce in real life. Whereas, in principle, a circuit complexity lower bound is much stronger, since you could come up with a completely different circuit for each size of problem.

In practice, however, nobody has the slightest idea how completely different circuits for each size of problem could ever give you an advantage over a good old-fashioned algorithm. (I say that without being sure that it is true — so if anybody would like to point me towards a paper in the literature that gives an interesting non-uniform upper bound where there is not a corresponding known uniform upper bound, then I’d be fascinated and would happily correct what I’ve just said.) Nevertheless, the theoretical distinction is undeniably there.

One can make the two definitions equivalent by insisting that the sequence of circuits is uniform, meaning that they can be produced in polynomial time by an algorithm. If that is the case, then a polynomial upper bound for the circuit size gives you a polynomial-time algorithm for the original problem, since all you have to do is generate the circuit and then see what it does.

My idle thoughts were concerned with the following question: what could a proof that P\neNP conceivably look like if it made essential use of uniformity? That is, how might one reason about algorithms without generalizing them to circuits? It seems pretty hard, because circuits are nice set-theoretic objects that can be formulated in terms of intersections, unions and complements, whereas algorithms are difficult things that are made precise via quite complicated notions such as that of a Turing machine.

Before I go any further, I want to stress that this is a blog post rather than a paper. I am using it to air ideas that I do not consider interesting or original enough to attempt to publish in a more conventional way. Indeed, I would go further and say that with one exception (a minor result in complexity that was published over a decade ago) I have not had any ideas in theoretical computer science that are both new and interesting to experts. Once or twice I’ve met the latter criterion but then they have turned out to be very far from new. I’m not sure whether I’ve had any new ideas, but if I have then they have been ideas with flaws that would be instantly apparent to experts.

Perhaps the most promising aspect of algorithms that is not shared by arbitrary circuits is the heavy use they make of iteration. In a sense, this is a triviality: a Turing computation just is the iteration of a fairly simple function, which, if you set it up correctly, will have a fixed point when the computation halts. But the connection can be made more direct via the equivalence of Turing computable functions and recursive functions.

The problem from the point of view of proving lower bounds is that iteration can give rise to some very strange functions: one has only to think of sets like the Mandelbrot set. So it does not look easy to argue that a set of the form “the set of all Boolean sequences that output 1 when you iterate the following function until you reach a fixed point in polynomial time” must have some simplicity property that is not shared by all NP functions.

Nevertheless, here is a little fantasy. It has all sorts of superficial problems with it, and I’m pretty sure it has deep problems with it too, but that gives it a small interest to the non-expert as an illustration of what makes the P/NP problem so hard.

The idea has two aspects to it. The first is to exploit the fact that an obvious analogue of the statement P\neNP in the infinite world is known to be true and is not that hard to prove. To spell out this analogy, let’s talk about sets instead of functions. (Corresponding to the function f:\{0,1\}^n\to\{0,1\} is the set \{x:f(x)=1\}.) A set in P (or rather, a set of polynomial circuit complexity) is one that can be built up from the basic coordinate hyperplanes \{x:x_i=1\} by means of not too many intersections, unions and complements. This should call to mind Borel subsets of \{0,1\}^{\mathbb{N}}, which are sets that can be built up from basic coordinate hyperplanes using only countable intersections, unions and complements. Now an NP set is a projection of a set in P. (That is, if S\subset\{0,1\}^m\times\{0,1\}^n is a set in P, then the set of all x\in\{0,1\}^m such that (x,y)\in S for some y\in\{0,1\}^n is a set in NP.) Projections of Borel sets are called analytic sets, and they are known to be different from Borel sets in general. Moreover, there are also “complete” sets: that is, there are universal analytic sets with the property that if they were Borel then all other analytic sets would have to be Borel.

Once one is aware of these facts, the temptation to try to exploit them is obvious. Indeed, this idea comes into the category of ideas that were quite interesting but far from new: it is credited to Michael Sipser, who, I should also say, had much more detailed and interesting proposals about how to use it than anything I have ever come up with. There are a number of snags with it, however, and I think it is now generally thought of as an approach to P\neNP that is “known not to work”. (Again, if there are variants of the idea that make some attempt to get round the relativization, natural proofs and algebrization barriers, then I am very happy to retract that last statement.)

OK, here was the thought I had, which is almost certainly something that has been considered in depth by the TCS community — so this post is party a request for a good reference, perhaps even to Sipser himself. It is not hard to show that the set of all graphs with vertex set \mathbb{N} that contain an infinite clique is analytic but not Borel. Could one perhaps take a purported polynomial-time algorithm for the clique problem for finite graphs, and “see what it does in the infinite case”? The hope would be that it would translate into an “efficient infinite procedure” for determining whether an infinite graph contains an infinite clique, which would in turn amount to a proof that the set of graphs containing an infinite clique was Borel. This would be a contradiction, which would prove that there was no polynomial-time algorithm for the clique problem in the finite case.

Before I list the numerous bad aspects of this idea, let me try to give it some veneer of plausibility by discussing one example where it does seem clear what the “infinite version” of a finite algorithm is. (But I’ll undercut myself in advance by saying that doing this for one algorithm is very different from doing it for a completely general algorithm.)

Let’s consider the following simple algorithm for deciding whether two given points x and y in a finite graph are joined by a path. You just write the number zero next to x, then 1 next to the neighbours of x, and in general at the kth stage you write a k next to all not-yet-labelled vertices that are neighbours of vertices that have been labelled (which will necessarily have been labelled k-1, or the new vertices would have been labelled earlier). You keep going until either you label y or you can no longer label any new vertices. It’s easy to see that this takes polynomial time (however you reasonably choose to implement it) and that by the end you have labelled all vertices that are connected to x with their distance from x.

Equally clearly, there is a sense in which this algorithm works for infinite graphs. You just do exactly the same thing that you do in the finite case, and after a countable number of steps either you will have identified a shortest path from x to y (if you label y then start at y and choose at each stage a neighbour with smaller label until you reach x) or you will have shown that x and y are not connected.

Not quite so obviously, the existence of this infinite algorithm can be used to show that the set of all infinite graphs such that x is joined to y is Borel.

So could this be an example of a much more general phenomenon, showing that polynomial-time algorithms have infinitary analogues that produce Borel sets? Here are a number of reasons to think not.

1. The parity function (which is 1 if there is an even number of 1s and 0 if there is an odd number of 1s) is computable in polynomial time. However, the natural infinitary analogue of this function is not even Lebesgue measurable (by the Lebesgue density theorem). More precisely, no function of infinite 01-sequences that changes its value whenever you change a single bit is Lebesgue measurable.

2. There are known complexity lower bounds for parity but they concern restricted classes of circuits. Particularly relevant is Hastad’s result that parity requires exponentially large circuits if those circuits are of constant depth. This suggests that the correct finitary analogue of Borel sets is not polynomially computable sets but sets that are computable by polynomial circuits of bounded depth. (I think this is the direction Sipser went in, but I can’t find a copy of his paper Borel Sets and Circuit Complexity online and am not near a library.)

3. A straightforward result in descriptive set theory is that a set is Borel if and only if it and its complement are both analytic. It is conceivable that sets that are in both NP and co-NP are thereby in P, but if that is true then it certainly does not have a straightforward proof. The class NP\capco-NP includes such famous problems as factorizing. I think it is fair to say that the general belief among experts is that NP\capco-NP does not equal P, but that this belief is considerably less strong than the belief that P\neNP. In particular, a number of people have expressed doubts that factorizing requires a super-polynomial algorithm, and if a polynomial-time algorithm were to be found for factorizing, then as well as causing a massive shake-up of internet security, it would make it much harder to argue that P\neNP\capco-NP.

4. There is a polynomial-time algorithm for deciding whether a bipartite graph has a perfect matching. However, the problem of deciding whether an infinite bipartite graph has a perfect matching is analytic but not Borel. So the infinitary analogue of the result about perfect matchings would have to be a slightly non-obvious one. (Perhaps it could be something like that the set of bipartite graphs with perfect matchings that don’t match any number to another number that is too far from the first number is Borel.)

5. When it is straightforward to produce an infinite version of a finite algorithm, it tends to be possible also to produce a rather boring infinite version. For instance, in the connectivity problem, one can check in a nice countable way whether x and y are joined by simply enumerating all finite paths and seeing whether any of them happens to link x to y. The obvious finitary analogue of this infinitary algorithm is not a polynomial-time algorithm. This suggests that Borelness is somehow “crude and insensitive” in a way that polynomial time computability is not.

6. I am not sure about this one, but I have a feeling that experts would tell me that any attempt along these lines would relativize: that is, it would also prove the false result that P\neNP relative to any oracle. I don’t want to say more than that because my understanding of this barrier is such that I am likely to get things wrong. Instead, let me refer you to this post of Terry Tao.

Despite these fairly devastating objections, I can’t help having a teeny affection for this general approach. Probably I am wrong to do so, so if anyone would like to hit me with even stronger reasons to think that it cannot work, then you will cure me of a minor ailment and earn a corresponding amount of gratitude.

9 Responses to “My pennyworth about Deolalikar”

  1. Rahul Says:

    Great post! There are actually a few contexts where non-uniformity helps and we don’t know how to eliminate it. Here are three classes of ways in which it can be used, I’d be interested to hear of others:

    (1) To simulate randomness (“Adleman’s trick”). The class BPP of problems solvable in probabilistic polynomial time is in polynomial time with a polynomial amount of advice, but eliminating the advice is one of the major open questions of complexity theory (derandomization). Indeed, eliminating the advice would actually imply lower bounds by results of Kabanets and Impagliazzo.

    (2) The census trick: Perhaps the most interesting application of this is to the NE (non-deterministic time 2^O(n)) versus coNE (complement of NE) question, which is a “lifted” version of NP vs coNP: NE != coNE would imply NP != coNP, but not the other way around. Of course we don’t know if NE = coNE or not, but we know NE is in coNE with n+1 bits of advice (here n is the input length). The trick is to encode the number of strings belonging to your language L at input length n within your advice. Then in coNE, you can guess all the strings that are in L, using the advice to check that you have indeed guessed all such strings, and then accept only those strings which are not in L.

    (3) To encode a promise condition: Sometimes advice can be used in dealing with so-called “semantic” classes such as probabilistic polynomial time where the acceptance and rejection criteria are mutually exclusive but not exhaustive. It is not known for example, whether probabilistic quadratic time is more powerful than probabilistic linear time, but this is known if each class is given just 1 bit of advice. Another example here is the result that MA with 1 bit of advice (here MA is NP but with the verification being probabilistic) doesn’t have Boolean circuits of size n^k for any fixed k. Note that the upper bound here uses just 1 bit of non-uniformity while the lower bound is against algorithms with a fixed polynomial number of bits of non-uniformity.

    Having said all this, it’s true that we do not know how to take advantage of uniformity in our lower bounds. Indeed, most known ways to do this run up against the relativization barrier.

  2. none Says:

    I think Mike Sipser spent some time in the 1980’s on an approach that sounds something like what you’re describing. He got some nice results out of it but it fell short of separating P from NP.

  3. Ryan O'Donnell Says:

    Tim, regarding your point 2:

    Mike Sipser once told me that indeed, thinking about Borel Sets and Descriptive Complexity was what led him to the work in the Furst-Saxe-Sipser paper (which is the precursor to Hastad’s final optimal results).

    He said that his original random restriction arguments were for “infinite depth-2 circuits”, which actually made the analysis easier. He then managed to convert these to a finite analogue, with “finite vs. infinite fan-in” turning into “bounded vs. unbounded finite fan-in”.

  4. Cantor Says:

    Circuits can do (much) better than algorithms. Consider the set that contains a binary string x if and only if |x|=n (|x| is the length of x), and the n’th Turing machine (according to some canonical order, say lexicographic, of Turing machines) halts on the empty input. No algorithm can solve the problem of whether a given input x belongs to this set. On the other hand, constant circuits can: the circuit for input length n is the constant 1 if the n’th machine halts and otherwise it is the constant 0.

  5. Kaveh Says:

    Here is Sipser’s paper:
    http://doi.acm.org/10.1145/800061.808733

  6. Albert Atserias Says:

    As you point out, the analogy between Borel sets and polynomial-time computable sets is appealing but breaks at a few places. But perhaps the analogy between analytic sets and NP is more robust. One reason to think so is that any set in NP, which in the standard definition is the polynomial projection of a polynomial-time decidable set, can also be put as the polynomial projection of a set that can be decided by polynomial-size bounded-depth circuits (in a sense this is a rephrasing of the NP-completeness of the satisfiability problem for Boolean formulas in conjunctive normal form). If you buy this, you might find interesting that Sipser himself gave a purely combinatorial proof that analytic sets are not closed under complement. Here is the paper:
    http://www.springerlink.com/content/661291h0w622rr76/

  7. KWRegan Says:

    One aspect of uniformity is conveyed by Dexter Kozen’s paper “Indexings of Subrecursive Classes”—there’s a reference in comments of this MathOverflow item. It contains an argument that “If P != NP is provable at all, then it is provable by diagonalization.” I’ve regarded it as a chicken/egg matter on which I took the opposite side from him, but actually it helped me advance my suggestion (or really reason for agreeing with some others) that Deolalikar’s strategy isn’t really using uniformity. If it is, and Dexter is right, one feels there should be more tracks of a diagonalization.

    Your attempt with infinite graphs might be informed by
    Attila Máté’s paper Nondeterministic polynomial-time computations and models of arithmetic. It cites earlier papers which also examine what happens to formalized polynomial-time computations when extended to nonstandard integers (or nonstandard-other structures), which might give more control than what you sketch above.

  8. arnab Says:

    A beautiful post! Perhaps, it’s easier to think about the infinitary versions starting from the logical characterizations of these complexity classes?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: