Answers, results of polls, and a brief description of the program | Gowers's Weblog

]]>I think it’s silly even when it works …

]]>Dear Prof. Gowers,

For finite groups, at least, that would indeed be one way to prove the required cancellation. After repeating the multiplication order(x) times, you’d be left with y=z.

So maybe it’s not so silly 🙂

]]>Oops — prev. anonymous comment was mine. Rereading it, I should also emphasise that Thomas (Barnet-Lamb) is not just involved in the compiler but has been working with me on mathematical language since we were undergrads; it’s always been an entirely equal collaboration.

]]>Tim has put that better then I could, but I’d like to present another angle on the same issue. The fact that humans are good at maths but surprisingly bad at logic suggests that something is ‘lost in translation’ when maths is “compiled” down into logic. My own interest in the language of mathematics originally stemmed from a feeling that studying it would give me some understanding of what it was that is lost. While in most respects that work has gone much further than I originally hoped, it has been less effective at pinning down what is lost in translation — and that’s a major reason why I became interested in attempting to analyse mathematical cognition more directly.

Incidentally, I should emphasise that the two projects are quite separate — Tom and I have been working on a mathematical-language-to-logic compiler, but it’s entirely separate from the work described in the blog post, except in the rather superficial sense that we (Tom, Tim and I) intend eventually to pipe the output of the compiler into the program described above.

]]>For my part, I don’t believe that humans can solve NP-complete problems. Therefore, mathematicians are not tackling general instances of problem A, but rather a very small class of “hard but not too hard” problems. I see what Mohan and I, and others in the human-oriented tradition, are doing as trying to understand what this class is and why it is so amenable to human mathematicians with their very limited short-term memories and slow processing speeds.

]]>Concerning “in-principle barriers”. Actually, we are talking about algorithm for the problem A=”given formal system and a statement S in it, determing whether S has a proof of length at most k in it (k is given in unary)”. So, if P=NP, we have no chance against computer. Moreover, if A has at least efficient heuristic algorithm working at least for all “interesting” instances, we will loose jobs. However, I believe that A has no efficient algorithm in any reasonable sense, and in this case smart hints can help computer to work exponentially faster 🙂

]]>I agree that if we want a computer to prove a difficult theorem any time soon, then human interaction is unavoidable. But I don’t see any in-principle barriers to computers eventually putting us out of work. That would be sad, but the route to it could be very exciting as the human interaction gets less and less and the “boring” parts of the proofs that computers can handle get more and more advanced, freeing us up to think about the interesting parts.

]]>Thank you for your comments! I am sure that you thesis was already “something potentially interesting to the ITP community”, but I agree that coming up with a program is even more impressive. I am happy to hear that you are planning to talk to them – this collaboration may result in the building of new “human-oriented” ITP. I completely understand that there are significant difficulties “which will take some work to get around”, but the result, in the best case, would be a true revolution: all proofs are readable, but at the same time fully certified! Moreover, computer (by proving subgoals automatically) would provide invaluable help at least in the “boring” parts of argument… I am sure, however, that interaction with human is unavoidable, and it will not be able to prove really difficult theorems fully automatically, otherwise we all loose our jobs 🙂

]]>It may be worth adding that our ability to produce human-readable output stems from the fact that we try to model human reasoning as closely as we possibly can. I think it would be very difficult to take an existing automated prover and massage its output into a human-readable format — I won’t say it’s impossible, but we at least have no idea of how to do it. Stronger, I don’t think we would even know where to start… our whole working methodology revolves around the fact that we are trying to think as human mathematicians — whenever we get stuck, either with the reasoning or the write-up, we think about what we would do as humans and why we do it. Of course, we aren’t the first to work in this way — Newell and Simon, Woody Bledsoe, Alan Bundy and their many students are all giant figures associated with this `human-oriented’ approach to proving and we see ourselves as continuing in that vein. But modern automated theorem proving very much seems to have moved away from this tradition — for example, looking through the TPTP website and the associated solutions library, I couldn’t spot a single program from the human-oriented tradition.

Re: interactive theorem proving, I’m very far from an expert, but I think it is certainly the case that what we are doing is much closer to that tradition. But there seem to be subtle `impedance mismatches’ which will take some work to get around. For example, as I understand it the standard way of handling a disjunctive hypothesis is to split the goal into two completely separate subgoals — cf. http://kti.mff.cuni.cz/~urban/hol_light_html/Help/DISJ_CASES_TAC.html. A key consequence of this is that the _other_ hypotheses are explicitly duplicated. That doesn’t conform to a human way of thinking — we would normally think of the other hypotheses not as being duplicated, but as being ‘ambient’, accessible from both subproblems. Representing that explicitly is important for our approach, both a moral level (because we’re trying to model human reasoning closely) and as it turns out on a practical level (because each hypothesis has an associated sets of tags which change as one applies moves, and duplicating this wouldn’t make much sense). [I should say that if I have mischaracterised the situation in ITP, apologies in advance!]

So that’s a partial explanation of why we haven’t just built our system on top of an existing interactive theorem prover. There are other reasons, for example foundational. I do think gaps of this kind will be bridgeable. More generally, we are very much looking forward to talking to people in the theorem proving community about this. But I think there before starting a dialogue, the onus was on us to show that we had something potentially interesting to that community — hence writing this program. The main point of this iteration is to show that human-readable output is possible; we hope that later iterations will also be interesting because of what they are able to prove (vindicating the human oriented approach). If we reach that stage then we would be delighted to see someone take our program and layer it on top of a interactive theorem prover to get the best of both worlds, namely certification of correctness and human readable proof.

]]>I agree with you there too (that a fully automatic prover can be turned into an interactive prover). But there’s a big difference in emphasis if you actually set out to make your prover fully automatic.

Mohan is about to reply with his own comments on the issues you raise.

]]>Thank you for your reply. Actually, automated and interactive theorem proving are very closely interconected: the more interactive prover can do fully automatically, the less detailed hints it requires. For now, even interactive prover with ability of doing most of boring bits automatically would be a breakthrough. Then it can be continuously improved and require less and less hints, eventually proving highly non-trivial theorems with no hints at all. However, no matter how strong it will be, there ALWAYS be theorems it cannot prove without hints, so allowing interaction would be useful by defintion.

Anyway, to “make the experience of interacting with an interactive theorem prover (or working with automated one) far more pleasant” two very important improvements is needed: (1) making the prover stronger and (2) making it understand as much of natural math language as possible, and produce as readable proofs as possible. While there are a lot of teams working with (1), your team is probably the only team in the world being able to make a breakthrough in (2). For this reason, I was surprised that you devote most of time and effort to (1). But, in any case, the resulting proof of the test lemma (“A closed subset of a compact set is compact.”) looks much more natural than any other prover can produce, the results of your experiment are fantastic, so congradulations, good luck in future research, and I will look forward for any other updates.

]]>The place where I disagree with you is when you say, “Obviously, fully automated prover cannot go too far.” I’m interested in this because I want to understand how human mathematicians have “clever” ideas. I don’t want to give up and settle for a program where all cleverness is supplied by the user and the program just does the boring bits, though I do think that such a program could be a wonderful thing to have as long as not too many hints need to be supplied.

It’s true that there has been a lot of work in both fully automatic and interactive provers. We are very much going for the former, which is why we are not building on the work done with the latter. Also, both of us have a strong tendency to want to work things out for ourselves, since only in that way can we feel confident that the approaches we use are in some sense “forced”. That has indeed led to our reinventing a number of wheels. We hope to reap the reward for this over the next year or two by producing programs that can solve, fully automatically and in a human way, problems that previous fully automatic provers have not yet managed to solve.

Finally, I agree that the work of Mohan and Tom Barnet-Lamb has the potential to make the experience of interacting with an interactive theorem prover far more pleasant, and that that could be extremely exciting and important.

]]>However, I have some questions about the aims of you program. First, and most importantly, I hope you agree that the final version of the “program” must be interactive, with user being able to give it hints? Obviously, fully automated prover cannot go too far. Imagine, however, you have a complicated theorem with 20 lemmas, and Lemma 12 is “A closed subset of a compact set is compact.”. Imagine that you type this lemma and the program immediately write a proof for you, so that you can go for next lemma. As a result, you would write the high-level outline of the proof, and the program would automatically fill all the boring details. If the program cannot prove some lemmas, you need to give more details, until it can. Important, that in this way no wrong result can be proved, so you would produce a 100% correct proof, with no need for referee to check! This is a revolution!!!

Second, it seems to me that you do a duplicate job. Automated and interactive theorem provers DO exist, they are developed for years, so why spend time for re-open them? The problem with them is exactly the LANGUAGE. The existing theorem provers work with their own format, far from usual mathematical language, and this is the main reason why almost no mathematician use them. As far as I understand, what is left to do, is to write the program translating the statements like “A closed subset of a compact set is compact.” to their languages, then use their algorithms to find a proof, and translate the proof back to natural language.

Third, there strong teams developing automated and interactive theorem provers for years, they have a lot of experience and proof tactics ready for use. Why you decided not to cooperate with them, and work just in small team with 2 people? I think the existing tems could help with “proof” parts, and you could concentrate on “language” part to produce a really great program, which have a potential to be a default program for writing all mathematics in the near future!!

Am I misunderstand something?

]]>They are in tension certainly, but they are not incompatible if one defines the word “silly” appropriately. What we mean by “silly” is trying something that no sensible mathematician would try. We don’t mean trying something that with hindsight can be seen as obviously not going to work.

To take an example, suppose you’re trying to prove that every x such that P(x) and Q(x) satisfies R(x). You might say to yourself, “Hey, maybe I can do without assumption Q(x) and show that P(x) alone implies R(x).” And you might try seriously to prove that, before spotting a rather simple counterexample and feeling slightly foolish. But this is not necessarily “silly” in the sense we’re talking about. It would be silly if the reaction of pretty well any mathematician was that Q(x) was *obviously* necessary (though even then, finding a counterexample when Q(x) doesn’t hold could be informative). But often that is not the case and it wouldn’t be silly at all.

Here’s an example of the kind of move we would consider silly. Suppose you want to prove the cancellation law for groups: that implies that . You start by picking and such that . At that point you could deduce that , then that , and so on. It would be obvious to a human mathematician that doing that was getting nowhere, so we would want it to be obvious to any program that we wrote.

]]>Conversely when one sees a proof with an ‘idea’, it often manifests as some (at first) surprising step. E.g. if you’ve just learnt algebra in school then a step like ‘But wait! That looks like the difference of two squares …’ looks like something out of left field. And it’s not obvious at first where this will go in the system – is it method or data? It’s not a generic step independent of domain knowledge, but neither is it part of the definition of anything.

Do you have an idea about what an idea will look like? Do you think that somehow with more careful analysis of what a human would do next, ideas will drop out, or will ‘having an idea’ require having an idea?

]]>Just to clarify, I don’t know whether Toby Gee has any interest in computer theorem proving, and mentioned him only parenthetically because he has written a paper with Tom Barnet-Lamb, who has worked closely with Mohan. The answer to your main question is no — and I wasn’t even close.

]]>