Recent news concerning the Erdos discrepancy problem

February 11, 2014

I’ve just learnt from a reshare by Kevin O’Bryant of a post by Andrew Sutherland on Google Plus that a paper appeared on the arXiv today with an interesting result about the Erdős discrepancy problem, which was the subject of a Polymath project hosted on this blog four years ago.

The problem is to show that if (\epsilon_n) is an infinite sequence of \pm 1s, then for every C there exist d and m such that \sum_{i=1}^m\epsilon_{id} has modulus at least C. This result is straightforward to prove by an exhaustive search when C=2. One thing that the Polymath project did was to discover several sequences of length 1124 such that no sum has modulus greater than 2, and despite some effort nobody managed to find a longer one. That was enough to convince me that 1124 was the correct bound.

However, the new result shows the danger of this kind of empirical evidence. The authors used state of the art SAT solvers to find a sequence of length 1160 with no sum having modulus greater than 2, and also showed that this bound is best possible. Of this second statement, they write the following: “The negative witness, that is, the DRUP unsatisfiability certificate, is probably one of longest proofs of a non-trivial mathematical result ever produced. Its gigantic size is comparable, for example, with the size of the whole Wikipedia, so one may have doubts about to which degree this can be accepted as a proof of a mathematical statement.”

I personally am relaxed about huge computer proofs like this. It is conceivable that the authors made a mistake somewhere, but that is true of conventional proofs as well. The paper is by Boris Konev and Alexei Lisitsa and appears here.

Taylor’s theorem with the Lagrange form of the remainder

February 11, 2014

There are countless situations in mathematics where it helps to expand a function as a power series. Therefore, Taylor’s theorem, which gives us circumstances under which this can be done, is an important result of the course. It is also the one result that I was dreading lecturing, at least with the Lagrange form of the remainder, because in the past I have always found that the proof is one that I have not been able to understand properly. I don’t mean by that that I couldn’t follow the arguments I read. What I mean is that I couldn’t reproduce the proof without committing a couple of things to memory, which I would then forget again once I had presented them. Briefly, an argument that appears in a lot of textbooks uses a result called the Cauchy mean value theorem, and applies it to a cleverly chosen function. Whereas I understand what the mean value theorem is for, I somehow don’t have the same feeling about the Cauchy mean value theorem: it just works in this situation and happens to give the answer one wants. And I don’t see an easy way of predicting in advance what function to plug in.

I have always found this situation annoying, because a part of me said that the result ought to be a straightforward generalization of the mean value theorem, in the following sense. The mean value theorem applied to the interval [x,x+h] tells us that there exists y\in (x,x+h) such that f'(y)=\frac{f(x+h)-f(x)}h, and therefore that f(x+h)=f(x)+hf'(y). Writing y=x+\theta h for some \theta\in(0,1) we obtain the statement f(x+h)=f(x)+hf'(x+\theta h). This is the case n=1 of Taylor’s theorem. So can’t we find some kind of “polynomial mean value theorem” that will do the same job for approximating f by polynomials of higher degree?

Now that I’ve been forced to lecture this result again (for the second time actually — the first was in Princeton about twelve years ago, when I just suffered and memorized the Cauchy mean value theorem approach), I have made a proper effort to explore this question, and have realized that the answer is yes. I’m sure there must be textbooks that do it this way, but the ones I’ve looked at all use the Cauchy mean value theorem. I don’t understand why, since it seems to me that the way of proving the result that I’m about to present makes the whole argument completely transparent. I’m actually looking forward to lecturing it (as I add this sentence to the post, the lecture is about half an hour in the future), since the demands on my memory are going to be close to zero.
Read the rest of this entry »

How to work out proofs in Analysis I

February 3, 2014

Now that we’ve had several results about sequences and series, it seems like a good time to step back a little and discuss how you should go about memorizing their proofs. And the very first thing to say about that is that you should attempt to do this while making as little use of your memory as you possibly can.

Suppose I were to ask you to memorize the sequence 5432187654321. Would you have to learn a string of 13 symbols? No, because after studying the sequence you would see that it is just counting down from 5 and then counting down from 8. What you want is for your memory of a proof to be like that too: you just keep doing the obvious thing except that from time to time the next step isn’t obvious, so you need to remember it. Even then, the better you can understand why the non-obvious step was in fact sensible, the easier it will be to memorize it, and as you get more experienced you may find that steps that previously seemed clever and nonobvious start to seem like the natural thing to do.

For some reason, Analysis I contains a number of proofs that experienced mathematicians find easy but many beginners find very hard. I want to try in this post to explain why the experienced mathematicians are right: in a rather precise sense many of these proofs really are easy, in the sense that if you just repeatedly do the obvious thing you will solve them. Others are mostly like that, with perhaps one smallish idea needed when the obvious steps run out. And even the hardest ones have easy parts to them.
Read the rest of this entry »

Introduction to Cambridge IA Analysis I 2014

January 11, 2014

This term I shall be giving Cambridge’s course Analysis I, a standard first course in analysis, covering convergence, infinite sums, continuity, differentiation and integration. This post is aimed at people attending that course. I plan to write a few posts as I go along, in which I will attempt to provide further explanations of the new concepts that will be covered, as well as giving advice about how to solve routine problems in the area. (This advice will be heavily influenced by my experience in attempting to teach a computer, about which I have reported elsewhere on this blog.)

I cannot promise to follow the amazing example of Vicky Neale, my predecessor on this course, who posted after every single lecture. However, her posts are still available online, so in some ways you are better off than the people who took Analysis I last year, since you will have her posts as well as mine. (I am making the assumption here that my posts will not contribute negatively to your understanding — I hope that proves to be correct.) Having said that, I probably won’t cover exactly the same material in each lecture as she did, so the correspondence between my lectures and her posts won’t be as good as the correspondence between her lectures and her posts. Nevertheless, I strongly recommend you look at her posts and see whether you find them helpful.

You will find this course much easier to understand if you are comfortable with basic logic. In particular, you should be clear about what “implies” means and should not be afraid of the quantifiers \exists and \forall. You may find a series of posts I wrote a couple of years ago helpful, and in particular the ones where I wrote about logic (NB, as with Vicky Neale’s posts above, they appear in reverse order). I also have a few old posts that are directly relevant to the Analysis I course (since they are old posts you may have to click on “older entries” a couple of times to reach them), but they are detailed discussions of Tripos questions rather than accompaniments to lectures. You may find them useful in the summer, and you may even be curious to have a quick look at them straight away, but for now your job is to learn mathematics rather than trying to get good at one particular style of exam, so I would not recommend devoting much time to them yet.
Read the rest of this entry »

DBD2 — success of a kind

January 9, 2014

Yesterday, as I was walking to my office in the morning, I planned to write a post in which I was going to say that Polymath9 had basically been a failure, though not a failure I minded about, since it hadn’t had any significant negative consequences. Part of the reason I wanted to say that was that for a few weeks I’ve been thinking about other things, and it seems better to “close” a project publicly than to leave it in a strange limbo.

When I got to my office, those other things I’ve been thinking about (the project with Mohan Ganesalingam on theorem proving) commanded my attention and the post didn’t get written. And then in the evening, with impeccable timing, Pavel Pudlak sent me an email with an observation that shows that one of the statements that I was hoping was false is in fact true: every subset of \{0,1\}^n can be Ramsey lifted to a very simple subset of a not much larger set. (If you have forgotten these definitions, or never read them in the first place, I’ll recap them in a moment.)

How much of a disaster is this? Well, it’s never a disaster to learn that a statement you wanted to go one way in fact goes the other way. It may be disappointing, but it’s much better to know the truth than to waste time chasing a fantasy. Also, there can be far more to it than that. The effect of discovering that your hopes are dashed is often that you readjust your hopes. If you had a subgoal that you now realize is unachievable, but you still believe that the main goal might be achievable, then your options have been narrowed down in a potentially useful way.

Is that the case here? I’ll offer a few preliminary thoughts on that question and see whether they lead to an interesting discussion. If they don’t, that’s fine — my general attitude is that I’m happy to think about all this on my own, but that I’d be even happier to discuss it with other people. The subtitle of this post is supposed to reflect the fact that I have gained something from making my ideas public, in that Pavel’s observation, though simple enough to understand, is one that I might have taken a long, or even infinite, time to make if I had worked entirely privately. So he has potentially saved me a lot of time, and that is one of the main points of mathematics done in the open.
Read the rest of this entry »

A little paradox

December 9, 2013

This post is intended as a footnote to one that I wrote a couple of years ago about the meaning of “implies” in mathematics, which was part of a series of posts designed as an introduction to certain aspects of university mathematics.

If you are reasonably comfortable with the kind of basic logic needed in an undergraduate course, then you may enjoy trying to find the flaw in the following argument, which must have a flaw, since I’m going to prove a general statement and then give a counterexample to it. If you find the exercise extremely easy, then you may prefer to hold back so that others who find it harder will have a chance to think about it. Or perhaps I should just say that if you don’t find it easy, then I think it would be a good exercise to think about it for a while before looking at other people’s suggested solutions.
Read the rest of this entry »

DBD1 — initial post

November 3, 2013

This post is intended as a launch of Polymath9. I have no idea how the project will go, but I think it may be rather short lived, since the difficulties I am having at the moment look as though they could turn out to be serious ones that rule out any approach along the lines I have been thinking about. However, it is difficult to say that with any certainty, because the approach is fairly flexible, so even if the precise statements I have been trying to prove are false, it might be possible to come up with variants that are true. In a way I find that a good state of affairs, because it increases the chances of proving something interesting. Obviously it increases the chances of proving that P\neNP if one has more ways of attacking the problem. (I’m not claiming that it increases the probability to one that is not small — just that it increases it.) But it also increases the chances of what I would regard as a very nice consolation prize if, as expected, the approach does not work, namely a new barrier to proving that P\neNP. I don’t think it would be as fundamental a barrier as the three main barriers discovered so far, since it would not be showing that existing methods cannot work. Rather, it would be saying, “Here’s something we could try. Oh dear, it doesn’t work.” But as long as that something was reasonably general, I think its failure to work could be interesting enough to publish.

I’ve thought a little about what phrase to attach to the project (the equivalent of “density Hales-Jewett” or “Erdős discrepancy problem”). I don’t want to call it “P versus NP” because that is misleading: the project I have in mind is much more specific than that. It is to assess whether there is any possibility of proving complexity lower bounds by drawing inspiration from Martin’s proof of Borel determinacy. Only if the answer turned out to be yes, which for various reasons seems unlikely at the moment, would it be reasonable to think of this as a genuine attack on the P versus NP problem. So the phrase I’ve gone for is “discretized Borel determinacy”. That’s what DBD stands for above. It’s not a perfect description, but it will do.
Read the rest of this entry »

What I did in my summer holidays

October 24, 2013

This post is intended to accomplish several things at once. First and foremost, I want to explain (not just in the post) why I have been interested in Borel determinacy and in the natural proofs barrier. Roughly speaking (or should I say tl;dr?) I think that Martin’s proof of Borel determinacy has features that might just conceivably offer a way past that barrier.

As long-term readers of this blog will be aware, the P versus NP problem is one of my personal mathematical diseases (in Richard Lipton’s sense). I had been in remission for a few years, but last academic year I set a Cambridge Part III essay on barriers in complexity theory, and after marking the essays in June I thought I would just spend an hour or two thinking about the problem again, and that hour or two accidentally turned into about three months (and counting).

The trouble was that I had an idea that has refused to die, despite my best efforts to kill it. Like a particularly awkward virus, it has accomplished this by mutating rapidly, so that what it looks like now is very different from what it looked like at the beginning of the summer. (For example, at that stage I hadn’t thought of trying to model a proof on the proof of Borel determinacy.) So what am I to do?
Read the rest of this entry »

Holding a country to ransom

October 15, 2013

Here is a quick thought about the mathematics of the US shutdown, not to be taken too seriously (the thought I mean — the shutdown obviously is to be taken seriously). It’s for the benefit of anyone who is puzzled that the Tea Party can have such a large influence, and more generally how a political system can be stable when almost nobody likes it. I’m going to prove that in a country of n people, it is possible to devise a democratic system in which n^\alpha of those people control the decisions, where \alpha=\log 2/\log 3. For example, in a population of 100,000,000, all you need is a band of fanatics with about 112,000 people — or approximately 0.1% of the population. Although we do not have such a system and the distribution is unlikely, the systems and distributions we do have still allow a minority to have undue influence, and for similar reasons. What I’m about to describe is the extreme case.
Read the rest of this entry »

Razborov and Rudich’s natural proofs argument

October 7, 2013

Introduction

The purpose of this post is to add some rigour to what I wrote in the previous post, and in particular to the subsection entitled “Why should we believe that the set of easily computable functions is a ‘random-like’ set?” There I proved that if the Rubik’s-cube-like problem is as hard as it looks, then there can be no polynomial-time-computable property that distinguishes between a random composition of n^k 3-bit scramblers and a purely random Boolean function. This implies that there can be no polynomial-time-computable “simplicity” property that is satisfied by all Boolean functions of circuit complexity at most n^k that is not satisfied by almost all Boolean functions.

I personally find the assumption that the Rubik’s-cube-like problem is hard very plausible. However, if you disagree with me, then I don’t have much more I can say (though see Boaz Barak’s first comment on the previous post). What Razborov and Rudich did was to use a different set of random polynomial-time-computable functions that has a better theoretical backing. They build them out of a pseudorandom function generator, which in turn is built out of a pseudorandom generator, which is known to exist if the discrete logarithm problem is hard. And the discrete logarithm problem is hard if factorizing large integers is hard. Since many people have tried hard to find an algorithm for factorizing large integers, there is some quite strong empirical evidence for this problem’s being hard. It’s true that there are also people who think that it is not hard, but the existence of a pseudorandom generator does not depend on the hardness of factorizing. Perhaps a more significant advantage of the Razborov-Rudich argument is that any pseudorandom generator will do. So the correctness of their conclusion is based on a weaker hypothesis than the one I used earlier.
Read the rest of this entry »


Follow

Get every new post delivered to your Inbox.

Join 1,574 other followers