In this post I want to revisit a topic that I first discussed on my web page here. My aim was to present a way in which one might discover a solution to the cubic without just being told it. However, the solution that arose was not very nice, and at the end I made the comment that I did not know a way of removing the rabbit-out-of-a-hat feeling that the usual much neater formula for the cubic (together with its derivation) left me with.

A couple of years ago, I put that situation right by stumbling on a very simple idea about quadratic equations that generalizes easily to cubics. More to the point, the stumble wasn’t completely random, so the entire approach can be justified as the result of standard and easy research strategies. I am no historian, but I would imagine that this idea is pretty similar to the idea (in some equivalent form) that first led to this solution.

I shall assume familiarity with solving quadratics—the problem here is to find the right way of generalizing this solution. (If you want to see how one might discover a solution to the quadratic then I cover that in the earlier discussion of cubics.) Given that, then the first step in a natural discovery of a solution of the cubic is the observation, which one can hardly help making, that solutions to quadratics take the form . If we now turn things round and just assume that solutions will take this form then we can get a very quick derivation of the quadratic formula, which, for simplicity, I will do just for quadratics of the form . (Of course, it is very easy to reduce the general case to this case, so this is not a serious loss of generality.)

The derivation comes from the well-known fact that the roots of such a quadratic must add up to and must multiply to give . The first fact tells us that and the second tells us that , which in turn tells us that , so that . By our earlier computation, this is . This gives the usual quadratic formula in the case .

Was that a fully justified argument? Yes, because once you are looking for roots of the form there is no mystery behind the idea of looking at what you know about the two roots, converting that into some equations for and and trying to solve those equations. You can’t tell in advance that the equations will have a nice solution, but it’s very natural to give the approach a try.

Now let us ask ourselves the following question: what would be the most blindingly obvious way of generalizing the above approach to cubics? There are two ideas we might have in connection with this. The first is to try to get the cubic into as simple a form as possible, and the second is to make a guess about the general form of the roots. Let us take each of these in turn, beginning with the second.

What is the most natural way of generalizing our choice above for the form of the roots? To ask this question another way: we are trying to find XXX, where XXX is to the number 3 as and are to the number 2. There is a very obvious guess: we should take , and , where , and are the three cube roots of some number . If we write for the cube root of 1 (or, to be more specific, the number , then we can write this guess as , and (where is some cube root of —it doesn’t matter which).

By analogy with the quadratic case, we are hoping that this will be the general form of a solution to the equation . But a moment’s thought shows that it cannot be. Let us see this in two different ways.

The first is that if that is the general form of the roots, then we have two degrees of freedom—the choice of and the choice of . But we are looking at a three-dimensional set of equations (since we are free to choose , and ). It is a good exercise to prove rigorously that our guess is *guaranteed* to be wrong for this reason, but for now let us be satisfied with the observation that it looks very worrying. Indeed, if life were that simple then it is hardly likely that solving the cubic would have been as hard a problem as it was.

A second way to see that the guess is wrong is to consider what happens if . Now we are looking at a cubic of the form , and if the roots take the form stated then, since their sum is now zero, we find that . But then the three roots are just the cube roots of , so they are the roots of the equation . In other words, the guess is wrong unless . (This is of course an instance of the fact that we do not have enough degrees of freedom.)

So, with this small extra insight into the problem, let us try to come up with a better guess. How do we generalize a pair such as and ? We want a triple of roots, but we also want each component of the triple to have *three* degrees of freedom. In other words, we want each root to be made out of a , a and a .

Since we don’t quite know how we will build the roots, a helpful idea at this point is to *lose* some information in the quadratic case. This is a slightly subtle point that I will discuss more in a moment. First let us merely observe that I *could* have represented the two roots of a quadratic as and , and it would still have been very easy to solve for and . Then the fact that a square root was involved would not have been a *guess* (however natural) but something that one actually *derived*, in a very easy and natural way.

Since this slight modification of the quadratic guess will turn out to be very helpful, it is important to establish that it could be justified. That is, I am not drawing a rabbit out of a hat at *this* point. The justification is as follows. In the cubic case we do not know exactly what the form of our guess would take. We could just make some *wild* guesses and hope to hit the right answer. But much better is to make more *general* guesses and then *work out* what their more precise forms must be. We can do that in the quadratic case, so it is a very sensible strategy to try to do the same for cubics.

Having established this point, let us see what happens. We are now trying to find the natural analogue for the number 3, built out of three variables , and , of the pair in the degree-2 case. The pair consists of a couple of linear combinations of and , so it is natural (though not essential to the discovery of the argument) to think of it as a linear transformation of the pair . That draws our attention to the matrix , and it is then very natural to wonder if this matrix has an obvious generalization to a matrix.

It does! This is the case of the well-known *circulant* matrix, but even if you don’t know that, you do know that the numbers 1 and -1 are the two square roots of 1. Moreover, this is not just a coincidence but the *reason* that they occur in our discussion of quadratics. So it is natural to try to build a matrix out of the three *cube* roots of 1, which are , and . In the end there is only one sensible choice to make (give or take the odd symmetry). It is the matrix . Thus, our guess for the forms of the three roots is , and .

This seems a very satisfactory guess (even if we don’t have a compelling reason to suppose that it will work). So now we are left with the task of solving for , and on the assumption that they are the roots of the cubic . At this point one could just plunge in, but it helps a lot to simplify the cubic first by “completing the cube”. This is the familiar idea (described in my other cubics discussion) that by substituting for you get a cubic in where the coefficient of is zero. So let’s just assume, as we may, that , so that we are looking for roots of . Since the roots add up to 0 and , this tells us that , so the three roots are now of the form , and . (We are therefore down to two degrees of freedom, but so is the cubic we are trying to solve.)

The information we know about these three roots is that their product is and that the sum of all the products of two of them is . So the next task is clear: expand out these expressions and see if we can solve the resulting equations in and . The details of this are not particularly important: you could stop reading now and just take on trust that we end up needing to solve quadratics and take cube roots, both of which we are allowed to assume that we can do. However, it’s nice to see that it really does work.

The product of the three numbers , and works out to be . (It’s instructive to do this calculation for yourself and see how the fact that makes the other two possible terms cancel. Then one can see that the fact that rather simple expressions come out of these calculations is not a coincidence.) As for the sum of the three products of two of them, it comes out to be , which equals . So we need and to take the values and , respectively. This tells us that and are the two roots of the equation , so, as claimed, we can solve for and by solving a quadratic and taking cube roots.

A small extra point is that one must think a bit about which cube roots to take, but that I will gloss over here.

An obvious question: what happens if one tries to generalize this approach to quartics and quintics? The answer is that in both cases it is obvious how to generalize the guess about the form that the roots should take. In the case of the quartic, when one guesses that they are of the form , , and , everything works out nicely, if you get rid of the term and hence of . You get some equations in and and they aren’t too hard to solve. If you try it for the quintic then, not too surprisingly, you end up with some equations that are more complicated than the quintic you started with.

Apologies for the matrices not coming out: I’ll repair that as soon as I can work out how to do so. [*Now sorted out, with help from comments below: many thanks*.]

September 15, 2007 at 7:49 pm |

Dear Timothy, let me praise your effort in “demystifying proofs”, i think that it is one of the most important roles of mathematics and we (mathematicians or just students like me) should try to spread a wider conscience of its importance.

This is a more “human” aspect of mathematics, since it concerns more “who is studing” rather than “what is being studied”.

I feel often tempted to try to learn proofs and to make proofs just by blindly applying magic tricks, trying to produce a proof that may be checked by a computer, but that would give no insight about what is being proved. I’m now learning to recognize such tendency as missing trust in what can be understood (while instead i’m blindly relying on my intuitive skills to “set up the mess”), and i’m trying to correct it by all means.

When (if) i will be a professor, i will be very careful while saying someone to “work hard”, to avoid people trying to do so by just studying theorems without sufficient awareness (instead of trying to develop it).

I’m just trying to express my thoughts, and i’d be happy to listen the advice of more experienced people. Thanks for your post!

September 15, 2007 at 8:03 pm |

Check out this post. It helped figure out how to get matrices working properly on WordPress.com

September 15, 2007 at 8:41 pm |

Dear Tim,

For matrices in wordpress, I find that \begin{pmatrix} … \\ … \end{pmatrix} works nicely, e.g. . (If you “edit” this comment you will be able to steal the latex code.)

It is amusing to recast your above discussion through the lens of Fourier analysis. One can view solving a polynomial as solving a system of symmetrised equations. For instance, by the factor theorem, the task of finding the three roots x,y,z of the cubic is equivalent to solving the system of equations

.

Now these equations are invariant under cyclic shift of the x,y,z. One of the key insights of Fourier analysis is that any mathematical problem which enjoys a translation invariance symmetry is likely to be clarified by use of the Fourier transform. This would motivate the Fourier substitution , , , which leads to

.

This lets one solve for u; to solve for v and w one can then observe a residual cyclic symmetry between v and w, prompting another Fourier transform , , which soon lets one solve for everything.

Presumably this can all be interpreted nicely in terms of the Galois group , and in particular to the solvability of that group, especially given that solvable groups can be “built up” from abelian groups such as and , which are precisely the groups which enjoy nice Fourier transforms. My Galois theory is incredibly rusty, though, so I don’t see the connection clearly.

October 29, 2013 at 11:47 am

It should be $3u^2-3vw=c$, I think.

I don’t see how the substitution $v = s+t$, $w = s-t$ helps; $v^3 = s+t$, $w^3 = s-t$ is what we need; I don’t see why this would be preferred on symmetry grounds.

August 26, 2013 at 5:46 pm

Dear Prof.Tao,

I give up your 245a real analysis blog series for a moment and decide to turn to complex analysis .Ancient mathematicians first encounter imaginary numbers while they were studying the root of cubic equations.Then I am troubled by the cubic equation,because I think that the method of solving the cubic equation is too unacceptable by me,only genius will discover them and accept them without any uncomfortable!

Luckily I see your comment.I think that the method of using discrete fourier transform is very good,although I think it is also not easy to come up with this method because I think there is no direct hint on why we should use fourier transform…

September 15, 2007 at 9:07 pm |

Terry: I didn’t have the “p” in “\pmatrix” and my slashes were the wrong way round: thanks for the tip. I idly wondered about Fourier analysis when the circulant matrix came up (of course) but didn’t get further than that “idly”. It is indeed a nice way of looking at it, and gives me some hope of carrying out a further project, possibly with the help of suggestions on this blog, of arriving at a completely demystified proof of the insolubility of the quintic. My Galois theory is also very rusty — in fact, I never really understood it properly as an undergraduate — which I regard as an essential qualification for carrying out such a project, since what I’d really like to do is end up with a proof that doesn’t use the language of group theory. Alternatively, I’d like to make enough elementary observations (all by following natural problem-solving techniques) that the idea of looking at automorphisms of number fields emerges of its own accord. The second approach is probably more tempting to anybody who does know their Galois theory well, but it is a challenge to do it if you aren’t allowed to draw rabbits out of hats. But your idea of

notdoing the initial simplification by getting rid of the quadratic term and seeing two symmetries in operation is a big help, since, as you say, it leads to the notion of solvability of a group (not that I have fully worked out the connection either, and one has to try to find the connection in a way that doesn’t rely on knowing that it is there to be found).September 15, 2007 at 9:53 pm |

About a demystified proof of the insolubility of the quintic: I heard (but haven’t checked myself) that a nice book about this is the one by Alekseev which explains a topological approach due to Arnold which is meant to be understandable by high school students. Here’s the amazon entry http://www.amazon.com/Abels-Theorem-Problems-Solutions-International/dp/1402021860 and here is a related note of Arnold about it http://www.institut.math.jussieu.fr/seminaires/singularites/abel.pdf

September 15, 2007 at 10:54 pm |

Thanks for the nice post. Give us more like this please :-)

I think you have a typo where you have written x3 – dx – c 3 instead of x2 – dx – c 3

September 16, 2007 at 12:05 am |

Phil: thanks — typo fixed.

September 16, 2007 at 8:34 pm |

I really enjoyed this post and the discussions afterwards. It seems many people are rusty on Galois theory. Every time I have learned Galois theory (and then promptly forgotten it) I always left feeling that it is an amazing theory but I would never know how to use it to solve a concrete question. For example: If one knows that a given quintic is solvable by radicals, how does one go from there and actually find the roots? Any thoughts?

September 17, 2007 at 5:33 am |

By some coincidence, there is another blog post on the solvability of the cubic, this time from the perspective of classical invariant theory, at

http://rigtriv.wordpress.com/2007/08/29/invariants-and-solving-polynomials/

Basically, the philosophy here is to only permit yourself to write down expressions (such as the discriminant) which transform nicely under projective changes of coordinates. There are relatively few of these expressions, and so you have a smaller “search space” in which to find the invariants that factorise the original polynomial into linear factors.

September 18, 2007 at 1:27 pm |

Hi Tim,

It was stated on your home page that any reasonable person wouldn’t be expected to solve the cubic in a few hours or so. However, how long would you expect say a cambridge undergraduate with no knowledge of solving the cubic to discover the solution?

And I wondered if it would be a good idea if you were to write a similar article on finding the closed form sum of 1+(2^m)+(3^m)…+n^m where n and m are positive integers, to offer some insight into how one might discover the formula for that?

Thanks,

John

September 18, 2007 at 11:15 pm |

JOHN SMITH:

There is a discussion in section 6.5 of Concrete Mathematics, by Graham, Knuth and Patashnik, of how one might discover and prove the formula for this sum by nothing but sheer obstinancy. Later, in section 7.3, they describe how the problem becomes easier when armed with the tool of generating functions. I’ll sketch the latter attack here, because generating functions are a great tool. Set S(m,n)=1+2^m+3^m+..+n^m. There are four approaches you might try in a generating function attack: you could define any of the four functions

A_m(w)=sum_n S(m,n) w^n, B_m(w)=sum_n S(m,n) w^n/n!, C_n(z)=sum_m S(m,n) z^m, or D_n(z)=sum_m S(m,n) z^m/m!

and try to get this function into a simple form. With A, B and C, we strike out. But, if we come back for a fourth swing, D_n has the nice closed form

(e^{nz}-1)/(e^z-1)=(sum_{k>=1} n^k z^k/k! ) (1/z-1/2+z/6-z^3/30+…)

and we get the closed formula immediately. So one approach to this sum comes down to knowing about generating functions, being willing to try again when the first attack fails, and some basic comfort manipulating series.

September 20, 2007 at 9:30 am |

[…] this? The reason is rather similar to the reason that I got rid of the square root round in my post about cubics. In both cases I wanted to generalize something that was a bit too complicated for it to be obvious […]

September 21, 2007 at 3:43 pm |

My answer to JOHN SMITH’s first question of September 18th is that I would expect most Cambridge undergraduates

notto manage to find a solution to the cubic unaided. However, that is not because they would be incapable of it. My post tries to demonstrate that by showing that you don’t have to have flashes of genius to solve the cubic — just the wish to generalize the quadratic solution in as natural a way as you can. Rather, it is because only a smallish proportion of Cambridge undergraduates (or indeed, mathematics undergraduates anywhere) start out with the belief that they could ever solve a mathematics problem that wasn’t carefully constructed to be solvable in a fairly routine way. There’s a simple way of getting this belief, which is to solve one hard problem. It may take a few hours, or a week, or a month, or six months, or even longer, but once someone has done it they understand from their own experience that it is possible. Then they start to have positive thoughts like, “What strategy will maximize my chances of solving this problem?” or “I seem to keep having the same difficulty — what could I do differently?” rather than negative ones like, “Maths is hard, and I haven’t been taught how to do this, so there’s no chance of my managing.” And then they find that solving difficult problems is just like many other activities: hard, yes, but not impossible and something that gets easier with practice.September 21, 2007 at 4:47 pm |

Thanks for that Tim. What you say is encouraging to me. I have deliberately Not read this post on the cubic in its entirety because I wanted to discover it without being told how which is I presume, the whole idea of the post.

At first I tried ‘completing the cube’, which seemed a natural thing to try but I was heading in the wrong direction..

December 5, 2007 at 5:57 pm |

Sorry for this somewhat late post – I just discovered your blog and am playing catch-up! Re: your “arriving at a completely demystified proof of the insolubility of the quintic”, I would recommend looking at Galois theory by the route through which it was originally discovered, rather than the more modern abstract point of view. Or more accurately, the various attempts at it (e.g. by Lagrange) which were subsequently refined (Galois) and what ideas led them at each stage. I haven’t yet read much about this myself, but as far as I remember the book “Pioneers of Representation Theory” has a nice section on this early on, including the sorts of explicit calculations gregknese was enquiring about – how do I actually “solve” this supposedly solvable equation?

On an unrelated note, when you say “we end up needing to solve quadratics and take cube roots, both of which we are allowed to assume that we can do”, I wonder how many people would actually consider cube roots of a complex number something basic they “can calculate”, as, unlike in the square root case, you can’t find the real and imaginary parts in terms of roots of only real numbers, so presumably would resort to using transcendental functions (in which case you could of course solve the quintic too.) Utlimately you have to make a choice of “numbers I can deal with” – something you’ve talked about before in other places.

December 27, 2007 at 7:25 am |

When Mark Kac was young he solved the cubic in a different way if a, b, and c are the roots and w^3 = 1 where w is not 1.

Let s = a + b + c, t = a + wb +w^2c, u = a + w^2b + wc

Then express t^3 + u^3 and t^3 * u^3 in terms of the symmetric terms.

There is a nice story to that solution you can find it in Mark Kac’s autobiography “Enigmas of Chance”.

It seems to me that this solution is related to Terence Tao’s solution as a sort of inverse transformation.

Since someone cited sums of powers I remembered finding a nice “geometrical” solution to the sum of squares 1^2 + 2^2 + 3^2 +…+ n^2. Organize one 1, two 2’s, three 3’s, …., and n n’s into an equilateral triangle in the natural way, then add the corresponding elements of the three versions of this triangle (the original and the rotated by 120 and 240 degrees) it nice to see that the sum is constant and equal to 2n+1. Since there are n(n+1)/2 elements in the triangle the total sum is n(n+1)/2*(2n+1) and the sum of the elements in the original triangle n(n+1)(2n+1)/6 which is the sum of the squares.

I would like to present this in a motivated way but I am afraid that is somewhat longer. I also tried to generalize this idea to get the sum of cubes but failed. Since it has such a nice sum I suppose that it could also have a nice geometrical method to it. Anyone has an idea?

December 30, 2007 at 9:32 pm |

As Nick already pointed out, the historical route to Galois theory is the most illuminating. According to the marvelous read

Jörg Bewersdorff. Galois Theory for Beginners: A Historical Perspective

http://tinyurl.com/2u6k52

Lagrange came up with the sum-of-roots-of-unity ansatz while trying to understand what’s going on with the quintic.

In fact, I find the usual “modern” presentation of Galois theory utterly incomprehensible. I mean, it’s not so bad and quite simple after knowing the historical approach of permuting the roots. But why, why is the easy way pretty much forgotten? Why struggle months with something abstract-nonsensical and void of motivation when there’s a dead simple and even more illuminating approach to understand it in hours? As Feynman (?) put it: “If you can’t explain it to a six year old, you don’t really understand it.” But I’m preaching to the choir :-)

December 30, 2007 at 9:47 pm |

Oh, and concerning the sum $1^k + 2^k + \dots + n^k$, there is a simple way for finding a formula for any given $k$: the simple observation is that in the cases $k=1,2,3$, the formula is a polynomial of degree $k+1$ and the idea is to make exactly this ansatz in the general case.

This is further simplified by changing coordinates, i.e. by using a different basis than $n^j$ for the vector space of polynomials. A better basis is $n, n(n-1), n(n-1)(n-2), \dots, n(n-1)\cdots(n-k)$ since these functions have nice sums. In other words, try to find an explicit formula for $\sum_{a=0}^n a(a-1)(a-2)$ and be delighted :-)

December 30, 2007 at 10:16 pm |

Heinrich: or you could use Faulhaber’s Fabulous Formula.

January 11, 2008 at 11:26 pm |

[…] of equations involving (because it contains a cubic, which is not good, see this post of mine and this one of Gowers’s which show how hard a single variable cubic is), we can instead homogenize and solve the system […]

February 22, 2014 at 6:41 am |

Dear Professor Gowers,below is how I thought.It mainly explains why the cubic root appears.Of couse,it is hinted by the history.

Before we try to solve the cubic,Let’s investigate how to solve the

quadratic.Let the roots of the quadratic be .According to the factor theorem,we have In order to figure out and ,we need another linear equation ,where are constants and both are nonzero,and ,so that we obtain a system of linear

equations then by solving this system of linear equations we could get .Obviously, is not a symmetric polynomial of ,but is a symmetric polynomial of .By the fundamental theorem of symmetric polynomials, can be expressed as a polynomial of $b$ and $c$,more specifically,

We HOPE that there is a relationship between and ,so that when we combine this relationship with the equation (1),we could obtain a value for .Suppose

then we get that ,so .When ,,this is not what we want.So let ,then .Then the equation (1) becomes By solving it we obtain that Then we managed to get a system of linear equations Solve it ,we can get .By using the same method,we try to solve the cubic.Let the roots of the cubic be ,then according to the factor theorem,we have We need another two linear equations together with the linear equation ,then by solving this sysmtem of linear equations we can get .Obviously,the polynomial is not a symmetric polynomial of ,because if it is symmetric,then ,then the system of linear equations do not have a unique solution,this is not what we want.But must be a symmetric polynomial of .By using the fundamental theorem of symmetric polynomials,this symmetric polynomial can be expressed as a polynomial of ,we denote it as .Now let’s see two sets of linear polynomials and Each of the

polynomial is a factor of and the polynomial in each set has the cyclic symmetric relation.We HOPE that there is a relationship among the polynomials in each sets.Assume that Then we get that ,.When ,,this is not what we want,when ,where ,at this

time,.Similary,when ,.According to the symmetry,let ,.Then we have Let ,,then we have It is also easy to

see that is a symmetric polynomial of ,so can be expressed as a polynomial of ,denoted as .So So and

can be easily solved.Finally,we managed to get a system of linear equations Then can be solved!(Note that finally will disappear.)