In this post I want to revisit a topic that I first discussed on my web page here. My aim was to present a way in which one might discover a solution to the cubic without just being told it. However, the solution that arose was not very nice, and at the end I made the comment that I did not know a way of removing the rabbit-out-of-a-hat feeling that the usual much neater formula for the cubic (together with its derivation) left me with.
A couple of years ago, I put that situation right by stumbling on a very simple idea about quadratic equations that generalizes easily to cubics. More to the point, the stumble wasn’t completely random, so the entire approach can be justified as the result of standard and easy research strategies. I am no historian, but I would imagine that this idea is pretty similar to the idea (in some equivalent form) that first led to this solution.
I shall assume familiarity with solving quadratics—the problem here is to find the right way of generalizing this solution. (If you want to see how one might discover a solution to the quadratic then I cover that in the earlier discussion of cubics.) Given that, then the first step in a natural discovery of a solution of the cubic is the observation, which one can hardly help making, that solutions to quadratics take the form . If we now turn things round and just assume that solutions will take this form then we can get a very quick derivation of the quadratic formula, which, for simplicity, I will do just for quadratics of the form . (Of course, it is very easy to reduce the general case to this case, so this is not a serious loss of generality.)
The derivation comes from the well-known fact that the roots of such a quadratic must add up to and must multiply to give . The first fact tells us that and the second tells us that , which in turn tells us that , so that . By our earlier computation, this is . This gives the usual quadratic formula in the case .
Was that a fully justified argument? Yes, because once you are looking for roots of the form there is no mystery behind the idea of looking at what you know about the two roots, converting that into some equations for and and trying to solve those equations. You can’t tell in advance that the equations will have a nice solution, but it’s very natural to give the approach a try.
Now let us ask ourselves the following question: what would be the most blindingly obvious way of generalizing the above approach to cubics? There are two ideas we might have in connection with this. The first is to try to get the cubic into as simple a form as possible, and the second is to make a guess about the general form of the roots. Let us take each of these in turn, beginning with the second.
What is the most natural way of generalizing our choice above for the form of the roots? To ask this question another way: we are trying to find XXX, where XXX is to the number 3 as and are to the number 2. There is a very obvious guess: we should take , and , where , and are the three cube roots of some number . If we write for the cube root of 1 (or, to be more specific, the number , then we can write this guess as , and (where is some cube root of —it doesn’t matter which).
By analogy with the quadratic case, we are hoping that this will be the general form of a solution to the equation . But a moment’s thought shows that it cannot be. Let us see this in two different ways.
The first is that if that is the general form of the roots, then we have two degrees of freedom—the choice of and the choice of . But we are looking at a three-dimensional set of equations (since we are free to choose , and ). It is a good exercise to prove rigorously that our guess is guaranteed to be wrong for this reason, but for now let us be satisfied with the observation that it looks very worrying. Indeed, if life were that simple then it is hardly likely that solving the cubic would have been as hard a problem as it was.
A second way to see that the guess is wrong is to consider what happens if . Now we are looking at a cubic of the form , and if the roots take the form stated then, since their sum is now zero, we find that . But then the three roots are just the cube roots of , so they are the roots of the equation . In other words, the guess is wrong unless . (This is of course an instance of the fact that we do not have enough degrees of freedom.)
So, with this small extra insight into the problem, let us try to come up with a better guess. How do we generalize a pair such as and ? We want a triple of roots, but we also want each component of the triple to have three degrees of freedom. In other words, we want each root to be made out of a , a and a .
Since we don’t quite know how we will build the roots, a helpful idea at this point is to lose some information in the quadratic case. This is a slightly subtle point that I will discuss more in a moment. First let us merely observe that I could have represented the two roots of a quadratic as and , and it would still have been very easy to solve for and . Then the fact that a square root was involved would not have been a guess (however natural) but something that one actually derived, in a very easy and natural way.
Since this slight modification of the quadratic guess will turn out to be very helpful, it is important to establish that it could be justified. That is, I am not drawing a rabbit out of a hat at this point. The justification is as follows. In the cubic case we do not know exactly what the form of our guess would take. We could just make some wild guesses and hope to hit the right answer. But much better is to make more general guesses and then work out what their more precise forms must be. We can do that in the quadratic case, so it is a very sensible strategy to try to do the same for cubics.
Having established this point, let us see what happens. We are now trying to find the natural analogue for the number 3, built out of three variables , and , of the pair in the degree-2 case. The pair consists of a couple of linear combinations of and , so it is natural (though not essential to the discovery of the argument) to think of it as a linear transformation of the pair . That draws our attention to the matrix , and it is then very natural to wonder if this matrix has an obvious generalization to a matrix.
It does! This is the case of the well-known circulant matrix, but even if you don’t know that, you do know that the numbers 1 and -1 are the two square roots of 1. Moreover, this is not just a coincidence but the reason that they occur in our discussion of quadratics. So it is natural to try to build a matrix out of the three cube roots of 1, which are , and . In the end there is only one sensible choice to make (give or take the odd symmetry). It is the matrix . Thus, our guess for the forms of the three roots is , and .
This seems a very satisfactory guess (even if we don’t have a compelling reason to suppose that it will work). So now we are left with the task of solving for , and on the assumption that they are the roots of the cubic . At this point one could just plunge in, but it helps a lot to simplify the cubic first by “completing the cube”. This is the familiar idea (described in my other cubics discussion) that by substituting for you get a cubic in where the coefficient of is zero. So let’s just assume, as we may, that , so that we are looking for roots of . Since the roots add up to 0 and , this tells us that , so the three roots are now of the form , and . (We are therefore down to two degrees of freedom, but so is the cubic we are trying to solve.)
The information we know about these three roots is that their product is and that the sum of all the products of two of them is . So the next task is clear: expand out these expressions and see if we can solve the resulting equations in and . The details of this are not particularly important: you could stop reading now and just take on trust that we end up needing to solve quadratics and take cube roots, both of which we are allowed to assume that we can do. However, it’s nice to see that it really does work.
The product of the three numbers , and works out to be . (It’s instructive to do this calculation for yourself and see how the fact that makes the other two possible terms cancel. Then one can see that the fact that rather simple expressions come out of these calculations is not a coincidence.) As for the sum of the three products of two of them, it comes out to be , which equals . So we need and to take the values and , respectively. This tells us that and are the two roots of the equation , so, as claimed, we can solve for and by solving a quadratic and taking cube roots.
A small extra point is that one must think a bit about which cube roots to take, but that I will gloss over here.
An obvious question: what happens if one tries to generalize this approach to quartics and quintics? The answer is that in both cases it is obvious how to generalize the guess about the form that the roots should take. In the case of the quartic, when one guesses that they are of the form , , and , everything works out nicely, if you get rid of the term and hence of . You get some equations in and and they aren’t too hard to solve. If you try it for the quintic then, not too surprisingly, you end up with some equations that are more complicated than the quintic you started with.
Apologies for the matrices not coming out: I’ll repair that as soon as I can work out how to do so. [Now sorted out, with help from comments below: many thanks.]