I hope that most of you have either asked yourselves this question explicitly, or at least felt a vague sense of unease about how the definitions I gave in lectures, namely
relate to things like the opposite, adjacent and hypotenuse. Using the power-series definitions, we proved several facts about trigonometric functions, such as the addition formulae, their derivatives, and the fact that they are periodic. But we didn’t quite get to the stage of proving that if and is the angle that the line from to makes with the line from to , then and . So how does one establish that? How does one even define the angle? In this post, I will give one possible answer to these questions.
A couple of possible approaches that I won’t attempt to use
A cheating and not wholly satisfactory method would be to define the angle to be . Then it would be trivial that and we could use facts we know to prove that . (Or could we? Wouldn’t we just get that it was ? The fact that many angles have the same and creates annoying difficulties for this approach, though ones that could in principle be circumvented.) But if we did this, how could we be confident that the notion of angle we had just defined coincided with what we think angle should be? The problem has not been fully solved.
Another approach might be to define trigonometric functions geometrically, prove that they have the basic properties that we established using the power series definitions, and prove that these properties characterize the trigonometric functions (meaning that any two functions and that have the properties must be and ). However, this still requires us to make sense of the notion of angle somehow, and we might also feel slightly worried about whether the geometric arguments we used to justify the addition formulae and the like were truly rigorous. (I’m not saying it can’t be done satisfactorily — just that I don’t immediately see a good way of doing it, and I have a different approach to present.)
How are radians defined? You take a line L starting at the origin, and it hits the unit circle at some point P. Then the angle that line makes with the horizontal (or rather, the horizontal heading out to the right) is defined to be the length of the circular arc that goes anticlockwise round the unit circle from to P. (This defines a number between 0 and , but we can worry about numbers outside this range later.)
Calculating the length of a circular arc
There is nothing wrong with this definition, except that it requires us to make rigorous sense of the length of a circular arc. How are we to do this?
For simplicity, let’s assume that our point P is and that both and are positive. So P is in the top right quadrant of the unit circle. How can we define and then calculate the length of the arc from to , or equivalently from to ?
One non-rigorous but informative way of thinking about this is that for each between and , we should take an interval , work out the length of the bit of the circle vertically above this interval, and sum up all those lengths. The bit of the circle in question is a straight line (since is infinitesimally small) and by similar triangles its length is .
How did I write that down? Well, the big triangle I was thinking of was one with vertices , and the point on the circle directly above , which is , by Pythagoras’s theorem. The little triangle has one side of length , which corresponds to the side in the big triangle of length . So the hypotenuse of the little triangle is , as I claimed.
Adding all these little lengths up, we get , so it remains to evaluate this integral.
This is of course a very standard integral, usually solved by substituting or for . If you do that, you find that the length works out as , which is just what we hoped. However, we haven’t discussed integration by substitution in this course, so let us see it in a more elementary way (not that proving an appropriate form of the integration-by-substitution rule is especially hard).
Using the rules for differentiating inverses, we find that
and since , this gives us . So the integrand has as an antiderivative, and therefore, by the fundamental theorem of calculus,
So the angle between the horizontal and the line joining the origin to is (by definition) the length of the arc from to , which we have calculated to be . Therefore, .
How close was that to being rigorous?
The process I just went through, of saying “Let’s add up a whole lot of infinitesimal lengths; that says we should write down the following integral; calculating the integral gives us L, so the length is L,” is a process that one often goes through when calculating similar quantities. Why are we so confident that it is OK?
I sometimes realize with mathematical questions like this that I have been a mathematician for many years and never bothered to worry about them. It’s just sort of obvious that if a function is reasonably nice, then writing something down that’s approximately true with and turning into and writing a nice sign in front gives you a correct expression for the quantity in question. But let’s try to think a bit about how we might define length rigorously.
First, we should say what a curve is. There are various definitions, according to how much niceness one wants to assume, but let me take a basic definition: a curve is a continuous function from an interval to . (I haven’t defined continuous functions to , but it simply means that if , then and are both continuous functions from to .)
This is an example of a curious habit of mathematicians of defining objects as things that they clearly aren’t. Surely a curve is not a function — it’s a special sort of subset of the plane. In fact, shouldn’t a curve be defined as the image of a continuous function from to ? It’s true that that corresponds more closely to what we are thinking of when we use the word “curve”, but the definition I’ve just given turns out to be more convenient, though it’s important to add that two curves (as I’ve defined them) and are equivalent if there is a strictly increasing continuous bijection such that for every . In this situation, we think of and as different ways of representing the same curve.
Incidentally, if you want a reason not to identify curves with their images, then one quite good reason is the existence of objects called space-filling curves. These are continuous functions from intervals of reals to that fill up entire two-dimensional sets. Here’s a picture of one, lifted from Wikipedia.
It shows the first few iterations of a process that gives you a sequence of functions that converge to a continuous limit that fills up an entire square.
Lengths of curves
Going back to lengths, let’s think about how one might define them. The one thing we know how to define is the length of a line segment. (Strictly speaking, I’m not allowed to say that, since a line segment isn’t a function, but let’s understand it as a particularly simple function from an interval to a line segment in the plane.) Given that, a reasonable definition of length would seem to be to approximate a given curve by a whole lot of little line segments. That leads to the following idea for at least approximating the length of a curve . We take a dissection and add up all the little distances . Here I am defining the distance between two points in in the normal way by Pythagoras’s theorem. This gives us the expression
for the approximate length given by the dissection. We then hope that as the differences get smaller and smaller, these estimates will tend to a limit. It isn’t hard to see that if you refine a dissection, then the estimate increases (you are replacing the length of a line segment that joins two points by the length of a path that consists of line segments and joins the same two points).
Actually, that hope is not always fulfilled: sometimes the estimates tend to infinity. Indeed, for space-filling curves, or fractal-like curves such as the Koch snowflake, the estimates do tend to infinity. In this case, we say that they have infinite length. But if the estimates tend to a limit as the maximum of the differences tends to zero, we call that limit the length of the curve. A curve that has a finite length defined this way is called rectifiable.
Suppose now that we have a curve given by and that the two functions and are continuously differentiable. Then both and are bounded on , so let’s suppose that is an upper bound for and . Then by the mean value theorem,
Therefore, for every dissection, which implies that the curve is rectifiable. (Remark: I didn’t really use the continuity of the derivatives there — just their boundedness.)
We can say slightly more than this, however. The differentiability of tells us that for some . And similarly for with some . Therefore, the estimate for the length can be written
This looks very similar to the kind of thing we write down when doing Riemann integration, so let’s see whether we can find a precise connection. We are concerned with the function . If we now do use the continuity of and , then is continuous too, so it can be integrated. Now since and belong to the interval , and both lie between the lower and upper sums given by the dissection. That implies the same for
Since is integrable, the limit of as the largest (which is often called the mesh of the dissection) tends to zero is .
We have shown that the length of the curve is given by the formula
Now, finally, let’s see whether we can justify our calculation of the length of the arc of the unit circle between and . It would be nice to parametrize the circle as , but we can’t do that, since we are defining using length, so we would end up with a circular definition (in more than one sense). [Actually, we can do something very close to this. See the final section of the post for details.] So let’s parametrize it as follows. We’ll define on the interval and we’ll send to . Then and , so
So the length is , which is exactly the expression we wrote down earlier.
Let me make two quick remarks about that. First, you might argue that although I have shown that the final expression is indeed correct, I haven’t shown that the informal argument is (essentially) correct. But I more or less have, since what I have effectively done is calculate the lengths of the hypotenuses of the little triangles in a slightly different way. Before, I used the fact that one side was and used similar triangles. Here I’ve used the fact that one side is and another side is and used Pythagoras.
A slightly more serious objection is that for this calculation I used a general result that depended on the assumption that both and are continuously differentiable, but didn’t check that the appropriate conditions held, which they don’t. The problem is that , so , which tends to infinity as and is undefined at .
However, it is easy to get round this problem. What we do is integrate from to , in which case the argument is valid, and then let tend to zero. The integral between and is , and that tends to .
One final remark is that this length calculation explains why the usual substitution of for in an integral of the form is not a piece of unmotivated magic. It is just a way of switching from one parametrization of a circular arc (using the x-coordinate) to another (using the angle, or equivalently the distance along the circular arc) that one expects to be simpler.
An easier argument
Thanks to a comment of Jason Fordham below, I now realize that we can after all parametrize the circle as . However, this is not the I’m trying to calculate, so let’s call it . I’m just taking to be an ordinary real number, and I’m defining and using the power-series definition. Then the arc of the unit circle that goes from to can be defined as the curve defined on the interval by the formula . The general formula for the length of a curve then gives us
So the length of the arc satisfies .