I’m writing this post as a way of preparing for a lecture. I want to discuss the result that a power series is differentiable inside its circle of convergence, and the derivative is given by the obvious formula
. In other words, inside the circle of convergence we can think of a power series as like a polynomial of degree
for the purposes of differentiation.
A preliminary question about this is why it is not more or less obvious. After all, writing , we have the following facts.
- Writing
, we have that
.
- For each
,
.
If we knew that , then we would be done.
Ah, you might be thinking, how do we know that the sequence converges? But it turns out that that is not the problem: it is reasonably straightforward to show that it converges. (Roughly speaking, inside the circle of convergence the series
converges at least as fast as a GP, and multiplying the
th term by
doesn’t stop a GP converging (as can easily be seen with the help of the ratio test). So, writing
for
, we have the following facts at our disposal.
Doesn’t it follow from that that ?
We are appealing here to a general principle, which is that if some functions converge to and their derivatives converge to
, then
is differentiable with
. Is this general principle correct?
Unfortunately, it isn’t. Suppose we take some continuous functions that converge to a step function. (Roughly speaking, you make
be 0 up to 0, then linear with gradient
until it hits 1, then 1 from that point onwards.) And suppose we then let
be the function that differentiates to
and is 0 up to 0. Then the
converge to the function that is 0 up to 0 and
for positive
. This function almost differentiates to the step function, but it isn’t differentiable at 0.
So we’ve somehow got to use particular facts about power series in order to prove our result — we can’t appeal to general considerations, because then we are appealing to a principle that isn’t true. (Actually, in principle some compromise might be possible, where we show that functions defined by power series have a certain property and then use nothing apart from that property from that point on. But as it happens, we shall not do this.)
Why can’t we just jump in and prove it with a big calculation?
We have a formula for . Why don’t we write out a formula for
and see if we can tell what happens when
?
That is certainly a sensible first thing to try, so let’s see what happens.
What can we do with that? Perhaps we’d better apply the binomial theorem. Then we find that the right-hand side is equal to
Part of the above expression gives us what we want, namely . So we’re left wanting to prove that
tends to 0 as .
Unfortunately, as gets big, some of those binomial coefficients get pretty big too. Indeed, when
is bigger than
, the growth in the binomial coefficients seems to outstrip the shrinking of the powers of
. What can we do?
A useful trick
Fortunately, there is a better (for our purposes at least) way of writing . We just expanded out
using the binomial theorem. But we could instead have used the expansion
Applying that with and
, we get
Just before we continue, note that this gives us an alternative, and in my view nicer, way to see that the derivative of is
, since if you divide the right-hand side by
and let
then each of the
terms tends to
.
Anyhow, if we use this trick, then works out to be
Now let’s subtract the thing we want this to tend to, which is . (This is not valid unless we know that this series converges. So at some stage we will need to prove that.) If we think of
as a sum of
copies of
, then we can write the difference as
which equals
Now is another example of the expansion we had above. That is, we can write it as
We haven’t yet mentioned the radius of convergence of the original power series, but let’s do so now. Suppose it is , that
is such that
, and that we have chosen
small enough that
. Then the modulus of the expression above is at most
.
It follows that
Since , this is equal to
.
So this will tend to zero as as long as we can prove that the sum
converges.
Convergence of the power series you get when you differentiate term by term
Let’s prove a lemma to deal with that last point. It says that if is smaller than the radius of convergence of the power series
, then the power series
converges.
The proof is very similar to an argument we have seen already. Let be the radius of convergence, and pick
with
. Then the power series
converges, so the terms
are bounded above, by
, say. Then
.
But the series converges, by the ratio test. Therefore, by the comparison test, the series
converges.
This shows also that if then the power series
converges (since we have just proved that it converges absolutely). So if we differentiate a power series term by term, we get a new power series that has the same radius of convergence, something we needed earlier.
If we apply this lemma a second time, we get that the power series converges, and dividing by 2 that gives us what we wanted above, namely that
converges.
A couple of applications
An obvious way of applying the result is to take some of your favourite power series and differentiate them term by term. This illustrates the very important general point that if you can obtain something in two different ways, then you usually end up proving something interesting.
So let’s take the function , which we have shown converges everywhere. Then we can obtain the derivative either by differentiating the function itself or by differentiating the power series term by term. That tells us that
, which simplifies to
, which in turn simplifies to
, which equals
.
Earlier we proved this result by writing as
and proving that
. I still prefer that proof, but you are at liberty to disagree.
As another example, let us consider the power series . When
this equals
, by the formula for summing a GP. We can now differentiate the power series term by term, and we can also differentiate the function
. Doing so tells us the interesting fact that
We can see that in another way as well. By our result on multiplying power series, the product of with itself is the power series
, where
is the convolution of the constant sequence
with itself. That is,
with every
and
equal to 1, which gives us
. (This agrees with the previous answer, since
is the same as
.)
Tidying up the proof
In the proof above, we used the identity
with and
, and then we used it again to calculate what happened when we subtracted
. Can we get those calculations out of the way in advance? That is, can we begin by finding a nice formula for
?
We obviously can, by subtracting from the right-hand side and simplifying, much as we did in the proof above (with
and
). However, we can do things a bit more slickly as follows. Start with the identity
Differentiating both sides with respect to , we get
If we now take for
and
for
, we deduce that
is equal to
In particular, if and
are both at most
, then
, which is the main fact we needed in the proof.
Armed with this fact, we could argue as follows. We want to show that
is . By the inequality we have just proved, if
and
are at most
, then the modulus of this expression is at most
and an earlier lemma told us that this converges within the circle of convergence. So the quantity we want to be is in fact bounded above by a multiple of
. (Sometimes people use the notation
for this. The
means “bounded above in modulus by a constant multiple of the modulus of”.)
Was the “trick” a trick?
The proof in this post has relied heavily on the idea, which appeared to come from nowhere, of writing not in the obvious way, which is
but in a “clever” way, namely
Is this something one just has to remember, or can it be regarded as the natural thing to do?
I chose the words “can it be regarded as” quite carefully, since I want to argue that it is the natural thing to do, but when I was preparing this lecture, I didn’t find it the natural thing to do, as I shall now explain. I came to this result with the following background. Many years ago, I lectured a IB course called Further Analysis, which was a sort of combination of the current courses Metric and Topological Spaces and Complex Analysis, all packed into 16 lectures. (Amazingly, it worked quite well, though it was a challenge to get through all the material.) As a result of lecturing that, I learnt a proof that power series can be differentiated term by term inside their circle of convergence, but the proof uses a number of results from complex analysis. I then believed what some people say, which is that the complex analysis proof of this result is a very good advertisement for complex analysis, since a direct proof is horrible. And then at some point I was chatting to Imre Leader about the reorganization of various courses, and he told me that it was a myth that proving the result directly was hard. It wasn’t trivial, he said, but it was basically fine. In fact, it may even be thanks to him that the result is in the course.
Until a few days ago, I didn’t bother to check for myself that the proof wasn’t too bad — I just believed what he said. And then with the lecture coming up, I decided that the time had finally come to check it: something that I assumed would be a reasonably simple exercise. I duly did the obvious thing, including expanding using the binomial theorem, and got stuck.
I would like to be able to say that I then thought hard about why I was stuck, and after a while thought of the idea of expanding using the expansion of
. But actually that is not what happened. What happened was that I thought, “Damn, I’m going to have to look up the proof.” I found a few proofs online that looked dauntingly complicated and I couldn’t face reading them properly, apart from one that was quite nice and that for a while I thought I would use. But one thing all the proofs had in common was the use of that expansion, so that was how the idea occurred to me.
So what follows is a rational reconstruction of what I wish had been my thought processes, rather than of what actually went on in my mind.
Let’s go back to the question of how to differentiate . I commented above that one could do it using the
expansion, and said that I even preferred that approach. But how might one think of doing it that way? There is a very simple answer to that, which is to use one of the alternative definitions of differentiability, namely that
is differentiable at
with derivative
if
as
. This is simply replacing
by
, but that is nice because it has the effect of making the expression more symmetrical. (One might argue that since we are talking about differentiability at
, the variables
and
are playing different roles, so there is not much motivation for symmetry. And indeed, that is why calling one point
and the other
is often a good idea. But symmetry is … well … sort of good to have even when not terribly strongly motivated.)
If we use this definition, then the derivative of is the limit as
of
, and now there is no temptation to use the binomial expansion (we would first have to write
as
and the whole thing would be disgusting) and the absolutely obvious thing to do is to observe that we have a nice formula for the ratio in question, namely
which obviously tends to as
.
In fact, the whole proof is arguably nicer if one uses and
rather than
and
.
Thus, the “clever” expansion is the natural one to do with the symmetric definition of differentiation, whereas the binomial expansion is the natural one to do with the definition. So in the presentation above, I have slightly obscured the origins of the argument by applying the clever expansion to the
definition.
Another way of seeing that it is natural is to think about how we prove the statement that a product of limits is the limit of the products. The essence of this is to show that if is close to
and
is close to
, then
is close to
. This we do by arguing that
is close to
, and that
is close to
.
Suppose we apply a similar technique to try to show that is close to
. How might we represent their difference? A natural way of doing it would be to convert all the
s into
s in a sequence of
steps. That is, we would argue that
is close to
, which is close to
, and so on.
But the difference between and
is
, so if we adopt this approach, the we will end up showing precisely that
February 22, 2014 at 3:45 pm |
If you assume uniform convergence of the functions, this principle becomes rather usefully true, doesn’t it?
However this is presumably dealt with later (perhaps along with the Riemann integral?).
February 22, 2014 at 5:06 pm
To be more precise, we can get away with assuming pointwise convergence of
to a continuous function
as long as the derivatives
are continuous and converge uniformly to
, at least on some open interval/set around the point we are looking at.
My favourite proof of this uses the Riemann integral and the Fundamental Theorem of Calculus, but there are also some relatively elementary proofs available using e.g. suitable Mean Value Theorem estimates.
February 23, 2014 at 10:33 am
You’re right that we haven’t done uniform convergence or the Riemann integral yet, and in fact uniform convergence isn’t on the syllabus for this course, but if I have time at the end of the course, having done integration, I might mention this proof.
February 22, 2014 at 3:57 pm |
Another way would to state a general principle with stronger hypothesis to makes it true ;). I suggest the following : Let
be a sequence a function, two times differentiable, such that
converges pointwise to 
converges pointwise to 
is bounded by
, for some
independent of 
is differentiable and
.
(i)
(ii)
(iii)
then
This applies easily to power series since the pointwise convergence and the bound can be established by the ratio test.
PROOF.
in your domain and
.
we have

Let
For all
The first and the second terms tends to zero as
, by the
. This proves that
is
and that
.
hypothesis (i). The third term as well, by hypothesis (ii). By Taylor’s
theorem with Lagrange form of the remainder (the one you told us about few days
ago!), the last term is bounded by
differentiable at
February 23, 2014 at 10:34 am
Thanks for this. I’m thinking of adding a section to the post above, giving your argument in full detail, but if any of my students are reading this, I recommend trying to fill in the details for yourself.
February 22, 2014 at 11:17 pm |
“So what follows is a rational reconstruction of what I wish had been my thought processes, rather than of what actually went on in my mind.”
I absolutely love this statement, and I think this admission makes this post much more beautiful than if you had presented the “trick” from the start. I try to make my students realize that slick textbook and article presentations frequently obscure the problem-solving process itself. While such education is not necessarily the point of journal articles, textbooks for real analysis and other subjects too often present “nice” proofs without discussing how on Earth a person actually thinks to do these things. I truly appreciate your attempts to motivate proofs and to minimize the use of “magic bullets” (e.g. use of Cauchy’s generalized MVT with specially chosen functions.)
Thank you for this blog. I enjoy it.
February 24, 2014 at 6:18 am |
Reblogged this on Math Online Tom Circle.
July 30, 2014 at 1:56 pm |
Thank you for a very interesting post. This is a bit late, but I’d like to give another real variable proof of the result in which the heavy work is done by uniform convergence of power series strictly within their circle of convergence.
Fix
such that
and choose
such that
there exists
such that, whenever
, the errors in approximating (i)
by
, (ii)
by
and (iii)
by $\sum_{n=0}^N na_n z^{n-1}$ are all
. Since
for some (complicated) polynomial
, the left-hand side is
for all sufficiently small
. Hence
for all sufficiently small
.
July 30, 2014 at 6:38 pm
Sorry, the second paragraph and the LaTeX have got completely mangled. Please could you delete everything and I will put a post on my blog with the idea.
July 30, 2014 at 7:24 pm |
[…] of convergence then its derivative is , where this series has radius of convergence at least . An interesting post on Gowers’ weblog gives a direct proof of this (in the real variable case, but it all goes […]
October 24, 2015 at 6:47 pm |
[…] Weblog is definitely more on the technical side overall, with a few exceptions. However, you will find that Gowers likes to write posts for lots of different ability […]
November 5, 2016 at 3:59 pm |
Thank you very much for making public this thoughts of yours, Gowers.
March 15, 2022 at 10:00 pm |
[…] term by term, which should be discussed in any serious real (complex) analysis textbook. See also a note by […]