I’m writing this post as a way of preparing for a lecture. I want to discuss the result that a power series is differentiable inside its circle of convergence, and the derivative is given by the obvious formula . In other words, inside the circle of convergence we can think of a power series as like a polynomial of degree for the purposes of differentiation.

A preliminary question about this is why it is not more or less obvious. After all, writing , we have the following facts.

- Writing , we have that .
- For each , .

If we knew that , then we would be done.

Ah, you might be thinking, how do we know that the sequence converges? But it turns out that that is not the problem: it is reasonably straightforward to show that it converges. (Roughly speaking, inside the circle of convergence the series converges at least as fast as a GP, and multiplying the th term by doesn’t stop a GP converging (as can easily be seen with the help of the ratio test). So, writing for , we have the following facts at our disposal.

Doesn’t it follow from that that ?

We are appealing here to a general principle, which is that if some functions converge to and their derivatives converge to , then is differentiable with . Is this general principle correct?

Unfortunately, it isn’t. Suppose we take some continuous functions that converge to a step function. (Roughly speaking, you make be 0 up to 0, then linear with gradient until it hits 1, then 1 from that point onwards.) And suppose we then let be the function that differentiates to and is 0 up to 0. Then the converge to the function that is 0 up to 0 and for positive . This function *almost* differentiates to the step function, but it isn’t differentiable at 0.

So we’ve somehow got to use particular facts about power series in order to prove our result — we can’t appeal to general considerations, because then we are appealing to a principle that isn’t true. (Actually, in principle some compromise might be possible, where we show that functions defined by power series have a certain property and then use nothing apart from that property from that point on. But as it happens, we shall not do this.)

### Why can’t we just jump in and prove it with a big calculation?

We have a formula for . Why don’t we write out a formula for and see if we can tell what happens when ?

That is certainly a sensible first thing to try, so let’s see what happens.

What can we do with that? Perhaps we’d better apply the binomial theorem. Then we find that the right-hand side is equal to

Part of the above expression gives us what we want, namely . So we’re left wanting to prove that

tends to 0 as .

Unfortunately, as gets big, some of those binomial coefficients get pretty big too. Indeed, when is bigger than , the growth in the binomial coefficients seems to outstrip the shrinking of the powers of . What can we do?

### A useful trick

Fortunately, there is a better (for our purposes at least) way of writing . We just expanded out using the binomial theorem. But we could instead have used the expansion

Applying that with and , we get

Just before we continue, note that this gives us an alternative, and in my view nicer, way to see that the derivative of is , since if you divide the right-hand side by and let then each of the terms tends to .

Anyhow, if we use this trick, then works out to be

Now let’s subtract the thing we want this to tend to, which is . (This is not valid unless we know that this series converges. So at some stage we will need to prove that.) If we think of as a sum of copies of , then we can write the difference as

which equals

Now is another example of the expansion we had above. That is, we can write it as

We haven’t yet mentioned the radius of convergence of the original power series, but let’s do so now. Suppose it is , that is such that , and that we have chosen small enough that . Then the modulus of the expression above is at most .

It follows that

Since , this is equal to .

So this will tend to zero as as long as we can prove that the sum converges.

### Convergence of the power series you get when you differentiate term by term

Let’s prove a lemma to deal with that last point. It says that if is smaller than the radius of convergence of the power series , then the power series converges.

The proof is very similar to an argument we have seen already. Let be the radius of convergence, and pick with . Then the power series converges, so the terms are bounded above, by , say. Then .

But the series converges, by the ratio test. Therefore, by the comparison test, the series converges.

This shows also that if then the power series converges (since we have just proved that it converges absolutely). So if we differentiate a power series term by term, we get a new power series that has the same radius of convergence, something we needed earlier.

If we apply this lemma a second time, we get that the power series converges, and dividing by 2 that gives us what we wanted above, namely that converges.

### A couple of applications

An obvious way of applying the result is to take some of your favourite power series and differentiate them term by term. This illustrates the very important general point that if you can obtain something in two different ways, then you usually end up proving something interesting.

So let’s take the function , which we have shown converges everywhere. Then we can obtain the derivative either by differentiating the function itself or by differentiating the power series term by term. That tells us that

, which simplifies to , which in turn simplifies to , which equals .

Earlier we proved this result by writing as and proving that . I still prefer that proof, but you are at liberty to disagree.

As another example, let us consider the power series . When this equals , by the formula for summing a GP. We can now differentiate the power series term by term, and we can also differentiate the function . Doing so tells us the interesting fact that

We can see that in another way as well. By our result on multiplying power series, the product of with itself is the power series , where is the convolution of the constant sequence with itself. That is, with every and equal to 1, which gives us . (This agrees with the previous answer, since is the same as .)

### Tidying up the proof

In the proof above, we used the identity

with and , and then we used it again to calculate what happened when we subtracted . Can we get those calculations out of the way in advance? That is, can we begin by finding a nice formula for ?

We obviously can, by subtracting from the right-hand side and simplifying, much as we did in the proof above (with and ). However, we can do things a bit more slickly as follows. Start with the identity

Differentiating both sides with respect to , we get

If we now take for and for , we deduce that is equal to

In particular, if and are both at most , then , which is the main fact we needed in the proof.

Armed with this fact, we could argue as follows. We want to show that

is . By the inequality we have just proved, if and are at most , then the modulus of this expression is at most

and an earlier lemma told us that this converges within the circle of convergence. So the quantity we want to be is in fact bounded above by a multiple of . (Sometimes people use the notation for this. The means “bounded above in modulus by a constant multiple of the modulus of”.)

### Was the “trick” a trick?

The proof in this post has relied heavily on the idea, which appeared to come from nowhere, of writing not in the obvious way, which is

but in a “clever” way, namely

Is this something one just has to remember, or can it be regarded as the natural thing to do?

I chose the words “can it be regarded as” quite carefully, since I want to argue that it is the natural thing to do, but when I was preparing this lecture, I didn’t find it the natural thing to do, as I shall now explain. I came to this result with the following background. Many years ago, I lectured a IB course called Further Analysis, which was a sort of combination of the current courses Metric and Topological Spaces and Complex Analysis, all packed into 16 lectures. (Amazingly, it worked quite well, though it was a challenge to get through all the material.) As a result of lecturing that, I learnt a proof that power series can be differentiated term by term inside their circle of convergence, but the proof uses a number of results from complex analysis. I then believed what some people say, which is that the complex analysis proof of this result is a very good advertisement for complex analysis, since a direct proof is horrible. And then at some point I was chatting to Imre Leader about the reorganization of various courses, and he told me that it was a myth that proving the result directly was hard. It wasn’t trivial, he said, but it was basically fine. In fact, it may even be thanks to him that the result is in the course.

Until a few days ago, I didn’t bother to check for myself that the proof wasn’t too bad — I just believed what he said. And then with the lecture coming up, I decided that the time had finally come to check it: something that I assumed would be a reasonably simple exercise. I duly did the obvious thing, including expanding using the binomial theorem, and got stuck.

I would like to be able to say that I then thought hard about why I was stuck, and after a while thought of the idea of expanding using the expansion of . But actually that is not what happened. What happened was that I thought, “Damn, I’m going to have to look up the proof.” I found a few proofs online that looked dauntingly complicated and I couldn’t face reading them properly, apart from one that was quite nice and that for a while I thought I would use. But one thing all the proofs had in common was the use of that expansion, so that was how the idea occurred to me.

So what follows is a rational reconstruction of what I *wish* had been my thought processes, rather than of what actually went on in my mind.

Let’s go back to the question of how to differentiate . I commented above that one could do it using the expansion, and said that I even preferred that approach. But how might one think of doing it that way? There is a very simple answer to that, which is to use one of the alternative definitions of differentiability, namely that is differentiable at with derivative if as . This is simply replacing by , but that is nice because it has the effect of making the expression more symmetrical. (One might argue that since we are talking about differentiability *at* , the variables and are playing different roles, so there is not much motivation for symmetry. And indeed, that is why calling one point and the other is often a good idea. But symmetry is … well … sort of good to have even when not terribly strongly motivated.)

If we use this definition, then the derivative of is the limit as of , and now there is no temptation to use the binomial expansion (we would first have to write as and the whole thing would be disgusting) and the absolutely obvious thing to do is to observe that we have a nice formula for the ratio in question, namely

which obviously tends to as .

In fact, the whole proof is arguably nicer if one uses and rather than and .

Thus, the “clever” expansion is the natural one to do with the symmetric definition of differentiation, whereas the binomial expansion is the natural one to do with the definition. So in the presentation above, I have slightly obscured the origins of the argument by applying the clever expansion to the definition.

Another way of seeing that it is natural is to think about how we prove the statement that a product of limits is the limit of the products. The essence of this is to show that if is close to and is close to , then is close to . This we do by arguing that is close to , and that is close to .

Suppose we apply a similar technique to try to show that is close to . How might we represent their difference? A natural way of doing it would be to convert all the s into s in a sequence of steps. That is, we would argue that is close to , which is close to , and so on.

But the difference between and is , so if we adopt this approach, the we will end up showing precisely that

February 22, 2014 at 3:45 pm |

If you assume

uniformconvergence of the functions, this principle becomes rather usefully true, doesn’t it?However this is presumably dealt with later (perhaps along with the Riemann integral?).

February 22, 2014 at 5:06 pm

To be more precise, we can get away with assuming pointwise convergence of to a continuous function as long as the derivatives are continuous and converge

uniformlyto , at least on some open interval/set around the point we are looking at.My favourite proof of this uses the Riemann integral and the Fundamental Theorem of Calculus, but there are also some relatively elementary proofs available using e.g. suitable Mean Value Theorem estimates.

February 23, 2014 at 10:33 am

You’re right that we haven’t done uniform convergence or the Riemann integral yet, and in fact uniform convergence isn’t on the syllabus for this course, but if I have time at the end of the course, having done integration, I might mention this proof.

February 22, 2014 at 3:57 pm |

Another way would to state a general principle with stronger hypothesis to makes it true ;). I suggest the following : Let be a sequence a function, two times differentiable, such that

(i) converges pointwise to

(ii) converges pointwise to

(iii) is bounded by , for some independent of

then is differentiable and .

This applies easily to power series since the pointwise convergence and the bound can be established by the ratio test.

PROOF.

Let in your domain and .

For all we have

The first and the second terms tends to zero as , by the

hypothesis (i). The third term as well, by hypothesis (ii). By Taylor’s

theorem with Lagrange form of the remainder (the one you told us about few days

ago!), the last term is bounded by . This proves that is

differentiable at and that .

February 23, 2014 at 10:34 am

Thanks for this. I’m thinking of adding a section to the post above, giving your argument in full detail, but if any of my students are reading this, I recommend trying to fill in the details for yourself.

February 22, 2014 at 11:17 pm |

“So what follows is a rational reconstruction of what I wish had been my thought processes, rather than of what actually went on in my mind.”

I absolutely love this statement, and I think this admission makes this post much more beautiful than if you had presented the “trick” from the start. I try to make my students realize that slick textbook and article presentations frequently obscure the problem-solving process itself. While such education is not necessarily the point of journal articles, textbooks for real analysis and other subjects too often present “nice” proofs without discussing how on Earth a person actually thinks to do these things. I truly appreciate your attempts to motivate proofs and to minimize the use of “magic bullets” (e.g. use of Cauchy’s generalized MVT with specially chosen functions.)

Thank you for this blog. I enjoy it.

February 24, 2014 at 6:18 am |

Reblogged this on Math Online Tom Circle.

July 30, 2014 at 1:56 pm |

Thank you for a very interesting post. This is a bit late, but I’d like to give another real variable proof of the result in which the heavy work is done by uniform convergence of power series strictly within their circle of convergence.

Fix such that and choose such that there exists such that, whenever , the errors in approximating (i) by , (ii) by and (iii) by $\sum_{n=0}^N na_n z^{n-1}$ are all . Since

for some (complicated) polynomial , the left-hand side is for all sufficiently small . Hence

for all sufficiently small .

July 30, 2014 at 6:38 pm

Sorry, the second paragraph and the LaTeX have got completely mangled. Please could you delete everything and I will put a post on my blog with the idea.