## Sums of kth powers

Recently I was giving a talk at a school in London and had cause to think about sums of kth powers, because I wanted to show people the difference between unenlightening proofs and enlightening ones. (My brief was to try to give them some idea of what proofs are and why they are worth bothering about.) On the train down, I realized that there is a very simple general argument that can be used to work out formulae for sums of kth powers. After a brief look online, I haven’t found precisely this method, though I’ve found plenty of methods in the vicinity of it. It’s bound to be in the literature somewhere, so perhaps somebody can point me towards a reference. But even so, it’s worth a short post.

I’ll start with the case $k=1$. I want to phrase a familiar argument in a way that will make it easy to generalize in the way I want to generalize it. Let $R_2(n)$ be the rectangle consisting of all integer points $(x,y)$ such that $1\leq x\leq n$ and $1\leq y\leq n+1$. We can partition $R_2(n)$ into those points for which $x\geq y$ and those points for which $x. The number of points of the first kind is $1 + 2 + ... + n$, since for each $x$ we get $x$ possibilities for $y$. The number of points of the second kind is $0 + 1 + ... + n$, since for each $y$ we get $y-1$ possibilities for $x$. Therefore, $2(1 + 2 + ... + n) = n(n+1)$ and we get the familiar formula.

Now let’s move to sums of squares. This time I’ll let $R_3(n)$ be the set of points $(x,y,z)$ with $1\leq x\leq n$, $1\leq y\leq n+1$ and $1\leq z\leq n+1$. We can partition $R_3(n)$ into three sets, the first consisting of points for which $x$ is maximal, the second of points for which $y$ is maximal and $x$ is not maximal, and the third of points for which $z$ is strictly larger than both $x$ and $y$. The numbers of points in these sets are easily seen to be

$1^2 + 2^2 + \dots + n^2$

$0.1 + 1.2 + 2.3 + \dots + n(n+1) = 1^2 + 2^2 + \dots + n^2 + 1 + 2 + \dots + n$

and

$1^2 + 2^2 + \dots + n^2$,

respectively. This gives us the formula

$3(1^2 + 2^2 + \dots + n^2) = n(n+1)^2 - n(n+1)/2 = n(n+1)(n+1/2)$

from which we get the familiar formula $\frac 16 n(n+1)(2n+1)$ for the sum of the first $n$ squares. Writing $S_k(n)$ for the sum of the first $n$ $k$th powers, we also get the relationship $3S_2(n) + S_1(n) = n(n+1)^2.$

A striking fact about power sums is that $S_3(n) = S_1(n)^2$. One way of explaining this can be found in a rather nice animation that I came across as a result of a Google Plus post of Richard Green. Another comes from continuing with the approach here.

This time I’ll let $R_4(n)$ be the set of points $(x,y,z,w)$ such that $x$ and $y$ are between 1 and n and $z$ and $w$ are between 1 and $n+1$. Again I’ll partition according to which of $(x,y,z,w)$ is the largest, taking the first one that’s largest when there is a tie. That gives me four sets. Here are their sizes.

$x$ first largest: $1^3 + 2^3 + \dots + n^3$.

$y$ first largest: $0.1^2 + 1.2^2 + \dots + (n-1).n^2$.

$z$ first largest: $0^2.1 + 1^2.2 + \dots + n^2.(n+1)$.

$w$ first largest: $0^3 + 1^3 + 2^3 + \dots + n^3$.

These sizes can be written as $S_3(n), S_3(n)-S_2(n), S_3(n)+S_2(n)$, and $S_3(n)$. So we get $4S_3(n) = n^2(n+1)^2$, which gives us that $S_3(n)=S_1(n)^2$. It also gives us a kind of explanation of that fact: for $S_1(n)$ we decompose $R_2(n)$ into two equal pieces of size $S_1(n)$, while for $S_3(n)$ we decompose $R_2(n)^2$ into four pieces that don’t quite all have size $S_3(n)$ but the two errors cancel out.

To see that this is a partial but not total coincidence, I’m going to jump to $S_5(n)$ now. I’ll let $R_6(n)$ be the set of points $(u,v,w,x,y,z)$ such that $u,v,w$ are between 1 and $n$ and $x,y,z$ are between 1 and $n+1$. This time the calculations are as follows.

$u$ first largest: $\sum_{k=1}^n k^5=S_5(n)$.

$v$ first largest: $\sum_{k=1}^n (k-1)k^4 = S_5(n)-S_4(n)$.

$w$ first largest: $\sum_{k=1}^n (k-1)^2k^3 = S_5(n)-2S_4(n)+S_3(n)$.

$x$ first largest: $\sum_{k=1}^nk^3(k+1)^2=S_5(n)+2S_4(n)+S_3(n)$.

$y$ first largest: $\sum_{k=1}^nk^4(k+1)=S_5(n)+S_4(n)$.

$z$ first largest: $\sum_{k=1}^nk^5=S_5(n)$.

Adding all these up we find that $6S_5(n)+2S_3(n)=n^3(n+1)^3$. From that we get that

$S_5(n)=\frac 16n^3(n+1)^3-\frac 1{12}n^2(n+1)^2$

$=\frac 1{12}n^2(n+1)^2(2n(n+1)-1)=\frac 1{12}n^2(n+1)^2(2n^2+2n-1).$

In general, if we use this method to calculate $1^k+\dots+n^k$ when $k$ is odd, then (as with other methods) we obtain a relationship between $S_k(n)$ and earlier $S_j(n)$. But what is nice about it is that there is a lot of cancellation: all the $S_j(n)$ for $j$ even make a contribution of zero.

Indeed, if $k=2r+1$, then we have $k+1$ sets, and their sizes are $S_k(n)$, $S_k(n)-S_{k-1}(n)$, $S_k(n)-2S_{k-1}(n)+S_{k-2}(n)$, and so on down to

$S_k(n)-\binom r1S_{k-1}(n)+\binom r2S_{k-2}(n)-\dots+(-1)^rS_{k-r}(n)$

and then the same thing but with all minus signs in these linear combinations replaced by plus signs. Adding it all up we get a linear combination of $S_k(n)$, $S_{k-2}(n)$, … , $S_{k-t}(n)$ equalling $n^r(n+1)^r$, where $t=r$ if $r$ is even and $t=r-1$ if $r$ is odd. When $k$ is small, we don’t have to take into account too many $S_j(n)$, so the formulae remain quite nice for a while before they become disgusting.

Note that it is quite easy to work out the coefficients of the various $S_{k-2j}$ in the above linear combination: they are just sums of binomial coefficients. Several other methods require one to solve simultaneous equations, though they are usually in triangular form, so not too bad.

A small remark is that the basic idea of this argument is to discretize a much easier continuous argument that shows that $\int_0^xt^kdt=x^{k+1}/(k+1)$. That argument is to take the $(k+1)$-dimensional cube consisting of all points $(t_1,\dots,t_{k+1})$ such that each $t_i$ belongs to the interval $[0,x]$ and partition it into $k+1$ pieces according to which coordinate is largest. (I call the argument geometrical because these pieces are pyramids with $k$-cube bases.) In the continuous case, we don’t have to worry about what happens if there is more than one largest coordinate, since that set has measure zero. Each piece has measure $\int_0^xt^kdt$, and there are $k$ pieces, while the cube has measure $x^{k+1}$, so we are done.

A second remark is that the method I previously knew of for calculating sums of $k$th powers is to exploit the fact that deserves to be called the natural discrete analogue of the above. Define $n_k$ to be $n(n-1)\dots(n-k+1)$. This we think of as the discrete analogue of $x^k$. Then we do a discrete analogue of differentiating, which is to look at $(n+1)_k-n_k$, which equals $n(n-1)\dots(n-k+2)(n+1-(n-k+1))=kn_{k-1}$. This is the discrete analogue of the fact that the derivative of $x^k$ is $kx^{k-1}$. Next, we use the discrete analogue of the fundamental theorem of calculus, which is the statement that $\sum_{i=m}^{n-1}(f(i+1)-f(i))=f(n)-f(m)$ to deduce that $\sum_{m=1}^n km_{k-1}=n_k$. This gives us for each $k$ a polynomial of degree $k$ that we can sum easily, namely $n_k$, and then to work out the sum of the first $k$th powers, we write it as a linear combination of the $n_k$, sum everything, and simplify the answer. That works fine, but the calculations are quite a bit more complicated than what I did above, and the proof is too algebraic to explain why the answers have the fairly nice forms they do. (For example, why is $n^2(n+1)^2$ a factor of the sum of the first $n$ $k$th powers whenever $k$ is odd and at least 3? From the argument I gave earlier in the post, this follows fairly easily by induction.)

### 20 Responses to “Sums of kth powers”

1. misha Says:

1) Once upon a time David Bernstein explained to me the combinatorial meaning of the formula for the sum of m(m-1)…(m-k+1)/k! where m runs from 1 to n. To pick k+1 numbers out of the first n, we pick the biggest one in our future sample first, and then pick k numbers from the ones that are smaller.

2) From the finite difference considerations it is clear that the formula for the sum of the k-th powers is a polinomial of degree k+1 in n. Actually we can just guess that from examples. Then we can figure out the coefficients of this polynomial by fitting the small n data, and then finish the proof by induction.

2. misha Says:

An approach similar to yours is explained in sections 3.1-3.4 of “Mathematical Discovery” by George Polya, where the whole capter 3 is called “Recursion.”

3. mathtuition88 Says:

Reblogged this on Singapore Maths Tuition and commented:
Very nice proof for the sum of kth powers. Recommended for reading!

4. hhanche Says:

I’ll look at this proof later, but I have a somewhat off topic comment: I find the TeX’d formulas really hard to read, as the font is so much thinner than the surrounding text font. I am old enough to have developed some cataracts, and reading the mathematics here takes real effort. Could you consider either moving to MathJax, or else configuring your setup to use a fatter font? I imagine either stix or Knuth’s concrete fonts would be an improvement. But MathJax, or some clever solution uses MathML is best, for then the formula will scale along when I scale up the text.

• Bruce Smith Says:

I am using the latest Firefox on a Mac; for me, the math does scale properly along with the text, when I make the text larger using cmd-+.

5. Kevin Carlson Says:

This is nice! Seems in the $k=1$ case that you want the subdivision to be $x\geq y$ and $x, which the current version has reversed.

Thanks — I’ve corrected that now.

6. Martin Cohen Says:

Looks like inclusion-exclusion.

Not sure if that’s significant or obvious.

• gowers Says:

It is significant, in that it can be used to give a slightly different method of proof. I’ll illustrate it with summing squares. This time, let’s take the discrete cube consisting of all points $(x,y,z)$ with every coordinate between 1 and $n$, and let’s simply count points with $x$ maximal, points with $y$ maximal and points with $z$ maximal. I’ll call these sets $A, B$ and $C$.

Then $A,B$ and $C$ each have size $1^2+2^2+\dots+n^2$. The intersection of any two of them has size $1+2+\dots+n$. And finally the intersection of all three of them has size $1+1+\dots+1$. So by the inclusion-exclusion formula we get

$n^3 = 3S_2(n)-3S_1(n)+S_0(n).$

That gives us that $6S_2(n)=2n^3+3n(n+1)-2n$, which equals $n(2n^2+3n+1)=n(n+1)(2n+1)$.

In general, if we do things this way, we get that

$n^k=kS_{k-1}(n)-\binom k2S_{k-2}(n)+\dots+(-1)^{k-1}S_0(n)$

For example, $n^2=2S_1(n)-n$, so $S_1(n)=n(n+1)/2$.

This formula is perhaps simpler conceptually than the one in the post, but requires more calculation because you have to calculate all the earlier $S_j(n)$.

7. Simon Tatham Says:

A really quick approach to summing $k$th powers, which I think is different from both your method here and the $n_k$ method you mention at the end (which was previously the only one I knew too) was pointed out to me in 2008 by Gareth Taylor, which is to observe that (for example)

$\sum (i+1)^3 = \sum i^3 + 3\sum i^2 + 3\sum i + \sum 1$

and since most of the LHS sum of $(i+1)^3$ cancels against most of the RHS sum of $i^3$, we can rearrange trivially to get a formula for the next term down, namely $\sum i^2$:

$3\sum_{i=1}^{n} i^2 = (n+1)^3 - 1 - 3\sum i - \sum 1$

This approach generalises to give a formula for each $S_k$ in terms of all the previous ones:

$S_0 = {1\over 1} \left( (n+1) - 1 \right) = n$

$S_1 = {1\over 2} \left( (n+1)^{2} - 1 - S_0 \right) = {n(n+1)\over2}$

$S_2 = {1\over 3} \left( (n+1)^{3} - 1 - S_0 - 3S_1 \right) = {n(n+1)(2n+1)\over6}$

$S_3 = {1\over 4} \left( (n+1)^{4} - 1 - S_0 - 4S_1 - 6S_2 \right) = {n^2(n+1)^2\over4}$

$S_4 = {1\over 5} \left( (n+1)^{5} - 1 - S_0 - 5S_1 - 10S_2 - 10S_3 \right) = {n(n+1)(2n+1)(3n^2+3n-1)\over30}$

and so on in an obvious pattern.

• gowers Says:

That is indeed very easy conceptually, though the $n^r(n+1)^r$ trick in the approach in the post leads to cancellation that means that it requires less calculation. Also, with this approach it looks like a miracle that $S_3$ ends up simplifying down to $n^2(n+1)^2/4$, rather than being a very natural consequence of the argument.

8. Sums of powers of k: a more elegant method | Beyond Solutions Says:

[…] Last week, I put up a post on deriving a formula for for positive integer , using the method of telescoping sums. Interestingly enough, Tim Gowers, a Fields Medallist, just put up a post outlining a very elegant outline of how to derive these formulas. (See his post here.) […]

9. Vincent Says:

Very nice! Tangentially related: do you happen to know what (if any) is the next term in the sequence of equations:

$$3^2 + 4^2 = 5^2$$
$$3^3 + 3^4 + 3^5 = 3^6$$

?

• Vincent Says:

(Also: how do I make part of my replies into Latex if it is apparently not by typing ‘$$’?) • Vincent Says: O I misspelled, I meant:$$3^2 + 4^2 = 5^23^3 + 4^3 + 5^3 = 6^3
which is not only more related to the post than what I wrote, but also more true.

• gowers Says:

I didn’t know that $3^3+4^3+5^3=6^3$. (To get the latex to appear one writes @latex 3^3+4^3+5^3=6^3@ except that instead of the @ signs one puts \$ signs.)

I presume the obvious pattern doesn’t continue. In other words, I presume that it is not true that $3^4+4^4+5^4+6^4=7^4$. Indeed, it can’t hold because the left-hand side is even.

One could also ask the much more general question of whether there are other examples of identities of the form

$a^k+(a+1)^k+\dots+b^k=(b+1)^k.$

For that to hold, we need $(b-1)^k+b^k\leq (b+1)^k$, which means that $b$ must be small enough that the powers are still increasing pretty fast. So we need $(b+1)^k$ to be of a similar order of magnitude to $\sum_{a=1}^ba^k$. This order of magnitude should be around $b^{k+1}/k$, so that needs to be around $(b+1)^k$. A back-of-envelope calculation then tells us that $b$ will have to be of similar magnitude to $k$. But whether there are any more solutions I don’t know: I wouldn’t expect an infinite family of solutions.

In fact, I think there could be a heuristic argument that suggests that there are only finitely many solutions. Roughly speaking, for each $k$ there will be only a small number of $b$s that could possibly work. If you now test each one, it will have a probability of … this is the bit I’m not quite sure about, but I’m pretty confident that the answer is small … something like $k^{-ck}$ of actually working. Summing over all $k$ gives a finite result.

10. Anonymous Says:

Very nice argument. btw there is a typo right below “and so on down to” where ${2\choose 1}$ should be ${r\choose 1}$

Thanks — I’ve put that right now.

11. Tom Bancroft Says:

Hey Tim how are you? How is the maths going? Tom Bancroft ex-Fineline drummer

12. Mr Math Says:

Very nice article.

13. Lee Says:

My approach for summing kth powers is similar to the $n_k$ approach, but uses binomial coefficients, in Pascal’s triangle. First we need to write n^k in terms of binomial coefficients. Let’s try k=3. I write the values for j^3 on a line and then I look at differences, writing those on a second line. Then I write the second line differences on the third line, etc.

0, 1, 8, 27, 64, 125, …
1, 7, 19, 37, 61, …
6, 12, 18, 24, …
6, 6, 6, 6, …
0, 0, 0, …

From this I read the first column to tell me that j^3 = 0 (j choose 0) + 1 (j choose 1) + 6 (j choose 2) + 6 (j choose 3). Then I observe on Pascal’s triangle that the sum of (j choose i) from j = 0 to n is simply (n+1 choose i+1). Thus S_k(n) = 0 (n+1 choose 1) + 1 (n+1 choose 2) + 6 (n+1 choose 3) + 6 (n+1 choose 4).