A mathematician watches tennis II

This has been a year to remember for anybody whose interest in tennis is more that of a nerd than that of a tennis player (which, given the uselessness of my serve, very much applies to me), in that it has given us two records that may well never be beaten. First we have Roger Federer’s record of 23 consecutive Grand Slam semi-finals (set at the Australian Open, and finally fixed at 23 when he lost in the quarter-finals at Roland Garros), and now, something I’ve been hoping for all my life: a seemingly endless match. At the time of writing, John Isner and Nicolas Mahut are waiting to resume a match that has gone into a third day. They will do so later today, with the score standing at 59-59 in the final set. This doesn’t just beat previous records — it utterly smashes them. This set is way more than twice as long as the previous longest set in a Grand Slam, it alone is far longer than the previous longest ever full match in professional tennis, both players have served far more aces in a single match (95 for Mahut, 98 for Isner) than anybody before, and so on. And if you also take account of the fact that the previous two sets had to be settled by tie-breaks, with no breaks of serve in either, then we have had 142 games in a row with no breaks of serve. (I can’t remember when the break occurred in the second set, but even this number 142 can probably be improved slightly.) [Update. The match is now over, with Isner winning 70-68, so the eventual number of consecutive unbroken service games was 137 in the final set, 161 if you include the previous two sets, and a few more still, I think, if you include the last few games of the second set. The number of aces for both players ended up well into triple figures.]

Isner said, with some justification, that nothing like this will ever happen again. But with how much justification? As ever, to answer this question involves choosing some kind of probabilistic model, and it is far from obvious how to choose an appropriate one. But it is possible to get some feel for the probabilities by looking at a crude model, while being fully aware that it is not realistic.

So let’s begin by assuming that two players are playing a single game of tennis, and that the returner wins each point with probability q=1-p, with the outcomes of the points being independent. (This independence assumption is not usually a good one, but watching the Isner-Mahut match I got the feeling that it was better than usual. Both players somehow retained their composure throughout and just played their normal games, which happened to involve remarkably good serving.)

A quick way to work out the probability that the returner wins (that is, the probability of a break of serve) is to consider what happens after the first six points. (If the game is over earlier than the end of the sixth point, we can let the players play two more meaningless points and the probabilities are unaffected.) If either player reaches four points, then that player wins, and otherwise the score is deuce. So if Q is the probability that the returner wins from deuce, then the probability that the returner wins the game works out, by a simple application of the binomial distribution, to be

q^6 + 6q^5p + 15q^4p^2 + 20q^3p^3Q.

As for Q, it satisfies the equation

Q=q^2+2pqQ,

since if the returner wins the next two points then he wins (I say “he” because I’m talking about a men’s match here), if each player wins one point then it’s deuce again, and otherwise the returner loses. Thus, Q=q^2/(1-2pq).

I did a quick back-of-envelope calculation to see what would happen if q=1/4 — that is, if the probability of winning a point on the other person’s serve is 1/4. If I didn’t make a mistake, the probability of winning a game worked out to be something like 1/16. Let’s take that as the probability. Then the probability of a run of 142 games without a break of serve is (15/16)^{142}, which should be around e^{-142/16}. But 142/16 is about 9, and e^{-9} is around 0.0001234 (what were the chances that it would turn out to be such a nice number?), which is about 1 in 8000.

Given that each Grand Slam tournament involves 127 matches, so that in each year there are 508 Grand Slam matches, this seems to suggest that even in Grand Slams we should expect an event like this about every 16 years. But it suggests nothing of the kind. In almost all matches, this probabilistic model is ludicrously wrong: if, for example, one player is significantly better than the other, as is very often the case, or the match ebbs and flows, as is even more often the case, then it is certainly not the case that, game in and game out, the probability of the server winning the point is 3/4. This model is non-ridiculous only under very special circumstances: both players need to be excellent and very consistent servers, and they need to be evenly matched (on the day at least). And this state of affairs needs to last. And even then, the chances that there will be an extraordinarily long run of matches are only 1/8000 (though that probability is quite sensitive to my choice of q, and perhaps in the Isner-Mahut match it was more like 1/5 or 1/6 — there certainly seemed to be a lot of love games — in which case the likelihood would go up somewhat).

Something like this seems to have happened with Isner and Mahut. One of the remarkable things about the match was the quality of the actual tennis. Particularly remarkable was that neither player went through phases of losing rhythm and being unable to get their first serves in. (There were perhaps mini-phases, but there was no sign that they could not be explained by pure chance.) And since both players were, obviously, pretty tired, the server’s advantage was presumably increased as the match wore on.

This last factor perhaps explains why the previous records were not just beaten but smashed, which would otherwise be rather mysterious. If both players got into a groove on their serves, and were too tired to cope with each other’s serves, then the value of q would have gone down. But the match had to go on for quite a long time for this to happen. So perhaps the conditional probability of a long run given that there has been a long run up to now is higher than the probability of a long run starting from scratch. (In probability jargon, the process is not memoryless.)

After all that, is Isner right to say that this will never happen again? To answer that we would have to guess how often two very good and evenly matched servers are pitted against each other. But let’s suppose that that happens in say 50 of the 504 Grand Slam matches (it would be 100 if we added in women’s matches, but the server’s advantage in women’s tennis is much less, so I think one can confidently say that this will never happen in a women’s match). And let’s suppose that 20 of those matches that go to five sets (which is probably quite a generous estimate). And finally, let’s guess that the probability of a run of 118 games in the final set without a break of serve is somewhat higher than 1/8000 — we could go for 1/4000, say. Then we would expect a match like the Isner-Mahut match about once every 200 years.

Will tennis still be played in its current form in 200 years’ time? I don’t think that can be anything like guaranteed. So Isner’s bold statement could well be right.

27 Responses to “A mathematician watches tennis II”

  1. Richard Elwes Says:

    Thank you for this. I was also watching the match (with my jaw on the floor) and it occured to me to try this sort of estimate – I’m delighted to have been saved the effort. The point you make about the server’s advantage increasing as the players tire is very pertinent I think.

    More detailed statistics of the match (so far!) are here. This suggests that q for both players (over the whole match) is a bit less than 1/4, for Isner it’s nearer 1/5. But, as you predicted, it’s decreased over the match. Isner started out at around q=32% in the first set (which admittedly he won with a break of serve), with a total value of q=42/155=27% for sets 1-4, dipping to 53/304=17% in the final set. Mahut has been a lot more consistent (remarkably so) with q=33/129=26% for sets 1-4 combined, dipping to 72/318=23% in set 5. I expect the point would be further reinforced by the breakdown within the final set where they really were dying on their feet. But that data’s not available that I can see.

    Actually (15/16)^42 is nearer 1/10000, highly unlikely certainly, but not as freakish as I had first expected. (If it had come out at many millions to one, you’d have thought there’d have to be a question of match-fixing, albeit of a rather unusual sort. Glad to see my cynicism is not justified.)

  2. Richard Elwes Says:

    Erratum: “(15/16)^142 is nearer 1/10000”

  3. WierdDreams Says:

    Wonderful Analysis of once in a million phenomenon.

  4. Stones Cry Out - If they keep silent… » Things Heard: e124v4 Says:

    […] How about tennis? […]

  5. Greg Martin Says:

    “… the server’s advantage in women’s tennis is much less, so I think one can confidently say that this will never happen in a women’s match”

    There are a whole lot of statements made about things that would “never happen” in women’s sports, from marathon times to dunking basketballs, back to women being interested in sports at all (and way back to women not being able to handle the stress of breathing hard). I think we should be leery of imagining that we’re seeing today the eternal pinnacle of skill level in women’s tennis serving.

    • gowers Says:

      You’re right of course. What I meant was not never as in never ever, but never in a women’s tennis match as women’s tennis matches have been ever since I have watched them, with on average many more breaks of serve than in men’s matches. From that I deduce that the value of q is higher for women than it is for men, and even a small difference in q makes a large difference to the probability of going for 142 games with no break of serve. I wouldn’t completely rule out that in the course of time the value of q for women will become comparable to what it is for men at the moment, though it seems a bit unlikely. (For one thing, Isner and Mahut are both much taller than almost all women — Isner especially — and height confers a considerable advantage to a big server.)

  6. PC Says:

    It will actually be a bit longer, because one of the Grand Slams, the US Open, adopts a 5th set tie break. So there are only 381 matches in a year.

  7. alreadydone Says:

    Longest women’s match: In 1984, Vicki Nelson took 6 hours, 31 minutes to defeat Jean Hepner 6–4, 7–6(11) in a first round match in the 1984 Ginny of Richmond tournament in Richmond, Virginia. The match featured a 29-minute, 643-shot rally, the longest in professional tennis history.
    See: http://en.wikipedia.org/wiki/Longest_tennis_match_records
    It doesn’t need strong serve and 5-set match to make a long match, a merely 2-set match with extremely long rallies can also create record.

    • gowers Says:

      I certainly don’t dispute that — my comments were about the number of games rather than the time taken. Having said which, any attempt at analysing a 643-shot rally using a crude probabilistic model would lead to such tiny probabilities that it demands an explanation. The only one I can think of is that for some reason both players decided that they would not try to win the rally but just stay in it. So they (I am guessing) kept the ball reasonably central, and deepish but not too deep. As a result, their probabilities of missing would have become much smaller and 643 would have become still extraordinary but no longer inexplicable.

      I’d be curious to know whether the rally ended with an unforced error or a winner. Is there an account of it somewhere?

    • gowers Says:

      Found one here. It ended with a winner.

  8. Biweekly Links – 06-25-2010 « God, Your Book Is Great !! Says:

    […] Mahut – You can check some highlights here . And a mathematical analysis of the game is at A mathematician watches tennis II. Another analysis on the World cup first round results is at Reconstructing World Cup results. I […]

  9. Plutoman Says:

    Great analysis. Here’s mine: http://fixedandfloating.blogspot.com/2010/06/greatest-match-ever-some-analysis.html

    Yep the US Open uses the 5th-set shoot-out, so 381 Grand Slam matches is correct. There’s also the Davis Cup though (no 5th-set tie-breaks in live matches, though the number of live matches in a year isn’t fixed, and I haven’t tried estimating it).
    For some reason I find that two-set women’s match the most fascinating of all. A virtual half-hour rally?! That’s just so far off the charts as to be almost unreal. If that tie-break had gone the other way we’d probably already have had a 9-hour match there.

  10. kristalcantwell Says:

    This reminds me of the book _The Black Swan_. As I recall the idea there was that the distribution of extreme cases is a lot higher if the distribution is not normal. In this case the complicating factor that might allow a longer match would be that the game might change. It has already changed with the introduction of newer raquets. There might be a similar change that gave a greater advantage to the server. If this happened the probability of the record being broken might be higher than the statistics would indicate.

  11. Randall Says:

    An analysis with a bit more emphasis on the particular matchup:

    http://blogs.wsj.com/dailyfix/2010/06/24/isner-fitting-winner-of-marathon-wimbledon-match/

    The most amusing line is “their tired legs couldn’t return serve even at their usually mediocre rates”.

    Click to access gilovich%20(1985)%20the%20hot%20hand%20in%20basketball.%20on%20the%20misperception%20of%20random%20sequences.pdf

    I haven’t read the above article (and there might be far better examples), but I thinks its conclusions are basically valid across a wide spectrum of sport. If so, your concern that “this state of affairs needs to last” isn’t so much of a concern. In tennis, where fatigue favors the server, the state may actually improve in a monotone fashion.

    • Randall Says:

      Another thing the “naive model” doesn’t account for (besides the obvious, i.e. random selection of two player tendency profiles and a surface, all of which will make a long match much more likely than the naive model would predict) is the fact that one of the courts is more to the advantage of the server than the other (I believe it is the ad court that favors a left handed server, the deuce court that favors a right handed server). Not sure what the numbers are, but any difference in the percentages on the two courts would again make breaking serve harder.

  12. Remo Says:

    http://answers.yahoo.com/question/index;_ylt=AuCWixxAtj15Xi61VeJBLJHty6IX;_ylv=3?qid=20100624094437AAGGcJK&show=7#profile-info-d3339228e63a53e480f5bb320ea841cfaa

    Another model

  13. Travels in a Mathematical World Says:

    Carnival of Mathematics #67…

    …while Tim Gowers is musing on a year of tennis in A mathematician watches tennis II….

  14. mostlymath Says:

    Very interesting read. I must admit I didn’t watch the match from quite such a mathematical perspective, but it’s a fascinating take on it. I’ll have to watch with a keener mind in future.

  15. Gaurav Says:

    Dear Tim, Can I just say how grateful I am that someone with your accomplishments is willing to share your thoughts with people of only elementary mathematical knowledge through this blog and your wonderful red book.

  16. Mens Accessories Says:

    tim,

    thank you for the share – made me reminisce about how great maths class in uni was.

    Keep up the great work.

  17. sam Says:

    Top 10 latest ATP Tennis Rankings

    1. Rafael Nadal,Spain,10,925 points

    2. Roger Federer, Switzerland, 7,215

    3. Novak Djokovic, Serbia, 7,085

    4. Andy Murray, Britain, 5,305

    5. Robin Soderling, Sweden, 4,830

    6. Nikolay Davydenko, Russia, 4,195

    7. Tomas Berdych, Czech Republic, 3,950

    8. Fernando Verdasco, Spain, 3,430.

    9. Juan Martin del Potro, Argentina, 3,170

    10. Jo-Wilfried Tsonga, United France, 3,095
    (As on August 16, 2010)

  18. A newbie Says:

    A stupid question but I must ask it. Professor, how did you get the mathematical notation into your blog? What program are you using please? I know this must be obvious to all the mathematicians out there but I don’t know the answer so I had to ask.

  19. SuperTramP Says:

    very nice analysis. The probability actually says nothing about the real world though 🙂 But loved the way you approached it nevertheless.

  20. Mike Smith Says:

    I’m wondering if there is a way to compare one players service winning probability against his opponent’s returning percentage? E.g., if player A wins 75% of first serves, but player B wins 50% of first serve returns.

  21. Novak Djokovic Says:

    Tennis is a weird ass sport, but we all seem to love to watch it. Including scientists!

  22. How Many Extended Play Sets Will Be Required to Decide a Match? | JD's Tennis Scoring System Says:

    […] ended with a fifth set score of 70-68.) According to one mathematician a match of this length is a once-in-200 year occurrence  so it makes a useful worst case […]

Leave a comment