A mathematician watches tennis

Because the French Open and Wimbledon have been available on the BBC website I’ve been watching a lot of tennis recently. And as I do so I can’t help thinking about whether mathematics has anything to say about the tactics that the players should adopt in various situations. And the more I think (or rather, idly muse) about this question, the more it becomes clear that the modelling problem it presents is a pretty hard one. Most of this post will be a discussion of questions rather than a serious attempt to supply answers.

Just to make the discussion more concrete, here are a couple of more specific questions, which I’ll come back to later. The first one is fairly simple.

1. It is generally held to be a slight advantage to serve first in a set. The reasoning goes like this. Let’s suppose (for simplicity) that the game goes with serve till 4-4. If you are serving first, then you will be in a very dangerous position if your serve is broken, since you will then have to break back immediately or lose the set. However, at least you won’t have lost. By contrast, if you are serving second and the score is 4-5, then you can’t afford to be broken — if you are broken then you lose the set and do not get even a small chance to redeem yourself. And if you have just broken your opponent so that it’s 5-4, then you still have the task of serving for the set.

However, a simple model would suggest that this reasoning is flawed. If you have a probability p of winning a game on your serve and a probability q of winning it on your opponent’s serve, then over the next two games you have a probability pq of winning both, p(1-q)+q(1-p) of winning one, and (1-p)(1-q) of losing both, and the order the games are played in makes no difference.

I’ll come back to the (not particularly interesting) solution of this “paradox” in a moment, but before that let me turn to a second and rather general piece of conventional tennis wisdom.

2. It seems pretty obvious that different circumstances call for different styles of play. For example, if you are a break up and 40-0 on your serve, then it feels as though you can afford to take a few risks, whereas if you are serving a second serve at match point down, then you should play it a little bit safe, since a double fault would cost you the match (and similarly, if you are going for a winner, you shouldn’t aim too close to the line, and so on).

But a very simple argument suggests that a different piece of tennis wisdom, the one that says, “Forget where you are in the game and just take things point by point” must be correct. After all, the best strategy cannot be anything other than to maximize the probability that you win the point, so if it ever makes sense to serve a reasonably powerful second serve and risk serving a double fault, then it makes sense even if you are match point down.

I have more mathematics-of-tennis questions that I want to mention, but first let’s dispose of these two, by thinking about what the model is that underlies the second argument in each case and what important factors it fails to take into account.

It is a model much loved by setters of questions in elementary probability: on each point, if player A is serving then there is a probability p that A will win and a probability 1-p that B will win, whereas if player B is serving then these probabilities are q and 1-q. Moreover, the outcomes of all the points are independent.

If that were a realistic model, then there would indeed be no advantage in serving first, and one should play the same tactics on every point. However, there are two (at least) very important things that it does not take into account. One is that the probabilities change according to how nervous a player is (and also, it has to be said, appear to ebb and flow during a match for no apparent reason). So for example it might be a good strategy to serve a rather lame second serve when you are match point down, since if you do a normal one then nerves are more likely to cause you to mess it up, and your opponent’s nerves may cause them to mess up their return even if under normal circumstances they could hit a winner off it. Similarly, the pressure of serving at 4-5 down is greater than the pressure of serving at 5-4 up, so perhaps there really is some justification for the belief that serving first is an advantage.

Unfortunately, there doesn’t seem to be an easy way of realistically incorporating players’ psychological states into a model, so let’s set that question aside (unless someone feels like having a go at it). But there is a second consideration that does lend itself to a mathematical analysis, and that is the fact that if you play precisely the same tactic every point, then you will be handing an advantage to your opponent. For instance, you may win the point with probability p if you serve out wide, and with probability q<p if you serve near the T, but if you serve out wide every time, then your opponent can take advantage of this by standing wider, at which point the probability of winning the point will no longer be p but something lower that may well be less than q.

I think (without having checked) that one can deal with the second consideration by defining a strategy to be something that tells you not what shot to play at each stage, but what probability distribution to choose your shot from: e.g., it might tell you to serve out wide 50% of the time, for the T 30% of the time, and straight at the body 20% of the time. And the optimal strategy would be the probability distribution that maximized your overall probability of winning the point. That sounds to me like fairly standard game theory.

A third factor I won't discuss here is conservation of energy — not the basic law of physics, but the more homely physical principle that says that if you run around too much then you will get tired later on. That might tell you that if you can just get a shot back by running all the way across the court, but can't do anything with it when you do except gently pat it over the net straight at your opponent, then it might be better not to bother, even if it increases your chances of winning the point from 0 to 1/100.

There's one more question I want to mention, and it's the main one I wondered about. It is how one should model the tactical decisions that are made during a rally. As a warm-up, here is a special case. It's your first serve. Is it better to go all out for an ace, or should you try for a serve that has a slightly higher chance of going in and still leaves you in a very strong position? Under suitable assumptions, that question is pretty easy once one puts some numbers in. Let's suppose that if you go for an ace, then you'll serve one with probability p and you'll serve a fault with probability 1-p. (That ignores the possibility that your opponent might occasionally be able to get your serve back, but let us indeed ignore that.) And let's suppose that if you go for a serve that is merely very hard to return, then you'll succeed with probability q and serve a fault with probability 1-q, and furthermore that if you succeed then your opponent will give you the opportunity to smash, and that your smash will win the point with probability r (which is very close to 1) and will go out with probability 1-r. Then if you go for an ace, you win with probability p, and if you go for a win on your second shot then you win with probability qr.

After that trivial calculation, the question that remains is whether p is likely to be higher or lower than qr, and rather boringly I don't have any idea. By roughly how much do professional tennis players increase their first-serve percentage if they make their serves very slightly slower, or very slightly less close to the lines? And how much can they afford to do that before their serves become easy to return? Obviously it depends a lot on who is playing, so a different question might be whether it is in fact the case that there are professional tennis players who are trying for aces when it is clearly not the best strategy. (Of course, the mixing-it-up principle could complicate this question and lead to the conclusion that one should sometimes try for an ace and sometimes go for a good first serve that leaves one in a strong position in the ensuing rally.)

Another simple question: is it conceivable that it might sometimes be good tactics to attempt first-serve-type serves all the time? For instance, if you are Ivo Karlovich on a good day and your first-serve percentage is 75% (as happened at times during Wimbledon), and if you are winning 95% of the points when your first serve goes in, then your probability of losing the point if you go for first serves every time is 0.95\times 0.75+0.25(0.95\times 0.75), which is 0.95\times(0.75\times 1.25), which is 19/20 times 15/16, which is certainly at least 7/8. That gives your opponent almost no chance of winning a game.

Maybe I should change that question to a related one: why is it that nobody ever plays that tactic, except perhaps on the occasional point?

This is a fairly simple question. Let's suppose that with my best strategy, if I have just one serve left then I have a probability r of winning the point. And now suppose that it is in fact my first serve. Let's suppose that I can choose the probability p that my first serve goes in, and that there is some function q(p), which tells me the probability that I will win the point if I do a probability-p serve. So for example, if I go for an ace, then p may be fairly low, but q(p)=1, whereas if I go for a safer first serve then p is higher and q(p) correspondingly lower.

If I have just one serve left, then I have a very simple optimization problem: I just need to maximize pq(p), and we are assuming that its maximal value is r. But if I have two serves left, then the quantity I am trying to maximize is pq(p)+(1-p)r. What difference does that make? Well, we are assuming that q(p) is a decreasing function of p. If p is chosen to maximize pq(p), then, at least if q is differentiable, decreasing p by a small \delta will decrease pq(p) by o(\delta) (since the derivative of q at p is zero) but will also increase (1-p)r by \delta r. Therefore, the maximum of pq(p)+(1-p)r will occur at a lower value of p, which corresponds to a riskier serve. (I've made various assumptions there that I can't be bothered to make explicit.)

Finally, here is the general case of the question that intrigued me most. It is to try to find a model that would justify the following piece of reasonable-sounding tennis advice: that if you are in the middle of trading groundstrokes with your opponent, then you should not go directly for a winner, but should instead try to play yourself into a stronger and stronger position until you can hit a winner with less risk. That advice may not be universally applicable, but it does appear to be the case that some players sometimes go for winners when they would have been better advised to be patient. (Equally, if you are playing Roger Federer, you may realize that you have no hope of winning a long rally so you are more or less forced to go for a risky shot that will be a winner if it comes off.)

How does one even think about such questions mathematically? I'm more concerned about that than about the actual answers that a model might yield. One might try something like this. In any given situation, you have a range of shots you can attempt. They have various probabilities of success, and if they do succeed then your opponent has a range of possibilities that depends on the shot you do.

It is natural to simplify the model as follows. In any given situation, you have a function q(p) like the one discussed earlier. That is, you can choose a shot that will go in with probability p, and if it goes in you will win the rally with probability q(p). But hang on, how do we have any idea what q(p) is? Well, by hitting your probability-p shot you give your opponent a choice similar to yours: a function r that says "If I hit a probability-s shot then I will win the rally with probability r(s)." But things are more complicated, because r depends not just on s but also on how risky your shot was — that is, on p.

In other words, your aim is clearly to maximize pq(p), but this can be broken down further as a wish to maximize p(1-\max sr(p,s)).

We can take this further. Let p_1,p_2,p_3\dots measure the riskiness of the shots and let r(p_1,\dots,p_k) denote my probability of winning if I chose to play a p_1-shot, you chose to play a p_2-shot, and so on. (This is a slight change because it focuses on my probability of winning and saves me having to write "1-".) Then it looks as though my best strategy is to maximize p_1r(p_1), which is the same as maximizing p_1(1-\max_{p_2} p_2(1-r(p_1,p_2))=p_1\min_{p_2}(q_2+p_2r(p_1,p_2)), where q_2=1-p_2, which is the same as maximizing p_1\min_{p_2}(q_2+p_2\max_{p_3}r(p_1,p_2,p_3)), and so on. Let me just write out the kth version of this expression in full gory detail. My probability of winning appears to be (I haven't checked this carefully) \max_{p_1}p_1\min_{p_2}(q_2+p_2\max_{p_3}p_3\dots\min_k(q_k+p_kr(p_1,\dots,p_k))) if k is even and a similar expression ending with \max_{p_k}p_kr(p_1,\dots,p_k) if k is odd.

But k isn't bounded, so it seems as though we need to take some kind of limit as k tends to infinity. Does anyone have any idea what sort of thing that limit is? It makes my head spin like the ball on a heavily sliced second serve. And that is just the question of modelling the situation in the first place: actually solving the optimization problem given the function r looks pretty unpleasant. So are there simplifying assumptions that would at least allow one to justify some of the advice given to players about when to attack, when to go for winners, and so on?

One final remark: I do not for one moment think that anything in this post, even when fully developed, would be of the slightest use to a tennis player. I just thought I'd better make that clear.

29 Responses to “A mathematician watches tennis”

  1. roland Says:

    Andy Roddick sometimes adopts your proposed tactic of going for first-serve-type serves only. That fits with your calculation as he is one of the best servers (inferior to Karlovic though). But in real life the tactic is dangerous. Players go through ups and downs and if one misses his first service that alone indicates that one might miss the next first-serve-type too.
    Another reason migth be that the strategy does not work in tie-breaks, where slower but more reliable second serve-type serve are sensible. So for every player it makes sense to get into the rhythm of winning points with slower second serves during the set.

  2. roland Says:

    Andy Roddick sometimes adopts your proposed tactic of going for first-serve-type serves only. That fits with your calculation as he is one of the best servers (inferior to Karlovic though). But in real life the tactic is dangerous. Players go through ups and downs and if one misses his first service that alone indicates that one might miss the next first-serve-type too.
    Another reason migth be that the strategy does not work in tie-breaks, where slower but more reliable second serve-type serve are sensible. So for every player it makes sense to get into the rhythm of winning points with slower second serves during the set.
    Forgot to add good post! Looking forward to seeing the next one!

  3. Theo Says:

    I have only epsilon to add to your very nice post.

    Well, epsilon + epsilon^2. The epsilon^2 part is to point out a sign error (I believe) in the paragraph that starts “If I have just one serve left,”. In particular: increasing p decreases (1-p)r, not increases it, so you want a riskier serve on your first serve, which is how the players actually play, unless I’ve made a sign error.

    The epsilon part is the following, with the caveat that I am no expert in any part of probability, and also haven’t spent more time than it took to read your post to think about any of your questions. But I believe that there is some work by Josh Tenenbaum (a mathematical cognitive psychologist at MIT) on other parts of probability theory, and in one way the flavor is different. In particular, if memory serves, then Tenenbaum has a model of probality requiring a similar limit, but shows that the contributions of large-k probabilities vanish fairly quickly: you can truncate the sequence and replace the tail with an arbitrary probability as long as your arbitrary number was not 1 or 0.

    I should condition the above paragraph more strongly. My knowledge of Tenebaum’s work comes from one lunch-time casual conversation with his partner, rather than from reading his papers. Moreover, he was looking at a very different question, and it’s likely that his models are sufficiently different that my connection of the two is purely superficial.

    • gowers Says:

      Thanks for pointing out the sign mistake, which I’ve now corrected.

      I was thinking about my question about large k after posting this post and thought that something along the lines of what you say should be correct: that is, that one should be able to get a good model by taking a large but finite k. I think all that would be needed is some assumption like that on any given shot there is at least a 0.0001 probability that it will go out (or if that seems unrealistic, try going for three consecutive shots — if you do a ludicrously safe shot, then your opponent will surely do a shot that leaves you with at least a 1% chance of missing the one after that) so that the probability that a rally lasts for at least k shots tends to zero as k tends to infinity.

  4. Mark Bennet Says:

    Is it possible that first serve = greatest chance of winning point with a defined minimum energy (so there is a constraint on effort, and that if the point is won, it is cheaply won)

    Second serve = greatest chance of winning point at a higher energy cost. Except that the statistics seem to show that second serves lose more often than first ones.

    I have had similar thoughts – why not first serves all the time?

  5. Rajiv Says:

    Moreover, if if points are independently decided, there’s a winning streak factor – a player who is able to win a tough point is often seen to steamroll the next point or two or even the entire set.

    On the other hand, winning a freak point can instill a false sense of bravado and the player could make the silliest of unforced error in the next point.

    In short, it may be useful to learn models that don’t treat points independently.

  6. Harrison Says:

    Have there been statistical analyses done on “streaks” in tennis? I know, for instance, that there don’t appear to be any such things outside of what’s predicted assuming independence.

    Tennis might be different, of course, but we humans are notoriously bad at seeing patterns where there aren’t any. 🙂

    • Harrison Says:

      Sorry, my first sentence there should end with “… in basketball or baseball.”

    • Timothy Chow Says:

      The fascinating book “Anthology of Statistics in Sports” (Albert, Bennett, Cochran) has an article by Jackson and Mosurski, “Heavy Defeats in Tennis: Psychological Momentum or Random Effect?” that discusses precisely this question. The paper shows that there is some evidence of non-independence in tennis, unlike basketball for instance.

  7. roland Says:

    I think your trick to get rid of the 1-‘s is flawed. For example take the formula after “which is the same as maximizing ..” (first occurence). We minimize the probability that my opponent get his shot in times the probability that that shot makes me win the point. The outcome should be the must unprobable shot that leaves me next to no chance to get the ball back.

    • gowers Says:

      You’re quite right of course. I’ll change it when I get a spare moment. [Now changed — I still haven’t checked it quite as carefully as I should but I think I’m closer this time.]

  8. speedy Says:

    I have read this discussion before in John Haigh’s taking chances

  9. gowers Says:

    I should have a look at that. Another book that probably discusses it is How to take a penalty: the hidden mathematics of sport, by Rob Eastaway and, again, John Haigh.

  10. oz Says:

    1. Interestingly, in tie-breakers the serving order is
    different. While in the ‘regular’ match the serving
    alternates: ABABAB .. in tie-breaker it is: ABBAABBAA ..
    Why were the rules chosen to be so? the only difference
    is that tie-breaker each ‘mini-game’
    has p closer to half than in a game (since in a game you have to win 4-points by a 2 point lead, which sort of ‘pushes’ the single-point p away from half). Suppose both players are of the same quality (so p=q) – then in both a ‘standard set’ setting and in tie-breaker both players have prob. 1/2 to win. Still, psychologically,
    I can’t help but feeling that the tie-breaker system is ‘more fair’ – maybe
    since in this system each player has advantage in terms of the number
    of games served so far.

    2. Another interesting issue which could be modeled mathematically is
    when to challenge the calls. Each player has 3 incorrect challenge
    in each set. Lets say that you estimate a call to be incorrect (against you)
    to be alpha. Then a naive strategy is to challenge whenever alpha>=r
    for some threshold r (lets say setting r=90%, so you challenge only if
    you’re 90% sure that the call is wrong). But surely in the last games you should adjust your calling according to how many challenges you’ve got left, the position in the game etc. (e.g. trailing 0-40 on your opponent’s serve it might be not so productive to challenge even if you’re almost sure you are correct, since you’re probably gonna lose the game anyway)

  11. Elsinor Says:

    A similar fun piece of mathematics is applicable to the game of squash.

    In squash, the winner of a rally is the server for the next point, and only a server can win a point. A game or set is won by the first player to reach 9 points, typically by 2 or more points. However, if the score reaches 8-8 then the receiver can choose to play wither first to 9 points or first to 10 points wins the game. The receiver should choose “10 ponts” if their (fixed) probability of winning any rally is larger than 0.38.

  12. Richard Gill Says:

    Gerard Sierksma (operations research, Univ. Groningen) is a sportsman and mathematician who also has a company whose software is used by several big football clubs, the Dutch Olympic committee etc, for tactics and strategy in a number of games. I don’t know if he tackled tennis yet.

  13. Richard Gill Says:

    ps his home page

    http://www.rug.nl/staff/g.sierksma/index

  14. Antonis Says:

    Jan Vecer of Columbia University has been working in this direction:
    http://www.columbia.edu/cu/news/04/01/janVecer.html

  15. Kevin O'Bryant Says:

    I once heard Agassi answer the question “how is it that Federer is able to raise his game so much on the big points?” Agassi’s answer was that good players will notice a weakness in the opponent (maybe foot position correlating with type of serve, or standing too far out when receiving a serve), but won’t exploit it immediately. They’ll save the weakness for a crucial moment in the match (when behind a few points in the second set tiebreak against Roddick, for example).

    Interestingly, this possibility arises only because of the peculiar scoring system of tennis. There wouldn’t be any advantage in soccer or basketball, that I see, to not exploiting any advantage immediately.

    This isn’t really different from the Allied powers not using information (and allowing ships to sink and men to die) in order to conceal the breaking of the enigma code. That is, use the info often enough to gain advantage, but seldom enough that the opponent can dismiss it as a fluke.

    • gowers Says:

      That’s very interesting, but I’m not sure I agree that the same idea couldn’t be applied to soccer. Suppose, for example, that you are playing a soccer match where your team needs to win and the opposing team needs at least a draw. And just to make the example very extreme, suppose you noticed such a glaring weakness in your opponents’ tactics that you had a guaranteed goal whenever you wanted it, but as soon as you cashed in they would realize their mistake and correct it. Finally, suppose your team was not very good in defence. Then it would make sense not to exploit the loophole but just play conventionally and hope that you were not behind near the end of the game, and then score the guaranteed extra goal right near the end when the opposing team wouldn’t have time to react to it by switching to a more attacking style of play. But like the tennis example, this is exploiting a particular feature of football — that the game ends after 90 minutes. If the aim were to be the first team to get to three goals, or the first to be two goals ahead, with no time limit, then I don’t see any reason not to exploit the advantage immediately.

      Also, a variant of what Agassi talked about could apply in soccer, which is that if you noticed a weakness in one of the opposing team’s defenders, then you might decide not to exploit it unless there was a good chance of going on to score a goal. That is, the equivalent for soccer of playing a big point in tennis would be something like being near to the other team’s goal with not many of their defenders in the way.

    • Mark Bennet Says:

      I once heard Gary Lineker not quite denying that during the first part of a game he would deliberately allow himself to be caught offside in positions which might otherwise be advantageous in order to induce a significant weakness which could later be exploited.

      I guess the same could be done in tennis too.

  16. Can we take games seriously? « Subramanian Ramamoorthy’s Weblog Says:

    […] practical problems under discussion. In this context, I was pleasantly surprised and pleased to see this excellent post by the eminent mathematician Tim Gowers about our difficulty in modelling such games […]

  17. Mariano Beguerisse Says:

    Very nice post!
    It touches the very bottom of my Maths/Tennis-loving heart!!
    Recently, as part of my research on network science, I told a collaborator that I wanted to do some work on tennis and he pointed out a paper for me to read. I think that it could be worth sharing here (I’ve just began to read it so I don’t know how good it is):

    Monte Carlo Tennis
    Paul K. Newton and Kamran Aslam
    SIAM Review
    Volume 48 , Issue 4 (November 2006)
    Pages: 722 – 742
    http://dx.doi.org/10.1137/050640278

    Greetings to all,

    Mariano.

  18. Steve Lawford Says:

    Jan Magnus (Tilburg) has done a lot of interesting work with his co-authors on the statistics of tennis, including the probability of a player winning the match, conditional on being at a particular stage of the match. See Section 14 of http://center.uvt.nl/staff/magnus/subjects.pdf for links to papers.

  19. Alyssa Grossen Says:

    This is post is really interesting!

    For one thing, those who read it will hopefully be encouraged to view sports with a more mathematical eye, such as what probabilities and statisitics are involved with certain strategies. Not only can the numerical ideas in this post be applied to tennis, but they can be applied to other sports as well.

    Stephen R. Clarke has also done research regarding to mathematics in tennis that I also found interesting. Clarke uses tree diagrams in order to connect the probabilites of a player winning a match to the current score of the match.

    link:

    http://researchbank.swinburne.edu.au/vital/access/manager/Repository/swin:7781

  20. dropship Says:

    such as what probabilities and statisitics are involved with certain strategies

  21. Gil Kalai Says:

    There is some literature on optimal mixed strategies for serving in tennis starting with the paper Walker, M. and Wooders, J. “Minimax Play at Wimbledon.” American Economic Review 91 (5): 1521-1538 Dec 2001. Joel Wiles thesis is also on this topic http://econ.duke.edu/uploads/assets/dje/2006_Symp/Wiles.pdf

  22. Mike the Zim Says:

    I’ve always wondered why a serving player with a 40 love lead doesn’t just hit 6 attempts at an ace.

    Playing out two or even three points to try and re-gain an advantage (or point) over your opponent when it’s widely held that the service is already an advantage you don’t have to earn!

    Once a serve is hit and subsequently returned the odds are (by and large) 1 to 1 that either player wins the point but an aggressive ace hunting type of serve keeps the odds for the server – and in my example you have 6 try’s to get it right, whereas if you grind out the point you only have 3 try’s which is in fact divided by half cuz your opponent is still trying to win!

  23. blogofmaths Says:

    A very interesting post. I’ve often thought about this kind of thing myself, being interested in both mathematics and tennis. It’s interesting to see how far maths can be used to model situations which initially seem too complicated to model accurately.

    I made a similar post about sports in general: https://blogofmaths.wordpress.com/2016/07/07/a-question-of-sport/

Leave a comment