#### Transcript Forecasting in Tennis

```Forecasting the Winner
of a Tennis Match
Franc Klaassen
University of Amsterdam (NL)
Jan R. Magnus
Tilburg University (NL)
TST Congress, London
July 29, 2003
Overview






Forecasting: one aspect of a larger tennis project
Motivation for forecasting
How to compute forecasts during a match?
Forecasting in practice: graph of the 2003 Ladies’
Singles Wimbledon final
Robustness of the graph
Conclusion.
Tennis project

Testing hypotheses (six papers):



Service strategy (in progress):


How to choose the strengths of 1st and 2nd services to
maximize the probability of winning a point?
Rule changes (one paper):


7th game is the most important game in a set: false
Real champions win the big points: true.
How to reduce the service dominance? Presented at
TST-1.
Forecasting (two papers):

Forecasting winner while match is in progress: TST-2.
Motivation for forecasting
Forecasting the winner of a tennis match:

Before a match


Using odds from bookmakers
Using statistical model, e.g.,



Boulier and Stekler (1999)
Clarke and Dyte (2000)
During a match
Using statistical model
 focus of our paper.

Why forecasting during match?
TV spectators want information on:
1.
2.
3.
Which player leads at this moment?
Who is most likely to win the match?
How did the match develop up to now
(momentum, winning mood)?
TV spectators get info on
Score: gives info on


2 (Likely winner): Partially
4-6 for Agassi-Hewitt => Hewitt will probably win,
4-6 for Agassi-Henman => Agassi will still be the favorite

3 (Development up to now): Partially
5-5 can result after 4-4 (match in balance),
but also after 5-0 (one player is in a winning mood)
 Room for improvement regarding 2 and 3.
TV spectators also get info on
Match/set stats (%1st serve in,...): give info on


2 (Likely winner): Not much
3 (Development up to now): Partially
Comparison of 2nd set with 1st set statistics gives some insight,
but each statistic is too aggregate to give a clear picture.
Note: summary stats provide detailed info on specific aspects of each player
 useful, but beyond scope of our paper.
 Still room for improvement regarding 2 and 3
 Purpose of current paper.
Idea
Present the probability that a player will win match;
update it as match unfolds (real-time forecasting).
Example: Agassi-Hewitt
At start of match: Agassi wins with prob. 60%
At 4-6:
Agassi wins with prob. 30%
At 4-6/0-3:
Agassi wins with prob. 20%.
Use graph to visualize the probs. of all points till now.
How to compute the forecasts
during a match?
Suppose: match between players A and B.
Goal: Prob{A wins match} at each point up to now.
This probability depends on 2 inputs (besides score):


Prob{A wins match} at start of match
Prob{A wins point on serve}+Prob{B wins point on serve}.
Implementation using our computer program TENNISPROB:


Choose the two inputs before the match and keep them constant
Type in the score at each point
 TENNISPROB gives Prob{A wins match} very quickly.
How to choose the two inputs?
Prob{A wins match} at start of match
We provide an estimate based on rankings (e.g., 80%),

but one can easily improve/overrule that estimate if one has specific
other info (injury problems, specific ability of surface,...) (e.g., 70%)
 In the end there is one starting point of the graph (70%).

Prob{A wins point on serve}+Prob{B wins point on serve}
We provide an estimate based on rankings (e.g., 120%: both players
win 60% of their points on service)

No need for adjustment: the graph hardly depends on our choice
 There is an estimate (120%).

Forecasting in practice: Serena-Venus
Williams at Wimbledon 2003
Before the match starts, we choose inputs:
Prob{Serena wins match}
= 70%
Prob{Serena wins point on serve}+
Prob{Venus wins point on serve} = 116%.
Then the match starts and graph builds up
Note: match has not yet been completed
 graph does not use info on later points!
Probability S. Williams wins match
1.0
0.8
0.6
0.4
0.2
set 1
0.0
0
20
40
60
80 100 120 140 160 180
Point number
Probability S. Williams wins match
1.0
0.8
0.6
0.4
0.2
set 1
0.0
0
20
40
60
80 100 120 140 160 180
Point number
Probability S. Williams wins match
1.0
0.8
0.6
0.4
0.2
set 1
set 2
0.0
0
20
40
60
80 100 120 140 160 180
Point number
Probability S. Williams wins match
1.0
0.8
0.6
0.4
0.2
set 1
set 2
set 3
0.0
0
20
40
60
80 100 120 140 160 180
Point number
Probability S. Williams wins match
1.0
0.8
0.6
0.4
0.2
set 1
set 2
set 3
0.0
0
20
40
60
80 100 120 140 160 180
Point number
Probability S. Williams wins match
1.0
0.8
0.6
0.4
0.2
set 1
set 2
set 3
0.0
0
20
40
60
80 100 120 140 160 180
Point number
Robustness of the graph
Our choices for the two input probabilities may be not
perfectly correct; is that a problem?
 Does profile change a lot if one chooses:


Starting probability:
60% or 80% instead of 70%?
Prob{Serena wins point on serve}+Prob{Venus wins point on serve}:
110% or 120% instead of 116%?
Probability S. Williams wins match
1.0
0.8
0.6
0.4
0.2
set 1
set 2
set 3
0.0
0
20
40
60
80 100 120 140 160 180
Point number
Probability S. Williams wins match
1.0
0.8
0.6
0.4
0.2
set 1
set 2
set 3
0.0
0
20
40
60
80 100 120 140 160 180
Point number
Conclusion
We have introduced a robust method to
forecast winner of match as match unfolds





New: existing papers focus on forecasting at start of match, while
we do it also for matches in progress
Info on who will win match and on development of match till now
Single line makes the information visible at a glance
Graph can be generated instantly
and for any match (not just at Wimbledon)
 Graph is useful in addition to score & summary statistics.
Potential application:
present graph during change of ends  TV commentator
can discuss match developments so far (turning points,..)
Future research

So far: two input probs. are kept fixed during match:
updating may improve graph, but value-added is unclear.

Other aspects of tennis project:


Service strategy
Development of tennis over time:



Has return indeed improved?
In what respects has the women’s game changed?
Differences between Wimbledon and other tournaments:

Impact of surfaces: grass, clay, hard court
 Need more data on grand slam/ATP/WTA tournaments.
Probability S. Williams wins match
1.0
0.8
0.6
0.4
0.2
set 1
set 2
set 3
0.0
0
20
40
60
80 100 120 140 160 180
Point number
```