Transcript Slide 1

Comparing Market Efficiency with
Traditional and Non-Traditional
Ratings Systems in ATP Tennis
Dr Adrian Schembri
Dr Anthony Bedford
Bradley O’Bree
Natalie Bressanutti
RMIT Sports Statistics Research Group
School of Mathematical and
Geospatial Sciences
RMIT University
Melbourne, Australia
www.rmit.edu.au/sportstats
Aims of the Presentation
 Structure of ATP tennis, rankings, and tournaments;
 Challenges associated with predicting outcomes of tennis
matches;
 Utilising the SPARKS and Elo ratings to predict ATP
tennis;
 Evaluate changes in market efficiency in tennis over the
past eight years.
RMIT University©2011
RMIT Sports Statistics
2
Background to ATP Tennis
 ATP: Association of Tennis Professionals;
 Consists of 65 individual tournaments each year for men
playing at the highest level;
 Additional:
178 tournaments played in the Challenger Tour;
 534 tournaments played in Futures tennis.
RMIT University©2011
RMIT Sports Statistics
3
ATP Tennis Rankings
 “Used to determine qualification for entry and seeding in
all tournaments for both singles and doubles”;
 The rankings period is always the
past 52 weeks prior to the current
week.
RMIT University©2011
RMIT Sports Statistics
4
ATP Tennis Rankings – Sept 12th, 2011
RMIT University©2011
RMIT Sports Statistics
5
How Predictive are Tennis Rankings?
Case Study
Australian Hardcourt Titles
January, 1998
Adelaide, Surface – Hardcourt
Age
ATP Ranking
RMIT University©2011
Lleyton Hewitt (AUS)
Andre Agassi (USA)
16 years
27 years
550
86 (6th in Jan, 1999)
RMIT Sports Statistics
6
How Predictive are Tennis Rankings?
Case Study
Aircel Chennai Open
January 4 - 10, 2010
Chennai, Surface – Hardcourt
Robby Ginepri
Robin Soderling
6-4 7-5
Age
Tourn Seed
RMIT University©2011
16 years
27 years
Unseeded
1
RMIT Sports Statistics
7
Challenges Associated with Predicting
Outcomes in ATP Tennis
 Individual sport and therefore natural variation due to
individual differences prior to and during a match;
 Constant variations in the quality of different players:
 Players climbing the rankings;
 Players dropping in the rankings;
 Players ranking remaining stagnant.
 The importance of different tournaments
varies for each individual players.
RMIT University©2011
RMIT Sports Statistics
8
Recent Papers on Predicting ATP Tennis
and Evaluating Market Efficiency
 Forrest and McHale (2007) reviewed the potential for longshot bias in men’s tennis;
 Klaassen and Magnus (2003) developed a probabilitybased model to evaluate the likelihood of a player winning a
match, whilst Easton and Uylangco (2010) extended this to a
point-by-point model;
 A range of probability-based models are available online,
however these are typically volatile and reactive to events
such as breaks in serve and each set result (e.g.,
www.strategicgames.com.au).
RMIT University©2011
RMIT Sports Statistics
9
Aims of the Current Paper
 Evaluate the efficiency of various tennis betting markets
over the past eight years;
 Compare the efficiency of these markets with traditional
ratings systems such as Elo and a non-traditional ratings
system such as SPARKS;
 Identify where inefficiencies in the market lie and the
degree to which this has varied over time.
RMIT University©2011
RMIT Sports Statistics
10
Elo Ratings and
the SPARKS
Model
www.rmit.edu.au/sportstats
Introduction to Ratings Systems
 Typically used to:
 Monitor the relative ranking of players with other
players in the same league;
 Identify the probability of each team or player winning
their next match.
 Have been developed in the context of individual (chess,
tennis) or group based sports (e.g., AFL football, NBA);
 The initial ratings suggest which player is likely to win, with
the difference between their old ratings being used to
calculate a new rating after the match is played.
RMIT University©2011
RMIT Sports Statistics
12
Introduction to SPARKS
 Initially developed by Bedford and Clarke (2000) to provide
an alternative to traditional ratings systems;
 Differ from Elo-type ratings systems as SPARKS considers
the margin of the result;
 Has been recently utilised to evaluate other characteristics
such as the travel effect in tennis
(Bedford et al., 2011).
RMIT University©2011
RMIT Sports Statistics
13
Introduction to SPARKS
• where
• where
RMIT University©2011
RMIT Sports Statistics
14
Introduction to SPARKS
Optim isation process
max f ( x)  wins
s.t.
0    10.
0    100.
0    1000.
where

 P1old  P2 old 

P1new  P1old  100 S p1  S p 2   





 P  P 
P 2 new  P 2old  100 S p 2  S p1    2 old 1old 




RMIT University©2011
RMIT Sports Statistics
15
SPARKS: Case Study
Robin Soderling (SWE)
6-2
Seeding
Ryan Harrison (USA)
6-4
1
Qualifier
2986.3
978.4
Expected Outcome
20.1
-20.1
Observed Outcome
Win
Loss
SPARKS
24
6
SPARKS Difference
18
-18
-2.1
2.1
2975.9
988.8
Pre-Match Rating
Residuals
Post-Match Rating
RMIT University©2011
RMIT Sports Statistics
16
Longitudinal Examination of SPARKS
4500
4000
3500
3000
2500
2000
1500
1000
500
0
13000
18000
RMIT University©2011
23000
28000
RMIT Sports Statistics
33000
38000
43000
17
Limitations of SPARKS: Case Study
Player
Set 1 Set 2
Set 3
Set 4
Calculation
SPARKS (Diff)
Player 1
7
7
7
21 + (3*6)
39 (21)
Player 2
6
6
6
18 + (0*6)
18 (21)
Player
Set 1 Set 2
Set 3
Set 4
Calculation
SPARKS
Player 1
6
3
6
6
21 + (3*6)
39 (21)
Player 2
2
6
2
2
12 + (1*6)
18 (21)
+
RMIT University©2011
RMIT Sports Statistics
18
Limitations of SPARKS: Case Study
Player
Set 1 Set 2
Set 3
Set 4
Calculation
SPARKS (Diff)
Player 1
7
7
7
21 + (3*6)
39 (21)
Player 2
6
6
6
18 + (0*6)
18 (21)
Player 2 competitive
in all three sets.
Player
Set 1 Set 2
Set 3
Set 4
Calculation
SPARKS
Player 1
6
3
6
6
21 + (3*6)
39 (21)
Player 2
2
6
2
2
12 + (1*6)
18 (21)
+
RMIT University©2011
RMIT Sports Statistics
19
Limitations of SPARKS: Case Study
Player
Set 1 Set 2
Set 3
Set 4
Calculation
SPARKS (Diff)
Player 1
7
7
7
21 + (3*6)
39 (21)
Player 2
6
6
6
18 + (0*6)
18 (21)
Player 2 competitive
in all three sets.
Player
Set 1 Set 2
Set 3
Set 4
Calculation
SPARKS
Player 1
6
3
6
6
21 + (3*6)
39 (21)
Player 2
2
6
2
2
12 + (1*6)
18 (21)
Player 2 competitive
in 1 out of 4 sets.
RMIT University©2011
RMIT Sports Statistics
20
Historical Results of the SPARKS Model
 The following table displays historical results of the raw
SPARKS model over the past 8 years.
Year
Win Prediction in all
ATP Matches
2003
.64
2004
.64
2005
.69
2006
.67
2007
.66
2008
.67
2009
.69
2010
.72
+
RMIT University©2011
RMIT Sports Statistics
21
Historical Results of the SPARKS Model
 The following table displays historical results of the raw
SPARKS model over the past 8 years.
RMIT University©2011
Year
Win Prediction in all
ATP Matches
2003
.64
2004
.64
2005
.69
2006
.67
2007
.66
2008
.67
2009
.69
2010
.72
RMIT Sports Statistics
22
Banding of Probabilities
 Probability banding is used primarily to determine whether
a models predicted probability of a given result is accurate;
 Enables an assessment of whether the probability
attributed to a given result is appropriate based
Lower
Band
on reviewing all results within the band;
 For example, if 200 matches within a given
tennis season are within the .20 to .25 probability
band, then between 20% and 25% (or approx 45
matches) of these matches should be won by the
players in question.
RMIT University©2011
RMIT Sports Statistics
Upper
Band
Midpoint
0.00
0.05
0.025
0.05
0.10
0.075
0.10
0.15
0.125
0.15
0.20
0.175
0.20
0.25
0.225
0.25
0.30
0.275
0.30
0.35
0.325
0.35
0.40
0.375
0.40
0.45
0.425
0.45
0.50
0.475
23
Banding and the SPARKS Model
Lower
Band
Upper
Band
Midpoint
Lower
Band
Upper
Band
Midpoint
0.00
0.05
0.025
0.50
0.55
0.525
0.05
0.10
0.075
0.55
0.60
0.575
0.10
0.15
0.125
0.60
0.65
0.625
0.15
0.20
0.175
0.65
0.70
0.675
0.20
0.25
0.225
0.70
0.75
0.725
0.25
0.30
0.275
0.75
0.80
0.775
0.30
0.35
0.325
0.80
0.85
0.825
0.35
0.40
0.375
0.85
0.90
0.875
0.40
0.45
0.425
0.90
0.95
0.925
0.45
0.50
0.475
0.95
1.00
0.975
+
RMIT University©2011
RMIT Sports Statistics
24
Banding and the SPARKS Model
Lower
Band
Upper
Band
Midpoint
Lower
Band
Upper
Band
Midpoint
0.00
0.05
0.025
0.50
0.55
0.525
0.05
0.10
0.075
0.55
0.60
0.575
0.10
0.15
0.125
0.60
0.65
0.625
0.15
0.20
0.175
0.65
0.70
0.675
0.20
0.25
0.225
0.70
0.75
0.725
0.25
0.30
0.275
0.75
0.80
0.775
0.30
0.35
0.325
0.80
0.85
0.825
0.35
0.40
0.375
0.85
0.90
0.875
0.40
0.45
0.425
0.90
0.95
0.925
0.45
0.50
0.475
0.95
1.00
0.975
Represent the
underdog.
RMIT University©2011
Represent the
favourite.
RMIT Sports Statistics
25
Banding and the SPARKS Model
RMIT University©2011
RMIT Sports Statistics
26
Banding and the SPARKS Model (2003-2010)
+
RMIT University©2011
RMIT Sports Statistics
27
Banding and the SPARKS Model (2003-2010)
Over-estimates the
probability of the
favorite winning.
Under-estimates the
probability of the
under-dog winning.
RMIT University©2011
RMIT Sports Statistics
28
Elo Ratings
www.rmit.edu.au/sportstats
Introduction to Elo Ratings
Elo ratings system developed by Árpád Élő to calculate
relative skill levels of chess players
RN  RO  W O  E 
where:
RN = New rating
RO = Old rating
O = Observed Score
E = Expected Score
W = Multiplier
(16 for masters, 32 for lesser players)
RMIT University©2011
1

O  0
0 .5

E
RMIT Sports Statistics
1
1  10 ( RA  RB ) / 400
30
Probability Bands: Elo Ratings
+
RMIT University©2011
RMIT Sports Statistics
31
Probability Bands: Elo Ratings
RMIT University©2011
RMIT Sports Statistics
32
Probability Bands: Elo Ratings (2003-2010)
RMIT University©2011
RMIT Sports Statistics
33
Probability Bands: Elo Ratings (2003-2006)
+
RMIT University©2011
RMIT Sports Statistics
34
Probability Bands: Elo Ratings (2003-2006)
High variability in the majority
of probability bands during the
burn-in period.
RMIT University©2011
RMIT Sports Statistics
35
Probability Bands: Elo Ratings (2007-2010)
+
RMIT University©2011
RMIT Sports Statistics
36
Probability Bands: Elo Ratings (2007-2010)
RMIT University©2011
RMIT Sports Statistics
37
Advantages and Shortcomings of SPARKS
and Elo Ratings
 SPARKS considers the margin of the result, often a difficult
task in the context of tennis;
 Elo is only concerned with whether the player wins or
loses, not the margin of victory in terms of the number of
games or sets won;
 Elo provides a more efficient model in terms of probability
banding, suggesting that evaluating the margin of matches
may be misleading at times.
RMIT University©2011
RMIT Sports Statistics
38
Market Efficiency of ATP
Tennis in Recent Years
www.rmit.edu.au/sportstats
ATP Betting Markets Used in the Current
Analysis
Market
Abbreviation
Bet 365
B365
Luxbet
LB
Expekt
EX
Stan James
SJ
Pinnacle Sports
PS
Elo ratings
Elo
SPARKS
SPARKS
RMIT University©2011
RMIT Sports Statistics
40
Overall Efficiency of Each Market between
2003 and 2010
Market
2003
2004
2005
2006
2007
2008
2009
2010
Overall
B365
.71
.67
.70
.71
.72
.71
.70
.70
.703
LB
.70
.69
.68
.69
.70
.71
.70
.70
.697
PS
.71
.65
.70
.68
.72
.70
.70
.70
.696
SJ
.69
.69
.70
.67
.73
.69
.70
.71
.696
EX
.72
.65
.72
.69
.73
.70
.70
.69
.698
Elo
.59
.62
.66
.65
.70
.66
.68
.67
.654
SPARKS
.63
.64
.69
.67
.66
.60
.69
.72
.667
Overall
.68
.66
.69
.68
.71
.68
.70
.70
.69
+
RMIT University©2011
RMIT Sports Statistics
41
Overall Efficiency of Each Market between
2003 and 2010
Market
2003
2004
2005
2006
2007
2008
2009
2010
Overall
B365
.71
.67
.70
.71
.72
.71
.70
.70
.703
LB
.70
.69
.68
.69
.70
.71
.70
.70
.697
PS
.71
.65
.70
.68
.72
.70
.70
.70
.696
SJ
.69
.69
.70
.67
.73
.69
.70
.71
.696
EX
.72
.65
.72
.69
.73
.70
.70
.69
.698
Elo
.59
.62
.66
.65
.70
.66
.68
.67
.654
SPARKS
.63
.64
.69
.67
.66
.60
.69
.72
.667
Overall
.68
.66
.69
.68
.71
.68
.70
.70
.69
RMIT University©2011
RMIT Sports Statistics
42
Overall Efficiency of Each Market between
2003 and 2010
+
RMIT University©2011
RMIT Sports Statistics
43
Overall Efficiency of Each Market between
2003 and 2010
Heightened stability
and efficiency across
markets and seasons
since 2008.
RMIT University©2011
RMIT Sports Statistics
44
Converting Market Odds into a Probability
2011 US Open Final
Novak Djokovic
Rafael Nadal
Match Odds
$1.63
$2.25
Conversion
1/1.63
1/2.25
.61
.44
Probability of Winning
RMIT University©2011
RMIT Sports Statistics
45
Accounting for the Over-Round
 The sum of the probability-odds in any given sporting
contest typically exceeds 1, to allow for the bookmaker to
make a profit;
 The amount that this probability exceeds 1 is referred to as
the over-round;
 For example, if the sum of probabilities for a given match is
equal to 1.084, the over-round is equal to .084 or 8.4%
RMIT University©2011
RMIT Sports Statistics
46
Accounting for the Over-Round
2011 US Open Final
Novak Djokovic
Rafael Nadal
Match Odds
$1.63
$2.25
Conversion
1/1.63
1/2.25
.61
.44
Probability of Winning
Sum of Probabilities
1.05
Over-Round
5%
6–2
RMIT University©2011
6–4
6–7
RMIT Sports Statistics
6–1
47
Comparison of Over-Round Across Markets
+
RMIT University©2011
RMIT Sports Statistics
48
Comparison of Over-Round Across Markets
Kruskal-Wallis test with follow-up Mann-Whitney U tests:
Significant difference between all betting markets aside from Pinnacle
Sports and Stan James.
RMIT University©2011
RMIT Sports Statistics
49
Over-Round for Bet 365 (2003-2010)
RMIT University©2011
RMIT Sports Statistics
50
Accounting for the Over-Round:
Normalised Probabilities and Equal Distribution
Novak Djokovic
Rafael Nadal
$1.63
$2.25
Raw Probability of Winning
.61
.44
Over-round
.05
.05
.61/1.05
.44/1.05
.58
.42
.61 – (.05/2)
.44 – (.05/2)
.585
.415
Match Odds
Normalisation
Normalised Probability of Winning
Equal Distribution
Equalised Probability of Winning
+
RMIT University©2011
RMIT Sports Statistics
51
Accounting for the Over-Round:
Normalised Probabilities and Equal Distribution
Novak Djokovic
Rafael Nadal
$1.63
$2.25
Raw Probability of Winning
.61
.44
Over-round
.05
.05
.61/1.05
.44/1.05
.58
.42
.61 – (.05/2)
.44 – (.05/2)
.585
.415
Match Odds
Normalisation
Normalised Probability of Winning
Equal Distribution
Equalised Probability of Winning
RMIT University©2011
RMIT Sports Statistics
52
Accounting for the Over-Round:
Normalised Probabilities and Equal Distribution
Roger Federer
Bernard Tomic
$1.07
$6.60
Raw Probability of Winning
.93
.15
Over-round
.08
.08
.93/1.08
.15/1.08
.86
.14
.93 – (.08/2)
.15 – (.08/2)
.89
.11
Match Odds
Normalisation
Normalised Probability of Winning
Equal Distribution
Equalised Probability of Winning
+
RMIT University©2011
RMIT Sports Statistics
53
Accounting for the Over-Round:
Normalised Probabilities and Equal Distribution
Roger Federer
Bernard Tomic
$1.07
$6.60
Raw Probability of Winning
.93
.15
Over-round
.08
.08
.93/1.08
.15/1.08
.86
.14
.93 – (.08/2)
.15 – (.08/2)
.89
.11
Match Odds
Normalisation
Normalised Probability of Winning
Equal Distribution
Equalised Probability of Winning
RMIT University©2011
RMIT Sports Statistics
54
Market Efficiency in ATP Tennis
+
RMIT University©2011
RMIT Sports Statistics
55
Market Efficiency in ATP Tennis
SPARKS significantly less
efficient when compared with
the betting markets for all
bands aside from .50 - .55.
RMIT University©2011
RMIT Sports Statistics
56
Market Efficiency in ATP Tennis - Raw
+
RMIT University©2011
RMIT Sports Statistics
57
Market Efficiency in ATP Tennis - Raw
General inefficiency across
bands, likely due to no
correction for the over-round.
RMIT University©2011
RMIT Sports Statistics
58
Market Efficiency in ATP Tennis - Normalised
RMIT University©2011
RMIT Sports Statistics
59
Market Efficiency in ATP Tennis – Equal Diff
+
RMIT University©2011
RMIT Sports Statistics
60
Market Efficiency in ATP Tennis – Equal Diff
Relative consistency in efficiency and
variability within each band across
markets.
+
RMIT University©2011
RMIT Sports Statistics
61
Market Efficiency in ATP Tennis – Equal Diff
Evidence of longshot
bias for the .25 to .30
band.
RMIT University©2011
RMIT Sports Statistics
62
Market Efficiency in ATP Tennis: Bet365
+
RMIT University©2011
RMIT Sports Statistics
63
Market Efficiency in ATP Tennis: Bet365
RMIT University©2011
RMIT Sports Statistics
64
Longitudinal Changes in Market Efficiency
+
RMIT University©2011
RMIT Sports Statistics
65
Longitudinal Changes in Market Efficiency
Homogeneity of variance
tests revealed significantly
less variability across
markets in recent years.
Few significant differences
emerged when comparing
efficiency across the bands
over the past 8 years.
RMIT University©2011
RMIT Sports Statistics
66
Most Efficient Year: 2007
+
RMIT University©2011
RMIT Sports Statistics
67
Most Efficient Year: 2007
RMIT University©2011
RMIT Sports Statistics
68
Least Efficient Year: 2004
+
RMIT University©2011
RMIT Sports Statistics
69
Least Efficient Year: 2004
RMIT University©2011
RMIT Sports Statistics
70
Discussion of Findings
www.rmit.edu.au/sportstats
Psychological Player Considerations
 Form of an individual player will affect the context and
potential outcome of the entire match, as opposed to a teambased sport where individual players have less impact or can
be substituted off if out of form.
 Micro-events within a match, at times, have an impact on
the outcome of the match. Examples:
 Rain delays
 Injury Time outs
 Code violations
RMIT University©2011
RMIT Sports Statistics
72
Shortcomings of the Current Analysis
 A set multiplier of ‘6’ was used for the SPARKS model
based on the original SPARKS model published in 2000;
 Only a limited number of betting markets were
incorporated, and therefore it was not possible to utilise Betfair
data into the analysis;
 Differences in market efficiency and inefficiency were not
evaluated at the surface level. This would be particularly
interesting if evaluated for clay, given the volatility of player
performance on clay when compared with other surfaces.
RMIT University©2011
RMIT Sports Statistics
73
Future Work
 Optimise the set multiplier of the SPARKS model;
 Develop a model that combines SPARKS and Elo ratings;
 Extend the current findings to incorporate women’s tennis
given that evidence has shown greater volatility in the
women’s game.
 Incorporate data on other potential predictors of tennis
outcomes. Examples include:
 The set sequence of the match
 Surface
 Importance of the tournament
(e.g., Grand slams)
RMIT University©2011
RMIT Sports Statistics
74
Conclusions
 Whilst considerable variability was evident during the 2003
– 2007 seasons, an increase in consistency across markets
since 2008.
 Following a lengthy burn-in period of four years, the Elo
model outperformed SPARKS and most betting markets
across the majority of probability bands;
 Whilst not efficient in terms of probability banding, the
SPARKS model was able to predict an equivalent proportion
of winners to the betting markets, and outperformed some
markets in recent years;
 A model that combines both Elo and SPARKS may yield
the most efficient model.
RMIT University©2011
RMIT Sports Statistics
75
Questions and Comments
RMIT University©2011
RMIT Sports Statistics
76
RMIT University©2011
RMIT Sports Statistics
77