ArticlePDF Available

Investigating the efficiency of the Asian handicap football betting market with ratings and Bayesian networks

Authors:

Abstract and Figures

Despite the massive popularity of the Asian Handicap (AH) football (soccer) betting market, its efficiency has not been adequately studied by the relevant literature. This paper combines rating systems with Bayesian networks and presents the first published model specifically developed for prediction and assessment of the efficiency of the AH betting market. The results are based on 13 English Premier League seasons and are compared to the traditional market, where the bets are for win, lose or draw. Different betting situations have been examined including a) both average and maximum (best available) market odds, b) all possible betting decision thresholds between predicted and published odds, c) optimisations for both return-on-investment and profit, and d) simple stake adjustments to investigate how the variance of returns changes when targeting equivalent profit in both traditional and AH markets. While the AH market is found to share the inefficiencies of the traditional market, the findings reveal both interesting differences as well as similarities between the two.
Content may be subject to copyright.
Uncorrected Author Proof
Journal of Sports Analytics xx (2021) x–xx
DOI 10.3233/JSA-200588
IOS Press
1
Investigating the efficiency of the Asian
handicap football betting market with
ratings and Bayesian networks
1
2
3
Anthony C. Constantinoua,b,
4
aBayesian Artificial Intelligence Research lab, School of Electronic Engineering and Computer Science, Queen
Mary University of London (QMUL), London, E1 4NS, UK
5
6
bThe Alan Turing Institute, UK7
Abstract. Despite the massive popularity of the Asian Handicap (AH) football (soccer) betting market, its efficiency has not
been adequately studied by the relevant literature. This paper combines rating systems with Bayesian networks and presents
the first published model specifically developed for prediction and assessment of the efficiency of the AH betting market. The
results are based on 13 English Premier League seasons and are compared to the traditional market, where the bets are for
win, lose or draw. Different betting situations have been examined including a) both average and maximum (best available)
market odds, b) all possible betting decision thresholds between predicted and published odds, c) optimisations for both
return-on-investment and profit, and d) simple stake adjustments to investigate how the variance of returns changes when
targeting equivalent profit in both traditional and AH markets. While the AH market is found to share the inefficiencies of
the traditional market, the findings reveal both interesting differences as well as similarities between the two.
8
9
10
11
12
13
14
15
16
Keywords: Directed acyclic graph, football prediction, graphical models, profitability, rating system, return-on-investment,
soccer prediction
17
18
1. Introduction19
Football prediction models have become immen-20
sely popular over the last couple of decades, and21
this is due to the increasing popularity of football
22
betting. In the academic literature, such models typ-
23
ically focus on predicting the outcome of a match
24
in terms of home win, draw, or away win; known as
25
the 1X2 distribution. Several types of models have
26
been published for this purpose and include rating27
systems (Leitner et al., 2008; Hvattum & Arntzen,28
2010; Constantinou & Fenton, 2013; Wunderlich29
& Memmert, 2018), statistical methods (Dixon &
30
Coles, 1997; Rue & Salvesen, 2000; Crowder et al.,31
2002; Goddard, 2005; Angelini & De Angelis, 2017),
32
machine learning techniques (Huang & Chang, 2010;
33
Corresponding author: Anthony C. Constantinou. E-mail: a.
constantinou@qmul.ac.uk.
Arabzad et al., 2014; Pena, 2014), knowledge-based 34
systems (Joseph et al., 2006), and hybrid methods 35
that combine any of the above (Constantinou & Fen- 36
ton, 2017; Constantinou, 2018; Hubacek et al., 2018). 37
In the recent special issue international competition 38
Machine Learning for Soccer, the models that topped 39
the performance table are hybrid and heavily rely on 40
rating systems (Constantinou, 2018; Hubacek et al., 41
2018). 42
In Asia, the most popular form of betting (also 43
common in Europe) is the so-called Asian Handicap 44
(AH). This form of betting introduces a hypotheti- 45
cal handicap (i.e., advantage) typically in favour of 46
the weaker team. Specifically, traditional AH intro- 47
duces a goal deficit to the team more likely to win 48
before kick-off. The manipulation of the match out- 49
come creates interesting situations in which betting 50
is determined by hypothetical, rather than actual, 51
match outcome. Examples of the various types of AH 52
ISSN 2215-020X © 2021 – The authors. Published by IOS Press. This is an Open Access article distributed under the terms
of the Creative Commons Attribution-NonCommercial License (CC BY-NC 4.0).
Uncorrected Author Proof
2A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting
betting are provided in Section 2. This type of bet-53
ting has also become popular in the UK over the last
54
couple of decades. Football (soccer) syndicates are55
rumoured to bet millions per week, often on behalf56
of clients, on AH outcomes offered by bookmakers57
in Asia (Williams-Grut, 2016). This is because the
58
Asian markets attract higher volumes of bets and offer59
greater market liquidity. Estimates suggest that over60
70% of the betting turnover for football is recorded61
with Asian bookmakers (Kerr, 2018).
62
Whereas there is a large literature analysing63
traditional betting strategies, with researchers inves-64
tigating how to optimise their betting, there has been
65
only four published papers involving some analysis66
related to AH. Specifically, Vlastakis et al (2008)
67
used AH odds as one of their model variables to68
predict match scores and showed that they are a69
strong predictor of match outcomes. Grant et al
70
(2018) used AH odds, in conjunction with 1X2 odds,71
to analyse arbitrage opportunities and showed that72
these exist across a large number of fixed-odds and73
exchange market odds. Hofer and Leitner (2017)74
described how to derive information from live AH75
and Under/Over odds in order to maximise expected76
returns. Finally, and in an effort to educate gamblers,
77
Hassanniakalager and Newall (n.d.) investigated the
78
product risk associated with different football odds79
and showed that the AH odds would generally gener-
80
ate lower losses compared to other popular types of81
bet such as the 1X2, Under/Over, and correct score.82
Remarkably, no previous published work involves a
83
model specifically designed for, and assessed with,84
AH bets.
85
The purpose of this paper is to investigate the
86
efficiency of the AH market in relation to the 1X287
market. The 1X2 market has been extensively studied88
and the literature provides mixed empirical evidence89
regarding its efficiency, with most evidence point-90
ing towards a weak-form efficient market (Giovanni91
& De Angelis, 2019). In this paper, the efficiency of
92
both markets is measured in terms of the ability of the93
model in discovering profitable betting opportunities
94
given both average and maximum market odds. The95
model is based on a modified version of the pi-rating96
system, which is a previously published football97
rating system (Constantinou & Fenton, 2013), that98
generates ratings that reflect team scoring abil-99
ity. The ratings are provided as input to a novel100
hybrid Bayesian Network (BN) model specifically
101
constructed to simulate the influential relationships
102
between possession, shots, and goals, to predict both
103
1X2 and AH outcomes.
104
A BN is a type of a probabilistic model intro- 105
duced by Pearl (Pearl, 1985) that consists of nodes 106
and arcs. Nodes represent variables and arcs repre- 107
sent conditional dependencies. A BN that consists 108
of both discrete and continuous variables, such as 109
the one constructed in this study, is called a hybrid 110
BN. Each variable has a corresponding Conditional 111
Probability Distribution (CPD) that captures the mag- 112
nitude as well as the shape of the relationship between 113
directly linked variables. If we assume that the arcs in 114
the BN represent influential relationships, then such a 115
model can be viewed as a causal graph and represents 116
a unique Directed Acyclic Graph (DAG) that can be 117
used for interventional analysis. Otherwise, the arcs 118
represent conditional dependencies that are not nec- 119
essarily causal relationships, and such a BN is not a 120
unique DAG but rather a Partial DAG that represents 121
an equivalence class of DAGs. For a quick introduc- 122
tion to BNs, with a focus on football examples, see 123
(Constantinou & Fenton, 2018). 124
Based on 13 English Premier League seasons 125
and betting simulations under different assumptions, 126
the findings reveal interesting differences as well as 127
similarities between the AH and 1X2 markets. Impor- 128
tantly, the AH market is found to share inefficiencies 129
with the traditional 1X2 market, and this provides 130
opportunities for ‘beating’ the market. The paper is 131
structured as follows: Section 2 describes the rules of 132
the AH betting market, Section 3 describes the model, 133
Section 4 covers the data and the process of model 134
fitting, Section 5 presents the results, and Section 6 135
provides the concluding remarks. 136
2. Asian handicap betting rules 137
In what follows, the decimal odds system is used 138
(also known as European odds) for payoff in the event 139
of winning a bet. The decimal odds represent the total 140
return ratio of the stake; implying that the stake is 141
already included in the decimal number. For example, 142
a payoff of ‘3’ returns three times the stake; i.e., a bet 143
of £1 would return 1 ×3=£3(£2 profit). Odds also 144
reflect probability that incorporates the bookmakers’ 145
profit margin. An example from data is the Arsenal 146
versus Crystal Palace match played on 21/04/2019 147
with average 1X2 market odds 1.54, 4.44, 6.03, corre- 148
sponding to probabilities {64.94%, 22.52%, 16.58%}.149
Summing up these probabilities gives us 104.04%, 150
and the implied average profit margin of 4.04%. 151
The AH betting market operates such that adver- 152
saries are handicapped according to their difference in 153
Uncorrected Author Proof
A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting 3
Table 1
The whole-goal AH outcome for different hypothetical scores between Arsenal and Crystal Palace
Arsenal Crystal Palace Score Handicap Settlement AH winner
goals goals difference score
1 0 1 –1 0 Void
1 1 0 –1 –1 Crystal Palace
3 1 2 –1 1 Arsenal
4 1 3 –1 2 Arsenal
0 0 0 –1 –1 Crystal Palace
0 1 –1 –1 –2 Crystal Palace
1 2 –1 –1 –2 Crystal Palace
strength. The term handicap means that one team is
154
assigned a hypothetical score (including fractional)155
advantage before the match is played. All types of156
AH betting offered by the bookmakers involve two157
possible outcomes. This means that the AH betting158
market reduces the possible match outcomes from
159
three (i.e., 1X2) to two. Each binary outcome corre-
160
sponds to each team winning, with the odds adjusted161
subject to the given handicap.
162
The standard AH betting involves assigning a163
hypothetical score advantage to the underdog. This164
represents the most common type of AH betting, and165
aims to make the contest equal. That is, the handicap
166
applied is the one1that optimises the odds, for both167
teams to win, towards 2 (or 50% chance of winning)168
and hence, it maximises the uniformity of the AH169
payoff distribution. While this represents the most170
popular type of bet, the AH market offers different171
types of score advantage, each of which we discuss172
below, including the possibility to assign a hypothet-173
ical score advantage to the favourite rather than the
174
underdog.175
The market offers three types of AH betting that176
need to be modelled explicitly into the model, as well177
as in the betting simulation. These are:
178
i. Whole goal handicap: A team is given a whole-179
goal handicap such as 1 or +1. In this case, the180
possibility of a draw is eliminated by removing
181
the draw outcome from the equation and nor-182
malising the probabilities of the residual two
183
outcomes to sum up to 1. If a handicap draw184
is observed, the bet is voided (refunded).
185
Let us revisit the example from data discussed186
at the beginning of this section, involving Arse-187
nal versus Crystal Palace with average 1X2188
market odds {1.54,4.44,6.03}. Arsenal was the189
1The other handicaps do not share the same market liquidity;
implying limited stakes and possibly also higher profit margins,
for the bookmaker, due to lower competition.
strong favourite. The bookmakers introduced the 190
handicap of –1, which maximised the uniformity 191
of the AH distribution with odds {1.87,1.99}.192
The match ended 2-3; i.e., –1 for Arsenal. The 193
AH winner was Crystal Palace since it won the 194
match by one goal difference, which makes it 195
two goals difference given the handicap; i.e., this 196
made the settlement score, which is the match 197
result after the handicap is considered, equal to 198
2. Table 1 illustrates how the whole-goal AH 199
is determined based on other hypothetical score 200
lines between Arsenal and Crystal Palace. 201
ii. Half-goal handicap: A team is given a half-goal 202
handicap such as 1.5 or +1.5. Assuming a 203
match between Xand Yand a handicap of +1.5 204
(i.e., Xreceives a 1.5 goal advantage), and 205
that a bet is placed on X, the bet would win 206
as long Xdoes not lose by more than one goal 207
difference; otherwise the bet is lost. In this case, 208
the possibility of a draw is eliminated by the 209
handicap itself, since it is not possible for the 210
settlement score to be a draw. 211
An example from data is the Liverpool versus 212
Wolves match played on 12/05/2019 with 213
average 1X2 market odds {1.30, 5.62, 10.17}.214
Liverpool was the strong favourite. The book- 215
makers introduced the handicap of 1.5, which 216
maximised the uniformity of the AH distribution 217
with odds {1.91,1.95}. The match ended 2–0 218
(i.e., +2) in favour of Liverpool. The AH winner 219
was Liverpool since it won the match by two 220
goals difference; i.e., 0.5 goals more than the 221
handicap. This made the settlement score equal 222
to 0.5. Table 2 illustrates how the half-goal AH 223
is determined based on other hypothetical score 224
lines between Liverpool and Wolves. 225
iii. Quarter-goal handicap: A team is given a 226
quarter-goal handicap such as –0.25 or +0.25. 227
This type of handicap is, in fact, a com- 228
bined whole-goal and a half-goal handicap. 229
Uncorrected Author Proof
4A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting
Table 2
The half-goal AH outcome for different hypothetical scores between Liverpool and Wolves
Liverpool Wolves Score Handicap Settlement AH winner
goals goals difference score
1 0 1 –1.5 –0.5 Wolves
1 1 0 –1.5 –1.5 Wolves
3 1 2 –1.5 0.5 Liverpool
4 1 3 –1.5 1.5 Liverpool
0 0 0 –1.5 –1.5 Wolves
0 1 –1 –1.5 –2.5 Wolves
1 2 –1 –1.5 –2.5 Wolves
Table 3
The quarter-goal AH outcome for different hypothetical scores between Fulham and Newcastle
Fulham Newcastle Score Handicap Settlement AH winner
Goals goals difference score
1 0 1 –0.25 (0 and –0.5) 1 and 0.5 Fulham
1 1 0 –0.25 (0 and –0.5) 0 and –0.5 Void and Newcastle
3 1 2 –0.25 (0 and –0.5) 2 and 1.5 Fulham
4 1 3 –0.25 (0 and –0.5) 3 and 2.5 Fulham
0 0 0 –0.25 (0 and –0.5) 0 and –0.5 Void and Newcastle
0 1 –1 –0.25 (0 and –0.5) –1 and –1.5 Newcastle
1 2 –1 –0.25 (0 and –0.5) –1 and –1.5 Newcastle
For example, if we bet £10 on the away team230
to win given AH –0.25 with odds 2 (i.e., 50%),
231
the stake would be divided between the nearest232
whole-goal and half-goal handicaps. That is, a
233
£5 bet will be placed on the away team to win234
given AH ±02with odds 2.5 (i.e., 40%) and235
another £5 bet on the away team to win given236
AH –0.5 with odds 1.66 (i.e., 60%). Note that237
the odds for the quarter-goal handicap reflect238
the average payoff, in terms of probability,
239
of the two nearest handicaps. Since this is a
240
combination of two bets, each bet is executed
241
independently. For example, a score of 0–0
242
would have resulted in voiding AH±0 (i.e.,243
£5 are returned) and winning AH –0.5 (i.e.,
244
£5×1.66 = £8.3 are returned).245
An example from data is the Fulham versus246
Newcastle match played on 12/05/2019 with
247
average 1X2 market odds {2.50,3.53,2.78}.248
Fulham was the weak favourite. The bookmak-249
ers introduced the handicap of –0.25, which250
maximised the uniformity of the AH distribu-
251
tion with odds {2.15,1.75}. The match ended252
0–4(i.e., –4) in favour of Newcastle. The AH
253
winner was Newcastle, since it won the match
254
by four goals difference; i.e., 4.25 goals more
255
than the handicap. This made the settlement256
2A zero-goal AH implies no handicap, but that there must be a
match winner; otherwise, the bet is voided.
score equal to –4.25. Table 3 illustrates how the 257
quarter-goal AH is determined based on other 258
hypothetical score lines between Fulham and 259
Newcastle. 260
3. The model 261
The overall model combines ratings with BNs. The 262
rating system captures the skill of teams over time, 263
and provides the ratings as an input into the BN model 264
which captures the magnitude of the relationships 265
between variables of interest. The two subsections 266
that follow describe the rating system and the BN 267
model respectively. 268
3.1. The rating system 269
The pi-rating is a football rating system that deter- 270
mines team ability based on the relative discrepancies 271
in scores between adversaries. It was first introduced 272
in (Constantinou & Fenton, 2013) and thereafter used 273
in (Constantinou, 2018; Hubacek et al., 2018; 2019; 274
Van Cutsem, 2019; Wheatcroft, 2020). Modified ver- 275
sions of the pi-rating also formed part of the top two 276
performing models in the international competition 277
Machine Learning for Soccer (Constantinou, 2018; 278
Hubacek et al., 2018). This paper makes use of the 279
original pi-rating system (Constantinou & Fenton, 280
2013), with two modifications described below. 281
Uncorrected Author Proof
A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting 5
The pi-ratings assign a ‘home’ (H) and an ‘away’
(A) rating to each team, to account for team-specific
home advantage and away disadvantage. Therefore,
when a team Xplays against team Y, the match pre-
diction is determined by team’s Xrating Hversus
team’s Yrating A. The ratings are revised after each
match based on two learning rates: a) the learning
rate λwhich determines to what extent new match
results override previous match results in terms of the
impact in determining current team ratings, and b) the
learning rate γwhich determines to what extent per-
formances at home grounds influence a team’s away
rating and vice versa. Therefore, at the end of a match
between teams Xand Y, the new ratings at time tare
revised given the most recent ratings at time t1as
follows:
X’s Hrating : Rt
XH =Rt1
XH +eHλ
X’s Arating : Rt
XA =Rt1
XA +γ(Rr
XH Rt1
XH )
Y’s Arating : Rt
YH =Rt1
YH +eAλ
Y’s Hrating : Rt
YH =Rt1
YH +γ(Rt
YA Rt1
YA )
where eis the error between the observed goal differ-
ence oand rating difference  p which, for home
and away teams, is measured as follows:
eH=oH pH and eA=oA pA
respectively, where
oH =GoH GoH and oA =GoA GoH
pH =GpH GpA and pA =GpA GpH
where GoH and GoA are goals observed for home282
and away teams respectively, and similarly GpH and283
GpA are goals predicted for home and away teams.284
While the original pi-ratings represent a dimin-285
ished expectation of goal difference against the286
average opponent in the data, in this paper they287
represent the actual goal difference expectation.288
Specifically, the rating equation in this paper is289
simplified not to include the deterministic function
290
(e)=3×log10(1 +e) defined in the original paper291
(Constantinou & Fenton, 2013), which is a function292
that diminishes the importance of each additional293
goal difference under the assumption that a win is
294
more important than increasing goal difference. The
295
justification for this first modification is that, in AH,
296
we are only interested in goal differences and thus,297
the motivation here is to optimise for goal difference298
rather than the ability to win matches.299
The second modification involves the learning rate
λ. In this paper, λis multiplied by kwhen a match
involves at least one team which had previously
played less than 38 matches, according to available
data. This modification aims to increase the speed by
which team ratings converge for new teams during
their first EPL season (each team plays 38 matches
in a season), and is expected to be especially impact-
ful during the very first season in the data since, at
that point, all teams are considered ‘new’ by the rat-
ing. Therefore, the revised pi-ratings exclude ψ(e),
defined above, and include k, as follows:
Rt
XH =Rt1
XH +eHλk and Rt
YA =Rt1
YA +eAλk
where k=3 for match instances in which both teams 300
had previously played less than 38 matches; other- 301
wise k=1. The parameter kwas optimised in terms 302
of minimising prediction error e. A limitation here 303
is that the kparameter was optimised given integer 304
inputs from 1 to 10. For future work, it is recom- 305
mended that the kparameter is optimised given real 306
numbers. 307
3.2. The BN model 308
The graph of a BN model can be automatically 309
discovered from data, determined by knowledge, 310
or a combination of the two. Learning the correct 311
graph of a BN from data remains a major challenge 312
in the fields of probabilistic machine learning and 313
causal discovery. While some structure learning algo- 314
rithms perform well with synthetic data, it is widely 315
acknowledged that this level of performance does not 316
extend to real-world data which typically incorporate 317
noise and latent confounders. 318
In disciplines like bioinformatics, applying struc- 319
ture learning algorithms can reveal new insights that 320
would otherwise remain unknown. However, these 321
algorithms are less effective in areas with access to 322
domain knowledge, such as in sports. As a result, the 323
BN model in this paper has had its graphical struc- 324
ture determined by the temporal fact that possession 325
influences the number of shots created, which in turn 326
influence the number of shots on target, and which 327
in turn influence the number of goals scored. Each 328
of these factors is also dependent on the level of rat- 329
ing difference between the two teams, as illustrated in 330
Fig. 1. This type of model can also be characterised as 331
a hierarchical Bayesian model of which the network 332
is the structural representation, and where the nodes 333
represent variables or hyperparameters of statistical 334
distributions. 335
Uncorrected Author Proof
6A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting
Fig. 1. The BN model. The Rating Difference (RD) is the only observable node in the network, determined by the pi-ratings, and represents
the difference between the home team’s prior home rating and the away team’s prior away rating Rt1
XH Rt1
YA .
The temporal order of events in the BN graph336
naturally captures the importance of each event in pre-337
dicting goals scored. For example, the graph assumes338
that shots on target have a direct influence on goals339
scored, whereas possession has an indirect influence
340
and hence, while influential, it is assumed to be less
341
impactful than shots on target. While the temporal
342
order defines direct and indirect influences, note that
343
the magnitude of direct influences is still determined344
by data.
345
For each match, the prior ratings are retrieved and
the difference in team ratings is used as an input
into the BN model, which is a Hybrid BN model
consisting of both discrete and continuous variables,
designed in AgenaRisk (Agena, 2019). Specifically,
the actual input is the difference between prior home
and away ratings, and is passed to the BN model as
an observation to node Rating Difference (RD)inthe
form of
Rt1
XH Rt1
YA
To ensure that the BN model is trained accurately 346
with respect to the rating data, the parameterisation 347
of the CPDs is also restricted to match instances in 348
which both teams had previously played at least 38 349
matches. All the residual variables in the BN model 350
are latent. Specifically, 351
i. The node RD, which represents the observable
rating difference between teams, is a mix-
ture of Gaussian probability density functions
N(μ, σ2); one for each state of node Rat-
ing Difference Level (RDL). Specifically, for
−∞ <RD<,
f(RD|2, RDL)=
 1
2πσ2e(RD)2
2σ2
RDL
where parent RDL is a discrete distribution, μ352
is the average rating difference and σ2the vari- 353
ance of the rating differences. RDL consists 354
Uncorrected Author Proof
A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting 7
Table 4
Predetermined levels of rating difference
RDL 1 2 3 . . . 21 22 23
RD 2.095 1.93 1.765 intervals of –1.205 –1.37 < –1.37
& & 0.165 & &
< 2.095 < 1.93 rating < –1.04 < –1.205
Data points 55 68 107 . . . 93 58 55
of 23 states3, where each state corresponds to355
a pre-determined level of rating difference as
356
shown in Table 4. For example, the rating dif-357
ference level 3 is parameterised based on all358
historical match instances in which adversaries359
had rating difference RD =Rt1
XH Rt
YA rang-
360
ing from 1.765 to < 1.93. The granularity of
361
the 23 states has been chosen to ensure that for362
any combination of rating difference there is363
enough data points (more than 50) for a rea-364
sonably well informed prior.365
ii. The node P, which represents ball possession,
is a mixture of probability density functions
Beta(a, β); one for each state of RDL. Specif-
ically, for P[0,1],
f(P|a, β, RDL)=
Pa1(1 P)β1
Beta(a, β)
RDL
where Beta(α, β)istheBeta function, αis the
366
first shape parameter of the Beta function, also367
known as the alpha parameter, and represents
368
the number of minutes the home team is in pos-369
session of the ball, and βis the second shape
370
parameter of the Beta function, also known as371
the beta parameter, that represents the number372
of minutes the away team is in possession of
373
the ball. Thus, Preflects the possession rate
374
associated with the home team, over a Beta
375
distribution, whereas for the possession of the
376
away team the model assumes 1 P.377
iii. The node p(SM), which represents the prob-378
ability to generate a shot per minute spent
379
in possession of the ball, is also a mixture380
of probability density functions Beta(a, β)
381
given RDL, where αis the number of shots and
3The decision to discretise RDL represents a practical choice for
Hybrid BN modelling. In this case, discretising RD into RDL was
necessary to capture conditional Beta-Binomial relationships from
Possession to Goals scored, given the rating difference between
adversaries.
βis the number of minutes minus the number 382
of shots. 383
iv. The node S, which represents the expected
number of shots, is a Binomial probability mass
function B(n, p),
f(k, n, p)Pr(k|n, p)=
Pr(S=k)=n!
k!(nk)! pk(1p)nk
where nrepresents the number of minutes in 384
possession of the ball defined as4P×90, under 385
the assumption a match lasts 90 playable min- 386
utes, and pis p(SM); i.e., the probability to 387
generate a shot per minute spent in possession 388
of the ball, as defined above. 389
v. The node p(ST ), which represents the proba- 390
bility for a shot to be on target, is also a mixture 391
of probability density functions Beta(a, β)392
given RDL, where ais the number of shots on 393
target, and βis the number of shots off target; 394
i.e., total shots minus shots on target. 395
vi. The node ST, which represents the expected 396
number of shots on target, is also a Binomial 397
probability mass function B(n, p), where n398
is the expected number of shots Sand pis the 399
probability for a shot to be on target p(ST ). 400
vii. The node p(G), which represents the prob- 401
ability to score a goal, is also a mixture 402
of probability density functions Beta(a, β)403
given RDL, where αis the number of goals 404
scored, and βis the number of shots on target 405
successfully defended; i.e., total shots on target 406
minus goals scored. 407
viii. The node G, which represents the expected 408
number of goals scored, is also a Binomial 409
probability mass function B(n, p), where n410
is the expected number of shots on target ST,411
and pis the probability to score a goal p(G). 412
ix. The node 1X2 is a discrete distribution with 413
states Home win,Draw, and Away win, deter- 414
mined by the distributions Gof both home (H)415
4For the away team (i.e., AT)itis(1P×90.
Uncorrected Author Proof
8A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting
Table 5
The data variables used to train the rating system (R), the BN model (BN), and to simulate betting (B)
Variable Details Used in
Date The date of the match R
Home team The team playing at home grounds R
Away team The team playing at away grounds R
Match outcome (1X2) The outcome of the match in terms of home win, draw, or away win BN, B
Home possession rate The ball possession rate of the team playing at home BN
Away possession rate The ball possession rate of the team playing away BN
Home shots The number of shots created by the home team BN
Away shots The number of shots created by the away team BN
Home shots on target The number of shots on target created by the home team BN
Away shots on target The number of shots on target created by the away team BN
Home goals The number of goals scored by the home team R, BN, B
Away goals The number of goals scored by the away team R, BN, B
Team ratings The difference between home team and away team ratings BN
Handicap The AH on which the market odds are based B
1X2 odds The average and maximum (i.e., best available) bookmaker 1X2 odds B
AH odds The average and maximum (i.e., best available) bookmaker AH odds B
and away (A) teams; i.e., 1X2 = “Home win
416
if GoH >G
oA,“Away win”ifGoH <G
oA,
417
Draw” otherwise.
418
x. The node GD, which represents goal differ-419
ence, is simply GoH GoA.
420
xi. The node AH represents a set of nodes corre-421
sponding to all the possible AH outcomes with422
state probabilities for home and away wins,
423
given GD, as defined in Section 2.424
It should be clear by this point that for both home425
and away teams: a) nodes Pand p(SM) are hyperpa-426
rameters of Beta node S, b) nodes Sand p(ST ) are427
hyperparameters of Beta node ST, and c) nodes ST
428
and p(G) are hyperparameters of node G; effectively
429
creating a Beta-Binomial Hybrid BN process.
430
4. Data and model fitting
431
4.1. Data432
The rating method, the BN model, and the bet-
433
ting simulation are based on data collected from
434
www.football-data.co.uk and manually recorded
435
from www.nowgoal.com. Table 5 specifies which of
436
the data variables are used by the rating system, the437
BN model, and the betting simulation. For exam-438
ple, the rating system only requires information about439
goal data and hence, it only considers variables Date,440
Home team,Away team,Home goals, and Away goals.441
Since the ratings are used as an input into the BN442
model, they represent an additional BN variable and
443
at the same time make the BN model independent of
444
variables Date,Home team and Away team.
445
The data are based on the English Premier League 446
(EPL) seasons 1992/93 to 2018/19. However, AH 447
odds data were available only from season 2006/07 448
onwards, whereas ball possession data which is 449
needed by the BN model were available only from 450
season 2009/10 onwards. As a result, the rating sys- 451
tem is trained with up to 27 seasons of data, the BN is 452
parameterised with up to 10 seasons of data (since it 453
requires possession data), and the betting simulation 454
is performed over 13 seasons (since it requires AH 455
odds data). 456
4.2. Model fitting 457
By definition, the ratings are developed in a tem- 458
poral manner. That is, for a match prediction at time 459
tthe model considers team ratings at time t1. For 460
any match prediction, a team’s rating will always be 461
based on the most recent rating prior to the date of 462
the match under prediction, and a team’s rating will 463
always be based on past match results. 464
Conversely, the BN model functions as a machine 465
learning model independent of time and is validated 466
using leave-one-out cross validation (LOOCV). A 467
prediction between teams that have rating difference 468
Z, where Zis one of the 23 RDLs as defined in Table 4, 469
is derived from all data matches with rating differ- 470
ence Z, excluding the match under prediction during 471
validation. 472
This combination of model parameterisation and 473
validation with a rating system and a BN model 474
is adopted by (Constantinou, 2018). The validation 475
approach is unconventional because the BN model 476
assumes no temporal relationships. When applied 477
to past matches, it generates predictions at time t478
Uncorrected Author Proof
A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting 9
based on the whole data set which may include future479
match results. The reason this approach works well,
480
without overestimating the future accuracy of the481
model, is because it does not matter whether the data482
comes from past or future. This is because the model483
assumes that the relationship between, for example,
484
shots on target and goals scored remains invariant485
over time for the average EPL team, and empirical486
results support this claim. These include:487
i. The results presented in Sections 5.2 and 5.3488
which show that predictive accuracy is consis-489
tent across all 13 seasons tested, including the490
three seasons 2006/07 to 2008/09 which did not491
form part of the BN’s training data;492
ii. The model in (Constantinou, 2018) which was
493
based on this approach and ranked 2nd in
494
the international Machine Learning for Soccer495
competition, with a prediction error consistent496
with the validation error.
497
The empirical support for the model extends to498
demonstrating that the model can produce good pre-499
dictions for matches between teams Xand Yeven500
when the prediction is derived from match data that
501
neither X nor Y participated in (Constantinou, 2018).
502
This claim is also supported by the results presented503
in this paper. Specifically, during Seasons 2006/07 to504
2008/09 the following teams have had their perfor-
505
mance determined by data that did not include any
506
of their matches: a) Sheffield United, b) Charlton,
507
and c) Derby. The reason this occurred is because,508
as discussed above, the BN model was trained with
509
data from season 2009/10 onwards, which does not
510
include any match instances associated with these511
teams. Their performance in terms of possession,512
shots, shots on target and goals scored was derived513
by other similar match situations in terms of rating
514
difference between home and away teams.
515
This approach has advantages and disadvantages.516
The disadvantage is that, for those who are interested
517
in how such a model would have performed in the518
past, the results only approximate past performance
519
under the assumption the model would have been520
trained with at least the same amount of data as the521
test model. On the other hand, the advantage is that522
this approach allows us to preserve the sample size523
of the training data throughout validation, and this524
enables us to validate how the resulting model would525
have performed over multiple seasons without modi-
526
fying its parameterisation (excluding the removal of a
527
single sample; i.e. the match under assessment during
528
validation).
529
To understand why this is important, consider eval- 530
uating match instances five seasons in the past. A 531
temporal model would require the removal of the 532
five most recent seasons from the training data. This 533
would have led to limited samples for some of the 534
predetermined levels of rating difference shown in 535
Table 4. The limited data issue can only be over- 536
come by reducing the number of predetermined levels 537
of rating difference (i.e., the dimensionality of the 538
model); but doing so would produce a different model 539
than the one described. Instead, the approach adopted 540
by (Constantinou, 2018) enables us to address the 541
temporal aspect of the problem through the ratings 542
and to preserve the fitting of the BN across all sea- 543
sons tested; effectively enabling us to test the current 544
parameterised model on multiple seasons indepen- 545
dent of time. 546
5. Results 547
The results are reported in terms of rating (i.e., goal 548
difference) error, predictive accuracy and profitabil- 549
ity. Specifically, Section 5.1 assesses the accuracy of 550
the modified pi-ratings in terms of expected goal dif- 551
ference error, Section 5.2 assesses the accuracy of 552
the overall model in predicting both AH and 1X2 553
outcomes, and Section 5.3 assesses the capability of 554
the model in terms profitability in both 1X2 and AH 555
markets. 556
5.1. Pi-ratings accuracy and overall model 557
fitting 558
Figure 2 shows that the optimal λand parameters, 559
that minimise the goal difference error as defined in 560
Section 3, are λ= 0.018 and = 0.7. Note that while 561
the results are based on training data from seasons 562
1992/93 to 2018/19, the optimisation is restricted to 563
match instances in which both teams had previously 564
played at least 38 matches; a total of 9,073 match 565
instances. This is to ensure that the model is optimised 566
on matches in which both teams have had their ratings 567
developed by at least one football season. 568
The optimal learning rates are fairly consistent 569
with those reported in (Constantinou & Fenton, 2013) 570
(i.e., λ= 0.035 and = 0.7) on the basis of minimis- 571
ing goal difference error over five EPL seasons, with 572
those reported in (Van Cutsem, 2019) (i.e., where 573
λ= 0.06 and = 0.6) on the basis of minimising mean 574
squared goal difference error over eight EPL sea- 575
sons, with those reported in (Constantinou, 2018), 576
Uncorrected Author Proof
10 A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting
Fig. 2. The optimal modified pi-rating learning rates and associated prediction error e,givenk= 3, are λ=0.018 and = 0.7. The results are
based on training data from seasons 1992/93 to 2018/19. The optimisation is restricted to match instances in which both teams had previously
played at least 38 matches; a total of 9,073 match instances.
λ= 0.054 and = 0.79, on the basis of minimising577
the Rank Probability Score (RPS) error metric (Con-578
stantinou & Fenton, 2012) over multiple leagues579
worldwide, and with those reported in (Hubacek et580
al., 2018), λ= 0.06 and = 0.5, where pi-ratings had
581
been used in conjunction with Gradient boosted trees582
parameters to minimise RPS over multiple leagues583
worldwide.
584
However, note that the optimal learning rate λ585
is lower in this study, and this is likely due to the
586
modification that performs more aggressive rating587
revisions to the first 38 matches of each team, since
588
it is intended to improve the speed of rating conver-
589
gence. Interestingly, the overall mean goal difference
590
error shown in Fig. 2, e= 1.2283 (or e2= 1.509), is591
considerably lower than those reported in (Constanti-
592
nou & Fenton, 2013) and (Van Cutsem, 2019), where593
e2= 2.625 and e2= 2.66 respectively, and this sug-
594
gests that the modifications have had a positiveimpact
595
on the ratings.596
Figure 3 illustrates the expected goal difference597
for each of the 23 rating difference levels. Level 23598
represents the highest rating discrepancy in favour of
599
the away team, where the average expectation of the600
match is a score difference of –1.38 (or 1.38 goals in601
favour of the away team), and level 1 represents the
602
highest rating discrepancy in favour of the home team603
with an expected score difference of 2.18 (or 2.18
604
goals in favour of the home team). The graph reveals
605
a linear relationship between rating discrepancy and606
score discrepancy.
Fig. 3. Sensitivity analysis between the 23 states of RDL node and
the expected goal difference, with linear trendline superimposed
as a dashed red line.
As shown in Table 4, the granularity of the 23 607
intervals was selected to ensure that for any rating 608
difference state there are at least 50 data points for a 609
reasonably well-informed prior of observed goal dif- 610
ference. As with any discretised variable, different 611
splits produce slightly different results. In the case 612
of the RDL distribution, any changes in discretisation 613
will remain faithful to the linear relationship illus- 614
trated in Fig. 3; implying that we should expect minor 615
changes to the interval averages as long as the num- 616
ber of splits remains invariant and data points for each 617
interval are maintained above 50. 618
Uncorrected Author Proof
A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting 11
Table 6
The discrepancy in prediction error efor the different combinations of λand , relative to the optimal inputs of λ=0.018 and = 0.7. Darker
green cells represent lower discrepancy, whereas darker red cells represent higher discrepancy. Recommended inputs for λand are the
ones that generate up to 0.01% discrepancy
Any minor model amendment is naturally expected619
to have minor impact on the predicted probabili-620
ties, and any minor impact is expected to have some621
influence on the results based on small discrepancies622
between predicted and published market odds (i.e.,623
small θvalues as defined later in subsection 5.3, such
624
as θ=1or2). However, no changes are expected625
for larger discrepancies. Since the conclusions in this
626
paper are not driven by results that are based on such627
small differences between predicted and observed628
odds, any minor modification is not expected to alter629
concluding remarks.
630
Table 6 presents the discrepancy in prediction631
error efor the different hyperparameter combinations632
of λand , relative to the prediction error gener-
633
ated by the optimal inputs of λ= 0.018 and = 0.7.634
Recommended values for λand , that could be
635
considered as an alternative to the optimal values636
of λ= 0.018 and = 0.7, are the ones that gener-637
ate up to 0.01% discrepancy in prediction error. A638
total of 31 different combinations, excluding the639
optimal combination, generate discrepancy within
640
the 0.01% threshold. By keeping the value for λ
641
static, the recommended hyperparameter combina-
642
tions are a) λ= 0.017 and =0.65–0.73, b) λ= 0.018
643
and = 0.65–0.74, c) λ= 0.019 and = 0.65–0.74,644
and d) λ= 0.02 and = 0.7–0.72.645
5.2.. Predictive accuracy
646
Predictive accuracy is measured for both 1X2 and
647
AH distributions. The Brier Score is used to mea-648
sure the accuracy of the binary AH outcome, and649
the RPS metric (Constantinou & Fenton, 2012) is650
used to measure the accuracy of the multinomial 1X2651
Table 7
Predictive accuracy across all seasons, based on the Rank Proba-
bility Score (RPS) for multinomial 1X2 predictions and the Brier
Score (BS) for binary AH predictions. Lower score indicates higher
predictive accuracy for both RPS and BS
Season RPS BS
(1X2 accuracy) (AH accuracy)
2006/07 0.197 0.252
2007/08 0.184 0.248
2008/09 0.192 0.229
2009/10 0.188 0.199
2010/11 0.202 0.248
2011/12 0.205 0.257
2012/13 0.191 0.258
2013/14 0.195 0.249
2014/15 0.199 0.254
2015/16 0.213 0.254
2016/17 0.191 0.267
2017/18 0.192 0.253
2018/19 0.191 0.260
Overall 0.195 0.248
distribution. The RPS can be viewed as a Brier Score 652
extended to multinomial ordinal distributions. 653
Table 7 shows that the RPS error for the 1X2 654
outcomes ranges from 0.184 to 0.213 with an aver- 655
age RPS of 0.195 across all 13 EPL seasons. This 656
result compares well relative to previous studies that 657
assumed pi-ratings. Specifically, in (Constantinou, 658
2018) the RPS ranged from 0.187 to 0.236 for 52 659
different leagues worldwide, with an average RPS 660
of 0.211 at validation, an average RPS of 0.203 for 661
EPL matches, and an average RPS of 0.208 in the 662
competition. Similarly, the average RPS was 0.2 in 663
(Hubacek et al., 2018), according to Fig. 3, and 0.206 664
in the competition. 665
These results are consistent with those reported 666
in Section 5.1, which show that the overall error e667
Uncorrected Author Proof
12 A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting
Fig. 4. Investigating the prediction shift over time, with reference to the four main variables of the BN model. The shift is measured in terms
of the expected value of the specified distribution, and by adding a football season’s worth of data, to the training data set, at a time.
optimises lower in this study; i.e., the ratings more
668
accurately predict score difference. The results from
669
profitability presented in Section 5.3 are also consis-
670
tent with these findings.
671
5.2.1. Time-series analysis and sample size672
requirements673
The model is trained with data the covers 13 years674
of data, and not all the variables could be measured675
throughout this period. Still, the model seems to work676
well without evidence of bias. This subsection inves-677
tigates whether the variables used in the model show
678
any drift over time that the model might have fil-679
tered out. Moreover, because the dimensionality of
680
the model has been adjusted relative to the available
681
sample size, this subsection also reports the sample
682
size required for the model priors to be well informed683
by data.684
Figure 4 presents the results from time-series anal-685
ysis in investigating potential shifts in the predictive686
outputs of the model over time. The analysis is per-
687
formed by increasing the training data set by a single688
football season’s worth of data at a time, and the shift689
is measured in terms of changes in the expected value
690
of the given distribution. As shown in Fig. 4, the anal-691
ysis focuses on the four main variables of the BN692
model; namely possession, shots, shots on target, and 693
goals scored. Moreover, the different levels of rating 694
difference (refer to Table 4) are categorised into four 695
groups, and are measured with reference to the 10 696
specified seasons. 697
The reason the first three, out of the 13, seasons 698
are not considered here is because (as later shown in 699
Table 8) the BN model was not trained with data sam- 700
ples from the first two seasons, and this also means 701
that any results obtained during the third season rely 702
on very low samples. As previously discussed in sub- 703
section 3.2, the reason the first two seasons are not 704
considered by the BN model is because the BN is 705
trained with match instances in which both teams 706
had previously played at least 38 matches, to allow 707
for the pi-ratings to converge to reasonably accurate 708
estimates before considered for model training. 709
The results are discussed with reference to Table 8, 710
which presents the available samples for each of the 711
23 levels of rating difference, after each subsequent 712
league data set is added to the training data set that 713
was used to learn the BN model. The results show that 714
many of the shifts occur in the first few seasons, and 715
this is reasonable since the first seasons are the ones 716
which rely on fewer samples. Shifts are also observed 717
after adding the most recent leagues, but these shifts 718
Uncorrected Author Proof
A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting 13
Table 8
The number of data samples observed in each of the 23 intervals of Rating Difference Level (RDL) after each subsequent league data set
is added to the training data. Darker red cells represent sample size less than 19 (equivalent to less than half a season’s data), orange cells
represent sample size less than 38 (equivalent to more than half, and less than, a season’s data), and white cells represent sufficient
sample size
are largely restricted to the outputs of possession and719
shots on target, and involve matches with a level of720
rating difference between 1 to 6. The most important721
output of the model, which is the goal difference,
722
remains essentially unchanged over time. Therefore,723
while it is reasonable to assume that shifts in playing724
style might occur naturally from season to season,
725
collectively these shifts appear to have no meaning-
726
ful impact on the prediction of goal difference, from727
which the 1X2 and AH distributions are formed.728
Lastly, the results in Table 8 suggest that the model
729
described in this paper should be trained with at least
730
five seasons of league data, to ensure that the model731
priors are well informed by data. Because the model
732
is trained with publicly available data, and which only733
increases over time, there is no motivation to use less
734
data than what is currently available and hence, this735
requirement should not be viewed as a limitation.736
5.3. Profitability737
The assessment of profitability is based on 13 EPL
738
seasons and considers both the average and the best739
available (maximum) market odds. The simulation
740
is evaluated both in terms of maximising profit as
741
well as the Return On Investment (ROI). A stan-
742
dard betting strategy is used where fixed singe-unit
743
bets (e.g., £1) are placed on 1X2 and AH outcomes744
with payoff that is higher than the model’s estimated745
unbiased payoff by at least θ, where θis the discrep-746
ancy between the predicted probability and the payoff747
probability. For example, if the model predicts 51%748
and the bookmakers’ payoff for that event is 50% (i.e.,749
odds of 2), then θ=1; i.e., the bookmakers pay 1%750
more than the model’s estimate. The betting simula- 751
tion is performed across all payoff decision thresholds 752
θ, for both 1X2 and AH outcomes. The results are first 753
discussed in terms of 1X2 betting performance with 754
reference to Tables 9, 10 and 11; then in terms of AH 755
betting performance with reference to Tables 12, 13 756
and 14. 757
Table 9 presents the profitability generated by 758
1X2 bets over all possible payoff decision thresh- 759
olds θ, assuming static θacross all 13 seasons, for 760
both average and maximum market odds. Unsurpris- 761
ingly, the results show that it is much easier for the 762
model to generate profit when taking advantage of the 763
maximum market odds. Moreover, low thresholds θ764
(i.e., when the predictions are roughly in agreement 765
with market odds) are not profitable. Interestingly, 766
ROI maximises at much higher thresholds θcom- 767
pared to profit; i.e., profit maximises at 8% and 9% 768
whereas ROI at 18% and 16%, for average and maxi- 769
mum market odds respectively. This is because lower 770
thresholds θgenerate a higher number of bets which 771
can generate higher profit even if ROI is lower. 772
Tables 10 and 11 show how the profitability 773
changes when we consider the threshold θthat max- 774
imises ROI (Table 10) or profit (Table 11) per season, 775
rather than considering a static θacross all seasons 776
(Table 9), for both average and maximum market 777
odds. Profit, once more, tends to maximise on lower 778
thresholds θcompared to ROI. As an example, Table 779
A1 provides the information used during the betting 780
simulation to assess profitability for 1X2 outcomes, 781
based on average odds of season 2010/11 as shown 782
in Table 10. 783
The results show that the optimal threshold θvaries 784
dramatically between seasons, and there is much to 785
Uncorrected Author Proof
14 A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting
Table 9
Profitability based on average (left) and maximum (right) market odds for 1X2 bets simulated over 13 EPL seasons; from 2006/09 to 2018/19
θAverage market odds Maximum market odds
Bets Bets Odds Win Returns Profit ROI Bets Bets Odds Win Returns Profit ROI
won Rate won Rate
0% 4334 1290 3.06 29.8% 3942.5 –391.5 –9.03% 4938 1514 3.22 30.7% 4879.0 –59.01 –1.20%
1% 3712 1105 3.06 29.8% 3375.9 –336.1 –9.05% 4794 1468 3.24 30.6% 4755.3 –38.67 –0.81%
2% 3109 937 3.11 30.1% 2909.9 –199.2 –6.41% 4305 1307 3.19 30.4% 4167.1 –137.91 –3.20%
3% 2538 775 3.09 30.5% 2397.8 –140.2 –5.52% 3658 1122 3.23 30.7% 3623.2 –34.81 –0.95%
4% 2072 643 3.09 31.0% 1987.2 –84.9 –4.10% 3067 950 3.26 31.0% 3096.4 29.38 0.96%
5% 1699 524 3.15 30.8% 1650.9 –48.1 –2.83% 2553 802 3.27 31.4% 2620.0 67.02 2.63%
6% 1339 415 3.19 31.0% 1325.3 –13.7 –1.02% 2076 656 3.29 31.6% 2157.8 81.77 3.94%
7% 1036 329 3.26 31.8% 1074.0 38.0 3.67% 1682 526 3.34 31.3% 1756.3 74.29 4.42%
8% 814 260 3.30 31.9% 858.7 44.7 5.49% 1345 422 3.42 31.4% 1444.6 99.56 7.40%
9% 612 198 3.17 32.4% 626.8 14.8 2.42% 1049 352 3.48 33.6% 1226.2 177.18 16.89%
10% 452 148 3.13 32.7% 462.6 10.6 2.34% 807 267 3.40 33.1% 906.6 99.63 12.35%
11% 320 101 3.12 31.6% 315.2 –4.8 –1.51% 608 199 3.28 32.7% 652.3 44.29 7.28%
12% 241 75 3.16 31.1% 236.9 –4.1 –1.68% 451 155 3.30 34.4% 511.8 60.78 13.48%
13% 191 65 3.27 34.0% 212.3 21.3 11.13% 324 108 3.27 33.3% 352.7 28.65 8.84%
14% 143 50 3.45 35.0% 172.5 29.5 20.59% 241 80 3.38 33.2% 270.4 29.37 12.19%
15% 103 37 3.10 35.9% 114.8 11.8 11.48% 184 62 3.48 33.7% 216.0 31.98 17.38%
16% 76 27 3.40 35.5% 91.8 15.8 20.84% 132 46 3.48 34.8% 160.0 27.99 21.20%
17% 51 17 3.64 33.3% 61.8 10.8 21.20% 101 37 3.31 36.6% 122.3 21.31 21.10%
18% 37 12 3.81 32.4% 45.7 8.7 23.59% 76 26 3.46 34.2% 89.9 13.91 18.30%
19% 24 7 3.10 29.2% 21.7 –2.3 –9.58% 52 18 3.82 34.6% 68.7 16.67 32.06%
20% 19 5 3.44 26.3% 17.2 –1.8 –9.37% 34 11 3.51 32.4% 38.6 4.61 13.56%
Table 10
The payoff discrepancies θthat maximise ROI per season (in yellow shading), based on 1X2 bets and for both average (left) and maximum
(right) market odds. The optimal θdiscrepancy is chosen over all θthat generate at least 30 bets in a single season
Season Average market odds Maximum market odds
θBets Bets Odds Win Returns Profit ROI θBets Bets Odds Win Returns Profit ROI
won Rate won Rate
2006/07 8% 34 13 3.4 38.2% 43.7 9.66 28.41% 10% 39 16 3.4 41.0% 54.6 15.6 40.00%
2007/08 7% 63 21 2.8 33.3% 58.4 –4.56 –7.24% 11% 41 15 3.1 36.6% 47.1 6.11 14.90%
2008/09 3% 181 72 3.1 39.8% 224.3 43.3 23.92% 5% 202 77 3.5 38.1% 268.0 66.04 32.69%
2009/10 2% 242 57 3.5 23.6% 199.9 –42.09 –17.39% 13% 30 9 3.7 30.0% 33.0 3.02 10.07%
2010/11 10% 33 17 2.8 51.5% 48.2 15.18 46.00% 10% 59 29 3.0 49.2% 86.6 27.56 46.71%
2011/12 0% 326 103 3.3 31.6% 338.7 12.68 3.89% 1% 370 119 3.6 32.2% 433.4 63.4 17.14%
2012/13 7% 65 22 3.5 33.8% 77.9 12.89 19.83% 9% 71 25 3.8 35.2% 94.5 23.45 33.03%
2013/14 10% 30 10 3.7 33.3% 36.6 6.59 21.97% 12% 33 13 3.8 39.4% 49.7 16.7 50.61%
2014/15 7% 73 33 3.1 45.2% 101.6 28.58 39.15% 9% 72 33 3.3 45.8% 109.5 37.51 52.10%
2015/16 10% 65 25 3.0 38.5% 74.7 9.65 14.85% 12% 57 21 3.2 36.8% 67.7 10.71 18.79%
2016/17 9% 58 14 4.8 24.1% 67.7 9.72 16.76% 10% 80 19 5.0 23.8% 94.7 14.73 18.41%
2017/18 8% 85 24 3.9 28.2% 93.3 8.28 9.74% 13% 31 11 3.5 35.5% 38.4 7.43 23.97%
2018/19 13% 32 11 3.5 34.4% 38.3 6.28 19.63% 14% 34 11 3.7 32.4% 40.4 6.36 18.71%
Overall 4.35% 1287 422 3.38 32.79% 1403.16 116.16 9.03% 8.01% 1119 398 3.59 35.57% 1417.62 298.62 26.69%
be gained by identifying the optimal θ. However, the786
high variance of θsuggests that is not reasonable to
787
expect that we will be able to successfully predict the788
optimal value of θbefore a season starts. Moreover,789
while the results are restricted to cases with 30 or
790
more bets in a single season, it is clear that in many
791
cases the sample size remains insufficient for deriving
792
reliable and robust single-season conclusions. This 793
means that the maximised profitability presented in 794
Tables 10 and 11 is not a realistic expectation of real- 795
world performance; only Table 9 is. These results are 796
important because they highlight the danger when 797
optimising models based on the results of a single 798
season (or few seasons), which is often the case in the 799
Uncorrected Author Proof
A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting 15
Table 11
The payoff discrepancies θthat maximise profit per season (in yellow shading), based on 1X2 bets and for both average (left) and maximum
(right) market odds. The optimal θdiscrepancy is chosen over all θthat generate at least 30 bets in a single season
Season Average bookmaker odds Maximum bookmaker odds
θBets Bets Odds Win Returns Profit ROI θBets Bets Odds Win Returns Profit ROI
won Rate won Rate
2006/07 8% 34 13 3.4 38.2% 43.7 9.66 28.41% 5% 166 56 3.4 33.7% 188.4 22.39 13.49%
2007/08 8% 45 14 2.9 31.1% 40.8 –4.16 –9.24% 11% 41 15 3.1 36.6% 47.1 6.11 14.90%
2008/09 3% 181 72 3.1 39.8% 224.3 43.3 23.92% 5% 202 77 3.5 38.1% 268.0 66.04 32.69%
2009/10 10% 41 11 3.1 26.8% 33.7 –7.28 –17.76% 13% 30 9 3.7 30.0% 33.0 3.02 10.07%
2010/11 8% 59 27 2.8 45.8% 75.9 16.85 28.56% 6% 164 63 3.1 38.4% 197.2 33.19 20.24%
2011/12 0% 326 103 3.3 31.6% 338.7 12.68 3.89% 1% 370 119 3.6 32.2% 433.4 63.4 17.14%
2012/13 7% 65 22 3.5 33.8% 77.9 12.89 19.83% 9% 71 25 3.8 35.2% 94.5 23.45 33.03%
2013/14 5% 123 44 3.3 35.8% 143.2 20.21 16.43% 0% 380 127 3.4 33.4% 438.1 58.11 15.29%
2014/15 7% 73 33 3.1 45.2% 101.6 28.58 39.15% 8% 91 40 3.3 44.0% 133.2 42.22 46.40%
2015/16 8% 94 34 3.1 36.2% 104.7 10.66 11.34% 1% 371 130 3.1 35.0% 400.4 29.41 7.93%
2016/17 9% 58 14 4.8 24.1% 67.7 9.72 16.76% 10% 80 19 5.0 23.8% 94.7 14.73 18.41%
2017/18 8% 85 24 3.9 28.2% 93.3 8.28 9.74% 9% 93 26 4.2 28.0% 108.0 15.04 16.17%
2018/19 4% 188 67 3.1 35.6% 204.5 16.46 8.76% 5% 206 73 3.0 35.4% 222.3 16.25 7.89%
Overall 4.62% 1372 478 3.28 34.84% 1549.85 177.85 12.96% 4.61% 2265 779 3.52 34.39% 2658.36 393.36 17.37%
Table 12
Profitability based on average (left) and maximum (right) market odds for AH bets simulated over 13 EPL seasons; from 2006/07 to 2018/19
θAverage market odds Maximum market odds
Bets Bets Odds Win Returns Profit ROI Bets Bets Odds Win Returns Profit ROI
won Rate won Rate
0% 3914 2329 1.62 59.5% 3762.8 –151.22 –3.86% 4703 2830 1.66 60.2% 4707.0 4.01 0.09%
1% 3471 2051 1.62 59.1% 3326.5 –144.52 –4.16% 4228 2527 1.66 59.8% 4203.2 –24.76 –0.59%
2% 3064 1817 1.61 59.3% 2929.4 –134.61 –4.39% 3788 2247 1.66 59.3% 3736.8 –51.25 –1.35%
3% 2665 1584 1.61 59.4% 2554.4 –110.65 –4.15% 3375 1998 1.67 59.2% 3340.7 –34.28 –1.02%
4% 2286 1357 1.62 59.4% 2201.3 –84.71 –3.71% 2974 1766 1.67 59.4% 2949.3 –24.73 –0.83%
5% 1932 1157 1.62 59.9% 1873.4 –58.56 –3.03% 2586 1526 1.67 59.0% 2545.8 –40.17 –1.55%
6% 1640 984 1.63 60.0% 1600.9 –39.09 –2.38% 2192 1304 1.67 59.5% 2171.3 –20.74 –0.95%
7% 1386 837 1.63 60.4% 1368.5 –17.52 –1.26% 1883 1124 1.67 59.7% 1877.9 –5.15 –0.27%
8% 1135 688 1.63 60.6% 1124.2 –10.78 –0.95% 1587 955 1.68 60.2% 1600.0 13.03 0.82%
9% 936 572 1.64 61.1% 938.5 2.45 0.26% 1307 790 1.68 60.4% 1325.1 18.06 1.38%
10% 765 460 1.65 60.1% 759.0 –5.98 –0.78% 1080 658 1.69 60.9% 1111.7 31.73 2.94%
11% 614 368 1.66 59.9% 612.1 –1.89 –0.31% 896 541 1.70 60.4% 920.6 24.61 2.75%
12% 488 292 1.65 59.8% 480.9 –7.12 –1.46% 728 427 1.71 58.7% 728.3 0.25 0.03%
13% 394 233 1.63 59.1% 379.7 –14.29 –3.63% 571 330 1.71 57.8% 564.5 –6.46 –1.13%
14% 312 191 1.65 61.2% 314.5 2.49 0.80% 463 279 1.71 60.3% 478.3 15.26 3.30%
15% 250 155 1.65 62.0% 255.1 5.07 2.03% 369 222 1.70 60.2% 376.8 7.76 2.10%
16% 201 126 1.67 62.7% 210.2 9.20 4.57% 300 183 1.68 61.0% 308.0 8.02 2.67%
17% 138 83 1.67 60.1% 138.8 0.80 0.58% 237 149 1.70 62.9% 253.7 16.75 7.07%
18% 113 70 1.68 61.9% 117.7 4.72 4.18% 180 108 1.69 60.0% 182.1 2.12 1.18%
19% 87 53 1.66 60.9% 87.9 0.86 0.99% 136 85 1.70 62.5% 144.5 8.48 6.24%
20% 62 40 1.74 64.5% 69.7 7.65 12.34% 108 69 1.72 63.9% 118.8 10.81 10.01%
21% 44 31 1.74 70.5% 54.0 10.01 22.75% 78 51 1.73 65.4% 88.3 10.34 13.26%
22% 30 20 1.87 66.7% 37.3 7.31 24.37% 60 40 1.77 66.7% 70.8 10.81 18.02%
23% 24 17 1.85 70.8% 31.4 7.37 30.71% 41 26 1.95 63.4% 50.7 9.70 23.66%
24% 16 11 1.84 68.8% 20.2 4.20 26.25% 33 21 2.05 63.6% 43.0 9.98 30.24%
25% 11 8 1.81 72.7% 14.5 3.45 31.36% 27 19 2.04 70.4% 38.7 11.73 43.44%
literature. This outcome is also discussed in Section800
6, point iv.
801
Interestingly, while maximising profit per season
802
is guaranteed to also maximise profit over all seasons
803
(Table 11), the same does not apply to ROI (see 804
Table 10 and compare it to Table 11). That is, opti- 805
mising the betting strategy for maximum ROI per 806
season does not necessarily imply that ROI will 807
Uncorrected Author Proof
16 A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting
Table 13
The payoff discrepancies θthat maximise ROI per season (in yellow shading), based on AH bets and for both average (left) and maximum
(right) market odds. The optimal θdiscrepancy is chosen over all θthat generate at least 30 bets in a single season
Season Average market odds Maximum market odds
θBets Bets Odds Win Returns Profit ROI θBets Bets Odds Win Returns Profit ROI
won Rate won Rate
s
2006/07 7% 66 46 1.6 69.7% 74.9 8.945 13.55% 8% 80 57 1.6 71.3% 91.4 11.385 14.23%
2007/08 8% 71 47 1.5 66.2% 72.2 1.165 1.64% 10% 70 47 1.5 67.1% 72.7 2.745 3.92%
2008/09 12% 30 22 1.8 73.3% 38.8 8.78 29.27% 14% 31 21 1.9 67.7% 40.9 9.88 31.87%
2009/10 11% 52 33 1.5 63.5% 49.6 –2.39 –4.60% 16% 31 22 1.4 71.0% 31.6 0.61 1.97%
2010/11 11% 43 28 1.8 65.1% 51.8 8.75 20.35% 14% 33 22 1.8 66.7% 39.0 6.01 18.21%
2011/12 10% 51 29 1.8 56.9% 51.7 0.74 1.45% 14% 34 20 1.8 58.8% 36.0 1.98 5.82%
2012/13 9% 55 40 1.6 72.7% 65.6 10.55 19.18% 13% 30 23 1.5 76.7% 35.4 5.415 18.05%
2013/14 10% 58 38 1.7 65.5% 64.0 5.96 10.28% 14% 33 22 1.7 66.7% 37.9 4.92 14.91%
2014/15 8% 64 39 1.8 60.9% 68.7 4.68 7.31% 9% 72 46 1.8 63.9% 83.4 11.41 15.85%
2015/16 8% 109 68 1.6 62.4% 107.2 –1.76 –1.61% 11% 87 55 1.7 63.2% 91.7 4.745 5.45%
2016/17 15% 33 21 1.6 63.6% 33.4 0.405 1.23% 17% 33 21 1.6 63.6% 34.2 1.18 3.58%
2017/18 14% 41 25 1.7 61.0% 43.3 2.29 5.59% 17% 31 20 1.8 64.5% 36.3 5.305 17.11%
2018/19 2% 267 163 1.7 61.0% 272.5 5.485 2.05% 4% 256 156 1.7 60.9% 266.6 10.59 4.14%
Overall 7.45% 940 599 1.66 63.7% 993.6 53.6 5.70% 10.04% 821 532 1.68 64.8% 897.2 76.175 9.28%
Table 14
The payoff discrepancies θthat maximise profit per season (in yellow shading), based on AH bets and for both average (left) and maximum
(right) market odds. The optimal θdiscrepancy is chosen over all θthat generate at least 30 bets in a single season
Season Average market odds Maximum market odds
θBets Bets Odds Win Returns Profit ROI θBets Bets Odds Win Returns Profit ROI
won Rate won Rate
2006/07 1% 227 151 1.6 66.5% 244.2 17.22 7.59% 1% 308 204 1.7 66.2% 339.5 31.46 10.21%
2007/08 8% 71 47 1.5 66.2% 72.2 1.165 1.64% 8% 103 68 1.6 66.0% 106.6 3.64 3.53%
2008/09 9% 67 46 1.8 68.7% 80.9 13.89 20.73% 6% 168 102 1.8 60.7% 188.5 20.47 12.18%
2009/10 12% 39 25 1.5 64.1% 37.1 –1.89 –4.85% 16% 31 22 1.4 71.0% 31.6 0.61 1.97%
2010/11 11% 43 28 1.8 65.1% 51.8 8.75 20.35% 0% 356 220 1.7 61.8% 376.7 20.68 5.81%
2011/12 10% 51 29 1.8 56.9% 51.7 0.74 1.45% 14% 34 20 1.8 58.8% 36.0 1.98 5.82%
2012/13 9% 55 40 1.6 72.7% 65.6 10.55 19.18% 6% 154 104 1.6 67.5% 163.0 9.02 5.86%
2013/14 4% 172 111 1.7 64.5% 188.9 16.9 9.83% 6% 161 104 1.8 64.6% 183.0 22.035 13.69%
2014/15 6% 95 58 1.8 61.1% 101.6 6.64 6.99% 9% 72 46 1.8 63.9% 83.4 11.41 15.85%
2015/16 8% 109 68 1.6 62.4% 107.2 –1.76 –1.61% 11% 87 55 1.7 63.2% 91.7 4.745 5.45%
2016/17 15% 33 21 1.6 63.6% 33.4 0.405 1.23% 17% 33 21 1.6 63.6% 34.2 1.18 3.58%
2017/18 14% 41 25 1.7 61.0% 43.3 2.29 5.59% 0% 369 240 1.6 65.0% 382.6 13.565 3.68%
2018/19 2% 267 163 1.7 61.0% 272.5 5.485 2.05% 2% 318 197 1.7 61.9% 330.5 12.515 3.94%
Overall 5.57% 1270 812 1.66 63.9% 1350.4 80.385 6.33% 5.55% 2194 1403 1.69 63.9% 2347.3 153.31 6.99%
maximise across all seasons. Table 10 shows that808
even though ROI is maximised for each season
809
independently, the overall ROI across all 13 sea-810
sons is 9.03% in the case of average odds, which811
is notably lower compared to the respective overall812
ROI of 12.96% in Table 11. However, this observa-813
tion does not extend to the case of maximum market814
odds. This outcome is also discussed in Section 6,
815
point vi.816
Tables 12, 13, and 14 repeat the analysis of 817
Tables 9, 10, and 11, but for AH rather than 1X2 818
betting. Table A2 presents an example of the infor- 819
mation used during the betting simulation to assess 820
profitability from AH bets, and it is based on aver- 821
age odds of season 2010/11 as shown in Table 13. 822
Overall, the AH bets appear to generate both lower 823
Uncorrected Author Proof
A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting 17
profit as well as ROI compared to 1X2 bets. As with824
1X2 bets, optimising for maximum ROI per season
825
leads to a lower ROI across all seasons, compared to826
maximising profit. Specifically, Table 13 shows that827
when maximising ROI per season leads to an overall828
ROI of 5.7% for average odds, which is lower than the
829
respective average ROI of 6.33% in Table 14 when830
maximising profit. Once more, this observation only831
applies to the case of average market odds.832
An interesting observation is that AH betting gen-
833
erates a lower number of bets when θis low, compared834
to 1X2 betting, and a higher number of bets when θis835
high. This suggests that AH betting is less sensitive
836
to the betting decision threshold θcompared to 1X2837
betting, for both average and maximum market odds.
838
Furthermore, the optimal threshold θfor AH bets does839
not vary as much as it did for 1X2 bets. Despite the840
relatively low variance of θand the common occur-
841
rence of winning 60+ out of 100 AH bets, profitability842
is still inconsistent between seasons. This is because843
match bets do not share the same payoff.844
5.3.1. Odds of bets simulated845
When it comes to the bets simulated, the 1X2 bets846
tend to average odds greater than 3 which suggests847
that the model tends to recommend bets on outsiders;848
a behaviour that is consistent with previous stud-
849
ies including the original pi-rating (Constantinou &
850
Fenton, 2013; Constantinou, 2018). Conversely, the
851
AH bets tend to be simulated on favourite outcomes852
with average season odds typically ranging between
853
1.6 and 1.8. However, it is important to note that an
854
issue with the AH odds retrieved from www.football-855
data.co.uk is that they do not always represent the856
odds associated with the handicap that maximises857
the uniformity of the AH distribution, as discussed
858
in Section 2. For example, the AH odds for seasons
859
2009/10 and 2010/11 appear to be predominantly860
based on ±0 AH; i.e., no handicap, with the outcome
861
of draw eliminated. Examples of this issue can also862
be viewed in Table A2; e.g., refer to the imbalanced
863
AH odds for dates 14/08, 11/09 and 27/11.864
According to Table 15, at least part of the AH865
odds of the first five seasons do not reflect the stan-866
dard AH outcome, whereas the eight most recent867
seasons appear to be correctly based on the stan-868
dard AH outcome that aims to make the competition869
equal. Results from predictive accuracy and prof-
870
itability suggest that there is no meaningful difference
871
between the first five and the last eight seasons. There-
872
fore, we have no reason to assume that this might have
873
Table 15
The mean average and mean maximum AH odds for each of the
13 seasons
Average Maximum
odds odds
Season HT AT HT AT
2006/07 1.89 1.97 1.95 2.05
2007/08 1.92 2.01 2.00 2.09
2008/09 1.85 2.30 1.94 2.50
2009/10 2.08 3.01 2.24 3.38
2010/11 1.87 2.24 1.94 2.38
2011/12 1.93 1.94 1.99 2.01
2012/13 1.93 1.95 1.99 2.01
2013/14 1.93 1.94 2.00 2.01
2014/15 1.92 1.95 1.98 2.01
2015/16 1.94 1.93 1.99 1.99
2016/17 1.95 1.93 2.01 1.99
2017/18 1.95 1.93 2.00 1.99
2018/19 1.96 1.94 2.03 2.00
influenced the overall conclusions. Finally, the pref- 874
erence of the model to bet on favourite AH outcomes 875
remains consistent across all 13 seasons. This out- 876
come is also discussed in Section 6, point ii.877
A possible limitation here is that, while the AH 878
market offers multiple handicaps for each match, this 879
study has only considered one handicap per match. 880
However, it is reasonable to assume that the results 881
presented in this paper approximate the overall AH 882
market. This is because when the model suggests a 883
bet on team Xfor a given handicap, then we should 884
expect the model to suggest a bet on team X regardless 885
the handicap, since any handicap must remain faithful 886
to the expected goal difference of the match, which 887
determines θ.888
5.3.2. Betting stake adjustments 889
Figure 5 presents the cumulative profit generated 890
over eight different betting scenarios that represent 891
the combinations of the following betting options: a) 892
optimising for maximum ROI or profit, b) optimising 893
θper season or across all seasons, and c) simulating 894
1X2 or AH bets. The results illustrate how the differ- 895
ence in profit and ROI evolves across the 13 seasons 896
between 1X2 and AH bets. While AH bets generate 897
considerably lower profit and ROI, the profitability 898
is much less volatile than 1X2 bets and hence, it is 899
subject to a lower risk of loss which can often be 900
detrimental. For example, note the significant losses 901
for the two best performing scenarios during matches 902
1900 to 2100, which are both based on 1X2 bets. 903
However, the lower risk of loss also limits profits. 904
A fairer assessment of risk between 1X2 and AH 905
profits would be to simply optimise stakes such that, 906
Uncorrected Author Proof
18 A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting
Fig. 5. Cumulative profit when the betting procedure is optimised for either profit or ROI, overall or per season, and based on either 1X2 or
AH maximum market odds. The results are based on 13 EPL seasons; from 2006/09 to 2018/19. Optimisations for overall profit and ROI,
across all 13 seasons, are restricted to θdiscrepancies that generate at least 100 bets over those 13 seasons.
Fig. 6. Comparing the volatility of profits when the stakes of AH bets is increased by as much required for the cumulative profit to match
that of 1X2 bets.
at the end of the betting period, they both pro-907
duce the same profit. Figure 6 provides these results908
by extending the scenarios of Fig. 5 to include an909
additional betting scenario in which AH stakes are 910
increased proportional to the difference in cumulative 911
profit between 1X2 and AH bets. For example, in 912
Uncorrected Author Proof
A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting 19
Fig. 6a the new AH betting scenario assumes an913
increase of 5.59 times the stakes of AH bets, in order
914
for the cumulative profit to become equal to that gen-915
erated by 1X2 bets.916
Overall, the graphs suggest that if we want AH bets917
to generate as much profit as 1X2 bets do, then profit
918
from AH bets will likely be subject to a similar risk of919
loss as with 1X2 bets. Therefore, while AH is often920
preferred due to the lower variance of returns, this921
advantage is rather eliminated when we need to bet
922
proportionally larger to match the expected profit pro-923
duced by the corresponding 1X2 bets. This outcome924
is also discussed in Section 6, point iii.
925
6. Discussion and concluding remarks926
This paper presented a model specifically devel-927
oped for the prediction and assessment of the AH928
football betting market. The model is based on a
929
modified version of the pi-ratings system which mea-
930
sures the relative scoring ability between teams. The
931
modified pi-ratings are used as an input into a novel932
BN model that had its graphical structure deter-933
mined by the temporal assumption Possession 934
Shots Shots on Target Goals scored, which935
captures the natural causal chain of these events via
936
a Beta-Binomial Hybrid BN modelling process. One937
example of this assumption is that possession occurs
938
before shots (or shots on target) and hence, shots939
are assumed to be more impactful than possession940
in terms of determining goals scored.
941
Using goal scoring data over the last 27 EPL sea-942
sons, the modified pi-ratings discovered a strong943
linear relationship between team rating difference
944
and expected goal difference. However, the linear
945
relationship is oscillatory (refer to Fig. 3) and this946
suggests that goal data alone may be insufficient947
in completely explaining team ability. Future work
948
will investigate whether factors beyond goals scored
949
could better explain this relationship. For example,
950
in (Constantinou & Fenton, 2017) it was shown that951
the three teams who were promoted to the EPL, from952
the English Championship, tend to perform signifi-953
cantly better than the teams they replace. This is an954
important factor not taken into consideration by the955
pi-ratings; i.e., the teams are promoted with either an956
ignorant rating (if it is their first time in the EPL) or957
with the rating they had when they were last relegated,958
which clearly underestimates their performance once
959
they return to the EPL.960
AH betting is assessed with reference to the tra- 961
ditional 1X2 betting. The assessment is based on 962
both average and maximum market odds and over all 963
possible betting decision thresholds in terms of dis- 964
crepancy between predicted and offered market odds. 965
Furthermore, the assessment differentiates between 966
betting strategies that are optimised for ROI and 967
betting strategies that are optimised for profit. Key 968
observations include: 969
i. The previous literature has generally focused 970
on maximum market odds, and this is under- 971
standable since professional gamblers aim to 972
maximise payoff. Still, average odds are impor- 973
tant because they reveal the expected returns 974
for the average gambler. Moreover, maximum 975
odds are not attainable by everyone since many 976
countries do not allow access to many of 977
the online bookmakers, including exchange- 978
based websites which often offer the best odds 979
(excluding commission). This study shows that 980
the maximum available market odds increase 981
profits by up to four times relative to aver- 982
age odds. Specifically, taking advantage of the 983
maximum market odds can lead to increased 984
profits that range anywhere between 42% (refer 985
to overall profits in Table 13) and 296% (refer 986
to maximised profits in Table 9). 987
ii. The recommended AH bets tend to be on 988
favourite outcomes with odds that typically 989
average between 1.6 and 1.8 per season. Con- 990
versely, the recommended 1X2 bets tend to 991
be on outsider outcomes with odds averaging 992
above 3. The reduction of the problem from 993
a three-state multinomial to a binary distribu- 994
tion (i.e., from 1X2 to AH) explains why the 995
odds move from 1-in-3 to 1-in-2, but not why 996
the recommended bets switch from outsiders 997
to favourites. 998
iii. AH bets generate lower profit as well as ROI 999
compared to 1X2 bets. Specifically, 1X2 bets 1000
are found to generate 2.5 to 5.5 times 1001
higher profit and 2.5 to 4 times higher ROI 1002
compared to AH bets (refer to Fig. 6). For this 1003
reason, returns from AH bets tend to be consid- 1004
erably less volatile and subject to a lower risk of 1005
loss. While this outcome is in agreement with 1006
(Hassanniakalager & Newall, n.d.), this pre- 1007
sumed advantage of AH betting is flawed. This 1008
is because, when the betting stakes of AH bets 1009
are increased proportional to the difference in 1010
cumulative profit between 1X2 and AH bets, 1011
Uncorrected Author Proof
20 A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting
the variance of profit from AH bets increases1012
towards the variance of profit from 1X2 bets.
1013
This implies that, when aiming for the same1014
profit at the end of the same period of time,1015
AH bets are not necessarily less risky than 1X21016
bets.
1017
iv. Past studies often focus on a single foot-1018
ball season, and profitability tends to be1019
reported based on the betting decision thresh-1020
old that maximises ROI under the assumption
1021
that the optimal betting decision threshold1022
remains invariant between seasons. However,1023
the results in this paper show that the optimal
1024
betting decision threshold varies dramatically1025
between seasons, despite predictive accuracy
1026
being consistent across the 13 seasons, and this1027
applies to both 1X2 and AH bets; albeit to a1028
lower degree for AH bets.
1029
This implies that the profitability presented1030
in Tables 10, 11, 13, and 14 is not a realis-1031
tic expectation of real-world performance. This1032
is because the optimal betting decision thresh-1033
old is not consistent between seasons, and the1034
high variance suggest that it is unreasonable to1035
assume we will be able to predict the decision
1036
threshold that maximises profit or ROI before
1037
a season starts. Therefore, the choice of eval-1038
uating football models based on the threshold
1039
that maximises profitability in a single football1040
season, which is often the case in the literature,1041
should be discouraged. Moreover, the optimal
1042
betting decision threshold is also dependent on1043
whether we would like to maximise ROI or
1044
profit. On the other hand, Tables 9 and 12 repre-
1045
sent a more realistic expectation of real-world1046
performance, even though it is unlikely that we1047
will follow a static betting decision threshold1048
across these many seasons.1049
v. Neither profit nor ROI are consistent between1050
seasons, and this applies to both 1X2 and AH
1051
bets. While the overall performance of the1052
model is good enough to beat the market, it is
1053
still possible for the best possible betting deci-1054
sion threshold to be lossmaking for a whole1055
season (see Tables 10, 11, 13, 14). While this1056
is true for average market odds, the risk is1057
eliminated when we consider maximum odds;1058
though some seasons were barely profitable.1059
vi. Finally, the results show that choosing to opti-
1060
mise for maximum ROI per season will likely
1061
produce undesired results in the long term, and
1062
this applies to both 1X2 and AH bets. On the
1063
other hand, choosing to optimise for maximum 1064
profit (rather than ROI) per season, not only 1065
guarantees that the profit is maximised across 1066
all seasons, but also often generates a higher 1067
overall ROI, across all seasons, compared to 1068
the overall ROI generated when optimising for 1069
maximum ROI for each season independently. 1070
This finding is important since most of the pre- 1071
vious studies focus on maximising ROI, often 1072
for individual seasons. 1073
Lastly, it is important to note that this paper has 1074
considered football data up to season 2018/19. The 1075
two subsequent seasons have been partly affected by 1076
the COVID-19 pandemic, where many matches were 1077
played with fewer or without fans. Relevant stud- 1078
ies have shown that this event had an insignificant 1079
or a significant negative effect on home advantage 1080
(Wunderlich et al., 2021; McCarrick et al., 2021). 1081
The model described in this paper does not to con- 1082
sider this event. However, because the model relies 1083
on pi-ratings which involve a home and an away 1084
rating for each team, it can be easily adjusted to con- 1085
sider such previously unseen events. For example, if 1086
we assume that home advantage is not relevant for 1087
a particular match, we could consider assigning the 1088
‘away’ ratings to both teams for that match. Future 1089
research works could investigate whether such mod- 1090
elling modifications, that take into consideration the 1091
effect of playing in empty stadiums, improve predic- 1092
tive accuracy. 1093
Acknowledgments 1094
This research was supported by the ERSRC Fel- 1095
lowship project EP/S001646/1 on Bayesian Artificial 1096
Intelligence for Decision Making under Uncertainty, 1097
by The Alan Turing Institute in the UK, and by Agena 1098
Ltd. 1099
References 1100
Agena. 2019. AgenaRisk: Bayesian Network Software for Risk 1101
Analysis and Decision Making. (Agena Ltd) Retrieved August 1102
1, 2019, from https://www.agenarisk.com/ 1103
Angelini, G., & De Angelis, L. 2017. PARX model for football 1104
match predictions, Journal of Forecasting, 36, 795-807. 1105
Arabzad, S. M., Araghi, M. E., Sadi-Nezhad, S., & Ghofrani, N. 1106
2014. Football match results prediction using artificial neural 1107
networks; The case of Iran Pro League. International Journal 1108
of Applied Researchon Industrial Engineering, 1(3), 159-179. 1109
Uncorrected Author Proof
A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting 21
Constantinou, A. 2018. Dolores: A model that predicts football
1110
match outcomes from all over the world. Machine Learning,1111
1-27.1112
Constantinou, A., & Fenton, N. 2012. Solving the Problem of
1113
Inadequate Scoring Rules for Assessing Probabilistic Football1114
Forecast Models. Journal of Quantitative Analysis in Sports,1115
8(1).1116
Constantinou, A., & Fenton, N. 2013. Determining the level of
1117
ability of football teams by dynamic ratings based on the
1118
relative discrepancies in scores between adversaries. Journal1119
of Quantitative Analysis in Sports, 9(1), 37-50.1120
Constantinou, A., & Fenton, N. 2017. Towards smart-data:1121
Improving predictive accuray in long-term football team per-1122
formance. Knowledge-Based Systems, 124, 93-104.1123
Constantinou, A., & Fenton, N. 2018. Things to know about1124
Bayesian Networks. Significance, 15(2), 19-23.
1125
Crowder, M., Dixon, M., Ledford, A., & Robinson, M. 2002.1126
Dynamic modelling and prediction of English Football1127
League matches for betting. Journal of the Royal Statistical1128
Society: Series D (The Statistician), 51(2), 157-168.1129
Dixon, M., & Coles, S. 1997. Modelling association football scores1130
and inefficiencies in the football betting market. Applied1131
Statistics, 46(2), 265-280.
1132
Giovanni, A., & De Angelis, L. 2019. Efficiency of online foot-1133
ball betting markets. International Journal of Forecasting, 35,1134
712-721.1135
Goddard, J. 2005. Regression models for forecasting goals and1136
match results in association football. International Journal of1137
Forecasting, 21(2), 331-340.1138
Grant, A., Oikonomidis, A., & Bruce, A. J. 2018. New entry,
1139
strategic diversity and efficiency in soccer betting markets:1140
the creation and suppression of arbitrage opportunities. The1141
European Journal of Finance, 24(18), 1799-1816.1142
Hassanniakalager, A., & Newall, P. (n.d.). A machine learning per-1143
spective on responsible gambling. Behavioural Public Policy,1144
1-24.1145
Hofer, V., & Leitner, J. 2017. Relative pricing of binary options in1146
live soccer betting markets. Journal of Economic Dynamics
1147
&Control, 76, 66-85.1148
Huang, K., & Chang, W. 2010. A neural network method for pre-1149
diction of 2006World Cup Football Game. In Proceedings1150
of the IEEE 2010 International Joint Conference on Neural
1151
Networks (IJCNN). Barcelona.1152
Hubacek, O., Gustav, S., & Zelezny, F. 2019. Score-based soccer
1153
match outcome modeling – an experimental review. Math-
1154
Sport International. Athens, Greece.1155
Hubacek, O., Sourek, G., & Zelezny, F. 2018. Learning to predict
1156
soccer results from relational data with gradient boosted trees.1157
Machine Learning, 108, 29-47.1158
Hvattum, L. M., & Arntzen, H. 2010. Using ELO ratings for match 1159
result prediction in association football. International Journal 1160
of Forecasting, 26, 460-470. 1161
Joseph, A., Fenton, N., & Neil, M. 2006. Predicting football results 1162
using Bayesian nets and other machine learning techniques. 1163
Knowledge-Based Systems, 19(7), 544-553. 1164
Kerr, J. 2018. How can legislators protect sport from the integrity 1165
threat posed by cryptocurrencies? The International Sports 1166
Law Journal, 18, 79-97. 1167
Leitner, C., Zeileis, A., & Hornik, K. 2008. Forecasting Sports 1168
Tournaments by Ratings of (prob)abilities: A comparison for 1169
the EURO 2008. International Journal of Forecasting, 26,1170
471-481. 1171
McCarrick, D., Bilalic, M., Neave, N., & Wolfson, S. 2021. 1172
Home advantage during the COVID-19 pandemic: Analy- 1173
ses of European football leagues. Psychology of Sport and 1174
Exercise, 56, Article 102013. 1175
Pearl, J. 1985. Bayesian Networks: A model of activated mem- 1176
ory for evidential reasoning. In Proceedings of the Cognitive 1177
Science Society, 329-334. 1178
Pena, J. L. 2014. A Markovian model for association football 1179
possession and its outcomes. arXiv:1403.7993 [math.PR]. 1180
Rue, H., & Salvesen, O. 2000. Prediction and retrospective analysis 1181
of soccer matches in a league. Journal of the Royal Statistical 1182
Society: Series D (The Statistician), 49(3), 399-418. 1183
Van Cutsem, L. 2019. Pi Ratings. Retrieved July 8, 2019, from 1184
https://cran.r-project.org/web/packages/piratings/vignettes/ 1185
README.html 1186
Vlastakis, N., Dotsis, G., & Markellos, R. 2008. Nonlinear mod- 1187
elling of European football scores using support vector 1188
machines. Applied Economics, 40, 111-118. 1189
Wheatcroft, E. 2020. A profitable model for predicting the 1190
over/under market in football. International Journal of Fore- 1191
casting.1192
Williams-Grut, O. 2016. Inside Starlizard: The story of Britain’s 1193
most successful gambler and the secretive company that helps 1194
him win. London, UK: Business Insider. 1195
Wunderlich, F., & Memmert, D. 2018. The Betting Odds Rating 1196
System: Using soccer forecasts to forecast soccer. PLoS ONE, 1197
13(6), e0198668. 1198
Wunderlich, F., Weigelt, M., Rein, R., & Memmert, D. 2021. 1199
How does spectator presence affect football? Home advantage 1200
remains in European top-class football matches played with- 1201
out spectators during the COVID-19 pandemic. PLoS ONE 1202
16(3), e0248590.1203
Uncorrected Author Proof
22 A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting
Appendix A: Sample results from betting simulations
Table A1
Details of profitability for case in Table 10: Season= 2010/11, Odds =Average, Bets =1X2, Optimisation = ROI, and θ=10%
Model Average Bookmakers’ Payoff Bets Returns
predictions bookmakers’ unnormalised discrepancy simulated from bets
odds prediction
Date HT AT 1X2 p(1) p(X) p(2) Odds(1) Odds(X) Odds(2) p(1) p(X) p(2) θ(1) θ(X) θ(2) Bet(1) Bet(X) Bet(2) Return(1) Return(X) Return(2) Profit
14/08/2010 Aston Villa West Ham 1 0.62 0.22 0.16 1.96 3.30 4.03 0.51 0.30 0.25 0.11 –0.08 –0.09 1 0 0 1.96 0 0 0.96
14/08/2010 Wigan Blackpool 2 0.33 0.26 0.41 1.82 3.45 4.50 0.55 0.29 0.22 –0.22 –0.03 0.19 0 0 1 0 0 4.5 3.5
26/09/2010 Wolves Aston Villa 2 0.25 0.25 0.50 2.83 3.25 2.50 0.35 0.31 0.40 –0.10 –0.06 0.10 0 0 1 0 0 2.5 1.5
02/10/2010 Sunderland Man United X 0.13 0.20 0.67 4.93 3.45 1.75 0.20 0.29 0.57 –0.08 –0.09 0.10 0 0 1 0 0 0 –1
23/10/2010 Birmingham Blackpool 1 0.39 0.27 0.34 1.85 3.48 4.31 0.54 0.29 0.23 –0.15 –0.02 0.11 0 0 1 0 0 0 –1
24/10/2010 Liverpool Blackburn 1 0.73 0.17 0.10 1.66 3.64 5.43 0.60 0.27 0.18 0.12 –0.11 –0.08 1 0 0 1.66 0 0 0.66
01/11/2010 Blackpool West Brom 1 0.47 0.26 0.26 2.79 3.25 2.54 0.36 0.31 0.39 0.11 –0.04 –0.13 1 0 0 2.79 0 0 1.79
10/11/2010 Man City Man United X 0.25 0.25 0.50 2.57 3.22 2.75 0.39 0.31 0.36 –0.14 –0.06 0.14 0 0 1 0 0 0 –1
27/11/2010 Bolton Blackpool X 0.41 0.27 0.32 1.57 3.96 5.82 0.64 0.25 0.17 –0.22 0.01 0.15 0 0 1 0 0 0 –1
11/12/2010 Aston Villa West Brom 1 0.62 0.22 0.16 2.10 3.30 3.53 0.48 0.30 0.28 0.15 –0.08 –0.13 1 0 0 2.1 0 0 1.1
11/12/2010 Stoke Blackpool 2 0.41 0.27 0.32 1.62 3.86 5.41 0.62 0.26 0.18 –0.21 0.01 0.14 0 0 1 0 0 5.41 4.41
12/12/2010 Tottenham Chelsea X 0.25 0.25 0.51 2.84 3.28 2.49 0.35 0.30 0.40 –0.10 –0.06 0.10 0 0 1 0 0 0 –1
13/12/2010 Man United Arsenal 1 0.62 0.22 0.16 1.95 3.40 3.92 0.51 0.29 0.26 0.11 –0.07 –0.09 1 0 0 1.95 0 0 0.95
28/12/2010 Sunderland Blackpool 2 0.41 0.27 0.32 1.60 3.81 5.82 0.63 0.26 0.17 –0.21 0.00 0.15 0 0 1 0 0 5.82 4.82
28/12/2010 West Brom Blackburn 2 0.36 0.27 0.37 1.82 3.48 4.49 0.55 0.29 0.22 –0.19 –0.02 0.15 0 0 1 0 0 4.49 3.49
05/01/2011 Arsenal Man City X 0.63 0.21 0.15 1.94 3.49 3.86 0.52 0.29 0.26 0.12 –0.07 –0.11 1 0 0 0 0 0 –1
05/01/2011 Everton Tottenham 1 0.48 0.26 0.26 2.66 3.24 2.64 0.38 0.31 0.38 0.10 –0.05 –0.12 1 0 0 2.66 0 0 1.66
15/01/2011 West Brom Blackpool 1 0.33 0.26 0.41 1.78 3.68 4.48 0.56 0.27 0.22 –0.23 –0.01 0.19 0 0 1 0 0 0 –1
16/01/2011 Liverpool Everton X 0.63 0.22 0.16 2.20 3.22 3.43 0.45 0.31 0.29 0.17 –0.09 –0.14 1 0 0 0 0 0 –1
23/01/2011 Blackburn West Brom 1 0.56 0.24 0.21 2.22 3.27 3.24 0.45 0.31 0.31 0.11 –0.07 –0.10 1 0 0 2.22 0 0 1.22
01/02/2011 West Brom Wigan X 0.41 0.27 0.32 1.71 3.58 5.13 0.58 0.28 0.19 –0.17 –0.01 0.13 0 0 1 0 0 0 –1
12/02/2011 Man United Man City 1 0.67 0.20 0.13 1.77 3.57 4.61 0.56 0.28 0.22 0.10 –0.08 –0.08 1 0 0 1.77 0 0 0.77
12/02/2011 West Brom West Ham X 0.36 0.27 0.37 1.93 3.52 3.86 0.52 0.28 0.26 –0.16 –0.02 0.11 0 0 1 0 0 0 –1
20/02/2011 West Brom Wolves X 0.36 0.26 0.37 1.86 3.40 4.33 0.54 0.29 0.23 –0.18 –0.03 0.14 0 0 1 0 0 0 –1
26/02/2011 Wolves Blackpool 1 0.36 0.26 0.37 1.83 3.58 4.28 0.55 0.28 0.23 –0.18 –0.02 0.14 0 0 1 0 0 0 –1
19/03/2011 Man United Bolton 1 0.83 0.12 0.06 1.40 4.41 8.54 0.71 0.23 0.12 0.11 –0.11 –0.06 1 0 0 1.4 0 0 0.4
19/03/2011 West Brom Arsenal X 0.12 0.20 0.68 4.67 3.59 1.76 0.21 0.28 0.57 –0.09 –0.08 0.11 0 0 1 0 0 0 –1
09/04/2011 Wolves Everton 2 0.25 0.25 0.50 2.55 3.25 2.80 0.39 0.31 0.36 –0.14 –0.06 0.14 0 0 1 0 0 2.8 1.8
11/04/2011 Liverpool Man City 1 0.55 0.24 0.21 2.59 3.21 2.76 0.39 0.31 0.36 0.16 –0.07 –0.15 1 0 0 2.59 0 0 1.59
14/05/2011 West Brom Everton 1 0.26 0.24 0.50 2.63 3.28 2.69 0.38 0.30 0.37 –0.12 –0.06 0.13 0 0 1 0 0 0 –1
22/05/2011 Bolton Man City 2 0.33 0.26 0.41 4.92 3.72 1.69 0.20 0.27 0.59 0.13 –0.01 –0.19 1 0 0 0 0 0 –1
22/05/2011 Man United Blackpool 1 0.82 0.12 0.06 1.56 4.13 5.57 0.64 0.24 0.18 0.18 –0.12 –0.12 1 0 0 1.56 0 0 0.56
22/05/2011 Stoke Wigan 2 0.55 0.24 0.21 2.76 3.42 2.45 0.36 0.29 0.41 0.19 –0.05 –0.20 1 0 0 0 0 0 –1
TOTAL 15 0 18 22.66 0 25.52 15.18
Uncorrected Author Proof
A.C. Constantinou / Investigating the efficiency of the Asian handicap football betting 23
Table A2
Details of profitability for case in Table 13: Season= 2010/11, Odds =Average, Bets =AH, Optimisation = ROI, and θ= 11%
Goals Goal Model Average Bookmakers’ Payoff Bets Returns
predictions bookmakers’ unnormalized discrepancy simulated from bets
Date HT AT HT AT difer AH p(1) p(2) Odds(1) Odds(2) p(1) p(2) θ(1) θ(2) Bet(1) Bet(2) Return(1) Return(2) Profit
14/08/2010 Wigan Blackpool 0 4 –4 0 0.45 0.55 1.32 3.19 0.76 0.31 –0.31 0.24 0 1 0 3.19 2.19
21/08/2010 Arsenal Blackpool 6 0 6 –2 0.28 0.72 1.76 2.12 0.57 0.47 –0.28 0.24 0 1 0 0 –1
11/09/2010 Newcastle Blackpool 0 2 –2 0 0.65 0.35 1.20 4.23 0.83 0.24 –0.18 0.11 0 1 0 4.23 3.23
25/09/2010 West Ham Tottenham 1 0 1 0 0.51 0.49 2.54 1.47 0.39 0.68 0.11 –0.19 1 0 2.54 0 1.54
23/10/2010 Birmingham Blackpool 2 0 2 –0.5 0.39 0.61 1.85 2.01 0.54 0.50 –0.15 0.11 0 1 0 0 –1
24/10/2010 Liverpool Blackburn 2 1 1 –0.75 0.69 0.31 1.85 2.02 0.54 0.50 0.15 –0.19 1 0 1.425 0 0.425
30/10/2010 Man United Tottenham 2 0 2 –1 0.65 0.35 2.04 1.83 0.49 0.55 0.16 –0.19 1 0 2.04 0 1.04
01/11/2010 Blackpool West Brom 2 1 1 0 0.64 0.36 1.99 1.81 0.50 0.55 0.14 –0.20 1 0 1.99 0 0.99
10/11/2010 Man City Man United 0 0 0 0 0.33 0.67 1.82 1.98 0.55 0.51 –0.22 0.16 0 1 0 1 0
20/11/2010 Birmingham Chelsea 1 0 1 0.75 0.36 0.64 1.88 2.00 0.53 0.50 –0.17 0.14 0 1 0 0 –1
27/11/2010 Bolton Blackpool 2 2 0 0 0.56 0.44 1.20 4.17 0.83 0.24 –0.27 0.20 0 1 0 1 0
11/12/2010 Aston Villa West Brom 2 1 1 0 0.80 0.20 1.50 2.49 0.67 0.40 0.13 –0.20 1 0 1.5 0 0.5
11/12/2010 Stoke Blackpool 0 1 –1 –1 0.24 0.76 2.07 1.82 0.48 0.55 –0.24 0.21 0 1 0 1.82 0.82
26/12/2010 Aston Villa Tottenham 1 2 –1 0 0.57 0.43 2.22 1.64 0.45 0.61 0.12 –0.18 1 0 0 0 –1
28/12/2010 Sunderland Blackpool 0 2 –2 –1 0.25 0.75 2.06 1.81 0.49 0.55 –0.24 0.20 0 1 0 1.81 0.81
28/12/2010 West Brom Blackburn 1 3 –2 –0.5 0.36 0.64 1.82 2.05 0.55 0.49 –0.19 0.15 0 1 0 2.05 1.05
29/12/2010 Chelsea Bolton 1 0 1 –1.5 0.62 0.38 1.98 1.89 0.51 0.53 0.12 –0.15 1 0 0 0 –1
01/01/2011 Man City Blackpool 1 0 1 –1.5 0.37 0.63 1.83 2.03 0.55 0.49 –0.17 0.13 0 1 0 2.03 1.03
05/01/2011 Arsenal Man City 0 0 0 –0.5 0.63 0.37 1.94 1.93 0.52 0.52 0.12 –0.15 1 0 0 0 –1
05/01/2011 Everton Tottenham 2 1 1 0 0.65 0.35 1.90 1.92 0.53 0.52 0.12 –0.17 1 0 1.9 0 0.9
15/01/2011 West Brom Blackpool 3 2 1 –0.75 0.26 0.74 1.96 1.91 0.51 0.52 –0.25 0.22 0 1 0 0.5 –0.5
16/01/2011 Liverpool Everton 2 2 0 0 0.80 0.20 1.55 2.41 0.65 0.41 0.16 –0.22 1 0 1 0 0
01/02/2011 West Brom Wigan 2 2 0 –0.75 0.34 0.66 1.89 1.98 0.53 0.51 –0.19 0.16 0 1 0 1.98 0.98
12/02/2011 Man United Man City 2 1 1 –0.75 0.62 0.38 1.99 1.89 0.50 0.53 0.12 –0.15 1 0 1.495 0 0.495
12/02/2011 West Brom West Ham 3 3 0 –0.5 0.36 0.64 1.92 1.96 0.52 0.51 –0.16 0.13 0 1 0 1.96 0.96
20/02/2011 West Brom Wolves 1 1 0 –0.5 0.36 0.64 1.88 2.01 0.53 0.50 –0.17 0.14 0 1 0 2.01 1.01
26/02/2011 Wolves Blackpool 4 0 4 –0.5 0.36 0.64 1.83 2.05 0.55 0.49 –0.18 0.15 0 1 0 0 –1
05/03/2011 Arsenal Sunderland 0 0 0 –1 0.67 0.33 1.82 2.06 0.55 0.49 0.12 –0.16 1 0 0 0 –1
19/03/2011 Man United Bolton 1 0 1 –1.25 0.70 0.30 2.01 1.87 0.50 0.53 0.20 –0.23 1 0 0.5 0 –0.5
19/03/2011 West Brom Arsenal 2 2 0 0.75 0.37 0.63 1.84 2.03 0.54 0.49 –0.18 0.14 0 1 0 0 –1
03/04/2011 Fulham Blackpool 3 0 3 –1 0.31 0.69 1.92 1.94 0.52 0.52 –0.21 0.17 0 1 0 0 –1
09/04/2011 Man United Fulham 2 0 2 –1 0.70 0.30 1.86 2.01 0.54 0.50 0.16 –0.20 1 0 1.86 0 0.86
09/04/2011 Wolves Everton 0 3 –3 0 0.34 0.66 1.82 2.00 0.55 0.50 –0.21 0.16 0 1 0 2 1
10/04/2011 Blackpool Arsenal 1 3 –2 1.5 0.65 0.35 1.87 2.00 0.53 0.50 0.11 –0.15 1 0 0 0 –1
11/04/2011 Liverpool Man City 3 0 3 0 0.72 0.28 1.86 1.97 0.54 0.51 0.19 –0.23 1 0 1.86 0 0.86
16/04/2011 West Brom Chelsea 1 3 –2 0.75 0.36 0.64 1.95 1.94 0.51 0.52 –0.16 0.13 0 1 0 1.94 0.94
07/05/2011 Tottenham Blackpool 1 1 0 –1.5 0.40 0.60 1.81 2.05 0.55 0.49 –0.16 0.12 0 1 0 2.05 1.05
14/05/2011 Sunderland Wolves 1 3 –2 0 0.65 0.35 1.88 1.95 0.53 0.51 0.12 –0.17 1 0 0 0 –1
14/05/2011 West Brom Everton 1 0 1 0 0.34 0.66 1.89 1.94 0.53 0.52 –0.19 0.14 0 1 0 0 –1
15/05/2011 Arsenal Aston Villa 1 2 –1 –1.25 0.40 0.60 1.81 2.07 0.55 0.48 –0.15 0.11 0 1 0 2.07 1.07
22/05/2011 Bolton Man City 0 2 –2 0.75 0.67 0.33 2.00 1.88 0.50 0.53 0.17 –0.20 1 0 0 0 –1
22/05/2011 Man United Blackpool 4 2 2 –1 0.77 0.23 2.00 1.88 0.50 0.53 0.27 –0.31 1 0 2 0 1
22/05/2011 Stoke Wigan 0 1 –1 0 0.72 0.28 2.00 1.85 0.50 0.54 0.22 –0.26 1 0 0 0 –1
TOTAL 20 23 20.11 31.64 8.75
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
We investigate the state-of-the-art in score-based soccer match outcome modelling to identify the top-performing methods across diverse classes of existing approaches to the problem. Namely, we bring together various statistical methods based on Poisson and Weibull distributions and several general ranking algorithms (Elo, Steph ratings, Gaussian-OD ratings) as well as domain-specific rating systems (Berrar ratings, pi-ratings). We review, reimplement and experimentally compare these diverse competitors altogether on the largest database of soccer results available to identify true leaders. Our results reveal that the individual predictions, as well as the overall performances, are very similar across the top models tested, likely suggesting the limits of this generic approach to score-based match outcome modelling. No study of a similar scale has previously been done.
Article
Full-text available
The present paper investigates factors contributing to the home advantage, by using the exceptional opportunity to study professional football matches played in the absence of spectators due to the COVID-19 pandemic in 2020. More than 40,000 matches before and during the pandemic, including more than 1,000 professional matches without spectators across the main European football leagues, have been analyzed. Results support the notion of a crowd-induced referee bias as the increased sanctioning of away teams disappears in the absence of spectators with regard to fouls (p < .001), yellow cards (p < .001), and red cards (p < .05). Moreover, the match dominance of home teams decreases significantly as indicated by shots (p < .001) and shots on target (p < .01). In terms of the home advantage itself, surprisingly, only a non-significant decrease is found. While the present paper supports prior research with regard to a crowd-induced referee bias, spectators thus do not seem to be the main driving factor of the home advantage. Results from amateur football, being naturally played in absence of a crowd, provide further evidence that the home advantage is predominantly caused by factors not directly or indirectly attributable to a noteworthy number of spectators.
Article
Full-text available
This paper evaluates the efficiency of online betting markets for European (association) football leagues. The existing literature shows mixed empirical evidence regarding the degree to which betting markets are efficient. We propose a forecast-based approach for formally testing the efficiency of online betting markets. By considering the odds proposed by 41 bookmakers on 11 European major leagues over the last 11 years, we find evidence of differing degrees of efficiency among markets. We show that, if the best odds are selected across bookmakers, eight markets are efficient while three show inefficiencies that imply profit opportunities for bettors. In particular, our approach allows the estimation of the odds thresholds that could be used to set profitable betting strategies both ex post and ex ante.
Article
Full-text available
Betting odds are frequently found to outperform mathematical models in sports related forecasting tasks, however the factors contributing to betting odds are not fully traceable and in contrast to rating-based forecasts no straightforward measure of team-specific quality is deducible from the betting odds. The present study investigates the approach of combining the methods of mathematical models and the information included in betting odds. A soccer forecasting model based on the well-known ELO rating system and taking advantage of betting odds as a source of information is presented. Data from almost 15.000 soccer matches (seasons 2007/2008 until 2016/2017) are used, including both domestic matches (English Premier League, German Bundesliga, Spanish Primera Division and Italian Serie A) and international matches (UEFA Champions League, UEFA Europe League). The novel betting odds based ELO model is shown to outperform classic ELO models, thus demonstrating that betting odds prior to a match contain more relevant information than the result of the match itself. It is shown how the novel model can help to gain valuable insights into the quality of soccer teams and its development over time, thus having a practical benefit in performance analysis. Moreover, it is argued that network based approaches might help in further improving rating and forecasting methods.
Article
Full-text available
We describe our winning solution to the 2017’s Soccer Prediction Challenge organized in conjunction with the MLJ’s special issue on Machine Learning for Soccer. The goal of the challenge was to predict outcomes of future matches within a selected time-frame from different leagues over the world. A dataset of over 200,000 past match outcomes was provided to the contestants. We experimented with both relational and feature-based methods to learn predictive models from the provided data. We employed relevant latent variables computable from the data, namely so called pi-ratings and also a rating based on the PageRank method. A method based on manually constructed features and the gradient boosted tree algorithm performed best on both the validation set and the challenge test set. We also discuss the validity of the assumption that probability predictions on the three ordinal match outcomes should be monotone, underlying the RPS measure of prediction quality.
Article
Full-text available
Bayesian networks help us model and understand the many variables that inform our decision‐making processes. Anthony C. Constantinou and Norman Fenton explain how they work, how they are built and the pitfalls to avoid along the way Bayesian networks help us model and understand the many variables that inform our decision‐making processes. Anthony C. Constantinou and Norman Fenton explain how they work, how they are built and the pitfalls to avoid along the way.
Article
Full-text available
Despite recent promising developments with large datasets and machine learning, the idea that automation alone can discover all key relationships between factors of interest remains a challenging task. Indeed, in many real-world domains, experts can often understand and identify key relationships that data alone may fail to discover, no matter how large the dataset. Hence, while pure machine learning provides obvious benefits, these benefits may come at a cost of accuracy. Here we focus on what we call smart-data; a method which supports data engineering and knowledge engineering approaches that put greater emphasis on applying causal knowledge and real-world ‘facts’ to the process of model development, driven by what data are really required for prediction, rather than by what data are available. We demonstrate how we exploited knowledge to develop a model that generates accurate predictions of the evolving performance of football teams based on limited data. The model enables us to predict, before a season starts, the total league points a team is expected to accumulate throughout the season. The results compare favourably against a number of other relevant and different types of models, and are on par with some other models which use far more data. The model results also provide a novel and comprehensive attribution study of the factors most influencing change in team performance, and partly address the cause of the widely accepted favourite-longshot bias observed in bookies odds.
Article
The emergence of wagering sites taking bets in cryptocurrencies like bitcoin could lead to significant change in the way match-fixers operate. Internet gambling has already put fixing into overdrive. Now, with cryptocurrencies, fixers have a payment mechanism which offers them significantly more anonymity. This poses a new set of challenges not just for law enforcement authorities but also for lawmakers. This essay analyses the appropriateness of Australia’s legislation on match-fixing in an environment of ‘crypto-wagering’, and identifies several loopholes which could be exploited by fixers at the cutting edge of gambling and financial technology.
Article
We find that prices offered by competing bookmakers within the same quote-driven soccer (football) betting market provide arbitrage opportunities. However, the management practices of bookmakers prevent informed bettors exploiting these in practice. We identify two groups of bookmakers, ‘position-takers’ and ‘book-balancers.’ Position-takers alter their odds infrequently, while actively restricting informed traders. Book-balancers actively manage inventory by adjusting odds, and place few restrictions on their customers. We identify 545 arbitrage portfolios, and find that around 50% would require a bet on the favourite at the position-taking bookmaker. The management practices of position-takers generally prevent these opportunities being exploited in practice.
Article
We propose an innovative approach to model and predict the outcome of football matches based on the Poisson AutoRegression with eXogenous covariates (PARX) model recently proposed by Agosto, Cavaliere, Kristensen and Rahbek (2016). We show that this methodology is particularly suited to model the goals distribution of a football team and provides a good forecast performance that can be exploited to develop a profitable betting strategy. {This paper improves the strand of literature on Poisson-based models, by proposing a specification able to capture the main characteristics of goals distribution.} The betting strategy is based on the idea that the odds proposed by the market do not reflect the true probability of the match because they may also incorporate the betting volumes or strategic price settings in order to exploit bettors' biases. The out-of-sample performance of the PARX model is better than the reference approach by Dixon and Coles (1997). We also evaluate our approach in a simple betting strategy which is applied to English football Premier League data for the 2013/2014, 2014/2015, and 2015/2016 seasons. The results show that the return from the betting strategy is larger than 30% in most of the cases considered and may even exceed 100% if we consider an alternative strategy based on a predetermined threshold which makes it possible to exploit the inefficiency of the betting market.