ArticlePDF Available

Using artificial intelligence algorithms to predict self- reported problem gambling with account-based player data in an online casino setting

Authors:

Abstract and Figures

In recent years researchers have emphasized the importance of artificial intelligence (AI) algorithms as a tool to detect problem gambling online. AI algorithms require a training dataset to learn the patterns of a prespecified group. Problem gambling screens are one method for the collection of the necessary input data to train AI algorithms. The present study’s main aim was to identify the most significant behavioral patterns which predict self-reported problem gambling. In order to fulfil the aim, the study analyzed data from a sample of real-world online casino players and matched their self-report (subjective) responses concerning problem gambling with the participants’ actual (objective) gambling behavior. More specifically, the authors were given access to the raw data of 1,287 players from a European online gambling casino who answered questions on the Problem Gambling Severity Index (PGSI) between September 2021 and February 2022. Random forest and gradient boost machine algorithms were trained to predict self-reported problem gambling based on the independent variables (e.g., wagering, depositing, gambling frequency). The random forest model predicted self-reported problem gambling better than gradient boost. Moreover, problem gamblers showed a distinct pattern with respect to their gambling based on the player tracking data. More specifically, problem gamblers lost more money per gambling day, lost more money per gambling session, and deposited money more frequently per gambling session. Problem gamblers also tended to deplete their gambling accounts more frequently compared to non-problem gamblers. A subgroup of problem gamblers identified as being at greater harm (based on their response to PGSI items) showed even higher values with respect to the aforementioned gambling behaviors. The study showed that self-reported problem gambling can be predicted by AI algorithms with high accuracy based on player tracking data.
This content is subject to copyright. Terms and conditions apply.
ORIGINAL PAPER
Accepted: 5 June 2022 / Published online: 19 July 2022
© The Author(s) 2022
Extended author information available on the last page of the article
Using artificial intelligence algorithms to predict self-
reported problem gambling with account-based player data
in an online casino setting
MichaelAuer1· Mark D.Griths2
Journal of Gambling Studies (2023) 39:1273–1294
https://doi.org/10.1007/s10899-022-10139-1
Abstract
In recent years researchers have emphasized the importance of articial intelligence (AI)
algorithms as a tool to detect problem gambling online. AI algorithms require a training
dataset to learn the patterns of a prespecied group. Problem gambling screens are one
method for the collection of the necessary input data to train AI algorithms. The present
study’s main aim was to identify the most signicant behavioral patterns which predict
self-reported problem gambling. In order to full the aim, the study analyzed data from
a sample of real-world online casino players and matched their self-report (subjective)
responses concerning problem gambling with the participants’ actual (objective) gambling
behavior. More specically, the authors were given access to the raw data of 1,287 play-
ers from a European online gambling casino who answered questions on the Problem
Gambling Severity Index (PGSI) between September 2021 and February 2022. Random
forest and gradient boost machine algorithms were trained to predict self-reported prob-
lem gambling based on the independent variables (e.g., wagering, depositing, gambling
frequency). The random forest model predicted self-reported problem gambling better
than gradient boost. Moreover, problem gamblers showed a distinct pattern with respect
to their gambling based on the player tracking data. More specically, problem gamblers
lost more money per gambling day, lost more money per gambling session, and deposited
money more frequently per gambling session. Problem gamblers also tended to deplete
their gambling accounts more frequently compared to non-problem gamblers. A subgroup
of problem gamblers identied as being at greater harm (based on their response to PGSI
items) showed even higher values with respect to the aforementioned gambling behaviors.
The study showed that self-reported problem gambling can be predicted by AI algorithms
with high accuracy based on player tracking data.
Keywords Online gambling · Articial intelligence · Problem gambling · Player
tracking · Online casino
1 3
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
Introduction
Gambling disorder is a condition which aects around 0.5% of the general adult population
(Kessler et al., 2008; Abbott et al., 2018) although there is worldwide variation in problem
gambling (PG) prevalence, from below 1% of the adult population, up to around 5-6%
(Calado & Griths, 2016). Over the past few decades, technology has facilitated gambling,
and has led to it being more accessible and available through mobile devices such as tablets
and smartphones (Lopez-Gonzalez et al., 2021). Moreover, it has been noted that online
gambling is a medium of gambling rather than a type of gambling activity, and that most
internet gamblers also gamble oine (Wardle et al., 2011).
Online gambling participation has increased in recent years (Castrén et al., 2018; Chóliz
et al., 2021; Gainsbury, 2014; Rodríguez et al., 2017). A recent meta-analysis by Allami et
al. (2021) evaluated 57 risk factors from 104 gambling prevalence studies worldwide (with
sample sizes ranging from 5327 to 273,946 in the studies examined). The risk factors in the
studies were ranked in regard to their association with problem gambling. The risk factor
with the highest odds ratio was online gambling. They also reported that continuous forms
of gambling (such as slot machines and casino games) were most associated with problem
gambling. There are also features of the internet that provide reasons as to why users can
spend so long online including the perceived anonymity, aordability, easy accessibility,
interactivity, immersion/ dissociation, convenience, and disinhibition facilitation (Griths,
2003).
Most reviews of online gambling suggest it is a more ‘dangerous’ or ‘harmful’ medium
than oine gambling (e.g., Kuss & Griths 2012; Mora-Salgueiro et al., 2021). For
instance, Sirola et al. (2018) assessed problem gambling among a sample of 1200 Finnish
internet users with the South Oaks Gambling Screen. The results showed that over half
of participants who had visited gambling-related online communities were either at-risk
gamblers or probable pathological gamblers (54.33%). In three dierent regression mod-
els, visiting gambling-related online communities was a signicant predictor for excessive
gambling. However, other studies have not found online gambling to be related to increased
problem gambling. For instance, Philander and MacKay (2014) used secondary data and
found that past-year participation in online gambling was related to a decrease in problem
gambling severity, which is the opposite of the popular view in extant literature. Moreover,
in one of the few studies that compared oine-only gamblers, online-only gamblers, and
mixed-mode gamblers (i.e., those who gambled both online and oine) using a nationally
representative sample of British gamblers, Wardle et al. (2011) reported no problem gam-
bling among those who only gambled online. Problem gambling was highest among mixed-
mode gamblers followed by oine-only gamblers. The results suggest that the medium of
online gambling is not harmful itself but that to those who are vulnerable (e.g., problem
gamblers), the online medium could provide heightened risk because of its 24/7 capability.
Artificial intelligence, behavioral tracking, and gambling markers of harm
The terms ‘articial intelligence’ (AI), ‘machine learning’ and ‘data science’ are often used
interchangeably. However, machine learning refers to a group of advanced statistical meth-
ods, whereas AI can be regarded as the outcome of an advanced algorithm (Petit et al.,
2021). Online gambling facilitates the application of advanced analytical methods because
1 3
1274
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
each and every transaction is assigned to one account and recorded. Auer and Griths
(2013) argued that good tools which track a player’s behavior should be able to support
informed player choice, and also help online gambling operators gain more insight into their
players’ behavioral patterns. AI methods have been applied for numerous purposes in gam-
bling research. Several studies have used AI methods to predict voluntary self-exclusion
(i.e., Dragicevic et al., 2015; Finkenwirth et al., 2021; Haeusler, 2016; Percy et al., 2016).
Two studies have used AI methods to predict self-reported problem gambling (Luquiens et
al., 2016; Louderback et al., 2021). Auer and Griths (2019) applied AI methods to predict
voluntary limit setting among a sample of Norwegian online players. Cerasa et al. (2018)
used AI methods to predict personality traits predictive of self-reported problem gambling
in a sample of 40 psychiatric patients, recruited from specialized gambling clinics.
One of the innovations in gambling research over the past 15 years is the increasing use
of high-quality account-based behavioral tracking data provided by the gambling industry
to academic researchers. Both researchers and the gambling industry have utilized player
tracking data as a way to try to identify problem gambling (Auer & Griths, 2013; Deng
et al., 2019). For instance, AI methods were used by Ukhov et al. (2021) to compare online
casino players (n = 5000) and online sports bettors (n = 5000) and to see which features were
more predictive of problem gambling. The problem gambling sample was large (n = 5000,
comprising 2500 online casino players and 2500 online sports bettors, all of who had self-
excluded specically because they had problem gambling issues). They reported that the
number of daily wagers and the use of mobile devices (e.g., smartphones) were two of the
key predictors of problem gambling for online sports bettors whereas session durations, vol-
ume of approved deposits, and use of desktop computers were the key predictors for online
casino players. The study concluded that online problem gambling is not homogeneous and
that there are behavioral dierences in between problem gamblers based on preferred game
type.
As a result of a number of meetings between ve major gambling operators 888 Hold-
ings, GVC Holdings (now called Entain), Sky Betting & Gaming, William Hill, and Paddy
Power – the Senet Group developed a set of nine markers of harm to identify problematic
gambling (McAulie et al., 2022). The Senet Group is an organization which was estab-
lished in 2014 by the leading high-street bookmakers in the UK which was then taken over
by the Betting and Gaming Council (Narayan, 2020). Each of the nine markers (e.g., increase
in frequency of gambling, increased deposit frequency, failed deposits, late-night gambling,
etc.) is assigned four values (0 = no-risk, 1 = low-risk, 2 = medium-risk, and 3 = high-risk)
and the overall score across all nine markers can range between 0 and 27. The overall score
is also classied into categories (no-risk = 0–7, Level 1 = 8–9, Level 2 = 10–14, and Level
3 = 15–27) results in a type of intervention (PwC and Responsible Gambling Council, 2017).
The markers of harm identify changes in gambling (e.g., yesterday’s deposit was 2.5 times
larger than the average deposit for the past six months) as well as an assessment of overall
gambling behavior that might be viewed as risky (e.g., making 20 or more deposits in the
last 28 days).
In the peer-reviewed literature, McAulie et al. (2022) used two datasets from bwin’s
online sportsbook (one covering 2005–2007, and another covering 2015–2017) to evaluate
the prevalence of gambling markers of harm, as well as their intercorrelations, inter-individ-
ual and intraindividual stability, and correlations with extreme betting activity, demographic
variables, and gambling harm proxies. The authors found that on an average day, less than
1 3
1275
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
1% of players had risk scores high enough to trigger an intervention. They also found that
male gender and younger age were not positively correlated with the risk score. They also
reported that there were strong associations between the highest risk score during the study
period and being a top 6% or top 1% user in terms of number of bets or money wagered.
In the most recent (fth) edition of the Diagnostic and Statistical Manual of Mental Dis-
orders (DSM-5), gambling disorder was identied as a behavioral addiction (American Psy-
chiatric Association, 2013; Catania and Griths, 2021a) suggested ways that the DSM-5
criteria could be operationalized using behavioral tracking data. For instance, gambling
preoccupation was operationalized in four dierent ways including the number of hours
players spent on the website and the number of wagers and tolerance was operationalized in
two dierent ways including the increase in the number of money deposits over time. They
used a sample of 982 online gamblers and the rst three months of their gambling activ-
ity and concluded that some DSM-5 criteria could be operationalized with player tracking
data. Through cluster analysis they identied four types of online gambler (non-problem
gamblers, at-risk gamblers, nancially vulnerable gamblers, and emotionally vulnerable
gamblers), the latter two groups being problem gamblers and accounting for 1.23% of the
sample.
Problem gambling indicators using data from gamblers who have voluntarily self-
excluded
A number of studies have examined the prole of gamblers who have utilized voluntary
self-exclusion (VSE) tools. Using behavioral tracking data (i.e., the rst month of gambling
data among players who engaged in VSE because of gambling-related problems), Braver-
man et al. (2012) reported that the characteristics of rst-month betting were an increase
in wagering, frequent intensive betting, and high variability in amount of money wagered.
Finkenwirth et al. (2021) compared 2,157 Canadian online gamblers who had requested
VSE with 17,526 players who had not voluntarily self-excluded using 20 input variables
of gambling behavior. They applied AI algorithms to identify patterns indicative of future
self-exclusion. The variance in money bet per session was the most predictive explanatory
variable for VSE. Other signicant variables were the number of bets, the number of games
per session, money bet from promotional oers, amount of money won per day, and the
number of sessions per day. Using a dierent methodology, Haeusler (2016) used payment
data from a sample of 2696 bwin.com players to predict voluntary self-exclusion utilizing
AI algorithms. The study found that the frequency of deposits and the amount of money
deposited, the variance of the single amounts withdrawn, the amount of funds subject to
reversed withdrawals (when a player initiates a withdrawal of money after winning money
on the website and then decides not to and cancels the process), and the use of smartphones
to deposit money into their gambling account were found to be positively associated with
gambling self-exclusion.
Dragicevic et al. (2015) compared player tracking data from a sample of 347 players
who self-excluded with a control sample of 871 players who did not self-exclude. They also
compared the eciency of dierent AI methods. Their main nding was that self-excluders
lost more money than the control group. Their analysis also found that self-excluders made
riskier bets than the control group. Catania and Griths (2021b) compared players who
closed their account due to a specic self-reported gambling addiction with players who
1 3
1276
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
chose a six-month account closure option. Players who chose to close their account for six
months had low gambling activity and had only registered recently (i.e., just over 50% of
gamblers self-excluded within seven days of opening a gambling account, with one-fth
self-excluding within 24 h of opening an account). Catania and Griths concluded that
players who excluded voluntarily were too dierent to be treated as a homogenous group
and that self-exclusion alone was not a good proxy for problem gambling. Using a variety
of machine learning techniques, Percy et al. (2016) reported that the most accurate method
in identifying VSE was the random forest method.
Auer and Griths (2016) also argued that voluntary self-exclusion should not be used as
a proxy measure for problem gambling. They noted that there was no evidence of a direct
relationship between long-term self-exclusion and problem gambling and that gamblers
self-exclude for various reasons. Moreover, they noted that many problem gamblers never
self-exclude and many self-excluders do not have gambling problems and do not exclude
for reasons concerning problem gambling.
Self-reported problem gambling
There are over 20 screens that can assess problem gambling (Stincheld, 2014). Among the
most popular instruments are the South Oaks Gambling Screen (SOGS) and the Problem
Gambling Severity Index (PGSI). The SOGS is a 20-item scale and can reliably identify
individuals who are likely problem gamblers Duvarci & Varan, 2001; Lesieur & Blume,
1987; Shaer et al., 1999; Stincheld 2002). Strong et al. (2003) asserted that the SOGS
does not include less severe behavioral items and therefore may not do so well in identifying
people who are in the process of becoming problem gamblers.
The Problem Gambling Severity Index (PGSI; Ferris & Wynne 2001) comprises nine
items, four of which assess problem gambling behaviors and ve that assess negative con-
sequences of gambling. In a sample of 12,299 Canadian adults, Holtgraves (2009) found
that one underlying factor explains the nine PGSI questions. Holtgraves (2009) argued that
the PGSI presents a viable alternative to the SOGS for assessing degrees of problem gam-
bling severity in a non-clinical context. The PGSI was developed to reect more socially
oriented (rather than clinical) PG aspects (Petry, 2016). To date, the PGSI is arguably the
most widely used PG-screening tool currently (Calado & Griths, 2016).
Only a couple of studies have reported the association between self-reported problem
gambling and player tracking data among the same sample of online players (i.e., Luquiens
et al., 2016; Louderback et al., 2021). Luquiens et al. (2016) carried out a survey among
online poker players (n = 14,261) which included the Problem Gambling Severity Index
(PGSI). Their responses on the PGSI were compared with the tracking data of their actual
gambling. Almost one-fth of the participants who completed the PGSI were classed as
problem gamblers (18%). The key risk factors reported for problem gambling were: being
male, being aged below 28 years, having 60 + wagering sessions during the one-month study
period, losing more than €45 during the one-month study period, depositing 3 + times during
a 12-hour period, staking more than €298 during the one-month study period, having more
than €1.7 mean loss per session during the one-month study period, and engaging in multi-
tabling (playing simultaneously on multiple poker tables).
Louderback et al. (2021) used the Brief Biosocial Gambling Screen (BBGS) to assess
self-reported problem gambling among a sample of online gamblers. Their aim was to iden-
1 3
1277
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
tify thresholds for low-risk gambling. Among other variables, they measured duration of
gambling activity, gambling variability, net loss, amount of money wagered, and changes in
gambling behavior as predictive variables. The area under the curve (AUC) in the prediction
of the BBGS status was between 0.58 and 0.657. They concluded that wagering €167.97 or
less each month, spending 6.71% or less of individual’s annual income on online gambling
wagers, losing €26.11 or less on online gambling per month, and demonstrating variability
(i.e., standard deviation) in daily amount wagered of €35.14 or less were indicative of low-
risk gambling.
Previous papers have claimed that chasing losses can easily be observed by gambling
operators or researchers using account-based behavioral tracking data (e.g., Delfabbro et
al., 2012; Griths & Whitty, 2010). More recently, Challet-Bouju et al. (2020) and Perrot et
al. (2018) operationalized chasing losses as either three or more deposits within a 12-hour
period or a deposit less than one hour after a previous bet. Both studies clustered large
samples of online lottery and sports players and found that frequent session deposits were
correlated with high gambling intensity.
The present study
Gambling regulations in a number of European countries (e.g., UK, Spain, Germany, Swe-
den, Denmark) require license holders to identify problem gambling and regularly report the
number of problem gamblers to regulators. However, there is little research into the actual
playing behavior of problematic online gamblers. Luquiens et al.’s (2016) study was based
on online poker players and Louderback et al.’s (2021) study was based on relatively old
data from 2005 to 2010. Since then, internet gambling – as well as mobile gambling – has
signicantly increased (McGee, 2020).
The present study utilized a recent sample of European online casino players and ana-
lyzed the association between self-reported problem gambling and player tracking data. To
the best of the authors’ knowledge, Europe is the most highly regulated online gambling
environment which also includes the strictest player protection regulations. For that reason,
the authors examined a sample of European online casino players for the present study.
Moreover, the authors believe that the present study makes an important academic contribu-
tion. The ndings will be very helpful for online gambling operators as well as for regulators
and policymakers.
There were no specic hypotheses regarding the association between gambling behavior
and self-reported problem gambling. However, the study’s main aim was to identify the
most signicant behavioral patterns which predict self-reported problem gambling. In order
to full the aim, the present study analyzed data from a sample of real-world online casino
players and matched their self-report (subjective) responses concerning problem gambling
with the participants’ actual (objective) gambling behavior. The authors aimed to replicate
as many behavioral metrics used in previous research as possible for reasons of comparabil-
ity. Therefore, the study was necessarily explorative in nature.
1 3
1278
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
Method
The authors were given access by a European online casino to raw data of all players who
had answered the nine questions of the Problem Gambling Severity Index (PGSI) between
September 2021 and February 2022. Furthermore, only players who placed at least one
wager in the 30 days prior to answering the PGSI items were included in the sample. Players
were not actively prompted to answer the PGSI. They could answer the PGSI at any time
as it was always available on the website in the gambling operator’s ‘Responsible Gam-
ing’ section. Only the most recent set of answers were used for players who had answered
the PGSI multiple times during the study period. The nine PGSI questions are listed in the
Appendix 1.
The data comprised each wager and each win as well as each deposit and each with-
drawal by all the individuals who met the inclusion criterion (i.e., gamblers who placed at
least one wager in the 30 days prior to answering the PGSI). The data also contained the
amount of money in the gambling account (balance) before and after each transaction. The
authors were also given access to each player’s age and gender. The authors computed gam-
bling sessions based on the raw data. Sessions were computed based on the timestamp of
the single wagers. If two wagers were placed within 15 min of each other, the time between
those two events counted as gambling session time as has been used in other tracking stud-
ies (Hopfgartner et al., 2021). If there was more than 15 min between two wagers, the time
between the two events was not counted as belonging to the same gambling session.
Statistical analysis
For each of the nine PGSI items, players could choose between the categories ‘Never’ (0),
‘Sometimes’ (1), ‘Most of the time’ (2) and ‘Almost always’ (3). Scores ranged between 0
and 27. The authors also had access to the number of seconds between the rst click on the
PGSI site and the click on the submit button after answering all nine questions. Appendix
2 reports the player tracking features which were computed for each player for the 30 days
prior to answering the nine PGSI questions. The player tracking features measure the total
number of deposits and bets in the 30 days prior to answering the PGSI as well as average
amounts of money wagered per gambling day and per session. Furthermore, the authors
had access to data concerning prior self-exclusions (play breaks) as well as voluntary limit-
setting data. Two of the player tracking features in the present study were attempts to opera-
tionalize and measure chasing losses (i.e., regular gambling account depletion and frequent
session depositing). These are operationally dened below.
Regular gambling account depletion (i.e., percentage of sessions ending with low
account balance): The authors had access to the amount of money in the gambling
account before and after each wagering transaction (also referred to as the balance).
The amount of money in the gambling account after the last game of a session was
computed. For each player, the authors computed the percentage of sessions when there
was less than €5 in the gambling account at the end of the session. The present authors
believe that players who regularly deplete their gambling account may be an indication
of chasing and not being able to stop gambling.
1 3
1279
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
Frequent session depositing (i.e., average number of deposits per session/gambling
day): For each player, the authors computed the average number of monetary deposits
per session. Depositing frequently in a session may be an indication of chasing after
losses and not being able to stop or control gambling (Challet-Bouju et al., 2020).
The authors also applied two widely used articial intelligence (AI) algorithms to prediction
of self-reported problem gambling.
Gradient boost machine learning (GBML): GBML is a method which ts the data with
numerous models that are then aggregated to a nal model (Friedman, 2001). GBML
can detect linear as well as non-linear patterns.
Random forest (RF): RF is a popular machine learning method which ts the data with
numerous decision trees which are then aggregated into a nal model (Liaw & Wiener,
2002). RF can detect linear as well as non-linear patterns. The AI model’s predictive
quality was measured using the area under the curve (AUC). A value of 0.5 indicates
a low model quality and a value of 1 indicates a perfect t between the predicted and
actual values. Ling et al. (2003) have argued that AUC is a better way to measure the
predictive quality of AI models than the percentage of correctly classied records. The
AUC is a goodness of t statistic which can be used to evaluate model quality (Bradley,
1997).
The dependent variable was self-reported problem gambling and the independent variables
were player tracking features (listed in Appendix 2). The independent variables reected
the behavior for the 30 days prior to answering the PGSI. In order to nd the best tting
conguration, an automatic parameter search was conducted for both (i.e., the random forest
and the gradient boost machine algorithms). The optimal parameters were used to compute
a random forest and a gradient boost machine algorithm.
AI methods such as the ones chosen in the present study provide little insight into the
importance of single variables as predictors of self-reported problem gambling. In order to
gain more understanding as to which variable contributed to increased or decreased like-
lihood of self-reported problem gambling, the authors applied a cluster analysis. Cluster
analysis is also referred to as unsupervised learning as it aims to classify data into subgroups
(Jain et al., 2008). The algorithm assigns the sample to groups where members of one group
are as similar as possible and members of dierent groups are as dissimilar as possible. In
the present study, cluster analysis is simply used as an approach to further understand the
relationship between the behavioral metrics and self-reported problem gambling.
The authors used the programming language Python (Van Rossum, 2007) to analyze
the dataset. The scikit library (Pedregosa et al., 2011) was used for the machine learning
algorithms. The models’ performances were visually evaluated via their respective receiver
operating characteristic (ROC) curves (Hanley & McNeil, 1982) and numerically via the
area under the curve (Bradley, 1997). In order to test the validity of the machine learn-
ing models the data were split in to a training and a test set. More specically, 80% of the
data were used to train the models and 20% of the data were used to test the validity of the
models.
1 3
1280
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
Data cleaning and participants
A total of 1,287 players answered the nine PGSI questions between September 2021 and
February 2022. This was the time period for which data were made available to the authors.
Out of the 1,287 players, 60 players answered all nine questions with “almost always”
which results in a score of 27 (4.66%). However, only eight players (0.62%) received a PGSI
score of 26. The relatively large number of players scoring 27 could be a result of rushing
through the nine questions without reading them suciently. For that reason, the authors
removed players with a very short response time from the data sample. Consequently, 945
players with a reasonable response time were retained. The distributions of the 945 players
PGSI scores as well as the original 1,287 players PGSI scores are displayed in Fig. 1. Out of
the 945 players, only 11 players had a PGSI score of 27 (1.2%). The data cleaning process
also reduced the percentage of players who answered all nine questions with “never” from
22.5 to 19.7%. Answering all nine questions with “never” could also have been more likely
among the players who rushed through the nine questions without reading them suciently.
The average age of the 945 players was 41 years (SD = 11.81) and the sample comprised 433
females (46%) and 512 males (54%).
Results
Out of the 945 players, 248 players had a PGSI score of 8 or above (26%). A PGSI score of
8 or above indicates probable problem gambling. Figure 2 displays the distribution of the
four answers for each of the nine items for the group of problem gamblers. Item 6 (“Have
you felt that gambling has caused you any health problems, including stress or anxiety”)
was answered most frequently (50%) with “almost always”. Item 4 (“Have you borrowed
money or sold anything to get money to gamble?”) was answered least frequently with
“almost always” (13%). Item 4 also has the largest percentage of problem gamblers who
answered “never” (35%).
Fig. 1 Percentage of players for each PGSI score before and after removing players with short response time
1 3
1281
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
Artificial intelligence models
Random forest and gradient boost machine algorithms were trained to predict self-reported
problem gambling based on the independent variables (e.g., wagering, depositing, gambling
frequency). In order to nd the best tting conguration, an automatic parameter search was
conducted for both (i.e., the random forest and the gradient boost machine algorithms). The
optimal parameters were used to compute a random forest and a gradient boost machine
algorithm. Figure 3 reports the Receiver Operating Curve (ROC) as well as the area under
the Curve (AUC) values for both algorithms. The ROC reports the percentage of correctly
classied problem players (true positive rate/sensitivity) in relation to the percentage of
wrongly classied non-problem gamblers (false positive rate/1-specicity; see Narkhede
2018) for dierent cut-o values of the predicted probability of being a problem gambler.
The area under the ROC is referred to as the area under the curve (AUC). The entire chart
is a square. Each side of the square has a length of one which leads to an area of 1 (1 × 1).
The area on each side of the diagonal line is 0.5. The diagonal line represents a random
model which has an AUC of 0.5. A perfect model which classies each problem gambler
and each non-problem gambler correctly would have an AUC of 1. The larger the AUC
the better the model quality. The random forest model’s AUC value computed on the test
data was 0.729. This was larger than the gradient boost model’s AUC which was 0.67. This
indicates that the random forest model predicts self-reported problem gambling better. The
random forest algorithm also reports the most important variables in the model. These were
age, amount of money deposited, amount of money bet, number of gambling days, average
monetary loss per gambling day, average monetary loss per session, average number of
monetary deposits per session, account depletion, and number of play breaks.
Fig. 2 Percentage of players answering each of the nine PGSI items “never”, “sometimes”, “most of the
time”, and “almost always”
1 3
1282
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
Cluster analysis
The random forest machine learning algorithm does not report whether there are positive
or negative correlations between the explanatory variables and self-reported problem gam-
bling. In order to gain further insight into the behavior of problem gamblers (PGs) compared
to non-problem gamblers (NPGs), the authors performed a k-means cluster analysis. The
aim was to nd clusters of players with a higher percentage of self-reported problem gam-
bling. The previously reported variables with the highest importance in the random forest
machine learning algorithm were used in the cluster analysis. A z-score transformation was
applied to the variables (Mohamad et al., 2013). After this standardization, each variable
carried the same weight in the clustering process. The number of clusters was determined
using the elbow method (Kaufmann & Rousseeuw, 1990). The elbow method is a visual
approach which displays the within-sum of squares for dierent numbers of clusters. The
optimal number of clusters appears at the so-called elbow where the slope changes most sig-
nicantly. Figure 4 indicates that a ve-cluster solution tted the data best. The ve-cluster
datapoint is also indicated by the red circle in Fig. 4. Although Fig. 4 looks similar to Fig. 3,
the two are completely unrelated. Figure 4 shows the within-sum of squares for dierent
numbers of clusters and the area under the curve is meaningless for these data.
Fig. 4 Elbow chart visualizing the
optimal number of clusters for the
given dataset. (The red circle indi-
cates that four clusters are the best
possible solution)
Fig. 3 Receiver operating curve of
the random forest and a gradient
boost algorithm on the test data
1 3
1283
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
Table 1 reports the average values for each of the ve clusters. Self-reported problem
gambling and being female were not used in the cluster analysis. In total, 26% of the 945
players reported problem gambling based on their PGSI responses. The percentage of self-
reported problem gambling was dierent across the clusters. The largest percentage of PGs
was found in Cluster 1 (n = 124; 43%). None of the four other clusters had a percentage of
PGs above average. Cluster 5 had the lowest percentage of PGs (n = 13; 10%). With a mean
age of 31 years, players in Cluster 1 had the lowest mean age. The mean age across all 945
players was 40 years. The percentage of women in Cluster 1 (37%) was also lower than
the average percentage of women (45%). Only Cluster 3 had a lower percentage of women
(33%). On average, players in Cluster 1 deposited €385 (in the 30 days prior to answering
the PGSI) which was lower than the total average of €568. Only players in Custer 3 depos-
ited less money (€269). On average, players in Cluster 1 bet €2724 which was lower than
the total average of €5922. Only players in Cluster 3 deposited less money (€2372).
On average, players in Cluster 1 gambled on 5.53 days during the previous 30-day period
which was less frequently than the total average of seven days. Only players in Cluster 3
gambled less frequently (5.32 days). Players in Cluster 1 lost €45.77 per gambling day dur-
ing the previous 30-day period which was more than the total average loss per gambling day
(€15.24). Only players in Cluster 3 lost more money per gambling day (€58.14). A negative
loss metric refers to a loss which means the amount bet was larger than the amount won. A
positive loss metric refers to a win. On average, players in Cluster 4 won €91.70 per gam-
bling day. Players in Cluster 1 deposited 1.37 times per session. This was the highest value
across all clusters. In total, players deposited 1.11 times per session. Players in Cluster 1 lost
€33.14 per session which was higher than the average loss across all players (€-11.31). Only
players in Cluster 3 lost more per session (€33.50). Moreover, 93% of players in Cluster 1
usually gambled until they had less than €5 on their gambling account. In total, this behavior
occurred among 67% of all players. All other clusters respective values were lower. In total,
21% of players took a play break. Cluster 1 had the largest percentage of players taking play
breaks (27%).
Greater harm problem gamblers
Next, the authors selected a subgroup of PGs based on specic PGSI items. The nine items
in the PGSI are equally weighted but some items are far more indicative of problem gam-
bling than others. For instance, responding with the answer ‘almost always’ to some items
on the PGSI (e.g., “Have you felt that you might have a problem with gambling?”, “Has
gambling caused you any health problems, including stress or anxiety?” and “Has your
gambling caused any nancial problems for you or your household?”) are much more
strongly associated with problem gambling than items like borrowing money from others
to gamble and being criticized by others for gambling. Out of the 248 players which scored
at least eight or above on the PGSI, 79 players answered at least one of the three aforemen-
tioned questions to be more indicative of gambling harm with “almost always”. Moreover,
8.4% of the 945 players were in the subgroup of PGs.
Table 2 reports average values of the 248 PGs, the subgroup of 79 greater harm problem
gamblers (GHPGs) and the remaining 697 NPGs. The three numbers do not sum up to the
sample size, because the 79 GHPGs are included in the 284 PGs. PGs as well as the GHPGs
were younger than NPGs. PGs were on average 37 years old, the GHPGs were on average
1 3
1284
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
36 years old, and NPGs were on average 43 years old. A total of 49% of NPGs were female.
Moreover, 40% of PGs and 32% of GHPGs were female. PGs and GHPGs deposited less
money, bet less money, and gambled less frequently than NPGs. On average, GHPGs lost
more money per gambling day (€122.15) as well more money as per session (€71.78) than
all the PGs (€-68.24; €-42.73). On average, NPGs deposited money once (1.04) per ses-
sion. PGs deposited 1.41 times per session and GHPGs deposited 1.53 times per session.
On average, PGs (€63) and GHPGs (€96.08) deposited more money per session than NPGs
(€49.35). Two-thirds of NPGs (65%) typically gambled until less than €5 was left in their
gambling account. The respective values for PGs and GHPGs were 78% and 79%. A total
of 12% NPGs had play breaks in the 30 days prior to answering the PGSI questions, 46% of
PGs had play breaks and 59% of GHPGs had play breaks. The average money bet per game
by PGs was €3.30, and GHPGs bet €5.73 per game. On average, NPGs bet €3.21 per game.
The same pattern was found for the standard deviation of the bet. NPGs average standard
deviation of the bet was €3.54, PGs average standard deviation of the bet was €3.79 and the
GHPGs standard deviation of the bet was €6.79. The average prole of PGs and NPGs was
similar to the ndings in the cluster analysis. Compared to the entire group of PGs, GHPGs
deviated more from NPGs with respect to all the metrics listed in Table 2.
Discussion
Between September 2021 and February 2022, 1,287 players of a European online gambling
site answered the nine questions of the Problem Gambling Severity Index (PGSI). The fre-
quency of the single PGSI scores ranging from 0 to 27 is displayed in Fig. 1. As expected,
the distribution is skewed with more players scoring in the lower range and fewer players
scoring in the higher range. However, there is a discrete step between scores of 26 and 27 on
the PGSI. Sixty participants (4.66%) answered all nine questions of the PGSI with “almost
Table 1 Average values for each of the ve computed clusters (with clusters sorted according to size)
Clus-
ter 1
Clus-
ter 2
Clus-
ter 3
Clus-
ter 4
Clus-
ter 5
Total
PG 43% 25% 23% 13% 10% 26%
Age 31 35 54 45 45 40
Female 37% 43% 33% 79% 47% 45%
Amount of money deposited (€) 385 630 269 470 1 361 568
Amount of money bet (€) 2 724 6 897 2 372 8 340 13
339
5 922
Number of gambling days 5.53 6.00 4.91 5.32 19.71 7
Average monetary loss per gambling day (€) - 45.77 -
18.85
- 58.14 91.70 - 1.73 -
15.24
Average number of deposits per session 1.37 1.13 1.14 0.77 0.85 1.11
Average monetary loss per session (€) - 33.14 - 6.26 - 33.50 43.31 - 1.49 -
11.31
Percentage of sessions ending with low account
balance
93% 54% 87% 15% 59% 67%
Play break (yes/no) 27% 20% 24% 14% 14% 21%
Number 287 209 176 141 132 945
Percentage 30% 22% 19% 15% 14%
1 3
1285
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
always”. The present authors speculate that this spike was caused by participants who did
not read the questions suciently and simply answered each question “almost always”. A
similar spike was observed for players answering each question “never”. The authors also
had access to the time from navigating to the PGSI page and pressing the submit button after
answering all nine questions. After removing participants with unreasonably short response
times, the spike with score of 27 disappeared (see Fig. 1). As far as the present authors are
aware, this is the rst time that a study has accurately measured the response times taken to
complete a problem gambling screen. The results clearly indicate that response time can be
a crucial aspect in improving data quality.
In the present study, 26% of participants were PGs, which corresponds to a PGSI score
of 8 or above. The relatively high rate of self-reported problem gambling among the pres-
ent sample of online casino players is in line with previous ndings. Lopez-Gonzalez et al.
(2018) collected responses to the PGSI in a sample of 659 Spanish sports-bettors. One-fth
of them had a score of 8 or above and were classed as PGs (19.1%). Håkansson and Wid-
ingho (2020) surveyed a sample of 1,004 Swedish online gamblers examining problem
gambling symptoms (using the PGSI). They reported 44% of both past 30-day online casino
gambling and live betting were problem gamblers. Moreover, 18% of those reporting online
casino gambling but no live betting were problem gamblers.
The high percentage of problem gamblers in self-report studies is in stark contrast to
previous studies classifying problem gamblers using pure behavioral tracking data. McAu-
lie et al. (2022) reported that less than 1% of players were regarded as high-risk based on
the Senet Group’s markers of harm. They also reported that Entain, which was part of the
group of companies which dened the markers of harm, identied less than 6% of players
as being high risk. Based on player tracking data, Catania and Griths (2021a) found that
only 1.21% of their sample displayed elevated values on DSM-5 criteria for gambling disor-
der (although 33% were classed as at-risk gamblers). The large discrepancy between actual
gambling expenditure and self-reported gambling identied by previous studies (Auer &
Griths, 2017; Braverman et al., 2014) could play a role for the explanation for the discrep-
ancy between the frequency of self-reported PG gambling and the proportion of high-risk
players based behavioral tracking data.
Table 2 Average values for problem gamblers, greater harm problem gamblers, and non-problem gamblers
PGs GHPGs NPGs
N248 (26%) 79 (8.4%) 697 (64%)
Age 37 36 43
Female 40% 32% 49%
Amount deposited 432 478 631
Amount of money bet (€) 3253 2705 7032
Number of gambling days 5.79 4.76 8.30
Average monetary loss per gambling day (€) -68.24 -122.15 4.22
Average number of deposits per session 1.41 1.53 1.04
Average amount of money deposited per session (€) 63.00 96.08 49.35
Average money loss per session (€) -42.73 -71.87 -0.06
Percentage of sessions ending with low account balance 78% 79% 65%
Play break (yes/no) 46% 59% 12%
Average bet per game (€) 3.30 5.73 3.21
Standard deviation bet (€) 3.79 6.79 3.54
1 3
1286
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
Two AI algorithms were used to predict PG-based player tracking features 30 days prior
to answering the PGSI. A random forest model achieved an AUC of 0.729 and a gradient
boost machine model achieved an AUC of 0.67. Both goodness of t statistics were com-
puted on a test set which was left out from the model training. This level of model accuracy
is in line with previous results. Louderback et al. (2021) used the Brief Biosocial Gambling
Screen to assess self-reported problem gambling and they reported AUC values between
0.580 and 0.657. Luqiens et al. (2016) predicted self-reported problem gambling (using the
PGSI) in a sample of online poker players and reported an AUC of 0.73.
In order to provide greater insights into the association between the player tracking fea-
tures and self-reported problem gambling, a cluster analysis was performed. One cluster
contained 43% PGs and players lost more money per gambling day and session, depos-
ited more frequently per session, and depleted their gambling account in sessions more
frequently. They also had more play breaks in the 30 days prior to answering the PGSI.
However, in total they deposited less money, bet less money, and played less frequently.
The higher likelihood of depositing within sessions and the higher likelihood of depleting
the online account within-session could be indications of impaired self-control. Previous
studies have suggested that online gambling might have negative impacts on self-control
(Siemens et al., 2011). Two previous player tracking studies used frequent depositing as
proxy measures for chasing losses. Perrot et al. (2018) operationalized chasing losses as
either three or more deposits within a 12-hour period or a deposit less than one hour after a
previous bet. One subgroup of players was characterized by a high gambling activity and a
high probability of chasing behavior. Challet-Bouju et al. (2020) used the same operational-
ization of chasing losses as Perrot et al. (2018). In a cluster analysis they found a segment of
players with a high gambling activity which was associated with a high number of chasing
episodes.
The present authors developed a subgroup of PGs based on three PGSI items which
appear to be more strongly associated with problem gambling (“Have you felt that you
might have a problem with gambling?”, “Has gambling caused you any health problems,
including stress or anxiety?” and “Has your gambling caused any nancial problems for
you or your household?”). A total of 8.4% of players answered at least one of these three
questions with “almost always”. This subgroup of PGs (‘greater harm problem gamblers’
[GHPGs]) lost more money per session and per active gambling day and deposited money
more frequently per session. In total they gambled and deposited less than all PGs. Three-
fths of the GHPGs (59%) had play breaks compared to 46% of all PGs. This is in line with
the expectations as the GHPGs’ health and/or nancials were impacted by gambling.
In the 30 days prior to answering the PGSI, PGs and GHPGs bet and deposited less
than NPGs. However, the PGs and GHPGs deposited more per session, lost more money
per session and day, and deposited more frequently per session. At rst glance this seems
contradictory. The explanation lies most likely in the fact that PGs were much more likely
to self-exclude at some point of time during the 30 days prior to answering the PGSI. This
limited the number of days on which they could gamble which reects in the lower number
of gambling days compared to NPGs. It is concluded that PGs played less frequently due
to self-exclusion, but on the days they gambled, they spent more than NPGs. This is in line
with previous studies which found that PGs spend more money than NPGs (Louderback et
al., 2021; Luqiens et al., 2016). The increased likelihood of self-exclusions among PGs sup-
ports the notion that self-exclusion behavior is correlated with PG. Several previous studies
1 3
1287
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
have used self-exclusion as a proxy for problem gambling (e.g., Dragicevic et al., 2015;
Percy et al., 2016; Finkenwirth et al., 2021).
One of the most inuential metrics reported by the random forest algorithm was age. This
was also evident in the cluster analysis and in the average proles of PGs and NPGs. PGs
were younger than NPGs. In their analysis of online poker players, Luqiens et al. (2016)
also found PGs to be younger than NPGs. Although gender was not selected by the machine
learning algorithms, there was a clear dierence between PGs and NPGs. That dierence
was also evident in the cluster. The percentage of females in the PG group was lower com-
pared to that in the NPG group. Previous studies have also found problem gambling to be
more likely among males than females (e.g., Economou et al., 2019; Fröberg et al., 2015;
Husky et al., 2015).
The most important variables predicting self-reported problem gambling were age,
amount of money deposited, amount of money bet, number of gambling days, average
monetary loss per gambling day, average monetary loss per session, average number of
monetary deposits per session, account depletion, and number of play breaks. However,
analysis of these variables does not provide information about the direction of the asso-
ciation between independent variables and the dependent variable. Younger players for
example might have an elevated risk or a decreased risk. Consequently, additional cluster
analysis was performed which provided additional evidence concerning the variables most
predictive of problem gambling.
The ndings regarding frequent depositing and depleting the gambling account balance
are particularly interesting because they are fundamental to most online gambling operators’
marketing practices. Players have to deposit before they can play and to the best of the pres-
ent authors’ knowledge online gambling operators are trying to make this process as easy
and as frictionless as possible. Often players can deposit with one click and/or are reminded
when their account balance decreases. The present study’s ndings question these practices
and suggest that frequent depositing should be made more dicult. The present authors are
not aware of any regulation which would limit depositing frequency in short time periods or
prohibit operators from enticing monetary depositing within sessions.
The ndings will be of interest to many dierent stakeholder groups including the gam-
bling industry, gambling policymakers, gambling regulators and researchers in the gam-
bling studies eld. The ndings provide empirical evidence concerning the most important
behavioral indicators of problem gambling which could be used by (i) the gambling industry
to help identify problem gamblers using account-based data, (ii) gambling policymakers
and regulators to make evidence-based informed decisions and policies in the area of player
protection and harm-minimization, and (iii) researchers in the gambling studies eld to rep-
licate and/or build on the ndings reported here with other samples from dierent gambling
operators and dierent countries.
Limitations
The present study has a number of limitations that should be considered when interpreting
the study’s key ndings. First, only a relatively small number of participants answered the
PGSI questions which was then used to train the AI algorithms. Second, the PGSI data were
self-report and therefore subject to established methods biases (e.g., social desirability).
However, given that the self-report data appeared to support the objective player tracking
1 3
1288
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
data, the self-report data would appear to have good face validity. Third, the study was con-
ducted with players from just one European online gambling operator during a specic (and
relatively short) period of time and therefore the data are not necessarily representative of
online gamblers more generally. The results could vary across operators and jurisdictions, as
well as other time periods. However, the main ndings are in line with the ndings from pre-
vious research which identied that frequent session deposits were correlated with higher
gambling intensity. Additionally, the study’s validity was further improved by the fact that
the response time to the PGSI was measured and used to help identify potentially unreliable
answers. As this is the rst study to correlate self-reported problem gambling with player
tracking data, future replication studies should be conducted with data from dierent opera-
tors in other jurisdictions and utilize larger sample sizes and study the gambling behavior
for longer time periods (e.g., six months or a year).
Conclusions
The present study showed that self-reported problem gambling can be predicted by AI
algorithms with high accuracy based on player tracking data. The reported model accura-
cies were in line with previous prediction studies in the area of responsible gambling. The
results also supported Auer and Griths’ (2016) assertion that not all PGs self-exclude and
vice versa. The GHPGs spent more money, deposited money more frequently within ses-
sions, and depleted the gambling account more frequently compared to all PGs and NPGs.
However, numerous jurisdictions require operators to identify problem gambling based on
behavioral tracking data. For example, Sweden requires operators to monitor younger play-
ers more thoroughly (Svenska Spel, 2020). This is supported by the fact that PGs were
younger in the present study. The ndings of the present study shed more insight into sig-
nicant metrics and demographic dierences concerning problem gamblers by using a mix
of objective (account-based tracking) data and subjective (self-report) data.
Appendix 1: Problem Gambling Severity Index items
Item number and question
(1) Have you bet more than you could really aord to lose?
(2) Have you needed to gamble with larger amounts of
money to get the same excitement?
(3) Have you gone back to try to win to back the money
you’d lost?
(4) Have you borrowed money or sold anything to get
money to gamble?
(5) Have you felt that you might have a problem with
gambling?
(6) Have you felt that gambling has caused you any health
problems, including stress or anxiety
(7) Have people criticized your betting, or told you that you
have a gambling problem, whether or not you thought it is
true?
1 3
1289
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
Item number and question
(8) Have you felt your gambling has caused nancial prob-
lems for you or your household?
(9) Have you felt guilty about the way you gamble or what
happens when you gamble?
Individuals can answer: Never (0), Sometimes (1), Most of t he
Time (2), Almost Always (3)
Appendix 2: Player tracking features based on the 30 days prior to
answering the PGSI
Feature Number Feature
1 Age (in years)
2Gender
3Number of play breaks
4Number of voluntary limit changes
5Number of bets
6 Amount of money bet
7 Average bet amount
8Standard deviation bet
9Number of deposits
10 Amount of money deposited
11 Standard deviation deposits
12 Amount of money won
13 Amount of money lost (amount won minus amount bet)
14 Number of sessions
15 Total session length (in minutes)
16 Number of dierent gambling days
17 Average number of deposits per gambling day
18 Average number of deposits per session
19 Average amount of money lost per gambling day
20 Average monetary loss per session
21 Average amount of money deposited per gambling day
22 Average amount of money deposited per session
23 Percent of sessions ending with low account balance
Funding None received.
Data Availability The data for this study are commercially sensitive and are not publicly available.
Declarations
Conflict of interest The second author’s university currently receives funding from Norsk Tipping (the gam-
bling operator owned by the Norwegian Government). The second author has received funding for a number
of research projects in the area of gambling education for young people, social responsibility in gambling and
gambling treatment from Gamble Aware (formerly the Responsibility in Gambling Trust), a charitable body
which funds its research program based on donations from the gambling industry. Both authors undertake
consultancy for various gaming companies in the area of social responsibility in gambling.
Ethical approval – Ethical approval was provided by the ethics committee of Nottingham Trent University.
1 3
1290
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
Informed consent – Not applicable. Secondary data analysis.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence,
and indicate if changes were made. The images or other third party material in this article are included in the
article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is
not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright
holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
References
Abbott, M., Romild, U., & Volberg, R. (2018). The prevalence, incidence, and gender and age-specic inci-
dence of problem gambling: results of the Swedish longitudinal gambling study (Swelogs). Addiction,
113 (4), 699–707
Allami, Y., Hodgins, D. C., Young, M., Brunelle, N., Currie, S., Dufour, M. … Nadeau, L. (2021). A meta-
analysis of problem gambling risk factors in the general adult population. Addiction, 116(11), 2968–2977
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.).
Arlington, VA: American Psychiatric Publishing
Auer, M., & Griths, M. D. (2013). Behavioral tracking tools, regulation, and corporate social responsibility
in online gambling. Gaming Law Review and Economics, 17(8), 579–583
Auer, M., & Griths, M. D. (2016). Should voluntary” self-exclusion” by gamblers be used as a proxy mea-
sure for problem gambling? MOJ Addiction Medicine & Therapy, 2(2), 00019
Auer, M., & Griths, M. D. (2017). Self-reported losses versus actual losses in online gambling: An empiri-
cal study. Journal of Gambling Studies, 33(3), 795–806
Auer, M., & Griths, M. D. (2022). Predicting limit-setting behavior of gamblers using machine learning
algorithms: A real-world study of Norwegian gamblers using account data. International Journal of
Mental Health and Addiction, 20, 771–778
Baggio, S., Gainsbury, S. M., Starcevic, V., Richard, J. B., Beck, F., & Billieux, J. (2018). Gender dierences
in gambling preferences and problem gambling: A network-level analysis. International Gambling
Studies, 18(3), 512–525
Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algo-
rithms. Pattern Recognition, 30(7), 1145–1159
Braverman, J., & Shaer, H. J. (2012). How do gamblers start gambling: Identifying behavioural markers for
high-risk internet gambling. European Journal of Public Health, 22, 273–278
Braverman, J., Tom, M. A., & Shaer, H. J. (2014). Accuracy of self-reported versus actual online gambling
wins and losses. Psychological Assessment, 26(3), 865
Calado, F., & Griths, M. D. (2016). Problem gambling worldwide: An update of empirical research (2000–
2015). Journal of Behavioral Addictions, 5, 592–613
Castrén, S., Heiskanen, M., & Salonen, A. H. (2018). Trends in gambling participation and gambling sever-
ity among Finnish men and women: Cross-sectional population surveys in 2007, 2010 and 2015.BMJ
Open, 8(8), e022129
Catania, M., & Griths, M. D. (2021a). Applying the DSM-5 criteria for gambling disorder to online gam-
bling account-based tracking data: An empirical study utilizing cluster analysis. Journal of Gambling
Studies. https://doi.org/10.1007/s10899-021-10080-9. Advance online publication
Catania, M., & Griths, M. D. (2021b). Understanding online voluntary self-exclusion in gambling: An
empirical study using account-based behavioral tracking data. International Journal of Environmental
Research and Public Health, 18(4), 2000
Challet-Bouju, G., Hardouin, J. B., Thiabaud, E., Saillard, A., Donnio, Y., Grall-Bronnec, M., & Perrot, B.
(2020). Modeling early gambling behavior using indicators from online lottery gambling tracking data:
Longitudinal analysis.Journal of Medical Internet Research, 22(8), e17675
Chóliz, M., Marcos, M., & Lázaro-Mateo, J. (2021). The risk of online gambling: A study of gambling dis-
order prevalence rates in Spain. International Journal of Mental Health and Addiction, 19(2), 404–417
Cerasa, A., Lofaro, D., Cavedini, P., Martino, I., Bruni, A., Sarica, A. … Quattrone, A. (2018). Personality
biomarkers of pathological gambling: A machine learning study. Journal of Neuroscience Methods,
294, 7–14
1 3
1291
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
Chóliz, M. (2016). The challenge of online gambling: the eect of legalization on the increase in online
gambling addiction. Journal of Gambling Studies, 32(2), 749–756
Delfabbro, P. H., King, D. L., & Griths, M. D. (2012). Behavioural proling of problem gamblers: A critical
review. International Gambling Studies, 12, 349–366
Deng, X., Lesch, T., & Clark, L. (2019). Applying data science to behavioral analysis of online gambling.
Current Addiction Reports, 6(3), 159–164
Dragicevic, S., Percy, C., Kudic, A., & Parke, J. (2015). A descriptive analysis of demographic and behavioral
data from internet gamblers and those who self-exclude from online gambling platforms. Journal of
Gambling Studies, 31(1), 105–132
Duvarci, I., & Varan, A. (2001). Reliability and validity study of the Turkish form of the South Oaks Gam-
bling Screen. Turk Psikiyatri Dergisi, 12, 34–45
Economou, M., Souliotis, K., Malliori, M., Peppou, L. E., Kontoangelos, K., Lazaratou, H. … Papageorgiou,
C. (2019). Problem gambling in Greece: prevalence and risk factors during the nancial crisis. Journal
of Gambling Studies, 35(4), 1193–1210
Ferris, J., & Wynne, H. (2001). The Canadian Problem Gambling Index: Final report. Ottawa: Canadian
Centre on Substance Abuse
Finkenwirth, S., MacDonald, K., Deng, X., Lesch, T., & Clark, L. (2021). Using machine learning to predict
self-exclusion status in online gamblers on the PlayNow. com platform in British Columbia. Interna-
tional Gambling Studies, 21(2), 220–237
Friedman, J. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics,
29(5), 1189–1232
Fröberg, F., Rosendahl, I. K., Abbott, M., Romild, U., Tengström, A., & Hallqvist, J. (2015). The incidence
of problem gambling in a representative cohort of Swedish female and male 16–24 year-olds by socio-
demographic characteristics, in comparison with 25–44 year-olds. Journal of Gambling Studies, 31(3),
621–641
Gainsbury, S. (2014). AGRC discussion paper on interactive gambling. Melbourne: Australian Gambling
Research Centre
Gainsbury, S. M., Russell, A., Hing, N., Wood, R., & Blaszczynski, A. (2013). The impact of internet gam-
bling on gambling problems: A comparison of moderate-risk and problem Internet and non-Internet
gamblers. Psychology of Addictive Behaviors, 27(4), 1092–1101
Griths, M. (2003). Internet gambling: Issues, concerns, and recommendations. CyberPsychology & Behav-
ior, 6(6), 557–568
Griths, M. D., & Whitty, M. W. (2010). Online behavioural tracking in internet gambling research: Ethical
and methodological issues. International Journal of Internet Research Ethics, 3, 104–117
Haeusler, J. (2016). Follow the money: Using payment behaviour as predictor for future self-exclusion.
International Gambling Studies, 16(2), 246–262
Håkansson, A., & Widingho, C. (2020). Over-indebtedness and problem gambling in a general population
sample of online gamblers. Frontiers in Psychiatry, 11, 7
Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating character-
istic (ROC) curve. Radiology, 143(1), 29–36
Hing, N., Russell, A. M. T., Gainsbury, S. M., & Blaszczynski, A. (2015). Characteristics and help-seeking
behaviors of Internet gamblers based on most problematic mode of gambling.Journal of Medical Inter-
net Research, 17(1), e3781
Holtgraves, T. (2009). Evaluating the problem gambling severity index. Journal of Gambling Studies, 25(1),
105–120
Hopfgartner, N., Auer, M., Santos, T., Helic, D., & Griths, M. D. (2021). The eect of mandatory play
breaks on subsequent gambling behavior among Norwegian online sports betting, slots and bingo play-
ers: A large-scale real world study. Journal of Gambling Studies. Advance online publication. https://
doi.org/10.1007/s10899-021-10078-3
Husky, M. M., Michel, G., Richard, J. B., Guignard, R., & Beck, F. (2015). Gender dierences in the associa-
tions of gambling activities and suicidal behaviors with problem gambling in a nationally representative
French sample. Addictive Behaviors, 45, 45–50
Jain, A. K. (2008). Data clustering: 50 years beyond k-means. In: Joint European Conference on Machine
Learning and Knowledge Discovery in Databases (pp. 3–4). Springer, Berlin, Heidelberg
Kaufman, L., & Rousseeuw, P. J. (2009). Finding groups in data: An introduction to cluster analysis. Chich-
ester: John Wiley & Sons
Kessler, R. C., Hwang, I., LaBrie, R., Petukhova, M., Sampson, N. A., Winters, K. C., & Shaer, H. J. (2008).
DSM-IV pathological gambling in the National Comorbidity Survey Replication. Psychological Medi-
cine, 38(9), 1351–1360
Kuss, D. J., & Griths, M. D. (2012). Internet gambling behavior. In Z. Yan (Ed.), Encyclopedia of Cyber
Behavior (pp. 735–753). Hershey, PA: IGI Global
1 3
1292
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
LaPlante, D. A., Nelson, S. E., LaBrie, R. A., & Shaer, H. J. (2006). Men & women playing games: Gender
and the gambling preferences of Iowa gambling treatment program participants. Journal of Gambling
Studies, 22(1), 65–80
Lesieur, H. R. (1979). The compulsive gambler’s spiral of options and involvement. Psychiatry, 42(1), 79–87
Lesieur, H. R., & Blume, S. B. (1987). The South Oaks Gambling Screen (The SOGS): A new instrument for
the identication of pathological gamblers. American Journal of Psychiatry, 144, 1184–1188
Liaw, A., & Wiener, M. (2002). Classication and regression by random forest. R News, 2(3), 18–22
Likas, A., Vlassis, N., & Verbeek, J. J. (2003). The global k-means clustering algorithm. Pattern Recognition,
36(2), 451–461
Ling, C. X., Huang, J., & Zhang, H. (2003). AUC: A better measure than accuracy in comparing learning
algorithms. In: Conference of the Canadian Society for Computational Studies of Intelligence (pp. 329–
341). Springer, Berlin, Heidelberg
Lopez-Gonzalez, H., Estévez, A., & Griths, M. D. (2018). Spanish validation of the Problem Gambling
Severity Index: A conrmatory factor analysis with sports bettors. Journal of Behavioral Addictions,
7(3), 814–820
Lopez-Gonzalez, H., Griths, M. D., & Jiménez-Murcia, S. (2021). The erosion of intimacy and non-gam-
bling spheres by smartphone gambling: A qualitative study on workplace, bedtime, and bathroom dis-
ordered gambling. Mobile Media & Communication, 9, 254–273
Louderback, E. R., LaPlante, D. A., Currie, S. R., & Nelson, S. E. (2021). Developing and validating lower
risk online gambling thresholds with actual bettor data from a major internet gambling operator. Psy-
chology of Addictive Behaviors, 35(8), 921–938
Luquiens, A., Tanguy, M. L., Benyamina, A., Lagadec, M., Aubin, H. J., & Reynaud, M. (2016). Tracking
online poker problem gamblers with player account-based gambling data only. International Journal of
Methods in Psychiatric Research, 25(4), 333–342
McAulie, W. H., Louderback, E. R., Edson, T. C., LaPlante, D. A., & Nelson, S. E. (2022). Using “markers
of harm” to track risky gambling in two cohorts of online sports bettors. Journal of Gambling Studies.
https://doi.org/10.1007/s10899-021-10097-0. Advance online publication
McBride, J., & Derevensky, J. (2012). Internet gambling and risk-taking among students: An exploratory
study. Journal of Behavioral Addictions, 1(2), 50–58
McGee, D. (2020). On the normalisation of online sports gambling among young adult men in the UK: A
public health perspective. Public Health, 184, 89–94
Mohamad, I. B., & Usman, D. (2013). Standardization and its eects on K-means clustering algorithm.
Research Journal of Applied Sciences Engineering and Technology, 6(17), 3299–3303
Mora-Salgueiro, J., García-Estela, A., Hogg, B., Angarita-Osorio, N., Amann, B. L., Carlbring, P. … Colom,
F. (2021). The prevalence and clinical and sociodemographic factors of problem online gambling: A
systematic review. Journal of Gambling Studies, 37(3), 899–926
Narayan, N. (2020, April 08). BGC to take over assets and responsibilities of Senet Group. Retrieved
March 12, 2021, from https://europeangaming.eu/portal/latest-news/2020/04/08/68038/
bgc-to-takeover-assets-and-responsibilities-of-senet-group/
Narkhede, S. (2018). Understanding auc-roc curve. Towards Data Science, 26(1), 220–227
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O. … Duchesnay, E. (2011).
Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830
Percy, C., França, M., Dragičević, S., & d’Avila Garcez, A. (2016). Predicting online gambling self-exclu-
sion: An analysis of the performance of supervised machine learning models. International Gambling
Studies, 16(2), 193–210
Perrot, B., Hardouin, J. B., Grall-Bronnec, M., & Challet‐Bouju, G. (2018). Typology of online lotteries and
scratch games gamblers’ behaviours: A multilevel latent class cluster analysis applied to player account-
based gambling data.International Journal of Methods in Psychiatric Research, 27(4), e1746
Petry, N. M. (2016). Gambling disorder: The rst ocially recognized behavioral addiction. In N. M. Petry
(Ed.), Behavioral addictions: DSM-5® and beyond (pp. 7–42). New York, NY: Oxford University Press
Pettit, R. W., Fullem, R., Cheng, C., & Amos, C. I. (2021). Articial intelligence, machine learning, and deep
learning for clinical outcome prediction. Emerging Topics in Life Sciences, 5(6), 729–745
Philander, K. S., & MacKay, T. L. (2014). Online gambling participation and problem gambling severity: Is
there a causal relationship? International Gambling Studies, 14(2), 214–227
Potenza, M. N., Maciejewski, P. K., & Mazure, C. M. (2006). A gender-based examination of past-year rec-
reational gamblers. Journal of Gambling Studies, 22(1), 41–64
PwC & Responsible Gambling Council (2017). Remote gambling research: Interim report on Phase 2.
London: Gamble Aware. Retrieved February 27, 2022, from: www.gamble-aware remote-gambling-
research phase-2 pwc-report august-2017-nal.pdf
Rodríguez, P., Humphreys, B. R., & Simmons, R. (2017). Economics of sports betting. Northampton, UK:
Edward Elgar Publishing
1 3
1293
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Journal of Gambling Studies (2023) 39:1273–1294
Rossow, I. (2019). The total consumption model applied to gambling: Empirical validity and implications for
gambling policy. Nordic Studies on Alcohol and Drugs, 36(2), 66–76
Shaer, H. J., Hall, M. N., & Vander Bilt, J. (1999). Estimating the prevalence of disordered gambling
behavior in the United States and Canada: A research synthesis. American Journal of Public Health,
89, 1369–1376
Scholes-Balog, K. E., & Hemphill, S. A. (2012). Relationships between online gambling, mental health, and
substance use: a review. Cyberpsychology Behavior and Social Networking, 15(12), 688–692
Siemens, J. C., & Kopp, S. W. (2011). The inuence of online gambling environments on self-control. Jour-
nal of Public Policy & Marketing, 30(2), 279–293
Sirola, A., Kaakinen, M., & Oksanen, A. (2018). Excessive gambling and online gambling communities.
Journal of Gambling Studies, 34, 1313–1325
Stincheld, R. (2014). A review of problem gambling assessment instruments and brief screens. In D. Rich-
ards, A. Blaszczynski, & L. Nower (Eds.), Wiley-Blackwell handbook of disordered gambling (pp.
165–203). Oxford: Wiley
Stincheld, R., Govoni, R., & Frisch, G. R. (2007). A review of screening and assessment instruments for
problem and pathological gambling. In G. Smith, D. C. Hodgins, & R. Williams (Eds.), Research and
measurement issues in gambling studies (pp. 179–213). New York: Academic Press
Strong, D. R., Breen, R. B., Lesieur, H. R., & Lejuez, C. W. (2003). Using the Rasch model to evaluate
the South Oaks Gambling Screen for use with nonpathological gamblers. Addictive Behaviors, 28,
1465–1472
Svenska Spel (2021). Responsible gambling report 2020. Behind or work with responsible gambling.
Retrieved May 31, 2022, from: https://om.svenskaspel.se/wp-content/uploads/2021/03/responsible-
gambling-report-2020-nal.pdf
Ukhov, I., Bjurgert, J., Auer, M., & Griths, M. D. (2021). Online problem gambling: a comparison of casino
players and sports bettors via predictive modeling using behavioral tracking data. Journal of Gambling
Studies, 37(3), 877–897
Van Rossum, G. (2007). Python programming language. Retrieved May 31, 2022, from: https://www.python.
org
Wardle, H., Moody, A., Griths, M. D., Orford, J., & Volberg, R. (2011). Dening the online gambler and
patterns of behaviour integration: Evidence from the British Gambling Prevalence Survey 2010. Inter-
national Gambling Studies, 11, 339–356
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aliations.
Authors and Aliations
MichaelAuer1· Mark D.Griths2
Mark D. Griths
mark.griths@ntu.ac.uk
Michael Auer
m.auer@neccton.com
1 neccton GmbH, Davidgasse 5, 7052 Muellendorf, Austria
2 International Gaming Research Unit, Psychology Department, Nottingham Trent University,
50 Shakespeare Street, NG1 4FQ Nottingham, UK
1 3
1294
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center
GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers
and authorised users (“Users”), for small-scale personal, non-commercial use provided that all
copyright, trade and service marks and other proprietary notices are maintained. By accessing,
sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of
use (“Terms”). For these purposes, Springer Nature considers academic use (by researchers and
students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and
conditions, a relevant site licence or a personal subscription. These Terms will prevail over any
conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription (to
the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of
the Creative Commons license used will apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may
also use these personal data internally within ResearchGate and Springer Nature and as agreed share
it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not otherwise
disclose your personal data outside the ResearchGate or the Springer Nature group of companies
unless we have your permission as detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial
use, it is important to note that Users may not:
use such content for the purpose of providing other users with access on a regular or large scale
basis or as a means to circumvent access control;
use such content where to do so would be considered a criminal or statutory offence in any
jurisdiction, or gives rise to civil liability, or is otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association
unless explicitly agreed to by Springer Nature in writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a
systematic database of Springer Nature journal content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a
product or service that creates revenue, royalties, rent or income from our content or its inclusion as
part of a paid for service or for other commercial gain. Springer Nature journal content cannot be
used for inter-library loans and librarians may not upload Springer Nature journal content on a large
scale into their, or any other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not
obligated to publish any information or content on this website and may remove it or features or
functionality at our sole discretion, at any time with or without notice. Springer Nature may revoke
this licence to you at any time and remove access to any copies of the Springer Nature journal content
which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or
guarantees to Users, either express or implied with respect to the Springer nature journal content and
all parties disclaim and waive any implied warranties or warranties imposed by law, including
merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published
by Springer Nature that may be licensed from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a
regular basis or in any other manner not expressly permitted by these Terms, please contact Springer
Nature at
onlineservice@springernature.com
... Online gambling raises public concerns due to its high accessibility and immediacy [10]. However, the use of account-based behavioral data by the gambling operators as well as researchers also provides unique opportunities to monitor gambling behavior [11], [12], [13]. Account-based gambling refers to bets placed on gambling activities from a centralized account that is linked to an identified individual [13]. ...
... One proxy measure for problematic gambling that has been used within previous studies is self-exclusion [15], [16], [17], [18], [19]. An alternative proxy measure could be a screening instrument provided to gamblers on the website [11], [20]. As the use of these tools is optional, the question of representativeness arises [10]. ...
... The test data yielded an AUC value of 0.729 for the random forest model. Thus, based on player tracking data, self-reported problematic gambling can therefore be predicted by AI algorithms, with a high level of accuracy [11]. In another study, 2157 Canadian online gamblers with a record of self-exclusion enrollment were compared with a control group of 17 526 nonexcluded gamblers. ...
Article
This study involves a comprehensive analysis of an anonymized dataset provided by a Swiss online casino that adds to the identification of reliable early indicators for problematic online gambling. Targeting gambling addiction prevention, our objective was to model and evaluate behavioral characteristics that signal early stages of problem gambling. We scrutinized player behaviors against a list of gamblers previously excluded for problematic gambling, using this as our target variable. Our approach combined traditional gambling risk indicators, as outlined in the existing literature, with innovative exploratory feature engineering and feature selection. This involved computing moving aggregates over specific periods to capture nuanced gambling patterns. All features were evaluated by assessing mutual information with the target variable as well as the collinearity of each pairwise combination of features. Based on our data analysis, we found that the total losses in the previous seven days, total deposits in the previous 15 days, total duration played in the previous seven days, stakes (amount bet per game) over the previous seven days, andmaking a deposit 12 h after a loss (chasing) were the most informative and independent risk indicators. To assess the accuracy of these indicators for early detection of problematic gambling and accordingly for responsible gambling interventions, we combined them in a linear regression model and compared its performance with the casino’s currently used model. We found that a binary decision model based on a linear combination of these indicators provided better recall, greater precision, andmore timely decisions than the benchmark.
... Therefore, researchers frequently gambling behavior can provide insight into the behavioral predictors of self-reported problem gambling. Five player tacking studies have used self-reported problem gambling to predict problem gambling using player-tracking data (Auer & Griffiths, 2023a;Louderback et al., 2021;Luquiens et al., 2016;Murch et al., 2023;Perrot et al., 2022). All five studies collected self-reported problem gambling data using a problem gambling screen (e.g., PGSI, Ferris & Wynne, 2001) and correlated the self-report results with objective player tracking data. ...
... Moreover, Auer and Griffiths (2023e) did not find significant correlations between responses to the BBGS and player-tracking data in a sample of 1000 online slot gamblers. Auer and Griffiths (2023a) were given access to the raw data of 1287 players from a European online gambling casino who answered questions on the PGSI. They used the player tracking data 30 days before completion of the PGSI (using 8 + for problem gambling) to train machine learning models. ...
... Problem gamblers also tended to deplete their gambling accounts more frequently compared to non-problem gamblers. Auer and Griffiths (2023a) also recommended that online gambling operators should make depositing more than once per day more difficult to prevent players from chasing their losses. However, Auer and Griffiths did not analyze country-specific differences regarding the association between self-reported problem gambling and player-tracking data. ...
Article
Full-text available
The prevalence of online gambling and the potential for related harm necessitate predictive models for early detection of problem gambling. The present study expands upon prior research by incorporating a cross-country approach to predict self-reported problem gambling using player-tracking data in an online casino setting. Utilizing a secondary dataset comprising 1743 British, Canadian, and Spanish online casino gamblers (39% female; mean age = 42.4 years; 27.4% scoring 8 + on the Problem Gambling Severity Index), the present study examined the association between demographic, behavioral, and monetary intensity variables with self-reported problem gambling, employing a hierarchical logistic regression model. The study also tested the efficacy of five different machine learning models to predict self-reported problem gambling among online casino gamblers from different countries. The findings indicated that behavioral variables, such as taking self-exclusions, frequent in-session monetary depositing, and account depletion, were paramount in predicting self-reported problem gambling over monetary intensity variables. The study also demonstrated that while machine learning models can effectively predict problem gambling across different countries without country-specific training data, incorporating such data improved the overall model performance. This suggests that specific behavioral patterns are universal, yet nuanced differences across countries exist that can improve prediction models.
... Player data and AI support a variety of commercial use-cases including recommendation systems, fraud detection, and customer relationship marketing (Auer & Griffiths, 2023;Chui et al., 2018). They also assist stakeholders to curb the potential negative impacts of gambling (Ghaharian et al., 2022). ...
... After screening the titles and abstracts, we deemed 44 records as potentially eligible for inclusion. From these, we excluded 26 for not meeting the eligibility criteria: 17 were not sufficiently related to the gambling field (e.g., Kim & Werbach, 2016;Uusitalo et al., 2021), 7 lacked a discussion of risks and/or ethical concerns (e.g., Auer & Griffiths, 2023;McAuliffe et al., 2022), and 4 were unidentified duplicates. We selected 16 studies to include in the final review. ...
Preprint
Full-text available
The proliferation of data and artificial intelligence (AI) throughout society has raised concerns about its potential misuse and threats across industries. In this paper we explore the risks and ethical considerations of AI applications in gambling, an industry that makes significant contributions to many tourism destinations and local economies around the world. We conducted a scoping review to collect the breadth of literature and to understand the current state of knowledge. Our search yielded 2,499 potentially relevant documents, from which we deemed 16 as eligible for inclusion. A content analysis revealed convergence around six main themes: (1) Explainability, (2) Exploitation, (3) Algorithmic Flaws, (4) Consumer Rights, (5) Accountability, and (6) Human-in-the-Loop. We found that these gambling-specific themes largely overlap with broader AI principles. Most records focused on algorithmic strategies to reduce gambling-related harm (n = 12/16), thus we call for more attention to be turned to commercially driven AI applications. We provide a theoretical evaluation that illustrates the challenges involved for stakeholders tasked with governing AI risks and associated ethical considerations. As a globally reaching product, gambling regulators and operators need to be cognizant, not just of philosophical principles, but also of the rich tapestry of global ethical traditions.
... To date, the study of AI in gambling has largely focused on its application to support gambling-harm minimization and prevention [23,24]. In this context, machine learning algorithms are trained on gamblers' behavioral tracking data to retrospectively detect individuals who may be at-risk of harm according to their patterns of play [25][26][27]. Behavioral markers of harm, including the frequency, volume, intensity, and volatility of betting, have emerged from this extant research. Most of these studies focus on methodology and predictive capacity, with the goal of informing practical applications. ...
Article
Full-text available
Artificial intelligence (AI) is a transformative technology with the potential to bring immense benefit while simultaneously presenting significant risks and ethical concerns. In this study, we focus on the application of AI, a source of controversy itself, within a controversial industry. We examine AI ethics within the gambling sector, where ethical concerns are already heightened due to the potentially addictive and harmful nature of the product. We conducted 33 in-depth interviews between June and August 2023, exploring gambling industry stakeholders’ perceptions of key ethical issues regarding AI use in the sector. We also explored potential solutions to mitigate AI risks with interviewees. Using a qualitative grounded theory approach, we uncover a theory where the benefits of AI, such as its use to support consumer protection, must be balanced with the risks, including bias and player exploitation. Additionally, knowledge and collaboration, education, and regulation were identified as risk mitigation strategies that could help promote ethical AI practices in the gambling sector. Our theory, “Gambling’s AI Ethical Paradox”, highlights the challenges involved when deploying AI in controversial industries and serves as a starting point for gambling industry stakeholders to shape AI governance as the sector expands and AI continues to evolve.
... If further research replicates and supports these findings, these escalation behaviors might be a particularly important focus for harm reduction efforts. Findings from prior research (e.g., Allami et al., 2021, Auer & Griffiths, 2023 essentially separate gamblers by gambling involvement, and thus inform prevention efforts by indicating that risk detection efforts should flag those who gamble more and that messaging should tell people to gamble less frequently and spend less money to reduce risk for gambling problems (e.g., Hodgins et al., 2023). However, it is just as important to understand what, among people who are involved in gambling at a moderate to high level, predicts harm and how to mitigate that risk. ...
Article
Full-text available
Online sports gambling involvement is discontinuous in nature, with small groups of highly involved gamblers exhibiting betting behavior that is distinctly greater than other gamblers. There is some question about whether these groups, defined by exceedingly high levels of play, also have equivalently high rates of gambling problems, and whether they maintain these play levels over time. The current study builds on past work by examining the long-term trajectories of play and voluntary self-exclusion patterns across two years among a cohort of 32,262 highly-involved and less-involved online sports gamblers. We also examine the relative importance of betting behavior change as a risk factor for gambling problems by testing whether high involvement as compared to escalation of involvement is a better predictor of future self-exclusion. Measures included betting activities, transactional activities, and self-exclusion activities on a European online betting platform between February 2015 and January 2017. Results showed that bettors who were most highly involved in the first 8 months of the study in terms of number of bets and net loss were more likely to continue gambling on the platform in months 9–24 than others. Bettors who were most highly involved in the first 8 months of the study in terms of net loss and amount wagered were more likely to use self-exclusion than others, and more likely to have multiple self-exclusions. Escalations in frequency of play and average bet size within the first 8 months emerged as significant predictors of self-exclusion, even when controlling for high involvement.
Chapter
The escalating demand for live sports streaming has catalyzed a profound media transformation, favoring digital platforms over traditional mass media. However, this shift has also exposed the dark side of online interactions, including gambling and offensive comments, detracting from the experience of regular viewers. To address this, we present BScFilter, a pioneering Deep Learning approach for filtering sports comments in a resource-constrained environment. Our aim is to create a supportive viewer environment by automatically detecting and categorizing comments as gamble, hate, or sports-related. Leveraging our own developed dataset of 6012 annotated Bengali sports comments, we explore a range of ML and DL algorithms. The hybrid CNN+BiLSTM model with Keras embedding emerges as the top performer, achieving an impressive F1-score of 96.87%. We conduct comprehensive quantitative and qualitative analyses, revealing the strengths and limitations of our approach. While displaying promising outcomes, BScFilter offers an effective remedy to cultivate a respectful digital atmosphere for sports lovers, attenuating the influence of detrimental comments.
Article
Full-text available
Online gambling poses novel risks for problem gambling, but also unique opportunities to detect and intervene with at-risk users. A consortium of gambling companies recently committed to using nine behavioral "Markers of Harm'' that can be calculated with online user data to estimate risk for gambling-related harm. The current study evaluates these markers in two independent samples of sports bettors, collected ten years apart. We find over a two-year period that most users never had high enough overall risk scores to indicate that they would have received an intervention. This observation is partly due to characteristics of our samples that are associated with lower risk for gambling-related harm, but might also be due to overly high risk thresholds or flaws in the design of some markers. Users with higher average risk scores had more intraindividual variability in risk scores. Younger age and male gender were not associated with higher average risk scores. The most active users were more likely than other users to have ever exceeded risk thresholds. Several risk scores significantly predicted proxies of gambling-related harm (e.g., account closure). Overall, the current Markers of Harm system has some correctable limitations that future risk detection systems should consider adopting.
Article
Full-text available
AI is a broad concept, grouping initiatives that use a computer to perform tasks that would usually require a human to complete. AI methods are well suited to predict clinical outcomes. In practice, AI methods can be thought of as functions that learn the outcomes accompanying standardized input data to produce accurate outcome predictions when trialed with new data. Current methods for cleaning, creating, accessing, extracting, augmenting, and representing data for training AI clinical prediction models are well defined. The use of AI to predict clinical outcomes is a dynamic and rapidly evolving arena, with new methods and applications emerging. Extraction or accession of electronic health care records and combining these with patient genetic data is an area of present attention, with tremendous potential for future growth. Machine learning approaches, including decision tree methods of Random Forest and XGBoost, and deep learning techniques including deep multi-layer and recurrent neural networks, afford unique capabilities to accurately create predictions from high dimensional, multimodal data. Furthermore, AI methods are increasing our ability to accurately predict clinical outcomes that previously were difficult to model, including time-dependent and multi-class outcomes. Barriers to robust AI-based clinical outcome model deployment include changing AI product development interfaces, the specificity of regulation requirements, and limitations in ensuring model interpretability, generalizability, and adaptability over time.
Article
Full-text available
In order to protect gamblers, gambling operators have introduced a wide range of responsible gambling (RG) tools. Mandatory play breaks (i.e., forced termination of a gambling session) and personalized feedback about the gambling expenditure are two RG tools that are frequently used. While the motivation behind mandatory play breaks is simple (i.e., gambling operators expect gamblers to reduce their gambling significantly as a result of an enforced break in play), empirical evidence supporting the efficacy of the mandatory breaks is still limited. The present study comprised a real-world experiment with the clientele of Norwegian gambling operator Norsk Tipping. On the Norsk Tipping gambling website, which offers slots, bingo and sports-betting, forced termination occurs if gamblers have played continuously for a one-hour period. The study tested the effect of different lengths of mandatory play breaks (90 s, 5 min, 15 min) on subsequent gambling behavior, as well as the effect of combined personalized feedback concerning money wagered, won, and net win/loss. In total 21,129 online players (61% male; mean age = 47.4 years) experienced at least one play break between April 17 and May 21 (2020) with 156,989 mandatory play breaks in total. Results indicated that a 15-min mandatory play break led to a disproportionately longer voluntary play pause compared to 5-min and 90-s mandatory play breaks. Personalized feedback appeared to have no additional effect on subsequent gambling and none of the mandatory play breaks appeared to affect the increase or decrease in money wagered once players started to gamble again.
Article
Full-text available
The emergence of online gambling has raised concerns about potential gambling-related harm, and various measures have been implemented in order to minimise harm such as identifying and/or predicting potential markers of harm. The present study explored how the nine DSM-5 criteria for gambling disorder can be operationalised in terms of actual online gambling behaviour using account-based gambling tracking data. The authors were given access to an anonymised sample of 982 gamblers registered with an online gambling operator. The data collected for these gamblers consisted of their first three months’ gambling activity. The data points included customer service contacts, number of hours spent gambling, number of active days, deposit amounts and frequency, the number of times a responsible gambling tool (such as deposit limit) were removed by the gamblers themselves, number of cancelled withdrawals, number of third-party requests, number of registered credit cards, and frequency of requesting bonuses through customer service (i.e., the number of instances of ‘bonus begging’). Using these metrics, most of the DSM-5 criteria for gambling disorder can be operationalized (at least to some extent) using actual transaction data. These metrics were then applied to a sample of online gamblers, and through cluster analysis four types of online gambler based on these metrics (non-problem gamblers, at-risk gamblers, financially vulnerable gamblers, and emotionally vulnerable gamblers) were identified. The present study is the first to examine the application of the DSM-5 criteria of gambling disorder to actual gambling behaviour using online gambling transaction data and suggests ways that gambling operators could identify problem gamblers online without the need for self-report diagnostic screening instruments.
Article
Full-text available
Objective: To help individuals avoid potential negative consequences associated with their gambling, researchers have developed lower risk limits for time and financial involvement among populations of land-based gamblers. The present study extended these efforts to online gambler populations with prospective longitudinal data. Method: We used receiver operating characteristic curve analysis and logistic regression models predicting a positive Brief Biosocial Gambling Screen (BBGS; Gebauer et al., Canadian Journal of Psychiatry, 55, 2010, 82-90) to develop lower risk limits for six measures of gambling involvement among subscribers to an online gambling operator. We also tested the utility of these six newly developed online limits and three existing land-based limits for the BBGS outcome and proxies for gambling problems including: (a) voluntary self-limiting, (b) voluntary self-exclusion, (c) closing one's account, and (d) being assigned a flag for potential problem gambling by customer service. Results: We identified five optimal limits for lower risk online gambling with adequate sensitivity and specificity for predicting BBGS-positive status, and four of those that also received additional empirical support. These four empirically supported gambling limits were: (a) wagering 167.97 Euros or less each month; (b) spending 6.71% or less of one's annual income on online gambling wagers; (c) losing 26.11 Euros or less on online gambling per month; and (d) demonstrating variability (i.e., standard deviation) in daily amount wagered of 35.14 Euros or less during one's duration active. Conclusions: Our findings have implications for lower risk gambling limits research and suggest that unique limits might apply to online and land-based gambler populations. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Article
Full-text available
Background and Aims Few meta‐analyses have been conducted to pool the most constant risk factors for problem gambling. The present meta‐analysis summarizes effect sizes of the most frequently assessed problem gambling risk factors, ranks them according to effect size strength, and identifies any differences in effects across genders. Method A random‐effects meta‐analysis was conducted on jurisdiction‐wide gambling prevalence surveys on the general adult population published until March 2019. One hundred four studies were eligible for meta‐analysis. Number of participants varied depending on the risk factor analyzed, and ranged from 5,327 to 273,946 (52% female). Weighted mean odds ratios were calculated for 57 risk factors (sociodemographic, psychosocial, gambling activity, and substance use correlates), allowing them to be ranked from largest to smallest with regards to their association with problem gambling. Results The highest odds ratio was for Internet gambling (OR = 7.59, 95% CI [5.24, 10.99], p < .000) and the lowest was for employment status (OR = 1.03, 95% CI [0.87, 1.22], p = .718). The largest effect sizes were generally in the gambling activity category, and the smallest were in the sociodemographic category. No differences were found across genders for age‐associated risk. Conclusions A meta‐analysis of 104 studies of gambling prevalence indicated that the most frequently assessed problem gambling risk factors with the highest effect sizes are associated with continuous play format gambling products.
Article
Full-text available
Online gambling has continued to grow alongside new ways to analyze data using behavioral tracking as a way to enhance consumer protection. A number of studies have analyzed consumers that have used voluntary self-exclusion (VSE) as a proxy measure for problem gambling. However, some scholars have argued that this is a poor proxy for problem gambling. Therefore, the present study examined this issue by analyzing customers (from the gambling operator Unibet) that have engaged in VSE. The participants comprised of costumers that chose to use the six-month VSE option (n = 7732), and customers that chose to close their Unibet account due to a specific self-reported gambling addiction (n = 141). Almost one-fifth of the customers that used six-month VSE only had gambling activity for less than 24 h (19.15%). Moreover, half of the customers had less than seven days of account registration prior to six-month VSE (50.39%). Customers who use VSE are too different to be treated as a homogenous group and therefore VSE is not a reliable proxy measure for problem gambling. The findings of this research are beneficial for operators, researchers, and policymakers because it provides insight into gambling behavior by analyzing real player behavior using tracking technologies, which is objective and unbiased.
Article
Full-text available
The emergence and spread of new technologies have allowed for the introduction of new forms of gambling. Problem online gambling has specific characteristics, and its prevalence may differ from traditional forms of gambling. This paper systematically reviews studies that include data relevant to problem online gambling and to the sociodemographic and comorbidity variables related to it. A systematic literature search was conducted from Medline database. Following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement, preliminary search resulted in 427 articles, from which 20 were included in this systematic review based on pre-determined criteria. The reported prevalence of problem online gambling varied widely across the different studies. This heterogeneity is due to large variations in settings, instruments, and definitions of problem online gambling, which rules out a meta-analytic approach to the results. The sources of variability in the prevalence, the sociodemographic and comorbidity factors, and the implications for future research are discussed.
Article
The identification of disordered gambling in the online environment may enable interventions to be targeted to those users experiencing harms. We tested the performance of machine learning in classifying online gamblers with and without a record of voluntary self-exclusion (VSE). We analyzed a one year dataset from PlayNow.com, the provincially owned online gambling platform in British Columbia, Canada. The primary model compared 2,157 gamblers with a record of VSE enrollment (6 months to 3 years) against 17,526 non-VSE controls, using 20 input variables of gambling behavior. Machine learning (random forest classifier) achieved an Area Under the Receiver Operating Characteristic curve (AUROC) of 0.75 (SD = 0.01). The input variable with the greatest predictive signal (based on feature importance values) was Variance in Money Bet per Session. Further analyses tested a logistic regression model as a benchmark, and tested the impact of key modeling decisions (including use of a balanced dataset, and data inclusion threshold). Across all models, machine learning algorithms were able to predict VSE status with performance between 0.65 and 0.76, using our behavioral inputs. These results provide proof-of-principle data for the applied use of behavioral tracking to identify disordered gambling, and highlight the importance of behavioral inputs reflecting betting variability.