Available via license: CC BY 4.0
Content may be subject to copyright.
VanEetveldeetal. J EXP ORTOP (2021) 8:27
https://doi.org/10.1186/s40634-021-00346-x
REVIEW PAPER
Machine learning methods insport injury
prediction andprevention: asystematic review
Hans Van Eetvelde1* , Luciana D. Mendonça2,3,4 , Christophe Ley1 , Romain Seil5 and Thomas Tischer6
Abstract
Purpose: Injuries are common in sports and can have significant physical, psychological and financial consequences.
Machine learning (ML) methods could be used to improve injury prediction and allow proper approaches to injury
prevention. The aim of our study was therefore to perform a systematic review of ML methods in sport injury predic-
tion and prevention.
Methods: A search of the PubMed database was performed on March 24th 2020. Eligible articles included original
studies investigating the role of ML for sport injury prediction and prevention. Two independent reviewers screened
articles, assessed eligibility, risk of bias and extracted data. Methodological quality and risk of bias were determined by
the Newcastle–Ottawa Scale. Study quality was evaluated using the GRADE working group methodology.
Results: Eleven out of 249 studies met inclusion/exclusion criteria. Different ML methods were used (tree-based
ensemble methods (n = 9), Support Vector Machines (n = 4), Artificial Neural Networks (n = 2)). The classification
methods were facilitated by preprocessing steps (n = 5) and optimized using over- and undersampling methods
(n = 6), hyperparameter tuning (n = 4), feature selection (n = 3) and dimensionality reduction (n = 1). Injury predictive
performance ranged from poor (Accuracy = 52%, AUC = 0.52) to strong (AUC = 0.87, f1-score = 85%).
Conclusions: Current ML methods can be used to identify athletes at high injury risk and be helpful to detect the
most important injury risk factors. Methodological quality of the analyses was sufficient in general, but could be fur-
ther improved. More effort should be put in the interpretation of the ML models.
Keywords: Machine Learning, Injury prediction, Injury prevention, Sport injury
© The Author(s) 2021. Open Access This ar ticle is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.
Background
Injuries are common in individual and team sports and
can have significant physical, psychosocial and financial
consequences [3, 13, 22]. Understanding injury risk fac-
tors and their interplay is thereby a key component of
preventing future injuries in sport [4]. An abundance
of research has attempted to identify injury risk factors
[4, 6, 28]. However, sports injuries are a consequence of
complex interactions of multiple risk factors and incit-
ing events making a comprehensive model necessary [6,
28]. It has to account for the events leading to the injury
situation, as well as to include a description of body and
joint biomechanics at the time of injury [4]. Due to the
many interactions between intrinsic and extrinsic risk
factors as well as their sometimes highly unpredictable
nature (e.g., contact with another player), the ability to
foresee the occurrence of an inciting injury event is chal-
lenging. erefore, predictive modelling should not only
focus on the prediction of the occurrence of an injury
itself but, moreover, it should try to identify injury risk at
an individual level and implement interventions to miti-
gate the level of risk [28]. In order to plan effective pre-
ventive intervention, it is therefore important to be aware
both of the various isolated risk factors and their interac-
tion [6].
Open Access
Journal of
Experimental Orthopaedics
*Correspondence: Hans.VanEetvelde@UGent.be
1 Department of Applied Mathematics, Computer Science and Statistics,
Ghent University, Krijgslaan 281-S9, 9000 Ghent, Belgium
Full list of author information is available at the end of the article
Page 2 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
In recent years, the use of advanced Artificial Intelli-
gence (AI) methods has appeared in sports medicine to
tackle this challenging multi-faceted task [1, 5, 14, 16].
AI methods have already been used successfully in sports
science within the realm of game analysis, tactics, perfor-
mance analysis and outcome predictions [12, 17, 21] and
are about to start transforming clinical medicine [9, 31,
33, 39, 42]. However, for clinicians, the application and
the understanding of AI is often difficult [24]. erefore,
the explanations of the core terms for AI application are
provided in Supplementary File S1.
AI is mostly narrowed down to Machine Learning (ML)
methods although it is a very broad concept comprising
every aspect of mimicking human intelligence. ML is the
study of algorithms that can automatically learn from
data to make new decisions [23]. Current ML methods
include Neural Networks, Support Vector Machines, or
Random Forests which are part of a ’Machine Learning
pipeline’ (Fig. 1). e available data for the ML model
has to be of high quality and can be any data deemed
useful for the purpose of injury prediction. is data is
split in two parts (data splitting), the so-called training
and test data. First, the algorithm has to learn the rela-
tionship between outcome of interest (injury or not) and
the potential contributing factors (also called predictors/
features/covariates/explanatory variables) from the train-
ing data set. e test data can then be used to test the
prediction capacity of the algorithm learned from the
training data. It is important that this quality check is not
achieved on the training data, but on unseen data, hence
the data splitting at the beginning. e quality and size
of the data sets are important parameters for the quality
Fig. 1 Schematic figure of the Machine Learning approach. The entire Machine Learning process is shown. Parts in dotted shapes are optional or
not always necessary
Page 3 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
of the results. To improve the quality of these large and
complex datasets and to ensure optimal operation of the
ML algorithms, data preprocessing methods (imputation,
standardization, discretization), dimension reduction
and feature selection can be applied (see Supplementary
File S1). Most ML procedures further require parameter
tuning, a sort of optimization of parameters which can-
not be estimated directly from the data (e.g., number of
trees to be used in a Random Forest). When the entire
ML pipeline is fitted on the training data, the outcome
of the test data is predicted. Since we know the true out-
come of the test data, this allows us to evaluate our estab-
lished prediction model. Finally, well-performing models
provide an idea of the most important risk factors, by
observing which factors have the largest influence in
these models.
Considering that sport injury prediction and preven-
tion are trending topics in sport science [12, 13, 16], the
intention of this systematic review is to synthesize the
evidence of sophisticated ML algorithms in sport injury
prediction and prevention. Our systematic review differs
from the one by Claudino etal. [12] in that we focus on
injury prevention and risk factor identification together
with a deeper examination of the used ML analyses. e
following three topics are assessed:
1. Identify the currently used definition of ML as well as
predominantly used ML methods.
2. Identify the accuracy of the currently used ML meth-
ods to predict injury.
3. Evaluate the used methods for sport injury preven-
tion purposes.
Methods
is systematic review was performed in accordance with
the Preferred Reporting Items for Systematic Reviews
and Meta-Analyses (PRISMA) guidelines [30]. e
review protocol was prospectively registered at PROS-
PERO (International prospective register of systematic
reviews—ie, CRD42020177708).
Search strategy andinclusion/exclusion criteria
A systematic electronic search of the PubMed database
was executed on March 24th 2020 to identify studies
investigating Machine Learning methods in injury pre-
diction and prevention. e following search term was
used in all fields: (“deep learning” OR “artificial intel-
ligence” OR “machine learning” OR “neural network”
OR „neural networks “ OR „support vector machines
“ OR „nearest neighbor “ OR „nearest neighbors “ OR
„random forest “ OR „random forests “ OR „trees" OR
„elastic net “ OR „ridge “ OR „lasso “ OR „boosting “ OR
„predictive modeling “ OR “learning algorithms” OR
„bayesian logistic regression “) AND (“sport” OR „sports
“ OR “athlete” OR “athletes”) AND („injury “ OR "inju-
ries"). We did not use limits to perform the search and
no date restrictions were applied. Inclusion criteria were
as follows: (i) Original studies investigating the role of
machine learning for sport injury prediction and sport
injury prevention, (ii) English-language studies, (iii)
studies published online or in print in a peer-reviewed
journal. Injury prediction had to refer to predicting
either the occurrence, the severity, or the type of injuries
on the basis of risk factors. e exclusion criteria were
as follows: (i) not being sport-specific, (ii) not cover-
ing injury prevention or injury prediction, (iii) meeting
abstracts and proceedings. Also, studies were excluded
if the used approach was rather statistical than ML. is
explains why, for example, two papers from Hasler etal.
[19, 20] and one from Mendonça etal. [29] were not
included here.
Study selection
e titles and abstracts of all articles were screened for
relevance according to the inclusion and exclusion cri-
teria (L.M. and T.T.). If no abstract was available, the
full-text article was obtained to assess the relevance of
the study. e full text was subsequently reviewed for
possible inclusion in the systematic review for all arti-
cles that were not excluded during the initial screening
process. A third reviewer (H.E.) resolved between-
reviewer discrepancies. In addition to the electronic
search, the reference lists of all included articles and
review articles were manually searched (C.L., T.T.,
H.E.) for additional relevant articles. Moreover, if any
systematic reviews on ML in sport injury prediction
and prevention were identified during the screening
process, the reference list was screened to identify any
further studies.
Methodological quality
Methodological quality and risk of bias of included stud-
ies were determined by the Newcastle–Ottawa Scale
(NOS) [45]. Eligible studies were independently rated by
two authors blind to the study authors and institutions
(L.M. and T.T.), with discrepancies resolved by a third
author (H.E.). e NOS contains eight categories relating
to methodological quality and each study was given an
eventual score out of a maximum of 8 points. A score of
0–3 points equated to a low quality study, a score of 4–6
points equated to a moderate quality study, with a score
of 7–8 points required for a study to be given a score of
high quality.
Page 4 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
Data extraction
Characteristics of all included studies (i.e. partici-
pants, type of study, sample size, etc.) and about ML
used (i.e. data pre-processing, classification method,
etc.) were extracted independently by two reviewers
(H.E. and C.L.), with a third (L.M.) resolving potential
discrepancies.
Data analysis
Two independent reviewers (L.M. and T.T.) assessed the
quality of evidence using the GRADE methodology [18].
In the current review, evidence started at moderate cer-
tainty, since investigation of publication bias was not pos-
sible due to the small number of included trials. en it
was downgraded by one level for imprecision when the
analysed sample was < 300 participants (serious impreci-
sion was downgraded by two levels); and by one level for
risk of bias when the mean NOS Score was < 6 out of 9.
Between-reviewer discrepancies were resolved by a third
investigator (R.S.).
Results
In the scope of this systematic literature review, 246 arti-
cles were found, and an additional three articles added
by hand search from which a total of 11 articles were
included according to the strict inclusion/exclusion crite-
ria for this systematic review (Fig.2).
Study characteristics
Table1 lists all details of the included studies. Studies
were prospective cohort studies (n = 9) [2, 11, 25, 27,
32, 35, 36, 38, 41] or case–control studies (n = 2) [41,
46]. Most of them were performed in soccer (n = 4)
[2, 32, 35, 36] and Australian Football (n = 3).[11, 27,
38] Two studies incorporated athletes from multiple
sports [25, 34]. The number of participants ranged
between 25 and 363. In seven studies, the athlete was
the unit of observation [2, 25, 32, 34, 35, 38, 46]. In the
remaining 4 studies, there were multiple observations
per player [11, 27, 36, 41]. Both occurrence (n = 11)
[2, 11, 25, 27, 32, 34–36, 38, 41, 46] and type of injury
(n = 2) (acute/overuse [35], contact/non-contact [27])
were evaluated, whereby lower limb muscle injury was
the most often assessed outcome [2, 25, 38] and only
one publication investigated specifically upper limb
injuries [46].
Study Quality
e methodological quality of the included studies
ranged from 5 to 8 in the NOS scale (Table2). All studies
had proper ascertainment of outcome/exposure; follow-
up long enough for outcomes to occur, same method of
ascertainment for cases and controls; and adequacy of
follow-up of cohorts/non-response rate. Seven studies
(63.63%) were downgraded for methodological quality
due to imprecision and three (27.27%) because of risk of
bias (Table3).
Fig. 2 PRISMA flow chart
Page 5 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
Table 1 Study characteristics
Authors, Year Outcome Variable Predictor Variables Participants (Age
Mean ± sd)
Period Study Design Unit of Observation Number of
Observations
Total Amount of
Injuries / No. Of
Injured Athletes (N =)a
Number of
Features
AYALA ET AL., 2019
[2]
Occurrence of
Hamstring strain
injury
Individual (sport-
related back-
ground, demo-
graphic, previous
hamstring strain
injury), psy-
chological and
neuromuscular
measurements
96 Male professional
soccer players from 4
teams in 1st and 2nd
league in Spain
6 players that did not
complete the tests
and 4 players that
left their teams were
removed
1 season
(2013–2014)
Prospective cohort Player 86 NR/18 229
CAREY ET AL., 2018
[11]
Occurrence of non-
contact injury,
non-contact
causing time loss
injury and ham-
string injury
Training load
variables (+ Expo-
nential Weighted
Moving Average
features and
Acute Chronic
Workload Ratio
features)
75 male professional
players from 1 team
in the Australian
Football League in
Australia
3 seasons
(2014–2016)
Prospective cohort Player matches and
player training
sessions
13,867 Non-contact: 388/NR
Non-contact causing
time loss: 198/NR
Hamstring: 72/NR
58
LÓPEZ-VALEN-
CIANO ET AL.,
2018 [25]
Occurrence of
lower extremity
muscle injury
Individual
(sport-related
background,
demographic,
previous injury),
psychological and
neuromuscular
measurements
132 Male professional
players in handball
(34) and soccer (98)
in the first three
National Leagues in
Spain
6 players that did not
complete the tests
and 4 players that
left their teams were
removed
1 season
(2013–2014)
Prospective cohort Player 122 32/29 151
Page 6 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
Table 1 (continued)
Authors, Year Outcome Variable Predictor Variables Participants (Age
Mean ± sd)
Period Study Design Unit of Observation Number of
Observations
Total Amount of
Injuries / No. Of
Injured Athletes (N =)a
Number of
Features
MCCULLAGH ET AL.,
2013 [27]
Occurrence of
injury and injury
type (contact or
non-contact)
Workloads, squeeze
test data, soft tis-
sue scores, stress
level, mood, sleep
score, ankle flex-
ibility, fatigue and
player perceived
performance,
years played,
player durability,
age
39 male professional
players from the
Australian Football
League in Australia
1 season
(2010)
Prospective cohort Player weeks 1210 163/NR 30
OLIVER ET AL., 2020
[32]
Occurrence of non-
contact lower
limb injury
Personal data (age,
Body Mass Index,
etc.) and neuro-
muscular control
tests data
355 Male youth
soccer players (age
14.3 ± 2.1) from
Premier League and
Championship clubs
in England
1 season
(2014–2015)
Prospective cohort Player 355 NR/99 20
RODAS ET AL., 2019
[34]
Occurrence of
Tendinopathy
Genetic markers 363 Male (89%) and
female (11%) profes-
sional soccer, futsal,
basketball, handball
and roller hockey
players (age 25 ± 6)
from FC Barcelona
in Spain
10 years
(2008–2018)
Case–control Player 363 199/199 1 419 369
ROMMERS ET AL.,
2020 [35]
Occurrence of
injury and type of
injury (acute and
overuse)
Anthropometric
measurements,
motor coordina-
tion and physical
fitness
734 Male U10 to U15
youth soccer players
(age 11.7 ± 1.7) of 7
premier league clubs
in Belgium
1 season
(2017–2018)
Prospective cohort Player 734 NR /368 29
ROSSI ET AL., 2018
[36]
Occurrence of
injury
Personal, Workload
features from GPS
Tracking data,
previous injury
26 Male professional
soccer players (age
26 ± 4) in Italy
1 season
(2013–2014)
Prospective cohort Player training
session
952 23/13 55
Page 7 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
Table 1 (continued)
Authors, Year Outcome Variable Predictor Variables Participants (Age
Mean ± sd)
Period Study Design Unit of Observation Number of
Observations
Total Amount of
Injuries / No. Of
Injured Athletes (N =)a
Number of
Features
RUDDY ET AL., 2018
[38]
Occurrence of
hamstring strain
injury
Age, previous
hamstring strain
injury, low levels
of eccentric ham-
string strength
362 Male professional
players from the
Australian Football
League in Australia:
186 in 2013 (age
23.2 ± 3.6) and 176 in
2015 (age 25.0 ± 3.4)
2 seasons
(2013, 2015)
Prospective cohort Player 2013: 186
2015: 176
2013: NR/27
2015: NR/26
3 or 8
THORNTON ET AL.,
2017 [41]
Occurrence of
Injury
Training intensity 25 Male professional
rugby players from
Australian National
Rugby League in
Australia. Athletes
were included in
the dataset if they
sustained more than
3 injuries in total
3 seasons
(2013–2015)
Prospective cohort Player days NR 156/25 NR
WHITESIDE ET AL.,
2016 [46]
Occurrence of ulnar
collateral liga-
ment reconstruc-
tion
Demographic and
pitching perfor-
mance
208 Male professional
baseball pitchers
from the Major
League Baseball
in the USA and
Canada: 104 cases
(age 27.3 ± 3.8) and
104 controls (age
27.8 ± 3.7)
5 years
(2010–2015)
Matched Case–
control
Player 208 NR/NR 14
a The for the analysis relevant number is put in bold. If the unit of observation is player, then the number of injured players is relevant, since one only detects if a player gets injured at least once. If there are multiple
observations per player, the total number of injuries is relevant for the analysis
Page 8 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
Data analysis characteristics
In all 11 papers, the outcome variable was the occur-
rence of injury or the type of injury, which are categori-
cal variables, making the base models classification
models. From the 11 considered papers, 9 papers used
tree-based models [2, 11, 25, 32, 34–36, 38, 41], 4
papers used Support Vector Machines [11, 34, 38, 46]
and 2 papers used Artificial Neural Networks [27, 38].
Eight out of 9 papers using tree-based models applied a
bagging strategy [2, 11, 25, 32, 34, 36, 38, 41], whereof
5 used a Random Forest approach [11, 34, 36, 38, 41].
Four papers used boosting algorithms to construct tree
ensemble methods [2, 25, 32, 35].
e training, validation and test strategy for the used
ML approaches varied largely between the different
studies. For the evaluation and comparison of the
methods, 7 papers [2, 25, 27, 32, 34, 36, 46] used cross-
validation and 4 [11, 35, 38, 41] used a single data-split-
ting approach. Of the former, four [2, 25, 32, 36] used
stratified cross-validation, which may be especially of
interest in unbalanced data, because it ensures that in
both training and test set the number of positive cases
(injuries) is sufficiently high. In 4 papers [11, 34, 35,
38] the training dataset was split further for tuning the
hyperparameters. In 3 papers [11, 36, 37] the authors
repeated their entire analysis a large number of times to
adjust for the randomness in the resampling and under/
oversampling methods.
ree of the discussed papers used feature selection
methods [34, 36, 46]. Rodas etal. [34] used the LASSO
method for selecting significant features, Rossi etal. [36]
eliminated features by applying cross-validation on a sep-
arate part of the data, and Whiteside etal. [46] evaluated
all possible feature subsets. Carey etal. [11] used Princi-
pal Component Analysis for reducing the dimensionality
of the data instead of feature selection.
To adjust for imbalanced data, the training datasets
were over-and/or undersampled in 6 papers [2, 11, 25,
32, 36, 38]. All of them used oversampling of the minority
class (injuries) and 4 of them applied undersampling of
the majority class (non-injuries) [2, 11, 25, 32].
Data pre-processing was used in some papers to opti-
mize the performance of the classification methods. To
solve the missing values problem, three papers [2, 25, 34]
mentioned using imputation methods. In three papers [2,
25, 32], the continuous variables were transformed into
categorical variables, using cut-off values found in the lit-
erature or based on the data. ere was only one paper
[37] that mentioned a standardization of the continuous
variables.
Some of the studies had small deficits in the Machine
Learning Pipeline approach. Four papers in this review
had multiple observations per athlete [11, 27, 36, 41]
and it seems that players may appear in both the train-
ing and test datasets, which would be a violation of the
rule that the training and test dataset should be inde-
pendent from each other. e results of these studies
can therefore not be generalized to a bigger popula-
tion. e other mistakes were made in the preprocess-
ing phase. Four papers [2, 25, 32, 38] seemed to perform
discretization or standardization on the entire dataset
(including test dataset), which in that case would be an
example of data leakage, i.e. using the test data in the
training process. is should be avoided since it does
not reflect reality as the test dataset has to be seen as
future data. On the other hand, Ruddy etal. [38] inde-
pendently standardized the training and test dataset.
Applying different transformations on the training and
Table 2 Methodological quality of the included studies using
the NOS scale [45]
Cohort studies Selection Comparability Outcome Conclusion
Oliver et al **** * *** 8
Ayala et al **** *** 7
López-Valenciano
et al **** *** 7
Rommers et al **** *** 7
Rossi et al ** *** 5
Carey et al ** *** 5
Ruddy et al **** *** 8
Thornton et al ** *** 5
McCullagh & Whit-
fort **** *** 7
Case–control studies Selection Comparability Exposure Conclusion
Rodas et al **** *** 7
Whiteside et al **** *** 7
Table 3 Quality of Evidence according to GRADE [18]
Imprecision
(n < 300) Risk of
bias
(NOS < 6)
Conclusion
Oliver et al N (n = 355) N = 8 moderate-quality
Ayala et al Y (n = 96) N = 7 low-quality
López-Valenciano
et al Y (n = 132) N = 7 low-quality
Rommers et al N (n = 734) N = 7 moderate-quality
Rossi et al Y (n = 26) Y = 5 very-low
Carey et al Y (n = 133) Y = 5 low-quality
Ruddy et al N (n = 362) N = 8 moderate-quality
Thornton et al Y (n = 25) Y = 5 very-low
McCullagh &
Whitfort Y (n = 39) N = 7 very-low
Rodas et al N (n = 363) N = 7 moderate-quality
Whiteside et al Y (n = 113) N = 7 low-quality
Page 9 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
test dataset will cause non-optimal operation of the
classifier and can lead to lower predictive performance.
A structured overview of the data analysis characteris-
tics can be found in Table4.
Performance inpredicting injury occurrence
In Table 5, the study results characteristics are given
for each of the included papers. For predicting the
occurrence of the outcome (injury in general, muscle
injury, …), seven papers used Area Under the ROC
Curve (AUC) as an evaluation measure [2, 11, 25, 32,
36, 38, 41], while the remaining four papers used only
metrics based on the confusion matrix, e.g. accuracy,
sensitivity, specificity, precision and f1-score. Eight out
of eleven studies [2, 25, 27, 32, 35, 36, 41, 46] reported
appropriate to good performance of the Machine
Learning prediction methods. AUC values for predict-
ing the outcome ranged between 0.64 and 0.87, and
high values were found for accuracy (75%—82.9%),
sensitivity (55.6%—94.5%), specificity (74.2%—87%)
and precision (50%—85%). Three papers [11, 34, 38]
reported low prediction potential for their built ML
models, showing low AUC (0.52—0.65) and accuracy
(52%) values.
Most important injury predictors
Analysed risk factors included both modifiable (train-
ing load, psychological and neuromuscular assessment,
stress level, …) and non-modifiable (demographics,
genetic markers, anthropometric measurements, pre-
vious injury, …) factors (for more details, see Table1).
In 4 papers [2, 25, 32, 41], the authors have counted
the number of appearances of each feature in the final
ensemble of decision trees. Two papers [34, 46] counted
the number of times that a feature is selected by their
feature selection procedure. Rossi etal. [36] used the
decrease in Gini coefficient to measure the importance
of variables and Rommers etal. [35] used a SHAP sum-
mary plot [26]. is plot was based on the Shapley
values in game theory and shows the importance of
the variables, as well as the relation between high/low
feature values and high/low injury risk. Because of the
wide variety of features used over the different papers,
not much consistency was found in the reported
important predictors. e features that were reported
twice as important were previous injury [25, 36], higher
training load [36, 41], and higher body size (in youth
players only) [32, 35]. Note that lower training load
after previous injury might indicate a not fully recov-
ered athlete and can hence be considered being a risk
factor after previous injury [36].
Discussion
e 11 studies included in this systematic review showed
that ML methods can be successfully applied for sport
injury prediction. e most promising results to predict
injury risk were obtained in elite youth football players
based on anthropometric, motor coordination and physi-
cal performance measures with a high accuracy of 85%
[35], and in professional soccer based on a pre-season
screening evaluation with a high sensitivity (77.8%) and
specificity (83.8%).[2] is is in opposition with several
authors who found that screening tests were not success-
ful in predicting sports injuries [40, 43]. ese results
are promising in the sense that future models might help
coaches, physical trainers and medical practitioners in
the decision-making process for injury prevention.
Data inclusion was still limited in the analysed studies,
where only selected variables were included (e.g., only
anthropometric, motor coordination and physical per-
formance measures in the study by Rommers etal. [35]).
Nevertheless, the achieved accuracy was quite high and
future prediction might become even higher by using
smart machine learning approaches or by incorporating
more data (e.g., using sensors, more intense monitoring
of athletes) [44]. Future studies will need to refine the
target of injury prediction with AI/ML. is can either
be achieved with an increase of the number of different
injuries affecting a specific population or a study cohort
[35] or with a targeted inclusion of specific injuries with
a high injury incidence like hamstring injuries in football
or athletics [2, 38], or a high injury burden like anterior
cruciate ligament (ACL) injuries in pivoting sports or
ulnar collateral injuries in baseball [46]. e types and
number of injury risk factors to be included in these stud-
ies are manifold and vary for each target. Large datasets
may help the sports medicine community to improve the
understanding of the respective influence of each factor
on injury occurrence as well as their specific interactions
in a given environment, allowing for a more systemic
approach of sports injury prevention [6–8, 15].
In the new field of ML for sports injury prevention,
the level of quality of the published studies is of utmost
importance. e analysis of the methodological quality
of the 11 included studies indicates that they had very-
low to moderate methodological quality according to
the GRADE analysis. Imprecision (i.e., a study including
relatively few participants/events) is an issue that may
be improved with multicentric studies. Only 3 studies
[11, 36, 41] had a NOS score under 6 and only 1 study
[32] scored in comparability. In fact, the main reason
to a lower NOS score was lack of comparability, which
indicates that either cases and controls or exposed and
non-exposed individuals were not matched in the design
and/or confounders were not adjusted for in the analysis.
Page 10 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
Table 4 Data analysis characteristics
Authors Train, Validate and Test Strategy Data Pre-processing Feature Selection/
Dimensionality Reduction Machine Learning Classification
Methods Deficits of ML Analysis
AYALA ET AL threefold stratified cross-vali-
dation for comparison of 68
algorithms
- Data imputation: missing data
were replaced by the mean
values of the players in the
same division
- Data discretization
No - Decision tree ensembles
- Adjusted for imbalance via
synthetic minority oversam-
pling
- Aggregated using bagging
and boosting methods
Discretization before data split-
ting
CAREY ET AL - Split in training dataset (data
of 2014 and 2015) and test
dataset (data of 2016)
- Hyperparameter tuning via
tenfold cross-validation
- Each analysis repeated 50
times
NR Principal Component Analysis - Decision tree ensembles (Ran-
dom Forests), Support Vector
Machines
- Adjusted for imbalance via
undersampling and synthetic
minority oversampling
Dependency between training
and test dataset
LÓPEZ-VALENCIANO ET AL fivefold stratified cross-valida-
tion for comparison of 68
algorithms
- Data imputation: missing data
were replaced by the mean
values of the players in the
same division
- Data discretization using litera-
ture and Weka software
No - Decision trees ensembles
- Adjusted for imbalance via
synthetic minority oversam-
pling, random oversampling,
random undersampling
- Aggregated using bagging
and boosting methods
Discretization before data split-
ting
MCCULLAGH ET AL tenfold cross-validation for
testing NR No Artificial Neural Networks with
backpropagation Dependency between training
and test dataset
OLIVER ET AL fivefold cross-validation for
comparison of 57 models - Data discretization using litera-
ture and Weka software No - Decision trees ensembles
- Adjusted for imbalance via
synthetic minority oversam-
pling, random oversampling,
random undersampling
- Aggregated using bagging
and boosting methods
Discretization before data split-
ting
RODAS ET AL - Outer fivefold cross-validation
for model testing
- inner tenfold cross-validation
for hyperparameters tuning
- Synthetic variant imputation Least Absolute Shrinkage and
Selection Operator (LASSO) Decision tree ensembles (Ran-
dom Forests), Support Vector
Machines
ROMMERS ET AL - Split in training (80%) and test
(20%) dataset
- Cross-validation for tuning
hyperparameters
NR No Decision tree ensembles
- Aggregated using boosting
methods
Page 11 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
Table 4 (continued)
Authors Train, Validate and Test Strategy Data Pre-processing Feature Selection/
Dimensionality Reduction Machine Learning Classification
Methods Deficits of ML Analysis
ROSSI ET AL - Split in dataset 1 (30%) for fea-
ture elimination and dataset 2
(70%) for training and testing
- stratified two-fold cross-valida-
tion on dataset 2
- repeated 10,000 times
NR Recursive Feature Elimination
with Cross-Validation - Decision tree ensembles
- Adjusted for imbalance via
adaptive synthetic sampling
- Aggregated using Random
Forests
Dependency between training
and test dataset
RUDDY ET AL Between Year approach:
- Split in training dataset (2013)
and test dataset (2015)
Within Year approach:
- Split in training (70%) and test
(30%) dataset
Both approaches:
- tenfold cross-validation for
hyperparameter tuning
- Each analysis repeated 10,000
times
- Data standardization No - Single decision tree, decision
tree ensembles (Random
Forests), Artificial Neural
Networks, Support Vector
Machines
- Adjusted for imbalance via
synthetic minority oversam-
pling
Standardization independent in
training and test dataset
THORNTON ET AL Split in training (70%), validation
(15%), and test (15%) dataset NR No Decision tree ensembles
- Aggregated using Random
Forests
WHITESIDE ET AL fivefold cross-validation for
comparison of models NR Brute Force feature selection:
Every possible subset of
features is tested
Support Vector Machines
Page 12 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
Table 5 Study results characteristics
Authors, Year Performance Measures (+ Values for
Best ML Model) Predictive Performance of ML Methods Measures of Feature Importance Most Important Injury Predictors
Ayala et al., 2019 [2] AUC (0.873), Sensitivity (77.8%), Speci-
ficity (83.8%) An alternating decision tree, combined
with synthetic minority oversampling
and boosting gave the best results
The frequency with which each of
the features appears across the tree
classifiers
Sleep Quality
Carey et al., 2018 [10] (Median) AUC (all below 0.65), Sen-
sitivity, Specificity, Precision, False
Disovery Rate, Likelihood Ratios
The proposed ML models perform
only marginally better than would be
expected by random chance
NR NR
López-Valenciano et al., 2018 [25] AUC (0.747), Sensitivity (65.5%), Speci-
ficity (79.1%) An alternating decision tree, combined
with synthetic minority oversampling
and boosting gave the best results
The frequency with which each of
the features appears across the tree
classifiers
sport devaluation, history of muscle
injury in last season
McCullagh et al., 2013 [27] Accuracy (82.9%), Sensitivity (94.5%),
Specificity (81.1%) Indication that Artificial Neural Net-
works are able to derive meaningful
information from the vast amount of
data available to assist in the injury
prediction process
NR NR
Oliver et al., 2020 [32] AUC (0.663), Sensitivity (55.6%), Speci-
ficity (74.2%) The machine learning model provided
improved sensitivity to predict injury The frequency with which each of
the features appears across the tree
classifiers
interactions of asymmetry, knee valgus
angle and body size
Rodas et al., 2019 [34] Accuracy (52%), Sensitivity (75%),
Specificity (23%) There is low prediction potential for
presence or absence of tendinopathy The number of times that a feature
(genetic predictor) received a non-
zero coefficient in the LASSO analysis
rs10477683 in the fibrillin 2 gene was
the most robust SNP (single-nucleo-
tide polymorphism)
Rommers et al., 2020 [35] F1-score (85%), Sensitivity (85%), Preci-
sion (85%) It is possible to predict injury with high
accuracy SHAP (SHapley Additive exPlanations)
summary plot Higher predicted age at PHV (peak
height velocity), longer legs, higher
body height, lower body fat percent-
age
Rossi et al., 2018 [36] (Mean) AUC (0.76), F1-score (64%),
Sensitivity (80%), Specificity (87%)
Precision (50%), Negative Predicted
Value (96%)
The single Decision tree performs best
in terms of precision Mean decrease in Gini coefficient Previous injury (exponential weighted
moving average), total distance
(monotony of workload feature) and
high-speed running distance (expo-
nential weighted moving average)
Ruddy et al., 2018 [38] (Median) AUC (0.58, 0.57 and 0.52) Eccentric hamstring strength, age,
and previous hamstring strain injury
(HSI) data cannot be used to identify
athletes at an increased risk of HSI
with any consistency
NR NR
Thornton et al., 2017 [41] AUC (0.74, 0.65, 0.64 and 0.64) Machine learning techniques can
appropriately monitor injury risk
amongst professional team sport
athletes
Number of times that each feature
appears in the ensemble of decision
trees
The relative importance of each training
load variable varied for each player
Whiteside et al., 2016 [46] Accuracy (75%), Sensitivity (74%),
Specificity (75%), Precision (75%),
False Omission Rate (26%)
Machine learning models can predict
future ulnar collateral ligament
surgeries with high accuracy
The frequency with which each feature
appeared in the optimized models in
the fivefold cross-validation
Mean days between consecutive games,
pitches in repertoire, mean pitch
speed, horizontal release location
Page 13 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
Oliver et al., the only paper considering comparability,
recruited 6 professional football teams of the English
Premier League and Championship and followed 355
athletes [32]. Injured and non-injured players were com-
pared in all continuous variables and all 95% CI presented
had proper range, indicating adequate matching between
groups. Future studies should be aware of this common
limitation and include the comparison between groups.
In terms of ML methodology, the following observations
can be made from this review. (i) Tree-based models are
currently the most popular ML models in sports medicine.
ey are easy to visualize and interpret, and they can be
extended to ensemble methods for boosting and bagging
purposes or adapted to be cost-sensitive. e two publica-
tions that did not use tree-based models were the first to
be published on the subject [27, 46], thereby confirming
the trend that more recent studies seem to adhere to this
methodology. (ii) Concerning training and evaluating the
ML models, a big variety between the 11 papers could be
noticed. It was surprising to see that only 4 papers [11, 34,
35, 38] mentioned having tuned the hyperparameters to
optimize the performance of the ML methods, since tun-
ing hyperparameters is recommendable (though not man-
datory) in order to take the most out of the ML methods.
e other studies may have used values from the literature
or the default values from the used software, which may
have led to a failure to identify the optimal model. (iii) e
findings from this review further reveal that future stud-
ies involving ML approaches in the field of sports injury
prevention should aim for a higher methodological quality.
One of the identified deficits of the analysed studies was
the dependency between training and test datasets.
e predictive performance of the considered publica-
tions was very heterogeneous. It should be emphasized
that the reported predictive performances cannot be seen
as a quality measure of the ML analysis per se, because
they are depending on many other factors, like the kind of
included risk factors, the design of the study, the sample
size or the unit of observation. is also appeared when
the publications of Ayala et al. [2], Lopez-Valenciano
etal. [25] and Oliver etal. [32] were compared to each
other. ey used similar preprocessing and processing
steps and classification trees, but reported very different
performance values (AUC ranging from 0.663 to 0.873).
Furthermore, the reported measures (AUC, accuracy,
sensitivity, specificity) might not be the best measures to
evaluate the prediction models, since these measures only
see black and white (injured or not injured), while proba-
bilistic scoring rules, such as the Brier Score and the Log-
arithmic Loss, would be able to evaluate the exactness of
a predicted probability (e.g. this player has 30% chance to
get injured) as is stated by Carey etal.[10]. From a clinical
point of view, it could be more informative to know the
probability of injury instead of only the classification into
a high or low risk profile.
When dealing with injury prediction and prevention,
it is important to identify especially modifiable risk fac-
tors, which can be intrinsic or extrinsic [28]. While some
studies did not provide any information on the relative
importance or influence of an individual risk factor, oth-
ers used the number of times that a considered variable
appeared in the ensemble of decision trees. Rossi etal.
[36] measured how much the predictive performance of
an algorithm would decrease if a specific variable would
be left out as a predictor. Rommers etal. [35] provided
a visualisation of the influence of the risk factors on the
predicted injury risk. erefore, it appeared that more
efforts should be done to understand the relative weight
of individual risk factors on the injury risk. is approach
may help guiding practitioners to apply targeted inter-
ventions to the athletes at high injury risk.
Limitations ofthesystematic review
Besides investigating the outcome of machine learning
algorithms in injury prediction and prevention, this sys-
tematic review also focused on the methodology of AI/
ML studies, which makes some parts probably challeng-
ing to read for sports medicine clinicians. To avoid mis-
interpretation, a brief summary of AI/ML methods was
included. It is important to stress that a previous review
of Claudino etal. [12] about the use of AI in team sports
provided a first overview of the topic, however it included
methods that were used in a clearly statistical way, such as
Bayesian logistic regression and single decision tree classi-
fiers. Using this categorization implies that studies which
were performed before the era of AI/ML and including
statistical methods like linear or logistic regression would
need to be considered to get a complete overview of the
topic. is would seriously dilute true ML approaches.
Another limitation is the fact that only PubMed database
was included in this review. Even though, more relevant
studies were found compared to reviews using other data-
bases, such as e.g. in Claudino etal. [12].
Conclusion
is systematic review showed that ML methods may be
used to identify athletes at high injury risk during sport
participation and that it may be helpful to identify risk
factors. However, although the majority of the analysed
studies did apply machine learning methods properly to
predict injuries, the methodological study quality was
moderate to very low. Sports injury prediction is a grow-
ing area and further developments in this promising field
should be encouraged with respect to the big potential of
AI/ML methods.
Page 14 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
Supplementary Information
The online version contains supplementary material available at https:// doi.
org/ 10. 1186/ s40634- 021- 00346-x.
Additional le1: S1. Definitions of core terms important for AI applica-
tion in sport injury prediction and prevention.
Acknowledgements
Not applicable
Authors’ contributions
TT, HVE and LDM designed the search strategy and the overall setup of
the systematic review. HVE and CL performed the technical analysis of the
included studies. TT and LDM performed the methodological quality analysis
of the articles. All authors drafted the manuscript and performed multiple revi-
sions. All authors read and approved the final manuscript.
Funding
No special funding existed for the performed study.
Availability of data and materials
Not applicable.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Author details
1 Department of Applied Mathematics, Computer Science and Statistics,
Ghent University, Krijgslaan 281-S9, 9000 Ghent, Belgium. 2 Graduate Pro-
gram in Rehabilitation and Functional Performance, Universidade Federal
Dos Vales Do Jequitinhonha E Mucuri (UFVJM), Diamantina, Minas Gerais,
Brazil. 3 Department of Physical Therapy and Motor Rehabilitation, Faculty
of Medicine and Health Sciences, Ghent University, Ghent, Belgium. 4 Ministry
of Education of Brazil, CAPES Foundation, Brasília, Distrito Federal, Brazil.
5 Department of Orthopaedic Surgery, Centre Hospitalier Luxembourg
and Luxembourg Institute of Health, Luxembourg, Luxembourg. 6 Department
of Orthopaedic Surgery, University of Rostock, Rostock, Germany.
Received: 25 January 2021 Accepted: 15 March 2021
References
1. Adetiba E, Iweanya VC, Popoola SI, Adetiba JN, Menon C (2017) Auto-
mated detection of heart defects in athletes based on electrocardiogra-
phy and artificial neural network. Cogent Eng 4:1411220
2. Ayala F, López-Valenciano A, Gámez Martín JA, De Ste CM, Vera-Garcia
FJ, García-Vaquero MDP, Ruiz-Pérez I, Myer GD (2019) A Preventive Model
for Hamstring Injuries in Professional Soccer: Learning Algorithms. Int J
Sports Med 40:344–353. https:// doi. org/ 10. 1055/a- 0826- 1955
3. Bahr R, Clarsen B, Ekstrand J (2018) Why we should focus on the bur-
den of injuries and illnesses, not just their incidence. Br J Sports Med
52:1018–1021. https:// doi. org/ 10. 1136/ bjspo rts- 2017- 098160
4. Bahr R, Krosshaug T (2005) Understanding injury mechanisms: a key
component of preventing injuries in sport. Br J Sports Med 39:324–329.
https:// doi. org/ 10. 1136/ bjsm. 2005. 018341
5. Bartlett JD, O’Connor F, Pitchford N, Torres-Ronda L, Robertson SJ
(2017) Relationships Between Internal and External Training Load in
Team-Sport Athletes: Evidence for an Individualized Approach. Int
J Sports Physiol Perform 12:230–234. https:// doi. org/ 10. 1123/ ijspp.
2015- 0791
6. Bittencourt NFN, Meeuwisse WH, Mendonça LD, Nettel-Aguirre A, Oca-
rino JM, Fonseca ST (2016) Complex systems approach for sports injuries:
moving from risk factor identification to injury pattern recognition-narra-
tive review and new concept. Br J Sports Med 50:1309–1314. https:// doi.
org/ 10. 1136/ bjspo rts- 2015- 095850
7. Bolling C, van Mechelen W, Pasman HR, Verhagen E (2018) Context Mat-
ters: Revisiting the First Step of the “Sequence of Prevention” of Sports
Injuries. Sports Med Auckl NZ 48:2227–2234. https:// doi. org/ 10. 1007/
s40279- 018- 0953-x
8. Bolling C, Mellette J, Pasman HR, van Mechelen W, Verhagen E (2019)
From the safety net to the injury prevention web: applying systems think-
ing to unravel injury prevention challenges and opportunities in Cirque
du Soleil. BMJ Open Sport Exerc Med 5:e000492. https:// doi. org/ 10. 1136/
bmjsem- 2018- 000492
9. Cabitza F, Locoro A, Banfi G (2018) Machine Learning in Orthopedics: A
Literature ReviewFront Bioeng Biotechnol 6.https:// doi. org/ 10. 3389/ fbioe.
2018. 00075
10. Carey DL, Crossley KM, Whiteley R, Mosler A, Ong K-L, Crow J, Morris ME
(2018) Modeling Training Loads and Injuries: The Dangers of Discretiza-
tion. Med Sci Sports Exerc 50:2267–2276. https:// doi. org/ 10. 1249/ MSS.
00000 00000 001685
11. Carey DL, Ong K, Whiteley R, Crossley KM, Crow J, Morris ME (2018)
Predictive modelling of training loads and injury in Australian football. Int
J ComputSci Sport 17:49–66
12. Claudino JG, de Capanema D, O, de Souza TV, Serrão JC, Machado Pereira
AC, Nassis GP, (2019) Current Approaches to the Use of Artificial Intel-
ligence for Injury Risk Assessment and Performance Prediction in Team
Sports: a Systematic Review. Sports Med - Open 5:28. https:// doi. org/ 10.
1186/ s40798- 019- 0202-3
13. Emery CA, Pasanen K (2019) Current trends in sport injury prevention.
Best Pract Res ClinRheumatol 33:3–15. https:// doi. org/ 10. 1016/j. berh.
2019. 02. 009
14. Ertelt T, Solomonovs I, Gronwald T (2018) Enhancement of force patterns
classification based on Gaussian distributions. J Biomech 67:144–149.
https:// doi. org/ 10. 1016/j. jbiom ech. 2017. 12. 006
15. Fonseca ST, Souza TR, Verhagen E, van Emmerik R, Bittencourt NFN,
Mendonça LDM, Andrade AGP, Resende RA, Ocarino JM (2020)
Sports Injury Forecasting and Complexity: A Synergetic Approach.
Sports Med Auckl NZ 50:1757–1770. https:// doi. org/ 10. 1007/
s40279- 020- 01326-4
16. Gastin PB, Hunkin SL, Fahrner B, Robertson S (2019) Deceleration,
Acceleration, and Impacts Are Strong Contributors to Muscle Damage
in Professional Australian Football. J Strength Cond Res 33:3374–3383.
https:// doi. org/ 10. 1519/ JSC. 00000 00000 003023
17. Groll A, Ley C, Schauberger G, Eetvelde HV (2019) A hybrid random forest
to predict soccer matches in international tournaments. J Quant Anal
Sports 15:271–287. https:// doi. org/ 10. 1515/ jqas- 2018- 0060
18. Guyatt GH, Oxman AD, Vist GE, Kunz R, Falck-Ytter Y, Alonso-Coello P,
Schünemann HJ (2008) GRADE: an emerging consensus on rating qual-
ity of evidence and strength of recommendations. BMJ 336:924–926.
https:// doi. org/ 10. 1136/ bmj. 39489. 470347. AD
19. Hasler RM, Berov S, Benneker L, Dubler S, Spycher J, Heim D, Zimmer-
mann H, Exadaktylos AK (2010) Are there risk factors for snowboard inju-
ries? A case-control multicentre study of 559 snowboarders. Br J Sports
Med 44:816–821. https:// doi. org/ 10. 1136/ bjsm. 2010. 071357
20. Hasler RM, Dubler S, Benneker LM, Berov S, Spycher J, Heim D, Zimmer-
mann H, Exadaktylos AK (2009) Are there risk factors in alpine skiing? A
controlled multicentre survey of 1278 skiers. Br J Sports Med 43:1020–
1025. https:// doi. org/ 10. 1136/ bjsm. 2009. 064741
21. Hubáček O, Šourek G, Železný F (2019) Learning to predict soccer results
from relational data with gradient boosted trees. Mach Learn 108:29–47.
https:// doi. org/ 10. 1007/ s10994- 018- 5704-6
22. Klein C, Luig P, Henke T, Platen P (2020) Injury burden differs consider-
ably between single teams from German professional male football
(soccer): surveillance of three consecutive seasons. Knee Surg Sports
TraumatolArthrosc Off J ESSKA 28:1656–1664. https:// doi. org/ 10. 1007/
s00167- 019- 05623-y
23. Kuhn M, Johnson K (2013) Applied predictive modeling. Springer, New
York
Page 15 of 15
VanEetveldeetal. J EXP ORTOP (2021) 8:27
24. Liu Y, Chen P-HC, Krause J, Peng L (2019) How to Read Articles That
Use Machine Learning: Users’ Guides to the Medical Literature. JAMA
322:1806–1816. https:// doi. org/ 10. 1001/ jama. 2019. 16489
25. López-Valenciano A, Ayala F, PuertaJosM DE, Ste Croix MBA, Vera-Garcia
FJ, Hernández-Sánchez S, Ruiz-Pérez I, Myer GD (2018) A Preventive
Model for Muscle Injuries: A Novel Approach based on Learning Algo-
rithms. Med Sci Sports Exerc 50:915–927. https:// doi. org/ 10. 1249/ MSS.
00000 00000 001535
26. Lundberg SM, Lee S-I (2017) A Unified Approach to Interpreting Model
Predictions. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R,
Vishwanathan S, Garnett R (eds) 30th Conference on Neural Information
Processing Systems (NIPS 2017). Curran Associates, Inc., LongBeach, p
4765–4774
27. McCullagh J, Whitfort T (2013) An investigation into the application of
Artificial Neural Networks to the prediction of injuries in sport. Int J Sport
Health Sci 7:356–360
28. Meeuwisse WH, Tyreman H, Hagel B, Emery C (2007) A dynamic model
of etiology in sport injury: the recursive nature of risk and causation. Clin
J Sport Med Off J Can Acad Sport Med 17:215–219. https:// doi. org/ 10.
1097/ JSM. 0b013 e3180 592a48
29. Mendonça LD, Ocarino JM, Bittencourt NFN, Macedo LG, Fonseca ST
(2018) Association of Hip and Foot Factors With Patellar Tendinopathy
(Jumper’s Knee) in Athletes. J Orthop Sports PhysTher 48:676–684.
https:// doi. org/ 10. 2519/ jospt. 2018. 7426
30. Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group (2009) Preferred
reporting items for systematic reviews and meta-analyses: the PRISMA
statement. PLoS Med 6:e1000097. https:// doi. org/ 10. 1371/ journ al. pmed.
10000 97
31. Myers TG, Ramkumar PN, Ricciardi BF, Urish KL, Kipper J, Ketonis C (2020)
Artificial Intelligence and Orthopaedics: An Introduction for Clinicians.
JBJS 102:830–840. https:// doi. org/ 10. 2106/ JBJS. 19. 01128
32. Oliver JL, Ayala F, De Ste Croix MBA, Lloyd RS, Myer GD, Read PJ (2020)
Using machine learning to improve our understanding of injury risk and
prediction in elite male youth football players. J Sci Med Sport. https://
doi. org/ 10. 1016/j. jsams. 2020. 04. 021
33. Parker W, Forster BB (2019) Artificial intelligence in sports medicine radiol-
ogy: what’s coming? Br J Sports Med 53:1201–1202. https:// doi. org/ 10.
1136/ bjspo rts- 2018- 099999
34. Rodas G, Osaba L, Arteta D, Pruna R, Fernández D, Lucia A (2019) Genomic
Prediction of Tendinopathy Risk in Elite Team Sports. Int J Sports Physiol
Perform: 1–7.https:// doi. org/ 10. 1123/ ijspp. 2019- 0431
35. Rommers N, RÖssler R, Verhagen E, Vandecasteele F, Verstockt S, Vaeyens
R, Lenoir M, D’Hondt E, Witvrouw E, (2020) A Machine Learning Approach
to Assess Injury Risk in Elite Youth Football Players. Med Sci Sports Exerc
52:1745–1751. https:// doi. org/ 10. 1249/ MSS. 00000 00000 002305
36. Rossi A, Pappalardo L, Cintia P, Iaia FM, Fernàndez J, Medina D (2018)
Effective injury forecasting in soccer with GPS training data and machine
learning. PLoS ONE 13:e0201264. https:// doi. org/ 10. 1371/ journ al. pone.
02012 64
37. Ruddy JD, Cormack SJ, Whiteley R, Williams MD, Timmins RG, Opar DA
(2019) Modeling the Risk of Team Sport Injuries: A Narrative Review of
Different Statistical Approaches. Front Physiol 10:829. https:// doi. org/ 10.
3389/ fphys. 2019. 00829
38. Ruddy JD, Shield AJ, Maniar N, Williams MD, Duhig S, Timmins RG, Hickey
J, Bourne MN, Opar DA (2018) Predictive Modeling of Hamstring Strain
Injuries in Elite Australian Footballers. Med Sci Sports Exerc 50:906–914.
https:// doi. org/ 10. 1249/ MSS. 00000 00000 001527
39. Shah P, Kendall F, Khozin S, Goosen R, Hu J, Laramie J, Ringel M, Schork N
(2019) Artificial intelligence and machine learning in clinical develop-
ment: a translational perspective. NPJ Digit Med 2:69. https:// doi. org/ 10.
1038/ s41746- 019- 0148-3
40. Tervo T, Ermling J, Nordström A, Toss F (2020) The 9+ screening test score
does not predict injuries in elite floorball players. Scand J Med Sci Sports
30:1232–1236. https:// doi. org/ 10. 1111/ sms. 13663
41. Thornton HR, Delaney JA, Duthie GM, Dascombe BJ (2017) Importance of
Various Training-Load Measures in Injury Incidence of Professional Rugby
League Athletes. Int J Sports Physiol Perform 12:819–824. https:// doi. org/
10. 1123/ ijspp. 2016- 0326
42. Topol EJ (2019) High-performance medicine: the convergence of human
and artificial intelligence. Nat Med 25:44–56. https:// doi. org/ 10. 1038/
s41591- 018- 0300-7
43. Trinidad-Fernandez M, Gonzalez-Sanchez M, Cuesta-Vargas AI (2019) Is a
low Functional Movement Screen score (≤14/21) associated with injuries
in sport? A systematic review and meta-analysis. BMJ Open Sport Exerc
Med 5:e000501. https:// doi. org/ 10. 1136/ bmjsem- 2018- 000501
44. Verhagen E, Bolling C (2015) Protecting the health of the @hlete: how
online technology may aid our common goal to prevent injury and
illness in sport. Br J Sports Med 49:1174–1178. https:// doi. org/ 10. 1136/
bjspo rts- 2014- 094322
45. Wells G, Shea B, O’Connell D, Robertson J, Peterson J, Welch V, Losos M,
Tugwell P The Newcastle-Ottawa Scale (NOS) for Assessing the Quality of
Nonrandomized Studies in Meta- Analysis. http:// www. ohri. ca/ progr ams/
clini cal_ epide miolo gy/ oxford. asp. Accessed 28 June 2020
46. Whiteside D, Martini DN, Lepley AS, Zernicke RF, Goulet GC (2016) Predic-
tors of Ulnar Collateral Ligament Reconstruction in Major League Baseball
Pitchers. Am J Sports Med 44:2202–2209. https:// doi. org/ 10. 1177/ 03635
46516 643812
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in pub-
lished maps and institutional affiliations.