Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Purpose: Staying injury free is a major factor for success in sports. Although injuries are difficult to forecast, novel technologies and data-science applications could provide important insights. Our purpose was to use machine learning for the prediction of injuries in runners, based on detailed training logs. Methods: Prediction of injuries was evaluated on a new data set of 74 high-level middle- and long-distance runners, over a period of 7 years. Two analytic approaches were applied. First, the training load from the previous 7 days was expressed as a time series, with each day's training being described by 10 features. These features were a combination of objective data from a global positioning system watch (eg, duration, distance), together with subjective data about the exertion and success of the training. Second, a training week was summarized by 22 aggregate features, and a time window of 3 weeks before the injury was considered. Results: A predictive system based on bagged XGBoost machine-learning models resulted in receiver operating characteristic curves with average areas under the curves of 0.724 and 0.678 for the day and week approaches, respectively. The results of the day approach especially reflect a reasonably high probability that our system makes correct injury predictions. Conclusions: Our machine-learning-based approach predicts a sizable portion of the injuries, in particular when the model is based on training-load data in the days preceding an injury. Overall, these results demonstrate the possible merits of using machine learning to predict injuries and tailor training programs for athletes.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Indeed, many sports organizations already invested in measurement infrastructures to collect data on stress, or load, and recovery at a physical level (e.g. Brink et al., 2010;Cross et al., 2016;Impellizzeri et al., 2004;Jaspers et al., 2018;Lovdal et al., 2021;Van der Does et al., 2017). This can be extended with the collection of data on psychological stressors and states. ...
... Furthermore, the field of data science and sports analytics offers robust methods to integrate and analyze those multidimensional processes (e.g. Couceiro et al., 2016;De Leeuw et al., 2021;Jaspers et al., 2018;Lovdal et al., 2021;Orie et al., 2021). ...
... In line with the ergodicity issue of psychological resilience, a recent study showed that group level statistics on load, recovery, and their relationship, do not generalize to individual athletes (Neumann et al., 2021). Furthermore, Lovdal et al. (2021) demonstrated the merits of accounting for the temporal element in load and recovery research, thereby applying knowledge from data science. These authors applied a machine learning approach on load and recovery data from middle-and long-distance runners across several years. ...
Article
Full-text available
Athletes are exposed to various psychological and physiological stressors, such as losing matches and high training loads. Understanding and improving the resilience of athletes is therefore crucial to prevent performance decrements and psychological or physical problems. In this review, resilience is conceptualized as a dynamic process of bouncing back to normal functioning following stressors. This process has been of wide interest in psychology, but also in the physiology and sports science literature (e.g. load and recovery). To improve our understanding of the process of resilience, we argue for a collaborative synthesis of knowledge from the domains of psychology, physiology, sports science, and data science. Accordingly, we propose a multidisciplinary, dynamic, and personalized research agenda on resilience. We explain how new technologies and data science applications are important future trends (1) to detect warning signals for resilience losses in (combinations of) psychological and physiological changes, and (2) to provide athletes and their coaches with personalized feedback about athletes’ resilience.
... The machine learning algorithm chosen for this research is the Extreme Gradient Boosting of Decision Trees or XGBoost for short. This decision is motivated by its outstanding performance on various Kaggle 3 benchmark data sets among others, its efficiency in learning and applying a model together with the ability in determining the relevance of each independent variable, which facilitates the interpretation of the pipeline [38,39,40]. ...
... This is important to avoid overfitting. A similar approach was adapted in [40]. For a given app the prediction (removed or not removed) is then determined from the average score of all predictions by the participating XGBoost models. ...
Preprint
Full-text available
Mobile app stores are the key distributors of mobile applications. They regularly apply vetting processes to the deployed apps. Yet, some of these vetting processes might be inadequate or applied late. The late removal of applications might have unpleasant consequences for developers and users alike. Thus, in this work we propose a data-driven predictive approach that determines whether the respective app will be removed or accepted. It also indicates the features' relevance that help the stakeholders in the interpretation. In turn, our approach can support developers in improving their apps and users in downloading the ones that are less likely to be removed. We focus on the Google App store and we compile a new data set of 870,515 applications, 56% of which have actually been removed from the market. Our proposed approach is a bootstrap aggregating of multiple XGBoost machine learning classifiers. We propose two models: user-centered using 47 features, and developer-centered using 37 features, the ones only available before deployment. We achieve the following Areas Under the ROC Curves (AUCs) on the test set: user-centered = 0.792, developer-centered = 0.762.
... In recent years, several machine-learned injury-predicting models (Wilzman et al. (2022) Rommers et al. (2020);Oliver et al. (2020b)) and machinelearned models based on plantar pressure (Wilzman et al. (2022); Booth et al. (2020); Chen et al. (2021); Ardhianto et al. (2022); Botros et al. (2016); Nong et al. (2021); Jeon et al. (2008)) have been proposed and motivated the use of machine learning in this study. However, most of the proposed injury-predicting machine learning models focus on elite athletes in one particular sport such as soccer (Rossi et al. (2018); Ayala et al. (2019)), running (Lövdal et al. (2021); Martínez-Gramage et al. (2020)) or football (Carey et al. (2018); Rommers et al. (2020);Oliver et al. (2020b)) and the insights gained in these studies might not be transferable to other sports. Moreover, Winter et al. (2019) showed that factors that play a role in injury development depend on the skill level of the participants, which indicates that the findings from the aforementioned studies might not apply to non-elite athletes. ...
Article
Full-text available
Although running has many benefits for both the physical and mental health, it also involves the risk of injuries which results in negative physical, psychological and economical consequences. Those injuries are often linked to specific running biomechanical parameters such as the pressure pattern of the foot while running, and they could potentially be indicative for future injuries. Previous studies focus solely on some specific type of running injury and are often only applicable to a gender or running-experience specific population. The purpose of this study is, for both male and female, first-year students, (i) to predict the development of a lower extremity overuse injury in the next 6 months based on foot pressure measurements from a pressure plate and (ii) to identify the predictive loading features. For the first objective, we developed a machine learning pipeline that analyzes foot pressure measurements and predicts whether a lower extremity overuse injury is likely to occur with an AUC of 0.639 and a Brier score of 0.201. For the second objective, we found that the higher pressures exerted on the forefoot are the most predictive for lower extremity overuse injuries and that foot areas from both the lateral and the medial side are needed. Furthermore, there are two kinds of predictive features: the angle of the FFT coefficients and the coefficients of the autoregressive AR process. However, these features are not interpretable in terms of the running biomechanics, limiting its practical use for injury prevention.
... The use of ML algorithms to predict sports injuries is a current trend in research [14,17,31], but practitioners should remain cautious regarding their use despite recent advances. There are ethical implications to consider [5], such as inadvertently hindering a player's career through a wrongfully attributed worse prognosis. ...
Article
Purpose: Achilles tendon ruptures (ATR) are career-threatening injuries in elite soccer players due to the decreased sports performance they commonly inflict. This study presents an exploratory data analysis of match participation before and after ATRs and an evaluation of the performance of a machine learning (ML) model based on pre-injury features to predict whether a player will return to a previous level of match participation. Methods: The website transfermarkt.com was mined, between January and March of 2021, for relevant entries regarding soccer players who suffered an ATR while playing in first or second leagues. The difference between average minutes played per match (MPM) 1 year before injury and between 1 and 2 years after the injury was used to identify patterns in match participation after injury. Clustering analysis was performed using k-means clustering. Predictions of post-injury match participation were made using the XGBoost classification algorithm. The performance of this model was evaluated using the area under the receiver operating characteristic curve (AUROC) and Brier score loss (BSL). Results: Two hundred and nine players were included in the study. Data from 32,853 matches was analysed. Exploratory data analysis revealed that forwards, midfielders and defenders increased match participation during the first year after injury, with goalkeepers still improving at 2 years. Players were grouped into four clusters regarding the difference between MPMs 1 year before injury and between 1 and 2 years after the injury. These groups ranged between a severe decrease (n = 34; - 59 ± 13 MPM), moderate decrease (n = 75; - 25 ± 8 MPM), maintenance (n = 70; 0 ± 8 MPM), or increase (n = 30; 32 ± 13 MPM). Regarding the predictive model, the average AUROC after cross-validation was 0.81 ± 0.10, and the BSL was 0.12, with the most important features relating to pre-injury match participation. Conclusion: Most players take 1 year to reach peak match participation after an ATR. Good performance was attained using a ML classifier to predict the level of match participation following an ATR, with features related to pre-injury match participation displaying the highest importance. Level of evidence: I.
... The authors used a Bayesian approach to statistically determine the association between subjects and a variety of variables including demographic and anthropometric (age, weight, height and BMI) and physiological variables (HR, breathing rate). Lovdal et al. [12] collected data from 74 high-level middleand long-distance runners over a period of 7 years that includes external load and the success of training. They used XGBoost [3] to predict injury in the next session of competitive runners. ...
... Some studies have provided a first insight into machine learning methods to predict training process outcomes, such as injuries. [29][30][31] However, to date, there is no strong evidence on the accurate prediction of training process data. The system's complexity could again explain this and also the (sometimes limited) validity, reliability, and sensitivity of (sometimes inconsistently) collected data, which can be a pitfall and is occasionally quoted as "garbage in, garbage out." ...
Elite sport practitioners increasingly use data to support training process decisions related to athletes' health and performance. A careful application of data analytics is essential to gain valuable insights and recommendations that can guide decision making. In business organizations, data analytics are developed based on conceptual data analytics frameworks. The translation of such a framework to elite sport may benefit the use of data to support training process decisions. Purpose: The authors aim to present and discuss a conceptual data analytics framework, based on a taxonomy used in business analytics literature to help develop data analytics within elite sport organizations. Conclusions: The presented framework consists of 4 analytical steps structured by value and difficulty/complexity. While descriptive (step 1) and diagnostic analytics (step 2) focus on understanding the past training process, predictive (step 3) and prescriptive analytics (step 4) provide more guidance in planning the future. Although descriptive, diagnostic, and predictive analytics generate insights to inform decisions, prescriptive analytics can be used to drive decisions. However, the application of this type of advanced analytics is still challenging in elite sport. Thus, the current use of data in elite sport is more focused on informing decisions rather than driving them. The presented conceptual framework may help practitioners develop their analytical reasoning by providing new insights and guidance and may stimulate future collaborations between practitioners, researchers, and analytics experts.
... Moreover, the percentage of training days is also considered. The period of last weeks is chosen due to better performance than shorter time windows which was concluded in the existing literature [7]- [10]. In context of runner's load, number of competitions since the beginning of the month is taken into account. ...
Article
Full-text available
Background: Research on the application of technology in sports in Romania is completely lacking, and the existing studies at the international level have mainly been carried out in recent years. We considered it appropriate to highlight the best practice models of technology application in sports that can be multiplied, adapted, improved, and widely used. The paper aims to identify the use of technology and devices in sports, with an emphasis on their role in training and competitions with the aim of improving sports performance, to provide sports specialists, organizations, and authorities with a wide range of information regarding the connection between sport and technology. The results obtained regarding the application of technology in sports refer mainly to the following: techniques and technologies used in training and competition (portable localization technology and global positioning systems (GPS); Virtual Reality (VR) technology; video analysis; digital technologies integrated into sports training); aspects of sports training targeted through the use of technology (use of technology for athlete health, recovery, and injury management; use of technology for monitoring sports performance and various body indicators); training optimization and ecological dynamics and the sustainable development of sports. Conclusions: Unitary research, at a European or even global level, in a uniform theoretical and practical framework, could lead to much more efficient training with large increases in sports performance. The coaches and specialists working with the athlete determine the specificity of some elements of the training, depending on the characteristics of each athlete. Large clubs could become a factor in generating and disseminating knowledge related to training and competition monitoring, sports performance enhancement, and health, recovery, and injury management. Research directions for the use of technology in sport and the formation of connections with other fields can be extended. For example, combined technologies assisted by specialized software can be used. Creativity must be the starting point for the use and combination of existing technologies in sports and for the creation of new ones. Their creation and use involve the teamwork of athletes, coaches, and specialists from different fields, such as sports, physiology, psychology, biomechanics, informatics, etc.
Article
Mobile app stores are the key distributors of mobile applications. They regularly apply vetting processes to the deployed apps. Yet, some of these vetting processes might be inadequate or applied late. The late removal of applications might have unpleasant consequences for developers and users alike. Thus, in this work, we propose a data-driven predictive approach that determines whether the respective app will be removed or accepted. It also indicates the features’ relevance that helps the stakeholders in the interpretation. In turn, our approach can support developers in improving their apps and users in downloading the ones that are less likely to be removed. We focus on the Google App store and we compile a new data set of 870,515 applications, 56% of which have been removed from the market. Our proposed approach is a bootstrap aggregating of multiple XGBoost machine learning classifiers. We propose two models: user-centered using 47 features, and developer-centered using 37 features, which are available before publishing an app. We achieve the following Areas Under the ROC Curves (AUCs) on the test set: user-centered = 0.792, developer-centered = 0.762.
Article
Full-text available
Even though practicing sports has great health benefits, it also entails a risk of developing overuse injuries, which can elicit a negative impact on physical, mental, and financial health. Being able to predict the risk of an overuse injury arising is of widespread interest because this may play a vital role in preventing its occurrence. In this paper, we present a machine learning model trained to predict the occurrence of a lower-limb overuse injury (LLOI). This model was trained and evaluated using data from a three-dimensional accelerometer on the lower back, collected during a Cooper test performed by 161 first-year undergraduate students of a movement science program. In this study, gender-specific models performed better than mixed-gender models. The estimated area under the receiving operating characteristic curve of the best-performing male- and female-specific models, trained according to the presented approach, was, respectively, 0.615 and 0.645. In addition, the best-performing models were achieved by combining statistical and sports-specific features. Overall, the results demonstrated that a machine learning injury prediction model is a promising, yet challenging approach.
Article
Full-text available
We consider using the area under an empirical receiver operating characteristic curve to test the hypothesis that a predictive index combined with a range of cutoffs performs no better than pure chance in forecasting a binary outcome. This corresponds to the null hypothesis that the area in question, denoted as AUC, is 1/2. We show that if the predictive index comes from a first-stage regression model estimated over the same data set, then testing the null based on the standard asymptotic normality results leads to severe size distortion in general settings. We then analytically derive the proper asymptotic null distribution of the empirical AUC in a special case; namely, when the first-stage regressors are Bernoulli random variables. This distribution can be utilised to construct a fully in-sample test of with correct size and more power than out-of-sample tests based on sample splitting, though practical application becomes cumbersome with more than two regressors.
Article
Full-text available
Purpose: To examine the association and predictive ability of internal load markers with regards to non-contact injuries in young elite soccer players. Methods: Twenty-two soccer players (18.6 ± .6 years) who competed in the Spanish U19 League participated in the study. During a full season, non-contact injuries were recorded and, using session rating of perceived exertion (s-RPE), internal weekly load (sum of load of all training sessions and matches for each week) and acute:chronic workload ratio (typically, acute = current week and chronic = rolling 4 week average) were calculated. A Generalized Estimating Equation analysis was used to examine association of weekly and acute:chronic load ratio markers with a non-contact injury in the subsequent week. Load variables were also analyzed for predictive ability with Receiver Operating Characteristic (ROC) curve and area under the curve (AUC). Results: No association was found for weekly load (CI 1.00, .99 to 1.00) and acute:chronic load ratio (CI .16, .01 to 1.84) with respect to injury occurrence. In addition, the analyzed load markers showed poor ability to predict injury occurrence (AUC<.50). Conclusions: The results of this study suggest that internal load markers are not associated with non-contact injuries in young soccer players and present poor predictive capacity with regards to the latter.
Article
Full-text available
Injuries have a great impact on professional soccer, due to their large influence on team performance and the considerable costs of rehabilitation for players. Existing studies in the literature provide just a preliminary understanding of which factors mostly affect injury risk, while an evaluation of the potential of statistical models in forecasting injuries is still missing. In this paper, we propose a multi-dimensional approach to injury forecasting in professional soccer that is based on GPS measurements and machine learning. By using GPS tracking technology, we collect data describing the training workload of players in a professional soccer club during a season. We then construct an injury forecaster and show that it is both accurate and interpretable by providing a set of case studies of interest to soccer practitioners. Our approach opens a novel perspective on injury prevention, providing a set of simple and practical rules for evaluating and interpreting the complex relations between injury risk and training performance in professional soccer.
Article
Full-text available
In elite sports, training schedules are becoming increasingly complex, and a large number of parameters of such schedules need to be tuned to the specific physique of a given athlete. In this paper, we describe how extensive analysis of historical data can help optimise these parameters, and how possible pitfalls of under- and overtraining in the past can be avoided in future schedules. We treat the series of exercises an athlete undergoes as a discrete sequence of attributed events, that can be aggregated in various ways, to capture the many ways in which an athlete can prepare for an important test event. We report on a cooperation with the elite speed skating team LottoNL-Jumbo, who have recorded detailed training data over the last 15 years. The aim of the project was to analyse this potential source of knowledge, and extract actionable and interpretable patterns that can provide input to future improvements in training. We present two alternative techniques to aggregate sequences of exercises into a combined, long-term training effect, one of which based on a sliding window, and one based on a physiological model of how the body responds to exercise. Next, we use both linear modelling and Subgroup Discovery to extract meaningful models of the data.
Article
Full-text available
Athletes participating in elite sports are exposed to high training loads and increasingly saturated competition calendars. Emerging evidence indicates that poor load management is a major risk factor for injury. The International Olympic Committee convened an expert group to review the scientific evidence for the relationship of load (defined broadly to include rapid changes in training and competition load, competition calendar congestion, psychological load and travel) and health outcomes in sport. We summarise the results linking load to risk of injury in athletes, and provide athletes, coaches and support staff with practical guidelines to manage load in sport. This consensus statement includes guidelines for (1) prescription of training and competition load, as well as for (2) monitoring of training, competition and psychological load, athlete well-being and injury. In the process, we identified research priorities.
Article
Full-text available
Tactical match performance depends on the quality of actions of individual players or teams in space and time during match-play in order to be successful. Technological innovations have led to new possibilities to capture accurate spatio-temporal information of all players and unravel the dynamics and complexity of soccer matches. The main aim of this article is to give an overview of the current state of development of the analysis of position data in soccer. Based on the same single set of position data of a high-level 11 versus 11 match (Bayern Munich against FC Barcelona) three different promising approaches from the perspective of dynamic systems and neural networks will be presented: Tactical performance analysis revealed inter-player coordination, inter-team and inter-line coordination before critical events, as well as team-team interaction and compactness coefficients. This could lead to a multi-disciplinary discussion on match analyses in sport science and new avenues for theoretical and practical implications in soccer.
Article
Full-text available
Aim: Investigate whether acute workload (1 week total distance) and chronic workload (4-week average acute workload) predict injury in elite rugby league players. Methods: Data were collected from 53 elite players over two rugby league seasons. The ‘acute:chronic workload ratio’ was calculated by dividing acute workload by chronic workload. A value of greater than 1 represented an acute workload greater than chronicworkload. All workload data were classified into discrete ranges by z-scores. Results Compared with all other ratios, a very-high acute:chronic workload ratio (≥2.11) demonstrated the greatest risk of injury in the current week (16.7% injury risk) and subsequent week (11.8% injury risk). High chronic workload (>16 095 m) combined with a very high 2-week average acute:chronic workload ratio (≥1.54) was associated with the greatest risk of injury (28.6% injury risk). High chronic workload combined with a moderate workload ratio (1.02–1.18) had a smaller risk of injury than low chronic workload combined with several workload ratios (relative risk range from 0.3 to 0.7×/÷1.4 to 4.4; likelihood range=88–94%, likely). Considering acute and chronic workloads in isolation (ie, not as ratios) did not consistently predict injury risk. Conclusions: Higher workloads can have either positive or negative influences on injury risk in elite rugby league players. Specifically, compared with players who have a low chronic workload, players with a high chronic workload are more resistant to injury with moderate-low through moderate-high (0.85–1.35)acute:chronic workload ratios and less resistant to injury when subjected to ‘spikes’ in acute workload, that is, very-high acute:chronic workload ratios ∼1.5.
Article
Full-text available
To explore the association between in-season training load measures and injury risk in professional Rugby Union players. Methods This was a one-season prospective cohort study of 173 Professional Rugby Union players from four English Premiership teams. Training load (duration x session-RPE) and time-loss injuries were recorded for all players for all pitch and gym based sessions. Generalised estimating equations were used to model the association between in-season training load measures and injury risk in the subsequent week. Injury risk increased linearly with one-week loads and week-to-week changes in loads, with a 2 standard deviation (SD) increase in these variables (1245 AU and 1069 AU, respectively) associated with odds ratios of 1.68 (95% CI 1.05-2.68) and 1.58 (95% CI: 0.98-2.54). When compared with the reference group (<3684 AU), a significant non-linear effect was evident for four-week cumulative loads, with a likely beneficial reduction in injury risk associated with intermediate loads of 5932 to 8651 AU (OR: 0.55, 95% CI: 0.22-1.38) (this range equates to around four weeks of average in-season training load), and a likely harmful effect evident for higher loads of >8651 AU (OR: 1.39, 95% CI: 0.98-1.98). Players had an increased risk of injury if they had high one-week cumulative loads (1245 AU), or large week-to-week changes in load (1069 AU). In addition, a 'U-shaped' relationship was observed for four-week cumulative loads, with an apparent increase in risk associated with higher loads (>8651 AU). These measures should therefore be monitored to inform injury risk reduction strategies.
Article
Full-text available
Study design: An explorative, 1-year prospective cohort study. Objective To examine whether an association between a sudden change in weekly running distance and running-related injury varies according to injury type. Background: It is widely accepted that a sudden increase in running distance is strongly related to injury in runners. But the scientific knowledge supporting this assumption is limited. Methods: A volunteer sample of 874 healthy novice runners who started a self-structured running regimen were provided a global-positioning-system watch. After each running session during the study period, participants were categorized into 1 of the following exposure groups, based on the progression of their weekly running distance: less than 10% or regression, 10% to 30%, or more than 30%. The primary outcome was running-related injury. Results: A total of 202 runners sustained a running-related injury. Using Cox regression analysis, no statistically significant differences in injury rates were found across the 3 exposure groups. An increased rate of distance-related injuries (patellofemoral pain, iliotibial band syndrome, medial tibial stress syndrome, gluteus medius injury, greater trochanteric bursitis, injury to the tensor fascia latae, and patellar tendinopathy) existed in those who progressed their weekly running distance by more than 30% compared with those who progressed less than 10% (hazard ratio = 1.59; 95% confidence interval: 0.96, 2.66; P = .07). Conclusion: Novice runners who progressed their running distance by more than 30% over a 2-week period seem to be more vulnerable to distance-related injuries than runners who increase their running distance by less than 10%. Owing to the exploratory nature of the present study, randomized controlled trials are needed to verify these results, and more experimental studies are needed to validate the assumptions. Still, novice runners may be well advised to progress their weekly distances by less than 30% per week over a 2-week period.
Article
Full-text available
Abstract Australian track and field has a strong focus on State and National elite youth programmes as the development pathway to elite senior international competition. Yet, there are no clearly defined parameters for appropriate training volumes, training intensities or competition schedules for youth athletes. This study sought to examine the training profiles of, and injuries suffered by, elite youth track and field athletes between the ages 13 and 17 years. The participants were 103 elite NSW athletes (age 17.7±2.4 years, 64% girls) who recalled, through a questionnaire, their training profiles (frequency, volume and intensity) and injuries (type, site and severity) at three age groups: 13-14 years, 15-16 years and at 17 years of age. Eighty-one athletes (78.6%) sustained 200 injuries (time loss > 3 weeks) that were predominantly classified as overuse (76%) with 17.3% of athletes retiring due to injuries prior to turning 18 years. The results, analysed using t-test, one-way analysis of variance and chi-square analysis, showed that injured athletes trained at a higher intensity at 13-14 years (p < 0.01), completed more high-intensity training sessions at 13-14 years (p < 0.01) and 15-16 years (p < 0.05) and had a higher yearly training load at 13-14 years (p < 0.01). There was a significant relationship between forced retirement and having sustained an overuse injury (p<0.05). These findings suggest that monitoring by coaches and athletes of training loads, intensity and the number of hard sessions completed each week is warranted to minimise injuries sustained by 13-16 year old athletes.
Article
Full-text available
Objectives: To examine the relationship between combined training and game loads and injury risk in elite Australian footballers. Design: Prospective cohort study. Methods: Forty-six elite Australian footballers (mean±SD age of 22.2±2.9 y) from one club were involved in a one-season study. Training and game loads (session-RPE multiplied by duration in min) and injuries were recorded each time an athlete exerted an exercise load. Rolling weekly sums and week-to-week changes in load were then modelled against injury data using a logistic regression model. Odds ratios (OR) were reported against a reference group of the lowest training load range. Results: Larger 1 weekly (>1750 AU, OR=2.44-3.38), 2 weekly (>4000 AU, OR=4.74) and previous to current week changes in load (>1250 AU, OR=2.58) significantly related (p<0.05) to a larger injury risk throughout the in-season phase. Players with 2-3 and 4-6 years of experience had a significantly lower injury risk compared to 7+ years players (OR=0.22, OR=0.28) when the previous to current week change in load was more than 1000 AU. No significant relationships were found between all derived load values and injury risk during the pre-season phase. Conclusions: In-season, as the amount of 1-2 weekly load or previous to current week increment in load increases, so does the risk of injury in elite Australian footballers. To reduce the risk of injury, derived training and game load values of weekly loads and previous week-to-week load changes should be individually monitored in elite Australian footballers.
Article
Full-text available
Elite youth soccer players have a relatively high risk for injuries and illnesses due to increased physical and psychosocial stress. The aim of this study is to investigate how measures to monitor stress and recovery, and its analysis, provide useful information for the prevention of injuries and illnesses in elite youth soccer players. 53 elite soccer players between 15 and 18 years of age participated in this study. To determine physical stress, soccer players registered training and match duration and session rating of perceived exertion for two competitive seasons by means of daily training logs. The Dutch version of the Recovery Stress Questionnaire for athletes (RESTQ-Sport) was administered monthly to assess the psychosocial stress-recovery state of players. The medical staff collected injury and illness data using the standardised Fédération Internationale de Football Association registration system. ORs and 95% CIs were calculated for injuries and illnesses using multinomial regression analyses. The independent measures were stress and recovery. During the study period, 320 injuries and 82 illnesses occurred. Multinomial regression demonstrated that physical stress was related to both injury and illness (range OR 1.01 to 2.59). Psychosocial stress and recovery were related the occurrence of illness (range OR 0.56 to 2.27). Injuries are related to physical stress. Physical stress and psychosocial stress and recovery are important in relation to illness. Individual monitoring of stress and recovery may provide useful information to prevent soccer players from injuries and illnesses.
Article
Muscle grading of livestock is a primary component of valuation in the meat industry. In pigs, the muscularity of a live animal is traditionally estimated by visual and tactile inspection from an experienced assessor. In addition to being a time consuming process, scoring of this kind suffers from inconsistencies inherent to the subjectivity of human assessment. On the other hand, accurate, computer-driven methods for carcass composition estimation like magnetic resonance imaging (MRI) and computed tomography scans (CT-scans) are expensive and cumbersome to both the animals and their handlers. In this study, we propose a method that is fast, inexpensive, and non-invasive for estimating the muscularity of live pigs, using RGB-D computer vision and machine learning. We used morphological features extracted from depth images of pigs to train a classifier that estimates the muscle scores that are likely to be given by a human assessor. The depth images were obtained from a Kinect v1 camera which was placed over an aisle through which the pigs passed freely. The data came from 3246 pigs, each having 20 depth images, and a muscle score from 1 to 7 (reduced later to 5 scores) assigned by an experienced assessor. Classification based on morphological features of the pig’s body shape - using a gradient boosted classifier - resulted in a mean absolute error of 0.65 in 10-fold cross validation. Notably, the majority of the errors corresponded to pigs being classified as having muscle scores adjacent to the groundtruth labels given by the assessor. According to the end users of this application, the proposed approach could be used to replace expert assessors at the farm.
Article
Purpose: The influence of preceding load and future perceived wellness of professional soccer players is unexamined. This paper simultaneously evaluates the external load (EL) and internal load (IL) for different time frames in combination with presession wellness to predict future perceived wellness using machine learning techniques. Methods: Training and match data were collected from a professional soccer team. The EL was measured using global positioning system technology and accelerometry. The IL was obtained using the rating of perceived exertion multiplied by duration. Predictive models were constructed using gradient-boosted regression trees (GBRT) and one naive baseline method. The individual predictions of future wellness items (ie, fatigue, sleep quality, general muscle soreness, stress levels, and mood) were based on a set of EL and IL indicators in combination with presession wellness. The EL and IL were computed for acute and cumulative time frames. The GBRT model's performance on predicting the reported future wellness was compared with the naive baseline's performance by means of absolute prediction error and effect size. Results: The GBRT model outperformed the baseline for the wellness items such as fatigue, general muscle soreness, stress levels, and mood. In addition, only the combination of EL, IL, and presession perceived wellness resulted in nontrivial effects for predicting future wellness. Including the cumulative load did not improve the predictive performances. Conclusions: The findings may indicate the importance of including both acute load and presession perceived wellness in a broad monitoring approach in professional soccer.
Article
Monitoring of recovery in the context of athletic performance has gained significant importance during recent years. As a systematic process of data collection and evaluation, the monitoring of recovery can be implemented for various purposes. It may aid to prevent negative outcomes of training or competition, such as underrecovery, overtraining, or injuries. Further, it aims at establishing routines and strategies necessary to guarantee athletes’ readiness for performance by restoring their depleted resources. Comprehensive monitoring of recovery ideally encompasses a multidimensional approach, thereby considering biological, psychological, and social monitoring methods. From a biological perspective, physiological (e.g., cardiac parameters), biochemical (e.g., creatine kinase), hormonal (e.g., salivary cortisol) and immunological (e.g., immunoglobulin A) markers can be taken into account to operationalize training loads and recovery needs. Psychological approaches suggest the application of validated and reliable psychometric questionnaires (e.g., Recovery-Stress Questionnaire for Athletes) to measure a subjective perception of recovery as well as the subjective degree of training- or competition-induced fatigue. Social aspects also play a role in performance monitoring and may hence provide essential performance-related information. The implementation of a monitoring routine within athletic environments represents a continuous process which functions as an effective addition to training and depends on a range of conditions (e.g., organizational regulations, commitment of athletes). Current research in the field of monitoring aims at establishing individualized monitoring regimes that are referring to intraindividual reference values with the help of innovative technological devices.
Article
Purpose: Machine learning may contribute to understanding the relationship between the external load and internal load in professional soccer. Therefore, the relationship between external load indicators and the rating of perceived exertion (RPE) was examined using machine learning techniques on a group and individual level. Methods: Training data were collected from 38 professional soccer players over two seasons. The external load was measured using global positioning system technology and accelerometry. The internal load was obtained using the RPE. Predictive models were constructed using two machine learning techniques, artificial neural networks (ANNs) and least absolute shrinkage and selection operator (LASSO), and one naive baseline method. The predictions were based on a large set of external load indicators. Using each technique, one group model involving all players and one individual model for each player was constructed. These models' performance on predicting the reported RPE values for future training sessions was compared to the naive baseline's performance. Results: Both the ANN and LASSO models outperformed the baseline. Additionally, the LASSO model made more accurate predictions for the RPE than the ANN model. Furthermore, decelerations were identified as important external load indicators. Regardless of the applied machine learning technique, the group models resulted in equivalent or better predictions for the reported RPE values than the individual models. Conclusions: Machine learning techniques may have added value in predicting the RPE for future sessions to optimize training design and evaluation. Additionally, these techniques may be used in conjunction with expert knowledge to select key external load indicators for load monitoring.
Conference Paper
Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.
Article
Objective: The aim of this study is to investigate if changes in perceived stress and recovery over the course of a season are risk factors for acute and overuse injuries. Design: A prospective nonexperimental cohort design. Setting: Data were gathered at the SportsFieldLab Groningen and at the facilities of the participating teams. Participants: Eighty-six male and female basketball, volleyball, and korfball players aged 21.9 ± 3.5 years. Interventions: In this 10-month observational study, the independent variables are the changes in perceived stress and recovery. Main outcome measures: The Recovery-Stress Questionnaire for Athletes (RESTQ-Sport) was filled out every 3 weeks throughout the season to assess changes in perceived stress and recovery. Acute and overuse injuries were registered by the teams' physical therapists. Odds ratios and 95% confidence intervals were calculated. Results: During one season, 66 acute and 62 overuse injuries were registered. Multinomial regression analysis showed that perceived General Recovery, shown in the scales Social Recovery and General Well-Being, decreased in the 6-week period before an acute injury (OR 0.59 and 0.61, respectively, P ≤ 0.05) compared with healthy periods. Risk of overuse injuries increased when perceived Sport Recovery, shown in the Personal Accomplishment scale, decreased in the 3-week period before the injury (OR 0.59, P ≤ 0.05) compared with healthy periods. Conclusions: Therefore, decreased perceived recovery can indicate an increased injury risk. General Recovery affects acute injury risk and Sport Recovery affects the risk of an overuse injury. Monitoring perceived recovery over the course of a season could give guidance for recovery enhancing practices to prevent injuries.
Article
Objectives To investigate the impact of training modification on achieving performance goals. Previous research demonstrates an inverse relationship between injury burden and success in team sports. It is unknown whether this relationship exists within individual sport such as athletics. Design A prospective, cohort study (n = 33 International Track and Field Athletes; 76 athlete seasons) across five international competition seasons. Methods Athlete training status was recorded weekly over a 5-year period. Over the 6-month preparation season, relationships between training weeks completed, the number of injury/illness events and the success or failure of a performance goal at major championships was investigated. Two-by-two table were constructed and attributable risks in the exposed (AFE) calculated. A mixed-model, logistic regression was used to determine the relationship between failure and burden per injury/illness. Receiver Operator Curve (ROC) analysis was performed to ascertain the optimal threshold of training week completion to maximise the chance of success. Results Likelihood of achieving a performance goal increased by 7-times in those that completed >80% of planned training weeks (AUC, 0.72; 95%CI 0.64-0.81). Training availability accounted for 86% of successful seasons (AFE = 0.86, 95%CI, 0.46 to 0.96). The majority of new injuries occurred within the first month of the preparation season (30%) and most illnesses occurred within 2-months of the event (50%). For every modified training week the chance of success significantly reduced (OR = 0.74, 95%CI 0.58 to 0.94). Conclusions Injuries and illnesses, and their influence on training availability, during preparation are major determinants of an athlete's chance of performance goal success or failure at the international level.
Article
Clinical prediction models provide risk estimates for the presence of disease (diagnosis) or an event in the future course of disease (prognosis) for individual patients. Although publications that present and evaluate such models are becoming more frequent, the methodology is often suboptimal. We propose that seven steps should be considered in developing prediction models: (i) consideration of the research question and initial data inspection; (ii) coding of predictors; (iii) model specification; (iv) model estimation; (v) evaluation of model performance; (vi) internal validation; and (vii) model presentation. The validity of a prediction model is ideally assessed in fully independent data, where we propose four key measures to evaluate model performance: calibration-in-the-large, or the model intercept (A); calibration slope (B); discrimination, with a concordance statistic (C); and clinical usefulness, with decision-curve analysis (D). As an application, we develop and validate prediction models for 30-day mortality in patients with an acute myocardial infarction. This illustrates the usefulness of the proposed framework to strengthen the methodological rigour and quality for prediction models in cardiovascular research.
Article
The aim of this study was to examine the discriminant ability of aerobic fitness measures among junior cyclists of different competitive levels and to examine whether these variables were able to predict the cyclists who reached the professional level. A total of 309 young cyclists (mean ± SD, age = 17.5 ± 0.5 yr, height = 178 ± 6 cm, weight = 66 ± 7 kg) performed an incremental maximal test to determine peak oxygen uptake (VO2peak) and respiratory compensation point. To examine the discriminant and predictive ability of these parameters, the cyclists were classified according to their competitive level and specialty: 1) national team (NAT) and nonnational team (non-NAT); 2) nonprofessionals (NP), and professional flat specialists and professional climbers; and 3) nonprofessionals (NP), professional continental, and ProTour. A logistic regression was used to test the accuracy of models generated using as predictors the laboratory measures of aerobic fitness and anthropometric data. The mean absolute and relative VO2peak were 4.7 ± 0.6 L·min(-1) and 71 ± 7 mL·kg(-1)·min(-1), respectively. NAT displayed higher VO2 values than non-NAT. Professional flat specialists showed higher absolute VO2 values than NP. Professional climbers showed higher relative VO2 values than NP. ProTour showed higher aerobic fitness measures than NP. Using the receiver operating characteristic curve, body mass, absolute VO2peak, and VO2 at respiratory compensation point were found to discriminate NAT from non-NAT. Although some of these variables influenced the odds of becoming professionals (odds ratios from 1.10 to 2.86), no models were able to correctly identify the cyclists who became professionals. Traditional physiological measures of aerobic fitness are useful to identify junior cyclists who can excel in their category. However, these variables cannot be used for talent identification, if "talent" is interpreted as a young cyclist who will succeed in becoming a professional.
Article
Sixty runners belonging to two clubs were followed for 1 year with regard to training and injury. There were 55 injuries in 39 athletes. The injury rate per 1,000 hours of training was 2.5 in long-distance/marathon runners and 5.6 to 5.8 in sprinters and middle-distance runners. There were significant differences in the injury rate in different periods of the 12 month study, the highest rates occurring in spring and summer. In marathon runners there was a significant correlation between the injury rate during any 1 month and the distance covered during the preceding month (r = 0.59). In a retrospective analysis of the cause of injury, a training error alone or in combination with other factors was the most common injury-provoking factor (72%). The injury pattern varied among the three groups of runners: hamstring strain and tendinitis were most common in sprinters, backache and hip problems were most common in middle-distance runners, and foot problems were most common in marathon runners.
Article
The training programmes and competitive performances of 147 track and field athletes, from many different clubs within the UK, were analysed retrospectively in order to study the incidence, severity and types of injuries which they had suffered during the year September 1989-September 1990. This information was then related to the particular event in which they specialized as well as a number of hypothetical risk factors proposed for making them more prone to injury. Of the athletes 96 (65.3%) were male and 51 (34.7%) were female, and their ages ranged from 14 to 32 years, with their levels of competition ranging from 'competitive spectators' to UK internationals. A marked correlation was noted between their age, level of competition, number of supervised training sessions which they attended, and their incidence of injuries. However, certain other factors which were studied, such as their sex, the hours they trained, and the particular event in which they specialized appeared to provide no obvious relationship.
Article
This study examined the relationship between the bowling workload of first-class cricket fast bowlers and injury with the aim of identifying a workload threshold at which point the risk of injury increases. Ninety male fast bowlers (mean age 27 years, range 18-38 years) from six Australian state squads were observed for the 2000-2001 and/or 2001-2002 cricket seasons. Workload was quantified by examining fixture scorecards and conducting surveillance at training sessions. Injury data was obtained from the Cricket Australia's Injury Surveillance System. Compared to bowlers with an average of 3-3.99 days between bowling sessions, bowlers with an average of less than 2 days (risk ratio (RR) = 2.4, 95% confidence interval (CI) 1.6 to 3.5) or 5 or more days between sessions (RR = 1.8, 95% CI 1.1 to 2.9) were at a significantly increased risk of injury. Compared to those bowlers with an average of 123-188 deliveries per week, bowlers with an average of fewer than 123 deliveries per week (RR = 1.4, 95% CI 1.0 to 2.0) or more than 188 deliveries per week (RR= 1.4, 95% CI 0.9 to 1.6) may also be at an increased risk of injury. There appears to be a dual fast bowling workload threshold beyond which the risk of injury increases and maintaining a workload that is too low or infrequent is an equally significant risk factor for injury as maintaining a high bowling workload. Further study is required to determine the reason why players who bowl infrequently suffer more injuries.
Monitoring stress and recovery: new 540 insights for the prevention of injuries and illnesses in elite 541 youth soccer players
  • Kapm Lemmink
Lemmink KAPM. Monitoring stress and recovery: new 540 insights for the prevention of injuries and illnesses in elite 541 youth soccer players. Br J Sports Med. 2010;44:809-815. doi: 542 10.1136/bjsm.2009.069476
Bowling 550 workload and the risk of injury in elite cricket fast bowlers
  • R Dennis
  • R Farhart
  • C Goumas
  • J Orchard
Dennis R, Farhart R, Goumas C, Orchard J. Bowling 550 workload and the risk of injury in elite cricket fast bowlers. J 551 Sci Med Sport. 2003;6:359-367. doi: 10.1016/S1440-552 2440(03)80031-2 553
An examination of the 555 training profiles and injuries in elite youth track and field 556 athletes
  • D J Huxley
  • D O'connor
  • P A Healey
Huxley DJ, O'Connor D, Healey PA. An examination of the 555 training profiles and injuries in elite youth track and field 556 athletes. Eur J Sport Sci. 2014;14:185-192. doi: 557 10.1080/17461391.2013.809153
Athlete-569 customized injury prediction using training load statistical 570 records and machine learning
  • A Naglah
  • F Khalifa
  • A Mahmoud
Naglah A, Khalifa F, Mahmoud A, et al. Athlete-569 customized injury prediction using training load statistical 570 records and machine learning. IEEE Int Symp Signal Process
Tree Boosting With XGBoost-Why Does 627 XGBoost Win" Every" Machine Learning Competition? 628 (Master's thesis). Norwegian University of Science and 629 Technology
  • Nielsen Didrik
Nielsen Didrik. Tree Boosting With XGBoost-Why Does 627 XGBoost Win" Every" Machine Learning Competition? 628 (Master's thesis). Norwegian University of Science and 629 Technology, Trondheim. 2016.
Preventing in-game injuries for NBA players
  • H Talukder
  • T Vincent
  • G Foster
Talukder H, Vincent T, Foster G, et al. Preventing in-game injuries for NBA players. Paper presented at: MIT Sloan Analytics Conference, Boston MA. 2016.
An enhanced metric of injury risk utilizing artificial intelligence
  • C Dower
  • A Rafehi
  • J Weber
Dower C, Rafehi A, Weber J, Mohamad R. An enhanced metric of injury risk utilizing artificial intelligence. Paper presented at: MIT Sloan Analytics Conference, Boston, MA. 2018. https://www.alerteds. com/wp-content/uploads/2016/02/MITSSAC2018-An-enhancedmetric-of-injury-risk-utilizing-Artificial-Intelligence.pdf
Injury risk is increased by changes in perceived recovery of team sport players
  • Htd Van Der Does
  • M S Brink
  • Rta Otter
  • C Visscher
  • Kapm Lemmink
Van Der Does HTD, Brink MS, Otter RTA, Visscher C, Lemmink KAPM. Injury risk is increased by changes in perceived recovery of team sport players. Clin J Sport Med. 2017;27(1):46-51. PubMed ID: 26945309 doi:10.1097/JSM.0000000000000306
Tree Boosting With XGBoost-Why Does XGBoost Win" Every" Machine Learning Competition?
  • D Nielsen
Nielsen D. Tree Boosting With XGBoost-Why Does XGBoost Win" Every" Machine Learning Competition? [Master's thesis]. Trondheim, Norway: Norwegian University of Science and Technology; 2016.