Conference PaperPDF Available

The Use of Predictive Models in Dynamic Treatment Planning

Authors:

Abstract and Figures

With the expanding load on healthcare and consequent strain on budget, the demand for tools to increase efficiency in treatments is rising. The use of prediction models throughout the treatment to identify risk factors might be a solution. In this paper we present a novel implementation of a prediction tool and the first use of a dynamic predictor in vocational rehabilitation practice. The tool is periodically updated and improved with Genetic Improvement of software. The predictor has been in use for 10 months and is evaluated on predictions made during that time by comparing them with actual treatment outcome. The results show that the predictions have been consistently accurate throughout the patients' treatment. After approximately 3 week learning phase, the predictor classified patients with 100% accuracy and precision on previously unseen data. The predictor is currently being successfully used in a complex live system where specialists have used it to make informed decisions.
Content may be subject to copyright.
The Use of Predictive Models in Dynamic Treatment Planning
Saemundur O. Haraldsson∗‡ , Ragnheidur D. Brynjolfsdottir,
John R. Woodward, Kristin Siggeirsdottir∗† and Vilmundur Gudnason∗†
Janus Rehabilitation, Reykjavik, Iceland
The Icelandic Heart Association, Kopavogur, Iceland
Computing Science and Mathematics, University of Stirling, UK
Abstract—With the expanding load on healthcare and con-
sequent strain on budget, the demand for tools to increase
efficiency in treatments is rising. The use of prediction models
throughout the treatment to identify risk factors might be a
solution. In this paper we present a novel implementation of
a prediction tool and the first use of a dynamic predictor
in vocational rehabilitation practice. The tool is periodically
updated and improved with Genetic Improvement of software.
The predictor has been in use for 10 months and is evaluated
on predictions made during that time by comparing them with
actual treatment outcome. The results show that the predic-
tions have been consistently accurate throughout the patients’
treatment. After approximately 3 week learning phase, the
predictor classified patients with 100% accuracy and precision
on previously unseen data. The predictor is currently being
successfully used in a complex live system where specialists
have used it to make informed decisions.
Keywords-Prediction Models, Healthcare, Dynamic Planing,
Machine Learning, Vocational Rehabilitation, Genetic Im-
provement of Software
I. INTRODUCTION
Computational Intelligence (CI) for eHealth [1] includes
Machine Learning for pattern recognition in patient data [2],
[3], Image Processing algorithms for diagnosis [4] and
monitoring [5], and epidemiology analysis [6], [7]. The use
of predictive models has been of particular interest to the
healthcare industry [8]. Specifically since sufficient quantity
of documentation has been stored digitally to form such
quantities to be considered “Big Data” [9], [10]. Tradi-
tionally the use has been limited to predicting outcomes
before treatment begins, to help with treatment selection
or determine the risks versus benefits [11]–[13]. Prior to
this work, predictive models have not been used to guide
treatment when it is already in progress. Progress in CI and
automatic algorithm design [14] has created the potential
for using predictive models continually, not only at the
beginning of a treatment or to select one, but to make
dynamic decisions throughout. The use of predictive models
in healthcare serves mainly two goals:
Increasing the likelihood of successful treatment
Reducing overall cost of treatments
As individual goals, the former is arguably more important
since it affects health and quality of life. The latter has
multifaceted effects on society. However, they are closely
intertwined since a successful treatment reduces the risk
of relapse. This results in lower future financial burden on
healthcare and means being able to effectively prioritize
treatments without the loss of service [10]. Being able to
adjust treatment that has already begun would contribute to
achieving both goals. It would help with guidance towards
a successful outcome by providing objective insight into the
patients needs and situation, without the risk of ”Diagnostic
overshadowing” [15]. This is specifically important when
the patients suffer from multiple difficulties, both mental,
and physical.
This paper presents a novel implementation of a CI
predictive tool. The tool dynamically maintains a predictive
model and specialises it, in situ, to a single facility with
the most up to date information. The implementation is in
use by Janus Rehabilitation (JR), Reykjavik. It has been in
constant testing in a busy and complicated treatment facility
since June 2016. The predictor is currently an add-on feature
on Janus Manager (JM), a bespoke software for a vocational
rehabilitation centre, developed and maintained by JR [16],
[17].
The remainder of this paper is structured as follows:
Section II gives a brief overview of related work, Section III
describes the implementation of the prediction method,
Section IV details how the method has been used in JR
and the data it is used on, and finally, Sections V and VI
discuss the evaluation of the use case in a practical setting.
II. BACKGROU ND A ND R EL ATED W OR K
The term eHealth covers a vast literature [1] where Google
Scholar search returns over 70K results. It has various
sub-fields of healthcare and health related research which
all combine technology and information with the goal of
assisting or treating people. The focus of this paper is the
use of CI in healthcare as increasingly more hospitals and
institutions convert from paper administration systems to
digitally stored records. Searching through electronic med-
ical records is less time consuming than doing it manually
[9], [10]. In addition, the search does not only involve
looking up specific details of a single patient, but searching
for general patterns [18], [19]. Although humans can identify
patterns in data, the vast amount of heterogeneous and often
unstructured data involved in medical records [20] will make
Figure 1. A flow chart showing the prediction and update process while a single patient receives treatment. The patient attends their treatment schedule
and provides data. The specialist records the information, reviews predictions, and plans the treatment jointly with the patient. The predictor processes the
data, makes predictions and visualises them. Lastly, the Genetic Improvement updates the predictor when the patient finishes.
this more difficult. For that we need predictive models and
Machine Learning algorithms [21] to identify these patterns
and help us draw inferences. Predictive models have been
prevalent in healthcare research for some time and partic-
ularly after advances in genome mapping [8], [22], [23].
Examples of successful use of models include predicting
outcomes of vocational rehabilitation in patients with brain
tumours [11], planning home care rehabilitation [12], and
predicting depression treatment outcome [13].
Rehabilitation plays a large part in healthcare and is often
a complicated process. Many factors affect both the outcome
and its length [24] but current advances in CI could be
applied to related problems. Furthermore, there are no exam-
ples of predictive models that dynamically adapt themselves
during the rehabilitation process. This paper seeks to bridge
that gap by discussing a successful implementation of such
a model in practice.
The underlying predictive models in our software for
classification are based on Random Forest classifiers [25],
an ensemble method of tree predictors which has also
been used to diagnose chronic kidney disease [4]. Other
models include: Bayesian Interpolation [26], Support Vector
Machines regression [27], and Neural Networks [28].
The model is periodically updated with new information
and improved with Genetic Improvement (GI) of soft-
ware [29] procedure. GI is an emerging field from Search
Based Software Engineering [30] which uses computa-
tional search to improve existing software. Such as fixing
bugs [16], [17], [31], [32], and reducing execution time [33].
Typically, GI uses Genetic Programming [34] as the search
method but other search methods can be used.
III. JANU S REHABILITATION AND THE PREDICTOR
JR was established in 2000 and is one of the largest voca-
tional rehabilitation centres in Iceland [24], [35]. It employs
around 40 specialists that work in multiple interdisciplinary
teams. Each treatment plan is individually tailored by the
specialists in cooperation with the patient and periodically
reviewed and updated. There are three main reasons to
end treatment: a) Patient has begun work or education, b)
JR’s has exhausted its options for treatment, or c) Patient
decides to end the treatment prematurely with or without
notification. JR has no control over cother than trying to
identify warning signs and intervene whenever possible. For
aand b, each team and their patient share the responsibility
for deciding when treatment has ended and planning follow-
up measures. JR has a large database of earlier patients that
has gradually been building up. In less than a year nearly
400 new instances from 73 patients have been added to the
database which already contained over 4300 instances at the
start. Each event in a patient’s prediction history counts as
one instance. Each instance currently has 180 features of 4
data types as listed in Table I. The features describe each
patient’s circumstances: physical, psychological, personal,
and sociological. The data processing is generic enough to
allow JR to add features whenever they decide to collect
new information about patients.
Since June 2016 JR has used the predictor and suc-
cessfully confirmed it as a viable tool. It differs from the
traditionally off-line predictors by updating its rules on-
line whenever new data is recorded. JR has developed a
predictive model which the rehabilitation specialists use to
inform decisions at every stage of the rehabilitation process.
Figure 1 shows a flow chart of a consultation between
a specialist and patient while the predictor operates and
is updated in the background. When the patient enters
the treatment, the specialist records the information to the
database which initiates the predictor. Three predictions are
made and stored, based on the entered data:
Likelihood of successful rehabilitation
Drop out probability
Treatment length, in months
The first two are made with classification, while the last is
made with regression. The specialist can then review the
patient’s status with this new perspective and plan the next
steps. Additionally, the predictor lists the ten most influential
features for each prediction to help identify risk factors that
affect the outcome and length. These risk factors vary be-
tween patients, because each has their unique circumstances.
This cycle is repeated, every time new information is
recorded and as long as the treatment lasts. When patients
finish treatment, all the collected data regarding them is
anonymized and added to the database. The current version
of the predictor is then evaluated by comparing all its
previously stored predictions about them with the actual
outcome.
The bottom layer of Figure 1 is the GI procedure which
updates the predictor with the new data. Its implementation
has been described previously [16], [17], [32], [33]. In
short, it evolves a population of edits that represent small
changes to a program. An edit can change the source code
by either: Deleting, replacing, copying, or swapping code
segments. In this work the targeted software is a Python
script that pre-processes the data, and selects a prediction
algorithm from scikit learn [36] and tunes its parameters.
The objective of the improvement process is to minimise
the mean squared error of regression models and maximise
accuracy of classification models. The GI uses Monte Carlo
cross-validation [37], with 20 repetitions, to evaluate fitness.
The dataset is randomly divided into training and testing sets
of equal sizes. The fitness of each edit list is then the average
performance over 20 splits.
When the GI has finished, the best performing variation,
out of 2000 tested, replaces the current predictor instance
which fits three models, one for each of the three predicted
variables. They are then used for all predictions until the
next person leaves treatment and the updating process starts
again.
IV. EVALUATION OF THE PREDICTO R
JM has been in use since March 2016 and the predictor
was added in June 2016. For JR, the most important evalu-
ation of the predictor is how its specialists experience it in
practice.
However, we also verify the predictor objectively by eval-
uating its performance on those patients that have completed
their treatment after it was implemented. The procedure
involves iterating over 73 versions of the predictor from
Table I
DATA TYPE S OF F EATUR ES I N THE S ET,NU MB ER AN D EX AMP LE S.
Data type Amount Examples
Float 120 Age, Length of unemployment,
Quality of Life measurement
Current treatment duration
Integer 18 Number of children,
Number of medical diagnoses
Boolean 37 Bullied, Dyslexic,
Been JR patient before
Categorical 5 Education, Income,
Gender, Housing,
Relationship status
June 2016 until March 2017 and compare each version’s
predictions with actual outcome. For the classification prob-
lems, we measure accuracy ( c
n) and precision ( p
n), where n
is the number of predictions, cis the number of correctly
labelled predictions, and pis the number of correctly labelled
predictions of the positive class. Accuracy is the proportion
of correct labels while precision is the proportion of correct
positive labels. For the regression problems, predictions
(denoted with ˆ
Y), and true labels (Y), we measure mean
squared error (MSE) (1)
MSE =1
n
n
X
i=1
(ˆ
YiYi)2(1)
and median absolute deviation (MAD) (2)
MAD =med(|Yimed(Y)|)i[1,2, ..., n](2)
The MAD is a robust variation measurement while the MSE
is a well know measure of spread. Additionally, to evaluate
how the GI is affecting performance of the predictor we
compare these values before and after each improvement
process.
V. RE SU LTS FO R TH E PRE DI CT OR
A. The Practical use of the predictor
JR’s specialists have expressed that being able to identify
important factors of the patient’s current status is particularly
helpful, along with the graph of previous predictions. It has
been used as a visual aid by demonstrating an increased
likelihood of a positive outcome, and also to encourage the
patient when they cannot perceive progression themselves.
Some specialists have also used it to expedite appointments
with their patient when the predictor shows increased drop
out probability. However, to be able to verify if the use of the
predictor has decreased the number of drop outs or shortened
rehabilitation length we need to collect data over a longer
period. There are a number of seasonal variables that might
have confounding effects, such as seasonal affective disorder.
B. Results for classification problems
The predictor did well on two classification tasks; if treat-
ment will be successful, and if a patient will drop out. The
Figure 2. Precision and accuracy of predictions for dropping out and
successful treatment over the trial period.
predictions for dropping out had over 99% accuracy from
the start, and after August 2016, the accuracy was 100%.
Its precision started at 85% but increased to 100% within
2 weeks (see Figure 2). Similarly, predicting a successful
treatment was also at 100% accuracy and precision in week
two after the release of the first version of the predictor.
C. Results for regression problems
The regression models were able to predict treatment
length within three months from the actual duration. A three
month difference is an acceptable estimation because in
practice this is a lead-in time. Two predictor versions out of
total 73 had an error of up to five months. Those versions
were updated within a week of being activated and had no
measurable effect on therapy length of prevailing patients.
Figures 3 and 4 show the performance, as measured post hoc,
of number of different models for every two weeks over the
period, June 2016–March 2017. The boxes in the figures
are the first and third quartiles, the blue line are the best
performing models, and the green line is the performance
of the models that were in use each time. Note that the
performance of the model in use was always better than
the mean and median performing model. Furthermore, the
variation in the performance of the models gets increasingly
larger when adding more data to the training set. It is
possibly linked to the variation of treatment length as seen
in Figure 5.
D. Genetic Improvement
The GI was used to improve the selection and tuning
of both classification and regression models. The GI could
not improve beyond maximum regarding the classification
accuracy as seen in Figure 2 and the variation of the
accuracy between different versions of the predictor was less
than 1×105. However, the regression models were quite
different as mentioned in Section V-C. In Figure 3 and 4 we
can see performance spread of the top performing version
of each generation from the GI. The green lines indicate the
performance of the best version on the test set after correct
Figure 3. Distribution of post hoc evaluation of MAD for every two weeks
of updated models for treatment length. Both mean (diamond) and median
(triangle) are marked with each box.
Figure 4. Distribution of post hoc evaluation of MSE for every two weeks
of updated models for treatment length. Here, converted to Root Mean
Squared Error for scaling on y-axis. Both mean (diamond) and median
(triangle) are marked with each box.
results were known. The GI managed to keep predictions
mostly within three months from the actual length, causing
the overall performance of the predictor to be more than
adequate.
VI. CONCLUSION
The performance of the predictor, presented here, was
outstanding, with almost 100% accuracy and precision in
classification predictions over a 10 month period. Addition-
ally, it has consistently predicted treatment length within
satisfactory margin for a vocational rehabilitation.
The predictor was developed to meet the demand for
an objective view of each patient’s status while receiving
treatment. To the authors’ best knowledge, JR is the first
facility to use a predictor, designed for that purpose in
practice. JR’s specialists have integrated the use of the
predictor into their daily routine to get a clear view of the
progress of each patient, at every stage of their treatment.
With the increasing number of patients in treatment, the
predictor helps by identifying possible risk factors. It assists
the specialist to know where and when to intervene, possibly
shortening the treatment time. They have used the predictor
in various ways, discussing progress or the lack there of
Figure 5. The distribution of treatment length for the 73 patients that
finished treatment during the ten month period.
with the patient, to encourage or recognise what might be
interfering with the treatment. Some specialist have used the
predictor at the start of the rehabilitation to focus efforts on
specific areas in the patient’s circumstances.
The predictor’s first 10 months in use have been evaluated
by comparing the initial predictions for patients that were
receiving treatment with actual outcomes and treatment
lengths after they finished. The results for classifying a pa-
tient’s treatment as drop out, unsuccessful or successful were
more than satisfactory according to experts in rehabilitation,
with near 100% accuracy.
The graph in Figure 5 shows us that the treatment time
can vary from 2 months to 47 and that is a possible culprit
for the large variation in treatment time predictions for
JR’s patients. However, the GI was consistently able to
find versions of the predictor that had decent performance.
Overall the predictions for treatment length where within 2
to 3 months from what actually occurred. The combination
of GI and prediction models has proven to be beneficial for
the vocational rehabilitation treatment.
The predictor is still in use and continually evolving with
the expanding dataset, providing dynamic predictions for the
specialists. With current progress in software and hardware
development it is well worth exploring automatic adjust-
ments of predictive models. Automatic algorithm design [38]
and GI are two of many methodologies to make portable CI
tools for healthcare, rather than depending on predefined
models that might work well for the general population but
not in specific treatments or facilities. In other words, GI
can adapt the predictor to the specific data and patients at a
given facility. Therefore the predictor was able to perform
so well for JR, it was specialised to their database, which
contains a narrow population. The predictor is a valuable
asset for specialists, patients, and the facility as a whole.
The predictor needs to be adapt to each treatment facility and
database, and this can be achieved with GI. This predictor
can reduce cost and identify possible risk factors, helping
specialists to intervene earlier.
ACKNOWLEDGMENT
The work presented in this paper was done in collab-
oration with Janus Rehabilitation. The authors would like
to thank all the specialists for using the predictor and for
providing valuable feedback. Two of the authors are also
part of the DAASE project which is funded by the EPSRC
Grant EP/J017515/1
REFERENCES
[1] C. Pagliari, et al., “What Is eHealth (4): A Scoping Exercise to
Map the Field,” Journal of Medical Internet Research, vol. 7,
no. 1, p. e9, 3 2005.
[2] I. D. Falco, “Differential Evolution for automatic rule ex-
traction from medical databases,” Applied Soft Computing
Journal, vol. 13, no. 2, pp. 1265–1283, 2013.
[3] S. Schneeweiss, “Learning from Big Health Care Data,” New
England Journal of Medicine., vol. 370, no. 23, pp. 2161–
2163, 2014.
[4] A. Subasi, E. Alickovic, and J. Kevric, “Diagnosis of Chronic
Kidney Disease by Using Random Forest,” in CMBEBIH
2017: Proceedings of the International Conference on Med-
ical and Biological Engineering 2017, A. Badnjevic, Ed.
Singapore: Springer Singapore, 2017, pp. 589–594.
[5] D. Mulfari, A. Celesti, M. Fazio, M. Villari, and A. Puliafito,
“Using Google Cloud Vision in assistive technology
scenarios,” in 2016 IEEE Symposium on Computers and
Communication (ISCC), vol. 2016-Augus. IEEE, 6 2016,
pp. 214–219. [Online]. Available: http://ieeexplore.ieee.org/
document/7543742/
[6] P. Saripalli, “Analytic and learning framework for quantifying
Value in Value Based Care,” in Proceedings - IEEE Sympo-
sium on Computers and Communications. IEEE, 2016, pp.
261–266.
[7] K. Siggeirsdottir, et al., “Epidemiology of fractures in Iceland
and secular trends in major osteoporotic fractures 1989–
2008.” Osteoporosis international, vol. 25, no. 1, pp. 211–
2019, 2014.
[8] J. Danesh, et al., “C-Reactive Protein and Other Circulating
Markers of Inflammation in the Prediction of Coronary Heart
Disease,” New England Journal of Medicine, vol. 350, no. 14,
pp. 1387–1397, 4 2004.
[9] K. Uragaki, et al., “Sequential Pattern Mining on Electronic
Medical Records with Handling Time Intervals and the Ef-
ficacy of Medicines,” in Proceedings - IEEE Symposium on
Computers and Communications. IEEE, 2016, pp. 1–6.
[10] E. Soares, et al., “Modular Health Kiosk for health self-
assessment,” in Proceedings - IEEE Symposium on Computers
and Communications. IEEE, 2016, pp. 278–280.
[11] S. L. Rusbridge, N. C. Walmsley, S. B. Griffiths, P. A.
Wilford, and J. H. Rees, “Predicting outcomes of voca-
tional rehabilitation in patients with brain tumours,” Psycho-
Oncology, vol. 22, no. 8, pp. 1907–1911, 2013.
[12] M. Zhu, Z. Zhang, J. P. Hirdes, and P. Stolee, “Using machine
learning algorithms to guide rehabilitation planning for home
care clients.” BMC medical informatics and decision making,
vol. 7, no. 1, p. 41, 2007.
[13] A. M. Chekroud, et al., “Cross-trial prediction of treatment
outcome in depression: A machine learning approach,” The
Lancet Psychiatry, vol. 3, no. 3, pp. 243–250, 2016.
[14] E. K. Burke, M. R. Hyde, G. Kendall, and J. Woodward,
“Automatic heuristic generation with genetic programming,”
in Proceedings of the 9th annual conference on Genetic and
evolutionary computation - GECCO ’07. New York, New
York, USA: ACM Press, 2007, p. 1559.
[15] G. Shefer, C. Henderson, L. M. Howard, J. Murray, and
G. Thornicroft, “Diagnostic Overshadowing and Other Chal-
lenges Involved in the Diagnostic Process of Patients with
Mental Illness Who Present in Emergency Departments with
Physical Symptoms - A Qualitative Study,PLoS ONE, vol. 9,
no. 11, 2014.
[16] S. O. Haraldsson, J. R. Woodward, A. E. I. Brownlee, and
D. Cairns, “Exploring Fitness and Edit Distance of Mutated
Python Programs,” in Proceedings of the 17th European
Conference on Genetic Programming, EuroGP. Amsterdam,
The Netherlands: Springer Berlin Heidelberg, 2017.
[17] S. O. Haraldsson, J. R. Woodward, A. E. Brownlee, and
K. Siggeirsdottir, “Fixing Bugs in Your Sleep: How Genetic
Improvement Became an Overnight Success,” in Proceedings
of the 2017 Conference Companion on Genetic and Evolu-
tionary Computation Companion. Berlin, Germany: ACM,
2017.
[18] D. W. Bates, S. Saria, L. Ohno-Machado, A. Shah, and
G. Escobar, “Big data in health care: Using analytics to
identify and manage high-risk and high-cost patients,” Health
Affairs, vol. 33, no. 7, pp. 1123–1131, 2014.
[19] W. Raghupathi and V. Raghupathi, “Big data analytics in
healthcare: promise and potential,” Health Information Sci-
ence and Systems, vol. 2, no. 1, p. 3, 2014.
[20] Y. Wang, L. A. Kung, and T. A. Byrd, “Big data ana-
lytics: Understanding its capabilities and potential benefits
for healthcare organizations,Technological Forecasting and
Social Change, no. March, 2016.
[21] T. M. Mitchell, “The Discipline of Machine Learning.” Tech.
Rep., 2006.
[22] E. A. Marques, et al., “Proximal Femur Volumetric Bone
Mineral Density and Mortality: 13 Years of Follow-Up of
the AGES-Reykjavik Study,Journal of Bone and Mineral
Research, pp. n/a–n/a, 2017.
[23] N. A. Chatterjee, et al., “Genetic Obesity and the Risk of
Atrial FibrillationClinical Perspective,Circulation, vol. 135,
no. 8, pp. 741–754, 2 2017.
[24] K. Siggeirsdottir, et al., “Determinants of outcome of voca-
tional rehabilitation,” Work, vol. 55, no. 3, pp. 577–583, 2016.
[25] L. Breiman, “Random Forests,” Machine Learning, vol. 45,
no. 1, pp. 5–32, 2001.
[26] D. J. C. MacKay, “Bayesian Interpolation,” Neural
Computation, vol. 4, no. 3, pp. 415–447, 1992. [Online].
Available: http://www.mitpressjournals.org/doi/abs/10.1162/
neco.1992.4.3.415
[27] K. P. Bennett and C. Campbell, “Support vector machines:
hype or hallelujah?” ACM SIGKDD Explorations Newsletter,
vol. 2, no. 2, pp. 1–13, 2000. [Online]. Available:
http://portal.acm.org/citation.cfm?doid=380995.380999
[28] S. Haykin, Neural Networks: A Comprehensive Foundation,
1st ed. Upper Saddle River, New Jersey: Prentice Hall, 1994.
[29] J. Petke, S. O. Haraldsson, M. Harman, W. B. Langdon,
D. R. White, and J. R. Woodward, “Genetic Improvement
of Software: a Comprehensive Survey,” IEEE Transactions
on Evolutionary Computation, vol. To Appear, 2017.
[30] M. Harman and B. F. Jones, “Search-based software
engineering,” Information and Software Technology, vol. 43,
no. 14, pp. 833–839, 12 2001. [Online]. Available: http:
//linkinghub.elsevier.com/retrieve/pii/S0950584901001896
[31] C. Le Goues, T. Nguyen, S. Forrest, and W. Weimer,
“GenProg: A Generic Method for Automatic Software
Repair,” IEEE Transactions on Software Engineering, vol. 38,
no. 1, pp. 54–72, 2012. [Online]. Available: http://www.cs.
virginia.edu/weimer/p/weimer-tse2012-genprog.pdf
[32] S. O. Haraldsson, J. R. Woodward, and A. I. E. Brownlee,
“The Use of Automatic Test Data Generation for Genetic Im-
provement in a Live System,” in 8th International Workshop
on Search-Based Software Testing. Buones Aires: ACM,
2017.
[33] S. O. Haraldsson, J. R. Woodward, A. E. Brownlee, A. V.
Smith, and V. Gudnason, “Genetic Improvement of Runtime
and its Fitness Landscape in a Bioinformatics Application,”
in Proceedings of the 2017 Conference Companion on Ge-
netic and Evolutionary Computation Companion. Berlin,
Germany: ACM, 2017.
[34] R. Poli, W. B. Langdon, and N. F. McPhee, A field guide
to genetic programming. (With contributions by J. R.
Koza): Published via http://lulu.com and freely available at
http://www.gp-field-guide.org.uk, 2008. [Online]. Available:
http://www.gp-field-guide.org.uk
[35] K. Siggeirsdottir, U. Alfredsdottir, G. Einarsdottir, and B. Y.
Jonsson, “A new approach in vocational rehabilitation in
Iceland: preliminary report.” Work, vol. 22, no. 1, pp. 3–8,
1 2004.
[36] F. Pedregosa, et al., “Scikit-learn: Machine Learning in
{P}ython,” Journal of Machine Learning Research, vol. 12,
pp. 2825–2830, 2011.
[37] W. Dubitzky, M. Granzow, and D. Berrar, Fundamentals of
data mining in genomics and proteomics. Springer, 2007.
[38] J. Woodward and J. Swan, “The automatic generation of
mutation operators for genetic algorithms,” in GECCO’12,
14th annual conference on Genetic and evolutionary
computation, G. L. Pappa, J. Woodward, M. R. Hyde, and
J. Swan, Eds., Philadelphia, Pennsylvania, USA, 2012, pp.
67–74. [Online]. Available: http://dl.acm.org/citation.cfm?id=
2330796
... It was reported that the previous statistical methods to identify the optimal DTRs from the observational data were complicated and not readily implemented by researchers, especially when the survival time was the goal of interest. Hence, it was found that some researchers have viewed this practical issue as an aspect to take a great leap or one step forward in medicine [7][8][9][10][11] . ...
... The applied doses might be irregular and it would be basically determined by the recipient's pharmacological response and the variety of associated adverse effects. The idea of introducing adaptive control emerges as a response to the challenge, thus resulting into the more precise approach to adapt to the irregular dynamics of patient [11] . ...
Article
Background and Objective : Cancer is one of the major causes of death worldwide and chemotherapies are the most significant anti-cancer therapy, in spite of the emerging precision cancer medicines in the last 2 decades. The growing interest in developing the effective chemotherapy regimen with optimal drug dosing schedule to benefit the clinical cancer patients has spawned innovative solutions involving mathematical modeling since the chemotherapy regimens are administered cyclically until the futility or the occurrence of intolerable adverse events. Thus, in this present work, we reviewed the emerging trends involved in forming a computational solution from the aspect of reinforcement learning. Methods : Initially, this survey in-depth focused on the details of the dynamic treatment regimens from a broad perspective and then narrowed down to inspirations from reinforcement learning that were advantageous to chemotherapy dosing, including both offline reinforcement learning and supervised reinforcement learning. Results : The insights established in the chemotherapy-planning problem associated with the Reinforcement Learning (RL) has been discussed in this study. It showed that the researchers were able to widen their perspectives in comprehending the theoretical basis, dynamic treatment regimens (DTR), use of the adaptive control on DTR, and the associated RL techniques. Conclusions : This study reviewed the recent researches relevant to the topic, and highlighted the challenges, open questions, possible solutions, and future steps in inventing a realistic solution for the aforementioned problem.
... Since its establishment, it has serviced over 1500 patients with various complex mental and physical problems. Every patient receives a personalised approach to their rehabilitation and an interdisciplinary team with a designated coordinator who supports them through their journey [17][18][19][20]. ...
Article
Full-text available
Vocational Rehabilitation (VR) is a multidisciplinary health and social services process where patients are supported to permanently enter or re-enter the workforce. The aim of this retrospective cohort study spanning the last two decades was to investigate prolonged success after personalised VR and to identify aspects of VR that could be influenced to improve prolonged success. Former patients of an interdisciplinary VR centre in Iceland were surveyed and, using logistic regression, their responses were modelled with respect to socioeconomic information and descriptions of their progression and experience during the VR obtained by rigorous scientific methodology. Several aspects were found that could be influenced to increase the probability of prolonged success. These relate to three main areas; financial security, social skills and mental status as well as the importance of time and support during the transition from VR to work or education. This was independent of age and gender. Personalised VR is a cornerstone of successful VR.
... Time is the concern addressed in the vast majority of papers, with 34 papers considering execution time [1, 2, 7, 8, 10, 14, 15, 17, 24, 32, 35, 39, 41-44, 47-50, 55, 58-63, 68, 70-72, 75, 77, 87, 88], number of CPU or bytecode instructions [4,11,12,21,22,85], or also loading time [23]. Other NFPs include code size [25,38,90,91], energy consumption [13,18,19,27], memory usage [7,8,88], accuracy of the underlying algorithm [30,31,59,60,62,81], readability [73], or other application-specific NFPs [37, 40, 45, 46, 51-53, 64, 65]. A summary is presented in Figure 1. ...
Conference Paper
Genetic improvement (GI) improves both functional properties of software, such as bug repair, and non-functional properties, such as execution time, energy consumption, or source code size. There are studies summarising and comparing GI tools for improving functional properties of software; however there is no such study for improvement of its non-functional properties using GI. Therefore, this research aims to survey and report on the existing GI tools for improvement of non-functional properties of software. We conducted a literature review of available GI tools, and ran multiple experiments on the found open-source tools to examine their usability. We applied a cross-testing strategy to check whether the available tools can work on different programs. Overall, we found 63 GI papers that use a GI tool to improve nonfunctional properties of software, within which 31 are accompanied with open-source code. We were able to successfully run eight GI tools, and found that ultimately only two ---Gin and PyGGI--- can be readily applied to new general software.
Article
Despite recent increase in research on improvement of non-functional properties of software, such as energy usage or program size, there is a lack of standard benchmarks for such work. This absence hinders progress in the field, and raises questions about the representativeness of current benchmarks of real-world software. To address these issues and facilitate further research on improvement of non-functional properties of software, we conducted a comprehensive survey on the benchmarks used in the field thus far. We searched five major online repositories of research work, collecting 5499 publications (4066 unique), and systematically identified relevant papers to construct a rich and diverse corpus of 425 relevant studies. We find that execution time is the most frequently improved property in research work (63%), while multi-objective improvement is rarely considered (7%). Static approaches for improvement of non-functional software properties are prevalent (51%), with exploratory approaches (18% evolutionary and 15% non-evolutionary) increasingly popular in the last 10 years. Only 39% of the 425 papers describe work that uses benchmark suites, rather than single software, of those SPEC is most popular (63 papers). We also provide recommendations for future work, noting, for instance, lack of benchmarks for non-functional improvement that covers Python, JavaScript, or mobile devices. All the details regarding the 425 identified papers are available on our dedicated webpage: https://bloa.github.io/nfunc_survey.
Preprint
Full-text available
Performance is a key quality of modern software. Although recent years have seen a spike in research on automated improvement of software's execution time, energy, memory consumption, etc., there is a noticeable lack of standard benchmarks for such work. It is also unclear how such benchmarks are representative of current software. Furthermore, frequently non-functional properties of software are targeted for improvement one-at-a-time, neglecting potential negative impact on other properties. In order to facilitate more research on automated improvement of non-functional properties of software, we conducted a survey gathering benchmarks used in previous work. We considered 5 major online repositories of software engineering work: ACM Digital Library, IEEE Xplore, Scopus, Google Scholar, and ArXiV. We gathered 5000 publications (3749 unique), which were systematically reviewed to identify work that empirically improves non-functional properties of software. We identified 386 relevant papers. We find that execution time is the most frequently targeted property for improvement (in 62% of relevant papers), while multi-objective improvement is rarely considered (5%). Static approaches are prevalent (in 53% of papers), with exploratory approaches (evolutionary in 18% and non-evolutionary in 14% of papers) increasingly popular in the last 10 years. Only 40% of 386 papers describe work that uses benchmark suites, rather than single software, of those SPEC is most popular (covered in 33 papers). We also provide recommendations for choice of benchmarks in future work, noting, e.g., lack of work that covers Python or JavaScript. We provide all programs found in the 386 papers on our dedicated webpage at https://bloa.github.io/nfunc_survey/ We hope that this effort will facilitate more research on the topic of automated improvement of software's non-functional properties.
Article
Full-text available
Janus endurhæfing er læknisfræðileg starfs- og atvinnuendurhæfingþar sem þverfaglegt teymi sérfræðinga aðstoða þátttakendur afturút á vinnumarkaðinn. Flestir þátttakendur glíma við flókin ogfjölþætt vandamál bæði andleg og/eða líkamleg. Þeir þarfnaststarfsendurhæfingar sem er aðlöguð að mismunandi þörfumþeirra. Starfsemin tekur tillit til þessara þarfa meðal annars meðþví að bjóða upp á mismundandi brautir. Þróun hefur átt sér staðinnan starfseminnar meðal annars er fyrirtækið brautryðjandi ínotkun gervigreindar innan starfsendurhæfingar. Hlutverksjúkraþjálfara í endurhæfingunni hefur þróast í takt við breyttatíma, en snýr í dag að mestum hluta að fræðslu og þjálfun í hóp,auk einstaklingsmeðferða.
Conference Paper
Full-text available
We present a bespoke live system in commercial use with self-improving capability. During daytime business hours it provides an overview and control for many specialists to simultaneously schedule and observe the rehabilitation process for multiple clients. However in the evening, after the last user logs out, it starts a self-analysis based on the day's recorded interactions. It generates test data from the recorded interactions for Genetic Improvement to x any recorded bugs that have raised exceptions. The system has already been under test for over 6 months and has in that time identiied, located, and xed 22 bugs. No other bugs have been identiied by other methods during that time. It demonstrates the eeectiveness of simple test data generation and the ability of GI for improving live code. CCS CONCEPTS • Software and its engineering → Error handling and recovery ; Automatic programming; Maintaining software; Search-based software engineering; Empirical software validation;
Conference Paper
Full-text available
We present a Genetic Improvement (GI) experiment on ProbAbel, a piece of bioinformatics software for Genome Wide Association (GWA) studies. The GI framework used here has previously been successfully used on Python programs and can, with minimal adaptation , be used on source code written in other languages. We achieve improvements in execution time without the loss of accuracy in output while also exploring the vast tness landscape that the GI framework has to search. The runtime improvements achieved on smaller data set scale up for larger data sets. Our nd-ings are that for ProbAbel, the GI's execution time landscape is noisy but at. We also connrm that human written code is robust with respect to small edits to the source code. CCS CONCEPTS • Software and its engineering → Genetic programming; Software performance; Search-based software engineering;
Article
Full-text available
Genetic improvement uses automated search to find improved versions of existing software. We present a comprehensive survey of this nascent field of research with a focus on the core papers in the area published between 1995 and 2015. We identified core publications including empirical studies, 96% of which use evolutionary algorithms (genetic programming in particular). Although we can trace the foundations of genetic improvement back to the origins of computer science itself, our analysis reveals a significant upsurge in activity since 2012. Genetic improvement has resulted in dramatic performance improvements for a diverse set of properties such as execution time, energy and memory consumption, as well as results for fixing and extending existing system functionality. Moreover, we present examples of research work that lies on the boundary between genetic improvement and other areas, such as program transformation, approximate computing, and software repair, with the intention of encouraging further exchange of ideas between researchers in these fields.
Conference Paper
Full-text available
In this paper we present a bespoke live system in commercial use that has been implemented with self-improving properties. During business hours it provides overview and control for many specialists to simultaneously schedule and observe the rehabilitation process for multiple clients. However in the evening, after the last user logs out, it starts a self-analysis based on the day’s recorded interactions and the self-improving process. It uses Search Based Software Testing (SBST) techniques to generate test data for Genetic Improvement (GI) to fix any bugs if exceptions have been recorded. The system has already been under testing for 4 months and demonstrates the effectiveness of simple test data generation and the power of GI for improving live code.
Conference Paper
Full-text available
Genetic Improvement (GI) is the process of using computational search techniques to improve existing software e.g. in terms of execution time, power consumption or correctness. As in most heuristic search algorithms, the search is guided by fitness with GI searching the space of program variants of the original software. The relationship between the program space and fitness is seldom simple and often quite difficult to analyse. This paper makes a preliminary analysis of GI’s fitness distance measure on program repair with three small Python programs. Each program undergoes incremental mutations while the change in fitness as measured by proportion of tests passed is monitored. We conclude that the fitnesses of these programs often does not change with single mutations and we also confirm the inherent discreteness of bug fixing fitness functions. Although our findings cannot be assumed to be general for other software they provide us with interesting directions for further investigation.
Article
Full-text available
Background: Information regarding the determinants of successful vocational rehabilitation (VR) is scarce. Objective: Investigate whether sex, duration, quality of life and financial circumstances influence the success of VR. Methods: The study group consisted of 519 participants, (293 women, 56%) who finished VR in the period 2000-2014. The group was divided into the following subgroups: dropouts, unsuccessful and successful VR. Data were collected by questionnaire. Results: Income had the most impact on whether the outcome was successful. Having supplemental income when entering the VR program increased the likelihood of a successful conclusion, odds ratio (OR) 5.60 (95% CI; 2.43-13.59) (p < 0.001), being on sick leave OR 5.02 (95% CI 1.93-13.79) (p < 0.001) or rehabilitation pension OR 1.93 (95% CI 1.07-3.52) (p < 0.03). The participants in the successful sub-group were older (p < 0.06) and stayed in rehabilitation longer (p < 0.001), compared to those who were unsuccessful. However, the effect on OR was limited: 1.03 (95% CI 1.01-1.06) and 1.04 (95% CI 1.02-1.07), respectively. Conclusions: For this sample, supplemental income appears to be the most important factor for a successful rehabilitation outcome. Checking financial status at the beginning of the rehabilitation process could minimize financial strain and increase the likelihood of success.
Chapter
Chronic kidney disease (CKD) is a global public health problem, affecting approximately 10% of the population worldwide. Yet, there is little direct evidence on how CKD can be diagnosed in a systematic and automatic manner. This paper investigates how CKD can be diagnosed by using machine learning (ML) techniques. ML algorithms have been a driving force in detection of abnormalities in different physiological data, and are, with a great success, employed in different classification tasks. In the present study, a number of different ML classifiers are experimentally validated to a real data set, taken from the UCI Machine Learning Repository, and our findings are compared with the findings reported in the recent literature. The results are quantitatively and qualitatively discussed and our findings reveal that the random forest (RF) classifier achieves the near-optimal performances on the identification of CKD subjects. Hence, we show that ML algorithms serve important function in diagnosis of CKD, with satisfactory robustness, and our findings suggest that RF can also be utilized for the diagnosis of similar diseases.
Article
Background—Observational studies have identified an association between body mass index (BMI) and incident atrial fibrillation (AF). Inferring causality from observational studies, however, is subject to residual confounding, reverse causation, and bias. The primary objective of this study was to evaluate the causal association between BMI and AF using genetic predictors of BMI. Methods—We identified 51 646 individuals of European ancestry without AF at baseline from seven prospective population-based cohorts initiated between 1987 and 2002 in the United States, Iceland, and the Netherlands with incident AF ascertained between 1987 and 2012. Cohort-specific mean follow-up ranged 7.4 to 19.2 years, over which period there were a total of 4178 cases of incident AF. We performed a Mendelian randomization with instrumental variable analysis to estimate a cohort-specific causal hazard ratio for the association between BMI and AF. Two genetic instruments for BMI were utilized: FTO genotype (rs1558902) and a BMI gene score comprised of 39 single nucleotide polymorphisms identified by genome-wide association studies to be associated with BMI. Cohort-specific estimates were combined by random-effects, inverse variance weighted meta-analysis. Results—In age- and sex-adjusted meta-analysis, both genetic instruments were significantly associated with BMI (FTO: 0.43 [95% CI: 0.32 - 0.54] kg/m2 per A-allele, p<0.001); BMI gene score: 1.05 [95% CI: 0.90-1.20] kg/m2 per 1 unit increase, p<0.001) and incident AF (FTO - HR: 1.07 [1.02-1.11] per A-allele, p=0.004; BMI gene score - HR: 1.11 [1.05-1.18] per 1-unit increase, p<0.001). Age- and sex-adjusted instrumental variable estimates for the causal association between BMI and incident AF were HR 1.15 [1.04-1.26] per kg/m2, p=0.005 (FTO) and 1.11 [1.05-1.17] per kg/m2, p<0.001 (BMI gene score). Both of these estimates were consistent with the meta-analyzed estimate between observed BMI and AF (age- and sex-adjusted HR 1.05 [1.04-1.06] per kg/m2, p<0.001). Multivariable adjustment did not significantly change findings. Conclusions—Our data are consistent with a causal relationship between BMI and incident AF. These data support the possibility that public health initiatives targeting primordial prevention of obesity may reduce the incidence of AF.