Abstract— Interpretability of machine learning
models is critical for data-driven precision
medicine efforts. However, highly predictive
models are generally complex and are difficult
to interpret. Here using Model-Agnostic
Explanations algorithm, we show that complex
models such as random forest can be made
interpretable. Using MIMIC-II dataset, we
successfully predicted ICU mortality with 80%
balanced accuracy and were also were able to
interpret the relative effect of the features on
prediction at individual level.
Precision medicine holds a great future in
healthcare as it customizes medical care to an
individual’s unique disease state . The
widespread adoption of electronic health records
has resulted in a tsunami of data that can be
leveraged with machine learning based approaches
in order to dissect clinical heterogeneity and aid the
physician in targeted decision making.
In general, highly accurate machine learning
models tend to become complex and hence are
difficult to interpret. In the trade-off between
predictive modelling and explanatory modelling
, explanatory modeling is highly important
among healthcare practitioners because they value
the ability to understand the contribution of specific
features to a model. It is very important to
understand the decision process of a predictive
model before its decision can be utilized in clinical
setting because it affects the life and death of a
patient. A predictive model has to either be
interpretable, or it has to be transformed to be
interpretable, in order for a user of the model to
understand its decision process. Model
interpretability is vital for the successful application
of predictive models in healthcare, especially for
data-driven precision medicine since it involves
understanding a patient’s unique disease state.
Interpretable models would deliver actionable
insights in line with precision medicine initiatives.
In this study, we perform a case study of the
application of model interpretability for precision
medicine on Intensive care unit (ICU) data. ICUs
can benefit from the rich information that can be
extracted from the improved interpretability of the
models. In particular, one large area where ICU
physicians and staff can benefit is in early prediction
of mortality. This is a large problem because the
average mortality rate at hospitals is between 8-
19%, or around 500,000 deaths annually . In this
study, we demonstrate how a complex highly
predictive model trained for ICU mortality
prediction can be approximated as a simple
interpretable model for each patient. Through the
approximated simple models, we show that the
important features’ contributions during the
decision process of the complex predictive model
can be uniquely understood for each patietnt.
We extracted features from the Multi-Parameter
Intelligent Monitoring in Intensive Care (MIMIC-
II) dataset  containing 8,315 patients who
exhibited mortality and 23,974 patients who did not.
We extracted counts of medications, diagnoses, and
lab tests for all patients.
We used 75% of the data for training. Remaining
25% data was used for testing. Feature selection and
classification were performed using scikit-learn
0.17.1 . The top predictive features were selected
by ANOVA F-value feature selection test under 10-
fold cross validation.
Next, a random forest (RF)  model with 1000
trees was trained to predict the mortality status
(where value of 0 indicates no mortality and value
of 1 indicates mortality). Gini impurity was used as
the splitting criterion while growing decision trees.
Grid tuning was performed to select the optimum
number of predictors used for splitting a node of a
decision tree in RF.
Gajendra J. Katuwal* and Robert Chen+
*Rochester Institute of Technology, Rochester, NY 14623
+Georgia Institute of Technology, Atlanta, GA 30332
Machine Learning Model Interpretability for Precision Medicine