ArticlePDF AvailableLiterature Review

Machine learning in the prediction of cancer therapy


Abstract and Figures

Resistance to therapy remains a major cause of cancer treatment failures, resulting in many cancer-related deaths. Resistance can occur at any time during the treatment, even at the beginning. The current treatment plan is dependent mainly on cancer subtypes and the presence of genetic mutations. Evidently, the presence of a genetic mutation does not always predict the therapeutic response and can vary for different cancer subtypes. Therefore, there is an unmet need for predictive models to match a cancer patient with a specific drug or drug combination. Recent advancements in predictive models using artificial intelligence have shown great promise in preclinical settings. However, despite massive improvements in computational power, building clinically useable models remains challenging due to a lack of clinically meaningful pharmacogenomic data. In this review, we provide an overview of recent advancements in therapeutic response prediction using machine learning, which is the most widely used branch of artificial intelligence. We describe the basics of machine learning algorithms, illustrate their use, and highlight the current challenges in therapy response prediction for clinical practice.
Content may be subject to copyright.
Machine learning in the prediction of cancer therapy
Raihan Rafique
, S.M. Riazul Islam
, Julhash U. Kazi
Ideflod AB, Lund, Sweden
Department of Computer Science and Engineering, Sejong University, Seoul, South Korea
Division of Translational Cancer Research, Department of Laboratory Medicine, Lund University, Lund, Sweden
Lund Stem Cell Center, Department of Laboratory Medicine, Lund University, Lund, Sweden
article info
Article history:
Received 14 March 2021
Received in revised form 6 July 2021
Accepted 7 July 2021
Available online 08 July 2021
Artificial intelligence
Deep learning
Monotherapy prediction
Drug combinations
Drug synergy
Variational autoencoder
Restricted Boltzmann machine
Support vector machines
Ridge regression
Elastic net
Random forests
Deep neural network
Convolutional neural network
Graph convolutional network
Matrix factorization
Factorization machine
Higher-order factorization machines
Visible neural network
Ordinary differential equation
Resistance to therapy remains a major cause of cancer treatment failures, resulting in many
cancer-related deaths. Resistance can occur at any time during the treatment, even at the beginning.
The current treatment plan is dependent mainly on cancer subtypes and the presence of genetic muta-
tions. Evidently, the presence of a genetic mutation does not always predict the therapeutic response
and can vary for different cancer subtypes. Therefore, there is an unmet need for predictive models to
match a cancer patient with a specific drug or drug combination. Recent advancements in predictive
models using artificial intelligence have shown great promise in preclinical settings. However, despite
massive improvements in computational power, building clinically useable models remains challenging
due to a lack of clinically meaningful pharmacogenomic data. In this review, we provide an overview of
recent advancements in therapeutic response prediction using machine learning, which is the most
widely used branch of artificial intelligence. We describe the basics of machine learning algorithms, illus-
trate their use, and highlight the current challenges in therapy response prediction for clinical practice.
Ó2021 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and
Structural Biotechnology. This is an open access article under the CC BY license (http://creativecommons.
1. Introduction . . . ..................................................................................................... 4004
2. Basics of therapy response prediction. . .................................................................................. 4004
2.1. Pharmacogenomic data resources . . . . . . . . . .... .................................................................... 4004
2.2. Data preprocessing . . . . . .... .................................................................................. .. 4005
3. ML algorithms for drug response prediction . . . . . . . . . . . . . . . ............................................................... 4005
3.1. Linear regression . . . . . . . ............................................................................. ........... 4005
3.2. Nonlinear regression . . . . ................ ........................................................................ 4006
3.3. Kernel functions . . . . . . . ................................... ..................................................... 4006
3.4. Deep learning . . . . . . . . . ....................... ................................................................. 4008
2001-0370/Ó2021 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology.
This is an open access article under the CC BY license (
Corresponding author at: Division of Translational Cancer Research, Department of Laboratory Medicine, Lund University, Medicon village Building 404:C3, Scheelevägen
8, 22363 Lund, Sweden.
E-mail address: (J.U. Kazi).
Equal contributions.
Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
journal homepage:
4. Monotherapy response prediction. . . . . .................................................................................. 4008
4.1. Classical ML models in monotherapy prediction. . . . . . . . . . . . . . . . . ...................................... ............... 4009
4.2. Deep neural networks in monotherapy prediction . . . . . . . . . . . . . . . ............................................. ........ 4010
4.3. Matrix factorization and factorization machines in monotherapy prediction . . . . . . . . . . . . . .... .............................. 4010
4.4. Autoencoders in monotherapy prediction . . . ............................................................. ........... 4010
4.5. Graph convolutional networks in monotherapy prediction . . . . . . . . .......... ........................................... 4011
4.6. Visible neural networks in monotherapy prediction. . . . . . . . . . . . . . ...................................... ............... 4011
4.7. PDXs and organoids in monotherapy prediction. . . . . . . . . . . . . . . . . ................................... .................. 4011
5. Drug synergy prediction . . . . . . . . . . . . .................................................................................. 4011
5.1. Drug synergy prediction using conventional ML methods . . . . . . . . . ..................................................... 4012
5.2. Drug synergy prediction using DL . . . . . . . . . ................................... ..................................... 4012
5.3. Synergy prediction with a higher-order factorization machine . . . . . ................................ ..................... 4013
5.4. Synergy prediction using an autoencoder . . . ............................. ........................................... 4013
5.5. Synergy prediction with a graph convolutional network . . . . . . . . . . ................................ ..................... 4013
5.6. Restricted Boltzmann machine for predicting drug synergy . . . . . . . ....... .............................................. 4013
6. Limitations in the development of clinically relevant predictive models . . . . . . . . . . . . ............................................ 4013
7. Conclusion . . . . ..................................................................................................... 4014
CRediT authorship contribution statement . . . . . . . . . . . . . . . . . . ............................................................... 4014
Declaration of Competing Interest . . . . .................................................................................. 4014
Acknowledgments. . . . . . . . . . . . . ...................................................................................... 4014
References . . . . ..................................................................................................... 4014
1. Introduction
Adaptive resistance mechanisms are highly dependent on can-
cer subtypes and applied treatments. Therefore, the resistance
mechanism needs to be defined for each cancer subtype and indi-
vidual treatment plan. Currently, hardly any tools exist to deter-
mine from the beginning whether a patient will respond to a
specific therapy or display resistance. Thus, there is an unmet need
to develop tools to identify drug responses in individual patients
for precision medicine. Recent technological advances have initi-
ated a new era of precision medicine through data-driven assess-
ment of diseases by combining machine learning (ML) and
biomedical science. The use of artificial intelligence such as ML
helps to extract meaningful conclusions by exploiting big data,
thereby improving treatment outcomes. ML is widely used in can-
cer research and is becoming increasingly popular for cancer detec-
tion and treatment. The main goal of precision medicine is to
provide therapies that not only increase the survival chances of
patients but also improve their quality of life by reducing
unwanted side effects. This can be achieved by matching patients
with appropriate therapies or therapeutic combinations.
Some of the early studies on ML and its applications in human
cancer research have been discussed elsewhere [1]. Several recent
overviews in this emerging field have provided valuable insights
into the relevant computational challenges and advancements
[2–8]. These overviews illustrated the importance of the field and
supported the notion that ML is a highly promising approach to
personalized therapy for cancer treatment. In a recent review, a
broad perspective was provided on how ML tools can be incorpo-
rated into clinical practice with a focus on biomarker development
[9]. Another review identified several challenges in omics data
analysis and data integration to obtain robust results in big-data-
assisted precision medicine [10]. Several other reviews dealt pri-
marily with the computational methods and software that are
required to advance data-driven precision oncology [11–13]. Also,
whereas Grothen et. al. discussed artificial intelligence-based
investigations into cancer subtypes and disease prognosis from a
system biology perspective [14], Biswas et. al. reviewed artificial
intelligence applications for pharmacy informatics in a surveillance
and epidemiological context [15]. Another study systematically
explained how deep learning (DL), a subset of ML, has emerged
as a promising technique, highlighting various genomics and phar-
macogenomics data resources [16]. However, the aforementioned
studies did not focus strictly on drug response prediction from
clinical perspectives. In recent years, several surveys and review
articles have presented the potential and challenges of ML adop-
tion in clinical practice and drug response prediction in cancer
treatment [17–23]. Nonetheless, the area of applications of ML in
cancer treatment is so diverse that various issues still need to be
analyzed from a holistic perspective. In this review, we provide a
comprehensive overview of the ML solutions for drug response
prediction relating to the relevant clinical practices. In addition
to discussing the basics of therapy response prediction and related
ML principles, we systematically present the ML and DL
approaches that are promising for monotherapy and combination
therapy in cancer treatment, a focus that makes our article differ-
ent from existing surveys and reviews.
2. Basics of therapy response prediction
Predictive model development involves several steps that com-
bine biological data and ML algorithms. A brief workflow has been
depicted in Fig. 1.
2.1. Pharmacogenomic data resources
High-quality biological data are a prerequisite for a good model.
Large-scale cell line data are publicly available from different plat-
forms and include genomic, transcriptomic, and drug response
data. Pharmacogenomic data for cell lines are available mainly
from the Cancer Cell Line Encyclopedia (CCLE) [24,25], NCI-60
[26], the Genomics of Drug Sensitivity in Cancer (GDSC) [27,28],
gCSI [29], and the Cancer Therapeutics Response Portal (CTRP)
[30,31]. PharmacoDB [32] and CellMinerCDB [33,34] provide
access to the curated data from different studies. These datasets
offer baseline genomic and transcriptomic data for cell lines cover-
ing a wide range of cancers. DrugComb [35] and DrugCombDB [36]
offer manually curated drug combination data from different
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
studies. Besides these pharmacogenomic data for cell lines, which
have been widely used to develop ML models, several initiatives
have recently been undertaken to generate pharmacogenomic data
from patient-derived xenografts (PDXs). Compared with cell lines,
PDXs are superior in predicting clinical activities. PDX finder [37],
PRoXE [38], PDMR [39], and EorOPDXs [40] provide comprehensive
data for PDXs. Several other studies also provide high-quality tran-
scriptomic and pharmacogenomic data that are useful for model
development or testing when combined with other datasets [41–
2.2. Data preprocessing
Data preprocessing is an important step in the ML approach.
Large-scale data preprocessing includes data selection, noise filter-
ing, imputation of missing values, feature selection, and
Data selection – Data selection remains the most challenging
aspect due to the possible inconsistencies between different data-
sets [46]. Studies comparing the largest public collections of phar-
macological and genomic data for cell lines suggest that each
dataset separately exhibits reasonable predictive power but that
combining datasets can further increase the classification accuracy
Feature selection – Large-scale cell line datasets comprise tran-
scriptomic, mutational, copy number variation (CNV), methylation,
and proteomic data. Although genetic features such as mutations,
CNV, and promotor methylation have been shown to provide
important therapeutic insights, these features seem to be limited
to individual tumors [27]. Therefore, it has been suggested that
transcriptomic features alone hold the most predictive power
and that the addition of genetic features marginally improves per-
formance of an ML model [48–50]. The feature-to-sample ratio
plays an important role in controlling the variances, with a smaller
ratio providing better prediction [51]. However, maintaining a
proper feature-to-sample ratio is challenging for pharmacoge-
nomic data. For example, transcriptomic data can have more than
15,000 features, while the number of samples in any pharmacoge-
nomic study remains between 100 and 1000. Systematically reduc-
ing the number of features (also known as dimensionality
reduction) by incorporating meaningful descriptions improves pre-
diction accuracy by reducing overfitting [52,53]. Several tech-
niques can be used for feature selection, including minimum
redundancy maximum relevance (mRMR), high-correlation filters,
principal component analysis, and backward feature elimination
Data normalization – Because the range of values of raw data
varies widely, a normalization technique (also known as feature
scaling) is applied to change the values of numeric columns in
the dataset to obtain a common scale, so that the associated objec-
tive functions work properly. Different ways exist to perform fea-
ture scaling, including min–max normalization, rank-invariant
set normalization, data standardization, cross-correlation, and
scaling to unit length [63].
3. ML algorithms for drug response prediction
ML algorithms can be grouped into four major classes: super-
vised learning, semi-supervised learning, unsupervised learning,
and reinforcement learning [64,65]. Supervised learning algo-
rithms use a training dataset with known outcomes to build a
hypothetical function with decision variables that can later be used
to predict unknown samples (Fig. 2). On the other hand, unsuper-
vised learning algorithms use unlabeled data to find hidden struc-
tures or patterns; these algorithms are widely used in biological
research for clustering and pattern detection. Semi-supervised
learning algorithms are self-learning and can develop a prediction
model from partially labeled data [66]. A reinforcement learning
algorithm employs a sequential decision problem in which the
algorithm solves a problem and learns from the solution [65].In
this case, the algorithm discovers which actions result in the best
output on a trial-and-error basis. Perhaps supervised learning algo-
rithms are generally used for building classification models, and
these algorithms have also been widely tested for predicting treat-
ment outcomes. Therefore, in this review, we will focus mainly on
supervised learning algorithms.
3.1. Linear regression
Linear regression algorithms are simple and constitute the most
popular ML algorithms, with a wide range of applications. The
standard algorithm, least squares regression, uses the sum of
squared residuals as the cost function to be minimized. Least
squares regression works with a simple dataset; however, with
increasing complexity, the algorithm shows overfitting (low bias
but large variance). To resolve this problem, several algorithms,
Fig. 1. Workflow for ML prediction model development. Pharmacogenomic data from cell lines, patient-derived xenografts (PDXs), and patient materials are ideal for ML
model development. Data from different sources are preprocessed and then divided into training (including cross-validation) and test groups. The training dataset is used to
build and validate the prediction model, while the test dataset is used for testing the model’s accuracy and precision. To develop a prediction model for clinical use, vigorous
preclinical assessment is required that can be performed using cell lines, PDXs, and patient materials that have not been used for model development. Additionally, the
efficacy of predicted drugs must be tested for disease-specific preclinical models. Finally, both the model and predicted drug will undergo a clinical trial.
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
such as the ridge model, lasso model, and elastic net, have been
proposed. The cost functions in these models have been modified
to increase the bias and reduce the variance. In a ridge model, a
so-called L2 regularization, which is the squared value of the slope
multiplied by k, has been added to the least squares cost function.
The least absolute shrinkage and selection operator (lasso) regular-
ization (known as L1 regularization) is similar to the ridge regular-
ization, but in this case, the added value is the absolute value of the
slope multiplied by k. The elastic net algorithm adds contributions
from both L1 and L2 regularization; the cost function = min (sum of
the squared residuals + k* squared value of slope + k* absolute
value of slope). The kparameter is a positive number that repre-
sents regularization strength. A larger kvalue specifies stronger
regularization, while a near-zero value removes the regularization
so that all three algorithms become similar to the least squares
model (Fig. 3). By changing the value of k, it is possible to select
meaningful features. Therefore, these methods can be applied to
feature selection as well as to classification and regression prob-
lems [24,28].
3.2. Nonlinear regression
Among the various supervised learning algorithms, the decision
tree is a relatively popular predictive modeling algorithm used to
classify simple data. A decision tree takes data in the root node
and, according to a test rule (representing the branch), keeps grow-
ing until it reaches a decision (representing a leaf node). The inter-
nal nodes represent different attributes (features) [67]. Each
internal node breaks the data into a small subset until it meets a
particular condition. It is a white-box-type algorithm, as each step
can be understood, interpreted, and visualized. Although the deci-
sion tree is useful for simple classification, with a larger dataset
that has many features, it displays poor prediction powers due to
overfitting. To resolve this problem, several advanced decision-
tree-based models have been developed. The random forest algo-
rithm randomly splits (bootstrapping) training data into several
subsets (bagging) and uses each subset to build decision trees
(Fig. 4). The use of multiple random decision trees for prediction
increases the prediction accuracy [68]. Apart from the parallel
use of random multiple decision trees, boosting algorithms, such
as adaptive boosting (AdaBoost) and gradient boosting, use deci-
sion trees sequentially [69,70]. AdaBoost usually uses one-node
decision trees (decision stump), while gradient boosting uses deci-
sion trees of between 8 and 32 terminal nodes. Both adaptive and
gradient boosting algorithms display better prediction perfor-
mance than single decision trees. Furthermore, a more regularized
gradient boosting algorithm, extreme gradient boosting (XGBoost),
outperforms the former gradient boosting algorithms [71].
3.3. Kernel functions
Kernel functions are widely used to transform data to a higher-
dimensional similarity space. Kernel functions can be linear, non-
linear, sigmoid, radial, polynomial, etc. Support vector machines
(SVMs) are among the most popular kernel-based algorithms that
can be used not only for supervised classification and regression
problems but also for unsupervised learning. In a two-
dimensional space, a linear SVM classifier is defined by a straight
line as a decision boundary (maximum margin classifier) with a
soft margin (Fig. 5A). In this case, the soft margins are also straight
lines that represent the minimal distance of any training point to
the decision boundary [72]. With simple one-dimensional data,
the decision boundary can be a point (Fig. 5B); however, for com-
Fig. 2. Schematic representation of different ML algorithms. In a supervised learning model, all data have a known label, while the semi-supervised model can handle
partially labeled data. Both unsupervised and reinforcement learning algorithms can handle unlabeled data.
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
Fig. 3. A comparison of different linear regression algorithms. The sklearn.linear_model from SciKit learn was used to generate example plots using a diabetes dataset
provided in SciKit learn. Plots show that by changing the kvalue, regression can be regulated such that with a small kvalue, all linear regression algorithms provide similar
regression. Color code: linear regression – blue, ridge regression – green, lasso – cyan, and elastic net – red. (For interpretation of the references to color in this figure legend,
the reader is referred to the web version of this article.)
Fig. 4. Schematic representation of random forest algorithm. The three major steps in the random forest algorithm are bootstrapping, bagging, and aggregation. During
bootstrapping, the training dataset is resampled into several small datasets, which are then bagged for the decision tree. The size of the bagged dataset remains the same but
bootstrapped decision trees are different from each other. All decision trees make predictions on test data, and in the aggregation step, all predictions are combined for the
final prediction. For a classification problem, the final prediction is made by major voting, but for a regression problem, the final prediction uses the mean or median value.
Fig. 5. Support vector machine. (A) In a two-dimensional SVM classification system, the maximum margin classifier is a straight line (red line). Support vectors are the
nearest data points from the maximum margin classifier. The distance between support vectors and the maximum margin classifier is denoted as the soft margin. (B) In a
two-group, one-dimensional data space, the decision boundary is a point, as shown by the red line. (C) In a two-group one-dimensional data space where the decision
boundary cannot be drawn by a point, data are transformed by a kernel function to increase the dimension. (For interpretation of the references to color in this figure legend,
the reader is referred to the web version of this article.)
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
plex problems, the data may need to be transformed to a higher
dimension to draw a decision boundary (Fig. 5C).
3.4. Deep learning
DL methods are a type of ML method that can automatically dis-
cover appropriate representations for regression or classification
problems upon being fed with suitable data. The model can learn
complex functions and amplify important aspects to suppress irrel-
evant variations. During training, the algorithm takes the raw input
and processes it through hidden layers using nonlinear activation
functions. The algorithm tries to minimize certain cost functions
by defining values for the weights and biases (Fig. 6A). Usually, gra-
dient descent is used to find the minima. Gradients for all modules
can be determined by using the chain rule for derivatives, a proce-
dure that is known as backpropagation (starting from the output
and moving toward the input) [73]. DL algorithms have been suc-
cessfully employed in various domains, including image classifica-
tion, because of the availability of more data than features. The
development of DL models using genomic or transcriptomic data
is challenging due to the limited number of samples and the pres-
ence of many features. The selection of appropriate features can
reduce the feature-to-sample ratio and, thereby, prevent overfit-
ting. Furthermore, the addition of random dropout layers can help
the model learn important features and reduce overfitting (Fig. 6B).
Convolutional neural networks (CNNs) are useful for feature
learning (Fig. 6C). During the convolution and pooling steps, the
algorithm of a CNN learns important features [73]. CNNs are
widely used for structured data, such as images; however, if the
data are stored in other types of architectures, such as graphs (an
example includes small-molecule drugs with multiple atoms and
chemical bonds), conventional CNNs cannot be used. In this case,
a different type of convolutional neural network, referred to as
the graph convolutional networks (GCNs), could be applied to the
graph data [74]. GCNs have especially been used to extract atomic
features from drug structure (graph) data [75].
4. Monotherapy response prediction
Currently, only a few drug response prediction tools are avail-
able for clinical use. In fact, a couple of linear regression prediction
models are currently being used for certain types of cancers. A
supervised classification model using a 70-gene signature was
developed in 2002 to predict chemotherapy responses in breast
cancer [76]. The method was patented as MammaPrint and is cur-
rently used in the clinic for patients with early-stage breast cancer.
Later, a similar method was developed in which a linear regression
model based on the scores of a 21-gene signature (Oncotype DX)
was used to predict the chemotherapy responses in early-stage,
estrogen-receptor-positive, HER2-negative invasive breast cancer
[77]. Furthermore, a 50-gene signature was employed in multivari-
ate supervised learning (PAM50 or Prosigna, a breast cancer prog-
nostic gene signature assay) to predict treatment responses in
breast cancer [78]. Aside from these simple, cancer-subtype-
specific prediction models that are currently available in the clinic,
most other studies regarding monotherapy predictions are still in
the preclinical phase. Fig. 7 shows an overview of the methods that
have been used to develop monotherapy prediction models in the
past decade (a brief overview is included in Table 1).
Fig. 6. Deep learning (DL). (A) In a deep neural network (DNN) model, each node of the input data layer is fully connected to the hidden layer nodes. The first hidden layer
takes input data, multiplies it by weight, and adds a bias before applying a nonlinear activation function. The second hidden layer takes the first hidden layer as input and so
on until it reaches the output layer. (B) In a dropout layer, some nodes are randomly removed. (C) During the convolution, the dimension of input data is reduced using a
certain kernel size (in this example, 3x3) and the activation function. Then, features are pulled for further reduction. Finally, pulled features are flattened and applied to a
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
4.1. Classical ML models in monotherapy prediction
Sparse linear regression models have been used to predict drug
sensitivity in initial large-scale pharmacogenomic studies with cell
lines from various cancers [24,28,30]. These studies combined
genomic features with transcriptomic features from cell lines and
correlated them with corresponding drug sensitivity scores. The
ridge regression and elastic net algorithms were predominantly
employed for predictions [24,28,30,50,79]. However, due to the lin-
ear nature of the algorithms and the use of many features, these
models could easily become overfitted.
As discussed above, the performance of prediction algorithms is
largely influenced by biological feature selection [54,55,80,81].
Prediction performance can further be improved by incorporating
information on the similarity between cell lines and drugs [82].
Cell lines with a similar gene expression profile show similar
responses to a specific drug, while drugs with a similar chemical
structure display similar inhibitory effects toward different cell
lines. Therefore, a dual-layer network model that also considers
similarity information outperforms linear models [82]. Likewise,
a method based on a heterogeneous network in which the relation-
ships among drugs, drug targets, and cell lines were explicitly
incorporated was shown to better capture the relationship
between cell lines and drugs [83]. Collectively, a predictive model
with selected features performs better, and the addition of network
features improves the prediction accuracy.
The community-based NCI-DREAM study used a limited num-
ber of samples with a large number of genomic, transcriptomic,
and proteomic features [49]. The NCI-DREAM initiative developed
44 different drug sensitivity prediction models, with the Bayesian
multitask multikernel learning (BM-MKL) models performing rela-
tively better than other models. BM-MKL includes Bayesian infer-
ence, multitask learning, multiview learning (multiple data view),
and kernelized regression [49,84,85]. The standard model, kernel-
ized regression, is a nonlinear classification algorithm similar to
SVMs. Unlike the elastic net, kernelized regression captures the
nonlinear relationship between drug sensitivity and genomic or
transcriptomic features but simplifies the process by using a single
component for the predictions.
Besides using genomic or transcriptomic features to predict
drug sensitivity, the chemical and structural properties (also
known as descriptors) of drugs have been incorporated into the
learning algorithms. Combining drug descriptors with genomic or
transcriptomic data allows for the simultaneous prediction of
Table 1
Studies predicting monotherapy responses.
Year Data Features Algorithm Ref.
2012 GDSC Mutation, CNV, gene expression Elastic net [28]
CCLE Mutation, CNV, gene expression Elastic net [24]
2013 CCLE, GDSC Gene expression (1000 selected genes) Elastic net and other [54]
CTRP Mutation, CNV Elastic net [30]
GDSC Selected genomic features Neural networks and random forests [80]
2014 GDSC, clinical data Gene expression Ridge regression [79]
CCLE, GDSC Mutation, CNV, gene expression Elastic net and ridge regression [50]
GDSC, CCLE, NCI Gene expression (1000 selected genes) Random forest [55]
NCI-DREAM Mutation, CNV, gene expression, proteomic BM-MKL [49]
2015 GDSC, CCLE Gene expression Cell line-drug network model [82]
2016 NCI Mutation, CNV, gene expression, RPLA, miRNA Random forest and support vector machine [81]
GDSC 2 Mutation, CNV, gene expression, methylation Elastic net and random forest [27]
LINCS Gene expression DNN [88]
2018 AML patient and cell line data Gene expression VAE + LASSO (DeepProfile) [99]
GDSC Genomic fingerprints CNN [91]
AML patient and cell line data Gene expression, mutation, CNV, methylation Network-based gene-drug associations [87]
PharmacoDB, CMap Gene expression VAE (Dr.VAE) [59]
CCLE, GDSC Gene expression Recommender systems [94]
2019 GDSC Gene expression DNN [90]
TCGA, CCLE Mutation, gene expression VAE, DL (DeepDR) [60]
GDSC Mutations and CNV CNN ((tCNNS) [105]
GDSC Mutation, CNV, gene expression. DL (MOLI) [92]
GDSC, CCLE Gene expression Autoencoder (DeepDSC) [61]
2020 PDXGEM Gene expression Random forest [106]
GDSC, KEGG, STITCH Gene expression, pathway DL [89]
GDSC, CCLE, CTRP Gene expression, mutation, CNV, methylation VNN [62]
van de Wetering et al. [108], Lee et al. [109] Gene expression, pathway Ridge regression [107]
2021 GDSC Mutations and CNV Graph convolutional network [104]
Fig. 7. ML algorithms used in the last decade to build monotherapy response prediction. Earlier prediction models were likely developed mainly using classical ML
algorithms. Later, the DL algorithms were used mostly to develop the models. The majority of the studies used multi-omics data (mutation, CNV, methylation, and gene
expression) collected from large screening studies such as CCLE, GDSC, CTRP, etc. EN – elastic net, RF – random forest, NN – neural network, RR – ridge regression, BM-MKL –
Bayesian multitask multi-kernel learning, SVM – support vector machine, LASSO - least absolute shrinkage and selection operator, CNN – convolutional neural network, DNN
– deep neural network, AE – autoencoder, VAE – variational autoencoder, MF – matrix factorization, VNN – visual neural network, GCN – graph convolutional network.
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
multiple drug responses from a single model, although it is a chal-
lenging task due to the further increase in the total number of fea-
tures [86]. Likewise, in a study with multicancer and multidrug
associations, a disease-specific multi-omics approach to predicting
gene-drug association was adopted in which each gene was
checked for a pathway association [87]. The method is useful for
identifying critical regulatory genes that can be targeted by a drug.
4.2. Deep neural networks in monotherapy prediction
Although DL has long been widely used in several areas of med-
ical science and drug discovery platforms, it has recently been
applied to drug response prediction as well. Initially, feedforward
deep neural networks (DNNs) were applied to develop models
using selected genomic features [80] or transcriptomic data [88].
Later studies incorporated selected gene expression features with
pathway information to build DNN models [89,90]. In any case,
all these DNN models have been shown to outperform classical
ML models.
A CNN was used in the Cancer Drug Response Profile scan
(CDRscan) study, in which convolutions were applied separately
to genomic fingerprints of cell lines and molecular fingerprints of
drugs [91]. After convolution, those two sets of features were
merged and used with the drug response data to develop a DNN
model. Because a CNN learns important features during training
[73], the CDRscan method displays considerably higher robustness
and generalizability. A similar model (MOLI) was developed using
somatic mutations, CNVs, and gene expression data from GDSC
[92]; the model was later validated with PDXs and patient samples.
4.3. Matrix factorization and factorization machines in monotherapy
Matrix factorization (MF) is a supervised learning method that
has been widely used in popular e-commerce ML recommender
systems [93]. MF takes high-dimensional data, with missing infor-
mation, as input and decomposes it into lower-dimensional matri-
ces with the same numbers of latent factors (Fig. 8A). The learning
algorithms in recommender systems are not general and must be
tailored to each specific model. A modified recommender system
was developed (CaDRReS) in which cell line features were first cal-
culated using gene expression information [94]. The MF method
determined the pharmacogenomic space (the dot product of the
cell line vector and the drug vector), and drug sensitivity was com-
puted using a specific linear algorithm. The model was compared
to other ML algorithms and was found to perform similarly to
the elastic net. Because the model provides a projection of cell lines
and drugs into the pharmacogenomic space, it is easy to explore
relationships between drugs and cell lines [94].
In a recommender system, MF cannot add additional features
and cannot predict a completely new item, as the method is highly
dependent on data from input features. To resolve those issues, in
2010 Rendle introduced a generalized algorithm, the factorization
machine (FM)) [95]. FMs are SVM-like predictors but can handle
data with high sparsity (Fig. 8B). Classical FMs can easily handle
second-order feature combinations but struggle with higher-
order feature combinations. Blondel et al. proposed an updated
algorithm for the easy handling of higher-order feature combina-
tions, referred to as higher-order factorization machines (HOFMs)
[96]. So far, HOFMs have not been used in monotherapy response
prediction; however, they have been employed to predict drug
combinations (as described below).
4.4. Autoencoders in monotherapy prediction
An autoencoder is an unsupervised DL model that can be used
to reduce the dimension of features. An autoencoder learns hidden
(latent) variables from the observed data through the mapping of
higher-dimensional data onto a lower-dimensional latent space.
An autoencoder consists of two different types of layers: encoding
layers and decoding layers, with encoding layers projecting higher-
dimensional input data onto lower dimensions and decoding layers
reconstructing the lower-dimensional data back to the higher-
dimensional data similar to input (Fig. 9A). The loss function is
the least squares difference between the input and output vectors.
In this case, if the decoding weights correspond to the encoding
weights, the output will be the same as the input (deterministic
encoding). In general, an autoencoder uses nonlinear activation
functions for data compression and can discover nonlinear
explanatory features; therefore, it can be used to reduce gene
expression features and uncover a biologically relevant latent
space [61,97].
Besides the traditional autoencoder, the variational autoen-
coder (VAE) replaces the deterministic bottleneck layer with
stochastic sampling (mean and standard deviation) vectors
(Fig. 9B). The model includes regularization losses by adding a
Kullback-Leibler (KL) divergence term. This reparameterization
allows for backpropagation optimization and for learning the prob-
ability distribution of each latent variable instead of directly learn-
ing the latent variables [98].
The DL model to predict drug response (DeepDR) combined
mutational data with gene expression data to develop a monother-
apy prediction model, implementing an autoencoder for both
mutational and gene expression data [60]. In this model, the
autoencoder was first applied to the TCGA data to transform the
mutational and gene expression features into a lower-
dimensional representation. The encoded representations of the
TCGA data were linked to a feedforward neural network trained
on CCLE data for monotherapy prediction. The use of autoencoding
Fig. 8. Matrix factorization and factorization machine. (A) In MF, a matrix is decomposed into two lower-dimensional matrices with the same latent factor. The dot product of
lower-dimensional matrices is used to reconstitute the new matrix to calculate the loss function. (B) An FM transforms sample and features data to the binary representation
and can incorporate additional features.
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
increased the sample number in the prediction model and, there-
fore, displayed better prediction performance. Besides an autoen-
coder, a VAE was used to reduce the higher-dimensional acute
myeloid leukemia (AML) patient gene expression data to an 8-
dimensional representation, and the VAE was then used to build
a linear regression model (lasso) for drug response prediction
[99]. Later, a drug response VAE (Dr.VAE) was developed using
drug-induced gene expression perturbation [59]. This study used
a semi-supervised VAE to predict monotherapy responses using
cell line data, and the model was shown to perform better than
several linear or nonlinear algorithms. The use of drug-induced
gene expression perturbation seems to be useful in determining
pathways that regulate drug response and therapy resistance
[100]. Nevertheless, anomaly detection with density estimation
can improve the prediction accuracy through false positive detec-
tion, but this still needs to be implemented [101].
4.5. Graph convolutional networks in monotherapy prediction
Therapy response prediction using multiple drugs requires the
incorporation of chemical information about the drugs. This can
be done in several ways. The 2D molecular fingerprint (also known
as the Morgan fingerprint or circular fingerprint) is commonly
measured by the extended-connectivity fingerprint (ECFP) algo-
rithm [102]. This algorithm determines partial structures and con-
verts them into a binary representation. Similarly, the 3D
fingerprint descriptor collects 3D information, including electro-
statics and molecular shape. The simplified molecular input line
entry specification (SMILES) representation was developed by Wei-
ninger and provides a linear notation method [103]. SMILES can be
used directly by a CNN. Molecular graphs are another type of flex-
ible representation of small-molecule drugs. The GraphDRP study
used a molecular graph representation in a GCN to extract molec-
ular features from drugs [104]. At the same time, a CNN was used
to extract genomic features from cell lines. Then, the features from
the GCN and CNN were combined and fed into the fully connected
feedforward neural network for drug sensitivity prediction. The
GCN model was compared to a recently developed CNN model
using the SMILES format to describe the drugs and was found to
perform better, suggesting that the use of graph data for drugs
improves predictive performance [105].
4.6. Visible neural networks in monotherapy prediction
Model interpretation is an important research area in ML that
seeks to explain the model’s internal rationality of a prediction.
Biological ML models that were developed with prior knowledge
of network or structural data can be explained relatively easily. A
so-called visible neural network (VNN) incorporates genomic or
transcriptomic data considering the cellular architecture and sig-
naling pathways [62]. Chemical information about drugs was sep-
arately processed and then combined with the embedding
genotype data to develop the final prediction model (DrugCell).
The DrugCell method was compared to the elastic net and other
DNN models and found to have a similar or better predictive
4.7. PDXs and organoids in monotherapy prediction
Although most studies used cell line data to develop ML models,
recently the PDXGEM study applied PDXs to develop an ML model
[106]. In this study, drug activity was calculated as a percentage of
tumor volume changes. Baseline gene expression profiling data
were used to develop the model. Another recent study used data
from 3D organoid culture models and applied protein–protein
interaction networks [107]. The model was trained with pharma-
cogenomic data from two previous studies using ridge regression
[108,109]. This study developed a clinically relevant prediction
model that was also useful in identifying predictive biomarkers
[107]. Collectively, the use of PDXs and organoids in model devel-
opment increases the probability of successful clinical applications.
5. Drug synergy prediction
The use of monotherapy in cancer treatment is relatively rare,
and most cancer patients are treated with a combination of several
drugs. Cancer cells can easily develop resistance to monotherapy,
while the development of resistance to several drugs can be diffi-
cult or take longer. Therefore, combinatorial therapies are pre-
ferred over monotherapy in clinics for cancer treatment. A
combination of multiple drugs can have three different effects:
additive, antagonistic, and synergistic. The additive effect can be
considered a neutral effect, while the antagonistic effect is nega-
tive. The synergistic effect is preferable. Thus, predicting drug syn-
ergy will be highly beneficial for selecting effective combinations
for cancer treatment.
Drug synergy is usually calculated by a cell viability matrix, in
which a wide range of single and combinatorial drug effects are
noted. The Institute for Molecular Medicine Finland (FIMM) devel-
oped an experimental-computational pipeline to measure and
visualize synergy from drug combinations [110]. It allows for the
simultaneous measurement of several synergy scores, such as Bliss
independence [111], Loewe additivity [112], highest single agent
(HSA) [113], and zero interaction potency (ZIP) [114]. Later, the
study was extended to the prediction of drug combinations
[115]. Combenefit is yet another program for calculating synergy
scores, in particular Loewe additivity [116].
Several attempts have been made to identify drug synergy using
cell lines from different cancers [117–123]. These studies provided
an initial framework for developing ML algorithms for predicting
drug synergy. A list of available in silico drug synergy prediction
models is given in Table 2.
Fig. 9. Autoencoder and variational autoencoder. (A) The autoencoder determines latent variables by reducing the dimensions during encoding. Then it decodes the data into
a similar form using the latent variables. (B) VAE uses a similar process unless the latent variables are replaced by the mean and standard deviation.
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
5.1. Drug synergy prediction using conventional ML methods
In silico methods integrating molecular data with pharmacolog-
ical data could potentially identify drug combinations with some
limitations [124]. A heterogeneous network-assisted inference
(HNAI) framework was developed using drug-drug interaction
pairs connecting approved drugs, phenotypic similarity, therapeu-
tic similarity, chemical structure similarity, and genomic similarity
using naive Bayes, decision tree, k-nearest neighbor (KNN), logistic
regression, and SVM algorithms [125]. Then, the DDIGIP method, in
which the Gaussian interaction profile (GIP) kernel and the regu-
larized least squares (RLS) classifier were implemented, was based
on drug-drug interactions (DDIs) [126]. DDIGIP used the similarity
of drug features extracted from drug substructures, targets, trans-
porters, enzymes, pathways, indications, side effects, offside
effects, and drug-drug interaction data. Collectively, these methods
give valuable insights into drug-drug interactions but cannot pro-
vide information about whether certain drug combinations will be
effective for a specific patient. Gene expression data were used at a
limited scale to predict the effect of drug combinations by the Petri
net model [127], but the model requires gene expression profiles
for every drug pair, which limits its practical applications.
In a DREAM challenge, the human diffuse large B-cell lym-
phoma (DLBCL) cell line OCI-LY3 was treated with 91 compound
pairs of 14 drugs. The drug-induced genomic residual effect
model—which combined similarity and dissimilarity in compound
activity incorporating drug-induced gene perturbation, dose–re-
sponse, and pathway information—was reported to outperform
30 other models [128,129]. Although the accuracy of the predictive
models was not optimal for practical applications, this study raised
the probability of building computational predictive models for
drug synergy prediction. The gene expression perturbation data
generated in this project are valuable for other studies and can
be used to train random forest models with the biological and
chemical properties of drugs, such as physicochemical properties,
target network distances, and targeted pathways [130]. Similarly,
Cuvitoglu et al. extracted the drug perturbation set of genes for
each drug from the transcriptome profile of Cmap data [131] and
calculated six different features: the distance between two drugs
(M1), the mutual information about biological processes (M2),
the gene ontology similarity (M3), the overlap of drug perturbation
sets (M4), the betweenness centrality of the drug combination net-
work (M5), and the degree of the drug combination network (M6)
[132]. Three models were developed using a naive Bayes classifier,
an SVM, and a random forest algorithm. Different features were
tested, and models combining the M5 and M6 features performed
the best. In addition, the CellBox method used perturbation data of
the melanoma SK-Mel-133 cell line treated with 12 different drugs
[133,134]. Using nonlinear ordinary differential equations (ODEs),
CellBox provided an interpretable ML system that can be used to
predict drug combinations in a dynamic system. This study pro-
vided mechanistic insights for designing a combination therapy
with an understandable predictive model. Taken together, these
studies suggest that drug perturbation data provide important
information about the regulation of biological features that can
be used to develop efficient ML models [100].
Models integrating the signaling network or pathway map have
been used to detect drug combinations with limited general appli-
cations [135–137]. Similarly, synergy prediction models developed
with naive Bayes classifiers [138] and random forest algorithms
[139,140] had limited use for specific cell models. Collectively, syn-
ergy prediction models developed using classical ML algorithms
displayed acceptable predictive performance with specific datasets
but largely lacked generalizability.
5.2. Drug synergy prediction using DL
DL has been employed in the prediction of drug synergy. Using
the NCI-ALMANAC database [141], it has been demonstrated that
the use of gene expression, microRNA, and proteome data, along
with drug descriptors, provides the highest prediction capability
with feedforward neural networks [142]. This model used two sub-
models to separately process drug descriptors and gene expression,
microRNA, and proteome data. The submodels were fully con-
nected neural networks that helped reduce the dimensionality of
the data before they were fed into the final model. This study pro-
vided important insight into the use of DL in feature selection and
model development.
The DeepSynergy study [143] used a previously published drug
synergy dataset [122] to build a DL model and compared it with
several classical ML methods, such as gradient boosting, random
forest algorithms, SVMs, and elastic nets. This feedforward DL
model, which used gene expression data with the chemical fea-
tures of both drugs to predict Loewe additivity, achieved consider-
able accuracy. The use of DL allowed the model to perform better
than other ML algorithms, but it should also be tested with
unknown samples.
Recently, transformer boosted DL (TransSynergy) was devel-
oped, in which three components were used: input dimension
reduction, a self-attention transformer, and a fully connected out-
put layer [144]. The input vector contained selected features from
two drugs (drug-target interaction profile) and the cell line (gene
Table 2
Studies predicting drug synergy.
Year Study name Data Algorithm Ref
2015 RACS DCDB [151], KEGG, NCI-DREAM Semi-supervised learning [118]
2017 Li et al. DREAM [128] Random forest [130]
Gayvert et al. Held et al. [120] Random forest [140]
SynGeNet LINCS L1000, Held et al. [120] Network-based [136]
2018 Xia et al. NCI-ALMANAC [141] DL [142]
Deep Synergy O’Neil 2016 [122] DL [143]
Deep belief DREAM [128] Restricted Boltzmann machine [150]
2019 SynGeNet LINCS L1000 Network based [137]
DREAM CNV, mutation, methylation, and gene expression Multiple [117]
DDIGIP DrugBank, SIDER, OFFSIDES Regularized Least Squares [126]
Cuvitoglu et al. DCDB [151], Cmap [131] Naive Bayes, Support Vector Machines, and Random Forest [132]
Malyutina et al. O’Neil 2016 [122] Elastic net, random forest, support vector machine [115]
2020 Deep graph O’Neil 2016 [122] graph convolutional network [147]
comboFM NCI-ALMANAC [141] Higher-order factorization machines [145]
2021 CellBox Perturbation data [134] ODE [133]
AuDNNsynergy O’Neil 2016 [122] Autoencoder [146]
TranSynergy O’Neil 2016 [122] Transformer boosted DL [144]
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
expression). A fourth dimension was added if both gene expression
and gene dependency were used. The use of cell-line-gene depen-
dency, gene-gene interaction, and drug-target interaction provided
TransSynergy with a considerably higher predictive performance
and allowed the cellular effect of drug actions to be explained.
These methods provided a significant improvement over tradi-
tional ML mechanisms due to appropriate feature learning. How-
ever, all those models used cell line synergy data [122], which
might limit their application in preclinical and/or clinical trial
5.3. Synergy prediction with a higher-order factorization machine
An HOFM model [96] was used in comboFM to capture fifth-
order feature combinations using data from two drugs, cell lines,
and dose–response matrices [145]. The model integrated chemical
descriptors of drugs and gene expression data of cell lines as addi-
tional features. comboFM was trained with a part of the NCI-
ALMANAC data, while the other part of the data was used for pre-
dictive performance testing. The fifth-order comboFM was found
to perform significantly better than second- and first-order predic-
tors, suggesting that the use of higher-order feature combinations
can improve predictive performance.
5.4. Synergy prediction using an autoencoder
An autoencoder has also been employed to predict drug synergy
[146]. AuDNNsynergy used multi-omics data from CCLE and TCGA
databases combined with previously published drug synergy data
[122]. In this study, three independent autoencoders were used
to reduce the dimensions of TCGA gene expression, mutation,
and copy number data. The reduced dimensions were then com-
bined with drug combination data to develop the model. The
model was compared with the recently developed DeepSynergy
model and was shown to perform better [143], suggesting that fea-
ture reduction using an autoencoder and the use of multi-omics
data influence predictive performance.
5.5. Synergy prediction with a graph convolutional network
A graph convolutional network (GCN) model was described
(DeepGraph) in which a drug-drug synergy network, a drug-
target interaction network, and a protein–protein interaction net-
work were used to build a cell-line-specific model [147]. In the
DeepGraph study, a cell-line-specific multirelational network
graph was generated and fed into the GCN encoder. A four-layer
neural network with a relu activation function was used for encod-
ing, and a sigmoid activation function was used for the embedding
output vector. The matrix decoder was used to decode the embed-
ding vector, which predicts the synergy score [74]. The prediction
performance of DeepGraph was comparable to that of DeepSyn-
ergy. Because the DeepGraph method used a cell-line-specific
drug-protein network and protein–protein interaction network
and because only limited data for drug-protein interactions were
available, the method’s performance might be biased.
5.6. Restricted Boltzmann machine for predicting drug synergy
The restricted Boltzmann machine (RBM) is a generative proba-
bilistic model that has been widely used for handling higher-
dimensional data [148]. The RBM is similar in function to an
autoencoder and can be used to extract meaningful features from
higher-dimensional data. Furthermore, multiple RBMs can be
stacked to form a deep belief network, which allows unsupervised
and supervised data to be combined. RBMs have been used to iden-
tify gene expression biomarkers that can help predict clinical out-
comes [149]. Chen et al. used RBMs to develop a deep belief
network [150] from the DREAM consortium’s drug target informa-
tion and baseline gene expression data [128]. Although the model
was compared with existing DREAM consortium models and was
shown to outperform these models, the leave-one-out approach
that was adopted in this study was not comparable to the original
DREAM consortium models, which were compared with external
6. Limitations in the development of clinically relevant
predictive models
Currently, most ML models have been developed using cell line
data. Cell line data are robust, relatively easy to generate, and use-
ful for hypothesis generation. However, cell line data must be com-
plemented with more disease-relevant patient data. A large-scale
pharmacogenomic study using patient data is currently technically
difficult because it requires a lot of primary patient materials. This
can potentially be overcome by using PDXs. The recent develop-
ment of PDX repositories will support large-scale clinically rele-
vant studies in the near future [37–40].
Most tumors grow in a multicellular environment in which the
surrounding cells create a favorable microenvironment for tumor
growth. Prediction models based on cell line data do not capture
the microenvironment’s contributions and might therefore never
reach the level of accuracy that is necessary in the clinic. Cultured
tumor organoids can likely mimic the microenvironment of a
patient’s tumor [107]. However, currently, only limited pharma-
cogenomic data from tumor organoids are available.
Several recent models used multi-omics data to build predictive
models [62,87,92]. Although the use of multi-omics data can
improve the prediction performance and can be very useful for
research purposes, it limits the practical use of the models in the
clinic. For prediction purposes, it would be costly and time-
consuming to determine mutations, CNVs, promotor methylation,
protein expression, gene expression, etc. for each patient sepa-
rately. Gene expression data can potentially reflect most cellular
processes because mutations, CNVs, and promotor methylation
might ultimately determine gene expression changes.
Most gene expression data currently available involve the base-
line expression of genes and do not reflect drug-induced perturba-
tions [24,28,30,80]. A few studies provided a limited number of
drug-induced perturbation data, which were found to be very use-
ful for feature selection [59,134]. Thus, large-scale drug-induced
perturbation studies will help to develop better predictive models.
Nevertheless, drug synergy prediction is an important concept
that will have numerous uses in the clinic. At the same time, a
combination of several drugs can have severe adverse effects. Thus,
a comprehensive method is needed that will not only determine
drug synergy but also incorporate the adverse effect of drug com-
binations. Knowledge of safe and unsafe combinations of drugs
was used to build a linear regression prediction model [152–
154]. However, the model did not incorporate any biological data
to elucidate patient-specific side effects.
Several studies have highlighted implementation challenges
encountered in precision medicine solutions [155,156]. These chal-
lenges include data preprocessing, unstructured clinical text pro-
cessing, medical data processing and storage, and environmental
data collections. Apart from these challenges, the major challenge
might be the redesigning of clinical decision support systems so
that they can incorporate molecular, omics, and environmental
aspects of precision medicine. A comprehensive support system
is desirable to facilitate the curation of data from different sources
and multiple scales and to promote the interaction between bioin-
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
formatics and clinical informatics [155]. Building such a system
requires solving many integration and standardization issues.
As pointed out by many studies, model explainability, high-
quality training data, and collaborations between medical experts
and computational experts are some of the key factors affecting
the success of ML solutions for drug response prediction in cancer
treatment [9,157]. Although much omics information is available
and many theoretical frameworks exist, hands-on ML tools tar-
geted at physicians and medical professionals are scarce. In that
regard, various cloud-based cancer prediction tools, such as OASIS-
PRO [158], can be introduced to make ML solutions suitable for
massive clinical practice. The study gave an overview of general-
purpose multi-omics tools that can be useful for gene identification
and cancer subtyping [159].
Clinical trials are essential for clinical research in general and
cancer treatment in particular. The three-phase trial approach is
considered standard practice but is designed primarily for gradu-
ally improving treatments. Our ability to understand and treat can-
cer has, however, evolved over time [21]. Because of the immense
role of ML in both clinical trials and clinical practice, the inclusion
of ML in regulatory frameworks is unavoidable.
7. Conclusion
The development of predictive models for monotherapy and
combinatorial therapies is important but highly challenging. The
recent advancement in ML algorithms holds promise for the devel-
opment of clinically relevant predictive models. Furthermore, more
pharmacogenomic data from disease-relevant organoids and PDXs
are becoming available, allowing clinical biases to be overcome.
Massive computational power is within easy reach for handling a
large amount of data that is exponentially increasing. In the near
future, the current lack of clinically relevant pharmacogenomic
data might also be overcome. Therefore, although current predic-
tive models are far from being ready for clinical use, they show
us a clear path toward precision medicine.
CRediT authorship contribution statement
Raihan Rafique: Writing - original draft, Writing - review &
editing. S.M. Riazul Islam: Writing - original draft, Writing -
review & editing. Julhash U. Kazi: Conceptualization, Writing -
original draft, Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing finan-
cial interests or personal relationships that could have appeared
to influence the work reported in this paper.
This research was supported by the Crafoord Foundation (JUK),
the Swedish Cancer Society (JUK), and the Swedish Childhood Can-
cer Foundation (JUK). Open Access funding is provided by Lund
[1] Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI. Machine
learning applications in cancer prognosis and prediction. Comput. Struct.
Biotechnol. J. 2015;13:8–17.
[2] Sharma A, Rani R. A systematic review of applications of machine learning in
cancer prediction and diagnosis. Arch. Comput. Methods Eng. 2021. https://
[3] Hamamoto R, Suvarna K, Yamada M, Kobayashi K, Shinkai N, Miyake M, et al.
Application of artificial intelligence technology in oncology: towards the
establishment of precision medicine. Cancers (Basel) 2020;12:3532. https://
[4] Putora PM, Baudis M, Beadle BM, El Naqa I, Giordano FA, Nicolay NH.
Oncology informatics: status quo and outlook. Oncology 2020;98(Suppl.
[5] Shimizu H, Nakayama KI. Artificial intelligence in oncology. Cancer Sci.
[6] Huang S, Yang J, Fong S, Zhao QI. Artificial intelligence in cancer diagnosis and
prognosis: opportunities and challenges. Cancer Lett. 2020;471:61–71.
[7] Nardini C. Machine learning in oncology: a review. Ecancermedicalscience
[8] Filipp FV. Opportunities for artificial intelligence in advancing precision
medicine. Curr. Genet. Med. Rep. 2019;7(4):208–13.
[9] Azuaje F. Artificial intelligence for precision oncology: beyond patient
stratification. NPJ Precis. Oncol. 2019;3:6.
[10] Patel SK, George B, Rai V. Artificial intelligence to decode cancer mechanism:
beyond patient stratification for precision oncology. Front. Pharmacol.
[11] J. Singer, A. Irmisch, H.J. Ruscheweyh, F. Singer, N.C. Toussaint, M.P. Levesque,
D.J. Stekhoven, N. Beerenwinkel, Bioinformatics for precision oncology. Brief
Bioinform 20 (2019) 778–788.
[12] Nicora G, Vitali F, Dagliati A, Geifman N, Bellazzi R. Integrated multi-omics
analyses in oncology: a review of machine learning methods and tools. Front.
Oncol. 2020;10:1030.
[13] Bera K, Schalper KA, Rimm DL, Velcheti V, Madabhushi A. Artificial
intelligence in digital pathology - new tools for diagnosis and precision
oncology. Nat. Rev. Clin. Oncol. 2019;16(11):703–15.
[14] Grothen AE, Tennant B, Wang C, Torres A, Bloodgood Sheppard B, Abastillas G,
et al. Application of artificial intelligence methods to pharmacy data for
cancer surveillance and epidemiology research: a systematic review. JCO Clin.
Cancer Inform. 2020(4):1051–8.
[15] Biswas N, Chakrabarti S. Artificial intelligence (AI)-based systems biology
approaches in multi-omics data analysis of cancer. Front. Oncol.
[16] Chiu YC, Chen HH, Gorthi A, Mostavi M, Zheng S, Huang Y, et al. Deep learning
of pharmacogenomics resources: moving towards precision oncology. Brief
Bioinform 2020;21:2066–83.
[17] Adam G, Rampasek L, Safikhani Z, Smirnov P, Haibe-Kains B, Goldenberg A.
Machine learning approaches to drug response prediction: challenges and
recent progress. NPJ Precis. Oncol. 2020;4:19.
[18] Cuocolo R, Caruso M, Perillo T, Ugga L, Petretta M. Machine Learning in
oncology: a clinical appraisal. Cancer Lett. 2020;481:55–62.
[19] Tanoli Z, Vaha-Koskela M, Aittokallio T. Artificial intelligence, machine
learning, and drug repurposing in cancer. Expert Opin. Drug Discov.
[20] Rauschert S, Raubenheimer K, Melton PE, Huang RC. Machine learning and
clinical epigenetics: a review of challenges for diagnosis and classification.
Clin. Epigenet. 2020;12:51.
[21] Li A, Bergan RC. Clinical trial design: past, present, and future in the context of
big data and precision medicine. Cancer 2020;126(22):4838–46.
[22] Fountzilas E, Tsimberidou AM. Overview of precision oncology trials:
challenges and opportunities. Expert. Rev. Clin. Pharmacol. 2018;11
[23] Li X, Warner JL. A review of precision oncology knowledgebases for
determining the clinical actionability of genetic variants. Front. Cell Dev.
Biol. 2020;8:48.
[24] Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, et al.
The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer
drug sensitivity. Nature 2012;483(7391):603–7.
[25] Ghandi M, Huang FW, Jané-Valbuena J, Kryukov GV, Lo CC, McDonald ER,
et al. Next-generation characterization of the Cancer Cell Line Encyclopedia.
Nature 2019;569(7757):503–8.
[26] Shoemaker RH. The NCI60 human tumour cell line anticancer drug screen.
Nat. Rev. Cancer 2006;6(10):813–23.
[27] Iorio F, Knijnenburg TA, Vis DJ, Bignell GR, Menden MP, Schubert M, et al. A
landscape of pharmacogenomic interactions in cancer. Cell 2016;166
[28] Garnett MJ, Edelman EJ, Heidorn SJ, Greenman CD, Dastur A, Lau KW, et al.
Systematic identification of genomic markers of drug sensitivity in cancer
cells. Nature 2012;483(7391):570–5.
[29] Haverty PM, Lin E, Tan J, Yu Y, Lam B, Lianoglou S, et al. Reproducible
pharmacogenomic profiling of cancer cell line panels. Nature 2016;533
[30] Basu A, Bodycombe N, Cheah J, Price E, Liu Ke, Schaefer G, et al. An interactive
resource to identify cancer genetic and lineage dependencies targeted by
small molecules. Cell 2013;154(5):1151–61.
[31] Seashore-Ludlow B, Rees MG, Cheah JH, Cokol M, Price EV, Coletti ME, et al.
Harnessing connectivity in a large-scale small-molecule sensitivity dataset.
Cancer Discov. 2015;5(11):1210–23.
[32] Smirnov P, Kofia V, Maru A, Freeman M, Ho C, El-Hachem N, et al.
PharmacoDB: an integrative database for mining in vitro anticancer drug
screening studies. Nucleic Acids Res 2018;46:D994–D1002.
[33] Rajapakse VN, Luna A, Yamade M, Loman L, Varma S, Sunshine M, et al.
CellMinerCDB for integrative cross-database genomics and
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
pharmacogenomics analyses of cancer cell lines. iScience 2018;10:247–64.
[34] Luna A, Elloumi F, Varma S, Wang Y, Rajapakse VN, Aladjem MI, et al. Cell
Miner Cross-Database (CellMinerCDB) version 1.2: Exploration of patient-
derived cancer cell line pharmacogenomics. Nucleic Acids Res. 2021;49:
[35] Zagidullin B, Aldahdooh J, Zheng S, Wang W, Wang Y, Saad J, et al. DrugComb:
an integrative cancer drug combination data portal. Nucleic Acids Res
[36] Liu H, Zhang W, Zou B, Wang J, Deng Y, Deng L. DrugCombDB: a
comprehensive database of drug combinations toward the discovery of
combinatorial therapy. Nucleic Acids Res. 2020;48:D871–81.
[37] Conte N, Mason JC, Halmagyi C, Neuhauser S, Mosaku A, Yordanova G, et al.
PDX Finder: A portal for patient-derived tumor xenograft model discovery.
Nucleic Acids Res 2019;47:D1073–9.
[38] Townsend EC, Murakami MA, Christodoulou A, Christie AL, Köster J, DeSouza
TA, et al. The Public Repository of Xenografts Enables Discovery and
Randomized Phase II-like Trials in Mice. Cancer Cell 2016;29(4):574–86.
[39] PDMR (2021) NCI’s Patient-derived Models Repository. https://
[40] Hidalgo M, Amant F, Biankin AV, Budinská E, Byrne AT, Caldas C, et al. Patient-
derived xenograft models: an emerging platform for translational cancer
research. Cancer Discov. 2014;4(9):998–1013.
[41] Gao H, Korn JM, Ferretti S, Monahan JE, Wang Y, Singh M, et al. High-
throughput screening using patient-derived tumor xenografts to predict
clinical trial drug response. Nat. Med. 2015;21(11):1318–25.
[42] Mer AS, Ba-Alawi W, Smirnov P, Wang YX, Brew B, Ortmann J, et al.
Integrative Pharmacogenomics Analysis of Patient-Derived Xenografts.
Cancer Res. 2019;79(17):4539–50.
[43] Klijn C, Durinck S, Stawiski EW, Haverty PM, Jiang Z, Liu H, et al. A
comprehensive transcriptional portrait of human cancer cell lines. Nat.
Biotechnol. 2015;33(3):306–12.
[44] Greshock J, Bachman KE, Degenhardt YY, Jing J, Wen YH, Eastman S, et al.
Molecular target class is predictive of in vitro response profile. Cancer Res.
[45] Mpindi JP, Yadav B, Östling P, Gautam P, Malani D, Murumägi A, et al.
Consistency in drug response profiling. Nature 2016;540(7631):E5–6.
[46] Haibe-Kains B, El-Hachem N, Birkbak NJ, Jin AC, Beck AH, Aerts HJWL, et al.
Inconsistency in large pharmacogenomic studies. Nature 2013;504
[47] The Cancer Cell Line Encyclopedia and Genomics of Drug Sensitivity in Cancer
Investigators. Pharmacogenomic agreement between two cancer cell line
data sets. Nature 2015;528:84–7.
[48] Safikhani Z, Smirnov P, Thu KL, Silvester J, El-Hachem N, Quevedo R, et al.
Gene isoforms as expression-based biomarkers predictive of drug response
in vitro. Nat. Commun. 2017;8(1).
[49] Costello JC, Heiser LM, Georgii E, Gönen M, Menden MP, Wang NJ, et al. A
community effort to assess and improve drug sensitivity prediction
algorithms. Nat. Biotechnol. 2014;32(12):1202–12.
[50] Jang IS, Neto EC, Guinney J, Friend SH, Margolin AA. Systematic assessment of
analytical methods for drug sensitivity prediction from cancer cell line data.
Pac. Symp. Biocomput. 2014:63–74.
[51] Ali M, Aittokallio T. Machine learning and feature selection for drug response
prediction in precision oncology applications. Biophys. Rev. 2019;11(1):31–9.
[52] Koras K, Juraeva D, Kreis J, Mazur J, Staub E, Szczurek E. Feature selection
strategies for drug sensitivity prediction. Sci. Rep. 2020;10:9377.
[53] Ali M, Khan SA, Wennerberg K, Aittokallio T. Global proteomics profiling
improves drug sensitivity prediction: results from a multi-omics, pan-cancer
modeling approach. Bioinformatics 2018;34:1353–62.
[54] Papillon-Cavanagh S, De Jay N, Hachem N, Olsen C, Bontempi G, Aerts HJWL,
et al. Comparison and validation of genomic predictors for anticancer drug
sensitivity. J. Am. Med. Inform. Assoc. 2013;20(4):597–602.
[55] Stetson LC, Pearl T, Chen Y, Barnholtz-Sloan JS. Computational identification
of multi-omic correlates of anticancer therapeutic response. BMC Genom.
2014;15(Suppl 7):S2.
[56] Ding C, Peng H. Minimum redundancy feature selection from microarray gene
expression data. J. Bioinform. Comput. Biol. 2005;03(02):185–205.
[57] Lin TH, Li HT, Tsai KC. Implementing the Fisher’s discriminant ratio in a k-
means clustering algorithm for feature selection and data set trimming. J.
Chem. Inf. Comput. Sci. 2004;44(1):76–87.
[58] Nakajo M, Jinguji M, Tani A, Hirahara D, Nagano H, Takumi K, et al.
Application of a machine learning approach to characterization of liver
function using (99m)Tc-GSA SPECT/CT. Abdom Radiol (NY) 2021;46
[59] Rampasek L, Hidru D, Smirnov P, Haibe-Kains B, Goldenberg A. Dr.VAE:
improving drug response prediction via modeling of drug perturbation
effects. Bioinformatics 2019;35:3743–51.
[60] Chiu YC, Chen HIH, Zhang T, Zhang S, Gorthi A, Wang LJ, et al. Predicting drug
response of tumors from integrated genomic profiles by deep neural
networks. BMC Med. Genom. 2019;12(S1).
[61] Li M, Wang Y, Zheng R, Shi X, Li Y, Wu FX, et al. DeepDSC: a deep learning
method to predict drug sensitivity of cancer cell lines. IEEE/ACM Trans.
Comput. Biol. Bioinform. 2021;18(2):575–82.
[62] Kuenzi BM, Park J, Fong SH, Sanchez KS, Lee J, Kreisberg JF, et al. Predicting
drug response and synergy using a deep learning model of human cancer
cells. Cancer Cell 2020;38(672–684):e676.
[63] Liu X, Li N, Liu S, Wang J, Zhang N, Zheng X, et al. Normalization methods for
the analysis of unbalanced transcriptome data: a review. Front. Bioeng.
Biotechnol. 2019;7.
[64] Reinders C, Ackermann H, Yang MY, Rosenhahn B. Learning convolutional
neural networks for object detection with very little training data.
Multimodal Scene Understanding 2019:65–100.
[65] Jonsson A. Deep reinforcement learning in medicine. Kidney Dis. (Basel)
[66] Triguero I, García S, Herrera F. Self-labeled techniques for semi-supervised
learning: taxonomy, software and empirical study. Knowl. Inf. Syst. 2015;42
[67] Podgorelec V, Kokol P, Stiglic B, Rozman I. Decision trees: an overview and
their use in medicine. J. Med. Syst. 2002;26:445–63.
[68] Breiman L. Random forests. Mach. Learn. 2001;45:5–32.
[69] Freund Y, Schapire RE. A desicion-theoretic generalization of on-line learning
and an application to boosting. Computational Learning Theory, EuroCOLT
[70] Friedman J, Hastie T, Tibshirani R. Special Invited Paper. Additive logistic
regression: a statistical view of boosting. Ann. Stat. 2000;28:337–74.
[71] Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. arXiv 2016,
[72] Muller KR, Mika S, Ratsch G, Tsuda K, Scholkopf B. An introduction to kernel-
based learning algorithms. IEEE Trans. Neural Netw. 2001;12(2):181–201.
[73] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–44.
[74] Kipf TN, Welling M. Semi-Supervised Classification with Graph Convolutional
Networks. arXiv 2017:1609.02907v4.
[75] Sun M, Zhao S, Gilvary C, Elemento O, Zhou J, Wang F. Graph convolutional
networks for computational drug development and discovery. Brief
Bioinform 2020;21:919–35.
[76] van ’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AAM, Mao M, et al. Gene
expression profiling predicts clinical outcome of breast cancer. Nature
[77] Kaklamani V. A genetic signature can predict prognosis and response to
therapy in breast cancer: oncotype DX. Expert Rev. Mol. Diagn. 2006;6
[78] Parker JS, Mullins M, Cheang MCU, Leung S, Voduc D, Vickery T, et al.
Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin.
Oncol. 2009;27(8):1160–7.
[79] Geeleher P, Cox NJ, Huang R. Clinical drug response can be predicted using
baseline gene expression levels and in vitro drug sensitivity in cell lines.
Genome Biol. 2014;15(3):R47.
[80] Menden MP, Iorio F, Garnett M, McDermott U, Benes CH, Ballester PJ, et al.
Machine learning prediction of cancer cell sensitivity to drugs based on
genomic and chemical properties. PLoS ONE 2013;8(4):e61318. https://doi.
[81] Cortes-Ciriano I, van Westen GJ, Bouvier G, Nilges M, Overington JP, Bender A,
et al. Improved large-scale prediction of growth inhibition patterns using the
NCI60 cancer cell line panel. Bioinformatics 2016;32:85–95.
[82] Zhang N, Wang H, Fang Y, Wang J, Zheng X, Liu XS, et al. Predicting anticancer
drug responses using a dual-layer integrated cell line-drug network model.
PLoS Comput. Biol. 2015;11(9):e1004498.
[83] Zhang F, Wang M, Xi J, Yang J, Li A. A novel heterogeneous network-based
method for drug response prediction in cancer cell lines. Sci. Rep.
[84] Gonen M, Margolin AA. Drug susceptibility prediction against a panel of drugs
using kernelized Bayesian multitask learning. Bioinformatics 2014;30:
[85] Ammad-Ud-Din M, Khan SA, Wennerberg K, Aittokallio T. Systematic
identification of feature combinations for predicting drug response with
Bayesian multi-view multi-task linear regression. Bioinformatics 2017;33:
[86] Ammad-ud-din M, Georgii E, Gönen M, Laitinen T, Kallioniemi O, Wennerberg
K, et al. Integrative and personalized QSAR analysis in cancer by kernelized
Bayesian matrix factorization. J. Chem. Inf. Model. 2014;54(8):2347–59.
[87] Lee SI, Celik S, Logsdon BA, Lundberg SM, Martins TJ, Oehler VG, et al. A
machine learning approach to integrate big data for precision medicine in
acute myeloid leukemia. Nat. Commun. 2018;9(1).
[88] Aliper A, Plis S, Artemov A, Ulloa A, Mamoshina P, Zhavoronkov A. Deep
Learning Applications for Predicting Pharmacological Properties of Drugs and
Drug Repurposing Using Transcriptomic Data. Mol. Pharm. 2016;13
[89] Deng L, Cai Y, Zhang W, Yang W, Gao B, Liu H. Pathway-Guided Deep Neural
Network toward Interpretable and Predictive Modeling of Drug Sensitivity. J.
Chem. Inf. Model. 2020;60(10):4497–505.
[90] Sakellaropoulos T, Vougas K, Narang S, Koinis F, Kotsinas A, Polyzos A, et al. A
Deep Learning Framework for Predicting Response to Therapy in Cancer. Cell
Rep 2019;29(11):3367–3373.e4.
[91] Chang Y, Park H, Yang HJ, Lee S, Lee KY, Kim TS, et al. Cancer Drug Response
Profile scan (CDRscan): A Deep Learning Model That Predicts Drug
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
Effectiveness from Cancer Genomic Signature. Sci. Rep. 2018;8(1). https://doi.
[92] Sharifi-Noghabi H, Zolotareva O, Collins CC, Ester MMOLI. multi-omics late
integration with deep neural networks for drug response prediction.
Bioinformatics 2019;35:i501–9.
[93] Koren Y, Bell R, Volinsky C. Matrix Factorization Technique for Recommender
System. Computer 2009;42:30–7.
[94] Suphavilai C, Bertrand DN. Nagarajan. Predicting Cancer Drug Response using
a Recommender System. Bioinformatics 2018;34:3907–14.
[95] Rendle S. Factorization Machines. IEEE International Conference on Data
Mining IEEE 2010;2010:995–1000.
[96] Blondel M, Fujino A, Ueda N, Ishihata M. Higher-Order Factorization
Machines. In: 30th Conference on Neural Information Processing Systems
NIPS. p. 3351–9.
[97] Way GP, Greene CS. Extracting a biologically relevant latent space from
cancer transcriptomes with variational autoencoders. Pac Symp Biocomput
[98] Kingma DP, Welling M. Auto-Encoding Variational Bayes. arXiv
[99] Dincer AV, Celik S, Hiranuma N, LeeDeepProfile SI. Deep learning of cancer
molecular profiles for precision medicine. bioRxiv 2018.
[100] Shah K, Ahmed M, Kazi JU. The Aurora kinase/beta-catenin axis contributes to
dexamethasone resistance in leukemia. npj Precis. Oncol. 2021;5:13. https://
[101] Nachman B, Shih D. Anomaly detection with density estimation. Phys Rev D
[102] Rogers D, Hahn M. Extended-connectivity fingerprints. J. Chem. Inf. Model.
[103] Weininger D. SMILES, a chemical language and information system. 1.
Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci.
[104] Nguyen TT, Nguyen GTT, Nguyen T, Le DH. Graph convolutional networks for
drug response prediction. IEEE/ACM Trans Comput Biol Bioinform 2021.
[105] Liu P, Li H, Li S, Leung KS. Improving prediction of phenotypic drug response
on cancer cell lines using deep convolutional network. BMC Bioinf.
[106] Kim Y, Kim D, Cao B, Carvajal R, Kim M. PDXGEM: patient-derived tumor
xenograft-based gene expression model for predicting clinical response to
anticancer therapy in cancer patients. BMC Bioinf. 2020;21:288.
[107] Kong JH, Lee H, Kim D, Han SK, Ha D, Shin K, et al. Network-based machine
learning in colorectal and bladder organoid models predicts anti-cancer drug
efficacy in patients. Nat. Commun. 2020;11(1).
[108] van de Wetering M, Francies HE, Francis JM, Bounova G, Iorio F, Pronk A, et al.
Prospective derivation of a living organoid biobank of colorectal cancer
patients. Cell 2015;161(4):933–45.
[109] Lee SH, Hu W, Matulay JT, Silva MV, Owczarek TB, Kim K, et al. Tumor
Evolution and Drug Response in Patient-Derived Organoid Models of Bladder
Cancer. Cell 2018;173(2):515–528.e17.
[110] He L, Kulesskiy E, Saarela J, Turunen L, Wennerberg K, Aittokallio T, et al.
Methods for High-throughput Drug Combination Screening and Synergy
Scoring. Methods Mol. Biol. 2018;1711:351–98.
[111] Bliss CI. The toxicity of poisons applied jointly. Ann. Appl. Biol.
[112] Loewe S. The problem of synergism and antagonism of combined drugs.
Arzneimittelforschung 1953;3:285–90.
[113] Tan X, Hu L, Luquette LJ, Gao G, Liu Y, Qu H, et al. Systematic identification of
synergistic drug pairs targeting HIV. Nat. Biotechnol. 2012;30(11):1125–30.
[114] Yadav B, Wennerberg K, Aittokallio T, Tang J. Searching for Drug Synergy in
Complex Dose-Response Landscapes Using an Interaction Potency Model.
Comput. Struct. Biotechnol. J. 2015;13:504–13.
[115] Malyutina A, Majumder MM, Wang W, Pessia A, Heckman CA, Tang J, et al.
Drug combination sensitivity scoring facilitates the discovery of synergistic
and efficacious drug combinations in cancer. PLoS Comput. Biol. 2019;15(5):
[116] Di Veroli GY, Fornari C, Wang D, Mollard S, Bramhall JL, Richards FM, et al.
Combenefit: an interactive platform for the analysis and visualization of drug
combinations. Bioinformatics 2016;32(18):2866–8.
[117] Menden MP, Wang D, Mason MJ, Szalai B, Bulusu KAC, Guan Y, et al.
Community assessment to advance computational prediction of cancer drug
combinations in a pharmacogenomic screen. Nat. Commun. 2019;10(1).
[118] Sun Y, Sheng Z, Ma C, Tang K, Zhu R, Wu Z, et al. Combining genomic and
network characteristics for extended capability in predicting synergistic
drugs for cancer. Nat. Commun. 2015;6(1).
[119] Roller DG, Axelrod M, Capaldo BJ, Jensen K, Mackey A, Weber MJ, et al.
Synthetic lethal screening with small-molecule inhibitors provides a
pathway to rational combination therapies for melanoma. Mol. Cancer
Ther. 2012;11(11):2505–15.
[120] Held MA, Langdon CG, Platt JT, Graham-Steed T, Liu Z, Chakraborty A, et al.
Genotype-selective combination therapies for melanoma identified by high-
throughput drug screening. Cancer Discov 2013;3(1):52–67.
[121] Kang Y, Hodges A, Ong E, Roberts W, Piermarocchi C, Paternostro G, et al.
Identification of drug combinations containing imatinib for treatment of
BCR-ABL+ leukemias. PLoS ONE 2014;9(7):e102221.
[122] O’Neil J, Benita Y, Feldman I, Chenard M, Roberts B, Liu Y, et al. An Unbiased
Oncology Compound Screen to Identify Novel Combination Strategies. Mol.
Cancer Ther. 2016;15(6):1155–62.
[123] Chan GKY, Wilson S, Schmidt S, Moffat JG. Unlocking the Potential of High-
Throughput Drug Combination Assays Using Acoustic Dispensing. J Lab
Autom 2016;21(1):125–32.
[124] Zhao XM, Iskar M, Zeller G, Kuhn M, van Noort V, Bork P, et al. Prediction of
drug combinations by integrating molecular and pharmacological data. PLoS
Comput. Biol. 2011;7(12):e1002323.
[125] Cheng F, Zhao Z. Machine learning-based prediction of drug-drug interactions
by integrating drug phenotypic, therapeutic, chemical, and genomic
properties. J Am Med Inform Assoc 2014;21:e278–286.
[126] Yan C, Duan G, Pan Y, Wu FX, Wang J. DDIGIP: predicting drug-drug
interactions based on Gaussian interaction profile kernels. BMC Bioinf.
[127] Jin G, Zhao H, Zhou X, Wong STC. An enhanced Petri-net model to predict
synergistic effects of pairwise drug combinations from gene microarray data.
Bioinformatics 2011;27(13):i310–6.
[128] Bansal M, Yang J, Karan C, Menden MP, Costello JC, Tang H, et al. A
community computational challenge to predict the activity of pairs of
compounds. Nat. Biotechnol. 2014;32(12):1213–22.
[129] Goswami CP, Cheng L, Alexander PS, Singal A, Li L. A New Drug Combinatory
Effect Prediction Algorithm on the Cancer Cell Based on Gene Expression and
Dose-Response Curve. CPT Pharmacometrics Syst Pharmacol 2015;4
[130] Li X, Xu Y, Cui H, Huang T, Wang D, Lian B, et al. Prediction of synergistic anti-
cancer drug combinations based on drug target network and drug induced
gene expression profiles. Artif. Intell. Med. 2017;83:35–43.
[131] Lamb J. The Connectivity Map: a new tool for biomedical research. Nat. Rev.
Cancer 2007;7(1):54–60.
[132] Cuvitoglu A, Zhou JX, Huang S, Isik Z. Predicting drug synergy for precision
medicine using network biology and machine learning. J Bioinform Comput
Biol 2019;17(02):1950012.
[133] Yuan B, Shen C, Luna A, Korkut A, Marks DS, Ingraham J, et al. Cell Box:
Interpretable Machine Learning for Perturbation Biology with Application to
the Design of Cancer Combination Therapy. Cell Syst 2021;12(128–140):
[134] Korkut A, Wang W, Demir E, Aksoy BA, Jing X, Molinelli EJ, et al. Perturbation
biology nominates upstream-downstream drug combinations in RAF
inhibitor resistant melanoma cells. Elife 2015;4:e04640.
[135] Morris MK, Clarke DC, Osimiri LC, Lauffenburger DA. Systematic Analysis of
Quantitative Logic Model Ensembles Predicts Drug Combination Effects on
Cell Signaling Networks. CPT Pharmacometrics Syst Pharmacol 2016;5
[136] Regan KE, Payne PRO, Li F. Integrative network and transcriptomics-based
approach predicts genotype- specific drug combinations for melanoma. AMIA
Jt Summits Transl Sci Proc 2017;2017:247–56.
[137] Regan-Fendt KE, Xu J, DiVincenzo M, Duggan MC, Shakya R, Na R, et al.
Synergy from gene expression and network mining (SynGeNet) method
predicts synergistic drug combinations for diverse melanoma genomic
subtypes. npj Syst. Biol. Appl. 2019;5(1).
[138] Li P, Huang C, Fu Y, Wang J, Wu Z, Ru J, et al. Large-scale exploration and
analysis of drug combinations. Bioinformatics 2015;31(12):2007–16.
[139] Wildenhain J, Spitzer M, Dolma S, Jarvik N, White R, Roy M, et al. Prediction of
Synergism from Chemical-Genetic Interactions by Machine Learning. Cell
Syst 2015;1(6):383–95.
[140] Gayvert KM, Aly O, Platt J, Bosenberg MW, Stern DF, Elemento O, et al. A
Computational Approach for Identifying Synergistic Drug Combinations. PLoS
Comput. Biol. 2017;13(1):e1005308.
[141] Holbeck SL, Camalier R, Crowell JA, Govindharajulu JP, Hollingshead M,
Anderson LW, et al. The National Cancer Institute ALMANAC: A
Comprehensive Screening Resource for the Detection of Anticancer Drug
Pairs with Enhanced Therapeutic Activity. Cancer Res. 2017;77(13):3564–76.
[142] Xia F, Shukla M, Brettin T, Garcia-Cardona C, Cohn J, Allen JE, et al. Predicting
tumor cell line response to drug pairs with deep learning. BMC Bioinf.
[143] Preuer K, Lewis RPI, Hochreiter S, Bender A, Bulusu KC, Klambauer G.
DeepSynergy: predicting anti-cancer drug synergy with Deep Learning.
Bioinformatics 2018;34:1538–46.
[144] Liu Q, Xie L. TranSynergy: Mechanism-driven interpretable deep neural
network for the synergistic prediction and pathway deconvolution of drug
combinations. PLoS Comput. Biol. 2021;17:e1008653.
[145] Julkunen H, Cichonska A, Gautam P, Szedmak S, Douat J, Pahikkala T, et al.
Leveraging multi-way interactions for systematic prediction of pre-clinical
drug combination effects. Nat. Commun. 2020;11(1).
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
[146] Zhang T, Zhang L, Payne PRO, Li F. Synergistic Drug Combination Prediction
by Integrating Multiomics Data in Deep Learning Models. Methods Mol. Biol.
[147] Jiang P, Huang S, Fu Z, Sun Z, Lakowski TM, Hu P. Deep graph embedding for
prioritizing synergistic anticancer drug combinations. Comput. Struct.
Biotechnol. J. 2020;18:427–38.
[148] Larochelle H, Mandel M, Pascanu Y, Bengio Y. Learning algorithms for the
classification restricted boltzmann machine. J. Mach. Learn. Res.
[149] Jin T, Nguyen ND, Talos F, Wang D. ECMarker: interpretable machine learning
model identifies gene expression biomarkers predicting clinical outcomes
and reveals molecular mechanisms of human disease in early stages.
Bioinformatics 2020.
[150] Chen G, Tsoi A, Xu H, Zheng WJ. Predict effective drug combination by deep
belief network and ontology fingerprints. J. Biomed. Inform. 2018;85:149–54.
[151] Liu Y, Wei Q, Yu G, Gai W, Li Y, Chen X. DCDB 2.0: a major update of the drug
combination database. Database (Oxford) 2014;124.
[152] Huang H, Zhang P, Qu XA, Sanseau P, Yang L. Systematic prediction of drug
combinations based on clinical side-effects. Sci. Rep. 2014;4:7160.
[153] Torres NB, Altafini C. Drug combinatorics and side effect estimation on the
signed human drug-target network. BMC Syst. Biol. 2016;10:74.
[154] Gardiner LJ, Carrieri AP, Wilshaw J, Checkley S, Pyzer-Knapp EO, Krishna R.
Using human in vitro transcriptome analysis to build trustworthy machine
learning models for prediction of animal drug toxicity. Sci. Rep.
[155] Afzal M, Islam SMR, Hussain M, Lee S. Precision Medicine Informatics:
Principles, Prospects, and Challenges. IEEE Access 2020;8:13593–612.
[156] Kaur S, Singla J, Nkenyereye L, Jha S, Prashar D, Joshi GP, et al. Medical
Diagnostic Systems Using Artificial Intelligence (AI) Algorithms: Principles
and Perspectives. IEEE Access 2020;8:228049–69.
[157] Zhu W, Xie L, Han J, Guo X. The Application of Deep Learning in Cancer
Prognosis Prediction. Cancers (Basel) 2020:12.
[158] Yu KH, Fitzpatrick MR, Pappas L, Chan W, Kung J, Snyder M. Omics AnalySIs
System for PRecision Oncology (OASISPRO): a web-based omics analysis tool
for clinical phenotype prediction. Bioinformatics 2018;34:319–20. https://
[159] Sathyanarayanan A, Gupta R, Thompson EW, Nyholt DR, Bauer DC, Nagaraj
SH. A comparative study of multi-omics integration tools for cancer driver
gene identification and tumour subtyping. Brief Bioinform 2020;21:1920–36.
R. Rafique, S.M. Riazul Islam and J.U. Kazi Computational and Structural Biotechnology Journal 19 (2021) 4003–4017
... However, monotherapy can promote the development of drug resistances. Hence, there also exist methods that predict the sensitivity for drug combinations using cancer cell line panels 1,2,19,20 . A general issue with drug sensitivity data is that the high specificity of (targeted) anti-cancer drugs leads to an underrepresentation of sensitive samples. ...
Full-text available
Machine learning methods trained on cancer cell line panels are intensively studied for the prediction of optimal anti-cancer therapies. While classification approaches distinguish effective from ineffective drugs, regression approaches aim to quantify the degree of drug effectiveness. However, the high specificity of most anti-cancer drugs induces a skewed distribution of drug response values in favor of the more drug-resistant cell lines, negatively affecting the classification performance (class imbalance) and regression performance (regression imbalance) for the sensitive cell lines. Here, we present a novel approach called SimultAneoUs Regression and classificatiON Random Forests (SAURON-RF) based on the idea of performing a joint regression and classification analysis. We demonstrate that SAURON-RF improves the classification and regression performance for the sensitive cell lines at the expense of a moderate loss for the resistant ones. Furthermore, our results show that simultaneous classification and regression can be superior to regression or classification alone.
... In their study, they stated that machine learning applications have potential opportunities for clinical studies in the field of psychotherapy. In another study, Rafique et al. [13] examined the effect of machine learning on cancer treatment. ...
Full-text available
The Covid-19 pandemic is a deadly epidemic and continues to affect all world. This situation dragged the countries into a global crisis and caused the collapse of some health systems. Therefore, many technologies are needed to slow down the spread of the Covid-19 epidemic and produce solutions. In this context, some developments have been made with artificial intelligence, machine learning and deep learning support systems in order to alleviate the burden on the health system. In this study, a new Internet of Medical Things (IoMT) framework is proposed for the detection and early prevention of Covid-19 infection. In the proposed IoMT framework, a Covid-19 scenario consisting of various numbers of sensors is created in the Riverbed Modeler simulation software. The health data produced in this scenario is analyzed in real-time with Apache Spark technology and disease prediction is made. In order to provide more accurate results for Covid-19 disease prediction, Random Forest (RF) and Gradient Boosted Tree (GBT) Ensemble Learning classifiers, which are formed by Decision Tree (DT) classifiers, are compared for the performance evaluation. In addition, throughput, end-to-end delay results and Apache Spark data processing performance of heterogeneous nodes with different priorities are analyzed in the Covid-19 scenario. The MongoDB NoSQL database is used in the IoMT framework to store big health data produced in real-time and use it in subsequent processes. The proposed IoMT framework experimental results show that the GBTs classifier has the best performance with 95.70% training, 95.30% test accuracy and 0.970 Area Under the Curve (AUC) values. Moreover, the promising real-time performances of wireless body area network (WBAN) simulation scenario and Apache Spark show that they can be used for the early detection of Covid-19 disease.
... For the RFR model, n_estimators, max_depth, min_samples_split were all optimally adjusted using the GridSearchCV package. Although the DT is a popular and effective modeling algorithm mainly for predicting simple data [58], it can be more easily overfitted compared with the RFR, and provides poor prediction results [59]. The Lasso method obtains a relatively refined model by constructing a penalty function and forcing the sum of the absolute values of some regression coefficients to be within a fixed value and setting some of the other regression coefficients to be zero [60]. ...
Full-text available
Accurate prediction of food crop yield is of great significance for global food security and regional trade stability. Since remote sensing data collected from unmanned aerial vehicle (UAV) platforms have the features of flexibility and high resolution, these data can be used as samples to develop regional regression models for accurate prediction of crop yield at a field scale. The primary objective of this study was to construct regional prediction models for winter wheat yield based on multi-spectral UAV data and machine learning methods. Six machine learning methods including Gaussian process regression (GPR), support vector machine regression (SVR) and random forest regression (RFR) were used for the construction of the yield prediction models. Ten vegetation indices (VIs) extracted from canopy spectral images of winter wheat acquired from a multi-spectral UAV at five key growth stages in Xuzhou City, Jiangsu Province, China in 2021 were selected as the variables of the models. In addition, in situ measurements of wheat yield were obtained in a destructive sampling manner for prediction algorithm modeling and validation. Prediction results of single growth stages showed that the optimal model was GPR constructed from extremely strong correlated VIs (ESCVIs) at the filling stage (R2 = 0.87, RMSE = 49.22 g/m2, MAE = 42.74 g/m2). The results of multiple stages showed GPR achieved the highest accuracy (R2 = 0.88, RMSE = 49.18 g/m2, MAE = 42.57 g/m2) when the ESCVIs of the flowering and filling stages were used. Larger sampling plots were adopted to verify the accuracy of yield prediction; the results indicated that the GPR model has strong adaptability at different scales. These findings suggest that using machine learning methods and multi-spectral UAV data can accurately predict crop yield at the field scale and deliver a valuable application reference for farm-scale field crop management.
... Machine learning is a promising approach for acupuncture efficacy predictions in the PPPM/3PM framework Machine learning (ML) holds the promise to address these above two questions [13] and has been regarded as a powerful tool for predictive, preventive, and personalized medicine (PPPM/3PM) [14]. The PPPM/3PM is an advanced philosophy in healthcare and disease care sectors that enables to predict individual predisposition before the onset of the disease, provide targeted preventive measures, and develop personalized treatment strategies tailored to the individual [15,16]. ...
Full-text available
Background: Acupuncture is safe and effective for functional dyspepsia (FD), while its efficacy varies among individuals. Predicting the response of different FD patients to acupuncture treatment in advance and therefore administering the tailored treatment to the individual is consistent with the principle of predictive, preventive, and personalized medicine (PPPM/3PM). In the current study, the individual efficacy prediction models were developed based on the support vector machine (SVM) algorithm and routine clinical features, aiming to predict the efficacy of acupuncture in treating FD and identify the FD patients who were appropriate to acupuncture treatment. Methods: A total of 745 FD patients were collected from two clinical trials. All the patients received a 4-week acupuncture treatment. Based on the demographic and baseline clinical features of 80% of patients in trial 1, the SVM models were established to predict the acupuncture response and improvements of symptoms and quality of life (QoL) at the end of treatment. Then, the left 20% of patients in trial 1 and 193 patients in trial 2 were respectively applied to evaluate the internal and external generalizations of these models. Results: These models could predict the efficacy of acupuncture successfully. In the internal test set, models achieved an accuracy of 0.773 in predicting acupuncture response and an R 2 of 0.446 and 0.413 in the prediction of QoL and symptoms improvements, respectively. Additionally, these models had well generalization in the independent validation set and could also predict, to a certain extent, the long-term efficacy of acupuncture at the 12-week follow-up. The gender, subtype of disease, and education level were finally identified as the critical predicting features. Conclusion: Based on the SVM algorithm and routine clinical features, this study established the models to predict acupuncture efficacy for FD patients. The prediction models developed accordingly are promising to assist doctors in judging patients' responses to acupuncture in advance, so that they could tailor and adjust acupuncture treatment plans for different patients in a prospective rather than the reactive manner, which could greatly improve the clinical efficacy of acupuncture treatment for FD and save medical expenditures. Supplementary information: The online version contains supplementary material available at 10.1007/s13167-022-00271-8.
... One emerging area in cancer drug research is the incorporation of machine learning algorithms with the advance of artificial intelligence (Tanoli et al., 2021). With the rapidly increased quantity and improved quality of cases archived in databases capturing both patient and treatment information, therapeutic response prediction facilitates the optimisation of cancer treatment with reduced drug resistance and enhanced efficacy (Rafique et al., 2021). However, lack of clinically proven pharmacogenomic data remains one of the primary challenges in this area. ...
... Since the advent of precision and personalized medicine, machine learning (ML) has received great interest as a promising tool for diagnosing and predicting optimal treatment for cancer patients [3,[22][23][24]. Recent ML models were proposed to assist the noninvasive detection of 1p/19q co-deleted LGGs, based on the intensity, texture, and geometry obtained via radiomics features from medical images [4,5,10,18,19,[25][26][27][28][29][30][31][32][33]. ...
Full-text available
The prognosis and treatment plans for patients diagnosed with low-grade gliomas (LGGs) may significantly be improved if there is evidence of chromosome 1p/19q co-deletion mutation. Many studies proved that the codeletion status of 1p/19q enhances the sensitivity of the tumor to different types of therapeutics. However, the current clinical gold standard of detecting this chromosomal mutation remains invasive and poses implicit risks to patients. Radiomics features derived from medical images have been used as a new approach for non-invasive diagnosis and clinical decisions. This study proposed an eXtreme Gradient Boosting (XGBoost)-based model to predict the 1p/19q codeletion status in a binary classification task. We trained our model on the public database extracted from The Cancer Imaging Archive (TCIA), including 159 LGG patients with 1p/19q co-deletion mutation status. The XGBoost was the baseline algorithm, and we combined the SHapley Additive exPlanations (SHAP) analysis to select the seven most optimal radiomics features to build the final predictive model. Our final model achieved an accuracy of 87% and 82.8% on the training set and external test set, respectively. With seven wavelet radiomics features, our XGBoost-based model can identify the 1p/19q codeletion status in LGG-diagnosed patients for better management and address the drawbacks of invasive gold-standard tests in clinical practice.
Artificial intelligence (AI) powered by the accumulating clinical and molecular data about cancer has fueled the expectation that a transformation in cancer treatments towards significant improvement of patient outcomes is at hand. However, such transformation has been so far elusive. The opacity of AI algorithms and the lack of quality annotated data being available at population scale are among the challenges to the application of AI in oncology. Fundamentally however, the heterogeneity of cancer and its evolutionary dynamics make every tumor response to therapy sufficiently different from the population, machine-learned statistical models, challenging hence the capacity of these models to yield reliable inferences about treatment recommendations that can improve patient outcomes. This article reviews the nominal elements of clinical decision-making for precision oncology and frames the utility of AI to cancer treatment improvements in light of cancer unique challenges.
In the early 1900s, multiple significant studies showed high incidences of cancer. During this period, study with infectious agents produced only modest results which looked irrelevant to people. Then, in the 1980s, groundbreaking evidence that a number of viruses can cause cancer in people began to emerge. Machine learning and deep learning techniques have been widely employed in cancer detection and classification that include support vector machines (SVMs), artificial neural networks (ANNs), and conventional neural networks (CNNs). The recurrence of cancer is also an important issue that needs to be predicted with significant accuracy. This chapter reviews current state-of-the-art of ANNs model in the prediction of cancer recurrence.
Full-text available
A review of over 4000+ articles published in 2021 related to artificial intelligence in healthcare.A BrainX Community exclusive, annual publication which has trends, specialist editorials and categorized references readily available to provide insights into related 2021 publications. Cite as: Mathur P, Mishra S, Awasthi R, Cywinski J, et al. (2022). Artificial Intelligence in Healthcare: 2021 Year in Review. DOI: 10.13140/RG.2.2.25350.24645/1
Full-text available
Purpose To assess the utility of a machine-learning approach for predicting liver function based on technetium-99 m-galactosyl serum albumin (99mTc-GSA) single photon emission computed tomography (SPECT)/CT. Methods One hundred twenty-eight patients underwent a 99mTc-GSA SPECT/CT-based liver function evaluation. All were classified into the low liver-damage or high liver-damage group. Four clinical (age, sex, background liver disease and histological type) and 8 quantitative 99mTc-GSA SPECT/CT features (receptor index [LHL15], clearance index [HH15], liver-SUVmax, liver-SUVmean, heart-SUVmax, metabolic volume of liver [MVL], total lesion GSA [TL-GSA, liver-SUVmean × MVL] and SUVmax ratio [liver-SUVmax/heart-SUVmax]) were obtained. To predict high liver damage, a machine learning classification with features selection based on Gini impurity and principal component analysis (PCA) were performed using a support vector machine and a random forest (RF) with a five-fold cross-validation scheme. To overcome imbalanced data, stratified sampling was used. The ability to predict high liver damage was evaluated using a receiver operating characteristic (ROC) curve analysis. Results Four indices (LHL15, HH15, heart SUVmax and SUVmax ratio) yielded high areas under the ROC curves (AUCs) for predicting high liver damage (range: 0.89–0.93). In a machine learning classification, the RF with selected features (heart SUVmax, SUVmax ratio, LHL15, HH15, and background liver disease) and PCA model yielded the best performance for predicting high liver damage (AUC = 0.956, sensitivity = 96.3%, specificity = 90.0%, accuracy = 91.4%). Conclusion A machine-learning approach based on clinical and quantitative 99mTc-GSA SPECT/CT parameters might be useful for predicting liver function.
Full-text available
Glucocorticoids, such as dexamethasone and prednisolone, are widely used in cancer treatment. Different hematological malignancies respond differently to this treatment which, as could be expected, correlates with treatment outcome. In this study, we have used a glucocorticoid-induced gene signature to develop a deep learning model that can predict dexamethasone sensitivity. By combining gene expression data from cell lines and patients with acute lymphoblastic leukemia, we observed that the model is useful for the classification of patients. Predicted samples have been used to detect deregulated pathways that lead to dexamethasone resistance. Gene set enrichment analysis, peptide substrate-based kinase profiling assay, and western blot analysis identified Aurora kinase, S6K, p38, and β-catenin as key signaling proteins involved in dexamethasone resistance. Deep learning-enabled drug synergy prediction followed by in vitro drug synergy analysis identified kinase inhibitors against Aurora kinase, JAK, S6K, and mTOR that displayed synergy with dexamethasone. Combining pathway enrichment, kinase regulation, and kinase inhibition data, we propose that Aurora kinase or its several direct or indirect downstream kinase effectors such as mTOR, S6K, p38, and JAK may be involved in β-catenin stabilization through phosphorylation-dependent inactivation of GSK-3β. Collectively, our data suggest that activation of the Aurora kinase/β-catenin axis during dexamethasone treatment may contribute to cell survival signaling which is possibly maintained in patients who are resistant to dexamethasone.
Full-text available
Drug combinations have demonstrated great potential in cancer treatments. They alleviate drug resistance and improve therapeutic efficacy. The fast-growing number of anti-cancer drugs has caused the experimental investigation of all drug combinations to become costly and time-consuming. Computational techniques can improve the efficiency of drug combination screening. Despite recent advances in applying machine learning to synergistic drug combination prediction, several challenges remain. First, the performance of existing methods is suboptimal. There is still much space for improvement. Second, biological knowledge has not been fully incorporated into the model. Finally, many models are lack interpretability, limiting their clinical applications. To address these challenges, we have developed a knowledge-enabled and self-attention transformer boosted deep learning model, TranSynergy, which improves the performance and interpretability of synergistic drug combination prediction. TranSynergy is designed so that the cellular effect of drug actions can be explicitly modeled through cell-line gene dependency, gene-gene interaction, and genome-wide drug-target interaction. A novel Shapley Additive Gene Set Enrichment Analysis (SA-GSEA) method has been developed to deconvolute genes that contribute to the synergistic drug combination and improve model interpretability. Extensive benchmark studies demonstrate that TranSynergy outperforms the state-of-the-art method, suggesting the potential of mechanism-driven machine learning. Novel pathways that are associated with the synergistic combinations are revealed and supported by experimental evidence. They may provide new insights into identifying biomarkers for precision medicine and discovering new anti-cancer therapies. Several new synergistic drug combinations have been predicted with high confidence for ovarian cancer which has few treatment options. The code is available at .
Full-text available
Introduction: Drug repurposing provides a cost-effective strategy to re-use approved drugs for new medical indications. Several machine learning algorithms (ML) and AI approaches have been developed for systematic identification of drug repurposing leads based on big data resources, hence further accelerating and de-risking the drug development process by computational means. Areas covered: The authors focus especially on supervised ML and AI methods that make use of publicly available databases and information resources. While most of the example applications are in the field of anticancer drug therapies, the methods and resources reviewed are widely applicable also to other indications including COVID-19 treatment. A particular emphasis is placed on the use of comprehensive target activity profiles of drugs that enable systematic repurposing process by extending the target profile of a drug to include potent off-targets with therapeutic potential for a new indication. Expert opinion: The scarcity of clinical patient data and the current focus on genetic aberrations as the primary drug targets may limit the performance of anticancer drug repurposing approaches that rely solely on genomics-based information. Functional testing of cancer patient cells exposed to large number of targeted therapies and their combinations provides an additional source of repurposing information for tissue-aware AI approaches.
Full-text available
Systematic perturbation of cells followed by comprehensive measurements of molecular and phenotypic responses provides informative data resources for constructing computational models of cell biology. Models that generalize well beyond training data can be used to identify combinatorial perturbations of potential therapeutic interest. Major challenges for machine learning on large biological datasets are to find global optima in a complex multidimensional space and mechanistically interpret the solutions. To address these challenges, we introduce a hybrid approach that combines explicit mathematical models of cell dynamics with a machine-learning framework, implemented in TensorFlow. We tested the modeling framework on a perturbation-response dataset of a melanoma cell line after drug treatments. The models can be efficiently trained to describe cellular behavior accurately. Even though completely data driven and independent of prior knowledge, the resulting de novo network models recapitulate some known interactions. The approach is readily applicable to various kinetic models of cell biology. A record of this paper’s Transparent Peer Review process is included in the Supplemental Information.
Full-text available
Disease diagnosis is the identification of an health issue, disease, disorder, or other condition that a person may have. Disease diagnoses could be sometimes very easy tasks, while others may be a bit trickier. There are large data sets available; however, there is a limitation of tools that can accurately determine the patterns and make predictions. The traditional methods which are used to diagnose a disease are manual and error-prone. Usage of Artificial Intelligence (AI) predictive techniques enables auto diagnosis and reduces detection errors compared to exclusive human expertise. In this paper, we have reviewed the current literature for the last 10 years, from January 2009 to December 2019. The study considered eight most frequently used databases, in which a total of 105 articles were found. A detailed analysis of those articles was conducted in order to classify most used AI techniques for medical diagnostic systems. We further discuss various diseases along with corresponding techniques of AI, including Fuzzy Logic, Machine Learning, and Deep Learning. This research paper aims to reveal some important insights into current and previous different AI techniques in the medical field used in today’s medical research, particularly in heart disease prediction, brain disease, prostate, liver disease, and kidney disease. Finally, the paper also provides some avenues for future research on AI-based diagnostics systems based on a set of open problems and challenges.
Full-text available
CellMiner Cross-Database (CellMinerCDB, allows integration and analysis of molecular and pharmacological data within and across cancer cell line datasets from the National Cancer Institute (NCI), Broad Institute, Sanger/MGH and MD Anderson Cancer Center (MDACC). We present CellMinerCDB 1.2 with updates to datasets from NCI-60, Broad Cancer Cell Line Encyclopedia and Sanger/MGH, and the addition of new datasets, including NCI-ALMANAC drug combination, MDACC Cell Line Project proteomic, NCI-SCLC DNA copy number and methylation data, and Broad methylation, genetic dependency and metabolomic datasets. CellMinerCDB (v1.2) includes several improvements over the previously published version: (i) new and updated datasets; (ii) support for pattern comparisons and multivariate analyses across data sources; (iii) updated annotations with drug mechanism of action information and biologically relevant multigene signatures; (iv) analysis speedups via caching; (v) a new dataset download feature; (vi) improved visualization of subsets of multiple tissue types; (vii) breakdown of univariate associations by tissue type; and (viii) enhanced help information. The curation and common annotations (e.g. tissues of origin and identifiers) provided here across pharmacogenomic datasets increase the utility of the individual datasets to address multiple researcher question types, including data reproducibility, biomarker discovery and multivariate analysis of drug activity.
Background: Drug response prediction is an important problem in computational personalized medicine. Many machine-learning-based methods, especially deep learning-based ones, have been proposed for this task. However, these methods often represent the drugs as strings, which are not a natural way to depict molecules. Also, interpretation (e.g., what are the mutation or copy number aberration contributing to the drug response) has not been considered thoroughly. Methods: In this study, we propose a novel method, GraphDRP, based on graph convolutional network for the problem. In GraphDRP, drugs were represented in molecular graphs directly capturing the bonds among atoms, meanwhile cell lines were depicted as binary vectors of genomic aberrations. Representative features of drugs and cell lines were learned by convolution layers, then combined to represent for each drug-cell line pair. Finally, the response value of each drug-cell line pair was predicted by a fully-connected neural network. Four variants of graph convolutional networks were used for learning the features of drugs. Results: We found that GraphDRP outperforms tCNNS in all performance measures for all experiments. Also, through saliency maps of the resulting GraphDRP models, we discovered the contribution of the genomic aberrations to the responses. Conclusion: Representing drugs as graphs can improve the performance of drug response prediction. Availability of data and materials: Data and source code can be downloaded at
Advancement in genome sequencing technology has empowered researchers to think beyond their imagination. Researchers are trying their hard to fight against various genetic diseases such as cancer. Artificial intelligence has empowered research in the healthcare sector. The availability of open-source healthcare datasets has motivated the researchers to develop applications which helps in early diagnosis and prognosis of diseases. Further, Next-generation sequencing has helped to look into detailed intricacies of biological systems. It has provided an efficient and cost-effective approach with higher accuracy. The advent of microRNAs also known as small noncoding genes has begun the paradigm shift in oncological research. We are now able to profile expression profiles of RNAs using RNA-seq data. microRNA profiling has helped in uncovering their relationship in various genetic and biological processes. Here in this paper, we present a review of the machine learning perspective in cancer research. The best way to develop effective cancer treatment/drugs is to better understand the intricacies and complexities involved in the cancer microenvironment. Although there has been a plethora of methods and techniques proposed in the literature, still the deadliness of cancer can't be reduced. In such a situation Artificial intelligence (AI) or machine learning is providing a reliable, fast, and efficient way to deal with such stringent diseases.
PURPOSE The implementation and utilization of electronic health records is generating a large volume and variety of data, which are difficult to process using traditional techniques. However, these data could help answer important questions in cancer surveillance and epidemiology research. Artificial intelligence (AI) data processing methods are capable of evaluating large volumes of data, yet current literature on their use in this context of pharmacy informatics is not well characterized. METHODS A systematic literature review was conducted to evaluate relevant publications within four domains (cancer, pharmacy, AI methods, population science) across PubMed, EMBASE, Scopus, and the Cochrane Library and included all publications indexed between July 17, 2008, and December 31, 2018. The search returned 3,271 publications, which were evaluated for inclusion. RESULTS There were 36 studies that met criteria for full-text abstraction. Of those, only 45% specifically identified the pharmacy data source, and 55% specified drug agents or drug classes. Multiple AI methods were used; 25% used machine learning (ML), 67% used natural language processing (NLP), and 8% combined ML and NLP. CONCLUSION This review demonstrates that the application of AI data methods for pharmacy informatics and cancer epidemiology research is expanding. However, the data sources and representations are often missing, challenging study replicability. In addition, there is no consistent format for reporting results, and one of the preferred metrics, F-score, is often missing. There is a resultant need for greater transparency of original data sources and performance of AI methods with pharmacy data to improve the translation of these results into meaningful outcomes.