ArticlePDF AvailableLiterature Review

Abstract and Figures

The applications of modern artificial intelligence (AI) algorithms within the field of aging research offer tremendous opportunities. Aging is an almost universal unifying feature possessed by all living organisms, tissues, and cells. Modern deep learning techniques used to develop age predictors offer new possibilities for formerly incompatible dynamic and static data types. AI biomarkers of aging enable a holistic view of biological processes and allow for novel methods for building causal models—extracting the most important features and identifying biological targets and mechanisms. Recent developments in generative adversarial networks (GANs) and reinforcement learning (RL) permit the generation of diverse synthetic molecular and patient data, identification of novel biological targets, and generation of novel molecular compounds with desired properties and geroprotectors. These novel techniques can be combined into a unified, seamless end-to-end biomarker development, target identification, drug discovery and real world evidence pipeline that may help accelerate and improve pharmaceutical research and development practices. Modern AI is therefore expected to contribute to the credibility and prominence of longevity biotechnology in the healthcare and pharmaceutical industry, and to the convergence of countless areas of research.
Content may be subject to copyright.
Contents lists available at ScienceDirect
Ageing Research Reviews
journal homepage: www.elsevier.com/locate/arr
Review
Articial intelligence for aging and longevity research: Recent advances and
perspectives
Alex Zhavoronkov
a,b,c
, Polina Mamoshina
a,d
, Quentin Vanhaelen
a,
, Morten Scheibye-Knudsen
e
,
Alexey Moskalev
f
, Alex Aliper
a
a
Pharmaceutical Articial Intelligence Department, Insilico Medicine, Inc., Baltimore, MD, United States
b
Biogerontology Research Foundation, London, United Kingdom
c
Buck Institute for Research on Aging, Novato, CA, United States
d
Department of Computer Science, University of Oxford, Oxford, United Kingdom
e
Center for Healthy Aging, Department of Cellular and Molecular Medicine, University of Copenhagen, Denmark
f
George Mason University, Fairfax, VA, United States
ARTICLE INFO
Keywords:
Aging biomarker
Drug discovery
Articial intelligence
Deep learning
Reinforcement learning
Symbolic learning
Metalearning
Generative adversarial networks
ABSTRACT
The applications of modern articial intelligence (AI) algorithms within the eld of aging research oer tre-
mendous opportunities. Aging is an almost universal unifying feature possessed by all living organisms, tissues,
and cells. Modern deep learning techniques used to develop age predictors oer new possibilities for formerly
incompatible dynamic and static data types. AI biomarkers of aging enable a holistic view of biological processes
and allow for novel methods for building causal modelsextracting the most important features and identifying
biological targets and mechanisms. Recent developments in generative adversarial networks (GANs) and re-
inforcement learning (RL) permit the generation of diverse synthetic molecular and patient data, identication
of novel biological targets, and generation of novel molecular compounds with desired properties and ger-
oprotectors. These novel techniques can be combined into a unied, seamless end-to-end biomarker develop-
ment, target identication, drug discovery and real world evidence pipeline that may help accelerate and im-
prove pharmaceutical research and development practices. Modern AI is therefore expected to contribute to the
credibility and prominence of longevity biotechnology in the healthcare and pharmaceutical industry, and to the
convergence of countless areas of research.
1. Introduction
Aging can be dened as a gradual, multifactorial, time-dependent
process leading to the loss of function, biological and physical damage,
and the onset of multiple age-related diseases. Aging progressively af-
fects most regulatory mechanisms due to the hierarchical organization
of living systems. The human organism is a multi-level, complex system
comprised of billions of independent cells that form dierent types of
tissues. These tissues are the main blocks used to assemble organs, and
these organs are organized in dierent systems including the lymphatic,
respiratory, digestive, urinary, or reproductive systems to achieve
specic tasks. Aging can be inuenced by the complex interplay be-
tween environmental, mechanistic, biochemical and evolutionary con-
straints. Therefore, dysfunctions aecting only a few biological pro-
cesses within the cells of one or several organs can propagate to all parts
of the body. This explains why aging cannot be fully understood or
controlled when monitoring only a restricted number of physiological
processes. Taken together, aging appears to be the long-term result of
the disruption of dierent dynamical equilibriums established between
antagonistic processes, rather than the result of a sudden appearance of
isolated molecular processes or components with intrinsic negative ef-
fects. The systemic and multifactorial nature of aging explains why
understanding its biology and mechanisms are so complex and why, as
a consequence, aging research is continuously in need of multi-
disciplinary and global approaches. Novel experimental techniques
have allowed the generation and accumulation of a huge amount of
aging-related data, including genomic (Gleeson et al., 2017)(Yi et al.,
2013)(Yi et al., 2013)(Bennett et al., 2016), transcriptomic (Artemov
et al., 2015)(Bolotin et al., 2017), microRNA (Zabolotneva et al.,
2013), proteomic (Di Meo et al., 2016), antigen (Ionov, 2010), me-
thylation (Yin et al., 2017), imaging (Lee et al., 2017a,b)(Niklinski
et al., 2017), metagenomic (Alexander et al., 2017), mitochondrial
https://doi.org/10.1016/j.arr.2018.11.003
Received 29 September 2018; Received in revised form 7 November 2018; Accepted 21 November 2018
Corresponding author.
E-mail address: vanhaelen@insilicomedicine.com (Q. Vanhaelen).
Ageing Research Reviews 49 (2019) 49–66
Available online 22 November 2018
1568-1637/ © 2018 Published by Elsevier B.V.
T
(Sotgia and Lisanti, 2017), metabolic (Nielsen, 2017) and physiological
(Pretorius and Bester, 2016). These data provide an unprecedented
detailed overview of the aging process. However, the analysis and
practical use of the information contained within this huge amount of
data also requires adapted computational approaches such as machine
learning (ML) and, more recently, the development of deep learning
(DL) techniques which are the cornerstones of modern articial in-
telligence (AI) technologies. From this point of view, recent AI advances
have had a major impact within the eld of aging research (Moskalev
et al., 2017). The attractive feature of AI is its ability to identify re-
levant patterns within complex, nonlinear data, without the need for
any a priori mechanistic understanding of the biological processes. AI
unveils the mechanistic relationships taking place within the body.
Today, DL and AI algorithms have been successfully developed and
applied in many pharmaceutical areas (Mamoshina et al., 2016;
Gawehn et al., 2015;Lenselink et al., 2017; Chen et al., 2018) with
applications as wide-ranging as prediction of organic chemistry reac-
tions (Wei et al., 2016), identication of aging biomarkers
(Zhavoronkov et al., 2016), optimization of chemical synthesis (Segler
et al., 2018), prediction of pharmacological properties of drugs (Aliper
et al., 2016), analysis of relationships between certain lifestyle choices
like smoking and accelerated aging (Mamoshina et al., 2018c), in-
vestigation of protein secondary structure (Spencer et al., 2015),
modeling features of RNA-binding protein targets (Zhang et al., 2016),
analysis of drug-induced hepatotoxicity (Xu et al., 2015), or the study of
long non-coding RNAs (Fan and Zhang, 2015).
As emphasized by Deep Knowledge Analytics (www.dkv.global/
analytics) in its recent industry analytical report entitled AI for Drug
Discovery, Biomarker Development and Advanced R&D 2018,AI is
expected to make a major impact on healthcare. It can be used for the
development of eective personalized medicine based on the inter-
pretation of large medical databases gathered over the years by com-
panies and healthcare providers. There are currently several companies
applying AI technologies within the eld of aging research. Bioage is a
company using ML and genomic data for the development of bio-
markers of aging and drug discovery for aging and age-related disease.
Insilico Medicine is developing DL-based algorithms and deploying an
integrated AI pipeline for aging research, biomarker development, and
drug discovery in one end-to-end learning pipeline. The goal of the
company is to nd novel solutions for aging and age-related diseases
using advances in genomics, AI, and big data analysis. Atomwise is
using AI for aging research with a drug discovery pipeline targeting
age-related diseases that still lack eective treatments, like Alzheimers
disease. These examples illustrate that AI technologies can be applied at
dierent levels, with the goal of facilitating the development of new
pharmaceuticals quicker, cheaper, and more eectively (Fleming,
2018). AI can be applied for accelerating the identication of bio-
markers of age, for the identication of new targets and geroprotectors,
for accelerating and optimizing the development of new compounds
with specic desired properties, to improve patient prognosis by re-
ducing error rate, for helping to select the most appropriate treatment
by predicting treatment outcome (Trtica-Majnaric et al., 2010), and for
predicting the chance of drug success during clinical trials or clinical
trials outcomes. For a broad overview of the use of AI in biomedicine,
we refer the reader to these current reviews (Rifaioglu et al., 2018;
Ching et al., 2018;Tsigelny, 2018;Fabris et al., 2017).
The aim of this review is to provide a technical overview of the
advances and opportunities oered by AI for aging biomarkers devel-
opment and anti-aging drugs discovery. This work emphasizes that
despite their specic technical requirements, the computational
methods used for these tasks can be integrated within a single workow
to optimize several steps of aging researchfrom the identication of
aging signatures to target identication and ad hoc molecule genera-
tion. Currently, biomarker development is an intensive area of research
in geroscience, as it lays the foundation for ecient preclinical and
clinical evaluation of potential health span-extending interventions. We
describe how deep learned aging clocks are used to identify aging
biomarkers and other targets of interest. The emergence of AI-based
methods for small molecule drug discovery is a major change in the
standard drug discovery pipeline. For decades, computational methods
have been used for accelerating the identication of potential leads
during the early stages of the drug discovery process. However, with its
ability to generate molecules with specic properties, AI based mole-
cular generators provide new, promising opportunities. At this time, the
main algorithms designed to that end are described, and the advantages
and current challenges of these approaches are summarized. One of the
major challenges for the use of AI technology is to obtain more reliable
predictions. This relies strongly on the ability to extract the most re-
levant features. This paper examines the current strategies used in aging
research to extract more biologically relevant features and make AI-
based models more easily interpretable. To conclude, a short discussion
addresses a challenge specic to the development of increasing use of
AI within healthcareprivacy protection regulatory issues, which be-
came a major concern with the increasing amount of personal data
being stored, used, or even shared by AI-powered healthcare applica-
tions.
2. Advances in articial intelligence
2.1. Machine learning
Machine learning (ML) refers to algorithms that can learn from and
make predictions on data by building a model from sample inputs. ML
is commonly employed for computing tasks where designing and pro-
gramming explicit algorithms with good performance is dicult or
infeasible. Today, most commonly used traditional ML methods include
k-nearest neighbors (kNN) (Altman, 1992;Kramer, 2013), logistic re-
gression (LR) (Walker and Duncan, 1967), support vector machines
(SVM), also called support vector networks (Cortes and Vapnik, 1995),
gradient boosting machines (GBM) (Mason et al., 1999;Friedman,
2001;Ayyadevara and Kishore Ayyadevara, 2018), and random forest
(RF) (Ho, 1995;Breiman, 2001;Fratello and Tagliaferri, 2018). The
performance of these methods can vary depending on the type of task
(regression or classication), types, and amount of data to handle.
2.2. Deep learning
Deep structured learning, also called deep learning (DL) or hier-
archical learning, refers to a class of ML techniques that exploit many
layers of non-linear computational units to model complex relationships
among data. These architectures, composed of multiple layers, are
commonly called deep neural networks (DNNs), or sometimes stacked
neural networks. The dierence between the initial single-hidden-layer
articial neural networks (ANNs) and DNNs is the depth; that is, the
number of layers of nodes through which data is processed. Usually,
more than three layers (including input and output) qualify as "deep"
learning. Thus, "deep" is a technical term that means more than one
hidden layer. As other standard neural network architectures, DNNs are
ecient universal approximators. But they have additional character-
istics as they are based on the learning of multiple levels of features or
representations of the data. They use a cascade of many layers of
nonlinear processing units for feature extraction. Each successive layer
uses the output from the previous layer as input. Higher level features
are derived from lower level features to form a hierarchical re-
presentation. This hierarchy of features is called a deep architecture.
These methods are capable of learning multiple levels of representa-
tions that correspond to dierent levels of abstraction. These levels
form a hierarchy of concepts. Among the dierent architectures pro-
posed so far, recurrent neural networks, generative adversarial net-
works, and transfer learning techniques are gaining popularity within
aging research and are often considered for various applications in
healthcare. As a consequence of the rise of DL, traditional ML methods
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
50
are now commonly used as baseline models to assess the performance
of more recent DNN-based models.
2.3. Reinforcement learning
Reinforcement learning (RL) refers to goal-oriented algorithms,
which learn how to attain a complex objective or maximize along a
particular dimension over many steps (Arulkumaran et al., 2017;
Kulkarni, 2017). The key feature of RL algorithms is that they operate
in a delayed return environment, where it is not obvious to understand
which action leads to which outcome over many time steps. Thus, RL
aims at correlating immediate actions with the delayed returns they
produce. The reinforcement takes place in the sense that RL algorithms
are penalized when making the wrong decisions, and they get rewarded
when making the right one. RL algorithms are expected to increase
performance in more ambiguous, real-life environments (Nguyen et al.,
2017a,b).
2.4. Generative adversarial networks
Generative Adversarial Networks (GANs) are structured, probabil-
istic models for generating data. Being an unsupervised technique,
GANs can be used to generate data similar to the dataset that the GAN
was trained on (Goodfellow et al., 2014;Goodfellow, 2017). Although
relatively new, GANs have already been applied in various elds, in-
cluding making predictions of compound properties or for molecular
structure generation (Kadurin et al., 2017b;Kadurin et al., 2017a;
Polykovskiy et al., 2018;Zhavoronkov et al., 2018a,b). A GAN consists
of two DNNs called Discriminator and Generator, both dierentiable
functions. The discriminator estimates the probability that a given
sample is coming from the real dataset. It works as a critic and is op-
timized to distinguish the fake samples from the real ones. The gen-
erator outputs synthetic samples using a noise variable as input fol-
lowing a distribution. It is trained to capture the real data distribution
so that it can generate samples with distribution which are as real as
possible. The generator should improve its output until the dis-
criminator is unable to distinguish the generated output from the real
ones.
The two models compete against each other during the training
process. The goal of the generator is to try to trick the discriminator,
while the discriminator attempts to not be cheated. This process is
called a zero-sum game. It happens between the two models and mo-
tivates them to improve their functionalities in order to obtain gener-
ated samples indistinguishable from the real data.
From a conceptual point of view, GANs share similarities with RL.
GANs appear more advantageous in the sense that it is possible to
"backpropagate" the gradient information from the discriminator back
to the generator network. Consequently, the generator knows how to
adapt its parameters in order to produce output data that can fool the
discriminator.
2.5. Transfer learning
Transfer learning (TL) is a ML method where the set of learned
features of a model for a specic task is reused, or repurposed, as the
starting point for a model on a second task. In practice, TL can be ap-
plied in DL only when the model features learned from the rst task
remain general. It means that the features learned on the task must also
be suitable for the second task (Torrey and Shavlik, 2010). In practice,
TL is used as an optimization technique that allows saving time or
getting better performance. This can be of great interest given the vast
computational and time resources needed to develop and train DNN
models on problems such as computer vision and natural language
processing tasks, for instance.
2.6. Meta learning
Meta learning aims at applying ML algorithms on metadata ob-
tained from ML experiments to improve the performance of the learning
algorithms themselves. The hypothesis behind meta learning is that by
using dierent kinds of metadata such as properties of the learning
problem, algorithm properties (performance measures), or patterns
obtained from the data, one can learn, select, or combine dierent
learning algorithms to more eectively solve a given learning problem
(Zhou and Wu, 2018;Gupta et al., 2018). This approach could be
especially useful in the context of aging research because the most ef-
fective way to train learning algorithms depends on the type of data
used as well as on the nature of the questions to be answered.
3. Databases for DL for aging research
Various initiatives have been launched to organize and disseminate
the large amount of biological data generated in aging research.
CellAge (http://genomics.senescence.info/cells/) is a manually curated
database of genes associated with cell senescence. The data come from
gene manipulation experiments in dierent human cell types. This
database is hosted within the Human Ageing Genomic Resources, a
collection of databases and tools designed to help researchers study the
genetics of human ageing (Tacutu et al., 2018). It includes various re-
sources such as the LongevityMap (http://genomics.senescence.info/
longevity/)a repository of genetic association studies of longevity
which aims at aggregating the current knowledge of the genetics of
human longevity (Budovsky et al., 2013). GenAge (http://genomics.
senescence.info/genes/) is a benchmark database of age-related genes.
Geroprotectors (http://geroprotectors.org/) is a curated database of
geroprotectors. It contains more than 250 life-extension experiments in
11 wild-type model organisms, and data about more than 200 chemi-
cals promoting longevity, including compounds approved for human
use that are available. The dierent features of this database are de-
scribed in (Moskalev et al., 2015). The online crowd sourced pathway
annotation database, AgingChart (http://agingchart.org) provides a list
of pathways implicated in aging and longevity (Moskalev et al., 2016).
Another source of information regarding aging research is the Inter-
national Aging Research Portfolio (IARP) (https://agingportfolio.org/).
IARP provides users with access to information about current trends in
aging research, major centers of research, key investigators, and asso-
ciated research programs (Zhavoronkov and Cantor, 2011)(Kolesov
et al., 2014). IARP aims at helping to fund organizations to collaborate,
make decisions, and set future directions for research eorts in aging.
4. Applications of AI in aging research
4.1. Aging biomarker discovery and personalized medicine
Precision medicine is dependent on robust quantitative biomarkers.
Biomarkers of aging are tools able to provide a quantitative foundation
upon which to evaluate the therapeutic ecacy of clinical, health-span-
extending interventions. However, one of the current major impedi-
ments in human aging research is the absence of biomarkers that may
be targeted and measured to track the eectiveness of anti-aging
therapeutic interventions. This might be explained by the fact that
standard biomarkers are usually developed with the purpose of mea-
suring a strictly dened physiological process, and specic clinical
procedures based on the use of predened biomarkers. As a result, they
are not necessarily adapted for measuring the eects of a systemic
process such as aging. Currently, many biomarkers of aging monitor not
only one, but a restricted set of physiological functionalities whose
disruptions are known to trigger the onset of specic diseases and
malfunctions correlated with aging. Although this strategy provides
accurate results and useful information about aging itself, the con-
sidered biomarkers are not always able to represent the health state
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
51
with enough accuracy. Therefore, there is still a need to develop bio-
markers which are objectively quantiable and easily measurable
characteristics of biological aging. The design of such biomarkers, from
an experimental point of view, is a time-consuming and tedious multi-
step process that includes proof of concept, experimental validation and
analytical performance validation. AI technologies oer eective al-
ternatives for the development of aging biomarkers (Fig. 1A and B) and
DL-based aging clocks have already been used to identify quantitative
biomarkers. Dierent congurations and architectures have been stu-
died. Data types include medical big data obtained through genetics,
genomics (McCue and McCoy, 2017;Leung et al., 2016), biochemistry
(Tetko et al., 2016), proteomics (Issa et al., 2014) and clinical imaging
(Lee et al., 2017a,b). DL-based aging clocks may also dier through
training and validation protocols.
The use and development of such aging clocks must be done with
caution. Indeed, it is important to distinguish the chronological
agethe number of years an individual has been alivefrom the bio-
logical age. The biological age, commonly referred to as the physiolo-
gical age or the metabolic age, is the measure of how well the dierent
organs, physiological processes, and regulatory systems of the body
perform and to what extent are being maintained. In theory, monitoring
biological age provides an estimate of the health status of an individual.
Aging clocks estimate biological age from biological data and perform
linear or non-linear regressions for estimating the chronological age of
the individual. Thus, the training protocol of aging clocks aims at
minimizing the dierence, called aging acceleration, between the
physiological ageestimated by the modeland the actual chron-
ological age of the individual. However, it was shown that improving
Fig. 1. Applications of articial intelligence to aging research for biomarker development and target identication. 1 A. Machine-learned predictors of biological age
at the organismal level and population level. 1B. Machine-learned age predictors at the cell and tissue-level. 1C. Machine-learned predictors of cell type and
dierentiation state.
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
52
tchronological age estimation accuracy through error minimization can
also undermine the biological age acceleration signicance and reduce
the ability to dierentiate among disease states or mortality risks
(Mamoshina et al., 2018b;Pyrkov et al., 2018). When using ML tools,
such as DL architectures to unravel complex and nonlinear relations
between the features in the data and produce even more accurate
models, it is important to consider that the search for improved accu-
racy does not induce a signicant loss of biological information.
Nevertheless, more accurate chronological age estimations from bio-
logical samples can nd applications as described below.
4.1.1. Imaging biomarkers
Magnetic resonance imaging (MRI) is an advanced imaging tech-
nique used for the observation of dierent diseases and parts of the
body. Dierent computational methods can be used for analyzing the
results of MRI. The analysis can be done either for classicatio-
nassigning a label to an MRI series(normal/abnormal, level of se-
verity, etc.) or for segmentationto identify the boundaries of various
tissues. In both cases, the analysis necessitates an extraction of in-
formation from images. The most recent advancements in this eld
were obtained using convolutional neural networks (CNNs), a type of
neural network specialized for processing image data (Akkus et al.,
2017;Badrinarayanan et al., 2017;Pereira et al., 2016;Liu et al., 2018).
Although MRI images can be used as exclusive inputs for DL models,
other studies have shown the potential to combine them with additional
features. For example, in (van der Burgh et al., 2017), a deep learned
survival predictor for patients suering from amyotrophic lateral
sclerosisa progressive neuromuscular diseasewas assembled using
MRI structural connectivity and brain morphology data in addition to
commonly used clinical characteristics. The model demonstrated 84.4%
accuracy for classifying patients as short, medium, or long-potential
survivor.
Structural MRI data has also been used for age estimation. It serves
as a biomarker for aging in adults, and for patients with conditions such
as Alzheimers disease (Cole et al., 2016). Recently, a DNN-based ap-
proach using structural, volumetric features derived from T1-weighted
MRI was shown to outperform RF or ANN for age prediction (Bermudez
et al., 2017). The technique was used on a sample of 3348 subjects aged
from 4 to 26 years. A biomarker called Brain Age Gap (BAG) was cre-
ated. DNNs elicited a BAG of 2.87 years compared to 2.77 years with
ANN, and 2.94 years using RF. Performance of the DNN was improved
by using ensemble methods, with a mean absolute error (MAE) of 2.38
years with a DNN ensemble. These results demonstrate that age can be
accurately predicted with unimodal imaging in a young population
using engineered features instead of raw images. The algorithm de-
veloped in this study could also be used as a biomarker for neurode-
velopment and disease detection that can be easily translatable to the
bedside.
Other types of biomarkers of aging using images as inputs have been
suggested. For instance, in (Bobrov et al., 2018), the authors propose a
novel, non-invasive class of visual photographic biomarkers of aging
using the photographic images of eye corner areas for aging prediction.
The eye corner area of the human face is believed to be the most prone
to aging (Flament et al., 2013). To train and validate the model, a da-
taset of around 8000 high-resolution left and right eye corner photos
with labeled with true, chronological age. The model is based on a
modied version of Xception (Chollet, 2017), a DNN-based model
where all layers, except the last fully-connected layer are initialized
with pre-trained weights from the ImageNet database. Experiments
showed that the model is able to achieve a mean absolute error of 2.3
years within the age range of 20 to 80 years old. Those results suggest
that high-resolution images of eye corner wrinkles can be utilized to
obtain accurate chronological age estimation. Interestingly, age pre-
dictions for individuals of 70 years old and older were less accurate. A
hypothesis suggested by the authors is that the divergence in human
phenotypes becomes larger as they age. On the other hand, younger
people have relatively the same amount of wrinkles and pigmentation.
This characteristic has also aected the accuracy of age prediction
among very young individuals.
4.1.2. Omicsbiomarkers
A transcriptomic-based age predictor was presented in (Mamoshina
et al., 2018b). To train this model, 545 transcriptomic samples from 12
datasets of human skeletal muscle labeled according to the chron-
ological age, were collected. For this study, several regression models
were built including Elastic Net, SVM, kNN, RF, and Deep Feature Se-
lection (DFS) Model. A linear regression was used as a baseline and its
performance compared to other ML approaches. Although all models
achieved a strong correlation of predicted and chronological age, SVM
and DFS models clearly outperformed the other methods in age pre-
diction, achieving R2 values of 0.83 and 0.83 and mean absolute error
(MAE) values of 7.20 and 6.24 years, respectively. The performance of
models was also evaluated on gene expression samples of the skeletal
muscles from the Gene expression Genotype-Tissue Expression (GTEx)
project.
4.1.3. Multi-modal biomarkers
One of the rst methods for identifying biomarkers of aging through
population age estimates using DNNs was proposed in (Zhavoronkov
et al., 2016). In this study, an ensemble of 21 DNNs of varying depth
structure was used to predict human chronological age. The features, a
set of 41 biomarkers for each sample, were extracted from tens of
thousands of blood biochemistry samples from patients undergoing
routine physical examinations. Although being highly variable in
nature, the blood biochemistry tests are easy to perform. Furthermore,
they are in clinical use and commonly used by physicians. The best
performing DNN in the ensemble demonstrated a R2 of 0.80 with a
MAE of 6.07 years, while the entire ensemble achieved a R2 of 0.82 and
a MAE of 5.55 years. In order to analyze the importance of the dierent
features used, the permutation feature importance (PFI) method was
utilized. The ve most important biomarkers identied were; albumin
whose low level is associated with increased risk for heart failure in the
elderly, glucose which is linked to metabolic health; alkaline phos-
phatase whose level in blood increases with age; erythrocytes which are
known to be damaged by oxidative stress; and urea which is known for
increasing oxidative stress. These ve biomarkers monitor the physio-
logical status of renal, liver and metabolic systems, and respiratory
function. The associated features can be used for tracking physiological
processes related to aging.
As explained above, aging acceleration is a biologically relevant
variable associated with the prevalence of major diseases and mortality.
Consequently, aging acceleration can also be connected to overall
health using a scale based on deviation from the patients predicted
chronological age. This concept is applied in (Wang et al., 2017) where
a DNN-based predictive model of physiological age was developed with
the Mount Sinai Health System (MSHS) EMR data. Physiological mea-
surements, including vital signs and lab tests from the EMR, were used
as features to train the model. To identify the most relevant features
regarding age predictions, correlation analysis was performed. Among
all the vital signs, pulse pressure and systolic blood pressure show the
strongest positive correlation with chronological age for both genders.
Lab tests positively correlated with age included urea nitrogen, glucose,
hemoglobin A1C, PROTIME/INR; whereas lab tests that most nega-
tively correlated with age are glomerular ltration rate estimate, al-
bumin, total protein, red blood cell count, and hematocrit. Further-
more, correlations between physiological measurements and
chronological age were also investigated using unsupervised hier-
archical clustering on the LOWESS smoothed trends of the most
common physiological measurements across all patients. This approach
allows clustering of dierent physiological measurements with similar
trends. Finally, regression analysis was performed to evaluate how
these variables combine together to predict physiological age. The
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
53
performances of three methodsRF, Elastic Nets, and DNNwere
compared and the DNN showed the best performance. The DL model
was then used for the prediction task. The results show that a combi-
nation of vital signs and lab tests is more predictive of chronological age
than each data type used alone. Patients for which physiological age
was higher than the chronological ones elicit increased prevalence of
hypertension and cardiovascular disorders, increased chronic in-
ammation, possibility of chronic anemia, poor nutritional status, de-
creased kidney function and potential liver damage. On the other hand,
patients predicted younger have, in general, opposite physiological
patterns for many of the physiological measurements, with some ex-
ceptions. The specic physiological patterns identied include low risk
for hypertension and hyperlipidemia, healthy kidney and liver func-
tions, healthier nutritional status, and higher risk for venereal diseases.
Taking into account ethnic dierences in health, diet, lifestyle, be-
havior, environmental exposures, and average rate of biological aging,
it was assumed that deep learned biomarkers of aging are population
dependent (Cohen et al., 2016;Zhavoronkov et al., 2016). Using the
results obtained for predicting patient biological age using blood bio-
chemistry, a set of population specic DL-based predictors of biological
age trained upon blood biochemistry and hematological cell count da-
tasets was presented (Mamoshina et al., 2018a). Samples from patients
belonging to three distinct populationsCanada, South Korea, and
Eastern Europewere selected. Compared to the rst study, the
models used less features (21 compared to 41 features) to train three
separate deep networks on three specic ethnic populations. Models
were trained on 19 blood test features, 15 biochemistry markers, in-
cluding Albumin, Glucose, Hemoglobin, Cholesterol, Sodium, Urea, LDL
Cholesterol, Triglycerides, Hematocrit, HDL Cholesterol, Total Protein,
Calcium, Creatinine, Potassium, and Total Bilirubin, and four are cell
count markers, including Erythrocytes, and Platelet count. Patient sex
and population type were also incorporated in the feature set. As in the
previous work, the age prediction was treated as a regression task. The
model takes a vector of blood test values and returns a single value of
patient age. To evaluate the association of the predicted age accelera-
tion or age slowdown with all-cause mortality, hazard ratios were also
computed. The results showed that the best-performing predictor
achieved an MAE of 5.94 years, having greater predictive accuracy than
the best-performing predictor of the previously reported aging clock
(which achieved an MAE of 6.07 years). Furthermore, as for the pre-
vious studies, deep learned predictors outperformed conventional ML
models. Interestingly, population type appeared as one of the most
important markers for age quantication. These results conrm the
hypothesis that ethnically diverse aging clocks are capable of predicting
chronological age, and quantify biological age with greater accuracy
than generic aging clocks.
4.1.4. Epigenetic biomarkers
Epigenetics refers to the mitotically heritable modications in gene
expression which do not involve changes within the genetic code.
Epigenetic mechanisms are rather complex, including a combination of
molecular, chemical and environmental factors (constituting the epi-
genome) together with the genome, in establishing the unique func-
tionality of each cell type. DNA methylation is the most studied epi-
genetic mark. DNA methylation is characterized by the addition of a
methyl group to the cytosine in a cytosine-phosphate-guanine dinu-
cleotides or a CpG site. It was demonstrated that DNA methylation
marks elicit an age-associated pattern which has been used earlier to
design several epigenetic clocks of biological age (Hannum et al., 2013;
Horvath, 2013). The assessment of epigenetic DNA methylation age is
based on the association of the methylation level in selected CpG sites
with chronological age, in a population. The methylation level of those
sites can be used to evaluate the chronological age of individuals
(Mitnitski, 2018). Whereas DL techniques were recently applied to
identify epigenetic marks (Kim et al., 2016;Liu et al., 2016), ML
methods are also applied to develop tools using DNA methylation
patterns as a biomarker of aging. For example, in (Torabi Moghadam
et al., 2016) a pipeline of Monte Carlo Feature Selection and rule-base
modeling was developed in order to identify combinations of CpG sites
that classify samples in dierent age intervals based on the DNA me-
thylation levels. In (Levine et al., 2018), an epigenetic biomarker of
aging was developed using data from whole blood. This biomarker was
found to correlate with age in every tissue and cell tested. Furthermore,
it is able to predict a variety of aging outcomes, including allcause
mortality, cancers, health span, physical functioning, and Alzheimer's
disease. The combination of large amounts of epigenomics data pro-
duced and stored in the digital space, along with the development of AI
technologies, will lead to the design of more accurate epigenetic clocks
in the near future (Schumacher, 2018).
4.2. Target identication
Identifying targets of interest is another critical aspect in the de-
velopment of eective anti-aging treatment. Dierent computational
approaches have been developed, including screening dierences in
pathway activation patterns using pathway perturbation analysis.
Those methods characterize pathways as transcriptomic maps and can
be used to identify pathways eliciting high changes between young and
old individuals. The results of such analysis provide information about
pathways involved in aging (Fig. 2A). Another approach relies on
screening libraries of already known compounds using DNNs for iden-
tifying compounds with potential pro-longevity properties (Fig. 2B)
(Aliper et al., 2016). Features used by aging clocks can also be analyzed
to identify new targets. This approach was followed in (Mamoshina
et al., 2018b) where the list of genes used to predict age based on
transcriptomic proles of skeletal muscles was further analyzed to
identify the genes most important for age prediction. Several methods
were used to evaluate the importance of features (genes) on age pre-
diction. Methods include ranking genes by absolute values of their re-
gression coecients for an ElasticNet model, applying the RF feature
importance algorithm to extract the Gini importance value of each
gene, and analyzing the relative importance values assigned to genes by
the DFS model. The Borda count algorithm was used to summarize
ranks provided by these dierent methods and obtain nal importance
values. In addition, the wrapper method, which was applied to identify
the most important blood markers for age prediction (Zhavoronkov
et al., 2016) was also used.
Interestingly, the list of the most important genes selected by the
Borda algorithm contains several genes already known as therapeutic
targets. To provide a more comprehensive overview of how those genes
are related to the aging of skeletal muscles, pathway perturbation
analysis was performed using the iPANDA algorithm (Ozerov et al.,
2016) to compare signatures of young and old muscle tissue. iPANDA
belongs to the fourth generation of data driven pathway analysis
methods (Vanhaelen et al., 2017). Using a simplied description of the
pathways, these statistical-based methods can handle high dimension-
ality data to analyze changes between two conditions in the expression
of genes belonging to common pathways. As these methods work in
terms of pathway activity levels and not in terms of individual genes,
they reduce the genomic complexity from tens of thousands of features
to measurements on dozens of pathways (Khatri et al., 2012;Li et al.,
2015). The two conditions were dened as follows. The samples from
individuals 1630 years old were classied into the younggroup
while individuals over 60 years old were used to assemble the old
group. The list of dierentially expressed genes were computed and
their expression proles and a pathway database of 1856 annotated and
manually curated signaling pathway maps were used as an input for the
iPANDA algorithm. The results conrmed the established mechanisms
of human skeletal muscle aging, including dysregulation of cytosolic
Ca2+ homeostasis, PPAR signaling and neurotransmitter recycling
along with IGFR and PI3K-Akt-mTOR signaling.
Drug repurposing is a commonly used alternative approach for
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
54
nding new targets or indications for already approved drugs.
Computational drug repurposing is a highly active eld of research
within pharmacology, and many computational approaches are being
developed using various kinds of techniques (Hodos et al., 2016;
Vanhaelen et al., 2016;Alaimo et al., 2016;Vanhaelen, 2019). The
capabilities of DL and AI technologies are also being investigated in this
context. In (Aliper et al., 2016), the authors proposed to use DNN-based
system to classify drugs into therapeutic categories based solely on their
transcriptomic data. Datasets used as inputs include samples exposed to
various drugs selected across A549, MCF-7, and PC-3 cell lines. These
samples were gathered from the LINCS Project and linked to 12 ther-
apeutic use categories derived from the MeSH therapeutic use section.
However, it appeared that training the DNN using the entire dataset of
12,797 genes generated very poor results. In order to address this issue,
feature selection methods were used at the genomic and pathways le-
vels. At the genomic levels, the features were obtained with gene ex-
pression level data for landmark genes,”—genes that capture ap-
proximately 80% of the information and possess great inferential value.
These features were used to train the DNN classier. At the pathway
level, the activation scores of 271 signaling pathways were computed
resulting in a nal dataset containing 308, 454, and 433 drugs for
A549, MCF7, and PC3 cell lines, respectively. Using this dataset, an-
other DNN classier based only on pathway activation scores for drug
perturbation proles of three cell lines was assembled. Interestingly, it
appeared that this second classier performed much better, suggesting
that pathway level data is more complementary for DNN and better
suitable for classifying drugs into therapeutic use categories. Inter-
preting their results from a repurposing perspective, the authors argued
that the misclassiedsamples for a certain drug could be an indica-
tion of its potential for novel use in these exact incorrectlyassigned
conditions. Misclassication, therefore, may lead to unexpected new
discoveries.
Once treatments and drugs have been developed and marketed, it is
important to assess to what extent they can lead to a signicant im-
provement of health status. Although feedback received from patients
over time can provide meaningful information to improve the drug
development stages or optimize the drug discovery engine (Fig. 2C), it
should be emphasized that the eects of aging take years or decades to
unfold and the experimental observation of the eects of anti-aging
treatments is not necessarily straightforward. For instance, these
Fig. 2. Computational screening for geroprotectors and anti-aging drugs is a multiple process. (A) Pathway and transcriptomic signatures extracted when comparing
young and old individuals help to identify potential candidates and targets. (B) Already existing drugs and compounds can be screened using DNN-based approaches
to identify key properties associated with anti-aging drugs and identify drugs with the desired set of characteristics. (C) Once designed, approved drugs and
compounds reach the market. Feedback from users and patients provide useful information about the reliability of the drug discovery pipeline. Computational
pipelines are exible and can thus be easily adapted to meet patient expectations.
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
55
diculties are observed when analyzing the eect of the calorie re-
striction (CR) diet. This diet is characterized by a reduction in caloric
intake below usual levels. Observations indicate that CR contributes to
health benets and extends lifespan. This observation was initially
made in species eliciting a short average lifespan when CR was initiated
early or in mid-life and was sustained for a substantial portion of the
lifespan while maintaining adequate intake of essential nutrients.
However, as emphasized in a recent study where the eects of CR were
analyzed over a period of 2 years (Redman et al., 2018), obtaining si-
milar conclusive results for humans requires performing these experi-
ments over a longer period of time. In addition, other parameters,
which vary over time, such as environmental, economic and social
factors or lifestyle, must be taken into account. One can expect that the
accumulation of health data will allow a more systematic use of com-
putational methods to assess a posteriori the eects of anti-aging drugs
and treatments. For instance, aging clocks can be used to measure how
the aging acceleration in treated patients varies upon treatment (Fig. 3).
Information gathered from individuals can also be used to identify the
main features that characterize the optimal healthy status of an in-
dividual. This information can be applied to understand how anti-aging
drugs counteract lifestyle deleterious eects.
4.3. Small molecules drug discovery
Computational techniques are also developed for the design of drug
compounds and for generating large, virtual chemical libraries which
can be more eciently screened for in silico drug discovery. Drug dis-
covery and development timelines can be further optimized by using DL
and AI technologies to characterize drug candidates according to likely
ecacy and safety, prior to preclinical and clinical trials. The
disruption of the standard discovery pipeline through AI technologies
will be benecial for the identication of new candidates for devel-
oping anti-aging therapies.
In this context, generative models, based on the GAN paradigm for
instance, have great potentials due to their ability to generate virtual
molecules with desired chemical and biological properties. Several of
such models have been proposed for molecular de novo design and
molecular feature extraction. In what follows, we examine the main
models recently released and shortly describe the challenges faced.
Kadurin et al. (2017a) proposed the rst DL-based generator of
molecules. The core of the architecture is based on an adversarial au-
toencoder (AAE) and aims at generating novel molecular ngerprints
with a dened set of parameters. A molecular ngerprint is a numerical
method to encode the structure of a molecule. The most common type
of ngerprint, called binary ngerprint, is a series of binary digits that
represent the presence or absence of particular substructures in the
molecule. The system takes a vector of binary ngerprints and log
concentration of the molecule as inputs, and outputs a concentration
and a vector consisting of probabilities assigned to each bit of the n-
gerprint. For training on ngerprint, the log concentration of 6252
compounds proled on the MCF-7 cell line and the corresponding
growth inhibition percentage (GI) data, which indicates the reduction
in the number of tumor cells after drug treatment, were used. To assess
the validity of the predictions, the generated ngerprints were used to
screen several millions of compounds from the PubChem database and
identify the compounds for which anticancer activities are observed.
Other relevant biomedical properties of interest have also been either
tested or demonstrated.
This work was improved in (Kadurin et al., 2017b) where an ad-
vanced AAE model for molecular feature extraction was presented.
Fig. 3. Assessing the ecacy of anti-aging drugs and proposed treatments is critical for advances in aging research. Samples exposed to such treatments or drugs can
be used as inputs for aging clocks. Estimation of age acceleration can be obtained. Comparison with non-exposed samples gives a rst estimation of the in vivo eects
of these treatments. Another approach relies on the use of AI to extract features characterizing the optimal healthy state reached by most individuals from 25 to 35
years old. Computational analysis can track how these features are aected by various treatments such as the use of geroprotectors. On the other hand, the eects of
various common daily behaviors and lifestyles can also be integrated within these studies.
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
56
Compared to the previous work (Kadurin et al., 2017a), this model,
called druGAN (drug Generative Adversarial Network), also uses n-
gerprints as a representation of the molecules. In order to measure the
similarity between the generated molecules and the original data, the
authors used the Tanimoto similarity, which is a synonym of Jaccard
similarity in this context. Experiments showed that druGAN elicits
higher adjustability in generating molecular ngerprints, has a better
capacity of processing very large data sets of molecules, and is more
ecient in unsupervised pretraining for regression model. The study
includes a comparison between the druGAN and a variational auto-
encoder (VAE) model. VAE models are another type of commonly used
generative models based on DNN architectures. Interestingly, dierent
tests demonstrated that both AAE and VAE models can perform very
well depending on the kind of task to be solved. Consequently, both
VAE and AAE can be considered as valuable tools that can be used in
drug discovery pipelines on ngerprints and on other representations of
the molecular structure. However, the authors pointed out dierent
limitations of present AAE architectures. For instance, the study used
MACCS molecular ngerprints, which are not ideal representations of
molecular structure. The molecular ngerprint has two disadvantages.
First, one ngerprint can match several molecules, so there is no one-to-
one mapping from a molecule to the ngerprint, and second, the n-
gerprint representation contains less information about the molecule
topology than the string representation. Those disadvantages are not
shared by other more chemically and biologically relevant representa-
tions of the molecular structures such as the string representation of the
molecule (SMILES), InChI, or molecular graphs. This suggests that using
alternative representations could lead to better performances of gen-
erative adversarial models (Fig. 4A).
Following this conclusion, a VAE model for learning continuous
representation of molecules represented in SMILES format was
introduced by Gomez-Bombarelli et al. (Gómez-Bombarelli et al.,
2018). Models based on Recurrent Neural Networks (RNNs), a type of
architecture more adapted for data of sequential nature such as SMILES,
have also been investigated. For example, in (Bjerrum and Threlfall,
2017), the performances were further improved by using long short-
term memory (LSTM) cells and gated recurrent units (GRU). Archi-
tectures based on the GAN paradigm were also developed. For example,
sequence generation via deep reinforcement learning (DRL) was pro-
posed in (Yu et al., 2017). The architecture, called Sequence Generative
Adversarial Network (SeqGAN), combines GAN with a RL-based gen-
erator. Another extension of SeqGAN called ORGAN (Objective-Re-
inforced Generative Adversarial Network) was proposed in (Guimaraes
et al., 2017). This model adds an objective-reinforcedreward func-
tion for particular sequences into the SeqGAN reward loss. Further
works based on objective functions for molecular design within the
ORGAN paradigm was done in (Sanchez-Lengeling et al., 2017). The
proposed architecture, ORGANIC (Objective-Reinforced Generative
Adversarial Network for Inverse-design Chemistry), used various cri-
teria as objective lters to train the ORGAN model. The results showed
the use of dierent objective reward functions makes it possible to bias
the generation process and generates molecules with desired user-spe-
cied properties.
Following these works, the RANC model for the design of small-
molecules was later presented (Zhavoronkov et al., 2018a). RANC is
based on the GAN and RL paradigms. Moreover, RANC uses a dier-
entiable neural computer (DNC) as a generator. DNC is a category of
neural networks, with increased generation capabilities due to the ad-
dition of an explicit memory bank. This additional module can help to
mitigate common problems found in adversarial settings. RANC was
trained on the SMILES string representation of the molecules and results
showed the generated molecules match the distributions of the lengths
Fig. 4. (A) AI de novo molecular generators oer interesting opportunities to optimize the identication and selection of molecules with desired properties. However,
a systematic use of such approaches requires the establishment of standardized procedures and protocols for the training and validation of the models. Furthermore,
systematic studies could be performed to better understand how the model performances depend on the specic architecture, loss functions, and combination of
lters used. (B) AI-molecular generators are best used in combination with aging clocks. Targets identied from aging signatures and features used by aging clocks
are used to dene appropriate properties of molecules generated by AI generators. As for any computational methods, predictions obtained from AI-based generators
must go through various phases of testing. Feedbacks obtained from these critical steps can help improve the global pipeline.
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
57
and the key molecular descriptors of the training molecules. Further-
more, comparisons with ORGANIC showed that RANC performed better
in terms of number of unique structures, Muegge criteria, QED scores
and number of generated molecules passing the medicinal chemistry
lters (MCFs).
Another variant of molecular generator called Adversarial
Threshold Neural Computer (ATNC) was also designed (Zhavoronkov
et al., 2018b). Like the RANC and ORGANIC models, ATNC is based on
the GAN and RL paradigms and, like RANC, it also uses DNC as gen-
erator. However, ATNC includes a supplementary computational unit,
called adversarial threshold (AT). The AT unit acts as a lter between
the agent (generator) and the environment (discriminator and objective
reward functions). In order to generate more diverse molecules, a new
objective reward function named Internal Diversity Clustering (IDC)
was introduced. The performances were compared with ORGANIC and
both models were trained on the SMILES string representation of the
molecules. Four objective functions, the internal similarity, the Muegge
druglikeness lter, presence or absence of sp3-rich fragments, and the
IDC were used. The distributions of four molecular de-
scriptorsnumber of atoms, molecular weight, logP, and tpsawere
analyzed and ve supplementary chemical statistical features were also
computed (internal diversity, number of unique heterocycles, number
of clusters, number of singletons, and number of compounds that have
not been passed through medicinal chemistry lters). The analysis of
these molecular descriptors and chemical statistical features demon-
strated that the molecules generated by ATNC elicited better drugli-
keness properties. One of the limitations of the ATNC emphasized by
the authors concerns the architecture itself. It was suggested that re-
placing the GAN part by an AAE can provide the model with a me-
chanism to control the percentage of correctly reconstructed molecules.
Other main limitations are related to the method used for representing
the molecules. ATNC, RANC and ORGANIC are SMILES-based models
and cannot properly employ fragment-based objective reward func-
tions. This is due to the fact that SMILES string of a fragment cannot be
found in a SMILES string of a molecule because of the SMILES format
notation. The authors suggested that using other molecule representa-
tion, for example graph representation of molecules, where each mo-
lecule will be represented in a unique way, could overcome this issue.
Finally, they suggested that using modern, multi-objective RL techni-
ques could allow the environment to optimize the scoring of the mo-
lecules by using several objective rewards simultaneously (Fig. 4A).
The examples discussed in this section illustrate this growing eld of
research still lacks a unied set of benchmarks which could be used to
provide a framework to evaluate and compare dierent generative
models. Furthermore, it is necessary to formulate best practices for this
emerging industry of AI molecule generatorsat dierent levels, in-
cluding how much training data is required, for how long the model
should be trained, and what kind of metrics and loss functions are the
most appropriate for monitoring the performance and assessing the
validity of the outputs of these models. For example, replacing hand-
crafted rules commonly used in the context of de novo drug design with
rules learned from data. DiversityNet is a promising initiative to address
this issue. DiversityNet is a data science collaborative challenge which
asked participants to collaborate to design the most appropriate set of
tasks, metrics, and select suitable datasets to evaluate generative mo-
lecule generators. AI-molecular generators have already been success-
fully integrated in extended pipelines where their prediction power and
accuracy are improved by using biologically relevant features selected
through the use of aging clocks and other dimensionality reduction
techniques (Fig. 4B).
4.4. Regenerative medicine
The eld of regenerative medicine aims to provide patients with
improved treatment and faster recovery through, for example, the use
of induced pluripotent stem Cells (iPSCs) (Takahashi and Yamanaka,
2006;Takahashi et al., 2007) which can dierentiate into dierent cell
lineages and ultimately, to any kinds of cell types (Scudellari, 2016;
Yamanaka, 2012). Therefore, one could control iPSC dierentiation to
treat various diseases (Jiang et al., 2014). For example, by creating beta
islet cells to treat diabetes or neurons to treat neurological diseases.
These techniques could also be used to grow tissues and organs and
transplant them into the body, eliminating potential organ transplant
rejection. If applied at a larger scale, this could help to address the
shortage of organs available for transplants.
However, the use of iPSC potentials requires fully controlling the
dierentiation process itself. In past decades, much progress has been
made to understand the complex dynamics taking place during the stem
cell fate decision at the genetic and epigenetic levels. At the genomic
level, pluripotency maintenance is regulated by transcription factors
(Thomson et al., 2011;Walker et al., 2007;Walker and Stanford, 2009;
Tantin, 2013) which act as master regulators of gene regulatory net-
works (GRNs) (Iglesias-Bartolome and Gutkind, 2011;Ng and Surani,
2011;Dalton, 2013). From a dynamical point of view, GRNs are orga-
nized following various dynamical motifs which act together and con-
tribute to improving the adaptation abilities and robustness of the
system (Yeo and Ng, 2013;Saint-André et al., 2016). Transcription
factors have been shown to continually attempt to specify dierentia-
tion to their own lineage. Consequently, direct external interventions,
through activation or inhibition of one or several signaling pathways,
are necessary to reinforce the pluripotency state or to control the dif-
ferentiation to a specic lineage (Silva and Smith, 2008;Nowick and
Stubbs, 2010;Loh and Lim, 2011). From this point of view, the plur-
ipotency state could be considered a metastable state whose main-
tenance depends on the properties of the external environment of the
cell. These discoveries explain why dierentiating iPSCs into specic
cell types can only be achieved by following a rather complex dier-
entiation protocol, usually specic to each kind of cell (Malik and Rao,
2013).
Although signicant eorts and progresses have been accomplished
to establish standardized sets of dierentiation protocols for various
cell types (Si-Tayeb et al., 2010;Takeda et al., 2017;Dai et al., 2015;
Daniel et al., 2016), large scale applications are still dicult. Within
this context, computational methods could be used to design AI auto-
mated systems to create and adjust custom protocols to optimize the
success of stem cells dierentiation processes. The opportunities oered
by AI techniques in the development of predictive models for perso-
nalized treatments with engineered stem cells, immune cells, and re-
generated tissues in humans were recently reviewed in (Sniecinski and
Seghatchian, 2018). For instance, AI can be used to identify the state of
development of embryonic cells (Fig. 1C). An example of this kind of
application was recently published in (West et al., 2018). In this study,
a DNN ensemble was trained on transcriptomic data of 12,404 healthy,
untreated tissue samples from Aymetrix (4822 samples) and Illumina
(7582 samples) microarray platforms to be able to classify samples
according to ve categories: embryonic stem cells (ESCs), induced
pluripotent stem cells (iPSCs), embryonic progenitor cells (EPCs), adult
stem cells (ASCs) and adult cells (ACs). The DNN outperformed tradi-
tional ML methods (kNN, SVM, GBM) and achieved a mean 0.99 F1
scorethe probability that the guesses are correcton the Aymetrix
microarray training dataset, and 0.75 F1 on the external validation
dataset. Interestingly, prediction performances of the DNN were im-
proved with dimensionality reduction using the pathway level analysis
approach iPANDA. Further feature importance analysis identied re-
pression of COX7A1 as a novel marker associated with the mammalian
embryonic-fetal transition (EFT). The computational methods devel-
oped in this work have been made available through an online platform
called embryonic.ai (http://embryonic.ai). The AI-based platform was
developed by Insilico Medicine Inc. in collaboration with the company
Biotime Inc., (http://www.biotimeinc.com/). It gives access to the rst
deep learned transcriptome-based classier designed to compute the
embryonic score of a sample, an integrative metric of cell development
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
58
stage. Embryonic AI is an ensemble of DNNs trained and validated on
transcriptomics data representative of healthy ESC, iPSC, EPC, ASC and
AC types. Using data provided by the user, the system will output an
embryonic score.
Another example of application of AI technology is the recent in-
troduction of an online catalog of 3D stem cell images produced using
DL analyses and cell lines altered with the gene-editing tool CRISPR
(Maxmen, 2017). This tool, called Allen Cell Explorer (https://www.
allencell.org/), will be used in the near future by scientists to better
understand iPSC structures and their relationships to functions, and
ultimately to diseases like cancer.
At a larger scale, AI can also oer practical solutions to other types
of challenges faced in regenerative techniques with the prediction of
tissue engineering results with ANN (Xu et al., 2005;Shaikhina et al.,
2015) or with the development of computational model-based neural
networks for more elaborated tissue engineering applications. One can
expect the current trend toward AI-guided regenerative medicine will
bring impactful benets in the near future.
Applying such regenerative technologies to design anti-aging drugs
or treatments targeting age related diseases is an appealing perspective.
Several companies focus on developing such treatments but most of the
clinical applications will require years of development and clinical
trials before potential FDA approval. Nevertheless, several near-term
applications are already undergoing clinical trials. This is the case for a
regenerative product for age-related macular degeneration which is
developed by AgeX Therapeutics (http://www.agexinc.com/), a sub-
sidiary of BioTime, Inc. AgeX Therapeutics is also developing plur-
ipotent stem cell-derived therapies for manufacturing brown fat cells.
These cells contribute to the regulation of metabolism as they burn
calories rather than store them. The amount of brown fat cells within
the body decreases with age and restoring them might help maintain
the metabolic balance at the same level as younger individuals.
4.5. Gene therapy
Gene therapy is an experimental technique that uses genes to treat
or prevent diseases, including inherited disorders, some types of cancer,
and certain viral infections. In practice, this technique is designed to
introduce genetic material into cells to compensate for abnormal genes
or to make a benecial protein. Although promising, gene therapy is
currently being tested only for diseases that have no other cures
(Soleimani et al., 2015). However, in recent years gene therapy has
gained more attention due to several successes. The FDA has recently
approved the rst gene therapies for treating forms of leukemia, lym-
phoma and retinal dystrophyan inherited disease. Although these
initial successes did not benet from the emergence of AI techniques,
the rapidly growing amount of genomic data available has, like in other
elds of life science, triggered the interest applying AI for improving
gene therapies and as a consequence, the potential of personalized
medicine as both of them rely on matching the appropriate drug with
the right patient population. A task for which AI technologies are per-
fectly adapted (Fig. 5). More specically, one sees AI as a key ingredient
to improve the precision of the gene editing process. Although the de-
velopment of gene editing has provided new opportunities for exploring
personalized cures and treatments by providing scientists with the
ability to alter patient DNA, the ability to perform gene editing accu-
rately is still challenging. Several companies have developed AI-based
platforms to answer the needs of this specic sector. For example,
ATUM (https://www.atum.bio/), a California-based bioengineering
service organization, and the largest US-based provider of synthetic
genes, is applying AI to gene synthesis and has developed a technology
called Leap-In transposase which enables any recombinant DNA se-
quence to behave as a transposon. Synpromics (http://www.
synpromics.com/), an Edinburgh-based company, uses AI to identify
patterns between genomic sequences and their involvements in cell
type-specic regulation of gene expression. Elevation, the project
developed by Microsoft, uses genomic data and AI to predict the op-
timal position to edit a strand of DNA to alleviate side eects and speed
up the editing process (Listgarten et al., 2018).
Gene therapies also oer tremendous opportunities for designing
ecient anti-aging treatments. It is known that mitochondrial oxidative
stress may contribute to human aging. For example, increased expres-
sion of catalase in the mitochondria results in much more potent pro-
tection against oxidative stress (Bai et al., 1999;Arita et al., 2006). To
take advantage of this discovery, it has been suggested that an adeno-
associated virus vector expressing the mitochondria-targeted catalase
gene could be used as a gene therapy to prevent aging-related pa-
thology (Li and Duan, 2013). Telomerase gene therapy is another
possible application. This approach is based on the observation that
telomere shortening is linked with aging and disease and that the ge-
netic manipulation of lengthening telomeres through increased telo-
merase expression may result in increased longevity. Telomerase gene
therapy is a possible therapeutic intervention against aging and
agerelated diseases (Boccardi and Herbig, 2012;Bär et al., 2016;
Muñoz-Lorente et al., 2018). One can expect that the use of AI tech-
nologies to optimize the design of gene therapies will greatly help the
future development of anti-aging treatments.
4.6. Immuno oncology and immunosenescence
Cancer is currently one of the main causes of death. This is due to an
aging population but also possibly related to unhealthy food habits,
changing lifestyle, and increasing consumption of tobacco-related pro-
ducts. According to the National Cancer Institute, around 1.6 million
new cases of cancer were diagnosed in the USA in 2016. As these
numbers are expected to grow in the future, there is an increasing and
pressing demand for identifying new oncology drugs that can be used as
a part of anti-cancer treatments. Normally, the immune system is able
to recognize tumor cells and distinguish them from their normal
counterparts. However, in cancer patients, tumor cells escape from
immune system surveillance by dodging immune checkpoints (in-
hibitory pathways to inactivate T-cells). Oncology drugs, also called
anti-cancer drugs or anti-neoplastic drugs, are agents that can be used
alone or in combination to control or destroy neoplastic cells. These
agents can be either systemic or targeted. In systemic, the drug spreads
throughout the body, whereas in targeted, the drug or substance
identies the specic location causing less harm to the growth of
neighboring healthy cells. There are several types of Cancer
Immunotherapies using either Immune Checkpoint Modulators,
Immune System Modulators or Therapeutic antibodies, Immune Cell
Therapy, or Cancer Vaccines, usually made from a patients own tumor
cells or from substances produced by tumor cells.
A specic challenge when treating cancer comes from the fact that
each cancer and every cancer patient are dierent and tumor cells
within a specic tumor site can vary in diversity. For that reason, a
strategy pursued in oncology research is to identify small subsets of
cancer patients that can benet from a specic treatment. However, this
targeted approach has encountered limited success until recently, be-
cause although researchers and doctors had access to large sets of data
from imaging, genomics, co-morbidities and previous treatments, they
did not have the adapted methods to make an ecient use of them.
With its ability to learn, predict, and advice based on vast amounts
of data, AI technology can identify patterns that can be used to predict
the prognosis of patients and advise medical practitioners with dierent
options available ranging from available personalized medicine to
clinical trials with experimental therapies (Fig. 5). For example, con-
volutional neural networks (CNNs) were trained to classify cancer pa-
tients using immunohistochemistry of tumor tissues (Vandenberghe
et al., 2017). A ML-based tumor classier was presented in (Capper
et al., 2018) and works using ML methods specically for breast cancer
pattern classication and forecast modeling. These methods were also
reviewed in (Yue et al., 2018). Applications of standard ML-techniques
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
59
for cancer diagnosis have been covered in (Kourou et al., 2015) and DL-
based cancer diagnosis approaches were recently reviewed in (Hu et al.,
2018). As in the case of regenerative medicine, applying AI technology
as a diagnostic tool in oncology (detection of cancer) can signicantly
reduce the error rate of diagnosis and also contribute to reducing time-
consuming activities. There are many initiatives to support the appli-
cation of AI within oncology. For example, the open research initiative
called EPIDEMIUM aims to bring together multiple players and apply AI
to the research of new cancer therapies. Many companies are working
to develop AI-based platforms to address challenges faced in oncology
and cancer research. For instance, AI is used by the company Sophia
Genetics to pinpoint the gene mutations behind cancer to assist doctors
in the prescription of the best treatment. The company Freenome de-
veloped an AI genomics platform to predict patientsresponses to im-
muno-oncology therapies by observing changes in biomarkers circu-
lating in the bloodstream. In addition to being used to stratify patients,
AI can also be applied to identify synergistic combinations of cancer
targets in order to develop drugs against those targets; this strategy is
followed by Sanoand GSK which have partnered with Exscientia.
These eorts, although in their infancy, begin to generate results with
the case of the company BenevolentAI which, in partnership with
Janssen, has shown concrete results, leading to a drug candidate now
moving to a Phase II trial.
In non-cancerous cells, the immune system also undergoes altera-
tions with age. The most important alterations occur in the adaptive
immune system and involve T cells. Many of these alterations are as-
sumed to decrease capacity of the immune system to combat the
emerging or progressing tumor (Pawelec et al., 2010;Fulop et al., 2010;
Pawelec, 2017). The declining function of the immune system is known
as immunosenescence and leads to a higher incidence of infection,
cancer, and autoimmune disease related mortalities in the elderly po-
pulation (Pawelec, 2018). For these reasons, various strategies have
been suggested to combat immunosenescence, including cellular and
genetic therapies (Xu and Larbi, 2017). Immunosenescence involves a
shift in function of both adaptive and innate immune cells, leading to
reduced capacity to recognize new antigens and widespread chronic
inammation (Stahl and Brown, 2015;Ventura et al., 2017). This state
can be assessed by measures of dierent immune biomarkers that are
dierent in younger and older individuals and are associated with a
detrimental clinical outcome. Many other biomarkers were assessed in
studies of younger and older adults. Well established markers of im-
munosenescence include CD45RO, CD45RA (Neuber et al., 2003), CD27
(Bulati et al., 2011;Xu and Larbi, 2017) CD57 (Xu and Larbi, 2017) and
CD28 whose downregulation occurs in response to chronic immune
stimulation in older individuals (Kennedy et al., 2016;Tu and Rao,
2016). CD31 is known as a good marker of CD41 + T cells (Larbi and
Fulop, 2014;Douaisi et al., 2017) whereas CD79 A is a highly reliable
marker for B-cells. Downregulation of several genes has been associated
with a decrease in immune functions. For example, KLF4 whose ex-
pression may wane with age is known to control immune function
(Kennedy et al., 2016). TSPAN33 also wanes with age and plays a role
in red blood cell dierentiation (Kennedy et al., 2016). Interleukin-7
(IL-7) plays a central, critical role in the homeostasis of the immune
system (Nguyen et al., 2017a,b). Immunosenescence is also correlated
with a lower expression level of IL-2 as highly dierentiated T cells
accumulate with age and are unable to produce IL-2 (Henson and
Akbar, 2009). Other examples of genes whose expression is positively
or negatively correlated with the onset of immunosenescence are de-
scribed elsewhere (Bellavista and Franceschi, 2009;Opal et al., 2005;
Xu and Larbi, 2017;Rosenstiel et al., 2008). There have been several
studies where computational methods were applied to investigate the
mechanisms behind the onset of immunosenescence, especially its in-
terconnection with inammation (Morrisette-Thomas et al., 2014;
Bektas et al., 2017). However, in the near future, our understanding of
immunosenescence and its interconnection with other processes could
take advantage of similar methodologies rather than the ones used for
predicting biological ages. AI technology could identify key regulators
involved in the onset of immunosenescence and reveal the complexity
of the interplay with other key biological processes. These regulators
could, in turn, become targets for developing appropriate treatment.
5. AI for cross-species aging research
Demographic data, or life tables, such as the ones from the Human
Mortality Database (http://www.mortality.org) provide information to
analyze demographic trends including mortality and fertility rates.
Using life tables, one can extract survival curves showing the propor-
tion of individuals surviving to each age for a given species. The ana-
lysis of these curves demonstrate that they elicit specic topological
features which provide information about the specic aging patterns of
each species (Jones et al., 2014). As described in (Demetrius, 1978),
survival curves can be broadly classied into three types. The Type-I
survival curves change at early and middle ages and then decline at late
ages, as seen for humans. The Type-II curves almost linearly decrease
with age, as seen for short lived birds. Type-III curves quickly decrease
at early ages, as seen for most plants. Nevertheless, major, non-trivial
topological features of these curves are still poorly understood and a
current challenge in fundamental aging research is to nd how aging
Fig. 5. AI can be used in dierent ways for designing personalized treatments. AI platforms can be used as a diagnostic tool to reduce error rate. It is also useful to
stratify patients according to their specic health condition. By combining more accurate diagnostics and a better knowledge of the health conditions of the patients,
AI platforms can be applied to design more eective treatments.
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
60
patterns and mortality curves are shaped. To that end, it is necessary to
identify the mechanisms responsible for the observed shapes. Currently,
there are several causal and relatively complex mechanistic models
which have been built to describe some of these topological features
(National Research Council (US) Committee on Population, 2012). Al-
though, some of the models successfully predict several curves for
species, they do not provide a complete framework for explaining the
diversity of aging patterns observed through the tree of life (Liu, 2015;
Dolejs, 1997;Kogan et al., 2015). The diculty comes from the fact
that the mortality curves are the result of complex relationships be-
tween living styles, eects of natural selection, environmental condi-
tions, and ne-tuning of cellular mechanisms. Many of these parameters
are specic to each species (Vanhaelen, 2015;Vanhaelen, 2018). Al-
though these models can include variables such as resource availability,
reproduction rate, and the eects of competition between species,
considering the eects of ne-tuning of cellular mechanisms is still
challenging. This requires having a detailed description of these me-
chanisms, that is, a description of how aging occurs and propagates
with the living system. It is well known that there are many mechan-
isms contributing to aging, including inammation, apoptosis, oxida-
tive stress, accumulation of DNA damage, cell cycle deregulation, mi-
tochondrial dysfunction, and telomere shortening, to name a few. A
classical modeling approach would be to elaborate models combining
the eects of accumulation of mutations, senescence, eects of natural
selection and any other parameter and process supposed to intervene in
the onset and propagation of aging. Such a task presents tremendous
technical challenges (Tarkhov et al., 2017) that is made even more
complex by the fact that one can reasonably assume many of the me-
chanisms or biological parameters which should be included in such
models are still either poorly understood or even completely unknown.
Another approach (Fig. 6) would take advantage of the ability of AI
techniques to design specic algorithms whose goal should be to sys-
tematically analyze the demographic data available for various species
in order to identify and extract the major features behind the shape of
survival curves. In addition, AI platforms could be designed to perform
cross-species analysis of such data. This could allow the examination of
common and distinct features of the aging process through dierent
species. The results of such investigations could lead to the identica-
tion of generalized aging biomarkers. This approach could also be used
to analyze how evolution has shaped dierent speciesaging patterns.
6. Generative adversarial networks (GANs) for generation of
synthetic data and target identication
As previously described, GANs represent a powerful new tool for the
generation of synthetic data. In situations where patient-specic data-
sets are scarce, it is possible to use GANs to signicantly augment the
original data set by producing new data across the broad spectrum of
ages. It is also possible to simulate patient cases that did not exist in
nature by generating patients older than the current record of 122.5
years. This powerful technique can also be used to infer causality and
identify actionable biological processes or targets. By generating the
olderor youngerrepresentation of the individual patient or patient
subpopulations, it is possible to identify the most important features
responsible for this change and explore the dependencies between these
features. The illustration in Fig. 7 (A) demonstrates this concept using
photographic data. It is clear from the photographs that the generated
130 year old subject looks older than the original or synthetic 20-year
old subject. The applications of GANs to the generation of shorter and
longer-lived subjects depicted in Fig. 7 (B) may help identify the drivers
of the aging process as well as the protective mechanisms.
7. Aging research for advancing articial intelligence
While advances in AI are already making substantial contributions
to research in aging, the computational solutions specically developed
for aging research could substantially advance research in AI. For op-
timal use in aging research, AI should not only provide correct pre-
dictions, but also give information about the features used to obtain the
predictions. The results provided should be interpretable in terms of
initial inputs, which can be of highly diverse origins. There have al-
ready been several breakthroughs in making AI systems more inter-
pretable, contributing to the development of new memory systems
capable of capturing multi-modal continuous data and eciently for-
getting unnecessary information.
Improving the interpretability of AI-based algorithms can be done
using two dierent complementary approaches. First, the complex data
collated in many biological databases are well suited to DL but also
Fig. 6. AI algorithms are capable of handling large amounts of data and identify relevant patterns without prior information about the related phenomena. For this
reason, AI algorithms are adapted to analyze the complexity of demographic data. Analysis can include the extraction of the most important features used to
reproduce the topology of species. Specic curves can give an overview of the most important mechanisms involved. Cross species analysis could provide ex-
planations about the origins of the huge diversity observed through the tree of life. The information gathered could be used to design generalized aging biomarkers,
identify common features of aging across species, and improve the understanding of how evolution shaped aging across species.
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
61
contain challenging features, including high dimensionality, noise, and
multiple, often incompatible, platforms. Consequently, while AI archi-
tectures are able to extract features from the data automatically and
usually outperform other ML approaches in feature extraction tasks, it is
recommended to select a set of relevant features before training DL
models. Feature selection and extraction can involve dimensionality
reduction. Generic methods such as principal component analysis or
clustering methods can be applied. However, other feature extraction
methods preserving the biological function can also be used. In prac-
tice, the appropriate reduction and feature extraction methods heavily
depend on the context of the study. For instance, when dealing with
transcriptomic data, dimensionality reduction should be applied prior
to training DNNs because the dataset contains a number of samples
much smaller than the number of genes. Supervised, knowledge-based
approaches such as the gene aggregation method, like pathway ana-
lysis, are the most suitable tool to that end. The list of the most relevant
perturbed pathways is then taken as the new list of features. The
strategy is to ensure that the features are relevant with respect to the
biological problem under study. Besides the appropriate selection of
feature prior to the training of the model, the predictions obtained from
AI algorithms can be better interpreted by using methods such as the
permutation feature importance (PFI) technique (Altmann et al., 2010)
which allows evaluating the relative importance of features to DNN
prediction accuracy. Specic type of architectures, like DFS models (Li
et al., 2016), automatically implement similar procedures. These
ranked lists of features can be used as a starting point to investigate the
mechanisms behind the observed biological behavior and to put the
prediction of the algorithm in context.
8. Articial general intelligence (AGI) for aging research
While there are multiple eorts to develop articial general in-
telligence (AGI), also referred to as the sentient AI, and even transfer of
the human memory and capabilities into computers, there is no proof of
concept demonstrating the feasibility of any of these approaches
(Wallach et al., 2010;Deca and Koene, 2014). However, there is sub-
stantial debate on AI safety and ethics. Regardless of the winning ap-
proach to AGI and the probability of AGI emerging in the near future, it
may be important to develop a values-based rules book to train AGI to
maximize the number of quality-adjusted life years (QALY) for ev-
eryone in the population. Maximizing global longevity and human
health span should be taught as the ultimate form of altruism to AGI.
9. Conclusion
The revolution in deep learning which started with deep neural
networks outperforming humans in ImageNet competition (He et al.,
2015) and RL in video games (Mnih et al., 2015) is rapidly propagating
into aging research. These eorts are driven by academia and industry
with the inux of government funding and venture capital. Aging is a
universal feature possessed by most living organisms, and therefore,
most of the advances in deep learning in the context of aging research
are in the eld of biomarker development. Many age predictors are
commonly referred to as aging clocksdeveloped for multiple data
types ranging from basic clinical blood tests, photos, videos, voice,
retinal scans, and medical imaging to microbiome data. These feature
selections, feature importance analysis, multimodal data analysis eorts
and causal model development eorts not only help estimate the bio-
logical relevance of these data types but also help advance research in
Fig. 7. GANs are well known for their abilities to generate models from data. These features have been largely used for image generation. Combining this ability with
the capabilities of GAN to handle large sets of data to capture complex features across individuals at dierent age, one can build GAN-based architecture to generate
synthetic health data and pictures of individuals at any age using the health data at a known given age.
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
62
AI by making the DNNs more interpretable.
The applications of AI presented in this work illustrate that AI
technologies are rapidly emerging and are starting to deliver promising
results in dierent elds of aging and longevity research (Fig. 8). One
can expect that multidisciplinary approaches combining the ability of
modern AI to generalize, learn strategy, generate new models, objects
and data from learned features with accurate methods for feature ex-
traction and causality analysis will lead to new applications in every
area of preventative, regenerative and restorative medicine. AI pro-
gressively moves from the status of an overhyped technology with only
a few proof-of-concept examples to a massively-adopted and accepted
trend in healthcare. An example of this trend happened in early 2017
when the rst DNN based platform, Arterys Cardio DL, was ocially
approved by the FDA. This platform is now widely used in the clinic.
Systems like Young.AI (http://young.ai) and aging.ai (http://aging.ai)
which estimate the predicted biological age of a person using multiple
data types may provide valuable insights into the persons health status
and evolve into disease-specic applications. Multi-modal integration
of the multiple aging clocks using modern AI will lead to a more holistic
approach to the understanding of biology and provide a unied theory
of aging and repair.
Encouraging progress can be seen from regulatory institutions.
Many regulatory authorities have initiated the development of a reg-
ulatory framework to promote innovation and support the use of AI
technologies in healthcare. The rst cloud-based DNN has recently been
approved by the FDA under the category of medical devices. In the EU,
a legislative proposal for regulations related to software for medical
devices for prediction and prognosis is currently under review.
In addition to the many technical challenges still faced by AI tech-
nologies, another major concern in the application of AI technologies
within healthcare is related to the acquisition, generation, and use of
health data. Many people consider their health information private and
agree that it should be protected, accordingly. Patients usually want to
know how their information is being handled. The fact that the transfer
of medical records from paper to electronic formats could increase the
chances of individuals accessing, using, or disclosing sensitive personal
health data, has triggered a lot of privacy concerns.
To address these concerns, regulatory eorts are underway to en-
sure proper ow and use of healthcare records. The most recent de-
velopment is the General Data Protection Regulation (GDPR) enforce-
ment established in Europe which has strong implications for the
development of AI-based platforms. Although these regulations are
welcome and necessary to avoid abusive practices, regulatory institu-
tions should ensure that they do not become barriers to meaningful
technological development. The ability of AI to make accurate predic-
tions is heavily dependent on data availability. Access and regulation
should take into account that collaboration between healthcare and AI-
based companies is necessary to establish an ecient pipeline for data
acquisition.
The population specicity of the many aging biomarkers demon-
strates the need for international collaborations and consortiums fo-
cused on data economics, generation, and exchange, model exchange
and validation as well as meta-analysis, clinical trials and educational
programs.
Competing interest
Alex Zhavoronkov, Polina Mamoshina, Quentin Vanhaelen and Alex
Aliper are aliated with Insilico Medicine, Inc., a company engaged in
aging research, which designs and uses AI-based algorithms for de novo
molecules generations and is also involved in biomarker development
and hence may have competing nancial interests.
References
Akkus, Z., Galimzianova, A., Hoogi, A., Rubin, D.L., Erickson, B.J., 2017. Deep learning
for brain MRI segmentation: state of the art and future directions. J. Digit. Imaging
30, 449459.
Alaimo, S., Giugno, R., Pulvirenti, A., 2016. Recommendation techniques for drugTarget
interaction prediction and drug repositioning. Methods in Molecular Biology. pp.
441462.
Alexander, J.L., Wilson, I.D., Teare, J., Marchesi, J.R., Nicholson, J.K., Kinross, J.M.,
2017. Gut microbiota modulation of chemotherapy ecacy and toxicity. Nat. Rev.
Gastroenterol. Hepatol. 14, 356365.
Aliper, A., Plis, S., Artemov, A., Ulloa, A., Mamoshina, P., Zhavoronkov, A., 2016. Deep
learning applications for predicting pharmacological properties of drugs and drug
repurposing using transcriptomic data. Mol. Pharm. 13, 25242530.
Altmann, A., Toloşi, L., Sander, O., Lengauer, T., 2010. Permutation importance: a cor-
rected feature importance measure. Bioinformatics 26, 13401347.
Altman, N.S., 1992. An introduction to kernel and nearest-neighbor nonparametric re-
gression. Am. Stat. 46, 175185.
Arita, Y., Harkness, S.H., Kazzaz, J.A., Koo, H.-C., Joseph, A., Melendez, J.A., Davis, J.M.,
Chander, A., Li, Y., 2006. Mitochondrial localization of catalase provides optimal
protection from H2O2-induced cell death in lung epithelial cells. Am. J. Physiol. Lung
Cell Mol. Physiol. 290, L97886.
Artemov, A., Aliper, A., Korzinkin, M., Lezhnina, K., Jellen, L., Zhukov, N., Roumiantsev,
S., Gaifullin, N., Zhavoronkov, A., Borisov, N., Buzdin, A., 2015. A method for pre-
dicting target drug eciency in cancer based on the analysis of signaling pathway
activation. Oncotarget 6, 2934729356.
Arulkumaran, K., Deisenroth, M.P., Brundage, M., Bharath, A.A., 2017. Deep reinforce-
ment learning: a brief survey. IEEE Signal Process. Mag. 34, 2638.
Ayyadevara, V.K., Kishore Ayyadevara, V., 2018. Gradient boosting machine. Pro
Machine Learning Algorithms. pp. 117134.
Badrinarayanan, V., Kendall, A., Cipolla, R., 2017. SegNet: a deep convolutional encoder-
decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell.
39, 24812495.
Bai, J., Rodriguez, A.M., Melendez, J.A., Cederbaum, A.I., 1999. Overexpression of cat-
alase in cytosolic or mitochondrial compartment protects HepG2 cells against oxi-
dative injury. J. Biol. Chem. 274, 2621726224.
Bär, C., Povedano, J.M., Serrano, R., Benitez-Buelga, C., Popkes, M., Formentini, I.,
Bobadilla, M., Bosch, F., Blasco, M.A., 2016. Telomerase gene therapy rescues telo-
mere length, bone marrow aplasia, and survival in mice with aplastic anemia. Blood
127, 17701779.
Bektas, A., Schurman, S.H., Sen, R., Ferrucci, L., 2017. Human T cell immunosenescence
and inammation in aging. J. Leukoc. Biol. 102, 977988.
Bellavista, E., Franceschi, C., 2009. Neuroimmune system: aging. Encyclopedia of
Neuroscience. pp. 471476.
Bennett, C.W., Berchem, G., Kim, Y.J., El-Khoury, V., 2016. Cell-free DNA and next-
generation sequencing in the service of personalized medicine for lung cancer.
Oncotarget 7, 7101371035.
Bermudez, C., et al., 2017. Accurate Age Estimation in a Pediatric Population Using Deep
Fig. 8. AI is progressively deployed in aging research. AI-based methods have
already been applied in dierent areas and contribute to optimize research and
development pipelines. AI-based methods can either be used as standalone
approaches or integrated within wider end-to-end learning pipelines solving
complex tasks from hypothesis generation and target identication to real
world evidence analysis. These pipelines can include computational methods
used for more ecient features selection and prior dimensionality reduction.
Both of these steps are likely to lead to more accurate and biologically relevant
outputs and substantially accelerate aging and disease research.
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
63
Learning on T1weighted MRI Structural Features.
Bjerrum, E.J., Threlfall, R., 2017. Molecular Generation with Recur Rent Neur a L
Networks (RNNs). arXiv Preprint arXiv:1705.04612.
Bobrov, E., Georgievskaya, A., Kiselev, K., Sevastopolsky, A., Zhavoronkov, A., Gurov, S.,
Rudakov, K., del Pilar Bonilla Tobar, M., Jaspers, S., Clemann, S., 2018.
PhotoAgeClock: deep learning algorithms for development of non-invasive visual
biomarkers of aging. Aging (Albany NY)(Nov). https://doi.org/10.18632/aging.
101629.
Boccardi, V., Herbig, U., 2012. Telomerase gene therapy: a novel approach to combat
aging. EMBO Mol. Med. 4, 685687.
Bolotin, D.A., Poslavsky, S., Davydov, A.N., Frenkel, F.E., Fanchi, L., Zolotareva, O.I.,
Hemmers, S., Putintseva, E.V., Obraztsova, A.S., Shugay, M., Ataullakhanov, R.I.,
Rudensky, A.Y., Schumacher, T.N., Chudakov, D.M., 2017. Antigen receptor re-
pertoire proling from RNA-seq data. Nat. Biotechnol. 35, 908911.
Breiman, L., 2001. Random forest. Mach. Learn. 45 (1), 532.
Budovsky, A., Craig, T., Wang, J., Tacutu, R., Csordas, A., Lourenço, J., Fraifeld, V.E., de
Magalhães, J.P., 2013. LongevityMap: a database of human genetic variants asso-
ciated with longevity. Trends Genet. 29, 559560.
Bulati, M., Bua, S., Candore, G., Caruso, C., Dunn-Walters, D.K., Pellicanò, M., Wu, Y.-C.,
Colonna Romano, G., 2011. B cells and immunosenescence: a focus on IgG+IgD-
CD27- (DN) B cells in aged humans. Ageing Res. Rev. 10, 274284.
Capper, D., Jones, D.T.W., Sill, M., Hovestadt, V., Schrimpf, D., Sturm, D., Koelsche, C.,
Sahm, F., Chavez, L., Reuss, D.E., Kratz, A., Wefers, A.K., Huang, K., Pajtler, K.W.,
Schweizer, L., Stichel, D., Olar, A., Engel, N.W., Lindenberg, K., Harter, P.N.,
Braczynski, A.K., Plate, K.H., Dohmen, H., Garvalov, B.K., Coras, R., Hölsken, A.,
Hewer, E., Bewerunge-Hudler, M., Schick, M., Fischer, R., Beschorner, R.,
Schittenhelm, J., Staszewski, O., Wani, K., Varlet, P., Pages, M., Temming, P.,
Lohmann, D., Selt, F., Witt, H., Milde, T., Witt, O., Aronica, E., Giangaspero, F.,
Rushing, E., Scheurlen, W., Geisenberger, C., Rodriguez, F.J., Becker, A., Preusser,
M., Haberler, C., Bjerkvig, R., Cryan, J., Farrell, M., Deckert, M., Hench, J., Frank, S.,
Serrano, J., Kannan, K., Tsirigos, A., Brück, W., Hofer, S., Brehmer, S., Seiz-
Rosenhagen, M., Hänggi, D., Hans, V., Rozsnoki, S., Hansford, J.R., Kohlhof, P.,
Kristensen, B.W., Lechner, M., Lopes, B., Mawrin, C., Ketter, R., Kulozik, A., Khatib,
Z., Heppner, F., Koch, A., Jouvet, A., Keohane, C., Mühleisen, H., Mueller, W., Pohl,
U., Prinz, M., Benner, A., Zapatka, M., Gottardo, N.G., Driever, P.H., Kramm, C.M.,
Müller, H.L., Rutkowski, S., von Ho, K., Frühwald, M.C., Gnekow, A., Fleischhack,
G., Tippelt, S., Calaminus, G., Monoranu, C.-M., Perry, A., Jones, C., Jacques, T.S.,
Radlwimmer, B., Gessi, M., Pietsch, T., Schramm, J., Schackert, G., Westphal, M.,
Reifenberger, G., Wesseling, P., Weller, M., Collins, V.P., Blümcke, I., Bendszus, M.,
Debus, J., Huang, A., Jabado, N., Northcott, P.A., Paulus, W., Gajjar, A., Robinson,
G.W., Taylor, M.D., Jaunmuktane, Z., Ryzhova, M., Platten, M., Unterberg, A., Wick,
W., Karajannis, M.A., Mittelbronn, M., Acker, T., Hartmann, C., Aldape, K., Schüller,
U., Buslei, R., Lichter, P., Kool, M., Herold-Mende, C., Ellison, D.W., Hasselblatt, M.,
Snuderl, M., Brandner, S., Korshunov, A., von Deimling, A., Pster, S.M., 2018. DNA
methylation-based classication of central nervous system tumours. Nature 555,
469474.
Ching, T., Himmelstein, D.S., Beaulieu-Jones, B.K., Kalinin, A.A., Do, B.T., Way, G.P.,
Ferrero, E., Agapow, P.-M., Zietz, M., Homan, M.M., Xie, W., Rosen, G.L., Lengerich,
B.J., Israeli, J., Lanchantin, J., Woloszynek, S., Carpenter, A.E., Shrikumar, A., Xu, J.,
Cofer, E.M., Lavender, C.A., Turaga, S.C., Alexandari, A.M., Lu, Z., Harris, D.J.,
DeCaprio, D., Qi, Y., Kundaje, A., Peng, Y., Wiley, L.K., Segler, M.H.S., Boca, S.M.,
Swamidass, S.J., Huang, A., Gitter, A., Greene, C.S., 2018. Opportunities and ob-
stacles for deep learning in biology and medicine. J. R. Soc. Interface 15. https://doi.
org/10.1098/rsif.2017.0387.
Chollet, F., 2017. Xception: deep learning with depthwise separable convolutions. 2017
IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.
org/10.1109/cvpr.2017.195.
Cohen, A.A., Morissette-Thomas, V., Ferrucci, L., Fried, L.P., 2016. Deep biomarkers of
aging are population-dependent. Aging 8, 22532255.
Cole, J.H., Poudel, R.P.K., Tsagkrasoulis, D., Caan, M.W.A., Steves, C., Spector, T.D.,
Montana, G., 2016. Predicting Brain Age with Deep Learning from Raw Imaging Data
Results in a Reliable and Heritable Biomarker. arXiv:1612.02572 [stat.ML].
Cortes, C., Vapnik, V., 1995. Support-vector networks. Mach. Learn. 20, 273297.
Dai, P., Harada, Y., Takamatsu, T., 2015. Highly ecient direct conversion of human
broblasts to neuronal cells by chemical compounds. J. Clin. Biochem. Nutr. 56,
166170.
Dalton, S., 2013. Signaling networks in human pluripotent stem cells. Curr. Opin. Cell
Biol. 25, 241246.
Daniel, M.G., Lemischka, I.R., Moore, K., 2016. Converting cell fates: generating hema-
topoietic stem cells de novo via transcription factor reprogramming. Ann. N. Y. Acad.
Sci. 1370, 2435.
Deca, D., Koene, R.A., 2014. Experimental enhancement of neurphysiological function.
Front. Syst. Neurosci. 8, 189.
Demetrius, L., 1978. Adaptive value, entropy and survivorship curves. Nature 275,
213214.
Di Meo, A., Pasic, M.D., Yousef, G.M., 2016. Proteomics and peptidomics: moving toward
precision medicine in urological malignancies. Oncotarget 7, 5246052474.
Dolejs, J., 1997. The extension of Gompertz laws validity. Mech. Ageing Dev. 99,
233244.
Douaisi, M., Resop, R.S., Nagasawa, M., Craft, J., Jamieson, B.D., Blom, B., Uittenbogaart,
C.H., 2017. CD31, a valuable marker to identify early and late stages of t cell dif-
ferentiation in the human Thymus. J. Immunol. 198, 23102319.
Fabris, F., Magalhães, J.P., de, Freitas, A.A., 2017. A review of supervised machine
learning applied to ageing research. Biogerontology 18, 171188.
Fan, X.-N., Zhang, S.-W., 2015. lncRNA-MFDL: identication of human long non-coding
RNAs by fusing multiple features and using deep learning. Mol. Biosyst. 11, 892897.
Flament, F., Bazin, R., Laquieze, S., Rubert, V., Simonpietri, E., Piot, B., 2013. Eect of the
sun on visible clinical signs of aging in Caucasian skin. Clin. Cosmet. Investig.
Dermatol. 6, 221232.
Fleming, N., 2018. How articial intelligence is changing drug discovery. Nature 557,
S55S57.
Fratello, M., Tagliaferri, R., 2018. Decision trees and random forests. Reference Module in
Life Sciences.
Friedman, J.H., 2001. Greedy function approximation: a gradient boosting machine.
Mach. Ann. Stat. 29, 11891232.
Fulop, T., Kotb, R., Fortin, C.F., Pawelec, G., de Angelis, F., Larbi, A., 2010. Potential role
of immunosenescence in cancer development. Ann. N. Y. Acad. Sci. 1197, 158165.
Gawehn, E., Hiss, J.A., Schneider, G., 2015. Deep learning in drug discovery. Mol. Inform.
35, 314.
Gleeson, F.C., Voss, J.S., Kipp, B.R., Kerr, S.E., Van Arnam, J.S., Mills, J.R., Marcou, C.A.,
Schneider, A.R., Tu, Z.J., Henry, M.R., Levy, M.J., 2017. Assessment of pancreatic
neuroendocrine tumor cytologic genotype diversity to guide personalized medicine
using a custom gastroenteropancreatic next-generation sequencing panel. Oncotarget
8, 9346493475.
Gómez-Bombarelli, R., Wei, J.N., Duvenaud, D., Hernández-Lobato, J.M., Sánchez-
Lengeling, B., Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T.D., Adams, R.P.,
Aspuru-Guzik, A., 2018. Automatic chemical design using a data-driven continuous
representation of molecules. ACS Cent. Sci. 4, 268276.
Goodfellow, I.J., 2017. NIPS 2016 Tutorial: Generative Adversarial Networks.
arXiv:1701.00160v4 [cs.LG].
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.,
Courville, A., Bengio, Y., 2014. Generative adversarial nets. Adv. Neural Inf. Process.
Syst. 26722680.
Guimaraes, G.L., Sanchez-Lengeling, B., Farias, P.L.C., Aspuru-Guzik, A., 2017. Objective-
Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation
Models. arXiv Preprint arXiv:1705.10843.
Gupta, A., Eysenbach, B., Finn, C., Levine, S., 2018. Unsupervised Meta-Learning for
Reinforcement Learning. arXiv:1806.04640 [cs.LG].
Hannum, G., Guinney, J., Zhao, L., Zhang, L., Hughes, G., Sadda, S., Klotzle, B., Bibikova,
M., Fan, J.-B., Gao, Y., Deconde, R., Chen, M., Rajapakse, I., Friend, S., Ideker, T.,
Zhang, K., 2013. Genome-wide methylation proles reveal quantitative views of
human aging rates. Mol. Cell 49, 359367.
He, K., Zhang, X., Ren, S., Sun, J., 2015. Delving deep into rectiers: surpassing human-
level performance on ImageNet classication. 2015 IEEE International Conference on
Computer Vision (ICCV). https://doi.org/10.1109/iccv.2015.123.
Henson, S.M., Akbar, A.N., 2009. KLRG1more than a marker for T cell senescence. Age
31, 285291.
Ho, T.K., 1995. Random decision forests. Proceedings of 3rd International Conference on
Document Analysis and Recognition. Presented at the 3rd International Conference
on Document Analysis and Recognition 278282.
Hodos, R.A., Kidd, B.A., Shameer, K., Readhead, B.P., Dudley, J.T., 2016. In silico
methods for drug repurposing and pharmacology. Wiley Interdiscip. Rev. Syst. Biol.
Med. 8, 186210.
Horvath, S., 2013. DNA methylation age of human tissues and cell types. Genome Biol.
14, R115.
Hu, Z., Tang, J., Wang, Z., Zhang, K., Zhang, L., Sun, Q., 2018. Deep learning for image-
based cancer detection and diagnosis a survey. Pattern Recognit. 83, 134149.
Iglesias-Bartolome, R., Gutkind, J.S., 2011. Signaling circuitries controlling stem cell fate:
to be or not to be. Curr. Opin. Cell Biol. 23, 716723.
Ionov, Y., 2010. A high throughput method for identifying personalized tumor-associated
antigens. Oncotarget 1, 148155.
Issa, N.T., Byers, S.W., Dakshanamurthy, S., 2014. Big data: the next frontier for in-
novation in therapeutics and healthcare. Expert Rev. Clin. Pharmacol. 7, 293298.
Jiang, Z., Han, Y., Cao, X., 2014. Induced pluripotent stem cell (iPSCs) and their appli-
cation in immunotherapy. Cell. Mol. Immunol. 11, 1724.
Jones, O.R., Scheuerlein, A., Salguero-Gómez, R., Camarda, C.G., Schaible, R., Casper,
B.B., Dahlgren, J.P., Ehrlén, J., García, M.B., Menges, E.S., Quintana-Ascencio, P.F.,
Caswell, H., Baudisch, A., Vaupel, J.W., 2014. Diversity of ageing across the tree of
life. Nature 505, 169173.
Kadurin, A., Aliper, A., Kazennov, A., Mamoshina, P., Vanhaelen, Q., Khrabrov, K.,
Zhavoronkov, A., 2017a. The cornucopia of meaningful leads: applying deep adver-
sarial autoencoders for new molecule development in oncology. Oncotarget 8,
1088310890.
Kadurin, A., Nikolenko, S., Khrabrov, K., Aliper, A., Zhavoronkov, A., 2017b. druGAN: an
advanced generative adversarial autoencoder model for de novo generation of new
molecules with desired molecular properties in silico. Mol. Pharm. 14, 30983104.
Kennedy, R.B., Ovsyannikova, I.G., Haralambieva, I.H., Oberg, A.L., Zimmermann, M.T.,
Grill, D.E., Poland, G.A., 2016. Immunosenescence-related transcriptomic and im-
munologic changes in older individuals following inuenza vaccination. Front.
Immunol. 7, 450.
Khatri, P., Sirota, M., Butte, A.J., 2012. Ten years of pathway analysis: current approaches
and outstanding challenges. PLoS Comput. Biol. 8, e1002375.
Kim, S.G., Theera-Ampornpunt, N., Fang, C.-H., Harwani, M., Grama, A., Chaterji, S.,
2016. Opening up the blackbox: an interpretable deep neural network-based classier
for cell-type specic enhancer predictions. BMC Syst. Biol. 10 (Suppl 2), 54.
Kogan, V., Molodtsov, I., Menshikov, L.I., Shmookler Reis, R.J., Fedichev, P., 2015.
Stability analysis of a model gene network links aging, stress resistance, and negli-
gible senescence. Sci. Rep. 5, 13589.
Kolesov, A., Kamyshenkov, D., Litovchenko, M., Smekalova, E., Golovizin, A.,
Zhavoronkov, A., 2014. On multilabel classication methods of incompletely labeled
biomedical text data. Comput. Math. Methods Med. 2014.
Kourou, K., Exarchos, T.P., Exarchos, K.P., Karamouzis, M.V., Fotiadis, D.I., 2015.
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
64
Machine learning applications in cancer prognosis and prediction. Comput. Struct.
Biotechnol. J. 13, 817.
Kramer, O., 2013. K-nearest neighbors. Intelligent Systems Reference Library. pp. 1323.
Kulkarni, P., 2017. reinforcement and deep reinforcement machine learning. Intelligent
Systems Reference Library. pp. 5983.
Larbi, A., Fulop, T., 2014. From truly naïveto exhausted senescentT cells: when
markers predict functionality. Cytometry A 85, 2535.
Lee, D., Fontugne, J., Gumpeni, N., Park, K., MacDonald, T.Y., Robinson, B.D., Sboner, A.,
Rubin, M.A., Mosquera, J.M., Barbieri, C.E., 2017a. Molecular alterations in prostate
cancer and association with MRI features. Prostate Cancer Prostatic Dis. 20, 430435.
Lee, J.-G., Jun, S., Cho, Y.-W., Lee, H., Kim, G.B., Seo, J.B., Kim, N., 2017b. Deep learning
in medical imaging: general overview. Korean J. Radiol. 18, 570584.
Lenselink, E.B., Ten Dijke, N., Bongers, B., Papadatos, G., van Vlijmen, H.W.T.,
Kowalczyk, W., IJzerman, A.P., van Westen, G.J.P., 2017. Beyond the hype: deep
neural networks outperform established methods using a ChEMBL bioactivity
benchmark set. J. Cheminform. 9, 45.
Leung, M.K.K., Delong, A., Alipanahi, B., Frey, B.J., 2016. Machine learning in genomic
medicine: a review of computational problems and data sets. Proc. IEEE 104,
176197.
Levine, M.E., Lu, A.T., Quach, A., Chen, B.H., Assimes, T.L., Bandinelli, S., Hou, L.,
Baccarelli, A.A., Stewart, J.D., Li, Y., Whitsel, E.A., Wilson, J.G., Reiner, A.P., Aviv,
A., Lohman, K., Liu, Y., Ferrucci, L., Horvath, S., 2018. An epigenetic biomarker of
aging for lifespan and healthspan. Aging 10, 573591.
Li, D., Duan, D., 2013. Mitochondria-targeted antiaging gene therapy with adeno-asso-
ciated viral vectors. Methods Mol. Biol. 1048, 161180.
Listgarten, J., Weinstein, M., Kleinstiver, B.P., Sousa, A.A., Joung, J.K., Crawford, J., Gao,
K., Hoang, L., Elibol, M., Doench, J.G., Fusi, N., 2018. Prediction of o-target ac-
tivities for the end-to-end design of CRISPR guide RNAs. Nat. Biomed. Eng. 2, 3847.
Liu, F., Li, H., Ren, C., Bo, X., Shu, W., 2016. PEDLA: predicting enhancers with a deep
learning-based algorithmic framework. Sci. Rep. 6, 28517.
Liu, J., Pan, Y., Li, M., Chen, Z., Tang, L., Lu, C., Wang, J., 2018. Applications of deep
learning to MRI images: a survey. Big Data Min. Anal. 1, 118.
Liu, X., 2015. Life equations for the senescence process. Biochem. Biophys. Rep. 4,
228233.
Li, X., Shen, L., Shang, X., Liu, W., 2015. Subpathway analysis based on signaling-
pathway impact analysis of signaling pathway. PLoS One 10, e0132813.
Li, Y., Chen, C.-Y., Wasserman, W.W., 2016. Deep feature selection: theory and applica-
tion to identify enhancers and promoters. J. Comput. Biol. 23, 322336.
Loh, K.M., Lim, B., 2011. A precarious balance: pluripotency factors as lineage speciers.
Cell Stem Cell 8, 363369.
Malik, N., Rao, M.S., 2013. A review of the methods for human iPSC derivation. Methods
Mol. Biol. 997, 2333.
Mamoshina, P., Kochetov, K., Putin, E., Cortese, F., Aliper, A., Lee, W.-S., Ahn, S.-M., Uhn,
L., Skjodt, N., Kovalchuk, O., Scheibye-Knudsen, M., Zhavoronkov, A., 2018a.
Population specic biomarkers of human aging: a big data study using South Korean,
Canadian and Eastern European patient populations. J. Gerontol. A Biol. Sci. Med.
Sci. https://doi.org/10.1093/gerona/gly005.
Mamoshina, P., Vieira, A., Putin, E., Zhavoronkov, A., 2016. Applications of deep learning
in biomedicine. Mol. Pharm. 13, 14451454.
Mamoshina, P., Volosnikova, M., Ozerov, I.V., Putin, E., Skibina, E., Cortese, F.,
Zhavoronkov, A., 2018b. Machine learning on human muscle transcriptomic data for
biomarker discovery and tissue-specic drug target identication. Front. Genet. 9,
242.
Mamoshina, P., Kochetov, K., Cortese, F., Kovalchuk, A., Aliper, A., Putin, E., Scheibye-
Knudsen, M., Cantor, C., Skjodt, N., Kovalchuk, O., Zhavoronkov, A., 2018c. Blood
Biochemistry Analysis to Detect Smoking Status and Quantify Accelerated Aging in
Smokers. Scientic Reports, 2018. in print. .
Mason, L., Baxter, J., Bartlett, P.L., Frean, M., 1999. In: Solla, S.A., Leen, T.K., Müller, K.
(Eds.), Advances in Neural Information Processing Systems 12. MIT Press, pp.
512518.
Maxmen, A., 2017. Machine learning predicts the look of stem cells. Nature. https://doi.
org/10.1038/nature.2017.21769.
McCue, M.E., McCoy, A.M., 2017. The scope of big data in one medicine: unprecedented
opportunities and challenges. Front. Vet. Sci. 4, 194.
Mitnitski, A.B., 2018. Epigenetic Biomarkers for Biological Age, in: Epigenetics of Aging
and Longevity. Elsevier, pp. 153170.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A.,
Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A.,
Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D., 2015.
Human-level control through deep reinforcement learning. Nature 518, 529533.
Morrisette-Thomas, V., Cohen, A.A., Fülöp, T., Riesco, É., Legault, V., Li, Q., Milot, E.,
Dusseault-Bélanger, F., Ferrucci, L., 2014. Inamm-aging does not simply reect
increases in pro-inammatory markers. Mech. Ageing Dev. 139, 4957.
Moskalev, A., Anisimov, V., Aliper, A., Artemov, A., Asadullah, K., Belsky, D., Baranova,
A., de Grey, A., Dixit, V.D., Debonneuil, E., Dobrovolskaya, E., Fedichev, P.,
Fedintsev, A., Fraifeld, V., Franceschi, C., Freer, R., Fülöp, T., Feige, J., Gems, D.,
Gladyshev, V., Gorbunova, V., Irincheeva, I., Jager, S., Jazwinski, S.M., Kaeberlein,
M., Kennedy, B., Khaltourina, D., Kovalchuk, I., Kovalchuk, O., Kozin, S., Kulminski,
A., Lashmanova, E., Lezhnina, K., Liu, G.-H., Longo, V., Mamoshina, P., Maslov, A.,
Pedro de Magalhaes, J., Mitchell, J., Mitnitski, A., Nikolsky, Y., Ozerov, I., Pasyukova,
E., Peregudova, D., Popov, V., Proshkina, E., Putin, E., Rogaev, E., Rogina, B.,
Schastnaya, J., Seluanov, A., Shaposhnikov, M., Simm, A., Skulachev, V., Skulachev,
M., Solovev, I., Spindler, S., Stefanova, N., Suh, Y., Swick, A., Tower, J., Gudkov,
A.V., Vijg, J., Voronkov, A., West, M., Wagner, W., Yashin, A., Zemskaya, N.,
Zhumadilov, Z., Zhavoronkov, A., 2017. A review of the biomedical innovations for
healthy longevity. Aging 9, 725.
Moskalev, A., Chernyagina, E., de Magalhães, J.P., Barardo, D., Thoppil, H.,
Shaposhnikov, M., Budovsky, A., Fraifeld, V.E., Garazha, A., Tsvetkov, V.,
Bronovitsky, E., Bogomolov, V., Scerbacov, A., Kuryan, O., Gurinovich, R., Jellen,
L.C., Kennedy, B., Mamoshina, P., Dobrovolskaya, E., Aliper, A., Kaminsky, D.,
Zhavoronkov, A., 2015. Geroprotectors.org: a new, structured and curated database
of current therapeutic interventions in aging and age-related disease. Aging 7,
616628.
Moskalev, A., Zhikrivetskaya, S., Shaposhnikov, M., Dobrovolskaya, E., Gurinovich, R.,
Kuryan, O., Pashuk, A., Jellen, L.J., Aliper, A., Peregudov, A., Zhavoronkov, A., 2016.
Aging Chart: a Community Resource for Rapid Exploratory Pathway Analysis of Age-
related Processes, Nucleic Acids Res 44,(Database Issue). pp. D894D899.
Muñoz-Lorente, M.A., Martínez, P., Tejera, Á., Whittemore, K., Moisés-Silva, A.C., Bosch,
F., Blasco, M.A., 2018. AAV9-mediated telomerase activation does not accelerate
tumorigenesis in the context of oncogenic K-Ras-induced lung cancer. PLoS Genet.
14, e1007562.
National Research Council (US) Committee on Population, 2012. Between Zeus and the
Salmon: The Biodemography of Longevity. National Academies Press (US),
Washington (DC).
Neuber, K., Schmidt, S., Mensch, A., 2003. Telomere length measurement and determi-
nation of immunosenescence-related markers (CD28, CD45RO, CD45RA, interferon-
gamma and interleukin-4) in skin-homing T cells expressing the cutaneous lympho-
cyte antigen: indication of a non-ageing T-cell subset. Immunology 109, 2431.
Ng, H.-H., Surani, M.A., 2011. The transcriptional and signalling networks of plur-
ipotency. Nat. Cell Biol. 13, 490496.
Nguyen, N.D., Nguyen, T., Nahavandi, S., 2017a. System design perspective for human-
level agents using deep reinforcement learning: a survey. IEEE Access 5,
2709127102.
Nguyen, V., Mendelsohn, A., Larrick, J.W., 2017b. Interleukin-7 and immunosenescence.
J. Immunol. Res. 2017, 4807853.
Nielsen, J., 2017. Systems biology of metabolism: a driver for developing personalized
and precision medicine. Cell Metab. 25, 572579.
Niklinski, J., Kretowski, A., Moniuszko, M., Reszec, J., Michalska-Falkowska, A., Niemira,
M., Ciborowski, M., Charkiewicz, R., Jurgilewicz, D., Kozlowski, M., Ramlau, R.,
Piwkowski, C., Kwasniewski, M., Kaczmarek, M., Ciereszko, A., Wasniewski, T.,
Mroz, R., Naumnik, W., Sierko, E., Paczkowska, M., Kisluk, J., Sulewska, A., Cybulski,
A., Mariak, Z., Kedra, B., Szamatowicz, J., Kurzawa, P., Minarowski, L., Charkiewicz,
A.E., Mroczko, B., Malyszko, J., Manegold, C., Pilz, L., Allgayer, H., Abba, M.L., Juhl,
H., Koch, F., MOBIT Study Group, 2017. Systematic biobanking, novel imaging
techniques, and advanced molecular analysis for precise tumor diagnosis and
therapy: the Polish MOBIT project. Adv. Med. Sci. 62, 405413.
Nowick, K., Stubbs, L., 2010. Lineage-specic transcription factors and the evolution of
gene regulatory networks. Brief. Funct. Genomics 9, 6578.
Opal, S.M., Girard, T.D., Ely, E.W., 2005. The immunopathogenesis of sepsis in elderly
patients. Clin. Infect. Dis. 41 (Suppl 7), S50412.
Ozerov, I.V., Lezhnina, K.V., Izumchenko, E., Artemov, A.V., Medintsev, S., Vanhaelen,
Q., Aliper, A., Vijg, J., Osipov, A.N., Labat, I., West, M.D., Buzdin, A., Cantor, C.R.,
Nikolsky, Y., Borisov, N., Irincheeva, I., Khokhlovich, E., Sidransky, D., Camargo,
M.L., Zhavoronkov, A., 2016. In silico pathway activation network decomposition
analysis (iPANDA) as a method for biomarker development. Nat. Commun. 7, 13427.
Pawelec, G., 2018. Age and immunity: what is immunosenescence? Exp. Gerontol.
105, 49.
Pawelec, G., 2017. Immunosenescence and cancer. Biogerontology 18, 717721.
Pawelec, G., Derhovanessian, E., Larbi, A., 2010. Immunosenescence and cancer. Crit.
Rev. Oncol. Hematol. 75, 165172.
Pereira, S., Pinto, A., Alves, V., Silva, C.A., 2016. Brain tumor segmentation using con-
volutional neural networks in MRI images. IEEE Trans. Med. Imaging 35, 12401251.
Pretorius, E., Bester, J., 2016. Viscoelasticity as a measurement of clot structure in poorly
controlled type 2 diabetes patients: towards a precision and personalized medicine
approach. Oncotarget 7, 5089550907.
Polykovskiy, D., Zhebrak, A., Vetrov, D., Ivanenkov, Y., Aladinskiy, V., Bozdaganyan, M.,
Kadurin, A., 2018. Entangled conditional adversarial autoencoder for de-novo drug
discovery. Mol. Pharm. 15 (10), 43984405.
Pyrkov, T.V., Slipensky, K., Barg, M., Kondrashin, A., Zhurov, B., Zenin, A., Pyatnitskiy,
M., Menshikov, L., Markov, S., Fedichev, P.O., 2018. Extracting biological age from
biomedical data via deep learning: too much of a good thing? Sci. Rep. 8, 5210.
Redman, L.M., Smith, S.R., Burton, J.H., Martin, C.K., Ilyasova, D., Ravussin, E., 2018.
Metabolic slowing and reduced oxidative damage with sustained caloric restriction
support the rate of living and oxidative damage theories of aging. Cell Metab. 27,
805815 e4.
Rifaioglu, A.S., Atas, H., Martin, M.J., Cetin-Atalay, R., Atalay, V., Dogan, T., 2018.
Recent applications of deep learning and machine intelligence on in silico drug dis-
covery: methods, tools and databases. Brief. Bioinform. https://doi.org/10.1093/bib/
bby061.
Rosenstiel, P., Derer, S., Till, A., Häsler, R., Eberstein, H., Bewig, B., Nikolaus, S., Nebel,
A., Schreiber, S., 2008. Systematic expression proling of innate immune genes de-
nes a complex pattern of immunosenescence in peripheral and intestinal leukocytes.
Genes Immun. 9, 103114.
Saint-André, V., Federation, A.J., Lin, C.Y., Abraham, B.J., Reddy, J., Lee, T.I., Bradner,
J.E., Young, R.A., 2016. Models of human core transcriptional regulatory circuitries.
Genome Res. 26, 385396.
Sanchez-Lengeling, B., Outeiral, C., Guimaraes, G.L., Aspuru- Guzik, A., 2017. Optimizing
distributions over molecular space. An objective-reinforced generative adversarial
network for inversedesign chemistry (ORGANIC). ChemRxiv preprint, 5309668.
Schumacher, A., 2018. Epigenetics of aging and longevity. Epigenetics of Aging and
Longevity. Elsevier, pp. 499509.
Scudellari, M., 2016. How iPS cells changed the world. Nature 534, 310312.
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
65
Segler, M.H.S., Preuss, M., Waller, M.P., 2018. Planning chemical syntheses with deep
neural networks and symbolic AI. Nature 555, 604610.
Shaikhina, T., Lowe, D., Daga, S., Briggs, D., Higgins, R., Khovanova, N., 2015. Machine
learning for predictive modelling based on small data in biomedical engineering.
IFAC-PapersOnLine 48, 469474.
Silva, J., Smith, A., 2008. Capturing pluripotency. Cell 132, 532536.
Si-Tayeb, K., Noto, F.K., Nagaoka, M., Li, J., Battle, M.A., Duris, C., North, P.E., Dalton, S.,
Duncan, S.A., 2010. Highly ecient generation of human hepatocyte-like cells from
induced pluripotent stem cells. Hepatology 51, 297305.
Sniecinski, I., Seghatchian, J., 2018. Articial intelligence: a joint narrative on potential
use in pediatric stem and immune cell therapies and regenerative medicine. Transfus.
Apher. Sci. 57, 422424.
Soleimani, M., Merheb, M., Matar, R., 2015. Human gene therapy the future of health
care. Hamdan Med. J. 8, 101.
Sotgia, F., Lisanti, M.P., 2017. Mitochondrial biomarkers predict tumor progression and
poor overall survival in gastric cancers: companion diagnostics for personalized
medicine. Oncotarget 8, 6711767128.
Spencer, M., Eickholt, J., Cheng, Jianlin, 2015. A deep learning network approach to ab
initio protein secondary structure prediction. IEEEACM Trans. Comput. Biol.
Bioinform. 12, 103112.
Stahl, E.C., Brown, B.N., 2015. Cell therapy strategies to combat immunosenescence.
Organogenesis 11, 159172.
Tacutu, R., Thornton, D., Johnson, E., Budovsky, A., Barardo, D., Craig, T., Diana, E.,
Lehmann, G., Toren, D., Wang, J., Fraifeld, V.E., de Magalhães, J.P., 2018. Human
ageing genomic resources: new and updated databases. Nucleic Acids Res. 46,
D1083D1090.
Takahashi, K., Tanabe, K., Ohnuki, M., Narita, M., Ichisaka, T., Tomoda, K., Yamanaka, S.,
2007. Induction of pluripotent stem cells from adult human broblasts by dened
factors. Cell 131, 861872.
Takahashi, K., Yamanaka, S., 2006. Induction of pluripotent stem cells from mouse em-
bryonic and adult broblast cultures by dened factors. Cell 126, 663676.
Takeda, Y., Harada, Y., Yoshikawa, T., Dai, P., 2017. Direct conversion of human bro-
blasts to brown adipocytes by small chemical compounds. Sci. Rep. 7, 4304.
Tantin, D., 2013. Oct transcription factors in development and stem cells: insights and
mechanisms. Development 140, 28572866.
Torabi Moghadam, B., Dabrowski, M., Kaminska, B., Grabherr, M.G., Komorowski, J.,
2016. Combinatorial identication of DNA methylation patterns over age in the
human brain. BMC Bioinformatics 17, 393.
Tarkhov, A.E., Menshikov, L.I., Fedichev, P.O., 2017. Strehler-Mildvan correlation is a
degenerate manifold of Gompertz t. J. Theor. Biol. 416, 180189.
Tetko, I.V., Engkvist, O., Koch, U., Reymond, J.-L., Chen, H., 2016. BIGCHEM: challenges
and opportunities for big data analysis in chemistry. Mol. Inform. 35, 615621.
Torrey, L., Shavlik, J., 2010. Transfer learning. In: Olivas, E.S., Guerrero, J.D.M.,
Martinez-Sober, M., Magdalena-Benedito, J.R., Serrano López, A.J. (Eds.), Handbook
of Research on Machine Learning Applications and Trends. IGI Global, pp. 242264.
Thomson, M., Liu, S.J., Zou, L.-N., Smith, Z., Meissner, A., Ramanathan, S., 2011.
Pluripotency factors in embryonic stem cells regulate dierentiation into germ layers.
Cell 145, 875889.
Trtica-Majnaric, L., Zekic-Susac, M., Sarlija, N., Vitale, B., 2010. Prediction of inuenza
vaccination outcome by neural networks and logistic regression. J. Biomed. Inform.
43, 774781.
Tsigelny, I.F., 2018. Articial intelligence in drug combination therapy. Brief. Bioinform.
https://doi.org/10.1093/bib/bby004.
Tu, W., Rao, S., 2016. Mechanisms underlying t cell immunosenescence: aging and cy-
tomegalovirus infection. Front. Microbiol. 7, 2111.
Vandenberghe, M.E., Scott, M.L.J., Scorer, P.W., Söderberg, M., Balcerzak, D., Barker, C.,
2017. Relevance of deep learning to facilitate the diagnosis of HER2 status in breast
cancer. Sci. Rep. 7. https://doi.org/10.1038/srep45938.
Van der Burgh, H.K., Schmidt, R., Westeneng, H.-J., de Reus, M.A., van den Berg, L.H.,
van den Heuvel, M.P., 2017. Deep learning predictions of survival based on MRI in
amyotrophic lateral sclerosis. Neuroimage Clin. 13, 361369.
Vanhaelen, Q. (Ed.), 2019. Computational Methods for Drug Repurposing, Methods in
Molecular Biology. Humana Press.
Vanhaelen, Q., 2018. Evolutionary theories of aging a systemic and mechanistic per-
spective. In: Ahmad, S.I. (Ed.), Aging: Exploring a Complex Phenomenon. CRC Press
Taylor&Francis group, pp. 4372.
Vanhaelen, Q., 2015. Aging as an optimization between cellular maintenance require-
ments and evolutionary constraints. Curr. Aging Sci. 8, 110119.
Vanhaelen, Q., Aliper, A.M., Zhavoronkov, A., 2017. A comparative review of computa-
tional methods for pathway perturbation analysis: dynamical and topological
perspectives. Mol. Biosyst. 13, 16921704.
Vanhaelen, Q., Mamoshina, P., Aliper, A.M., Artemov, A., Lezhnina, K., Ozerov, I., Labat,
I., Zhavoronkov, A., 2016. Design of ecient computational workows for in silico
drug repurposing. Drug Discov. Today 22, 210222.
Ventura, M.T., Casciaro, M., Gangemi, S., Buquicchio, R., 2017. Immunosenescence in
aging: between immune cells depletion and cytokines up-regulation. Clin. Mol.
Allergy 15, 21.
Walker, E., Ohishi, M., Davey, R.E., Zhang, W., Cassar, P.A., Tanaka, T.S., Der, S.D.,
Morris, Q., Hughes, T.R., Zandstra, P.W., Stanford, W.L., 2007. Prediction and testing
of novel transcriptional networks regulating embryonic stem cell self-renewal and
commitment. Cell Stem Cell 1, 7186.
Walker, E., Stanford, W.L., 2009. Transcriptional networks regulating embryonic stem
cell fate decisions. Regulatory Networks in Stem Cells. pp. 87100.
Walker, S.H., Duncan, D.B., 1967. Estimation of the probability of an event as a function
of several independent variables. Biometrika 54, 167.
Wallach, W., Franklin, S., Allen, C., 2010. A conceptual and computational model of
moral decision making in human and articial agents. Top. Cogn. Sci. 2 (3), 454485.
Wang, Z., Li, L., Glicksberg, B.S., Israel, A., Dudley, J.T., Maayan, A., 2017. Predicting
age by mining electronic medical records with deep learning characterizes dierences
between chronological and physiological age. J. Biomed. Inform. 76, 5968.
Wei, J.N., Duvenaud, D., Aspuru-Guzik, A., 2016. Neural networks for the prediction of
organic chemistry reactions. ACS Cent. Sci. 2, 725732.
West, M.D., Labat, I., Sternberg, H., Larocca, D., Nasonkin, I., Chapman, K.B., Singh, R.,
Makarev, E., Aliper, A., Kazennov, A., Alekseenko, A., Shuvalov, N., Cheskidova, E.,
Alekseev, A., Artemov, A., Putin, E., Mamoshina, P., Pryanichnikov, N., Larocca, J.,
Copeland, K., Izumchenko, E., Korzinkin, M., Zhavoronkov, A., 2018. Use of deep
neural network ensembles to identify embryonic-fetal transition markers: repression
of in embryonic and cancer cells. Oncotarget 9, 77967811.
Xu, J., Ge, H., Zhou, X., Yan, J., Chi, Q., Zhang, Z., 2005. Prediction of vascular tissue
engineering results with articial neural networks. J. Biomed. Inform. 38, 417421.
Xu, W., Larbi, A., 2017. Markers of t cell senescence in humans. Int. J. Mol. Sci. 18.
https://doi.org/10.3390/ijms18081742.
Xu, Y., Dai, Z., Chen, F., Gao, S., Pei, J., Lai, L., 2015. Deep learning for drug-induced liver
injury. J. Chem. Inf. Model. 55, 20852093.
Yamanaka, S., 2012. Induced pluripotent stem cells: past, present, and future. Cell Stem
Cell 10, 678684.
Yeo, J.-C., Ng, H.-H., 2013. The transcriptional regulation of pluripotency. Cell Res. 23,
2032.
Yi, K.H., Axtmayer, J., Gustin, J.P., Rajpurohit, A., Lauring, J., 2013. Functional analysis
of non-hotspot AKT1 mutants found in human breast cancers identies novel driver
mutations: implications for personalized medicine. Oncotarget 4, 2934.
Yin, A., Etcheverry, A., He, Y., Aubry, M., Barnholtz-Sloan, J., Zhang, L., Mao, X., Chen,
W., Liu, B., Zhang, W., Mosser, J., Zhang, X., 2017. Integrative analysis of novel
hypomethylation and gene expression signatures in glioblastomas. Oncotarget 8,
8960789619.
Yu, L., Zhang, W., Wang, J., Yu, Y., 2017. SeqGAN: Sequence Generative Adversarial Nets
With Policy Gradient. arXiv Preprint arXiv:1609.05473.
Yue, W., Wang, Z., Chen, H., Payne, A., Liu, X., 2018. Machine learning with applications
in breast Cancer diagnosis and prognosis. Des. Codes Cryptogr., Large-Scale Numer.
Optim. 2, 13.
Zabolotneva, A.A., Zhavoronkov, A.A., Shegay, P.V., Gaifullin, N.M., Alekseev, B.Y.,
Roumiantsev, S.A., Garazha, A.V., Kovalchuk, O., Aravin, A., Buzdin, A.A., 2013. A
systematic experimental evaluation of microRNA markers of human bladder cancer.
Front. Genet. 4, 247.
Zhang, S., Zhou, J., Hu, H., Gong, H., Chen, L., Cheng, C., Zeng, J., 2016. A deep learning
framework for modeling structural features of RNA-binding protein targets. Nucleic
Acids Res. 44, e32.
Zhavoronkov, A., Cantor, C.R., 2011. Methods for structuring scientic knowledge from
many areas related to aging research. PLoS One 6, e22597.
Zhavoronkov, A., Putin, E., Mamoshina, P., Aliper, A., Korzinkin, M., Moskalev, A.,
Kolosov, A., Ostrovskiy, A., Cantor, C., Vijg, J., Zhavoronkov, A., 2016. Deep bio-
markers of human aging: application of deep neural networks to biomarker devel-
opment. Aging 8, 10211033.
Zhavoronkov, A., Putin, E., Asadulaev, A., Ivanenkov, Y., Aladinskiy, V., Sanchez-
Lengeling, B., Aspuru-Guzik, A., 2018a. Reinforced adversarial neural computer for
de novo molecular design. J. Chem. Inf. Model. 58, 11941204.
Zhavoronkov, A., Putin, E., Asadulaev, A., Vanhaelen, Q., Ivanenkov, Y., Aladinskaya,
A.V., Aliper, A., 2018b. Adversarial threshold neural computer for molecular de novo
design. Mol. Pharm. https://doi.org/10.1021/acs.molpharmaceut.7b01137.
Zhou, F., Wu, B., 2018. Deep Meta-Learning: Learning to Learn in the Concept Space.
A. Zhavoronkov et al. Ageing Research Reviews 49 (2019) 49–66
66
... They used approaches like MLR and Random Forest regression to get good results. Alex Zhavoronkov [5] et al. used artificial intelligence techniques for aging research. Reinforcement Learning and GANs applied for finding the aging of human. ...
Article
Full-text available
One of the most crucial elements in end-of-life judgment is life expectancy. For example, good forecasting aids in determining the course of therapy and planning for the acquisition of wellness services and infrastructure. Physicians, on the other hand, tend to overestimate life expectancy, missing the window of opportunity to begin a plan of care. This study examines the feasibility of estimating life expectancy from a WHO dataset collected from Kaggle using machine learning techniques. Even though much research has been conducted in the past on factors influencing life expectancy, including demographic factors, economic distribution, and death rates. It was observed that the impact of immunizations on the standard of living was not previously considered. In this paper, we analyzed life expectancy based on various features, including immunization features (Polio, Hepatitis B, Diphtheria, etc..), HDI factors (schooling, GDP, etc.) of various countries for 15 years period. We also proposed machine learning algorithms for the prediction of life expectancy. We applied regression algorithms logistic regression, SVM, Decision Tree, and random forest regression and achieved a good r-squared value with the random forest algorithm.
... Although these classical methods perform well in predicting adverse aging outcomes, they have limitations in processing multidimensional data, especially when the shape of the distribution is not suited for parametric methods [18], and recognizing the actual interactions between the biomarkers and outcomes [19], as some significant biomarkers were proved to be nonlinear [17]. While recently, new approaches applying machine learning (ML) algorithms have shown considerable accuracy and efficiency in BA prediction [20,21], causing wide attention [22]. Furthermore, the stacking and bagging algorithm displays better performance in distinguishing significant features [23], revealing the complicated non-linear relationships between biomarkers and the target condition [24], but few applications in ML-BA construction. ...
Article
Full-text available
Background Biological age (BA) has been recognized as a more accurate indicator of aging than chronological age (CA). However, the current limitations include: insufficient attention to the incompleteness of medical data for constructing BA; Lack of machine learning-based BA (ML-BA) on the Chinese population; Neglect of the influence of model overfitting degree on the stability of the association results. Methods and results Based on the medical examination data of the Chinese population (45–90 years), we first evaluated the most suitable missing interpolation method, then constructed 14 ML-BAs based on biomarkers, and finally explored the associations between ML-BAs and health statuses (healthy risk indicators and disease). We found that round-robin linear regression interpolation performed best, while AutoEncoder showed the highest interpolation stability. We further illustrated the potential overfitting problem in ML-BAs, which affected the stability of ML-Bas’ associations with health statuses. We then proposed a composite ML-BA based on the Stacking method with a simple meta-model (STK-BA), which overcame the overfitting problem, and associated more strongly with CA (r = 0.66, P < 0.001), healthy risk indicators, disease counts, and six types of disease. Conclusion We provided an improved aging measurement method for middle-aged and elderly groups in China, which can more stably capture aging characteristics other than CA, supporting the emerging application potential of machine learning in aging research.
... On the subject of aging, new technologies in Artificial Intelligence (AI), such as Machine Learning (ML) and Deep Learning (DL), offer a broad range of opportunities [4]. Modern assistive technologies now include Ambient Assisted Living (AAL) systems, lifelogging technologies, gerontechnology, and smart homes, and are transforming many aspects of elder care [5]. ...
Article
Full-text available
Addressing the problems facing the elderly, whether living independently or in managed care facilities, is considered one of the most important applications for action recognition research. However, existing systems are not ready for automation, or for effective use in continuous operation. Therefore, we have developed theoretical and practical foundations for a new real-time action recognition system. This system is based on Hidden Markov Model (HMM) along with colorizing depth maps. The use of depth cameras provides privacy protection. Colorizing depth images in the hue color space enables compressing and visualizing depth data, and detecting persons. The specific detector used for person detection is You Look Only Once (YOLOv5). Appearance and motion features are extracted from depth map sequences and are represented with a Histogram of Oriented Gradients (HOG). These HOG feature vectors are transformed as the observation sequences and then fed into the HMM. Finally, the Viterbi Algorithm is applied to recognize the sequential actions. This system has been tested on real-world data featuring three participants in a care center. We tried out three combinations of HMM with classification algorithms and found that a fusion with Support Vector Machine (SVM) had the best average results, achieving an accuracy rate (84.04%).
... PandaOmics is a fully integrated AI-based platform with a wide range of omics and text data sources (Vera et al., 2022). Compared to other existing tools for target discovery, PandaOmics has several unique advantages with respect to user experience, algorithms, the comprehensive database, and the time machine validation approach (Zhavoronkov et al., 2019). In an easy to use manner, this platform is able to define druggable targets using multiple advanced bioinformatics and AI models, accelerating the drug discovery process (Insilico Medicine, 2022;Pun et al., 2022). ...
Article
Full-text available
Amyotrophic lateral sclerosis (ALS) is a severe neurodegenerative disease with ill-defined pathogenesis, calling for urgent developments of new therapeutic regimens. Herein, we applied PandaOmics, an AI-driven target discovery platform, to analyze the expression profiles of central nervous system (CNS) samples (237 cases; 91 controls) from public datasets, and direct iPSC-derived motor neurons (diMNs) (135 cases; 31 controls) from Answer ALS. Seventeen high-confidence and eleven novel therapeutic targets were identified and will be released onto ALS.AI (http://als.ai/). Among the proposed targets screened in the c9ALS Drosophila model, we verified 8 unreported genes (KCNB2, KCNS3, ADRA2B, NR3C1, P2RY14, PPP3CB, PTPRC, and RARA) whose suppression strongly rescues eye neurodegeneration. Dysregulated pathways identified from CNS and diMN data characterize different stages of disease development. Altogether, our study provides new insights into ALS pathophysiology and demonstrates how AI speeds up the target discovery process, and opens up new opportunities for therapeutic interventions.
... DeepMind's AI system AlphaFold has led to substantial progress on the "protein folding" problem, 2 with potential to drastically improve our ability to treat disease (Jumper et al., 2021). Continued progress in AI for healthcare might even contribute to better understanding and slowing processes of ageing (Zhavoronkov et al., 2019), resulting in much longer lifespans than we enjoy today. ...
Preprint
Full-text available