Content uploaded by Sofiane Hamrioui
Author content
All content in this area was uploaded by Sofiane Hamrioui on Oct 24, 2018
Content may be subject to copyright.
SYSTEMS-LEVEL QUALITY IMPROVEMENT
Data Mining Algorithms and Techniques in Mental
Health: A Systematic Review
Susel Góngora Alonso
1
&Isabel de la Torre-Díez
1
&Sofiane Hamrioui
2
&Miguel López-Coronado
1
&
Diego Calvo Barreno
1
&Lola Morón Nozaleda
3
&Manuel Franco
4
Received: 17 May 2018 /Accepted: 16 July 2018 /Published online: 21 July 2018
#Springer Science+Business Media, LLC, part of Springer Nature 2018
Abstract
Data Mining in medicine is an emerging field of great importance to provide a prognosis and deeper understanding of disease
classification, specifically in Mental Health areas. The main objective of this paper is to present a review of the existing research works
in the literature, referring to the techniques and algorithms of Data Mining in Mental Health, specifically in the most prevalent diseases
such as: Dementia, Alzheimer, Schizophrenia and Depression. Academic databases that were used to perform the searches are Google
Scholar, IEEE Xplore, PubMed, Science Direct, Scopus and Web of Science, taking into account as date of publication the last 10 years,
from 2008 to the present. Several search criteria were established such as ‘techniques’AND ‘Data Mining’AND ‘Mental Health’,
‘algorithms’AND ‘Data Mining’AND ‘dementia’AND ‘schizophrenia’AND ‘depression’, etc. selecting the papers of greatest
interest. A total of 211 articles were found related to techniques and algorithms of Data Mining applied to the main Mental Health
diseases. 72 articles have been identified as relevant works of which 32% are Alzheimer’s, 22% dementia, 24% depression, 14%
schizophrenia and 8% bipolar disorders. Many of the papers show the prediction of risk factors in these diseases. From the review of the
research articles analyzed, it can be said that use of Data Mining techniques applied to diseases such as dementia, schizophrenia,
depression, etc. can be of great help to the clinical decision, diagnosis prediction and improve the patient’s quality of life.
Keywords Algorithms .Data mining .Mental health .Tech ni ques
Introduction
Mental Health is measured by a high grade of impairment, such
as affective disorder that results in depression and different
anxiety disorders. Worldwide, 25% suffer from Mental Health
problems in developed and developing countries. The data is
turning into terabytes and petabytes, 80% of which is unstruc-
tured, so it is difficult to process them with database
This article is part of the Topical Collection on Systems-Level Quality
Improvement
*Isabel de la Torre-Díez
isator@tel.uva.es
Susel Góngora Alonso
suselgongoraalonso@gmail.com
Sofiane Hamrioui
Sofiane.Hamrioui@univ-nantes.fr
Miguel López-Coronado
miglop@tel.uva.es
Diego Calvo Barreno
info@diegocalvo.es
Lola Morón Nozaleda
lolamoron@gmail.com
Manuel Franco
mfm@intras.es
1
Department of Signal Theory and Communications, and Telematics
Engineering, University of Valladolid, Paseo de Belén, 15,
47011 Valladolid, Spain
2
Bretagne Loire and Nantes Universities, UMR 6164, IETR Polytech
Nantes, Nantes, France
3
Nozaleda and Lafora Mental Health Clinic, C/ José Ortega Y Gasset,
44, 28006 Madrid, Spain
4
Psiquiatry Service, Hospital Zamora, Hernán Cortés, Zamora, Spain
Journal of Medical Systems (2018) 42: 161
https://doi.org/10.1007/s10916-018-1018-2
management tools and other traditional techniques. About $ 2.3
trillion is the global cost for Mental Health treatment. By im-
proving the quality of treatments we can reduce costs signifi-
cantly and this quality can be improved by the introduction of
Data Mining tools and techniques in Mental Health [1].
In the last two decades there has been a steady increase
in the use of Data Mining techniques in various disciplines
[2]. Data Mining incorporates a path to knowledge discov-
ery and is a significant process to discover patterns in data
by exploring and modeling large amounts of data. Data
Mining incorporates automatic learning algorithms to
learn, extract and identify useful information and subse-
quent knowledge of large databases [3]. In the last 10 years
Data Mining techniques have been used in medical re-
search, mainly in neuroscience and biomedicine. More re-
cently, psychiatry has begun to use benefits of these tech-
niques to gain a better understanding of the mental disease
genetic composition [4].
According to World Health Organization (WHO)[5]there
are a variety of mental disorders within main are dementia,
schizophrenia, depression, bipolar disorders and Alzheimer as
a dementia derived disease.
Currently most people suffer from neurodegenerative disor-
ders related to the brain [6]. These disorders lead to various
diseases. Dementia in this case is a general term for decrease
in mental capacity severe enough to interfere with daily life [7].
Alzheimer’sDisease(AD) is the most common type of dementia
represents 60–80% of mental disorders [8]. The disease diagno-
sis at an earlier stage is a crucial task, therefore, it is of medical
interest to develop predictive tools to evaluate this risk [9].
The objective of predictive data extraction in this area is to
build models from high-dimensional medical information and
use them to predict diagnostic results on unseen medical data
in order to support clinical decision making [10]. Approaches
in predictive data extraction can be applied to the construction
of decision models for medical procedures, such as prognosis,
diagnosis and treatment planning, which can be embedded
into clinical systems as systematic support components [11].
In this paper we have posed as research question: Are there
work related to Data Mining techniques and algorithms applied
to Mental Health with purpose of obtaining predictions of dis-
eases in this pathology? Therefore, the aim of our paper is to
present a review state of the art of Data Mining techniques and
algorithms in the prevalent diseases of Mental Health, being this
exhaustive study the main contribution of our paper and allow
us to direct future research in the creation of new prediction
algorithms. This paper gives continuity to a first review [12]
focused on analyzing sources and techniques of Big Data in
the health sector and identify which of these techniques are the
most used in the chronic diseases prediction.
There are reviews that base their study on: review, analysis
and evaluation for the early detection of Alzheimer diseases
using Machine Learning techniques [13], as well as in scope
and limits of Data Mining techniques for predictive analysis in
Mental Health [14].
The methodology used in this review is described below.
Afterwards, the results obtained the discussion of them and
the final conclusions of this research work will be finalized.
Methodology
In this paper we have carried out a review of the published
works related to techniques and algorithms of Data Mining in
Mental Health until March 2018. To carry out the review, the
scientific databases were used: Google Scholar, IEEE Xplore,
PubMed Science Direct, Scopus and Web of Science. The
databases used include the most scientific information in mul-
tidisciplinary fields, engineering and medicine, they allow to
find and access articles in scientific and academic journals, or
in repositories, archives and other collections of scientific
texts. The key terms introduced in the search engines of these
databases are: BTechn iques^AND BAlgorithms^AND BData
Mining^AND (Bdementia^OR Bdepression^OR
BAlzheimer^OR Bschizophrenia^OR Bmental health^), both
in Spanish and English. Those terms are searched in
BAbstract/Title/Keywords^, from 2008 to the present. The
search criteria shown in Table 1are those provided specifical-
ly by the database search engine itself.
The selection process of the papers was carried out by read-
ing the titles and abstracts of the results obtained; the papers
were classified by reading their abstracts as well as the full
article when necessary. The selection criteria to take into account
to classify the papers were the following: 1) Studies of Data
mining techniques applied to the main Mental Health diseases.
2) Studies of Data mining algorithms applied to the main Mental
Health diseases. 3) Studies aimed at another type of disease that
is not related to Mental Health are eliminated. All articles re-
peated in more than one database will be deleted. The Fig. 1
shows the diagram used in the review.
Of the 211 publications found 89 were duplicated or with
an irrelevant title for this research, the remaining 122 studies
were read and analyzed their abstracts to see which were of
interest, obtaining as a result 72 documents which gave rise to
relevant contributions. Then, in the following section are
shows the most relevant works found and the main techniques
and algorithms found in the review are analyzed.
Main techniques and algorithms of data
mining used in the review
The Data Mining techniques have recently become a predom-
inant field of research with wide applications in medical
healthcare, financial services, telecommunications, natural
sciences, etc. It is a process to discover useful models in data,
161 Page 2 of 15 J Med Syst (2018) 42: 161
with the aim of interpreting existing behaviors or predicting
future results [15].
The Data Mining algorithms are classified into two catego-
ries: descriptive (or unsupervised learning) and predictive (or
supervised learning). Descriptive data mining clusters data by
measuring the similarity between objects (or records) and dis-
covers unknown patterns or relationships in data while predic-
tive learning infers prediction rules (classification / prediction
models) from (training) data and applies the rules to
unpredicted / unclassified data [16].
The algorithms used in prognosis and diagnosis of Mental
Health diseases are supervised learning algorithms that in-
clude Artificial Neural Networks (ANNs), Decision Tree
(DT), genetic algorithms and linear discriminant analysis.
Other techniques that generally in use are Support Vector
Machine (SVM), Association Rules (ARs)miningand
Ensemble methods [12].
ANNs are computational models inspired by networks of
the central nervous system, capable of machine learning and
pattern recognition. In general, they are presented as systems
of interconnected Bneurons^that can compute values from
inputs by feeding information through their network [9].
Convolutional Neural Networks (CNNs/ConvNets) is in-
spired by the human visual system; they are similar to classic
neural networks. This architecture has been specifically de-
signed based on the explicit assumption that raw data is com-
posed of two-dimensional images that allow us to encode certain
properties and also reduce the amount of hyper parameters. The
CNN topology uses spatial relationships to reduce the number
of parameters that must be learned and, therefore, improves
upon general feed-forward back propagation training [17].
DT (for example, C4.5) is used to model sequential deci-
sion problems. They are composed of nodes and edges: inter-
nal nodes represent the predicate of the objects in the data set,
Table 1 Search criteria in the different scientific databases
Keywords/ Databases Google Scholar IEEE
Xplore
PubMed Science Direct Scopus Web of Science
Techniques OR Algorithms
AND Data Mining AND Mental Health
Babstract, title, keywords^Babstract^Btitle, abstract^Babstract, title,
keywords^
Babstract, title,
keywords^
Btitle, abstract^
Techniques OR Algorithms
AND Data Mining AND Dementia
Babstract, title, keywords^Babstract^Btitle, abstract^Babstract, title,
keywords^
Babstract, title,
keywords^
Btitle, abstract^
Techniques OR Algorithms
AND Data Mining AND Depression
Babstract, title, keywords^Babstract^Btitle, abstract^Babstract, title,
keywords^
Babstract, title,
keywords^
Btitle, abstract^
Techniques OR Algorithms
AND Data Mining AND Alzheimer
Babstract, title, keywords^Babstract^Btitle, abstract^Babstract, title,
keywords^
Babstract, title,
keywords^
Btitle, abstract^
Techniques OR Algorithms
AND Data Mining AND Schizophrenia
Babstract, title, keywords^Babstract^Btitle, abstract^Babstract, title,
keywords^
Babstract, title,
keywords^
Btitle, abstract^
Fig. 1 Flow diagram used in the
literature review
J Med Syst (2018) 42: 161 Page 3 of 15 161
while each edge represents a division rules over an attribute
(typically, division binary rules). Indeed, every node has two
(or more) outgoing branches: one is associated with objects
whose attributes satisfy the predicate, whereas the other to the
ones which do not. The generalized DT classifiers, such as
C4.5, rely onentropy rule or information gain rule which finds
at each node a predicate that optimizes an entropy function of
the defined partition [18].
There are classification methods such as those defined
below:
SVM builds a separation hyperplane, which maximizes the
minimum distance between data of different classes in a new
space that has been obtained by applying a kernel function to
the original data. SVM are particularly suitable for binary
classification tasks; in this case, the input data are two sets
of ndimensional vectors [18].
Rule-based classifiers assign a given class to each object
according to a specific function r: condition-c (called classifi-
cation rule), such that the rule rcovers an object xif the
attributes of xsatisfy the condition of r. Therefore, in this type
of classification, the classifier uses logical propositional for-
mulas in a disjunctive or conjunctive normal form (Bif then
rules^) for classifying the given samples [18].
Among the different variants of the Ensemble and Random
Forest classifiers, they have attracted the attention of researchers
due to its features of handling missing values and noisy data,
classification of characteristics and selection to form tree nodes
[19]. Random Forest is a variant of ensemble classifier
consisting of a collection of tree-structured classifiers {h(x,
Θk) k = 1, 2,....}, where {Θk} are independent identically dis-
tributed random vectors. Each tree casts a unit vote for the most
popular class input x. It is a popular supervised classification and
regression method that uses the concept of random feature se-
lection for making decision trees [19].
Naïve Bayesian classifier is a selective classifier which
calculates the set of probabilities by counting the frequency
and combination of values in a given data set. It assumes that
the all variables which contribute towards classification are
mutually independent. Naïve Bayesian classifier is based on
bayes theorem and theorem of total probabilities [7].
JRip (RIPPER) is one of the basic and most popular algo-
rithms. Classes are examined in increasing size and an initial
set of rules for the class is generated using the incremental
reduced error [7]. Proceed by treating all the examples of a
particularjudgment in the training data as a class and finding a
set of rules to cover all members of that class. Thereafter
proceeds to the next class and does the same, repeating this
until all classes have been covered.
K-mean is a basic technique of grouping in biocomputing.
The goal is to find K patterns by calculating the distance
betweeneachsamplevalue[20].
Apriori is an algorithm that determines the associations
between data by checking frequencies. It is one of the major
types of algorithm based on association rule. The main pur-
pose of using the Apriori algorithm in Data Mining is to find
patterns in data set. Calculates the conditional probability of
each case and return the most probable data set, which appears
most frequently in input data [21].
The Data Mining approach can significantly help the re-
search into mental illness, to find patterns and knowledge
embedded into the data. It requires exploration and analysis
of large quantities of data for the purpose of better understand-
ing and deriving knowledge regarding the problem at hand
[22]. Below we show the results obtained.
Results
In the review of literature we find a large number of studies
that base their research on the use of Data mining techniques
and algorithms applied to Mental Health diseases. Figure 2
shows the relevant paper statistics found in the last 10 years.
From a total of 72 papers found 35 belong to journals while 38
are conference papers.
Alzheimer
Alzheimer is a multifaceted disease in which the accumulated
cerebral pathology produces a progressive cognitive deteriora-
tion that finally leads to dementia [23,24]. It is characterized by
Fig. 2 Relevant papers statistics
found in the last 10 years
161 Page 4 of 15 J Med Syst (2018) 42: 161
Table 2 Studies of the bibliographic review related to Data Mining techniques and algorithms applied to patients with Alzheimer’s
Authors Year of
publication
Study proposal Techniques and
Algorithms
Results
Qu, Yuan, & Liu. [8] 2009 Predictive model to identify possible conversions
from MCI to AD based on the ADNIdatabase.
Naïve Bayes - Naïve Bayes can predict the conversion with a
reasonably good AUC value after feature
selection.
Joshi et al. [27] 2010 They propose a new model for the AD
classification when considering the risk factors
that most influence. Different models were
developed for AD classification.
Neural Networks and
Machine Learning:
Random Forest tree,
Multilayer Perceptron
- It was discovered that some specific genetic
factors, diabetes, age and smoking were the
strongest risk factors for AD.
- The classification model was validated with the
test cases and achieved classification average
accuracy of 99.25% with Random Forest tree
and the Multilayer Perceptron.
Plant et al. [28] 2010 They develop a new data mining framework in
combination with three different classifiers to
derive a quantitative index of pattern matching
for the prediction of the conversion from MCI
to AD.
SVM, Bayes statistics,
voting feature
intervals (VFI)
- Bayes and VFI yielded superior results
compared to the SVM approach, showing a
novelapproachtoidentifyregionsofhigh
discriminatory power for AD identification
and conversion prediction to AD between
MCI.
Chaves et al. [29] 2011 They propose to discover associations between
the attributes that characterize perfusion
patterns of normal subjects and make use of
them for AD early diagnosis.
Association rules - The proposed method yields up to 94.87%
classification accuracy (sensitivity = 91.07%,
specificity = 100%) overcoming the methods
recently developed until that year for AD
early diagnosis.
Plant et al. [30] 2011 Introduce a bootstrapping-based feature
extraction technique to identify early-stage
AD from resting-state functional resonance
images.
SVM - They show that subjects with early stage AD
can be distinguished with an accuracy of 79%
of healthy subjects of the same age.
Al-Dlaeen &
Alashqur [31]
2014 They present a focus towards a prediction model
for AD.
DT - Using Information Gain enables us to construct
an optimal or near optimal tree with fewer
nodes and branches than if we resort to a
random selection of attributes.
- The system can be used to predict the situation
of new patients with regard to AD
Bhagya Shree&
Sheshadri [7]
2014 Diagnosis of the disease using various machine
learning techniques of Data Mining
Naïve Bayes, DT J48,
Random Forest, JRip
- When evaluating performance taking into
account classification accuracy, precession,
recall and time required for execution of each
technique that Naïve bayes is best of all the
four techniques.
Fiscon et al. [18] 2014 They propose the automatic patients
classification from the Electroencephalogram
(EEG) biomedical signals involved in AD and
MCI in order to support medical in the right
diagnosis formulation.
DT J48, SVM -In the identification of MCI and CT, DT J48
outperforms SVM in the correct classification
percentage achieving 90% accuracy and 87%
specificity, as well as high performance in all
other metrics, using a leave-one-out cross
validation (51 folds).
- In the identification of AD and CT, J48
achieves better classification results (73%
accuracy) than SVM (68% accuracy) in
leave-one-out cross validation (63-fold).
- When distinguishing AD from patients with
MCI, J48 achieves the best results (80%
accuracy, 79% specificity) with respect to the
SVM classifier.
Queau, Shafiq, &
Alhajj [20]
2014 They present several Data Mining techniques to
analyze the data set of AD gene expression.
Association Rule
Mining, Clustering
- It helped to discover interesting and useful
patterns in the data set as: to deduce the
correlation between the different
characteristics of the genes that are healthy or
not in different stages of AD.
Zhang, Wang, &
Dong [32]
2014 Novel classification system to distinguish
between elderly subjects with AD, mild
cognitive impairment, and normal controls
(NC).
Kernel support vector
machine decision tree
(kSVM-DT).
- The kSVM-DT method gathered the main
components of magnetic resonance imaging
(MRI) data and additional data, and the final
classification accuracy is 80%.
Pachange, Joglekar,
& Kulkarni [19]
2015 They use data classification techniques related to
Data Mining, in which decision of multiple
base classifiers is combined for accurate
prediction of the presence or absence of AD
anomalies
Ensemble, Random
Forest
- Both techniques have proven efficient for
medical datasets classification, due to their
ability to handle noisy data, missing values
and unbalanced dataset.
Sheshadri, Shree, &
Krishna [33]
2015 New emerging approach in diagnostic AD. Naïve Bayes, JRip,
Random forest
- The JRip and Random Forest techniques show
better results compared to Naïve Bayes.
Sarraf & Tofighi [17] 2016 They use a Convolutional neuronal network to
distinguish an Alzheimer’s brain from a
normal, healthy brain.
CNN - AD data from normal control data were
successfully classified with 96.86% accuracy
using CNN deep learning architecture
(LeNet).
Martínez-Ballesteros
et al. [34]
2017 DT, quantitative rules,
hierarchical cluster
- The results have shown that the obtained rules
successfully characterize the underlying
J Med Syst (2018) 42: 161 Page 5 of 15 161
gradual deterioration of memory and other cognitive functions,
including language skills, attention, understanding, judgment
and abstract thinking. Patients eventually lose mental function,
leaving them completely dependent on others for care [25].
Therefore, the AD analysis and knowledge based on the
available data are important to understand, alleviate the effects
and smooth the way to disease cure [26]. In Table 2we present
the main studies found in review regarding the use of Data
Mining techniques and algorithms in AD, while Fig. 3shows
the percentages of the main Data Mining techniques applied to
said disease, DT being the most used, followed by SVM and
Naïve Bayes.
In [9] the authors have proposed the relationship and dis-
tribution between immune-phenotypic and oxidative stress
biomarkers in AD and mild cognitive impairment (MCI)
through the application of ANNs where they evaluate the
predictive capacity of the TWIST algorithm. The Auto-
Contractive Map algorithm (Auto-CM) was applied to gener-
ate a graph that reveals the most important links among vari-
ables. Applying these complex mathematical tools, they
succeeded to obtain an algorithm able to distinguish these
neurological conditions from biological parameters with good
accuracy. Moreover, a global immune deficit emerged from
this analysis in pathological individuals compared to healthy
controls, throwing new knowledge about the pathogenesis of
dementia.
In [36] present the first database that has been extracted to
investigate the protein features common to AD. This database
can be used to perform additional investigations in the area of
identification of causes, symptoms and possible treatment for
neuro-degeneration through computational methods. A com-
bination of brain imaging and clinical assessment checks for
Ta bl e 2 (continued)
Authors Year of
publication
Study proposal Techniques and
Algorithms
Results
To provide gene expression patterns and a deeper
knowledgeof biological functions with greater
relevance.
information, grouping relevant genes for the
problem under study and agreeing with prior
biological knowledge.
- We found 90 genes that were significantly
altered in AD patients.
Tejesw inee ,
Shomona, &
Athilakshmi [35]
2017 New dataset consisting of genetic information
pertaining to neuro-degenerative disorder:
AD.
SVM, Random Forest,
DT, Naïve Bayes,
Adaboost,
k-Nearest-Neighbour
(kNN)
- Before the feature selection, Random Forest
and kNN classifiers predicted the diagnostic
classes with high accuracy (~ 82%) when
compared with the other classification
techniques. SVM gave the best accuracy (~
94%) with CFS subset evaluation.
- In Gain Ratio method, Random Forest showed
impressive results (~85%). It was followed by
SVM and DT classifiers.
- The selection of optimal features by CFS
followed by the implementation of DFS, thus
creating a hybrid feature selection technique,
improves the classification accuracy and
helps in early disease diagnosis.
Fig. 3 Percentages of data mining
techniques and algorithms applied
to Alzheimer’sstudies
161 Page 6 of 15 J Med Syst (2018) 42: 161
Table 3 Studies of the bibliographic review related to the Data Mining techniques and algorithms applied to patients with Dementia
Authors Year of
publication
Study proposal Techniques and Algorithms Results
Wen e t al . [40] 2008 Investigated the parametric images potential
of studies with FDG-PET to aid the dementia
classification using Data Mining techniques.
Logistic regression - The results show that cerebral metabolic rate of glucose consumption (CMRGlc)
was efficient in the classification of dementia and Data Mining using voxel-level
features with PCA and the logistic regression model method achieving the best
classification.
S. Zhang et al.
[41]
2013 Predictive model for the technology adoption
of a video transmission solution based on
mobile
phones developed for people with dementia,
taking into account the individual Features.
DT C4.5 algorithm,
Logistic regression
- The statistical tests do not show a significant difference between the performance
of DT and logistic regression model (p= 0.894).
- The advantage of DT models is that are easy to use and interpret from the
perspective of non-technical user.
S. Zhang et al.
[42]
2014 Predictive adoption model for a video
transmission
system based on mobile phones, developed
for people with dementia.
Neural networks, DT C4.5,
SVM, Naïve Bayes,
adaptive boosting, CART,
kNN
- The k-Nearest-Neighbor algorithm using seven characteristics proved to be
the optimal classifier of the assistive technology adoption for people with
dementia (prediction accuracy 0.84 ± 0.0242).
Byeon [38] 2015 Mild cognitive impairment prediction model for
older people in Korean local communities.
Random Forest, Logistic
regression, DT
- The significant predictors of MCI were age, gender, level of education, income
level, subjective health, marital status, smoking, drinking, regular exercise
and hypertension.
- Random Forests model was more accurate compared to the Logistic regression
model and Decision Tree.
Bang et al. [43] 2017 Systematic and global data mining model to
improve the
process of dementia assessment.
SVM, ANN, DT - The results show SVM with a high performance of AUC 0.96 was configured as
the model of subsidiary decision making for a clinician.
- Therefore, it is an intelligent system that allows an intuitive collaboration between
the CAD system and doctors.
Moon et al. [21] 2017 They propose construction of sorting machine
with aim of distinguishing two dementia
diseases: Lewy bodies and Parkinson’s
disease.
Apriori,DT,SVM,K-means,
Neural Network
- To obtain greater precision, they used k-means first to discover where Sum of Squared
error (SSE) cluster becomes larger and use Neural Network considering the
SS-means of k-means.
J Med Syst (2018) 42: 161 Page 7 of 15 161
signs of memory impairment and movement problems are
used to identify patients with AD. Definitive diagnosis can
only be obtained after patients’autopsy by examining brain
tissues. There is a clear need for tangible advances in the area
of biomarkers for assessment of risk, diagnosis and monitor-
ing disease progression.
In [37] explore the possibility of analyzing subjects with
AD from non-image data using supervised learning approach,
and quantify their abnormality using measurable similarity
indexes. As a result, the proposed model works with numeri-
cal input values. Researchers can continue use MRI-based
methods as a preprocessing step for the proposed model, in
which you can extract useful information from images in form
of numerical values and used as inputs for the model,
obtaining better results.
Dementia
As worldwide aged population increases with the development
of science, technology and medicine, the number of geriatric
diseases also increases radically. According to World
Alzheimer Report in 2015, worldwide dementia population re-
corded 44 million in 2013 and will increase more than 3-fold to
135 million in 2050 [38]. It has been proposed to apply specific
motivation protocols adapted to the special features of individ-
ual subjects, which allows motivating people to change behav-
ioral disorders and promote healthy behavior patterns [39].
Next, we propose the studies (See Table 3) found regarding
the application of Data Mining techniques and algorithms in
patients with dementia. Figure 4shows the percentages of the
main techniques applied to dementia, being DT the most used,
followed by ANN, Logistic regression and SVM.
In [44] use Machine Learning applications with Neural
Networks, Mini Mental State Examination (MMSE)and
Functional Activities Questionnaire (FA Q ) methods toclassify
states of dementia and improve accuracy. This analysis shows
that the MMSE and FAQ tests, used in conjunction with the
Machine Learning and Neural Networks methods, improve
the correct classification of cognitively impaired and dement-
ed subjects by 20.84% - 26.84% on the use of the MMSE and
FAQ tests. These studies can be beneficial to understand the
disease progression and to assess prophylactic strategies that
can distinguish the different states of dementia, so that suitable
measures can be taken to manage dementia in general and AD
in particular.
In [45] the authors have proposed through the analyzing
networks of functional brain activity of healthy subjects and
patients with mild cognitive impairment, an intermediate stage
between the expected cognitive decline of normal aging and
the more pronounced decrease in dementia. The effectiveness
of the proposed approach is also confirmed by the score
achieved in the classification task, close to 95%. The method
was applied to a biomedical classification task, but its validity
is of absolute generality, since it can be applied to any type of
raw multivariate time series.
Depression
Nowadays, the main challenge in studying emotional disor-
ders is to master the patient’s emotional changes. The research
on user emotion detection mainly includes three categories:
emotions recognition based on audio-visual signals, physio-
logical signals and multimodal data [46]. Continuous moni-
toring of a person’s stress levels is essential to understand and
manage personal stress [47]. A series of physiological markers
are widely used for the stress assessment, which include: skin
galvanic response, various characteristics of heartbeat pat-
terns, blood pressure and respiratory activity [48].
Suicide is another of the most feared consequences of
mental illness. While there are significant differences
Fig. 4 Percentages of data mining
techniques and algorithms applied
to dementia
161 Page 8 of 15 J Med Syst (2018) 42: 161
Table 4 Studies of the bibliographic review related to the Data Mining techniques and algorithms applied to patients with Depression
Authors Year of
publication
Study proposal Techniques and Algorithms Results
Chang, Hung,
& Juang [51]
2013 They use ontologies and Data Mining
techniques to construct the
depression terminology and infer
depression probability
Ontologies, Bayesian networks - The results represent excellently the
organization structure of large
complex domains, while the
Bayesian probability allows the
probabilities assignment to any
other statement type.
Thanathamathee
[52]
2014 They developed and evaluated model
that uses adolescent depression data
to assess and predict depression
based on momentum, with feature
selection techniques.
AdaBoost based on DT classifier - The results are particularly useful for
depression screening in adolescents,
confirm to doctors that ask
questions and clearly observe
depression symptoms.
- Reduce diagnosis time during the
completion of questionnaires by
patients.
Ghafoor, Huang,
& Liu [53]
2015 Application of Data Mining
techniques in depression database
containing 5964 records. These
techniques are used together to
extract efficient results.
Association analysis, frequent pattern
tree
- The analysis results establish the
most common symptoms of
depressed patients, as well as their
scenarios.
Hou et al. [54] 2016 They propose the correlation between
reading habits and depressive
tendency of university students
based on dataset of university
library records and results of Mental
Health questionnaires.
kNN, SVM, Naïve Bayesian, Linear
regression, Logistic regression
- They tested and compared linear
regression algorithm and logistic
regression algorithm to construct
prediction models and finally
selected the logistic regression for
its highest prediction accuracy at a
lower relative error.
Husain et al. [55] 2016 Predict generalized anxiety disorder
among women.
Random Forest - The prediction accuracy is above 0.9,
which indicates that Random Forest
approach is able to accurately
predict the Generalized Anxiety
Disorder (GAD).
- For the specificity evaluation,
Random Forest shows consistency
by accurately predicting people do
not have GAD.
Li et al. [56] 2016 Research based on EEG, search to find
prominent bands of frequencies and
brain regions that are more related
to mild depression.
Bayes Net, SVM, Logistic
Regression, kNN, Random Forest.
Best First (BF), Greedy Stepwise
(GSW), Genetic Search (GS),
Linear Forword Selection (LFS)
and Rank Search (RS)basedon
Correlation Features Selection
(CFS) were applied for feature
selection.
- As a result, it was obtained that GSW
based on CFS and KNN had the
optimal performance and beta
frequency band played a more
important role in the detection mild
depression than alpha and theta
frequency bands, with the
classification accuracy higher than
92% and AUC above 0.950 for beta
frequency bands of Emo_block and
Neu_block.
Nie, Gong,
&Ye[57]
2016 They propose a censored regression
approach to predict the risk of
patients relapse after their initial
remission from one or multiple
stages of antidepressant treatment.
Regression tree, linear combination of
covariate.
- They show the main risk factors
identified by the multistage linear
method, are not only consistent with
the findings from some of the recent
research about relapse among
patients with MDD who had
initially achieved remission, but
also provide some insights on how
to develop therapies for prevention
of relapse.
Spyrou et al. [58] 2016 To evaluate the neurophysiological
features of elderly participants
suffering from depression and
Random Forest, Random Tree,
Multilayer Perceptron (MPL
Network), SVM
- The efficiency of classifiers varied
from 92.42 to 95.45%, with
J Med Syst (2018) 42: 161 Page 9 of 15 161
between suicide rates in many countries, suicide ranks
amongthetop15causesofdeatharoundtheworld[49].
Early recognition and accurate diagnosis of depression are
essential criteria to optimize treatment selection and im-
prove outcomes, thus reducing the economic and psycho-
social burdens that result from hospitalization, lost work
productivity and suicide [50].
In Table 4we show the studies found related to the Data
Mining techniques and algorithms in patients with depression,
while in Fig. 5are shown percentages of the main techniques
applied to this disease, with SVM being the most used, follow-
ed by Naïve Bayes.
In [60] the authors have proposed a new approach using
Data Mining techniques to predict the stress level of a patient
using a logistic model tress and know different factors that
affect the Mental Health of the patient efficiently. Stress pre-
diction and generated rules will act as a support tool to assist
medical experts provide treatment and to consult patient to
take precautions to prevent future complications. It also will
reduce the cost of several medical tests and facilitate patients
to take preventive measures well in advance.
Schizophrenia and bipolar disorders
Especially in psychiatry, technology and science have made
available new computational methods to help develop predictive
models and identify diseases with greater precision [61]. Among
mental illnesses, disorders related to autism, bipolar disorder and
schizophrenia have a particularly high impact on affected indi-
viduals and their families, they represent a heavy economic
burden for the health care system [62]. Such disorders include
generalized anxiety disorder (GAD), posttraumatic stress disor-
der (PTSD), panic disorder (PD), social phobia and specific
phobias, among others [63]. Below we show the main studies
found on schizophrenia and bipolar disorders (see Table 5)and
Fig. 5 Percentages of data mining
techniques and algorithms applied
to depression
Ta bl e 4 (continued)
Authors Year of
publication
Study proposal Techniques and Algorithms Results
neurodegeneration. The work is
focuses on the identification of
depression symptoms that coexist
with the cognitive deterioration, the
correlation of the examined
neurophysiological features with
the geriatric depression combined
with cognitive impairment.
Random Forest being the most
accurate (95.5%).
Kim et al. [59] 2017 They propose a simple and discreet
detection system that uses passive
infrared sensors to monitor the daily
life activities of elderly who live
alone.
Neural networks, DT C4.5, Bayesian
networks, SVM
- Neural networks surpasses the other
algorithms, followed by C4.5 DT
and is effective to detect normal
conditions and mild depressions
with up to 96% accuracy.
161 Page 10 of 15 J Med Syst (2018) 42: 161
Table 5 Studies of the bibliographic review related to the Data Mining techniques and algorithms applied to patients with Schizophrenia and Bipolar disorders
Authors Year of
publication
Study proposal Techniques and Algorithms Results
Ince et al. [64] 2008 Framework for schizophrenia diagnosis based on the
spectro-temporal patterns selected by a SVM from
multichannel magnetoencephalogram (MEG)
recording in a verbal working memory task
Recursive feature elimination technique (SVM-RFE) - The SVM-RFE algorithm can successfully select fea-
tures from a large prediction space associated with
neuronal activity in a functional task and these features
can be used effectively in recognizing patients in
schizophrenia.
Gangwar,
Mishra, &
Yad a [65]
2014 Analyze through Data Mining algorithms different
parameters for detection and diagnosis of
neuropsychiatric diseases, including Schizophrenia.
DT C5.0 - The results show that algorithm C5.0 has an accuracy of
90% and provides a quick and easy way for doctors to
make a decision regarding disease diagnosis.
GeethaRamani
&Sivaselvi
[66]
2014 They investigate the resting state fMRI images of 15
normal controls and 12 schizophrenia patients by
constructing a functional conectoma using image
preprocessing techniques, specifically realignment,
temporal correction, filtering, etc.
Random Forest, DT C4.5, Regression tree,
k-Nearest Neighbour.
- These algorithms have produced classification rules that
are used in the prediction of schizophrenia disorder,
resulting that algorithm C4.5 has achieved the highest
predictive accuracy, with 93%.
Lanata et al.
[67]
2014 They propose application of pattern recognition
technique to classify the pathological mental states of
bipolar disorders using the information collected from
electrodermal EC response.
k-Nearest Neighbor - The results showed that using a convolution-based ap-
proach to estimate sympathetic ANS markers and
simple k-Nearest Neighbor algorithms, the proposed
methodology is able to discern up to three mood states
such as depression, hypomania, and euthymia with an
average intra-subject accuracy greater than 98% and
inter-subject accuracy greater than 82%.
Thongkam &
Sukmak [68]
2014 The objective is to develop and investigate prediction
models of patient’s readmission with schizophrenia
using Data Mining techniques.
DT, Random Tree, Random Forests, AdaBoost, Bagging,
AdaBoost with DT, AdaBoost with Random Tree,
AdaBoost with Random Forests, Bagging with DT,
Bagging with Random Tree, Bagging with Random
Forest
- The experimental results showed that AdaBoost with
DT has the highest accuracy, recall and F-measure with
98.11%, 98.79 and 98.41%, respectively.
Castaldo et al.
[69]
2016 Pose to detect mental stress using linear and non-linear
Heart Rate Variability (HRV) features extracted from
3 min ECG excerpts recorded from 42 university
students, during oral examination (stress) and at rest
after a vacation.
DT C4.5 - The best performance machine learning method was the
DT C4.5 algorithm, which discriminated between
stress and rest with a sensitivity, specificity and
precision speed of 78%, 80 and 79%, respectively.
J Med Syst (2018) 42: 161 Page 11 of 15 161
Fig. 6shows the percentages of the main Data Mining tech-
niques and algorithms applied to this disease, being DT the most
used, followed by Random Forest and kNN.
In [70] is proposed a semiautomatic system that helps in the
preliminary diagnosis of the patient with psychological disorder.
The goal is not to fully automate the classification process of
mentally ill individuals, but to ensure that a classifier is aware of
all possible Mental Health illnesses could match patient’s symp-
toms. The results show that the system improves the organiza-
tional capacity to collect information faster at a lower cost and
make accurate decisions. The orchestration of genetic algorithms
through the implementation of Business Process Execution
Language allows flexible service workflows to be immediately
adjusted to modifications and make systems smarter.
Discussion
Frequent pattern analysis has been a topic of study focused on
Data Mining, and many algorithms and methods have been
developed to mining frequent sequential and structural pat-
terns. Data Mining algorithms have great potential to expose
the patterns in data, facilitate the search for the combinations
of genetic and environmental factors involved and provide an
indication of the influence [71].
The approach of using Data Mining techniques in psychi-
atry has the potential to open a completely new area of re-
search in the detection, diagnosis and classification of psychi-
atric disorders such as schizophrenia, dementia, depression,
anxiety and alcohol abuse.
Today, depression occurs in adolescents and suicidal de-
pression number increases every time. Therefore, it is known
that depression is a contaminant of morbidity, mortality and
economic loss. Although effective treatments for Mental
Health conditions such as depression and anxiety have been
available for some time, less than half of people with a mental
disorder search primary care medical or psychiatrist.
The results in [72] highlight the advantageous applicability
of machine learning for psychiatric research. By observing
and interpreting the use of online communities, researchers
become better placed to offer suggestions as to how commu-
nities can be cultivated for the maintenance and well-being of
people with a lived experience of depression.
Understanding the factors that predicting mental
healthcare-seeking behaviors is crucial for the formulation of
health policies and the design of interventions to address in-
equities in access to Mental Health services [73]. In addition,
it will be the second cause of death in 2020 due to complica-
tions derived from stress and the cardiovascular system.
Suicide rates have increased significantly worldwide in re-
cent years. This phenomenon is very complex and includes
biological, psychological and social variables. A large propor-
tion of people who have attempted suicide present with psy-
chiatric conditions, such as mood disorders (specifically, de-
pressive disorders), psychosis and substance abuse. Among
the social factors known to date, unemployment and social
isolation have been associated with high suicide rates [74].
A challenge when developing predictive clinical tools is to
establish what information should be used. Genetic and brain
imaging measures are possible sources of information and have
generated interest. However, even if effective, the cost and time
of data collection and processing may not be practical. Previous
attempts to identify clinical predictors of treatment outcome
usually some predictors based on clinical experience, and have
investigated their overall effect gradually [75].
It has been demonstrate [50]thatDataMiningappliedto
EEG signals can be a useful tool to discriminate between
depressed and healthy people. Given the questionable reliabil-
ity of diagnoses based on clinical symptoms, this quantitative
methodology may be a useful adjunctive clinical decision
support for identifying depression and supports independent
Fig. 6 Percentages of data mining
techniques and algorithms applied
to schizophrenia and bipolar
disorders
161 Page 12 of 15 J Med Syst (2018) 42: 161
studies confirming the potential clinical utility of computer-
assisted diagnosis of depression using EEG signals.
Information technologies have the power to positively trans-
form the way patients are treated, and help us advance knowl-
edge more quickly [76]. Patients can receive highly personalized
treatments, therapists will receive help in making evidence-
based decisions, and the scientist will be able to search new
knowledge that reveals the true causes of Mental Health ill-
nesses while developing more effective treatment approaches.
Conclusion
The purpose of this review was to provide a state of art over-
view in research about Data Mining techniques and algo-
rithms applied to Mental Health diseases. The selected studies
addressed of main Mental Health diseases based on predictive
techniques applied to different study features, as well as being
able to detect the risk factors for most diseases. The predictive
models and the binary classifiers can be trained according to
the features obtained from all these techniques. Once the lit-
erature review where the existing publications were analyzed
in the last 10 years, taking account the studies referring to use
Data Mining techniques and algorithms applied to the Mental
Health diseases and the exposed results, we propose, as future
line and continuation of this research apply the main tech-
niques found in a patients database with schizophrenia, make
a comparison between the techniques and evaluate the results
in terms of performance and accuracy, in addition to obtaining
common patterns among patients with the disease. The au-
thors of this paper are already working on the aforementioned
database. Another of the future lines to be proposed is devel-
opment of a prediction model of patient cognitive impairment
(dementia) based on the proposed algorithms and discover the
significant predictors that lead to the disease.
Acknowledgements This research has been partially supported by the
European Commission and the Ministry of Industry, Energy and
Tourism under the project AAL-20125036 named BWetake Care: ICT-
based Solution for (Self-) Management of Daily Living^.
Compliance with Ethical Standards
Conflict of Interest The authors declare that they have no conflict of
interest.
Ethical Approval This article does not contain any studies with human
participants or animals performed by any of the authors.
References
1. Dhaka, P., and Johari, R., Big data application: Study and archival
of mental health data, using MongoDB. Int. Conf. Electr. Electron.
Optim. Tech.:3228–3232, 2016.
2. Dipnall, J. F., Pasco, J. A., Berk, M., Williams, L. J., Dodd, S.,
Jacka, F. N. et al., Fusing data mining, machine learning and tradi-
tional statistics to detect biomarkers associated with depression.
PLoS One. 11(2):1–23, 2016.
3. Pirooznia, M., Seifuddin, F., Judy, J., Mahon, P., Potash, J., and
Zandi, P., Data mining approaches for genome-wide asscociation
of mood disorders. Psychiatr. Genet. 22(2):55–61, 2012.
4. Ni, H., Yang, X., Fang, C., Guo, Y., Xu, M., and He, Y., Data
mining-based study on sub-mentally healthy state among residents
in eight provinces and cities in China. J. Tradit. Chinese Med.
34(4):511–517, 2014.
5. (WHO) WHO. Trastornos mentales. 2018; Available from: http://
www.who.int/mediacentre/factsheets/fs396/es/ (last accessed April
2018).
6. Mathew, J., Mekkayil, L., Ramasangu, H., Karthikeyan, B. R., and
Manjunath, A. G., Robust algorithm for early detection of
Alzheimer’s disease using multiple feature extractions. IEEE
Annu. India Conf. 2016:1–6, 2016.
7. Bhagya Shree, S. R., and Sheshadri, H. S., An initial investigation
in the diagnosis of Alzheimer’s disease using various classification
techniques. 2014 IEEE Int Conf Comput Intell Comput Res.
2014;1–5.
8. Qu, X., Yuan, B., and Liu, W., A predictive model for identifying
possible MCI to AD Conversions in the ADNI database. 2009 2nd
Int. Symp. Knowl. Acquis. Model KAM 3:102–105, 2009.
9. Gironi M, Borgiani B, Farina E, Mariani E, Cursano C, Alberoni M,
et al. A global immune deficit in Alzheimer’s disease and mild
cognitive impairment disclosed by a novel data mining process. J.
Alzheimers Dis. 2015;43(4):1199–1213
10. Yoon, S., Taha, B., and Bakken, S., Using a data mining approach
to discover behavior correlates of chronic disease: A case study of
depression. Stud. Health Technol. Inform. 201:71–78, 2014.
11. Lee, C., Lam, C. P., and Masek, M., Rough-fuzzy hybrid approach
for identification of bio-markers and classification on Alzheimer’s
disease data. Proc. 2011 11th IEEE Int Conf Bioinforma Bioeng
BIBE.;84–91, 2011.
12. Alonso, S. G., de la Torre Díez, I., Rodrigues, J. J. P. C., Hamrioui,
S., and López-Coronado, M., A systematic review of techniques
and sources of big data in the healthcare sector. J. Med. Syst.
41(11):183, 2017.
13. Khan, A., and Usman, M., Early diagnosis of Alzheimer’sdisease
using machine learning techniques. 2015 7th Int Jt Conf. Knowl.
Discov. Knowl. Eng. Knowl. Manag. (IC3K) 1:380–387, 2015.
14. Wongkoblap, A., Vadillo, M. A., and Curcin, V., Researching men-
tal health disorders in the era of social media: Systematic review. J.
Med. Internet Res. 19(6):e228, 2017.
15. Yuan, C., Data mining techniques with its application to the dataset
of mental health of college students. IEEE Work Adv. Res. Technol.
Ind. Appl. WARTIA. 2014:391–393, 2014.
16. Yoo, I., Alafaireet, P., Marinov, M., Pena-Hernandez, K., Gopidi,
R., Chang, J.F. et al., Data mining in healthcare and biomedicine: A
survey of the literature. J. Med. Syst. 36(4):2431–2448, 2012.
17. Sarraf, S., and Tofighi, G., Deep learning-based pipeline to recog-
nize Alzheimer’s disease using fMRI data. 2016 Future
Technologies Conference (FTC), IEEE. 816–820, 2016.
18. Fiscon, G., Weitschek, E., Felici, G., Bertolazzi, P., De Salvo, S.,
and Bramanti, P, et al., Alzheimer’s disease patients classification
through EEG signals processing. 2014 IEEE Symp Comput Intell
Data Min (CIDM).;105–112, 2014.
19. Pachange S, Joglekar B, Kulkarni P. An ensemble classifier ap-
proach for disease diagnosis using random Forest. 2015 Annu.
IEEE Ind. Conf.;1–5, 2015.
20. Le, Q. B., Shafiq, O., and Alhajj, R., Analyzing Alzheimer’sdis-
ease gene expression dataset using clustering and association rule
mining. Proc. 2014 IEEE 15th Int Conf Inf Reuse Integr IEEE IRI.
;283–290, 2014.
J Med Syst (2018) 42: 161 Page 13 of 15 161
21. Moon, S., Choi, B., An, J., and Yoon, T., Constructing a sorting
machine for degenerative cerebropathia. Int. Conf. Adv. Commun.
Technol. ICACT.;800–804, 2017.
22. Hadzic M, Hadzic F, Dillon T. Tree mining in mental health do-
main. Proc. 41st Annu. Hawaii Int. Conf. Syst. Sci. ;1–8, 2008.
23. Simon, G. J., Li, P. W., Jack, Jr. C. R., and Vemuri, P.,
Understanding atrophy trajectories in Alzheimer’s disease using
association rules on MRI images. Proc. 17th ACM SIGKDD Int.
Conf. Knowl. Discov. Data Min. ;369–376, 2011.
24. Payandeh, S., Recursive Bayesian tracking for smart elderly living.
7th IEEE Annu Inf Technol Electron Mob Commun Conf IEEE
IEMCON.;1–7, 2016.
25. Chiang, H.-S., and Pao, S.-C., An EEG-based fuzzy probability
model for early diagnosis of Alzheimer’s disease. J. Med. Syst.
40(5):125, 2016.
26. Ertek, G., Tokdil, B., and Günaydın, İ., Risk factors and identifiers
for Alzheimer’s disease: A data mining analysis. Ind. Conf. Data
Min.;1–11, 2014.
27. Joshi, S., Shenoy, D, G.G. VS, Rrashmi, P. L., Venugopal, K. R.,
and Patnaik, L. M., Classification of Alzheimer’s disease and
Parkinson’s disease by using machine learning and neural network
methods. 2010 Second Int. Conf. Mach. Learn. Comput. ;218–222,
2010.
28. Plant, C., Teipel, S. J., Oswald, A., Böhm, C., Meindl, T., Mourao-
Miranda, J. et al., Automated detection of brain atrophy patterns
based on MRI for the prediction of Alzheimer’sdisease.
Neuroimage. 50(1):162–174, 2010.
29. Chaves, R., Górriz, J. M., Ramírez, J., Illn, I. A., Salas-Gonzalez,
D., and Gómez-Río, M., Efficient mining of association rules for
the early diagnosis of Alzheimer ’s disease. Phys. Med. Biol. 56(18):
6047–6063, 2011.
30. Plant C, Sorg C, Riedl V, Wohlschläger A. Homogeneity-based
feature extraction for classification of early-stage alzheimer’sdis-
ease from functional magnetic resonance images. Proc. 2011 Work
Data Min. Med. Healthc - DMMH. 2011;33–41, 2011.
31. Al-Dlaeen, D., and Alashqur, A., Using decision tree classification
to assist in the prediction of Alzheimer’s disease. 2014 6th Int.
Conf. Comput. Sci. Inf. Technol. (CSIT).;122–126, 2014.
32. Zhang, Y., Wang, S., and Dong, Z., Classification of Alzheimer
disease based on structural magnetic resonance imaging by kernel
support vector machine decision tree. Prog. Electromagn. Res. 144:
171–184, 2014.
33. Sheshadri, H. S. , Shree, S. R. B., and Krishna, M., Diagnosis of
Alzheimer’s disease employing neuropsychological and classifica-
tion techniques. Proc 2015 5th Int. Conf. IT Converg. Sec. ICITCS.
;1–6, 2015.
34. Martínez-Ballesteros, M., García-Heredia, J. M., Nepomuceno-
Chamorro, I. A., and Riquelme-Santos, J. C., Machine learning
techniques to discover genes with potential prognosis role in
Alzheimer’s disease using different biological sources. Inf.
Fusion. 36:114–129, 2017.
35. Tejeswinee, K., Shomona, G. J., and Athilakshmi, R., Feature se-
lection techniques for prediction of neuro-degenerative disorders: A
case-study with Alzheimer’sandParkinson’s disease. Proc.
Comput. Sci. 115:188–194, 2017.
36. Jacob SG, Athilakshmi R. Extraction of protein sequence features
for prediction of neuro-degenerative brain disorders: Pioneering the
CGAP database. Proc Int Conf Informatics Anal - ICIA-16.;30,
2016.
37. Aditya, C. R., and Pande, M. B. S., Devising an interpretable cal-
ibrated scale to quantitatively assess the dementia stage of subjects
with alzheimer’s disease: A machine learning approach. Inform.
Med. Unlocked. 6:28–35, 2017.
38. Byeon, H. A., Prediction model for mild cognitive impairment
using random forests. Int. J. Adv. Comput. Sci. Appl. 6(12):8–12,
2015.
39. Fernández-Llatas, C., García-Gomez, J. M., Vicente, J., Naranjo, J.
C., Robles, M., and Benedí, J. M., et al., Behaviour patterns detec-
tion for persuasive design in nursing homes to help dementia pa-
tients. Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc.
EMBS.;6413–6417, 2011.
40. Wen, L., Bewley, M., Eberl, S., Fulham, M., and Feng, D. D.,
Classification of dementia from fdg-pet parametric images using
data mining. 5th IEEE Int. Symp. Biomed. Imaging From Nano to
Macro, ISBI. ;412–415, 2008.
41. Zhang, S., Mcclean, S., Nugent, C., Neill, S. O., Donnelly, M., and
Galway, L., et al., Prediction of assistive technology adoption for
people with dementia. Int. Conf. Heal. Inf. Sci. ;160–171, 2013.
42. Zhang S, McClean SI, Nugent CD, Donnelly MP, Galway L,
Scotney BW, et al. A predictive model for assistive technology
adoption for people with dementia. IEEE J. Biomed. Heal.
Inform. 2014;18(1):375–383.
43. Bang, S., Son, S., Roh, H., Lee, J., Bae, S., Lee, K. et al., Quad-
phased data mining modeling for dementia diagnosis. BMC Med.
Inform. Decis. Mak. 17(1):60, 2017.
44. Joshi, S., Shenoy, P. D., Venugopal, K. R., and Patnaik, L. M.,
Evaluation of different stages of dementia employing neuropsycho-
logical and machine learning techniques. BT - 2009 1st Int. Conf.
Adv. Comput. ICAC. ;154–160, 2009.
45. Zanin, M., Sousa, P., Papo, D., Bajo, R., García-Prieto, J., Del Pozo,
F. et al., Optimizing fun network representation of multivariate time
series. Sci Rep. 2:630, 2012 1.
46. Yang, S., Zhou. P., Duan, K., Hossain, M. S., and Alhamid, M. F.,
emHealth: Towards emotion health through depression prediction
and intelligent health recommender system. Mob. Netw. Appl.;1–11,
2017.
47. Jena, L., and Kamila, N. K., A model for prediction of human
depression using Apriori algorithm. 2014 Int. Conf. Inf. Technol.
;240–244, 2014.
48. Jung, Y., and Yoon, Y. I., Multi-level assessmentmodel for wellness
service based on human mental stress level. Multimed. Tools Appl.
76(9):11305–11317, 2017.
49. Morales, S., Barros, J., Echávarri, O., García, F., Osses, A., Moya,
C. et al., Acute mental discomfort associated with suicide behavior
in a clinical sample of patients with affective disorders:
Ascertaining critical variables using artificial intelligence tools.
Front. Psychiatr. 8:7, 2017.
50. Mohammadi, M., Al-Azab, F., Raahemi, B., Richards, G.,
Jaworska, N., Smith, D. et al., Data mining EEG signals in depres-
sion for their diagnostic value. BMC Med. Inform. Decis. Mak.
15(1):108, 2015.
51. Chang YS, Hung WC, Juang TY. Depression diagnosis based on
ontologies and bayesian networks. Proc - 2013 IEEE Int. Conf.
Syst. Man., Cybern. SMC. ;3452–3457, 2013.
52. Thanathamathee, P., Boosting with feature selection technique for
screening and predicting adolescents depression. 2014 4th Int Conf
Digit Inf Commun Technol Its Appl DICTAP.;23–27, 2014.
53. Ghafoor, Y., Huang, Y. P., and Liu, S. I., An intelligent approach to
discovering common symptoms among depressed patients. Soft.
Comput. 19(4):819–827, 2015.
54. Hou, Y., Xu, J., Huang, Y, and Ma, X., A big data application to
predict depression in the university based on the reading habits.
2016 3rd Int. Conf. Syst. Inform., ICSAI. ;1085–1089, 2016.
55. Husain, W., Xin, L. K., Rashid, N. A., and Jothi, N., Predicting
generalized anxiety disorder among women using random forest
approach. 2016 3rd Int Conf Comput Inf Sci. ;37–42, 2016.
56. Li, X., Hu, B., Sun, S., and Cai, H., EEG-based mild depressive
detection using feature selection methods and classifiers. Comput.
Methods Programs Biomed. 136:151–161, 2016.
57. Nie, Z., Gong, P., and Ye, J., Predict risk ofrelapse for patients with
multiple stages of treatment of depression. Proc. 22Nd ACM
SIGKDD Int. Conf. Knowl. Discov. Data Min.;1795–1804, 2016.
161 Page 14 of 15 J Med Syst (2018) 42: 161
58. Spyrou, I. M., Frantzidis, C., Bratsas, C., Antoniou, I., and Bamidis,
P. D., Geriatric depression symptoms coexisting with cognitive de-
cline: A comparison of classification methodologies. Biomed. Sign.
Process Contrl. 25:118–129, 2016.
59. Kim, J. Y., Liu, N., Tan, H. X., and Chu, C. H., Unobtrusive mon-
itoring to detect depression for elderly with chronic illnesses. IEEE
Sens. J. 17(17):5694–5704, 2017.
60. D’monte, S., and Panchal, D., Data mining approach for diagnose
of anxiety disorder. Int. Conf. Comput. Commun. Autom.
(ICCCA).;124–127, 2015.
61. Tovar, D., Cornejo, E., Xanthopoulos, P., Guarracino, M. R., and
Pardalos, P. M., Data mining in psychiatric research. Psychiatr.
Disord. 829:593–603, 2012.
62. Lyalina, S., Percha, B., Lependu, P., Iyer, S. V., Altman, R. B., and
Shah, N. H., Identifying phenotypic signatures of neuropsychiatric
disorders from electronic medical records. J. Am. Med. Inform.
Assoc. 20:297–305, 2013.
63. Panagiotakopoulos, T. C., Lyras, D. P., Livaditis, M., Sgarbas, K.
N., Anastassopoulos, G. C., and Lymberopoulos, D. K., A contex-
tual data mining approach toward assisting the treatment of anxiety
disorders. IEEE Trans. Inf. Technol. Biomed. 14(3):567–581, 2010.
64. Ince, N. F., Goksu, F., Pellizzer, G., Tewfik, A., and Stephane, M.,
Selection of spectro-temporal patterns in multichannel MEG with
support vector machines for schizophrenia classification. 2008 30th
Annu. Int. Conf. IEEE Eng. Med. Biol Soc. ;3554–3557, 2008.
65. Gangwar, M., Mishra, R. B., and Yadav, R. S., Application of de-
cision tree method in the diagnosis of neuropsychiatric diseases.
Asia-Pacific World Congr Comput Sci Eng. ;1–8, 2014.
66. GeethaRamani, R., and Sivaselvi, K., Data mining technique for
identification of diagnostic biomarker to predict schizophrenia dis-
order. 2014 IEEE Int. Conf. Comput. Intell. Comput. Res. ;1–8,
2014.
67. Lanata, A., Greco, A., Valenza, G., and Scilingo, E. P., A pattern
recognition approach based on electrodermal response for
pathological mood identification in bipolar disorders. ICASSP,
IEEE Int. Conf. Acoust. Speech Sign. Process ;3601–3605, 2014.
68. Thongkam, J., and Sukmak, V., Enhancing decision tree with
adaboost for predicting schizophrenia readmission. Adv. Mater.
Res. 931:1467–1471, 2014.
69. Castaldo, R., Xu, W., Melillo, P., Pecchia, L., Santamaria, L., and
James, C., Detection of mental stress due to oral academic exami-
nation via ultra-short-term HRV analysis. Proc. Annu. Int. Conf.
IEEE Eng. Med. Biol. Soc. EMBS. ;3805–3808, 2016.
70. Azar, G., Gloster, C., El-Bathy, N., Yu, S., Neela, R. H, and
Alothman, I., Intelligent data mining and machine learning for
mental health diagnosis using genetic algorithm. IEEE Int. Conf.
Electro. Inf. Technol.;201–206, 2015.
71. Hadzic, M., Hadzic, F., and Dillon, T. S., Domain driven tree min-
ing of semi-structured mental health information. Data Min. Bus.
Appl. 2009;127–141.
72. Nguyen, T., O’Dea, B., Larsen, M., Phung, D., Venkatesh, S., and
Christensen, H., Using linguistic and topic analysis to classify sub-
groups of online depression communities. Multimed. Tools Appl.
76(8):10653–10676, 2017.
73. Cairney, J., Veldhuizen, S., Vigod, S., Streiner, D. L., Wade, T. J.,
and Kurdyak, P., Exploring the social determinants of mental health
service use using intersectionality theory and CART analysis. J.
Epidemiol. Commun. Health. 68(2):145–150, 2014.
74. Barros, J., Morales, S., Echávarri, O., García, A., Ortega, J., Asahi,
T. et al., Suicide detection in Chile: Proposing a predictive model
for suicide risk in a clinical sample ofpatients with mood disorders.
Rev. Bras. Psiquiatr. 39(1):1–11, 2017.
75. Chekroud AM, Zotti RJ, Shehzad Z, Gueorguieva R, Johnson MK,
Trivedi MH, et al. Cross-trial prediction of treatment outcome in
depression: A machine learning approach. Lancet Psychiatr.
2016;3(3):243–250.
76. Hadzic, M., Hadzic, F., and Dillon, T. S., Mining of patient data:
Towards better treatment strategies for depression. Int. J. Funct.
Inform. Personal. Med. 3(2):122–143, 2010.
J Med Syst (2018) 42: 161 Page 15 of 15 161
A preview of this full-text is provided by Springer Nature.
Content available from Journal of Medical Systems
This content is subject to copyright. Terms and conditions apply.