ArticlePDF Available

Data Mining Algorithms and Techniques in Mental Health: A Systematic Review

Authors:

Abstract and Figures

Background: Data Mining in medicine is an emerging field of great importance to provide a prognosis and deeper understanding of disease classification, specifically in Mental Health areas. Objective: The main objective of this paper is to present a review of the existing research works in the literature, referring to the techniques and algorithms of Data Mining in Mental Health, specifically in the most prevalent diseases such as: Dementia, Alzheimer, Schizophrenia and Depression. Methods: Academic databases that were used to perform the searches are Google Scholar, IEEE Xplore, PubMed, Science Direct, Scopus and Web of Science, taking into account as date of publication the last 10 years, from 2008 to the present. Several search criteria were established such as 'techniques' AND 'Data Mining' AND 'Mental Health', 'algorithms' AND 'Data Mining' AND 'dementia' AND 'schizophrenia' AND 'depression', etc. selecting the papers of greatest interest. Results: A total of 211 articles were found related to techniques and algorithms of Data Mining applied to the main Mental Health diseases. 72 articles have been identified as relevant works of which 32% are Alzheimer's, 22% dementia, 24% depression, 14% schizophrenia and 8% bipolar disorders. Many of the papers show the prediction of risk factors in these diseases. Conclusion: From the review of the research articles analyzed, it can be said that use of Data Mining techniques applied to diseases such as dementia, schizophrenia, depression, etc. can be of great help to the clinical decision, diagnosis prediction and improve the patient's quality of life.
This content is subject to copyright. Terms and conditions apply.
SYSTEMS-LEVEL QUALITY IMPROVEMENT
Data Mining Algorithms and Techniques in Mental
Health: A Systematic Review
Susel Góngora Alonso
1
&Isabel de la Torre-Díez
1
&Sofiane Hamrioui
2
&Miguel López-Coronado
1
&
Diego Calvo Barreno
1
&Lola Morón Nozaleda
3
&Manuel Franco
4
Received: 17 May 2018 /Accepted: 16 July 2018 /Published online: 21 July 2018
#Springer Science+Business Media, LLC, part of Springer Nature 2018
Abstract
Data Mining in medicine is an emerging field of great importance to provide a prognosis and deeper understanding of disease
classification, specifically in Mental Health areas. The main objective of this paper is to present a review of the existing research works
in the literature, referring to the techniques and algorithms of Data Mining in Mental Health, specifically in the most prevalent diseases
such as: Dementia, Alzheimer, Schizophrenia and Depression. Academic databases that were used to perform the searches are Google
Scholar, IEEE Xplore, PubMed, Science Direct, Scopus and Web of Science, taking into account as date of publication the last 10 years,
from 2008 to the present. Several search criteria were established such as techniquesAND Data MiningAND Mental Health,
algorithmsAND Data MiningAND dementiaAND schizophreniaAND depression, etc. selecting the papers of greatest
interest. A total of 211 articles were found related to techniques and algorithms of Data Mining applied to the main Mental Health
diseases. 72 articles have been identified as relevant works of which 32% are Alzheimers, 22% dementia, 24% depression, 14%
schizophrenia and 8% bipolar disorders. Many of the papers show the prediction of risk factors in these diseases. From the review of the
research articles analyzed, it can be said that use of Data Mining techniques applied to diseases such as dementia, schizophrenia,
depression, etc. can be of great help to the clinical decision, diagnosis prediction and improve the patients quality of life.
Keywords Algorithms .Data mining .Mental health .Tech ni ques
Introduction
Mental Health is measured by a high grade of impairment, such
as affective disorder that results in depression and different
anxiety disorders. Worldwide, 25% suffer from Mental Health
problems in developed and developing countries. The data is
turning into terabytes and petabytes, 80% of which is unstruc-
tured, so it is difficult to process them with database
This article is part of the Topical Collection on Systems-Level Quality
Improvement
*Isabel de la Torre-Díez
isator@tel.uva.es
Susel Góngora Alonso
suselgongoraalonso@gmail.com
Sofiane Hamrioui
Sofiane.Hamrioui@univ-nantes.fr
Miguel López-Coronado
miglop@tel.uva.es
Diego Calvo Barreno
info@diegocalvo.es
Lola Morón Nozaleda
lolamoron@gmail.com
Manuel Franco
mfm@intras.es
1
Department of Signal Theory and Communications, and Telematics
Engineering, University of Valladolid, Paseo de Belén, 15,
47011 Valladolid, Spain
2
Bretagne Loire and Nantes Universities, UMR 6164, IETR Polytech
Nantes, Nantes, France
3
Nozaleda and Lafora Mental Health Clinic, C/ José Ortega Y Gasset,
44, 28006 Madrid, Spain
4
Psiquiatry Service, Hospital Zamora, Hernán Cortés, Zamora, Spain
Journal of Medical Systems (2018) 42: 161
https://doi.org/10.1007/s10916-018-1018-2
management tools and other traditional techniques. About $ 2.3
trillion is the global cost for Mental Health treatment. By im-
proving the quality of treatments we can reduce costs signifi-
cantly and this quality can be improved by the introduction of
Data Mining tools and techniques in Mental Health [1].
In the last two decades there has been a steady increase
in the use of Data Mining techniques in various disciplines
[2]. Data Mining incorporates a path to knowledge discov-
ery and is a significant process to discover patterns in data
by exploring and modeling large amounts of data. Data
Mining incorporates automatic learning algorithms to
learn, extract and identify useful information and subse-
quent knowledge of large databases [3]. In the last 10 years
Data Mining techniques have been used in medical re-
search, mainly in neuroscience and biomedicine. More re-
cently, psychiatry has begun to use benefits of these tech-
niques to gain a better understanding of the mental disease
genetic composition [4].
According to World Health Organization (WHO)[5]there
are a variety of mental disorders within main are dementia,
schizophrenia, depression, bipolar disorders and Alzheimer as
a dementia derived disease.
Currently most people suffer from neurodegenerative disor-
ders related to the brain [6]. These disorders lead to various
diseases. Dementia in this case is a general term for decrease
in mental capacity severe enough to interfere with daily life [7].
AlzheimersDisease(AD) is the most common type of dementia
represents 6080% of mental disorders [8]. The disease diagno-
sis at an earlier stage is a crucial task, therefore, it is of medical
interest to develop predictive tools to evaluate this risk [9].
The objective of predictive data extraction in this area is to
build models from high-dimensional medical information and
use them to predict diagnostic results on unseen medical data
in order to support clinical decision making [10]. Approaches
in predictive data extraction can be applied to the construction
of decision models for medical procedures, such as prognosis,
diagnosis and treatment planning, which can be embedded
into clinical systems as systematic support components [11].
In this paper we have posed as research question: Are there
work related to Data Mining techniques and algorithms applied
to Mental Health with purpose of obtaining predictions of dis-
eases in this pathology? Therefore, the aim of our paper is to
present a review state of the art of Data Mining techniques and
algorithms in the prevalent diseases of Mental Health, being this
exhaustive study the main contribution of our paper and allow
us to direct future research in the creation of new prediction
algorithms. This paper gives continuity to a first review [12]
focused on analyzing sources and techniques of Big Data in
the health sector and identify which of these techniques are the
most used in the chronic diseases prediction.
There are reviews that base their study on: review, analysis
and evaluation for the early detection of Alzheimer diseases
using Machine Learning techniques [13], as well as in scope
and limits of Data Mining techniques for predictive analysis in
Mental Health [14].
The methodology used in this review is described below.
Afterwards, the results obtained the discussion of them and
the final conclusions of this research work will be finalized.
Methodology
In this paper we have carried out a review of the published
works related to techniques and algorithms of Data Mining in
Mental Health until March 2018. To carry out the review, the
scientific databases were used: Google Scholar, IEEE Xplore,
PubMed Science Direct, Scopus and Web of Science. The
databases used include the most scientific information in mul-
tidisciplinary fields, engineering and medicine, they allow to
find and access articles in scientific and academic journals, or
in repositories, archives and other collections of scientific
texts. The key terms introduced in the search engines of these
databases are: BTechn iques^AND BAlgorithms^AND BData
Mining^AND (Bdementia^OR Bdepression^OR
BAlzheimer^OR Bschizophrenia^OR Bmental health^), both
in Spanish and English. Those terms are searched in
BAbstract/Title/Keywords^, from 2008 to the present. The
search criteria shown in Table 1are those provided specifical-
ly by the database search engine itself.
The selection process of the papers was carried out by read-
ing the titles and abstracts of the results obtained; the papers
were classified by reading their abstracts as well as the full
article when necessary. The selection criteria to take into account
to classify the papers were the following: 1) Studies of Data
mining techniques applied to the main Mental Health diseases.
2) Studies of Data mining algorithms applied to the main Mental
Health diseases. 3) Studies aimed at another type of disease that
is not related to Mental Health are eliminated. All articles re-
peated in more than one database will be deleted. The Fig. 1
shows the diagram used in the review.
Of the 211 publications found 89 were duplicated or with
an irrelevant title for this research, the remaining 122 studies
were read and analyzed their abstracts to see which were of
interest, obtaining as a result 72 documents which gave rise to
relevant contributions. Then, in the following section are
shows the most relevant works found and the main techniques
and algorithms found in the review are analyzed.
Main techniques and algorithms of data
mining used in the review
The Data Mining techniques have recently become a predom-
inant field of research with wide applications in medical
healthcare, financial services, telecommunications, natural
sciences, etc. It is a process to discover useful models in data,
161 Page 2 of 15 J Med Syst (2018) 42: 161
with the aim of interpreting existing behaviors or predicting
future results [15].
The Data Mining algorithms are classified into two catego-
ries: descriptive (or unsupervised learning) and predictive (or
supervised learning). Descriptive data mining clusters data by
measuring the similarity between objects (or records) and dis-
covers unknown patterns or relationships in data while predic-
tive learning infers prediction rules (classification / prediction
models) from (training) data and applies the rules to
unpredicted / unclassified data [16].
The algorithms used in prognosis and diagnosis of Mental
Health diseases are supervised learning algorithms that in-
clude Artificial Neural Networks (ANNs), Decision Tree
(DT), genetic algorithms and linear discriminant analysis.
Other techniques that generally in use are Support Vector
Machine (SVM), Association Rules (ARs)miningand
Ensemble methods [12].
ANNs are computational models inspired by networks of
the central nervous system, capable of machine learning and
pattern recognition. In general, they are presented as systems
of interconnected Bneurons^that can compute values from
inputs by feeding information through their network [9].
Convolutional Neural Networks (CNNs/ConvNets) is in-
spired by the human visual system; they are similar to classic
neural networks. This architecture has been specifically de-
signed based on the explicit assumption that raw data is com-
posed of two-dimensional images that allow us to encode certain
properties and also reduce the amount of hyper parameters. The
CNN topology uses spatial relationships to reduce the number
of parameters that must be learned and, therefore, improves
upon general feed-forward back propagation training [17].
DT (for example, C4.5) is used to model sequential deci-
sion problems. They are composed of nodes and edges: inter-
nal nodes represent the predicate of the objects in the data set,
Table 1 Search criteria in the different scientific databases
Keywords/ Databases Google Scholar IEEE
Xplore
PubMed Science Direct Scopus Web of Science
Techniques OR Algorithms
AND Data Mining AND Mental Health
Babstract, title, keywords^Babstract^Btitle, abstract^Babstract, title,
keywords^
Babstract, title,
keywords^
Btitle, abstract^
Techniques OR Algorithms
AND Data Mining AND Dementia
Babstract, title, keywords^Babstract^Btitle, abstract^Babstract, title,
keywords^
Babstract, title,
keywords^
Btitle, abstract^
Techniques OR Algorithms
AND Data Mining AND Depression
Babstract, title, keywords^Babstract^Btitle, abstract^Babstract, title,
keywords^
Babstract, title,
keywords^
Btitle, abstract^
Techniques OR Algorithms
AND Data Mining AND Alzheimer
Babstract, title, keywords^Babstract^Btitle, abstract^Babstract, title,
keywords^
Babstract, title,
keywords^
Btitle, abstract^
Techniques OR Algorithms
AND Data Mining AND Schizophrenia
Babstract, title, keywords^Babstract^Btitle, abstract^Babstract, title,
keywords^
Babstract, title,
keywords^
Btitle, abstract^
Fig. 1 Flow diagram used in the
literature review
J Med Syst (2018) 42: 161 Page 3 of 15 161
while each edge represents a division rules over an attribute
(typically, division binary rules). Indeed, every node has two
(or more) outgoing branches: one is associated with objects
whose attributes satisfy the predicate, whereas the other to the
ones which do not. The generalized DT classifiers, such as
C4.5, rely onentropy rule or information gain rule which finds
at each node a predicate that optimizes an entropy function of
the defined partition [18].
There are classification methods such as those defined
below:
SVM builds a separation hyperplane, which maximizes the
minimum distance between data of different classes in a new
space that has been obtained by applying a kernel function to
the original data. SVM are particularly suitable for binary
classification tasks; in this case, the input data are two sets
of ndimensional vectors [18].
Rule-based classifiers assign a given class to each object
according to a specific function r: condition-c (called classifi-
cation rule), such that the rule rcovers an object xif the
attributes of xsatisfy the condition of r. Therefore, in this type
of classification, the classifier uses logical propositional for-
mulas in a disjunctive or conjunctive normal form (Bif then
rules^) for classifying the given samples [18].
Among the different variants of the Ensemble and Random
Forest classifiers, they have attracted the attention of researchers
due to its features of handling missing values and noisy data,
classification of characteristics and selection to form tree nodes
[19]. Random Forest is a variant of ensemble classifier
consisting of a collection of tree-structured classifiers {h(x,
Θk) k = 1, 2,....}, where {Θk} are independent identically dis-
tributed random vectors. Each tree casts a unit vote for the most
popular class input x. It is a popular supervised classification and
regression method that uses the concept of random feature se-
lection for making decision trees [19].
Naïve Bayesian classifier is a selective classifier which
calculates the set of probabilities by counting the frequency
and combination of values in a given data set. It assumes that
the all variables which contribute towards classification are
mutually independent. Naïve Bayesian classifier is based on
bayes theorem and theorem of total probabilities [7].
JRip (RIPPER) is one of the basic and most popular algo-
rithms. Classes are examined in increasing size and an initial
set of rules for the class is generated using the incremental
reduced error [7]. Proceed by treating all the examples of a
particularjudgment in the training data as a class and finding a
set of rules to cover all members of that class. Thereafter
proceeds to the next class and does the same, repeating this
until all classes have been covered.
K-mean is a basic technique of grouping in biocomputing.
The goal is to find K patterns by calculating the distance
betweeneachsamplevalue[20].
Apriori is an algorithm that determines the associations
between data by checking frequencies. It is one of the major
types of algorithm based on association rule. The main pur-
pose of using the Apriori algorithm in Data Mining is to find
patterns in data set. Calculates the conditional probability of
each case and return the most probable data set, which appears
most frequently in input data [21].
The Data Mining approach can significantly help the re-
search into mental illness, to find patterns and knowledge
embedded into the data. It requires exploration and analysis
of large quantities of data for the purpose of better understand-
ing and deriving knowledge regarding the problem at hand
[22]. Below we show the results obtained.
Results
In the review of literature we find a large number of studies
that base their research on the use of Data mining techniques
and algorithms applied to Mental Health diseases. Figure 2
shows the relevant paper statistics found in the last 10 years.
From a total of 72 papers found 35 belong to journals while 38
are conference papers.
Alzheimer
Alzheimer is a multifaceted disease in which the accumulated
cerebral pathology produces a progressive cognitive deteriora-
tion that finally leads to dementia [23,24]. It is characterized by
Fig. 2 Relevant papers statistics
found in the last 10 years
161 Page 4 of 15 J Med Syst (2018) 42: 161
Table 2 Studies of the bibliographic review related to Data Mining techniques and algorithms applied to patients with Alzheimers
Authors Year of
publication
Study proposal Techniques and
Algorithms
Results
Qu, Yuan, & Liu. [8] 2009 Predictive model to identify possible conversions
from MCI to AD based on the ADNIdatabase.
Naïve Bayes - Naïve Bayes can predict the conversion with a
reasonably good AUC value after feature
selection.
Joshi et al. [27] 2010 They propose a new model for the AD
classification when considering the risk factors
that most influence. Different models were
developed for AD classification.
Neural Networks and
Machine Learning:
Random Forest tree,
Multilayer Perceptron
- It was discovered that some specific genetic
factors, diabetes, age and smoking were the
strongest risk factors for AD.
- The classification model was validated with the
test cases and achieved classification average
accuracy of 99.25% with Random Forest tree
and the Multilayer Perceptron.
Plant et al. [28] 2010 They develop a new data mining framework in
combination with three different classifiers to
derive a quantitative index of pattern matching
for the prediction of the conversion from MCI
to AD.
SVM, Bayes statistics,
voting feature
intervals (VFI)
- Bayes and VFI yielded superior results
compared to the SVM approach, showing a
novelapproachtoidentifyregionsofhigh
discriminatory power for AD identification
and conversion prediction to AD between
MCI.
Chaves et al. [29] 2011 They propose to discover associations between
the attributes that characterize perfusion
patterns of normal subjects and make use of
them for AD early diagnosis.
Association rules - The proposed method yields up to 94.87%
classification accuracy (sensitivity = 91.07%,
specificity = 100%) overcoming the methods
recently developed until that year for AD
early diagnosis.
Plant et al. [30] 2011 Introduce a bootstrapping-based feature
extraction technique to identify early-stage
AD from resting-state functional resonance
images.
SVM - They show that subjects with early stage AD
can be distinguished with an accuracy of 79%
of healthy subjects of the same age.
Al-Dlaeen &
Alashqur [31]
2014 They present a focus towards a prediction model
for AD.
DT - Using Information Gain enables us to construct
an optimal or near optimal tree with fewer
nodes and branches than if we resort to a
random selection of attributes.
- The system can be used to predict the situation
of new patients with regard to AD
Bhagya Shree&
Sheshadri [7]
2014 Diagnosis of the disease using various machine
learning techniques of Data Mining
Naïve Bayes, DT J48,
Random Forest, JRip
- When evaluating performance taking into
account classification accuracy, precession,
recall and time required for execution of each
technique that Naïve bayes is best of all the
four techniques.
Fiscon et al. [18] 2014 They propose the automatic patients
classification from the Electroencephalogram
(EEG) biomedical signals involved in AD and
MCI in order to support medical in the right
diagnosis formulation.
DT J48, SVM -In the identification of MCI and CT, DT J48
outperforms SVM in the correct classification
percentage achieving 90% accuracy and 87%
specificity, as well as high performance in all
other metrics, using a leave-one-out cross
validation (51 folds).
- In the identification of AD and CT, J48
achieves better classification results (73%
accuracy) than SVM (68% accuracy) in
leave-one-out cross validation (63-fold).
- When distinguishing AD from patients with
MCI, J48 achieves the best results (80%
accuracy, 79% specificity) with respect to the
SVM classifier.
Queau, Shafiq, &
Alhajj [20]
2014 They present several Data Mining techniques to
analyze the data set of AD gene expression.
Association Rule
Mining, Clustering
- It helped to discover interesting and useful
patterns in the data set as: to deduce the
correlation between the different
characteristics of the genes that are healthy or
not in different stages of AD.
Zhang, Wang, &
Dong [32]
2014 Novel classification system to distinguish
between elderly subjects with AD, mild
cognitive impairment, and normal controls
(NC).
Kernel support vector
machine decision tree
(kSVM-DT).
- The kSVM-DT method gathered the main
components of magnetic resonance imaging
(MRI) data and additional data, and the final
classification accuracy is 80%.
Pachange, Joglekar,
& Kulkarni [19]
2015 They use data classification techniques related to
Data Mining, in which decision of multiple
base classifiers is combined for accurate
prediction of the presence or absence of AD
anomalies
Ensemble, Random
Forest
- Both techniques have proven efficient for
medical datasets classification, due to their
ability to handle noisy data, missing values
and unbalanced dataset.
Sheshadri, Shree, &
Krishna [33]
2015 New emerging approach in diagnostic AD. Naïve Bayes, JRip,
Random forest
- The JRip and Random Forest techniques show
better results compared to Naïve Bayes.
Sarraf & Tofighi [17] 2016 They use a Convolutional neuronal network to
distinguish an Alzheimers brain from a
normal, healthy brain.
CNN - AD data from normal control data were
successfully classified with 96.86% accuracy
using CNN deep learning architecture
(LeNet).
Martínez-Ballesteros
et al. [34]
2017 DT, quantitative rules,
hierarchical cluster
- The results have shown that the obtained rules
successfully characterize the underlying
J Med Syst (2018) 42: 161 Page 5 of 15 161
gradual deterioration of memory and other cognitive functions,
including language skills, attention, understanding, judgment
and abstract thinking. Patients eventually lose mental function,
leaving them completely dependent on others for care [25].
Therefore, the AD analysis and knowledge based on the
available data are important to understand, alleviate the effects
and smooth the way to disease cure [26]. In Table 2we present
the main studies found in review regarding the use of Data
Mining techniques and algorithms in AD, while Fig. 3shows
the percentages of the main Data Mining techniques applied to
said disease, DT being the most used, followed by SVM and
Naïve Bayes.
In [9] the authors have proposed the relationship and dis-
tribution between immune-phenotypic and oxidative stress
biomarkers in AD and mild cognitive impairment (MCI)
through the application of ANNs where they evaluate the
predictive capacity of the TWIST algorithm. The Auto-
Contractive Map algorithm (Auto-CM) was applied to gener-
ate a graph that reveals the most important links among vari-
ables. Applying these complex mathematical tools, they
succeeded to obtain an algorithm able to distinguish these
neurological conditions from biological parameters with good
accuracy. Moreover, a global immune deficit emerged from
this analysis in pathological individuals compared to healthy
controls, throwing new knowledge about the pathogenesis of
dementia.
In [36] present the first database that has been extracted to
investigate the protein features common to AD. This database
can be used to perform additional investigations in the area of
identification of causes, symptoms and possible treatment for
neuro-degeneration through computational methods. A com-
bination of brain imaging and clinical assessment checks for
Ta bl e 2 (continued)
Authors Year of
publication
Study proposal Techniques and
Algorithms
Results
To provide gene expression patterns and a deeper
knowledgeof biological functions with greater
relevance.
information, grouping relevant genes for the
problem under study and agreeing with prior
biological knowledge.
- We found 90 genes that were significantly
altered in AD patients.
Tejesw inee ,
Shomona, &
Athilakshmi [35]
2017 New dataset consisting of genetic information
pertaining to neuro-degenerative disorder:
AD.
SVM, Random Forest,
DT, Naïve Bayes,
Adaboost,
k-Nearest-Neighbour
(kNN)
- Before the feature selection, Random Forest
and kNN classifiers predicted the diagnostic
classes with high accuracy (~ 82%) when
compared with the other classification
techniques. SVM gave the best accuracy (~
94%) with CFS subset evaluation.
- In Gain Ratio method, Random Forest showed
impressive results (~85%). It was followed by
SVM and DT classifiers.
- The selection of optimal features by CFS
followed by the implementation of DFS, thus
creating a hybrid feature selection technique,
improves the classification accuracy and
helps in early disease diagnosis.
Fig. 3 Percentages of data mining
techniques and algorithms applied
to Alzheimersstudies
161 Page 6 of 15 J Med Syst (2018) 42: 161
Table 3 Studies of the bibliographic review related to the Data Mining techniques and algorithms applied to patients with Dementia
Authors Year of
publication
Study proposal Techniques and Algorithms Results
Wen e t al . [40] 2008 Investigated the parametric images potential
of studies with FDG-PET to aid the dementia
classification using Data Mining techniques.
Logistic regression - The results show that cerebral metabolic rate of glucose consumption (CMRGlc)
was efficient in the classification of dementia and Data Mining using voxel-level
features with PCA and the logistic regression model method achieving the best
classification.
S. Zhang et al.
[41]
2013 Predictive model for the technology adoption
of a video transmission solution based on
mobile
phones developed for people with dementia,
taking into account the individual Features.
DT C4.5 algorithm,
Logistic regression
- The statistical tests do not show a significant difference between the performance
of DT and logistic regression model (p= 0.894).
- The advantage of DT models is that are easy to use and interpret from the
perspective of non-technical user.
S. Zhang et al.
[42]
2014 Predictive adoption model for a video
transmission
system based on mobile phones, developed
for people with dementia.
Neural networks, DT C4.5,
SVM, Naïve Bayes,
adaptive boosting, CART,
kNN
- The k-Nearest-Neighbor algorithm using seven characteristics proved to be
the optimal classifier of the assistive technology adoption for people with
dementia (prediction accuracy 0.84 ± 0.0242).
Byeon [38] 2015 Mild cognitive impairment prediction model for
older people in Korean local communities.
Random Forest, Logistic
regression, DT
- The significant predictors of MCI were age, gender, level of education, income
level, subjective health, marital status, smoking, drinking, regular exercise
and hypertension.
- Random Forests model was more accurate compared to the Logistic regression
model and Decision Tree.
Bang et al. [43] 2017 Systematic and global data mining model to
improve the
process of dementia assessment.
SVM, ANN, DT - The results show SVM with a high performance of AUC 0.96 was configured as
the model of subsidiary decision making for a clinician.
- Therefore, it is an intelligent system that allows an intuitive collaboration between
the CAD system and doctors.
Moon et al. [21] 2017 They propose construction of sorting machine
with aim of distinguishing two dementia
diseases: Lewy bodies and Parkinsons
disease.
Apriori,DT,SVM,K-means,
Neural Network
- To obtain greater precision, they used k-means first to discover where Sum of Squared
error (SSE) cluster becomes larger and use Neural Network considering the
SS-means of k-means.
J Med Syst (2018) 42: 161 Page 7 of 15 161
signs of memory impairment and movement problems are
used to identify patients with AD. Definitive diagnosis can
only be obtained after patientsautopsy by examining brain
tissues. There is a clear need for tangible advances in the area
of biomarkers for assessment of risk, diagnosis and monitor-
ing disease progression.
In [37] explore the possibility of analyzing subjects with
AD from non-image data using supervised learning approach,
and quantify their abnormality using measurable similarity
indexes. As a result, the proposed model works with numeri-
cal input values. Researchers can continue use MRI-based
methods as a preprocessing step for the proposed model, in
which you can extract useful information from images in form
of numerical values and used as inputs for the model,
obtaining better results.
Dementia
As worldwide aged population increases with the development
of science, technology and medicine, the number of geriatric
diseases also increases radically. According to World
Alzheimer Report in 2015, worldwide dementia population re-
corded 44 million in 2013 and will increase more than 3-fold to
135 million in 2050 [38]. It has been proposed to apply specific
motivation protocols adapted to the special features of individ-
ual subjects, which allows motivating people to change behav-
ioral disorders and promote healthy behavior patterns [39].
Next, we propose the studies (See Table 3) found regarding
the application of Data Mining techniques and algorithms in
patients with dementia. Figure 4shows the percentages of the
main techniques applied to dementia, being DT the most used,
followed by ANN, Logistic regression and SVM.
In [44] use Machine Learning applications with Neural
Networks, Mini Mental State Examination (MMSE)and
Functional Activities Questionnaire (FA Q ) methods toclassify
states of dementia and improve accuracy. This analysis shows
that the MMSE and FAQ tests, used in conjunction with the
Machine Learning and Neural Networks methods, improve
the correct classification of cognitively impaired and dement-
ed subjects by 20.84% - 26.84% on the use of the MMSE and
FAQ tests. These studies can be beneficial to understand the
disease progression and to assess prophylactic strategies that
can distinguish the different states of dementia, so that suitable
measures can be taken to manage dementia in general and AD
in particular.
In [45] the authors have proposed through the analyzing
networks of functional brain activity of healthy subjects and
patients with mild cognitive impairment, an intermediate stage
between the expected cognitive decline of normal aging and
the more pronounced decrease in dementia. The effectiveness
of the proposed approach is also confirmed by the score
achieved in the classification task, close to 95%. The method
was applied to a biomedical classification task, but its validity
is of absolute generality, since it can be applied to any type of
raw multivariate time series.
Depression
Nowadays, the main challenge in studying emotional disor-
ders is to master the patients emotional changes. The research
on user emotion detection mainly includes three categories:
emotions recognition based on audio-visual signals, physio-
logical signals and multimodal data [46]. Continuous moni-
toring of a persons stress levels is essential to understand and
manage personal stress [47]. A series of physiological markers
are widely used for the stress assessment, which include: skin
galvanic response, various characteristics of heartbeat pat-
terns, blood pressure and respiratory activity [48].
Suicide is another of the most feared consequences of
mental illness. While there are significant differences
Fig. 4 Percentages of data mining
techniques and algorithms applied
to dementia
161 Page 8 of 15 J Med Syst (2018) 42: 161
Table 4 Studies of the bibliographic review related to the Data Mining techniques and algorithms applied to patients with Depression
Authors Year of
publication
Study proposal Techniques and Algorithms Results
Chang, Hung,
& Juang [51]
2013 They use ontologies and Data Mining
techniques to construct the
depression terminology and infer
depression probability
Ontologies, Bayesian networks - The results represent excellently the
organization structure of large
complex domains, while the
Bayesian probability allows the
probabilities assignment to any
other statement type.
Thanathamathee
[52]
2014 They developed and evaluated model
that uses adolescent depression data
to assess and predict depression
based on momentum, with feature
selection techniques.
AdaBoost based on DT classifier - The results are particularly useful for
depression screening in adolescents,
confirm to doctors that ask
questions and clearly observe
depression symptoms.
- Reduce diagnosis time during the
completion of questionnaires by
patients.
Ghafoor, Huang,
& Liu [53]
2015 Application of Data Mining
techniques in depression database
containing 5964 records. These
techniques are used together to
extract efficient results.
Association analysis, frequent pattern
tree
- The analysis results establish the
most common symptoms of
depressed patients, as well as their
scenarios.
Hou et al. [54] 2016 They propose the correlation between
reading habits and depressive
tendency of university students
based on dataset of university
library records and results of Mental
Health questionnaires.
kNN, SVM, Naïve Bayesian, Linear
regression, Logistic regression
- They tested and compared linear
regression algorithm and logistic
regression algorithm to construct
prediction models and finally
selected the logistic regression for
its highest prediction accuracy at a
lower relative error.
Husain et al. [55] 2016 Predict generalized anxiety disorder
among women.
Random Forest - The prediction accuracy is above 0.9,
which indicates that Random Forest
approach is able to accurately
predict the Generalized Anxiety
Disorder (GAD).
- For the specificity evaluation,
Random Forest shows consistency
by accurately predicting people do
not have GAD.
Li et al. [56] 2016 Research based on EEG, search to find
prominent bands of frequencies and
brain regions that are more related
to mild depression.
Bayes Net, SVM, Logistic
Regression, kNN, Random Forest.
Best First (BF), Greedy Stepwise
(GSW), Genetic Search (GS),
Linear Forword Selection (LFS)
and Rank Search (RS)basedon
Correlation Features Selection
(CFS) were applied for feature
selection.
- As a result, it was obtained that GSW
based on CFS and KNN had the
optimal performance and beta
frequency band played a more
important role in the detection mild
depression than alpha and theta
frequency bands, with the
classification accuracy higher than
92% and AUC above 0.950 for beta
frequency bands of Emo_block and
Neu_block.
Nie, Gong,
&Ye[57]
2016 They propose a censored regression
approach to predict the risk of
patients relapse after their initial
remission from one or multiple
stages of antidepressant treatment.
Regression tree, linear combination of
covariate.
- They show the main risk factors
identified by the multistage linear
method, are not only consistent with
the findings from some of the recent
research about relapse among
patients with MDD who had
initially achieved remission, but
also provide some insights on how
to develop therapies for prevention
of relapse.
Spyrou et al. [58] 2016 To evaluate the neurophysiological
features of elderly participants
suffering from depression and
Random Forest, Random Tree,
Multilayer Perceptron (MPL
Network), SVM
- The efficiency of classifiers varied
from 92.42 to 95.45%, with
J Med Syst (2018) 42: 161 Page 9 of 15 161
between suicide rates in many countries, suicide ranks
amongthetop15causesofdeatharoundtheworld[49].
Early recognition and accurate diagnosis of depression are
essential criteria to optimize treatment selection and im-
prove outcomes, thus reducing the economic and psycho-
social burdens that result from hospitalization, lost work
productivity and suicide [50].
In Table 4we show the studies found related to the Data
Mining techniques and algorithms in patients with depression,
while in Fig. 5are shown percentages of the main techniques
applied to this disease, with SVM being the most used, follow-
ed by Naïve Bayes.
In [60] the authors have proposed a new approach using
Data Mining techniques to predict the stress level of a patient
using a logistic model tress and know different factors that
affect the Mental Health of the patient efficiently. Stress pre-
diction and generated rules will act as a support tool to assist
medical experts provide treatment and to consult patient to
take precautions to prevent future complications. It also will
reduce the cost of several medical tests and facilitate patients
to take preventive measures well in advance.
Schizophrenia and bipolar disorders
Especially in psychiatry, technology and science have made
available new computational methods to help develop predictive
models and identify diseases with greater precision [61]. Among
mental illnesses, disorders related to autism, bipolar disorder and
schizophrenia have a particularly high impact on affected indi-
viduals and their families, they represent a heavy economic
burden for the health care system [62]. Such disorders include
generalized anxiety disorder (GAD), posttraumatic stress disor-
der (PTSD), panic disorder (PD), social phobia and specific
phobias, among others [63]. Below we show the main studies
found on schizophrenia and bipolar disorders (see Table 5)and
Fig. 5 Percentages of data mining
techniques and algorithms applied
to depression
Ta bl e 4 (continued)
Authors Year of
publication
Study proposal Techniques and Algorithms Results
neurodegeneration. The work is
focuses on the identification of
depression symptoms that coexist
with the cognitive deterioration, the
correlation of the examined
neurophysiological features with
the geriatric depression combined
with cognitive impairment.
Random Forest being the most
accurate (95.5%).
Kim et al. [59] 2017 They propose a simple and discreet
detection system that uses passive
infrared sensors to monitor the daily
life activities of elderly who live
alone.
Neural networks, DT C4.5, Bayesian
networks, SVM
- Neural networks surpasses the other
algorithms, followed by C4.5 DT
and is effective to detect normal
conditions and mild depressions
with up to 96% accuracy.
161 Page 10 of 15 J Med Syst (2018) 42: 161
Table 5 Studies of the bibliographic review related to the Data Mining techniques and algorithms applied to patients with Schizophrenia and Bipolar disorders
Authors Year of
publication
Study proposal Techniques and Algorithms Results
Ince et al. [64] 2008 Framework for schizophrenia diagnosis based on the
spectro-temporal patterns selected by a SVM from
multichannel magnetoencephalogram (MEG)
recording in a verbal working memory task
Recursive feature elimination technique (SVM-RFE) - The SVM-RFE algorithm can successfully select fea-
tures from a large prediction space associated with
neuronal activity in a functional task and these features
can be used effectively in recognizing patients in
schizophrenia.
Gangwar,
Mishra, &
Yad a [65]
2014 Analyze through Data Mining algorithms different
parameters for detection and diagnosis of
neuropsychiatric diseases, including Schizophrenia.
DT C5.0 - The results show that algorithm C5.0 has an accuracy of
90% and provides a quick and easy way for doctors to
make a decision regarding disease diagnosis.
GeethaRamani
&Sivaselvi
[66]
2014 They investigate the resting state fMRI images of 15
normal controls and 12 schizophrenia patients by
constructing a functional conectoma using image
preprocessing techniques, specifically realignment,
temporal correction, filtering, etc.
Random Forest, DT C4.5, Regression tree,
k-Nearest Neighbour.
- These algorithms have produced classification rules that
are used in the prediction of schizophrenia disorder,
resulting that algorithm C4.5 has achieved the highest
predictive accuracy, with 93%.
Lanata et al.
[67]
2014 They propose application of pattern recognition
technique to classify the pathological mental states of
bipolar disorders using the information collected from
electrodermal EC response.
k-Nearest Neighbor - The results showed that using a convolution-based ap-
proach to estimate sympathetic ANS markers and
simple k-Nearest Neighbor algorithms, the proposed
methodology is able to discern up to three mood states
such as depression, hypomania, and euthymia with an
average intra-subject accuracy greater than 98% and
inter-subject accuracy greater than 82%.
Thongkam &
Sukmak [68]
2014 The objective is to develop and investigate prediction
models of patients readmission with schizophrenia
using Data Mining techniques.
DT, Random Tree, Random Forests, AdaBoost, Bagging,
AdaBoost with DT, AdaBoost with Random Tree,
AdaBoost with Random Forests, Bagging with DT,
Bagging with Random Tree, Bagging with Random
Forest
- The experimental results showed that AdaBoost with
DT has the highest accuracy, recall and F-measure with
98.11%, 98.79 and 98.41%, respectively.
Castaldo et al.
[69]
2016 Pose to detect mental stress using linear and non-linear
Heart Rate Variability (HRV) features extracted from
3 min ECG excerpts recorded from 42 university
students, during oral examination (stress) and at rest
after a vacation.
DT C4.5 - The best performance machine learning method was the
DT C4.5 algorithm, which discriminated between
stress and rest with a sensitivity, specificity and
precision speed of 78%, 80 and 79%, respectively.
J Med Syst (2018) 42: 161 Page 11 of 15 161
Fig. 6shows the percentages of the main Data Mining tech-
niques and algorithms applied to this disease, being DT the most
used, followed by Random Forest and kNN.
In [70] is proposed a semiautomatic system that helps in the
preliminary diagnosis of the patient with psychological disorder.
The goal is not to fully automate the classification process of
mentally ill individuals, but to ensure that a classifier is aware of
all possible Mental Health illnesses could match patients symp-
toms. The results show that the system improves the organiza-
tional capacity to collect information faster at a lower cost and
make accurate decisions. The orchestration of genetic algorithms
through the implementation of Business Process Execution
Language allows flexible service workflows to be immediately
adjusted to modifications and make systems smarter.
Discussion
Frequent pattern analysis has been a topic of study focused on
Data Mining, and many algorithms and methods have been
developed to mining frequent sequential and structural pat-
terns. Data Mining algorithms have great potential to expose
the patterns in data, facilitate the search for the combinations
of genetic and environmental factors involved and provide an
indication of the influence [71].
The approach of using Data Mining techniques in psychi-
atry has the potential to open a completely new area of re-
search in the detection, diagnosis and classification of psychi-
atric disorders such as schizophrenia, dementia, depression,
anxiety and alcohol abuse.
Today, depression occurs in adolescents and suicidal de-
pression number increases every time. Therefore, it is known
that depression is a contaminant of morbidity, mortality and
economic loss. Although effective treatments for Mental
Health conditions such as depression and anxiety have been
available for some time, less than half of people with a mental
disorder search primary care medical or psychiatrist.
The results in [72] highlight the advantageous applicability
of machine learning for psychiatric research. By observing
and interpreting the use of online communities, researchers
become better placed to offer suggestions as to how commu-
nities can be cultivated for the maintenance and well-being of
people with a lived experience of depression.
Understanding the factors that predicting mental
healthcare-seeking behaviors is crucial for the formulation of
health policies and the design of interventions to address in-
equities in access to Mental Health services [73]. In addition,
it will be the second cause of death in 2020 due to complica-
tions derived from stress and the cardiovascular system.
Suicide rates have increased significantly worldwide in re-
cent years. This phenomenon is very complex and includes
biological, psychological and social variables. A large propor-
tion of people who have attempted suicide present with psy-
chiatric conditions, such as mood disorders (specifically, de-
pressive disorders), psychosis and substance abuse. Among
the social factors known to date, unemployment and social
isolation have been associated with high suicide rates [74].
A challenge when developing predictive clinical tools is to
establish what information should be used. Genetic and brain
imaging measures are possible sources of information and have
generated interest. However, even if effective, the cost and time
of data collection and processing may not be practical. Previous
attempts to identify clinical predictors of treatment outcome
usually some predictors based on clinical experience, and have
investigated their overall effect gradually [75].
It has been demonstrate [50]thatDataMiningappliedto
EEG signals can be a useful tool to discriminate between
depressed and healthy people. Given the questionable reliabil-
ity of diagnoses based on clinical symptoms, this quantitative
methodology may be a useful adjunctive clinical decision
support for identifying depression and supports independent
Fig. 6 Percentages of data mining
techniques and algorithms applied
to schizophrenia and bipolar
disorders
161 Page 12 of 15 J Med Syst (2018) 42: 161
studies confirming the potential clinical utility of computer-
assisted diagnosis of depression using EEG signals.
Information technologies have the power to positively trans-
form the way patients are treated, and help us advance knowl-
edge more quickly [76]. Patients can receive highly personalized
treatments, therapists will receive help in making evidence-
based decisions, and the scientist will be able to search new
knowledge that reveals the true causes of Mental Health ill-
nesses while developing more effective treatment approaches.
Conclusion
The purpose of this review was to provide a state of art over-
view in research about Data Mining techniques and algo-
rithms applied to Mental Health diseases. The selected studies
addressed of main Mental Health diseases based on predictive
techniques applied to different study features, as well as being
able to detect the risk factors for most diseases. The predictive
models and the binary classifiers can be trained according to
the features obtained from all these techniques. Once the lit-
erature review where the existing publications were analyzed
in the last 10 years, taking account the studies referring to use
Data Mining techniques and algorithms applied to the Mental
Health diseases and the exposed results, we propose, as future
line and continuation of this research apply the main tech-
niques found in a patients database with schizophrenia, make
a comparison between the techniques and evaluate the results
in terms of performance and accuracy, in addition to obtaining
common patterns among patients with the disease. The au-
thors of this paper are already working on the aforementioned
database. Another of the future lines to be proposed is devel-
opment of a prediction model of patient cognitive impairment
(dementia) based on the proposed algorithms and discover the
significant predictors that lead to the disease.
Acknowledgements This research has been partially supported by the
European Commission and the Ministry of Industry, Energy and
Tourism under the project AAL-20125036 named BWetake Care: ICT-
based Solution for (Self-) Management of Daily Living^.
Compliance with Ethical Standards
Conflict of Interest The authors declare that they have no conflict of
interest.
Ethical Approval This article does not contain any studies with human
participants or animals performed by any of the authors.
References
1. Dhaka, P., and Johari, R., Big data application: Study and archival
of mental health data, using MongoDB. Int. Conf. Electr. Electron.
Optim. Tech.:32283232, 2016.
2. Dipnall, J. F., Pasco, J. A., Berk, M., Williams, L. J., Dodd, S.,
Jacka, F. N. et al., Fusing data mining, machine learning and tradi-
tional statistics to detect biomarkers associated with depression.
PLoS One. 11(2):123, 2016.
3. Pirooznia, M., Seifuddin, F., Judy, J., Mahon, P., Potash, J., and
Zandi, P., Data mining approaches for genome-wide asscociation
of mood disorders. Psychiatr. Genet. 22(2):5561, 2012.
4. Ni, H., Yang, X., Fang, C., Guo, Y., Xu, M., and He, Y., Data
mining-based study on sub-mentally healthy state among residents
in eight provinces and cities in China. J. Tradit. Chinese Med.
34(4):511517, 2014.
5. (WHO) WHO. Trastornos mentales. 2018; Available from: http://
www.who.int/mediacentre/factsheets/fs396/es/ (last accessed April
2018).
6. Mathew, J., Mekkayil, L., Ramasangu, H., Karthikeyan, B. R., and
Manjunath, A. G., Robust algorithm for early detection of
Alzheimers disease using multiple feature extractions. IEEE
Annu. India Conf. 2016:16, 2016.
7. Bhagya Shree, S. R., and Sheshadri, H. S., An initial investigation
in the diagnosis of Alzheimers disease using various classification
techniques. 2014 IEEE Int Conf Comput Intell Comput Res.
2014;15.
8. Qu, X., Yuan, B., and Liu, W., A predictive model for identifying
possible MCI to AD Conversions in the ADNI database. 2009 2nd
Int. Symp. Knowl. Acquis. Model KAM 3:102105, 2009.
9. Gironi M, Borgiani B, Farina E, Mariani E, Cursano C, Alberoni M,
et al. A global immune deficit in Alzheimers disease and mild
cognitive impairment disclosed by a novel data mining process. J.
Alzheimers Dis. 2015;43(4):11991213
10. Yoon, S., Taha, B., and Bakken, S., Using a data mining approach
to discover behavior correlates of chronic disease: A case study of
depression. Stud. Health Technol. Inform. 201:7178, 2014.
11. Lee, C., Lam, C. P., and Masek, M., Rough-fuzzy hybrid approach
for identification of bio-markers and classification on Alzheimers
disease data. Proc. 2011 11th IEEE Int Conf Bioinforma Bioeng
BIBE.;8491, 2011.
12. Alonso, S. G., de la Torre Díez, I., Rodrigues, J. J. P. C., Hamrioui,
S., and López-Coronado, M., A systematic review of techniques
and sources of big data in the healthcare sector. J. Med. Syst.
41(11):183, 2017.
13. Khan, A., and Usman, M., Early diagnosis of Alzheimersdisease
using machine learning techniques. 2015 7th Int Jt Conf. Knowl.
Discov. Knowl. Eng. Knowl. Manag. (IC3K) 1:380387, 2015.
14. Wongkoblap, A., Vadillo, M. A., and Curcin, V., Researching men-
tal health disorders in the era of social media: Systematic review. J.
Med. Internet Res. 19(6):e228, 2017.
15. Yuan, C., Data mining techniques with its application to the dataset
of mental health of college students. IEEE Work Adv. Res. Technol.
Ind. Appl. WARTIA. 2014:391393, 2014.
16. Yoo, I., Alafaireet, P., Marinov, M., Pena-Hernandez, K., Gopidi,
R., Chang, J.F. et al., Data mining in healthcare and biomedicine: A
survey of the literature. J. Med. Syst. 36(4):24312448, 2012.
17. Sarraf, S., and Tofighi, G., Deep learning-based pipeline to recog-
nize Alzheimers disease using fMRI data. 2016 Future
Technologies Conference (FTC), IEEE. 816820, 2016.
18. Fiscon, G., Weitschek, E., Felici, G., Bertolazzi, P., De Salvo, S.,
and Bramanti, P, et al., Alzheimers disease patients classification
through EEG signals processing. 2014 IEEE Symp Comput Intell
Data Min (CIDM).;105112, 2014.
19. Pachange S, Joglekar B, Kulkarni P. An ensemble classifier ap-
proach for disease diagnosis using random Forest. 2015 Annu.
IEEE Ind. Conf.;15, 2015.
20. Le, Q. B., Shafiq, O., and Alhajj, R., Analyzing Alzheimersdis-
ease gene expression dataset using clustering and association rule
mining. Proc. 2014 IEEE 15th Int Conf Inf Reuse Integr IEEE IRI.
;283290, 2014.
J Med Syst (2018) 42: 161 Page 13 of 15 161
21. Moon, S., Choi, B., An, J., and Yoon, T., Constructing a sorting
machine for degenerative cerebropathia. Int. Conf. Adv. Commun.
Technol. ICACT.;800804, 2017.
22. Hadzic M, Hadzic F, Dillon T. Tree mining in mental health do-
main. Proc. 41st Annu. Hawaii Int. Conf. Syst. Sci. ;18, 2008.
23. Simon, G. J., Li, P. W., Jack, Jr. C. R., and Vemuri, P.,
Understanding atrophy trajectories in Alzheimers disease using
association rules on MRI images. Proc. 17th ACM SIGKDD Int.
Conf. Knowl. Discov. Data Min. ;369376, 2011.
24. Payandeh, S., Recursive Bayesian tracking for smart elderly living.
7th IEEE Annu Inf Technol Electron Mob Commun Conf IEEE
IEMCON.;17, 2016.
25. Chiang, H.-S., and Pao, S.-C., An EEG-based fuzzy probability
model for early diagnosis of Alzheimers disease. J. Med. Syst.
40(5):125, 2016.
26. Ertek, G., Tokdil, B., and Günaydın, İ., Risk factors and identifiers
for Alzheimers disease: A data mining analysis. Ind. Conf. Data
Min.;111, 2014.
27. Joshi, S., Shenoy, D, G.G. VS, Rrashmi, P. L., Venugopal, K. R.,
and Patnaik, L. M., Classification of Alzheimers disease and
Parkinsons disease by using machine learning and neural network
methods. 2010 Second Int. Conf. Mach. Learn. Comput. ;218222,
2010.
28. Plant, C., Teipel, S. J., Oswald, A., Böhm, C., Meindl, T., Mourao-
Miranda, J. et al., Automated detection of brain atrophy patterns
based on MRI for the prediction of Alzheimersdisease.
Neuroimage. 50(1):162174, 2010.
29. Chaves, R., Górriz, J. M., Ramírez, J., Illn, I. A., Salas-Gonzalez,
D., and Gómez-Río, M., Efficient mining of association rules for
the early diagnosis of Alzheimer s disease. Phys. Med. Biol. 56(18):
60476063, 2011.
30. Plant C, Sorg C, Riedl V, Wohlschläger A. Homogeneity-based
feature extraction for classification of early-stage alzheimersdis-
ease from functional magnetic resonance images. Proc. 2011 Work
Data Min. Med. Healthc - DMMH. 2011;3341, 2011.
31. Al-Dlaeen, D., and Alashqur, A., Using decision tree classification
to assist in the prediction of Alzheimers disease. 2014 6th Int.
Conf. Comput. Sci. Inf. Technol. (CSIT).;122126, 2014.
32. Zhang, Y., Wang, S., and Dong, Z., Classification of Alzheimer
disease based on structural magnetic resonance imaging by kernel
support vector machine decision tree. Prog. Electromagn. Res. 144:
171184, 2014.
33. Sheshadri, H. S. , Shree, S. R. B., and Krishna, M., Diagnosis of
Alzheimers disease employing neuropsychological and classifica-
tion techniques. Proc 2015 5th Int. Conf. IT Converg. Sec. ICITCS.
;16, 2015.
34. Martínez-Ballesteros, M., García-Heredia, J. M., Nepomuceno-
Chamorro, I. A., and Riquelme-Santos, J. C., Machine learning
techniques to discover genes with potential prognosis role in
Alzheimers disease using different biological sources. Inf.
Fusion. 36:114129, 2017.
35. Tejeswinee, K., Shomona, G. J., and Athilakshmi, R., Feature se-
lection techniques for prediction of neuro-degenerative disorders: A
case-study with AlzheimersandParkinsons disease. Proc.
Comput. Sci. 115:188194, 2017.
36. Jacob SG, Athilakshmi R. Extraction of protein sequence features
for prediction of neuro-degenerative brain disorders: Pioneering the
CGAP database. Proc Int Conf Informatics Anal - ICIA-16.;30,
2016.
37. Aditya, C. R., and Pande, M. B. S., Devising an interpretable cal-
ibrated scale to quantitatively assess the dementia stage of subjects
with alzheimers disease: A machine learning approach. Inform.
Med. Unlocked. 6:2835, 2017.
38. Byeon, H. A., Prediction model for mild cognitive impairment
using random forests. Int. J. Adv. Comput. Sci. Appl. 6(12):812,
2015.
39. Fernández-Llatas, C., García-Gomez, J. M., Vicente, J., Naranjo, J.
C., Robles, M., and Benedí, J. M., et al., Behaviour patterns detec-
tion for persuasive design in nursing homes to help dementia pa-
tients. Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc.
EMBS.;64136417, 2011.
40. Wen, L., Bewley, M., Eberl, S., Fulham, M., and Feng, D. D.,
Classification of dementia from fdg-pet parametric images using
data mining. 5th IEEE Int. Symp. Biomed. Imaging From Nano to
Macro, ISBI. ;412415, 2008.
41. Zhang, S., Mcclean, S., Nugent, C., Neill, S. O., Donnelly, M., and
Galway, L., et al., Prediction of assistive technology adoption for
people with dementia. Int. Conf. Heal. Inf. Sci. ;160171, 2013.
42. Zhang S, McClean SI, Nugent CD, Donnelly MP, Galway L,
Scotney BW, et al. A predictive model for assistive technology
adoption for people with dementia. IEEE J. Biomed. Heal.
Inform. 2014;18(1):375383.
43. Bang, S., Son, S., Roh, H., Lee, J., Bae, S., Lee, K. et al., Quad-
phased data mining modeling for dementia diagnosis. BMC Med.
Inform. Decis. Mak. 17(1):60, 2017.
44. Joshi, S., Shenoy, P. D., Venugopal, K. R., and Patnaik, L. M.,
Evaluation of different stages of dementia employing neuropsycho-
logical and machine learning techniques. BT - 2009 1st Int. Conf.
Adv. Comput. ICAC. ;154160, 2009.
45. Zanin, M., Sousa, P., Papo, D., Bajo, R., García-Prieto, J., Del Pozo,
F. et al., Optimizing fun network representation of multivariate time
series. Sci Rep. 2:630, 2012 1.
46. Yang, S., Zhou. P., Duan, K., Hossain, M. S., and Alhamid, M. F.,
emHealth: Towards emotion health through depression prediction
and intelligent health recommender system. Mob. Netw. Appl.;111,
2017.
47. Jena, L., and Kamila, N. K., A model for prediction of human
depression using Apriori algorithm. 2014 Int. Conf. Inf. Technol.
;240244, 2014.
48. Jung, Y., and Yoon, Y. I., Multi-level assessmentmodel for wellness
service based on human mental stress level. Multimed. Tools Appl.
76(9):1130511317, 2017.
49. Morales, S., Barros, J., Echávarri, O., García, F., Osses, A., Moya,
C. et al., Acute mental discomfort associated with suicide behavior
in a clinical sample of patients with affective disorders:
Ascertaining critical variables using artificial intelligence tools.
Front. Psychiatr. 8:7, 2017.
50. Mohammadi, M., Al-Azab, F., Raahemi, B., Richards, G.,
Jaworska, N., Smith, D. et al., Data mining EEG signals in depres-
sion for their diagnostic value. BMC Med. Inform. Decis. Mak.
15(1):108, 2015.
51. Chang YS, Hung WC, Juang TY. Depression diagnosis based on
ontologies and bayesian networks. Proc - 2013 IEEE Int. Conf.
Syst. Man., Cybern. SMC. ;34523457, 2013.
52. Thanathamathee, P., Boosting with feature selection technique for
screening and predicting adolescents depression. 2014 4th Int Conf
Digit Inf Commun Technol Its Appl DICTAP.;2327, 2014.
53. Ghafoor, Y., Huang, Y. P., and Liu, S. I., An intelligent approach to
discovering common symptoms among depressed patients. Soft.
Comput. 19(4):819827, 2015.
54. Hou, Y., Xu, J., Huang, Y, and Ma, X., A big data application to
predict depression in the university based on the reading habits.
2016 3rd Int. Conf. Syst. Inform., ICSAI. ;10851089, 2016.
55. Husain, W., Xin, L. K., Rashid, N. A., and Jothi, N., Predicting
generalized anxiety disorder among women using random forest
approach. 2016 3rd Int Conf Comput Inf Sci. ;3742, 2016.
56. Li, X., Hu, B., Sun, S., and Cai, H., EEG-based mild depressive
detection using feature selection methods and classifiers. Comput.
Methods Programs Biomed. 136:151161, 2016.
57. Nie, Z., Gong, P., and Ye, J., Predict risk ofrelapse for patients with
multiple stages of treatment of depression. Proc. 22Nd ACM
SIGKDD Int. Conf. Knowl. Discov. Data Min.;17951804, 2016.
161 Page 14 of 15 J Med Syst (2018) 42: 161
58. Spyrou, I. M., Frantzidis, C., Bratsas, C., Antoniou, I., and Bamidis,
P. D., Geriatric depression symptoms coexisting with cognitive de-
cline: A comparison of classification methodologies. Biomed. Sign.
Process Contrl. 25:118129, 2016.
59. Kim, J. Y., Liu, N., Tan, H. X., and Chu, C. H., Unobtrusive mon-
itoring to detect depression for elderly with chronic illnesses. IEEE
Sens. J. 17(17):56945704, 2017.
60. Dmonte, S., and Panchal, D., Data mining approach for diagnose
of anxiety disorder. Int. Conf. Comput. Commun. Autom.
(ICCCA).;124127, 2015.
61. Tovar, D., Cornejo, E., Xanthopoulos, P., Guarracino, M. R., and
Pardalos, P. M., Data mining in psychiatric research. Psychiatr.
Disord. 829:593603, 2012.
62. Lyalina, S., Percha, B., Lependu, P., Iyer, S. V., Altman, R. B., and
Shah, N. H., Identifying phenotypic signatures of neuropsychiatric
disorders from electronic medical records. J. Am. Med. Inform.
Assoc. 20:297305, 2013.
63. Panagiotakopoulos, T. C., Lyras, D. P., Livaditis, M., Sgarbas, K.
N., Anastassopoulos, G. C., and Lymberopoulos, D. K., A contex-
tual data mining approach toward assisting the treatment of anxiety
disorders. IEEE Trans. Inf. Technol. Biomed. 14(3):567581, 2010.
64. Ince, N. F., Goksu, F., Pellizzer, G., Tewfik, A., and Stephane, M.,
Selection of spectro-temporal patterns in multichannel MEG with
support vector machines for schizophrenia classification. 2008 30th
Annu. Int. Conf. IEEE Eng. Med. Biol Soc. ;35543557, 2008.
65. Gangwar, M., Mishra, R. B., and Yadav, R. S., Application of de-
cision tree method in the diagnosis of neuropsychiatric diseases.
Asia-Pacific World Congr Comput Sci Eng. ;18, 2014.
66. GeethaRamani, R., and Sivaselvi, K., Data mining technique for
identification of diagnostic biomarker to predict schizophrenia dis-
order. 2014 IEEE Int. Conf. Comput. Intell. Comput. Res. ;18,
2014.
67. Lanata, A., Greco, A., Valenza, G., and Scilingo, E. P., A pattern
recognition approach based on electrodermal response for
pathological mood identification in bipolar disorders. ICASSP,
IEEE Int. Conf. Acoust. Speech Sign. Process ;36013605, 2014.
68. Thongkam, J., and Sukmak, V., Enhancing decision tree with
adaboost for predicting schizophrenia readmission. Adv. Mater.
Res. 931:14671471, 2014.
69. Castaldo, R., Xu, W., Melillo, P., Pecchia, L., Santamaria, L., and
James, C., Detection of mental stress due to oral academic exami-
nation via ultra-short-term HRV analysis. Proc. Annu. Int. Conf.
IEEE Eng. Med. Biol. Soc. EMBS. ;38053808, 2016.
70. Azar, G., Gloster, C., El-Bathy, N., Yu, S., Neela, R. H, and
Alothman, I., Intelligent data mining and machine learning for
mental health diagnosis using genetic algorithm. IEEE Int. Conf.
Electro. Inf. Technol.;201206, 2015.
71. Hadzic, M., Hadzic, F., and Dillon, T. S., Domain driven tree min-
ing of semi-structured mental health information. Data Min. Bus.
Appl. 2009;127141.
72. Nguyen, T., ODea, B., Larsen, M., Phung, D., Venkatesh, S., and
Christensen, H., Using linguistic and topic analysis to classify sub-
groups of online depression communities. Multimed. Tools Appl.
76(8):1065310676, 2017.
73. Cairney, J., Veldhuizen, S., Vigod, S., Streiner, D. L., Wade, T. J.,
and Kurdyak, P., Exploring the social determinants of mental health
service use using intersectionality theory and CART analysis. J.
Epidemiol. Commun. Health. 68(2):145150, 2014.
74. Barros, J., Morales, S., Echávarri, O., García, A., Ortega, J., Asahi,
T. et al., Suicide detection in Chile: Proposing a predictive model
for suicide risk in a clinical sample ofpatients with mood disorders.
Rev. Bras. Psiquiatr. 39(1):111, 2017.
75. Chekroud AM, Zotti RJ, Shehzad Z, Gueorguieva R, Johnson MK,
Trivedi MH, et al. Cross-trial prediction of treatment outcome in
depression: A machine learning approach. Lancet Psychiatr.
2016;3(3):243250.
76. Hadzic, M., Hadzic, F., and Dillon, T. S., Mining of patient data:
Towards better treatment strategies for depression. Int. J. Funct.
Inform. Personal. Med. 3(2):122143, 2010.
J Med Syst (2018) 42: 161 Page 15 of 15 161
... Data mining can be useful for extracting a small subset of information that represents large volumes of data. Techniques such as clustering, classification, and association are considered data mining techniques [32][33][34]. Clustering is a technique for detecting patterns by segmenting the data into clusters. The data within a cluster is considered indistinguishable from each other and distinguishable from other clusters [33]. ...
Article
Full-text available
The Munsell and Natural Color Systems, as well as the World Color Survey, are standard sets of colors used in many practical and scientific applications. However, the colors of natural scenes exhibit a bias in color and do not have a uniform distribution, making it difficult for these sets to represent natural colors accurately. We derived sets of colors with a small number of samples that are better at representing natural colors than any of these standard sets. Hyperspectral images of natural scenes and a k-medoids clustering algorithm were used to derive representative colors. For the same number of samples, the set of colors obtained by k-medoids is better at representing natural colors than the standard sets. These optimized sets are important for applications that require precise representation of natural colors.
... S. Alonso et al.,2018 [10] Mental Disorders Data Mining techniques in the diagnosis of Dementia, Depression, etc. can perform better in clinical diagnosis for improving the patient's life. E.G. Pintelas et al.,2018[11] Anxiety Disorders A comparative analysis was performed for the diagnosis of different types of anxiety disorders using machine learning techniques. ...
Article
Full-text available
INTRODUCTION: Psychological disorders are a critical issue in today’s modern society, yet it remains to be continuously neglected. Anxiety and depression are prevalent psychological disorders that persuade a generous number of populations across the world and are scrutinized as global problems. METHODS: The three-step methodology is employed in this study to determine the diagnosis of anxiety and depressive disorders. In this survey, a methodical review of ninety-nine articles related to depression and anxiety disorders using different traditional classifiers, metaheuristics and deep learning techniques was done. RESULTS: The best performance and publication trend of traditional classifiers, metaheuristic and deep learning techniques have also been presented. Eventually, a comparison of these three techniques in the diagnosis of anxiety and depression disorders has been appraised. CONCLUSION: There is further scope in the diagnosis of anxiety disorders such as social anxiety disorder, phobia disorder, panic disorder, generalized anxiety, and obsessive-compulsive disorders. Already, there has been a lot of work has been done on conventional approaches to the prognosis of these disorders. So, there is need to need to scrutinize the prognosis of depression and anxiety disorders using the hybridization of metaheuristic and deep learning techniques. Also, the diagnosis of these two disorders among academic fraternity using metaheuristic and deep learning techniques need to be explored.
... Like data mining, machine learning extracts crucial clinical information from vast datasets by classifying individuals into subgroups. This approach has become a valuable tool in understanding various psychiatric disorders, including Alzheimer's disease, dementia, depression, and schizophrenia (Alonso et al., 2018). Its applications extend beyond basic research. ...
Article
Full-text available
Individuals with co-occurring psychiatric and substance use disorders (COD) face challenges, including accessing treatment, accurate diagnoses, and effective treatment for both disorders. This study aimed to develop a COD prediction model by examining the intersectionality of COD with race/ethnicity, age, gender identity, pandemic year, and behavioral health needs and strengths. Individuals aged 18 or older who participated in publicly funded behavioral health services (N = 22,629) were selected. Participants completed at least two Adult Needs and Strengths Assessments during 2019 and 2020, respectively. A chi-squared automatic interaction detection (CHAID) decision tree analysis was conducted to identify patterns that increased the likelihood of having COD. Among the decision tree analysis predictors, Involvement in Recovery emerged as the most critical factor influencing COD, with a predictor importance value (PIV) of 0.46. Other factors like Legal Involvement (PIV = 0.12), Decision-Making (PIV = 0.12), Parental/Caregiver Role (PIV = 0.11), Other Self-Harm (PIV = 0.10), and Criminal Behavior (PIV = 0.09) had progressively lower PIVs. Age, gender, race/ethnicity, and pandemic year did not show statistically significant associations with COD. The CHAID decision tree analysis provided insights into the dynamics of COD. It revealed that legal involvement played a crucial role in treatment engagement. Individuals with legal challenges were less likely to be involved in treatment. Individuals with COD displayed more complex behavioral health needs that significantly impaired their functioning compared to individuals with psychiatric disorders to inform the development of targeted interventions.
Article
This research investigates machine learning models for predicting mental health consequences using survey data. The study employs a two-phase approach first, it utilizes TensorFlow for initial Deep Neural Network (DNN) model building, and then it applies Random Forest (RF), Naive Bayes classifier, and decision tree methods for comparative analysis. The DNN model demonstrates strong performance, achieving high accuracy in mental health prediction. Metrics such as testing time, precision, mean absolute error, and accuracy are compared to provide insight into the advantages and disadvantages of each model. While the DNN model excels in accuracy and precision, other models offer trade-offs in computational efficiency. The results clarify the role of machine learning in mental wellness evaluation and intervention, providing guidance for further research and real-world applications. This research enhances the discourse on predictive modeling for mental health outcomes, facilitating advancements in leveraging machine learning to improve mental health assessment and intervention strategies.
Article
Atherosclerosis (AS) causes thickening and hardening of the arterial wall due to accumulation of extracellular matrix, cholesterol, and cells. In this study, we used comprehensive bioinformatics tools and machine learning approaches to explore key genes and molecular network mechanisms underlying AS in multiple data sets. Next, we analyzed the correlation between AS and immune fine cell infiltration, and finally performed drug prediction for the disease. We downloaded GSE20129 and GSE90074 datasets from the Gene expression Omnibus database, then employed the Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts algorithm to analyze 22 immune cells. To enrich for functional characteristics, the black module correlated most strongly with T cells was screened with weighted gene co-expression networks analysis. Functional enrichment analysis revealed that the genes were mainly enriched in cell adhesion and T-cell-related pathways, as well as NF-κ B signaling. We employed the Lasso regression and random forest algorithms to screen out 5 intersection genes (CCDC106, RASL11A, RIC3, SPON1, and TMEM144). Pathway analysis in gene set variation analysis and gene set enrichment analysis revealed that the key genes were mainly enriched in inflammation, and immunity, among others. The selected key genes were analyzed by single-cell RNA sequencing technology. We also analyzed differential expression between these 5 key genes and those involved in iron death. We found that ferroptosis genes ACSL4, CBS, FTH1 and TFRC were differentially expressed between AS and the control groups, RIC3 and FTH1 were significantly negatively correlated, whereas SPON1 and VDAC3 were significantly positively correlated. Finally, we used the Connectivity Map database for drug prediction. These results provide new insights into AS genetic regulation.
Article
The mental well-being of a person is their mental state. Chemical abnormalities in the brain cause mental health problems. It is important to monitor the mental health of different groups in order to predict health-related disorders. The community consists of working professionals and college students. It is widely believed that stress and grief affect people of all ages and backgrounds. Some serious mental health disorders, such as anxiety, bipolar disorder, and schizophrenia, often evolve and produce symptoms that can be recognized early. Such mental disorders could be avoided more successfully if abnormal mental states are detected in the early stages of the disease, allowing for additional care and treatment. This study analyzed the accuracy of four data mining techniques and introduced a new ensemble technique to improve their accuracy in identifying mental health issues. The data mining techniques are Logistic Regression, KNN Classifier, Decision Tree Classifier, and Random Forest. This paper provides scope for other researchers and practitioners seeking to achieve higher accuracy in identifying mental health issues using enhanced data mining algorithms to meet several accuracy criteria.
Article
Full-text available
Alzheimer’s and Parkinson’s disease are the most common forms of dementia that degenerate neurons in the brain cells. This paper targets a comparative study on the performance of data mining techniques in neuro-degenerative data. The existing data mining algorithms give classification accuracy ~93% with Correlation-based feature subset selection method. The proposed Decremental Feature Selection Method has yielded a more optimal feature subset that gives higher accuracy in prediction. Further exploration of computational methods to investigate the role of such genetic variants will aid in identifying the genetic cause of these diseases and design suitable drugs to target the gene property.
Article
Full-text available
The main objective of this paper is to present a review of existing researches in the literature, referring to Big Data sources and techniques in health sector and to identify which of these techniques are the most used in the prediction of chronic diseases. Academic databases and systems such as IEEE Xplore, Scopus, PubMed and Science Direct were searched, considering the date of publication from 2006 until the present time. Several search criteria were established as ‘techniques’ OR ‘sources’ AND ‘Big Data’ AND ‘medicine’ OR ‘health’, ‘techniques’ AND ‘Big Data’ AND ‘chronic diseases’, etc. Selecting the paper considered of interest regarding the description of the techniques and sources of Big Data in healthcare. It found a total of 110 articles on techniques and sources of Big Data on health from which only 32 have been identified as relevant work. Many of the articles show the platforms of Big Data, sources, databases used and identify the techniques most used in the prediction of chronic diseases. From the review of the analyzed research articles, it can be noticed that the sources and techniques of Big Data used in the health sector represent a relevant factor in terms of effectiveness, since it allows the application of predictive analysis techniques in tasks such as: identification of patients at risk of reentry or prevention of hospital or chronic diseases infections, obtaining predictive models of quality.
Article
Full-text available
Depression is an important mental disease of global concern. Its complicated etiology and chronic clinical features make it difficult for users to be conscious of their own depression emotion and seriously threaten the patient’s life safety. With the development of e-commerce, intelligent recommender system has brought new opportunities to personalized health monitoring for the users with emotional distress. Therefore, this paper puts forward the emHealth system, which is an intelligent health recommendation system with depression prediction for emotion health. This paper explores the monitoring and improvement of users psychological and physiological conditions by pushing personalized therapy solutions to patients with emotional distress. Specifically, this paper first proposes the system architecture of emHealth. Then, we design personalized mobile phone Apps to collect emotional data of users with tendentious depressive mood, and find the five main external characteristics of depression by Pearson correlation analysis. We divide 1047 volunteers data into training set and test set, and construct prediction model of depression using decision tree and support vector machine algorithms. For the different external factors that lead to depression, we give personalized recommendation and intelligent decision-making solution, and push related emotional improvement suggestions to guide users behavior. Finally, a specific application scene is demonstrated where patient’s family member carry out psychological counseling for the patient, to verify the practicability and validity of the system. The beneficial effects of this system can meet the needs of the electronic market and can be promoted and popularized.
Article
Full-text available
Background The number of people with dementia is increasing along with people’s ageing trend worldwide. Therefore, there are various researches to improve a dementia diagnosis process in the field of computer-aided diagnosis (CAD) technology. The most significant issue is that the evaluation processes by physician which is based on medical information for patients and questionnaire from their guardians are time consuming, subjective and prone to error. This problem can be solved by an overall data mining modeling, which subsidizes an intuitive decision of clinicians. Methods Therefore, in this paper we propose a quad-phased data mining modeling consisting of 4 modules. In Proposer Module, significant diagnostic criteria are selected that are effective for diagnostics. Then in Predictor Module, a model is constructed to predict and diagnose dementia based on a machine learning algorism. To help clinical physicians understand results of the predictive model better, in Descriptor Module, we interpret causes of diagnostics by profiling patient groups. Lastly, in Visualization Module, we provide visualization to effectively explore characteristics of patient groups. Results The proposed model is applied for CREDOS study which contains clinical data collected from 37 university-affiliated hospitals in republic of Korea from year 2005 to 2013. Conclusions This research is an intelligent system enabling intuitive collaboration between CAD system and physicians. And also, improved evaluation process is able to effectively reduce time and cost consuming for clinicians and patients. Electronic supplementary material The online version of this article (doi:10.1186/s12911-017-0451-3) contains supplementary material, which is available to authorized users.
Article
Full-text available
Background: Mental illness is quickly becoming one of the most prevalent public health problems worldwide. Social network platforms, where users can express their emotions, feelings, and thoughts, are a valuable source of data for researching mental health, and techniques based on machine learning are increasingly used for this purpose. Objective: The objective of this review was to explore the scope and limits of cutting-edge techniques that researchers are using for predictive analytics in mental health and to review associated issues, such as ethical concerns, in this area of research. Methods: We performed a systematic literature review in March 2017, using keywords to search articles on data mining of social network data in the context of common mental health disorders, published between 2010 and March 8, 2017 in medical and computer science journals. Results: The initial search returned a total of 5386 articles. Following a careful analysis of the titles, abstracts, and main texts, we selected 48 articles for review. We coded the articles according to key characteristics, techniques used for data collection, data preprocessing, feature extraction, feature selection, model construction, and model verification. The most common analytical method was text analysis, with several studies using different flavors of image analysis and social interaction graph analysis. Conclusions: Despite an increasing number of studies investigating mental health issues using social network data, some common problems persist. Assembling large, high-quality datasets of social media users with mental disorder is problematic, not only due to biases associated with the collection methods, but also with regard to managing consent and selecting appropriate analytics techniques.
Article
Full-text available
aim: In efforts to develop reliable methods to detect the likelihood of impending suicidal behaviors, we have proposed the following. Objective: To gain a deeper understanding of the state of suicide risk by determining the combination of variables that distinguishes between groups with and without suicide risk. Method: A study involving 707 patients consulting for mental health issues in three health centers in Greater Santiago, Chile. Using 345 variables, an analysis was carried out with artificial intelligence tools, Cross Industry Standard Process for Data Mining processes, and decision tree techniques. The basic algorithm was top-down, and the most suitable division produced by the tree was selected by using the lowest Gini index as a criterion and by looping it until the condition of belonging to the group with suicidal behavior was fulfilled. results: Four trees distinguishing the groups were obtained, of which the elements of one were analyzed in greater detail, since this tree included both clinical and personality variables. This specific tree consists of six nodes without suicide risk and eight nodes with suicide risk (tree decision 01, accuracy 0.674, precision 0.652, recall 0.678, spec-ificity 0.670, F measure 0.665, receiver operating characteristic (ROC) area under the curve (AUC) 73.35%; tree decision 02, accuracy 0.669, precision 0.642, recall 0.694, specificity 0.647, F measure 0.667, ROC AUC 68.91%; tree decision 03, accuracy 0.681, precision 0.675, recall 0.638, specificity 0.721, F measure, 0.656, ROC AUC 65.86%; tree decision 04, accuracy 0.714, precision 0.734, recall 0.628, specificity 0.792, F measure 0.677, ROC AUC 58.85%).
Article
Mental health related disorders are common diseases, especially among the elderly. Among the various mental health diseases, one potential threat to ageing-in-place is the risk of depression. In this paper, we propose a simple unobtrusive sensing system using Passive Infra-Red (PIR) motion sensors to monitor the Activities of Daily Living (ADLs) of elderly who are living alone. A feature extraction module comprising of three layers - states, events and activities - and the corresponding algorithms are proposed to extract features. Four popular classification models - neural network, C4.5 decision tree, Bayesian network, and Support Vector Machine (SVM) - are then applied to detect the severity of depression. We implement and test the algorithms on sensor data collected over three months from 20 elderly, each in different daily living conditions. Our evaluation shows that the proposed algorithms are effective in detecting both normal condition and mild depression with up to 96% accuracy, using neural network as the classification algorithm. The sensing system is non-intrusive and cost-effective, with the potential of use for long-term depression monitoring and detection of early symptoms of mental related disorders. This enables caregivers to provide timely interventions to elderly who are at risk of depression.
Conference Paper
Dementia with Lewy bodies and Parkinson's Disease has the common cause: mutation of gene CYP2D6. Thus, it is hard to distinguish those two diseases. In order to classify them clearly, construction of sorting machine is required. Various algorithms including Apriori, Decision tree, Support Vector Machine, K-means and Neural Network was used in the construction of sorting machine. Two kinds of sorting machines were constructed: one classifying whether the patient is under Dementia with lewy bodies or Parkinson's Disease, the other classifying whether the gene is normal or mutation has undergone.