ArticlePDF Available

Abstract and Figures

Parkinson's disease (PD) is a neurodegenerative disease that impacts the neural, physiological, and behavioral systems of the brain, in which mild variations in the initial phases of the disease make precise diagnosis difficult. The general symptoms of this disease are slow movements known as 'bradykinesia'. The symptoms of this disease appear in middle age and the severity increases as one gets older. One of the earliest signs of PD is a speech disorder. This research proposed the effectiveness of using supervised classification algorithms, such as support vector machine (SVM), naïve Bayes, k-nearest neighbor (K-NN), and artificial neural network (ANN) with the subjective disease where the proposed diagnosis method consists of feature selection based on the filter method, the wrapper method, and classification processes. Since just a few clinical test features would be required for the diagnosis, a method such as this might reduce the time and expense associated with PD screening. The suggested strategy was compared to PD diagnostic techniques previously put forward and well-known classifiers. The experimental outcomes show that the accuracy of SVM is 87.17%, naïve Bayes is 74.11%, ANN is 96.7%, and KNN is 87.17%, and it is concluded that the ANN is the most accurate one with the highest accuracy. The obtained results were compared with those of previous studies, and it has been observed that the proposed work offers comparable and better results.
Content may be subject to copyright.
Electronics 2022, 11, 3782. https://doi.org/10.3390/electronics11223782 www.mdpi.com/journal/electronics
Article
An Efficient Machine Learning Approach for Diagnosing
Parkinsons Disease by Utilizing Voice Features
Arti Rana 1, Ankur Dumka 2,3, Rajesh Singh 4,5, Mamoon Rashid 6,7,*, Nazir Ahmad 8 and Manoj Kumar Panda 9
1 Computer Science & Engineering, Veer Madho Singh Bhandari Uttarakhand Technical University,
Dehradun 248007, India
2 Department of Computer Science and Engineering, Women Institute of Technology,
Dehradun 248007, India
3 Department of Computer Science & Engineering, Graphic Era Deemed to be University,
Dehradun 248001, India
4 Division of Research and Innovation, Uttaranchal Institute of Technology, Uttaranchal University,
Dehradun 248007, India
5 Department of Project Management, Universidad Internacional Iberoamericana, Campeche 24560, Mexico
6 Department of Computer Engineering, Faculty of Science and Technology, Vishwakarma University,
Pune 411048, India
7 Research Center of Excellence for Health Informatics, Vishwakarma University, Pune 411048, India
8 Department of Information System, College of Applied Sciences, King Khalid University,
Muhayel 61913, Saudi Arabia
9 Department of Electrical Engineering, G.B. Pant Institute of Engineering and Technology,
Pauri 246194, India
* Correspondence: mamoon.rashid@vupune.ac.in
Abstract: Parkinsons disease (PD) is a neurodegenerative disease that impacts the neural,
physiological, and behavioral systems of the brain, in which mild variations in the initial phases of
the disease make precise diagnosis difficult. The general symptoms of this disease are slow
movements known as ‘bradykinesia’. The symptoms of this disease appear in middle age and the
severity increases as one gets older. One of the earliest signs of PD is a speech disorder. This research
proposed the effectiveness of using supervised classification algorithms, such as support vector
machine (SVM), naïve Bayes, k-nearest neighbor (K-NN), and artificial neural network (ANN) with
the subjective disease where the proposed diagnosis method consists of feature selection based on
the filter method, the wrapper method, and classification processes. Since just a few clinical test
features would be required for the diagnosis, a method such as this might reduce the time and
expense associated with PD screening. The suggested strategy was compared to PD diagnostic
techniques previously put forward and well-known classifiers. The experimental outcomes show
that the accuracy of SVM is 87.17%, naïve Bayes is 74.11%, ANN is 96.7%, and KNN is 87.17%, and
it is concluded that the ANN is the most accurate one with the highest accuracy. The obtained results
were compared with those of previous studies, and it has been observed that the proposed work
offers comparable and better results.
Keywords: ANN; KNN; machine learning (ML); naïve Bayes classification; Parkinson’s disease;
SVM
1. Introduction
Parkinson’s disease, commonly known as Tremor, is affected by a reduction in
dopamine levels in the brain which damages a persons motion functions, or physical
functioning. It is one of the world’s most common diseases. Intermittent neurological
signs and symptoms result from these lesions, which get worse as the disease progresses
[1]. Because aging causes changes in our brains, such as loss of synaptic connections and
changes in neurotransmitters and neurohormones, this condition is more frequent among
Citation: Rana, A.; Dumka, A.;
Singh, R.; Rashid, M.; Ahmad, N.;
Panda, M.K. An Efficient Machine
Learning Approach for Diagnosing
Parkinson’s Disease by Utilizing
Voice Features. Electronics 2022, 11,
3782. https://doi.org/10.3390/
electronics11223782
Academic Editors: Gabriella Olmo,
Florenc Demrozi and Yu Zhang
Received: 13 October 2022
Accepted: 16 November 2022
Published: 17 November 2022
Publisher’s Note: MDPI stays
neutral with regard to jurisdictional
claims in published maps and
institutional affiliations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license
(https://creativecommons.org/license
s/by/4.0/).
Electronics 2022, 11, 3782 2 of 21
the elderly. With the passage of time, the neurons in a persons body begin to die and
become inimitable. The consequences of neurological problems and the falling dopamine
levels in the patient’s body show gradually, making them difficult to detect until the
patients condition requires medical treatment [2]. However, the symptoms and severity
levels are different for individuals. Major symptoms of this disease are deficiency in
speech, short-term memory loss, loss of balance, and unbalanced posture [1].
Every year, 8.5 million individual cases of this disease are registered worldwide, as
per the World Health Organization (WHO) report in 2019 [3]. The chance of developing
this disease rises with age; currently, there are 4% of sufferers worldwide under 50 years
of age. This disease is the most widespread neurodegenerative disease in the world after
Alzheimers disease, impacting millions of people [4,5]. Therapy for this disease is still in
its initial stages, and doctors can only assist patients in alleviating the symptoms of the
disease [6]. However, there are no definite diagnostics for this disease, and the diagnosis
is largely dependent on the medical history of the patient [1]. As invasive procedures are
typically used for diagnosis and therapy, which are both expensive and demanding [7], a
reasonably straightforward and accurate way to diagnose this disease looks very relevant.
1.1. Machine Learning-Based Detection of Parkinson’s Disease
Over the past few decades, researchers have looked at a new way of detecting this
disease through ML techniques, a subset of artificial intelligence (AI). Clinical personnel
might better recognize these disease patients by combining traditional diagnostic
indications with ML.
As walking is the most common activity in every person’s day-to-day life, it has been
linked to physical as well as neurological disorders. This disease, for example, has been
identifiable using gait (mobility) data. Gait analysis approaches offer advantages such as
being non-intrusive and having the potential to be extensively used in residential settings
[8]. Few researchers have attempted to combine ML methods to make the procedure
autonomous and possible to do offline [9].
Furthermore, persons with the subject disease in its early stages might experience
speech problems [10]. These include dysphonia (weak vocal fluency), repetitious echoes
(a tiny assortment of audio variations), and hypophonia (vocal musculature disharmony)
[7,11]. Information from human aural emissions might be detected and evaluated using a
computing unit [12, 13].
1.2. Research Problem and Motivation
Early PD detection in PD patients is a crucial challenge. Even if their health
deteriorates, people can enhance their quality of life if they receive an early diagnosis.
Another issue is that the diagnosis of PD requires a number of steps, including gathering
a thorough neurological history from the patient and examining their motor abilities in
various environments.
The majority of recent studies deal with the homo dataset (text, speech, video, or
image). Problems with dataset modification and multi-data handling procedures have
been highlighted in the suggested study. The effectiveness of disease prediction is
regulated as a result of the examination of a particular dataset. More real-time solutions
are made possible by the use of machine learning-based techniques for multivariate data
processing. The multi-variate vocal data analysis (MVDA) is driven to provide multiple
dataset attribute-based Parkinsons disease identification utilizing machine learning
approaches. This study examines the potential for improving multi-variate and
multimodal data processing, which aids in raising the disease detection rate. The existing
research simultaneously concentrates on various ML-based techniques such as support
vector machines, naïve Bayes, K-NN, and artificial neural network evaluations of
Parkinsons data based on voice features. The MVDA employs extensive datasets and
machine learning approaches to improve disease identification based on these works. The
incorporation of numerous patients multivariate acoustic characteristics in the proposed
Electronics 2022, 11, 3782 3 of 21
MVDA is encouraged. The subjective disease has been diagnosed with the help of
proposed machine learning techniques under the MVDA system.
1.3. Contribution
This research article covers the techniques of machine learning which are
implemented in the auditory analysis of speech to diagnose this disease. The benefits and
shortcomings of these algorithms in detecting the disease are thoroughly contrasted, and
existing comparative studies potential drawbacks are explored. The accuracy of ANN in
speech analysis for diagnosis is the finest among different classifiers; however, the
assumption is to enhance and adapt to the difficulties that may come from the data. Using
the naïve Bayes classifier with suitable pre-processing might result in greater average
accuracy. The main contribution of this paper is as follows:
a. To identify which machine learning algorithms, such as SVM, KNN, naive Bayes, and
ANN, offer the most accurate classifications and diagnosis of Parkinson’s disease.
b. To develop statistical evaluations for the diagnosis of Parkinsons disease in order to
identify the frequency at which the best training and test results will be acquired, and
consequently to assist in upcoming literature-based research.
c. The proposed system has used an ANN classifier to attain the maximum
classification accuracy when compared to the approaches used in earlier research.
d. In order to improve the prediction of PD, a comprehensive methodology was
employed to explore the effectiveness and efficiency of various feature selection
approaches.
e. The proposed model is examined with four machine learning methods, including
SVM, naive Bayes, k-NN, and ANN, as well as with earlier and more current studies
on PD detection.
1.4. Structure of Proposed Work
The structure of the study is as follows: Section 2 describes the related research
survey. Section 3 discusses the methodology used to achieve the proposed objective.
Section 4 defines the materials and methods. Section 5 examines the experiment and
results. Section 6 discusses the comparative study and discussion. Finally, Section 7
concludes the proposed work.
2. Related Works
In order to distinguish PD cases from healthy controls, a variety of modern machine
learning algorithms, including support vector machines, artificial neural networks,
logistic regression, naïve Bayes, etc., have been successfully used. In this study, numerous
databases, including Web of Science, Elsevier, MDPI, Scopus, Science Direct, IEEE Xplore,
Springer, and Google Scholar, were utilized to survey relevant papers on Parkinsons
disease.
In a survey by [14], the authors used KNN, SVM, and discrimination-function-based
(DBF) classifiers for the diagnosis of PD. In their study, they used several parameters such
as jitter, fundamental frequency, pitch, shimmer, and other statistical measures. The best
accuracy among these classifiers was obtained from KNN with a 93.83% accuracy rate and
it also provided good performance in other parameters, such as sensitivity, specificity,
and error rate.
The authors in [15] used a convolution neural network classifier applied to speech
classification datasets. The accuracy reached throughout the training phase, which was
over 77%, makes the results optimistic. In accordance with the works mentioned above,
[16] examined a variety of classifiers to identify individuals who were likely to have
Parkinsons disease. They used 40 participants for their investigation, including 20 PD
patients and 20 healthy controls. According to the experimental findings, the naive Bayes
classifier has a detection accuracy of 65%, with a sensitivity rate of 63.6% and a specificity
Electronics 2022, 11, 3782 4 of 21
rate of 66.6%, respectively. In [17], the authors used three types of classifiers based on
KNN, SVM, and multilayer perceptron (MLP) to diagnose Parkinson’s disease. Among all
these ML classifiers, SVM using an RBF kernel outperformed with an overall classification
accuracy rate of 85.294%.
A summary of the most recent deep learning methods for audio signal processing is
given in another work by [18]. The works that have been examined include convolution
neural networks as well as other long short-term memory architecture models and audio-
specific neural network models. Similar to the previous studies, [19] detected PD using
naive Bayes and other machine learning approaches. In their method, relevant features
were extracted from the voice signal of PD patients and healthy control subjects using
signal processing techniques. The naive Bayes algorithm shows a 69.24% detection
accuracy and 96.02% precision rate for the 22 voice characteristics. In [20], the authors
suggested a technique for detecting Parkinsons disease using SVM on shifted delta
cepstral (SDC) and single frequency filtering cepstral coefficients (SFFCC) features
extracted from speech signals of PD patients and healthy controls. Comparing the
standard MFCC + SDC features to the SDC + SFFCC features, performance increases of
9% were observed. The 73.33% detection accuracy with a 73.32% F1-score was displayed
by the conventional SVM on SDC + SFCC features. In addition to the naive Bayes classifier,
several additional supervised methods, including but not restricted to well-known deep
learning methods, have been suggested to identify PD patients among healthy controls.
In a survey conducted by [21], the authors examined two recognizing decision forests
i.e., SysFor and ForestPA, along with the most widely used random forest classifier, which
has been utilized as a Parkinsons detector. In their study, as compared to SysFor and
ForestPA, random forests average detection accuracy on incremental trees showed
93.58%. For the purpose of classifying Parkinsons disease through sets of acoustic vocal
(voice) characteristics, the authors [22] suggested two frameworks based on CNN. Both
frameworks are used for the mixing of different feature sets, although they combine
feature sets in different ways. While the second framework provides feature sets to the
parallel input levels that are directly connected to convolution layers, the first framework
first combines several feature sets before passing them as inputs to the nine-layered CNN.
AI is assisting physicians in better diagnosing and treating diseases such as
postoperative hypotension, and more advanced future models may have even more
widespread medical uses. The evolutionary step in the creation of therapeutic pathways
and adherence is machine learning. The real benefit of machine learning, however, is that
it enables provider organizations to use information about the patient population from
their own systems of record to create therapeutic pathways that are unique to their
procedures, clientele, and physicians [23].
The vocal biomarkers and the description of the Aachen aphasia database, which
contains recordings and transcriptions of therapy sessions, were covered in [24]. The
authors also discussed how the biomarkers and the database could be used to build a
recognition system that automatically maps pathological speech to aphasia type and
severity.
In [25], the authors examined the suggested technique using a dataset of 288 audio
files from 96 patients, including 48 healthy controls and 48 participants with cognitive
impairment. The suggested method outperformed techniques based on manual
transcription and speech annotation, with classification results that were comparable to
those of the most advanced neuropsychological screening tests and an accuracy rate of
90.57%.
In [26], the authors intended to enlighten on the early indicators of major depressive
relapse, which were discreetly measured using remote measurement technologies (RMT).
RMT has the potential to alter how depression and other long-term disorders are
evaluated and handled if it is found to be acceptable to patients and other important
stakeholders and capable of providing clinically meaningful information predicting
future deterioration.
Electronics 2022, 11, 3782 5 of 21
It can be seen from the reviews above that all the research that has been carried out
is only restricted to a small number of datasets. The above previous works inspired us to
try a new methodology. In this study, we experimented with several feature selection
methods before comparing the results with various machine learning classifiers. Table 1
illustrates the review of ML techniques used to diagnose major symptoms of PD i.e.,
speech recording, handwriting pattern, and gait features, where data were collected from
the UCI machine learning repository, the University of Oxford (UO), and other resources
for 20 studies.
Table 1. Comparative Studies of Machine Learning Approaches to diagnose Parkinson’s Disease.
Reference
Feature
Machine Learning
Algorithms Used
Objective
Source of Data
No. of
Subjects
Outcomes
Sakar et al., 2019
[27]
Speech
Naïve Bayes,
Logistic
Regression, SVM
(RBF and Linear),
KNN, Random
Forest, MLP
Classification of PD
from HC
Collected from
participants
252, 188 PD +
64 HC
Highest accuracy
obtained from SVM
(RBF)86%
Yasar A. et al.,
2019 [28]
Speech
Artificial Neural
Network
Classification of PD
from HC
Collected from
participants
80, 40 PD +
40 HC
Accuracy of ANN
94.93%
Avuçlu, E., Elen,
A, 2020 [29]
Speech
KNN, Random
Forest, Naïve
Bayes, SVM
Classification of PD
from HC
UCI machine
learning
repository
31, 23 PD + 8
HC
Accuracy from
Naïve Bayes
70.26%
Marar et al., 2018
[30]
Speech
Naïve Bayes,
ANN, KNN,
Random Forest,
SVM, Logistic
Regression,
Decision Tree (DT)
Classification of PD
from HC
Collected from
participants
31, 23 PD + 8
HC
Highest accuracy
obtained from
ANN94.87%
Sheibani R et al.,
2019 [31]
Speech
Ensemble Based
Method
Classification of PD
from HC
UCI machine
learning
repository
31, 23 PD + 8
HC
Accuracy obtained
from ensemble
learning90.6%,
John M. Tracy et
al., 2020 [32]
Speech
Logistic
Regression (L2-
Regularized),
Random Forest,
Gradient Boosted
Trees
Classification of PD
from HC
mPower
database
2289, 246 PD
+ 2023 HC
Highest accuracy
obtained from
gradient boosted
trees Recall79.7%,
Precision90.1%,
F1-score83.6%
Cibulka et al.,
2019 [33]
Handwriting
Patterns
Random Forest
Classification of PD
from HC
Collected from
participants
270, 150 PD +
120 HC
Classification error
for rs11240569,
rs708727, rs823156 is
49.6%, 44.8%, 49.3%
respectively.
Hsu S-Y et al.,
2019 [34]
Handwriting
Patterns
SVM with RBF
Kernel, Logistic
Regression
Classification of PD
from HC
PACS
202, 94
Severe PD +
102 mild PD
+ 6 HC
Highest accuracy
obtained from SVM-
RBF 83.2% having
sensitivity 82.8%,
specificity 100%
Drotár, P et al.,
2016 [35]
Handwriting
Patterns
K-NN, Ensem-ble
AdaBoost
Classifier, Sup-
port Vector Ma-
chine
Classification of PD
from HC
PaHaW database
37 PD and 38
HC
Accuracy81.3%
Fabian Maass et
al., 2020 [36]
Handwriting
Patterns
SVM
Classification of PD
from HC
UCI machine
learning
repository
157, 82 PD +
68 HC +7
Normal
Pressure
Hydrocephal
us (NPH)
sensitivity-80%, and
specificity83%
Electronics 2022, 11, 3782 6 of 21
J. Mucha et al.,
2018 [37]
Handwriting
Patterns
Random Forest
Classifier
Classification of PD
from HC
PaHaW database
69, 33 PD +
36 HC
Obtained
classification
accuracy-90% with
sensitivity 89%, and
specificity 91%
Wenzel et al.,
2019 [38]
Handwriting
Patterns
CNN
Classification of PD
from HC
PPMI database
645, 438 PD +
207 HC
Accuracy-97.2%
Segovia, F. et al.,
2019 [39]
Handwriting
Patterns
SVM with 10 Cross
Validation
Classification of PD
from HC
Virgen De La
Victoria
Hospital,
Malaga, Spain
189, 95 PD +
94 HC
Accuracy-94.25%
Ye, Q. et al., 2018
[40]
Gait
Least Square
(LS)SVM,
Particle Swarm
Optimization
(PSO)
Classification of PD,
ALS, HD from HC
Neurology
Outpatient Clinic
at Massachusetts
General
Hospital, Boston,
MA, USA
64, 15 PD +
16 HC + 13
(Amyotrophi
c lateral
sclerosis
disease
(ALS)) + 20
(Huntington’
s disease
(HD))
Accuracy to
diagnose PD from
HC- 90.32%,
Accuracy to
diagnose HD from
HC-94.44%,
Accuracy to
diagnose ALS from
HC- 93.10%
Klomsae, A et al.,
2018 [41]
Gait
Fuzzy KNN
Classification of PD,
ALS, HD from HC
Neurology
Outpatient Clinic
at Massachusetts
General
Hospital, Boston,
MA, USA
64, 15 PD +
20 HD + 13
ALS + 16 HC
Accuracy to
diagnose PD from
HC- 96.43%,
Accuracy to
diagnose HD from
HC-97.22%,
Accuracy to
diagnose ALS from
HC-96.88%
J. P. Félix et al.,
2019 [42]
Gait
SVM, KNN, Naïve
Bayes, LDA,
Decision Tree
Classification of PD
from HC
Neurology
Outpatient Clinic
at Massachusetts
General
Hospital, Boston,
MA, USA
31, 15 PD +
16 HC
Highest accuracy
obtained from SVM,
KNN, and decision
tree- 96.8%
Andrei et al.,
2019 [43]
Gait
SVM
Classification of PD
from HC
Laboratory for
Gait and
Neurodynamics
166, 93 PD +
73 HC
Accuracy-100%
Priya SJ et al.,
2021 [44]
Gait
ANN
Classification of PD
from HC
Laboratory for
Gait and
Neurodynamics
166, 93 PD +
73 HC
Accuracy-96.28%
Oğul, et al., 2020
[45]
Gait
ANN
Classification of PD
from HC
Laboratory for
Gait and
Neurodynamics
166, 93 PD +
73 HC
Classification
accuracy-98.3%
Li B et al., 2020
[46]
Gait
Deep CNN
Classification of PD
from HC
Collected from
participants
20, 10 PD +
10 HC
Accuracy-91.9%
3. Proposed Work
The proposed ML model uses an SVM, naïve Bayes, KNN, and ANN algorithm in
the core. These algorithms are widely used in the literature since they are easy to use and
only need a small number of parameters to be tuned. There are several processes involved
in developing a model to detect PD from voice recordings. In the first phase, relevant
features are extracted from the dataset for better understanding. In the second phase,
machine learning techniques are applied to classify healthy as well as PD patients, which
are dependent on acoustic features to predict the outputs in the form of visual
representation of graphs and percentage of accuracy score tables. Finally, in the third
phase, there is a difference between the entire machine learning classifier models to
predict the best accuracy score. The complete technical process of the proposed work is
represented in Figure 1. The proposed methodology is shown to be better than the other
methodologies with respect to computational cost since few voice features were used
Electronics 2022, 11, 3782 7 of 21
instead of heavy feature extraction processes such as MRI, motion sensors, or handwriting
assessments. Additionally, the performances of different popular classifiers were
evaluated, and the best classifier was found to be ANN for PD diagnosis problems.
Voice Feature
Dataset
Training Dataset
Data
Preprocessing Feature Selection
Techniques
Filter Method
Wrapper Method
Support Vector Machine
Naïve Bayes Classifier
Artificial Neural
Network
K-Nearest Neighbour
Model
Training
Data
Preprocessing Feature Selection
Techniques
Filter Method
Wrapper Method
Support Vector Machine
Naïve Bayes Classifier
Artificial Neural
Network
K-Nearest Neighbour
Accuracy
Rate
F1-Score
Execution
Time
MCC
Parameters
The Final
Classification
Model Evaluation Result/ Outcome
Comparison of
Results
Test Dataset
Figure 1. Diagram of the flowchart of the proposed work.
Feature Selection
Due to many available features, feature selection is a frequent approach used to
minimize the dimension of data in machine learning based on voice analysis. As
demonstrated in Figure 2, all feature selection algorithms have the same aim of reducing
redundancy and increasing relevance, which improves the accuracy of the disease’s
diagnosis. Prior to supplying the data to the classifier, a variety of feature selection
strategies were used. The filter-based strategies take into account the importance of the
characteristics. As a result, they are stable and scalable and have a low level of complexity
[47,48]. The major drawback of this method is that, especially when the data are flowing
in a stream, it may overlook certain useful aspects [49]. Both univariate and multivariate
techniques based on filters are possible [50]. According to statistically based criteria such
as information gain (IG) [5153], the univariate approaches analyze attributes.
Electronics 2022, 11, 3782 8 of 21
Multivariate approaches calculate feature dependence before ranking the feature. In
addition, a widely utilized statistical technique for data analysis is principle component
analysis (PCA). By choosing a collection of features that accurately reflects the entire data
set, PCA can minimize the size of the data sets. The initial variables principal components
are the components with the largest variance value since PCA is a conversion technique.
Following that, the other principal components are arranged in descending order of
variance values [54]. Additionally, the wrapper-based algorithms assess the quality of the
chosen features based on the learning classifiers performance.
Dataset
Selection
Target Dataset Processed Data
Feature
Selection Feature
Extraction
Transformed
Data
Patterns
Pre-processing and cleaning
Data Mining
Interpretation and Evaluation
Figure 2. Feature Selection and Feature Extraction from Dataset.
In the pre-processing section, the whole procedure for filter techniques takes place
independent of the model. The models are skipped by the filter. Filter methods primarily
consider the data’s distribution and correlation and internal relationships. As a result,
filter techniques have the advantage of being simple and quick to compute. Because of
their simplicity and quick computing speed, filter approaches are commonly used in the
diagnosis of this disease. Some popular filtering methods are listed below. The minimum
redundancy and maximum relevancy (mRMR) method selects characteristics that are far
apart but have a strong correlation with the classification variable.
The wrapper method decides whether to have or reject a feature depending on a
classifiers working change [55]. The wrapper method takes certain classifiers into account
and provides a well-tailored subset. As a result, wrapper methods have a lower chance of
finding the local maximum. Due to its huge gain in performance, the wrapper approach
is popular among ML diagnostics. However, it has drawbacks such as being prone to
overfitting and being computationally costly. Wrapper-based feature selection techniques
use a classifier to build ML models with different predictor variables and select the
variable subset that leads to the best model.
In contrast, filter-based methods are statistical techniques independent of a learning
algorithm used to compute the correlation between the predictor and independent
variables. The predictor variables are scored according to their relevance to the target
variable. The variables with higher scores are then used to build the ML model. Therefore,
this research aims to use a filter-based feature selection method, to identify the most
relevant features for improved PD detection.
4. Materials and Methods
4.1. Dataset
The dataset of recorded speech signals was obtained from Max Little of the
University of Oxford [56,57]. Table 2 contains the details of the dataset. This dataset has
Electronics 2022, 11, 3782 9 of 21
an assortment of acoustic speech measures from 195 persons, where 147 persons have
Parkinson’s disease. All the attributes in the dataset characterize an individual voice
measure, and each tuple represents a total number of voice recordings made by these
people. The objective of the dataset is to differentiate fit persons compared to the
unhealthy using the status column, which is set to negative for fit persons and positive
for those having the disease.
Table 2. Detail of Parkinson’s Dataset.
Dataset Characteristic
Multivariate
No. of Instances
197
Attributes Characteristic
Real
No. of Attributes
23
Missing Values
N/A
Made by
Max Little of the University of Oxford
Associated Tasks
Classification
Types of Classification
Binary {0 for healthy and 1 for PD patient}
4.2. Parkinson’s Disease Diagnosis Based on Voice Analysis and Machine Learning
Some studies have concentrated on the acoustic level or the fluctuations in
fundamental frequency (F0) caused by vocal activities. The effects of power spectral
analysis of F0 phonation in persons with sensorineural audibility loss and the disease have
been examined in [5860]. F0s rhythm was unique in the incidence and amplitude of the
diseases. Further, the study demonstrated that the F0 analysis can be a useful tool for
neurological diseases under investigation. The autocorrelation function approach was
used to find the basic frequencies of speech transmissions. According to the concept,
Parkinsonian dysprosody is frequently described as a simple neuro-motor disorder.
The understanding and generation of pitch characteristics in a group of patients were
examined to confirm the idea. Conventional medications, such as LDOPA, define that in
the early stages of PD, LDOPA is a very effective treatment of subjective disease [61]. In
[62], the authors use deep learning to categorize the patients speech data as severe and
not severe. The evaluation measures employed in this study were the unified
Parkinsons disease ranking scale (UPDRS). The motor UPDRS examines the patients
motor ability on a 0108 scale, while the entire UPDRS provides a range of scores from 0
to 1766.
4.3. Classification of Parkinson’s Disease with ML Classifier
In this technique, well use an ML classifier to classify the disease. First, we select a
target variable of patient health status and measure the number of patients in this report.
We visualize the data graphically after assessing the health status of a patient. Two types
of datasets were developed: 80% of the dataset was used for training and 20% for the
testing dataset. In the following Figure 3, the score of 0 represents the healthy persons in
the sample, whose count is 48, and 1 represents the patients with Parkinson’s disease,
whose count is 147. The count of Parkinsons disease patients in the dataset: 147 out of 195
(75.38%). The count of healthy persons in the dataset: 48 out of 195 (24.62%).
Electronics 2022, 11, 3782 10 of 21
24.6%
75.4%
Healthy
Person
PD
Patient
75.4%
24.6%
Figure 3. Health Status of PD Patient.
4.4. Building of Machine Learning Techniques with Classifier Evaluation Metrics
By using different types of classifiers, it becomes easy to detect the disease.
Classification sensitivity, Matthewss correlation coefficient (MCC), accuracy, specificity,
F-score (F-measure), and other measurement parameters are used to distinguish it. Each
of these measurement criteria includes a formula for calculating it and determining which
classifier is the most qualitatively appropriate for the analysis. It is requisite to focus on
the confusion matrix before developing these criteria [63]. The confusion matrix of the
multi-class classifier is shown in Figure 4.
Positive Negative
Precision
TP
(TP+FP)
Negative
Predictive Value
TN
(TN+FN)
Positive
Negative
Sensitivity
TP
(TP+FN)
Specificity
TN
(TN+FP)
Accuracy
TP
(TP+TN+FP+FN)
True Positive False Negative
Type II Error
True Negative
False Positive
Type I Error
Predicated Class
Actual Class
Figure 4. Confusion Matrix with Sensitivity, Specificity, Accuracy, and Precision value.
F1-Score: It represents the accuracy of a model on a given dataset which is also known as
F-Score as shown in Equation (1):
  
 
(1)
Electronics 2022, 11, 3782 11 of 21
MCC: It is utilized for model evaluation to evaluate the quality of the binary and multi-
class classifications as shown in Equation (2). It is based on true-negative, true-positive,
and false-negative, false-positive. It lies between −1 to 1 which is defined as follows:
    
󰇛 󰇜󰇛 󰇜󰇛 󰇜󰇛 󰇜
(2)
(−1): Contradiction between prediction and observation
(0): No better than random prediction
(1): Perfect classifier (accurate prediction).
5. Experiments and Results
The proposed work is implemented in Python 3.7: JupyterLab. Here we detail the
experimental setup and the results of the four machine learning classification methods.
5.1. SVM-Classifier
SVM is one of the most prevalent classifier models because it provides accurate as
well as highly robust results. The fundamental goal of SVM is to classify the training data
by separating the classes while executing a multiple-class learning activity. It allows for
the best classification performance on training data and accurately classifies patterns from
the data [64]. The training procedure uses a sequential minimization strategy, and
classification accuracy is shown to be higher in SVM due to its greater generalization
ability [65]. The linear SVM is calculated by using the following Equation (3).
 󰇛󰇜
(3)
where x represents the data, y represents the class label, w represents the weight of vector
orthogonal to the decision hyper-plane, b represents the offset of the hyper-plane and T
shows the transpose operator [66].
In this study, we use the sklearn library in the SVM-classifier module for the
classification of the given dataset. Table 3 represents the results that are generated by
using the SVM classifier (Figure 5). Figure 6 represents the confusion matrix with the true
positive, true negative, false positive, and false negative value of a PD person by using the
SVM classifier.
Table 3. SVM Classifier.
Name
Results
Accuracy Score of test data
87.17%
Accuracy Score of training data
88.46%
Execution Time
0.03111 s
F1-score
66.19%
MCC
56.59%
Electronics 2022, 11, 3782 12 of 21
Figure 5. Results obtained by SVM.
Predicted
Healthy Predicted
Parkinson
True Healthy
True Parkinson
5 3
2 29
53
229
0 1
10
Figure 6. Confusion Matrix and Heatmap of SVM Classifier.
5.2. Naive Bayes Classifier
Another main essential category method of ML is the naive Bayes classifier
technique. It provides effective classification and learning and the majority of results are
acquired through the naïve Bayes method [67]. Naïve Bayes, based on Bayes theorem,
determines the likelihood of an event occurring depending on the events circumstances.
For instance, variations in the voice are common in people with the disease; hence, these
symptoms are linked to the prediction for diagnosis of this disease. The naive variation of
the theorem extends and simplifies the original Bayes theorem, which gives a mechanism
for determining the probability of a target occurrence. To estimate the likelihood of the
medical condition, the data comprise numerous speech signal variants. The sklearn
Gaussian naive Bayes algorithm is used to provide the classifier module for the execution
of the naïve Bayes categorization. The result of the classifier is shown in Table 4 and
graphical representation is illustrated in Figure 7.
Table 4. Naïve Bayes Classifier Results.
Name
Results
Accuracy Rate of test data
74.11%
Accuracy Rate of training data
76.23%
Execution Time
0.0323 s
87.17% 88.46%
66.19% 56.59%
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
Accuracy Score of
test data
Accuracy Score of
training data
F1-score MCC
Parameters
Results obtained by SVM
Electronics 2022, 11, 3782 13 of 21
F1-score
86.74%
MCC
66.56%
Figure 7. Results obtained by Naïve Bayes.
5.3. Artificial Neural Network
ANN is a subfield of deep neural networks that predict how the human brain works.
In general, there is a significant distinction between the human brain and ANN. The brain
has n number of parallel neurons, whereas the machine only has a finite sum of
processors. Additionally, neurons are meeker and more relaxed than computer
processors. Another major disparity between computer systems and the brain is the ability
to process information on a larger scale. Neurons are made up of synapses or networks
that operate together [64, 68]. In this article, the main aim is to classify the functionality of
ANN techniques in the early detection of this disease which is built on the subsequent
phases:
i. Identifying the responsibility and function of ANN in the detection of this disease.
ii. Making observations on labels and features of datasets.
iii. Grouping the types of the studied disease centered on their symptoms.
iv. Examining the accurate outcomes.
These outcomes can be further used in the medical sector as direction for developers
considering ANN deployment to enhance the civic health potential as a reaction to the
studied disease [69].
In the experiment of an artificial neural network, the dataset was split into two parts
i.e., the training dataset (80%) and the test dataset (20%). The classification results of the
artificial neural network were found to be very high in the form of the average accuracy
score which was the highest among all the classification methods, i.e., 96.7% shown in
Table 5 and graphical representation is shown in Figure 8.
Table 5. Artificial Neural Network Classifier Outcome.
Title
Results
Accuracy Rate of test data
96.7%
Accuracy Rate of training data
97.4%
Execution Time
0.025 s
F1-Score
87.01%
MCC
70.11%
74.11% 76.23% 86.74%
66.56%
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
Accuracy Score
of test data
Accuracy Score
of training data
F1-score MCC
Parameters
Results obtained by Naive Bayes
Electronics 2022, 11, 3782 14 of 21
Figure 8. Results obtained by ANN.
5.4. K-Nearest Neighbor
The KNN technique is costly while presenting with a huge training dataset since it
has been used most of the time in pattern recognition. KNN is the base concept of learning
by analogy utilized to categorize the nearest neighbors. It is accomplished by comparing
closely similar training tuples to the provided test tuple. As a result, n characteristics
are utilized to recognize training tuples in which each tuple corresponds to a distinct point
in the n-dimensional space. The KNN classifiers responsibility in the event of an
unlabeled tuple is to explore the pattern space for all k training tuples that are close
together [64]. This study aims to identify the accuracy rate of detecting the subject disease.
To find out the difference between affected patients and healthy persons, the KNN
algorithm is used. In terms of accuracy, experimental data reveal that the ANN classifier
outperformed the KNN classifier on average. The results of the KNN classifier are shown
in Table 6 with the accuracy rate of the training and test datasets, F1-score, and MCC
illustrated in Figure 9.
Table 6. KNN Classifier Results.
Name
Results
Accuracy Rate of test data
87.17%
Accuracy Rate of training data
88.46%
Execution Time
0.03111 s
F1-score
71%
MCC
65.02%
96.70% 97.40% 87.01%
70.11%
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
Accuracy Score
of test data
Accuracy Score
of training data
F1-score MCC
Parameters
Results obtained by ANN
Electronics 2022, 11, 3782 15 of 21
Figure 9. Results obtained by KNN.
5.5. Summary of Evaluation Results
The performance of all the classifier models used in the experiment for the disease’s
prediction is illustrated in Table 7. The artificial neural network classifier scores the
highest accuracy rate followed by SVM, naïve Bayes, and KNN. Figure 10 shows the
graphical representation of the results obtained by these four ML classifiers based on
various parameters. Table 7 illustrates that SVM attained the average accuracy for the
training and test datasets, which are 88.46% and 87.17% respectively, F1-score (66.19%),
and MCC (56.59%), sensitivity and specificity 62.5% and 93.54%, respectively. In addition,
the naïve Bayes achieved the average accuracy for the training and test datasets, F1-score,
MCC, sensitivity, and specificity, which are 76.23%, 74.11%, 86.74%, 66.56%, 84%, and
79.76% respectively.
Table 7. An overview of evaluation results.
Performance Measure
Accuracy
F1-Score
MCC
Sensitivity
Specificity
Training
Dataset
Test
Dataset
SVM
88.46%
87.17%
66.19%
56.59%
62.5%
93.54%
Naïve Bayes
76.23%
74.11%
86.74%
66.56%
84%
79.76%
KNN
88.46%
87.17%
71%
65.02%
60.0%
93.54%
ANN
97.4%
96.7%
87.01%
70.11%
92.42%
91.25%
It has been observed that the results obtained by the SVM and KNN have the same
values for all the parameters except MCC (65.02 %) and sensitivity (60%). Finally, the best
accuracy was obtained by the ANN where the results of parameters such as accuracy of
the training and test datasets, F1-score, MCC, sensitivity, and specificity are 97.4%, 96.7%,
64.55%, 87.01%, 70.11%, 92.42%, and 91.25%, respectively. Overall, the results of our
experiments show that ANN outperforms SVM, naive Bayes, and KNN.
87.17% 88.46%
71% 65.02%
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
Accuracy Score
of test data
Accuracy Score
of training data
F1-score MCC
Parameters
Results obtained by KNN
Electronics 2022, 11, 3782 16 of 21
Figure 10. Graphical representation of distributions of performance measures for all classifiers.
6. Comparative Study and Discussion
This section examines the efficient comparative result analysis of the proposed
technique with other conventional machine learning techniques. The comparison of the
proposed study with previously published research is shown in Table 8.
Table 8. Performance Comparison with previous studies.
Reference
Basis
Machine Learning Classifier
Accuracy
Sensitivity
Specificity
Sakar et al. [70]
Speech
SVM and KNN
68.45%
60
50
Vadovsky and Parali
[71]
Speech
C4.5 + C5.0 + random
forest + CART
66.5
NA
NA
Ouhmida, A. [72]
Speech
SVM, K-NN, Decision Tree
98.26%
(AUC)
NA
NA
Mabrouk et al.,[73]
Speech
Random forest, SVM, MLP,
KNN
78.4% (SVM),
82.2% (KNN)
NA
NA
Benba et al. [74]
Speech
HFCC-SVM
87.5%
90%
85%
Proposed Work
Speech
SVM, naïve Bayes, KNN and
ANN
87.17%,
74.11%,
87.17%, and
96.7%
62.5%, 84%,
60%, and
92.42%
93.54%,
79.76%,
93.54%, and
91.25%
As per the comparative analysis, the proposed model (using four machine learning
algorithms) shows better results obtained as compared to all other experimental machine
learning models and the existing state of the art. In the proposed study, the best result was
Electronics 2022, 11, 3782 17 of 21
achieved by ANN with 96.7% accuracy, which is higher than the other experimental
algorithms. The authors of [49] collected 20 PD and 20 HC speech datasets using high-
quality recording equipment and used KNN and SVM to analyze the datasets in order to
detect PD. KNN and SVM classifiers performed with accuracy rates of 59.52% (LOSO) and
68.45% (LOSO), respectively. In addition to [50], the authors used various algorithms such
as C4.5, C5.0, random forest, and CART based on decision trees. The authors
experimented on 40 individuals’ records, where 50% were affected with the subjective
disease and 50% were HC. For this study, the highest average model accuracy of 66.5%
was attained. ANN was used by [51] to identify PD. The dataset was obtained from the
University of California, Irvines machine learning library. A total of 45 attributes were
chosen as input values and one outcome for the categorization using the MATLAB tool.
With an accuracy of 94.93%, their suggested model was able to differentiate healthy
individuals from PD subjects. In [52], the authors used random forest, SVM, MLP, and
KNN classifiers for the detection of PD patients from HC. The result obtained from this
study was 78.4% and 82.2% for the SVM and KNN classifiers, respectively. In a study by
[53], the authors examined the comparison between the patients with PD (PWP) and
healthy controls (HC) based on a variety of speech samples. In their study, human factor
cepstral coefficients (HFCC) were applied. The extracted HFCC was used to generate the
average voice print for each voice recording. For the classification, SVM was used with a
variety of kernels, including RBF, polynomial, linear, and MLP. The SVMs linear kernel
allowed for the highest accuracy of 87.5%.
In addition to the comparisons mentioned above, the performance of the proposed
methodology is compared with related ML methods for PD analysis in various scenarios
and with various types of evaluated PD datasets. The proposed technique outperformed
other similar contributions of ML methods in terms of performance for diagnosing PD, as
seen in the above table, and is thus superior to them.
7. Conclusions
Automated ML techniques may classify PD from HC and predict the outcome using
non-invasive speech biomarkers as features. With noisy and high-dimensional data, our
study compares the performance of multiple machine learning classifiers for disease
detection. Accuracy at the clinical level is feasible with careful feature selection. In this
paper, we compared ML classifiers: SVM with an accuracy of 87.17%, naïve Bayes’
classifier with an accuracy of 74.11%, ANN with an accuracy of 96.7%, and KNN with an
accuracy of 87.17%. We used these techniques to distinguish between affected patients
and healthy people. The disease is diagnosed using human speech signals. The acquired
results demonstrate how feature selection techniques work well with ML classifiers,
especially when working with voice data where it is possible to extract a large number of
phonetic characteristics. The proposed early diagnosis approach makes it possible to
detect PD with high accuracy in its early stages and the subjective disease’s severe
symptoms can be prevented. Many categorization algorithms are being used in the
medical imaging area to obtain the best level of accuracy. This research may be used in
different machine learning methods and datasets to improve classifier performance and
reach the maximum accuracy score. In order to improve the accuracy of the models
created, future efforts will make use of the already-existing recordings and add to the
number of existing attributes. In order to compare the collected data, various different
records processing software that are available online may also be used.
Author Contributions: Conceptualization, A.R.; methodology, A.R. and A.D.; validation, A.R. and
M.R.; formal analysis, A.R. and N.A.; writingoriginal draft preparation, A.R.; writingreview
and editing, M.K.P. and M.R.; supervision, A.D. and R.S. All authors have read and agreed to the
published version of the manuscript.
Funding: There was no external funding received for this article.
Electronics 2022, 11, 3782 18 of 21
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Data in this research paper will be shared upon request made to the
first author.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. DeMaagd, G.; Philip, A. Parkinson’s Disease and Its Management: Part 1: Disease Entity, Risk Factors, Pathophysiology, Clinical
Presentation, and Diagnosis. Pharm. Ther. 2015, 40, 504532.
2. Rizek, P.; Kumar, N.; Jog, M.S. An update on the diagnosis and treatment of Parkinson disease. CMAJ 2016, 188, 11571165.
3. Available online: https://www.who.int/news-room/fact-sheets/detail/parkinson-disease (accessed on 30 October 2022).
4. de Rijk, M.C.; Launer, L.J.; Berger, K.; Breteler, M.M.; Dartigues, J.F.; Baldereschi, M.; Fratiglioni, L.; Lobo, A.; Martinez-Lage,
J.; Trenkwalder, C., et al. Prevalence of Parkinson’s disease in Europe: A collaborative study of population-based cohorts.
Neuro-logic Diseases in the Elderly Research Group. Neurology 2000, 54 (Suppl. 5), S21S23.
5. Canturk, İ.; Karabiber, F. A machine learning system for the diagnosis of Parkinson’s disease from speech signals and its
application to multiple speech signal types. Arab. J. Sci. Eng. 2016, 41, 50495059.
6. Singh, N.; Pillay, V.; Choonara, Y.E. Advances in the treatment of Parkinson’s disease. Prog. Neurobiol. 2007, 81, 2944.
7. Rana, A.; Rawat, A.S.; Bijalwan, A.; Bahuguna, H. Application of multi-layer (perceptron) artificial neural network in the
diagnosis system: A systematic review. In Proceedings of the 2018 International Conference on Research in Intelligent and
Computing in Engineering (RICE), San Salvador, El Salvador, 2224 August 2018; pp. 16.
8. Lakany, H. Extracting a diagnostic gait signature. Pattern Recognit. 2008, 41, 16271637.
9. Figueiredo, J.; Santos, C.P.; Moreno, J.C. Automatic recognition of gait patterns in human motor disorders using machine
learning: A review. Med. Eng. Phys. 2018, 53, 112. https://doi.org/10.1016/j.medengphy.2017.12.006.
10. Hazan, H.; Hilu, D.; Manevitz, L.; Ramig, L.O.; Sapir, S. Early diagnosis of Parkinson’s disease via machine learning on speech
data. In Proceedings of the 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel, Eilat, Israel, 1417
November 2012; pp. 14. https://doi.org/10.1109/eeei.2012.6377065.
11. Karan, B.; Sahu, S.S.; Mahto, K. Parkinson disease prediction using intrinsic mode function based features from speech signal.
Biocybern. Biomed. Eng. 2019, 40, 249264. https://doi.org/10.1016/j.bbe.2019.05.005.
12. Frid, A.; Safra, E.J.; Hazan, H.; Lokey, L.L.; Hilu, D.; Manevitz, L.; Ramig, L.O.; Sapir, S. Computational diagnosis of Parkinson’s
Disease directly from natural speech using machine learning techniques. In Proceedings of the 2014 IEEE International
Conference on Software Science, Technology and Engineering, Washington, DC, USA, 1112 June 2014; pp. 5053.
13. Rawat, A. S., Rana, A., Kumar, A., & Bagwari, A. (2018). Application of multi layer artificial neural network in the diagnosis
system: a systematic review. IAES International Journal of Artificial Intelligence, 7(3), 138.
14. KarimiRouzbahani, H.; Daliri, M.R. Diagnosis of Parkinson’s Disease in Human Using Voice Signals. BCN 2011, 2, 1220.
15. Khamparia, A.; Gupta, D.; Nguyen, N.G.; Khanna, A.; Pandey, B.; Tiwari, P. Sound Classification Using Convolutional Neural
Network and Tensor Deep Stacking Network. IEEE Access 2019, 7, 77177727. https://doi.org/10.1109/access.2018.2888882.
16. Bourouhou, A.; Jilbab, A.; Nacir, C.; Hammouch, A. Comparison of classification methods to detect the parkinson disease. In
Proceedings of the 2016 International Conference on Electrical and Information Technologies (ICEIT), Tangiers, Morocco, 47
May 2016; pp. 421424.
17. Sharma, A.; Giri, R.N. Automatic Recognition of Parkinson’s Disease via Artificial Neural Network and Support Vector
Machine. Int. J. Innov. Technol. Explor. Eng. (IJITEE) 2014, 4, 7.
18. Purwins, H.; Li, B.; Virtanen, T.; Schluter, J.; Chang, S.-Y.; Sainath, T.N. Deep Learning for Audio Signal Processing. IEEE J. Sel.
Top. Signal Process. 2019, 13, 206219. https://doi.org/10.1109/jstsp.2019.2908700.
19. Zhang, L.; Qu, Y.; Jin, B.; Jing, L.; Gao, Z.; Liang, Z. An Intelligent Mobile-Enabled System for Diagnosing Parkinson Disease:
Development and Validation of a Speech Impairment Detection System. JMIR Public Health Surveill. 2020, 8, e18689.
https://doi.org/10.2196/18689.
20. Kadiri, S.R.; Kethireddy, R.; Alku, P. Parkinson’s Disease Detection from Speech Using Single Frequency Filtering Cepstral
Coefficients. In Proceedings of the Interspeech 2020, Shanghai, China, 2529 October 2020.
https://doi.org/10.21437/interspeech.2020-3197.
21. Pramanik, M.; Pradhan, R.; Nandy, P.; Bhoi, A.K.; Barsocchi, P. Machine Learning Methods with Decision Forests for
Parkinson’s Detection. Appl. Sci. 2021, 11, 581. https://doi.org/10.3390/app11020581.
22. Gunduz, H. Deep Learning-Based Parkinson’s Disease Classification Using Vocal Feature Sets. IEEE Access 2019, 7, 115540
115551. https://doi.org/10.1109/access.2019.2936564.
23. Available online: https://www.dataversity.net/improving-clinical-insights-machine-learning/# (accessed on 27 August 2022).
24. Kohlschein, C.; Schmitt, M.; Schuller, B.; Jeschke, S.; Werner, C.J. A machine learning based system for the automatic evaluation
of aphasia speech. In Proceedings of the 2017 IEEE 19th International Conference on e-Health Networking, Applications and
Services (Healthcom), Dalian, China, 1215 October 2017; pp. 16. https://doi.org/10.1109/healthcom.2017.8210766.
Electronics 2022, 11, 3782 19 of 21
25. Bertini, F.; Allevi, D.; Lutero, G.; Montesi, D.; Calzà, L. Automatic Speech Classifier for Mild Cognitive Impairment and Early
Dementia. ACM Trans. Comput. Healthc. (HEALTH) 2022, 3, 111. https://doi.org/10.1145/3469089.
26. Matcham, F.; On Behalf of the RADAR-CNS Consortium; Pietro, C.B.D.S.; Bulgari, V.; de Girolamo, G.; Dobson, R.; Eriksson,
H.; Folarin, A.A.; Haro, J.M.; Kerz, M.; et al. Remote assessment of disease and relapse in major depressive disorder (RADAR-
MDD): A multi-centre prospective cohort study protocol. BMC Psychiatry 2019, 19, 72. https://doi.org/10.1186/s12888-019-2049-
z.
27. Sakar, C.O.; Serbes, G.; Gunduz, A.; Tunc, H.C.; Nizam, H.; Sakar, B.E.; Tutuncu, M.; Aydin, T.; Isenkul, M.E.; Apaydin, H. A
comparative analysis of speech signal processing algorithms for Parkinson’s disease classification and the use of the tunable-
factor wavelet transform. Appl. Soft Comput. 2019, 74, 255263.
28. Yasar, A.; Saritas, I.; Sahman, M.A.; Cinar, A.C. Classification of Parkinson disease data with artificial neural networks. In
Proceedings of the IOP Conference Series: Materials Science and Engineering, Wuhan, China, 1012 October 2019; Volume 675,
p.012031.
29. Avuçlu, E.; Elen, A. Evaluation of train and test performance of machine learning algorithms and Parkinson diagnosis with
statistical measurements. Med. Biol. Eng. Comput. 2020, 58, 27752788. https://doi.org/10.1007/s11517-020-02260-3.
30. Marar, S.; Swain, D.; Hiwarkar, V.; Motwani, N.; Awari, A. Predicting the occurrence of Parkinson’s Disease using various
Classification Models. In Proceedings of the 2018 International Conference on Advanced Computation and Telecommunication
(ICACAT), Bhopal, India, 2829 December 2018; pp. 15.
31. Nikookar, E.; Sheibani, R.; Alavi, S.E. An ensemble method for diagnosis of Parkinson’s disease based on voice measurements.
J. Med. Signals Sens. 2019, 9, 221226. https://doi.org/10.4103/jmss.jmss_57_18.
32. Tracy, J.M.; Özkanca, Y.; Atkins, D.C.; Ghomi, R.H. Investigating voice as a biomarker: Deep phenotyping methods for early
detection of Parkinson’s disease. J. Biomed. Inform. 2019, 104, 103362. https://doi.org/10.1016/j.jbi.2019.103362.
33. Cibulka, M.; Brodnanova, M.; Grendar, M.; Grofik, M.; Kurca, E.; Pilchova, I.; Osina, O.; Tatarkova, Z.; Dobrota, D.; Kolisek, M.
SNPs rs11240569, rs708727, and rs823156 in SLC41A1 Do Not Discriminate Between Slovak Patients with Idiopathic Parkinson’s
Disease and Healthy Controls: Statistics and Machine-Learning Evidence. Int. J. Mol. Sci. 2019, 20, 4688.
https://doi.org/10.3390/ijms20194688.
34. Hsu, S.-Y.; Lin, H.-C.; Chen, T.-B.; Du, W.-C.; Hsu, Y.-H.; Wu, Y.-C.; Tu, P.-W.; Huang, Y.-H.; Chen, H.-Y. Feasible Classified
Models for Parkinson Disease from 99mTc-TRODAT-1 SPECT Imaging. Sensors 2019, 19, 1740.
https://doi.org/10.3390/s19071740.
35. Drotár, P., Mekyska, J., Rektorová, I., Masarová, L., Smékal, Z., & Faundez-Zanuy, M. (2016). Evaluation of handwriting
kinematics and pressure for differential diagnosis of Parkinson's disease. Artificial intelligence in Medicine, 67, 39-46.
36. Maass, F.; Michalke, B.; Willkommen, D.; Leha, A.; Schulte, C.; Tönges, L.; Mollenhauer, B.; Trenkwalder, C.; Rückamp, D.;
Börger, M.; et al. Elemental fingerprint: Reassessment of a cerebrospinal fluid biomarker for Parkinson’s disease. Neurobiol. Dis.
2019, 134, 104677. https://doi.org/10.1016/j.nbd.2019.104677.
37. Mucha, J.; Mekyska, J.; Faundez-Zanuy, M.; Lopez-De-Ipina, K.; Zvoncak, V.; Galaz, Z.; Kiska, T.; Smekal, Z.; Brabenec, L.;
Rektorova, I. Advanced Parkinson’s Disease Dysgraphia Analysis Based on Fractional Derivatives of Online Handwriting. In
Proceedings of the 2018 10th International Congress on Ultra Modern Telecommunications and Control Systems and
Workshops (ICUMT), Moscow, Russia, 59 November 2018; pp. 16. https://doi.org/10.1109/icumt.2018.8631265.
38. Wenzel, M.; Milletari, F.; Krüger, J.; Lange, C.; Schenk, M.; Apostolova, I.; Klutmann, S.; Ehrenburg, M.; Buchert, R. Automatic
classification of dopamine transporter SPECT: Deep convolutional neural networks can be trained to be robust with respect to
variable image characteristics. Eur. J. Pediatr. 2019, 46, 28002811. https://doi.org/10.1007/s00259-019-04502-5.
39. Segovia, F.; Gorriz, J.M.; Ramirez, J.; Martinez-Murcia, F.J.; Castillo-Barnes, D. Assisted Diagnosis of Parkinsonism Based on
the Striatal Morphology. Int. J. Neural Syst. 2019, 29, 1950011. https://doi.org/10.1142/s0129065719500114.
40. Ye, Q.; Xia, Y.; Yao, Z. Classification of Gait Patterns in Patients with Neurodegenerative Disease Using Adaptive Neuro-Fuzzy
Inference System. Comput. Math. Methods Med. 2018, 2018, 9831252. https://doi.org/10.1155/2018/9831252.
41. Klomsae, A.; Auephanwiriyakul, S.; Theera-Umpon, N. (2018). String grammar unsupervised possibilistic fuzzy c-medians for
gait pattern classification in patients with neurodegenerative diseases. Comput. Intell. Neurosci. 2018, 2018, 1869565.
42. Felix, J.P.; Vieira, F.H.T.; Cardoso, A.A.; Ferreira, M.V.G.; Franco, R.A.P.; Ribeiro, M.A.; Araujo, S.G.; Correa, H.P.; Carneiro,
M.L. A Parkinson’s Disease Classification Method: An Approach Using Gait Dynamics and Detrended Fluctuation Analysis. In
Proceedings of the 2019 IEEE Canadian Conference of Electrical and Computer Engineering (CCECE), Edmonton, AB, Canada,
58 May 2019. https://doi.org/10.1109/ccece.2019.8861759.
43. Andrei, A.-G.; Tautan, A.-M.; Ionescu, B. Parkinson’s Disease Detection from Gait Patterns. In Proceedings of the 2019 E-Health
and Bioengineering Conference (EHB), Iasi, Romania, 2123 November 2019; pp. 14.
https://doi.org/10.1109/ehb47216.2019.8969942.
44. Priya, S.J.; Rani, A.J.; Subathra, M.S.P.; Mohammed, M.A.; Damaševičius, R.; Ubendran, N. Local Pattern Transformation Based
Feature Extraction for Recognition of Parkinson’s Disease Based on Gait Signals. Diagnostics 2021, 11, 1395.
https://doi.org/10.3390/diagnostics11081395.
45. Yurdakul, O.C.; Subathra, M.; George, S.T. Detection of Parkinson’s Disease from gait using Neighborhood Representation
Local Binary Patterns. Biomed. Signal Process. Control 2020, 62, 102070. https://doi.org/10.1016/j.bspc.2020.102070.
46. Li, B.; Yao, Z.; Wang, J.; Wang, S.; Yang, X.; Sun, Y. Improved Deep Learning Technique to Detect Freezing of Gait in Parkinson’s
Disease Based on Wearable Sensors. Electronics 2020, 9, 1919. https://doi.org/10.3390/electronics9111919.
Electronics 2022, 11, 3782 20 of 21
47. Rana, A.; Dumka, A.; Singh, R.; Panda, M.K.; Priyadarshi, N.; Twala, B. Imperative Role of Machine Learning Algorithm for
Detection of Parkinson’s Disease: Review, Challenges and Recommendations. Diagnostics 2022, 12, 2003.
https://doi.org/10.3390/diagnostics12082003.
48. Masoudi-Sobhanzadeh, Y.; MotieGhader, H.; Masoudi-Nejad, A. FeatureSelect: A software for feature selection based on
machine learning approaches. BMC Bioinform. 2019, 20, 107. https://doi.org/10.1186/s12859-019-2754-0.
49. Rahmaninia, M.; Moradi, P. OSFSMI: Online stream feature selection method based on mutual information. Appl. Soft Comput.
2018, 68, 733746. https://doi.org/10.1016/j.asoc.2017.08.034.
50. Pourbahrami, S. Improving PSO global method for feature selection according to iterations global search and chaotic theory.
arXiv 2018, preprint. arXiv:1811.08701.
51. Yu, L.; Liu, H. Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th
International Conference on Machine Learning (ICML-03), Washington, DC, USA, 2124 August 2003; pp. 856863.
52. Blum, A.L.; Langley, P. Selection of relevant features and examples in machine learning. Artif. Intell. 1997, 97, 245271.
https://doi.org/10.1016/s0004-3702(97)00063-5.
53. Raileanu, L.E.; Stoffel, K. Theoretical Comparison between the Gini Index and Information Gain Criteria. Ann. Math. Artif. Intell.
2004, 41, 7793. https://doi.org/10.1023/b:amai.0000018580.96245.c6.
54. Jolliffe, I.T. Principal Component Analysis, 2nd ed.; Springer: New York, NY, USA, 2002.
55. Miao, Y.; Lou, X.; Wu, H. The Diagnosis of Parkinson’s Disease Based on Gait, Speech Analysis and Machine Learning
Techniques. In Proceedings of the 2021 International Conference on Bioinformatics and Intelligent Computing (BIC 2021).
Association for Computing Machinery, New York, NY, USA, 2224 January 2021; pp. 358371.
56. Little, M.; McSharry, P.; Hunter, E.; Spielman, J.; Ramig, L. Suitability of dysphonia measurements for telemonitoring of
Parkinson’s disease. Nat. Preced. 2008, 1-1. https://doi.org/10.1038/npre.2008.2298.1.
57. Lichman, M. UCI Machine Learning Repository; University of California, School of Information and Computer Science: Irvine,
CA, USA. Available online: http://archive.ics.uci.edu/ml (accessed on 25 September 2022).
58. Rewar, S. A systematic review on Parkinson’s disease (PD). Indian J. Res. Pharm. Biotechnol. 2015, 3, 176.
59. Arora, S.; Venkataraman, V.; Zhan, A.; Donohue, S.; Biglan, K.; Dorsey, E.; Little, M. Detecting and monitoring the symptoms
of Parkinson’s disease using smartphones: A pilot study. Park. Relat. Disord. 2015, 21, 650653.
https://doi.org/10.1016/j.parkreldis.2015.02.026.
60. Miljkovic, D.; Aleksovski, D.; Podpečan, V.; Lavrač, N.; Malle, B.; Holzinger, A. Machine Learning and Data Mining Methods
for Managing Parkinson’s Disease. In Machine Learning for Health Informatics, Springer: Cham, Switzerland, 2016; pp. 209–
220. https://doi.org/10.1007/978-3-319-50478-0_10.
61. Challa, K.N.R.; Pagolu, V.S.; Panda, G.; Majhi, B. An improved approach for prediction of Parkinson’s disease using machine
learning techniques. In Proceedings of the 2016 International Conference on Signal Processing, Communication, Power and
Embedded System (SCOPES), Odisha, India, 35 October 2016; pp. 14461451. https://doi.org/10.1109/scopes.2016.7955679.
62. Lee, G.S.; Lin, S.H. Changes of rhythm of vocal fundamental frequency in sensorineural hearing loss and in Parkinson’s disease.
Chin. J. Physiol. 2009, 52, 446450.
63. Asmae, O.; Abdelhadi, R.; Bouchaib, C.; Sara, S.; Tajeddine, K. Parkinson’s Disease Identification using KNN and ANN
Algorithms based on Voice Disorder. In Proceedings of the 2020 1st International Conference on Innovative Research in Applied
Science, Engineering and Technology (IRASET), Meknes, Morocco, 1619 April 2020; Institute of Electrical and Electronics
Engineers (IEEE): New York, NY, USA, 2020; pp. 16.
64. Wu, X.; Kumar, V.; Ross Quinlan, J.; Ghosh, J.; Yang, Q.; Motoda, H.; Steinberg, D. Top 10 algorithms in data mining. Knowl. Inf.
Syst. 2008, 14, 137.
65. Ray, P.K.; Mohanty, A.; Panigrahi, T. Power quality analysis in solar PV integrated microgrid using independent component
analysis and support vector machine. Optik 2019, 180, 691698.
66. Lahmiri, S.; Shmuel, A. Detection of Parkinson’s disease based on voice patterns ranking and optimized support vector machine.
Biomed. Signal Process. Control 2018, 49, 427433. https://doi.org/10.1016/j.bspc.2018.08.029.
67. Bhatia, A.; Sulekh, R. Predictive Model for Parkinson’s disease through Naïve Bayes Classification. Int. J. Comput. Sci. Commun.
2017, 9, 194202.
68. Rana, A.; Bahuguna, H.; Bijalwan, A. Artificial Neural Network based Diagnosis System. International Journal of Computer
Trends and Technology 2017, 3, 48, 189-191.
69. Alzubaidi, M.S.; Shah, U.; DhiaZubaydi, H.; Dolaat, K.; Abd-Alrazaq, A.A.; Ahmed, A.; Househ, M. The Role of Neural Network
for the Detection of Parkinson’s Disease: A Scoping Review. Healthcare 2021, 9, 740.
70. Sakar, B.E.; Isenkul, M.E.; Sakar, C.O.; Sertbas, A.; Gurgen, F.; Delil, S.; Apaydin, H.; Kursun, O. Collection and Analysis of a
Parkinson Speech Dataset With Multiple Types of Sound Recordings. IEEE J. Biomed. Health Inform. 2013, 17, 828834.
https://doi.org/10.1109/jbhi.2013.2245674.
71. Vadovský, M.; Paralič, J. Parkinson's disease patients classification based on the speech signals. In Proceedings of the 2017 IEEE
15th International Symposium on Applied Machine Intelligence and Informatics (SAMI), Herlany, Slovakia, 2628 January 2017;
pp. 000321000326.
72. Ouhmida, A., Raihani, A., Cherradi, B., & Terrada, O. (2021). A Novel Approach for Parkinson’s Disease Detection Based on
Voice Classification and Features Selection Techniques. International Journal of Online & Biomedical Engineering, 17(10).
Electronics 2022, 11, 3782 21 of 21
73. Mabrouk, R.; Chikhaoui, B.; Bentabet, L. Machine Learning Based Classification Using Clinical and DaTSCAN SPECT Imaging
Features: A Study on Parkinson’s Disease and SWEDD. IEEE Trans. Radiat. Plasma Med. Sci. 2018, 3, 170177.
https://doi.org/10.1109/trpms.2018.2877754.
74. Benba, A.; Jilbab, A.; Hammouch, A. Using Human Factor Cepstral Coefficient on Multiple Types of Voice Recordings for
Detecting Patients with Parkinson’s Disease. Irbm 2017, 38, 346351. https://doi.org/10.1016/j.irbm.2017.10.002.
... Among all these techniques, SMOTE yielded the best results-an accuracy of 98% regarding the detection of PD. The research by Arti et al. [12] utilized three different machine learning classifiers, namely SVM, KNN, and Naive Bayes, together with Artificial Neural Networks, to diagnose Parkinson's disease through the examination of speech patterns. To improve the models further, they enriched the dataset using wrapper and filtering techniques. ...
Article
Full-text available
The rapid advancements in artificial intelligence (AI) and data analytics have created significant opportunities in fields such as healthcare and intelligent transportation. As the volume of complex data continues to grow, there is an increasing demand for analytical models capable of extracting meaningful patterns and generating accurate predictions. This study focuses on enhancing Parkinson’s disease (PD) detection by using the Harris Hawk Optimization (HHO) for feature selection to improve classifier performance on the UCI Parkinson's disease dataset. We evaluated four classifiers: Decision Tree (DT), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Random Forest (RF), under two scenarios: without feature selection and with HHO-based feature selection. The results reveal substantial performance improvements with HHO, with RF achieving the highest accuracy of 98.33%. Comparisons with recent studies highlight the effectiveness of our approach, establishing it as a new benchmark in PD detection accuracy. This research underscores the essential role of optimized feature selection in enhancing classifier accuracy and reliability, especially for early diagnosis through voice-based data.
... Ali et al. [6] introduced a hybrid framework that combines L1-regularized SVM with deep neural networks to refine feature sets and enhance accuracy. Rana et al. [7] concentrated on voice characteristics for PD diagnosis, demonstrating the effectiveness of algorithms such as Random Forest and Gradient Boosting in managing voice datasets. Gullapalli and Mittal [8] examined speech features and machine learning approaches, highlighting the promise of deep learning for early identification. ...
Article
Full-text available
Parkinson's Disease (PD) is a progressive neurodegenerative disorder that impacts motor skills, including tremors, bradykinesia, and rigidity, affecting millions globally. Early diagnosis is essential for effective treatment yet remains challenging as the symptoms overlap with other conditions and the limitations of conventional diagnostic methods. This study presents a diagnostic tool utilizing machine learning that employs a Support Vector Machine (SVM) classifier for precise PD prediction through biomedical voice data. The system uses the UCI Parkinson’s dataset, where pre-processing tasks like feature standardization, train-test split (80-20 ratio), and Recursive Feature Elimination (RFE)enhance model accuracy by identifying significant features. An easy-to-use Streamlit web application was developed to enable real-time predictions, permitting users to input voice parameters and receive instant diagnostic results. The SVM classifier achieved a precision rate of 92%, showcasing its capability and effectiveness in distinguishing PD from non-affected cases. By providing a scalable, cost-effective, and non-invasive approach, this tool bridges advanced computational techniques with real-world healthcare needs. Future enhancements will focus on integrating multimodal data, such as neuroimaging and wearable sensor data, as well as employing deep learning models to improve diagnostic accuracy and expand clinical applicability.
... Não há cura para a DP, de forma que os pacientes dependem de detecção precoce e tratamentos personalizados para retardar o progresso da doença (7) e assegurar uma melhor qualidade de vida. Neste sentido, dados acústicos têm sido utilizados para descrever as características vocais de indivíduos com DP (7) , e são diversos os trabalhos que propõem o uso de aprendizado de máquina para auxiliar no diagnóstico da DP a partir da classificação de sinais de voz (7,8,9,10,11) . ...
Article
Full-text available
Objetivo: Este estudo investiga se o possível viés na sobreamostragem via janelamento de dados de marcha em indivíduos com Doença de Parkinson (DP) também ocorre em sinais vocais. Um estudo anterior levantou a hipótese de que amostras distintas de um mesmo indivíduo não devem ser tratadas independentemente, dado o risco de enviesamento dos modelos. Método: Usamos sinais de voz de 24 indivíduos com DP e 8 saudáveis, e os algoritmos K-Nearest Neighbors (KNN), Support Vector Machine (SVM) e Random Forest (RF). A validação cruzada foi feita com Leave-one-out (LOOCV), adaptada para cenários com e sem viés nos dados de treinamento. Resultados: Modelos avaliados sem considerar o viés apresentaram performances inflacionadas, enquanto a abordagem rigorosa mostrou resultados mais modestos. Conclusão: Amostras do mesmo indivíduo em treinamento e teste podem inflar a performance dos modelos. A correta aplicação da sobreamostragem é crucial para desenvolver modelos confiáveis para o diagnóstico de DP.
... Furthermore, recent years have witnessed a surge in research dedicated to diagnosing PD through voice-based analysis, emphasizing the importance of feature selection methodologies. Various strategies have been employed, including some techniques like Features Importance [37], Filter & Wrapper feature selection [39,40], mRMR, ReliefF [5,36], Chi 2 [35], PCA [41], NCA, generating high-quality features using deep feature learning through an embedded stack group sparse autoencoder, combined with L1 regularization for fusing the deep features with original speech features [18], etc. These methods play a pivotal role to select the relevant features from vast datasets in order to enhance the speed, performance, robustness and reliability of diagnostic tools. ...
Article
This study aims to build a pre-diagnosis tool for predicting Parkinson's disease based on a speech disorder which appears as a symptom in approximately 90 % of people with this disease. Recently, some technologies such as AIoT and IoMT aim to integrate Artificial Intelligence and the Internet of Things or Internet of Medical Things to provide an intelligent remote diagnosis for enhancing medical services. Thus, the classification speed and reliability of the systems in these fields are highly recommended. In this work, we compared five ML algorithms (LR, RF, XGB, SVM, KNN) based on their performance, classification speed and reliability. We employed the sequential forward feature selection in order to select the optimal relevant feature for reducing the dimensionality of the used acoustic dataset to enhance both the performance and computation cost for the proposed system. Furthermore, the stratified cross-validation approach has been used to obtain a fair estimation for the proposed system across each point in the dataset. In this paper, we used a vocal dataset of Parkinson's disease consisting of 195 samples and 22 features. We found that 10 features provide the optimal performance. So, we proposed the K-Nearest Neighbours algorithm as a classifier for our system. It reached 98.46 %, 99.33 % and 98.67 % of the accuracy, sensitivity and precision respectively. Moreover, this work provides a detailed explanation of the employed techniques and the obtained results. The novelty of this work, compared to the existing literature, is to enhance both computation cost and performance for building a real-world system to diagnose Parkinson's disease through speech disorders.
Article
Full-text available
Parkinson’s disease (PD) is a neurodegenerative disease that affects the neural, behavioral, and physiological systems of the brain. This disease is also known as tremor. The common symptoms of this disease are a slowness of movement known as ‘bradykinesia’, loss of automatic movements, speech/writing changes, and difficulty with walking at early stages. To solve these issues and to enhance the diagnostic process of PD, machine learning (ML) algorithms have been implemented for the categorization of subjective disease and healthy controls (HC) with comparable medical appearances. To provide a far-reaching outline of data modalities and artificial intelligence techniques that have been utilized in the analysis and diagnosis of PD, we conducted a literature analysis of research papers published up until 2022. A total of 112 research papers were included in this study, with an examination of their targets, data sources and different types of datasets, ML algorithms, and associated outcomes. The results showed that ML approaches and new biomarkers have a lot of promise for being used in clinical decision-making, resulting in a more systematic and informed diagnosis of PD. In this study, some major challenges were addressed along with a future recommendation.
Article
Full-text available
Parkinson’s disease (PD) is one of the most widespread diseases that, primarily, affects the motor system of the neural central system. In fact, PD is characterized by tremors, stiffness of the muscles, imprecise gait movements, and vocal impairment. An accurate diagnosis of Parkinson’s disease is usually based on many neurological, psychological, and physical investigations despite the fact that its main symptoms cannot be easily decorrelated from other diseases. As such, many automatic diagnostic support systems based on Machine Learning approaches have been recently employed to assist the PD patients' assessment. In the current paper, a comparative analysis was performed on machine learning (ML) techniques for PD identification based on voice disorders analysis. These ML methods included the Support Vector Machine (SVM), K-Nearest-Neighbors (KNN), and Decision Tree (DT) algorithms. In addition, two feature selection techniques; mRMR and ReliefF; are used to further improve the performance of the proposed classifiers. The efficiency of the developed model has been evaluated based on accuracy, sensitivity, specificity and AUC metrics, and it is higher than existing approaches. The simulation results show that the KNN algorithm yielded the best classifier performance in term of accuracy and reached an AUC of 98.26%.
Article
Full-text available
Parkinson’s disease (PD) is a neuro-degenerative disorder primarily triggered due to the deterioration of dopamine-producing neurons in the substantia nigra of the human brain. The early detection of Parkinson’s disease can assist in preventing deteriorating health. This paper analyzes human gait signals using Local Binary Pattern (LBP) techniques during feature extraction before classification. Supplementary to the LBP techniques, Local Gradient Pattern (LGP), Local Neighbour Descriptive Pattern (LNDP), and Local Neighbour Gradient Pattern (LNGP) were utilized to extract features from gait signals. The statistical features were derived and analyzed, and the statistical Kruskal–Wallis test was carried out for the selection of an optimal feature set. The classification was then carried out by an Artificial Neural Network (ANN) for the identified feature set. The proposed Symmetrically Weighted Local Neighbour Gradient Pattern (SWLNGP) method achieves a better performance, with 96.28% accuracy, 96.57% sensitivity, and 95.94% specificity. This study suggests that SWLNGP could be an effective feature extraction technique for the recognition of Parkinsonian gait.
Article
Full-text available
Background: Parkinson’s Disease (PD) is a chronic neurodegenerative disorder that has been ranked second after Alzheimer’s disease worldwide. Early diagnosis of PD is crucial to combat against PD to allow patients to deal with it properly. However, there is no medical test(s) available to diagnose PD conclusively. Therefore, computer-aided diagnosis (CAD) systems offered a better solution to make the necessary data-driven decisions and assist the physician. Numerous studies were conducted to propose CAD to diagnose PD in the early stages. No comprehensive reviews have been conducted to summarize the role of AI tools to combat PD. Objective: The study aimed to explore and summarize the applications of neural networks to diagnose PD. Methods: PRISMA Extension for Scoping Reviews (PRISMA-ScR) was followed to conduct this scoping review. To identify the relevant studies, both medical databases (e.g., PubMed) and technical databases (IEEE) were searched. Three reviewers carried out the study selection and extracted the data from the included studies independently. Then, the narrative approach was adopted to synthesis the extracted data. Results: Out of 1061 studies, 91 studies satisfied the eligibility criteria in this review. About half of the included studies have implemented artificial neural networks to diagnose PD. Numerous studies included focused on the freezing of gait (FoG). Biomedical voice and signal datasets were the most commonly used data types to develop and validate these models. However, MRI- and CT-scan images were also utilized in the included studies. Conclusion: Neural networks play an integral and substantial role in combating PD. Many possible applications of neural networks were identified in this review, however, most of them are limited up to research purposes.
Article
Full-text available
Biomedical engineers prefer decision forests over traditional decision trees to design state-of-the-art Parkinson’s Detection Systems (PDS) on massive acoustic signal data. However, the challenges that the researchers are facing with decision forests is identifying the minimum number of decision trees required to achieve maximum detection accuracy with the lowest error rate. This article examines two recent decision forest algorithms Systematically Developed Forest (SysFor), and Decision Forest by Penalizing Attributes (ForestPA) along with the popular Random Forest to design three distinct Parkinson’s detection schemes with optimum number of decision trees. The proposed approach undertakes minimum number of decision trees to achieve maximum detection accuracy. The training and testing samples and the density of trees in the forest are kept dynamic and incremental to achieve the decision forests with maximum capability for detecting Parkinson’s Disease (PD). The incremental tree densities with dynamic training and testing of decision forests proved to be a better approach for detection of PD. The proposed approaches are examined along with other state-of-the-art classifiers including the modern deep learning techniques to observe the detection capability. The article also provides a guideline to generate ideal training and testing split of two modern acoustic datasets of Parkinson’s and control subjects donated by the Department of Neurology in Cerrahpaşa, Istanbul and Departamento de Matemáticas, Universidad de Extremadura, Cáceres, Spain. Among the three proposed detection schemes the Forest by Penalizing Attributes (ForestPA) proved to be a promising Parkinson’s disease detector with a little number of decision trees in the forest to score the highest detection accuracy of 94.12% to 95.00%.
Article
Full-text available
Freezing of gait (FOG) is a paroxysmal dyskinesia, which is common in patients with advanced Parkinson’s disease (PD). It is an important cause of falls in PD patients and is associated with serious disability. In this study, we implemented a novel FOG detection system using deep learning technology. The system takes multi-channel acceleration signals as input, uses one-dimensional deep convolutional neural network to automatically learn feature representations, and uses recurrent neural network to model the temporal dependencies between feature activations. In order to improve the detection performance, we introduced squeeze-and-excitation blocks and attention mechanism into the system, and used data augmentation to eliminate the impact of imbalanced datasets on model training. Experimental results show that, compared with the previous best results, the sensitivity and specificity obtained in 10-fold cross-validation evaluation were increased by 0.017 and 0.045, respectively, and the equal error rate obtained in leave-one-subject-out cross-validation evaluation was decreased by 1.9%. The time for detection of a 256 data segment is only 0.52 ms. These results indicate that the proposed system has high operating efficiency and excellent detection performance, and is expected to be applied to FOG detection to improve the automation of Parkinson’s disease diagnosis and treatment.
Article
The World Health Organization estimates that 50 million people are currently living with dementia worldwide and this figure will almost triple by 2050. Current pharmacological treatments are only symptomatic, and drugs or other therapies are ineffective in slowing down or curing the neurodegenerative process at the basis of dementia. Therefore, early detection of cognitive decline is of the utmost importance to respond significantly and deliver preventive interventions. Recently, the researchers showed that speech alterations might be one of the earliest signs of cognitive defect, observable well in advance before other cognitive deficits become manifest. In this article, we propose a full automated method able to classify the audio file of the subjects according to the progress level of the pathology. In particular, we trained a specific type of artificial neural network, called autoencoder, using the visual representation of the audio signal of the subjects, that is, the spectrogram. Moreover, we used a data augmentation approach to overcome the problem of the large amount of annotated data usually required during the training phase, which represents one of the most major obstacles in deep learning. We evaluated the proposed method using a dataset of 288 audio files from 96 subjects: 48 healthy controls and 48 cognitively impaired participants. The proposed method obtained good classification results compared to the state-of-the-art neuropsychological screening tests and, with an accuracy of 90.57%, outperformed the methods based on manual transcription and annotation of speech.