Content uploaded by Veeky Baths
Author content
All content in this area was uploaded by Veeky Baths on Dec 12, 2022
Content may be subject to copyright.
2022 IEEE World Conference on Applied Intelligence and Computing
978-1-6654-7988-2/22/$31.00 ©2022 IEEE 41
DOI: 10.1109/aic.2022.08
Liver Disease Diagnosis Using Machine Learning
Manas Minnoor
Department of Computer Science and Information Systems
BITS Pilani KK Birla Goa Campus
Goa, India
manas.minnoor@gmail.com
Veeky Baths
Cognitive Neuroscience Lab
BITS Pilani KK Birla Goa Campus
Goa, India
veeky@goa.bits-pilani.ac.in
Abstract— This paper evaluates the performance of various
supervised machine learning algorithms such as Logistic
Regression, K-Nearest Neighbors (KNN), Extra Trees,
LightGBM as well as a Multilayer Perceptron (MLP) neural
network in the detection and diagnosis of liver disease. Existing
methods for diagnosis tend to be highly invasive and time-
consuming. A lack of qualified experts exacerbates these issues.
Since blood tests, known as liver function tests, are a standard
method to assess liver health, these models utilize blood enzyme
levels like Bilirubin, Albumin, Alanine transaminase (SGPT),
and Aspartate Aminotransferase (SGOT) to diagnose liver
disease in patients. A total of 11 attributes are used to train the
models. The algorithms are compared using metrics including,
but not limited to, F1 score, accuracy, and precision. The Extra
Trees classifier is shown to provide the highest accuracy of 0.89
as well as an F1 score of 0.88. Thus, it appears to be the best
method for the timely and accurate diagnosis of liver disease
using blood enzyme levels. In addition, the usage of machine
learning algorithms alongside human medical expertise may
help drastically reduce errors in clinical diagnosis. This paper
establishes the feasibility of applying machine learning in
various medical fields including the diagnosis of other diseases.
Keywords—Liver disease prediction, machine learning,
multilayer perceptron (MLP), classification, k-nearest neighbors
I. INTRODUCTION
The liver is the largest internal organ in the human body,
in terms of size as well as weight. It accounts for between 2
and 3% of body weight and is located below the diaphragm
[1]. It is responsible for a multitude of functions, ranging from
the synthesis of bile to the breakdown and processing of toxins
in blood [2]. Bile is a secretion essential to human survival, as
it is the primary medium for the expulsion of cholesterol and
other toxic substances that the kidneys are incapable of
processing [3]. From this, we may conclude that the efficient
and comprehensive functioning of the liver is a requisite for a
healthy and productive life.
Being an indispensable organ, a profuse number of viruses
have evolved to infect the human liver. The most common of
them are referred to as Hepatitis viruses, deriving their names
from the condition they cause- inflammation of the liver.
Various non-infectious diseases such as primary biliary
cholangitis (PBC) and nonalcoholic fatty liver disease
(NAFLD) also affect the liver [2].
The approach described in this paper seeks to
comprehensively detect and diagnose liver patients in the
early stages of the disease. Existing methods rely on human
analysis of liver function tests (LFTs) as well as imaging such
as computed tomography (CT) for detecting lesions.
Ultrasonography performs well, with a sensitivity of 84.8%
[4], but is more invasive and thus less preferred. Innovative
techniques such as RPR, which measures the ratio between
red blood cell (RBC) volume distribution width and platelet
count, remain of questionable accuracy- an area under curve
(AUC) value of 0.73 [5]. Routine liver function tests remain
the most common method to diagnose liver disease yet reflect
staggeringly low accuracies when subjected to human
analysis. Doctors have been shown to arrive at contradictory
conclusions, reflected in poor sensitivities having a lower
limit of just 22% [6].
Considering these revelations, the approach in this paper
seeks to develop an accurate and automated model based on
machine learning to complement human expertise in
diagnosing liver disease. A reliable model would result in
drastically lower missed diagnoses, as well as earlier detection
of these conditions. It would also help overcome the abysmal
doctor to patient ratio in many developing countries, including
India. However, many medical professionals remain skeptical
of the reliability of machine learning in diagnosing disease,
and the results presented in this paper attempt to alleviate
these concerns.
This paper aims to benchmark the performance of a
diverse set of machine learning algorithms for the detection of
liver disease. These algorithms include Logistic Regression,
Support Vector Machine (SVM), Random Forest, Multilayer
Perceptron (MLP), K-nearest neighbors, XGBoost Classifier,
Extra Trees, and LightGBM Classifier.
The layout of the further sections of the paper is detailed
below. Section II provides a survey of relevant literature and
a brief overview of their results. Following this, section III
describes the machine learning workflow. Section IV serves
to delineate the data cleaning and preprocessing. Section V
details the process of feature selection for the models. A
concise review of relevant machine learning models may be
found in section VI. Section VII describes the methodology of
the model implementations, while section VIII summarizes
their results. Section IX provides a comparison between
model performances, and section X chronicles the paper’s
conclusion. The final section XI outlines possibilities for
future work.
2022 IEEE World Conference on Applied Intelligence and Computing (AIC) | 978-1-6654-7988-2/22/$31.00 ©2022 IEEE | DOI: 10.1109/AIC55036.2022.9848916
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on December 12,2022 at 10:48:10 UTC from IEEE Xplore. Restrictions apply.
42
II. LITERATURE SURVEY
As seen in [8]-[10] and [22], machine learning may be
harnessed to detect cardiovascular disease in an accurate and
reliable manner. Shah et al. [9] use algorithms including, but
not limited to, K-Nearest Neighbours, Decision Tree, and
Random Forest to diagnose heart disease. On comparison of
these models, they conclude that K-nearest neighbours is the
best-suited supervised algorithm for this use case with an
accuracy of over 90%.
A review of [11]-[13] provides evidence for the efficacy
of machine learning in diagnosing Parkinson’s disease as well.
Wroge et al. [12] obtain a maximum accuracy of 85% using
audio recordings and supervised algorithms. This is shown to
be a superior method when compared to the mean accuracy of
experts, which maxes out at 79.6% when patients do not return
for a follow-up appointment.
As shown by [14] and [15], risk assessment for diabetes
mellitus has been successfully conducted using machine
learning algorithms. Lai et al. [14] develop models to assess
the risk of diabetes in Canadian patients. They utilize Logistic
Regression and Gradient Boosting algorithms, obtaining Area
Under ROC (AROC) values of 84% and 8.47% respectively.
Thus, gradient boosting and logistic regression both prove to
be viable methods for disease diagnosis.
As is evidenced by [16]-[21], liver disease detection is
well complemented by machine learning techniques. Ghosh et
al. [16] utilize multiple models including Support Vector
Machine (SVM), XGBoost, Random Forest, and AdaBoost
classifiers. They note that Random Forest outperforms the
remaining models with an accuracy of 83.70%. Rabbi et al.
[18] introduce a less common supervised algorithm, Extra
trees, and find that the boosted variant of this model delivers
an accuracy of 92.19%. Dandan et al. [21] report the ability to
diagnose diffuse liver disease using a LightGBM classifier.
They obtain a significant improvement over referenced
methods’ accuracy, of 5.4%.
Based on the papers reviewed in the literature survey, we
conclude that methods such as neural networks such as MLP
[17] and other supervised algorithms can be exploited to
perform liver disease prediction with notable accuracy
III. MACHINE LEARNING WORKFLOW
We implement a machine learning workflow as described in
Fig 1.
Fig. 1. Machine learnigng workflow
The workflow begins with the collection of requisite data. We
then pre-process the data which involves replacing null
values, standardizing and normalizing the data. Using a
correlation matrix, we can carry out feature selection and
identify the relevant features. This is followed by training the
model and consequently validating them, iterating till the best
model is obtained. We then carry out a comparison between
the performances of the final models.
IV. DATA PREPROCESSING
The data is sourced from the Indian Liver Patient Dataset
(ILPD), belonging to the UCI Machine Learning Repository
[7]. The dataset consists of 583 data points, 416 of which are
liver patients and 167 which are healthy individuals. The
dataset contains a total of 11 attributes including the diagnosis,
as detailed below:
1) Age: Age of the patient
2) Gender: Assigned sex at birth
3) Total Bilirubin: A byproduct during the breakdown of
Red Blood Cells (RBCs). The value of total bilirubin is
calculated as direct bilirubin + indirect bilirubin.
4) Direct Bilirubin: Also known as conjugated bilirubin,
it is the form of bilirubin that can be expelled from the body.
5) Alkaline phosphatase: An enzyme that plays an
important role in the protection of the intestine.
6) Alanine aminotransferase: A catalyst in the alanine
cycle.
7) Aspartate aminotransferase: A catalyst in the
metabolism of amino acids.
8) Total Protein: A measure of both albumin and
globulin.
9) Albumin: A transport protein also responsible for
regulating colloid osmotic pressure in the blood.
10) Albumin/Globulin Ratio: The ratio between albumin
and globulin values in the blood.
11) Diagnosis: Denotes whether the patient suffers from
a liver condition.
A. Null value imputation
The Albumin/Globulin ratio attribute has null values,
which we replace with the mean value of the attribute. The
rest of the attributes contain no null values. Using one-hot
label encoding, the Gender attribute data is converted to
numerical values.
B. Log Transformation
Multiple attributes in the dataset appear to be right-
skewed. We apply log transformation on the following
columns to normalize the data: Total Bilirubin, Direct
Bilirubin, Alkaline phosphatase, Alanine aminotransferase,
Aspartate aminotransferase, and Albumin/Globulin ratio. The
transformation is necessary as we do not wish to filter out
outliers from the dataset. To standardize the data, we utilize a
Robust Scaler. This scaler is desired as it scales data without
being hampered by outliers. It uses the Interquartile Range
(IQR) to scale the data, as scaling based on mean/variance is
negatively affected by outliers.
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on December 12,2022 at 10:48:10 UTC from IEEE Xplore. Restrictions apply.
43
C. Up-sampling
As less than 30% of the instances are of liver patients, the
dataset is imbalanced. To combat this issue, we up-sample the
minority class (liver patients). This prevents the models from
simply predicting the majority class every time. The final
dataset has 832 records after the up-sampling of the minority
class. The up-sampling is done using the resample method of
sklearn. It ensures a random but comprehensive up-sampling
of the minority class.
V. FEATURE SELECTION
We investigate the correlation between features to identify
those sufficient for predictions. Linearly dependent attributes
are omitted to avoid misprediction due to multicollinearity.
The original attributes are as follows: Age, Gender, Total
Bilirubin (BR), Direct Bilirubin (Direct_BR), Alkaline
phosphatase (ALP), Alanine aminotransferase (ALT),
Aspartate aminotransferase (AST), Total Protein, Albumin,
Albumin/Globulin Ratio, and Diagnosis. The correlation
matrix represented in the form of a heat map is as shown in
Fig 2.
Fig. 2. Correlation heat map
We investigate the correlation between features to identify
those sufficient for predictions. Linearly dependent attributes
are omitted to avoid misprediction due to multicollinearity.
The original attributes are as follows: Age, Gender, Total
Bilirubin (BR), Direct Bilirubin (Direct_BR), Alkaline
phosphatase (ALP), Alanine aminotransferase (ALT),
Aspartate aminotransferase (AST), Total Protein, Albumin,
Albumin/Globulin Ratio, and Diagnosis. The correlation
matrix represented in the form of a heat map is as shown in
Fig 2. The features dropped from the dataset are Direct
Bilirubin, Albumin and AST respectively.
VI. MACHINE LEARNING MODELS
We seek to investigate the performance of eight different
machine learning algorithms at liver disease detection and
diagnosis. Their features are as described below:
1) Logistic Regression Classifier: It models probability
based on the sigmoid function and classifies data by
comparing it to a decision threshold.
2) Support Vector Machine Classifier: It produces a
hyperplane using the provided features which then separates
the plotted data into two distinct classes.
3) Random Forest Classifier: It is an ensemble algorithm
that constructs multiple decision trees at once to classify the
data.
4) Multilayer Perceptron Classifier: It is an artificial
neural network that learns the weights of its neurons by
training on the dataset.
5) K-Nearest Neighbors Classifier: It constructs class
boundaries using the assumption that data points close
together usually belong to the same class.
6) XGBoost Classifier: It is an ensemble algorithm that
combines the results of many base learners.
7) Extra Trees Classifier: It is also an ensemble
algorithm that uses noncorrelated decision trees to classify
data.
8) LightGBM Classifier: It is a lightweight gradient
boosting model.
VII. METHODOLOGY
Eight models are developed and trained on this dataset
using the sklearn toolkit in python. These are Logistic
Regression, Support Vector Machine (SVM), Random Forest,
Multilayer Perceptron, K-Nearest Neighbors, XGBoost, Extra
Trees, and LightGBM classifier. The final attributes passed
to the models for prediction are Age, Gender, Total Bilirubin,
Alkaline phosphatase, Alanine aminotransferase, total
protein, and Albumin/Globulin ratio. The diagnosis feature
has a value ‘0’ for healthy individuals and ‘1’ for liver
patients.
The models are evaluated using the test dataset and their
results are compared based on the following metrics. These
metrics are chosen due to their widespread understanding, as
well as their ability to accurately describe the performance of
medical models. The recall metric is given extra attention, as
is mentioned later.
1) Accuracy: The number of correct predictions upon the
total number of predictions.
2) Precision: The number of true positives upon the
number of true positives and false positives.
3) Recall: The number of true positives upon the number
of true positives and false negatives.
4) F1 Score: 2*Precision*Recall/(Precision + Recall)
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on December 12,2022 at 10:48:10 UTC from IEEE Xplore. Restrictions apply.
44
VIII. MODEL EVALUATION RESULTS
The model results are outlined in this section. The models
are evaluated on a multitude of metrics- accuracy, F1 score,
precision, and recall. We then further compare the models to
identify the best method for diagnosis.
A. Logistic Regression Classifier
The classifier predicts liver disease with a lower accuracy
of 63.94% and an F1 score of 0.57. The overall metrics are
shown in Table 1.
TABLE I. LOGISTIC REGRESSION METRICS
Metrics
Logistic Regression Results
Value
Accuracy
63.94%
Precision
0.70
Recall
0.48
F1 Score
0.57
B. Support Vector Machine Classifier
The classifier predicts liver disease with the lowest
accuracy of 62.02% and an F1 score of 0.52. The overall
metrics are shown in Table 2.
TABLE II. SUPPORT VECTOR MACHINE METRICS
Metrics
SVM Results
Value
Accuracy
62.02%
Precision
0.69
Recall
0.42
F1 Score
0.52
C. Random Forest Classifier
The classifier predicts liver disease with the second-
highest accuracy of 84.62% and an F1 score of 0.83. The
overall metrics are shown in Table 3.
TABLE III. RANDOM FOREST METRICS
Metrics
Random Forest Results
Value
Accuracy
84.62%
Precision
0.90
Recall
0.78
F1 Score
0.83
D. Multilayer Perceptron Classifier
The classifier predicts liver disease with a moderate
accuracy of 65.87% and an F1 score of 0.57. The overall
metrics are shown in Table 4.
TABLE IV. MULTILAYER PERCEPTRON METRICS
Metrics
Multilayer Perceptron Results
Value
Accuracy
65.87%
Metrics
Multilayer Perceptron Results
Value
Precision
0.75
Recall
0.47
F1 Score
0.57
E. K-Nearest Neighbours Classifier
The classifier predicts liver disease with a promising
accuracy of 72.60% and an F1 score of 0.67. The overall
metrics are shown in Table 5.
TABLE V. K-NEAREST NEIGHBORS METRICS
Metrics
K-Nearest Neighbors Results
Value
Accuracy
72.60%
Precision
0.83
Recall
0.56
F1 Score
0.67
F. XGBoost Classifier
The classifier predicts liver disease with a high accuracy
of 83.17% and an F1 score of 0.81. The overall metrics are
shown in Table 6.
TABLE VI. XGBOOST METRICS
Metrics
XGBoost Results
Value
Accuracy
83.17%
Precision
0.91
Recall
0.73
F1 Score
0.81
G. Extra Trees Classifier
The classifier predicts liver disease with the highest
accuracy of 88.94% and an F1 score of 0.88. The overall
metrics are shown in Table 7.
TABLE VII. EXTRA TREES METRICS
Metrics
Extra Trees Results
Value
Accuracy
88.94%
Precision
0.93
Recall
0.84
F1 Score
0.88
H. LightGBM Classifier
The classifier predicts liver disease with a reliable
accuracy of 80.29% and an F1 score of 0.77. The overall
metrics are shown in Table 8.
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on December 12,2022 at 10:48:10 UTC from IEEE Xplore. Restrictions apply.
45
TABLE VIII. LIGHTGBM METRICS
Metrics
LightGBM Results
Value
Accuracy
80.29%
Precision
0.90
Recall
0.68
F1 Score
0.77
IX. COMPARATIVE STUDY
The results of the comparison between the eight trained
models are presented in this section.
A. Accuracy
A comparison of accuracies of the four different ML
models is represented in the following figures. The table
details the accuracies of the various models, while Fig. 3
provides a pictorial representation of the same.
TABLE IX. COMPARATIVE STUDY: ACCURACY
Model
Accuracy Comparison
Accuracy
Logistic Regression
63.94%
Support Vector Machine
62.02%
Random Forest
84.62%
Multilayer Perceptron
65.87%
K-Nearest Neighbors
72.60%
XGBoost
83.17%
Extra Trees
88.94%
LightGBM
80.29%
Fig. 3. Comparison of model accuracies
From the above data, we can conclude that the accuracy of
the Extra Trees classifier is the highest, with the random forest
model having the second-highest value. The Support Vector
Machine model has the lowest accuracy. We also see that the
Logistic Regression and Multilayer Perceptron (MLP) models
perform poorly. Hence, these algorithms are not suited for
liver disease diagnosis. The gradient boosting algorithms,
LightGBM and XGBoost show promising performances with
accuracies higher than 80%.
B. Precision
A comparison of precision values of the four different ML
models is represented in the following figures. The table
details the precision values of the various models, while Fig.
4 provides a pictorial representation of the same.
TABLE X. COMPARATIVE STUDY: PRECISION
Model
Precision Comparison
Precision
Logistic Regression
0.70
Support Vector Machine
0.69
Random Forest
0.90
Multilayer Perceptron
0.75
K-Nearest Neighbors
0.83
XGBoost
0.91
Extra Trees
0.93
LightGBM
0.90
Fig. 4. Comparison of model precision values
From the above data, we can conclude that the precision
of the Extra Trees classifier is again the highest, with the
XGBoost model having the second-highest value. The
Support Vector Machine model has the lowest precision as
well. We also see that the Logistic Regression and Multilayer
Perceptron (MLP) models continue to perform poorly in
comparison to the other algorithms. We note that the high
precision values from some of the algorithms imply that the
quality of diagnosis is high, with fewer false positives.
C. Recall
A comparison of recall values of the four different ML
models is represented in the following figures. The table
details the recall values of the various models, while Fig. 5
provides a pictorial representation of the same.
TABLE XI. COMPARATIVE STUDY: RECALL
Model
Recall Comparison
Recall
Logistic Regression
0.48
Support Vector Machine
0.42
Random Forest
0.78
Multilayer Perceptron
0.47
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on December 12,2022 at 10:48:10 UTC from IEEE Xplore. Restrictions apply.
46
Model
Recall Comparison
Recall
K-Nearest Neighbors
0.56
XGBoost
0.73
Extra Trees
0.84
LightGBM
0.68
Fig. 5. Comparison of model recall values
From the above data, we can conclude that the recall of the
Extra Trees classifier is the highest, thus outperforming all
other models in every metric so far. The Random Forest model
has the second-highest value. The Support Vector Machine
model continues to perform the worst. We may observe
however that the average recall value is quite low.
Recall is an important metric for medical models as
missing a positive case has a far more perilous outcome than
falsely predicting a case to be positive. Disease models must
be trained keeping this in mind, and may adopt methods to
maximize recall overall.
D. F1 Score
A comparison of F1 scores of the four different ML
models is represented in the following figures. The table
details the F1 scores of the various models, while Fig. 6
provides a pictorial representation of the same.
TABLE XII. COMPARATIVE STUDY: F1 SCORE
Model
F1 Score Comparison
F1 Score
Logistic Regression
0.57
Support Vector Machine
0.52
Random Forest
0.83
Multilayer Perceptron
0.57
K-Nearest Neighbors
0.67
XGBoost
0.81
Extra Trees
0.88
LightGBM
0.77
Fig. 6. Comparison of model F1 scores
From the above data, we can conclude that the F1 score of
the Extra Trees classifier is also the highest, establishing its
relative superiority in diagnosing liver disease. The Random
Forest model has the second-highest value again, proving
itself as a reliable backup.
From the tables and graphs, we can conclude that the Extra
Trees classifier appears superior to all other studied methods
for liver disease diagnosis, with an accuracy of 88.94% and an
F1 score of 0.88, while the Logistic Regression, Multilayer
Perceptron (MLP) and Support Vector Machine (SVM)
classifiers perform poorly with accuracies far below 70%.
X. CONCLUSION
The machine learning models developed show promising
results in the diagnosis of liver disease. In particular, the Extra
Trees classifier provides the best accuracy at 88.94% with an
F1 score of 0.88. Thus, we can utilize these models ideally to
complement human expertise in this field. Doctors may use
such models to automate as well as confirm diagnoses, serving
to reduce human error and therefore the number of missed
cases as well. Early detection of liver disease will lead to a
better prognosis in most cases, and hence it is of utmost
importance that machine learning is widely utilized for
detection.
XI. FUTURE WORK
Future developments may include wider diversity in data,
as the current dataset is geographically biased. To ensure the
models work in more general use cases, more training data is
necessary. This may be achieved by reaching out to hospitals
in various regions for more comprehensive data collection.
Deep learning models developed in tandem with medical
imaging such as computed tomography (CT) may provide an
alternate approach to the diagnosis of some liver conditions as
well. This would provide an additional confirmation, thus
improving the reliability of automated detection of liver
disease.
ACKNOWLEDGMENT
No funding to declare.
REFERENCES
[1] S. R. Z. Abdel-Misih and M. Bloomston, “Liver Anatomy,” Surg Clin
North Am, vol. 90, no. 4, pp. 643–653, Aug. 2010, doi:
10.1016/j.suc.2010.04.017.
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on December 12,2022 at 10:48:10 UTC from IEEE Xplore. Restrictions apply.
47
[2] A. Kalra, E. Yetiskul, C. J. Wehrle, and F. Tuma, “Physiology, Liver,”
in StatPearls, Treasure Island (FL): StatPearls Publishing, 2022.
Accessed: Jan. 30, 2022. [Online]. Available:
http://www.ncbi.nlm.nih.gov/books/NBK535438/
[3] J. L. Boyer, “Bile Formation and Secretion,” Compr Physiol, vol. 3,
no. 3, pp. 1035–1078, Jul. 2013, doi: 10.1002/cphy.c120027.
[4] R. Hernaez et al., “Diagnostic accuracy and reliability of
ultrasonography for the detection of fatty liver: A meta-analysis,”
Hepatology, vol. 54, no. 3, pp. 1082–1090, 2011, doi:
10.1002/hep.24452.
[5] Y. Cai et al., “Diagnostic accuracy of red blood cell distribution width
to platelet ratio for predicting staging liver fibrosis in chronic liver
disease patients,” Medicine (Baltimore), vol. 98, no. 14, p. e15096,
Apr. 2019, doi: 10.1097/MD.0000000000015096.
[6] K. P. Lindvig et al., “Diagnostic accuracy of routine liver function tests
to identify patients with significant and advanced alcohol-related liver
fibrosis,” Scandinavian Journal of Gastroenterology, vol. 56, no. 9, pp.
1088–1095, Sep. 2021, doi: 10.1080/00365521.2021.1929450.
[7] “UCI Machine Learning Repository: ILPD (Indian Liver Patient
Dataset) Data Set.”
https://archive.ics.uci.edu/ml/datasets/ILPD+(Indian+Liver+Patient+
Dataset) (accessed Jan. 30, 2022).
[8] V. V. Ramalingam, A. Dandapath, and M. K. Raja, “Heart disease
prediction using machine learning techniques : a survey,” International
Journal of Engineering & Technology, vol. 7, no. 2.8, Art. no. 2.8, Mar.
2018, doi: 10.14419/ijet.v7i2.8.10557.
[9] D. Shah, S. Patel, and S. K. Bharti, “Heart Disease Prediction using
Machine Learning Techniques,” SN COMPUT. SCI., vol. 1, no. 6, p.
345, Oct. 2020, doi: 10.1007/s42979-020-00365-y.
[10] J. Patel, S. Tejalupadhyay, and S. Patel, Heart Disease prediction using
Machine learning and Data Mining Technique. 2016. doi:
10.090592/IJCSC.2016.018.
[11] E. Avuçlu and A. Elen, “Evaluation of train and test performance of
machine learning algorithms and Parkinson diagnosis with statistical
measurements,” Med Biol Eng Comput, vol. 58, no. 11, pp. 2775–
2788, Nov. 2020, doi: 10.1007/s11517-020-02260-3.
[12] T. J. Wroge, Y. Özkanca, C. Demiroglu, D. Si, D. C. Atkins, and R. H.
Ghomi, “Parkinson’s Disease Diagnosis Using Machine Learning and
Voice,” in 2018 IEEE Signal Processing in Medicine and Biology
Symposium (SPMB), Dec. 2018, pp. 1–7. doi:
10.1109/SPMB.2018.8615607.
[13] W. Wang, J. Lee, F. Harrou, and Y. Sun, “Early Detection of
Parkinson’s Disease Using Deep Learning and Machine Learning,”
IEEE Access, vol. 8, pp. 147635–147646, 2020, doi:
10.1109/ACCESS.2020.3016062.
[14] H. Lai, H. Huang, K. Keshavjee, A. Guergachi, and X. Gao, “Predictive
models for diabetes mellitus using machine learning techniques,” BMC
Endocrine Disorders, vol. 19, no. 1, p. 101, Oct. 2019, doi:
10.1186/s12902-019-0436-6.
[15] L. Zhang, Y. Wang, M. Niu, C. Wang, and Z. Wang, “Machine
learning for characterizing risk of type 2 diabetes mellitus in a rural
Chinese population: the Henan Rural Cohort Study,” Sci Rep, vol. 10,
no. 1, Art. no. 1, Mar. 2020, doi: 10.1038/s41598-020-61123-x.
[16] M. Ghosh et al., “A Comparative Analysis of Machine Learning
Algorithms to Predict Liver Disease,” Intelligent Automation & Soft
Computing, vol. 30, no. 3, pp. 917–928, 2021, doi:
10.32604/iasc.2021.017989.
[17] X. Pei, Q. Deng, Z. Liu, X. Yan, and W. Sun, “Machine Learning
Algorithms for Predicting Fatty Liver Disease,” ANM, vol. 77, no. 1,
pp. 38–45, 2021, doi: 10.1159/000513654.
[18] Md. F. Rabbi, S. M. Mahedy Hasan, A. I. Champa, Md. AsifZaman,
and Md. K. Hasan, “Prediction of Liver Disorders using Machine
Learning Algorithms: A Comparative Study,” in 2020 2nd
International Conference on Advanced Information and
Communication Technology (ICAICT), Nov. 2020, pp. 111–116. doi:
10.1109/ICAICT51780.2020.9333528.
[19] N. Atabaki-Pasdar et al., “Predicting and elucidating the etiology of
fatty liver disease: A machine learning modeling and validation study
in the IMI DIRECT cohorts,” PLOS Medicine, vol. 17, no. 6, p.
e1003149, Jun. 2020, doi: 10.1371/journal.pmed.1003149.
[20] J. Singh, S. Bagga, and R. Kaur, “Software-based Prediction of Liver
Disease with Feature Selection and Classification Techniques,”
Procedia Computer Science, vol. 167, pp. 1970–1980, Jan. 2020, doi:
10.1016/j.procs.2020.03.226.
[21] L. Dandan, M. Huanhuan, L. Xiang, J. Yu, J. Jing, and S. Yi,
“Classification of diffuse liver diseases based on ultrasound images
with multimodal features,” in 2019 IEEE International Instrumentation
and Measurement Technology Conference (I2MTC), May 2019, pp. 1–
5. doi: 10.1109/I2MTC.2019.8827174.
[22] D. Khanna, R. Sahu, V. Baths, and B. Deshpande, “Comparative Study
of Classification Techniques (SVM, Logistic Regression and Neural
Networks) to Predict the Prevalence of Heart Disease,” IJMLC, vol. 5,
no. 5, pp. 414–419, Oct. 2015, doi: 10.7763/IJMLC.2015.V5.544.
Authorized licensed use limited to: Birla Institute of Technology and Science. Downloaded on December 12,2022 at 10:48:10 UTC from IEEE Xplore. Restrictions apply.