Fig 1 - available via license: CC BY
Content may be subject to copyright.
General architecture of the proposed AE-DNNs method. DAE, deep autoencoder; DNN, deep neural network; RE, reconstruction error; CHD, coronary heart disease. https://doi.org/10.1371/journal.pone.0225991.g001

General architecture of the proposed AE-DNNs method. DAE, deep autoencoder; DNN, deep neural network; RE, reconstruction error; CHD, coronary heart disease. https://doi.org/10.1371/journal.pone.0225991.g001

Source publication
Article
Full-text available
Coronary heart disease (CHD) is one of the leading causes of death worldwide; if suffering from CHD and being in its end-stage, the most advanced treatments are required, such as heart surgery and heart transplant. Moreover, it is not easy to diagnose CHD at the earlier stage; hospitals diagnose it based on various types of medical tests. Thus, by...

Contexts in source publication

Context 1
... this study, we used DAE-general and DAE-risky two deep AE models, showing in Fig 1. The DAE-risky model learns from the only high-risk subset of the training dataset for the RE based feature extraction. ...
Context 2
... data that is different from most dataset gives higher RE than common data on the DAE-general model. As shown in Fig 1, we used two independent deep AE models. The DAE-general is used for data grouping and and selection of the CHD risk prediction model. ...

Similar publications

Article
Full-text available
Background Hereditary hearing loss (HHL) is the most common sensory deficit, which highly afflicts humans. With gene sequencing technology development, more variants will be identified and support genetic diagnoses, which is difficult for human experts to diagnose. This study aims to develop a machine learning-based genetic diagnosis model of HHL-r...

Citations

... In our proposed architecture, there are three distinct layers that make up the generator network. An input of random noise is provided to the first hidden layer, which consists of 21 neurons and an Elu activation function which is given by Equation (2) [34]. The subsequent hidden layer is the second one, and it consists of a total of 24 neurons and an Elu activation function. ...
Article
Full-text available
Biomarkers including fasting blood sugar, heart rate, electrocardiogram (ECG), blood pressure, etc. are essential in the heart disease (HD) diagnosing. Using wearable sensors, these measures are collected and applied as inputs to a deep learning (DL) model for HD diagnosis. However, it is observed that model accuracy weakens when the data gathered are scarce or imbalanced. Therefore, this work proposes two DL-based frameworks, GAN-1D-CNN, and GAN-Bi-LSTM. These frameworks contain: (1) a generative adversarial network (GAN) and (2) a one-dimensional convolutional neural network (1D-CNN) or bi-directional long short-term memory (Bi-LSTM). The GAN model is utilized to augment the small and imbalanced dataset, which is the Cleveland dataset. The 1D-CNN and Bi-LSTM models are then trained using the enlarged dataset to diagnose HD. Unlike previous works, the proposed frameworks increase the dataset first to avoid the prediction bias caused by the limited data. The GAN-1D-CNN achieved 99.1% accuracy, specificity, sensitivity, F1-score, and 100% area under the curve (AUC). Similarly, the GAN-Bi-LSTM obtained 99.3% accuracy, 99.2% specificity, 99.3% sensitivity, 99.2% F1-score, and 100% AUC. Furthermore, time complexity of proposed frameworks is investigated with and without principal component analysis (PCA). The PCA method reduced prediction times for 61 samples using GAN-1D-CNN and GAN-Bi-LSTM to 68.8 and 74.8 ms, respectively. These results show that it is reliable to use our frameworks for augmenting limited data and predicting heart disease.
... The recent research utilizing neural network is conducted by employing Multi-Layered Perceptron (MLP) trained by Backpropagation using Sigmoid activation function and 1000 epochs, as well as a learning rate of 0.25 and 25 neurons, resulting in accuracy of 80.66% in predicting patients with CHD [14]. Moreover, a better accuracy using deep neural network utilizing reconstruction error (RE) algorithm is obtained for KNHANES dataset, which is 86.34%, 91.37%, 82.90%, 86.91%, and 86.66% for accuracy, precision, recall, F-measure, and Area Under Curve (AUC), respectively [15]. In addition, the neural network model combined with feature correlation analysis (NN-FCA) applied on the KNHANES dataset produces an accuracy via the area under ROC curve of 0.749. ...
Conference Paper
Full-text available
Coronary heart disease (CHD), alternatively known as cardiovascular disease (CVD) is the number one cause of death in the world. Each person may develop different symptoms or no symptoms of CHD, which means that they do not know they have CHD until they experience a chest pain, a heart attack or cardiac arrest. These situations may be avoided if we are able to predict the early diagnosis of the heart disease and determine the most important risk factors associated with the disease. Currently, the accuracy of the prediction has remained inadequate and the most important risk factors have remained elusive. This research paper discusses many risk factors associated with the disease and presents the prediction models of coronary heart disease using supervised machine learning algorithms, namely Random Forest, XGBoost algorithms, as well as using Artificial Neural Network (ANN), a deep-learning-based algorithm. It uses the public dataset from the Cleveland database of UCI repository of coronary heart disease patients. The models are further optimized using Grid Search optimization algorithm. The results show that the Random Forest, XGBoost and ANN algorithms have accuracies of 81.11%, 82.22% and 86.67%, respectively. Equally important, the results of the feature importance signify the importance of maximum heart rate and nuclear stress test in predicting the early diagnosis of the disease.
... Nonetheless, the implementation of AI to NGS and genomics has already been shown to accurately predict the consequences of genetic risk factors in CVDs [25,26], show the noncoding-variant effects in CVDs [27,28], find patients with cardiac amyloidosis [29,30], and initiate specific therapies from tumor sequencing [31] by integrating with electronic health records (EHRs) in several academic and medical institutions. Additionally, there are several direct-to-consumer genomics companies that use AI along with WGS and WES; however, to date, these applications have been limited by a lack of transparency in the algorithms they utilize due to their proprietary nature and commercial competition, as well as a lack of a consistent validation cohort, genomic guided clinical trials, and high-quality phenotype data that are consistently encoded and managed (Table 1). ...
Article
Full-text available
Polygenic diseases, which are genetic disorders caused by the combined action of multiple genes, pose unique and significant challenges for the diagnosis and management of affected patients. A major goal of cardiovascular medicine has been to understand how genetic variation leads to the clinical heterogeneity seen in polygenic cardiovascular diseases (CVDs). Recent advances and emerging technologies in artificial intelligence (AI), coupled with the ever-increasing availability of next generation sequencing (NGS) technologies, now provide researchers with unprecedented possibilities for dynamic and complex biological genomic analyses. Combining these technologies may lead to a deeper understanding of heterogeneous polygenic CVDs, better prognostic guidance, and, ultimately, greater personalized medicine. Advances will likely be achieved through increasingly frequent and robust genomic characterization of patients, as well the integration of genomic data with other clinical data, such as cardiac imaging, coronary angiography, and clinical biomarkers. This review discusses the current opportunities and limitations of genomics; provides a brief overview of AI; and identifies the current applications, limitations, and future directions of AI in genomics.
... Kalp hastalıklarının teşhisinde klasik makine öğrenmesi algoritmalarına ek olarak derin öğrenme algoritmaları da kullanılmıştır [18]. Bu çalışmalardan birinde koroner arter hastalığı derin öğrenme algoritmaları yardımıyla teşhis edilebilmiştir [19]. Yapılan çalışmada, bir derin oto kodlayıcı yeniden yapılandırma hatasını tahmin etmek için kullanılmış ve çalışma sonunda %86,34 doğruluk, %91,37 hassasiyet ve %82,90 kesinlik değeri elde edilmiştir. ...
Article
Full-text available
Kalp hastalıkları ölüm oranı bakımından bütün hastalıklar arasında ilk sırada yer alır. Hastalığın kesin tedavisi olmamakla birlikte doğru teşhis hastaların hayatta kalma süresi ve yaşam kalitesine olumlu yönde etki eder. Bugüne kadar kalp hastalıklarının teşhisi için çeşitli klinik yöntemler kullanılmıştır. Son dönemde hastalığın teşhisi için makine öğrenmesi algoritmaları da kullanılmaktadır. Bu kapsamda yaptığımız çalışmada kalp hastalığı teşhisi için KNN sınıflayıcı kullanılmıştır. Algoritmanın sınıflandırma başarısını iyileştirmek için optimum parametreler bulunmaya çalışılmıştır. KNN algoritması için ilk parametre uzaklık yöntemidir ve bu parametre için Manhattan, Euclidean, Chebyshev ve Cosine ölçümleri tercih edilmiştir. Diğer parametre komşu sayısıdır ve en uygun komşu sayısını tespit edebilmek için 1…15 arasındaki tek sayılar denenmiştir. Kalp hastalıklarını sınıflandırmak için kullandığımız KNN algoritması C++ programlama dilinde kodlanmış ve çalıştırılmıştır. Model değerlendirme aşamasında UCI Statlog (Heart) veriseti kullanılmış ve sonuçlar doğruluk ve ROC analizine dayalı olarak elde edilmiştir. KNN algoritması ile elde edilen en yüksek sınıflandırma doğruluğu %100 ve en yüksek AUC değeri 1,00 olarak ölçülmüştür. Bu değer; Chebyshev uzaklık ölçümü ve komşu sayısının 7 olduğu durumda elde edilmiştir.
... In other words, instead of directly train classifiers, we remove outliers from the training dataset using Mah distance. After that, the prepared training dataset applied RF, KNN, XGBoost, DT, and NB algorithms [8][9][10][11]. ...
... This paper implemented popular classification algorithms such as Random Forest (RF), knearest neighbors (KNN). XGBoost, Decision Tree (DT), and Naïve Bayes classifier (NB) [11][12][13][14][15][16]. According to the compared algorithms, we have chosen the best values of input parameters by changing their values until increasing the performance. ...
... The performance evaluation of this paper was completed using accuracy, AUC, F1-score, and MSE. We can find precisions and recall as follows [9,11] Precision = T P T P + F P and Recall = T P T P + F N (3) ...
Article
Full-text available
In recent years, the incidence of hypertension diseases has increased dramatically, not only among the elderly but also among young people. In this regard, the use of machine learning methods to diagnose the causes of hypertension diseases has increased in recent years. In this article, we have improved the prediction of hypertension detection using Mahalanobis distance-based multivariate outlier removing of Korean national health data named by the KNHANES database. The study identified a variety of risk factors associated with chronic hypertension. Chronic disease is often caused by many factors, not just one. Therefore, it is necessary to study the detection of the disease taking into account complex factors. The paper is divided into two modules. Initially, the data preprocessing step that uses a tree classifier-based feature selection, and to remove multivariate outlier using Mahalanobis distance from KNHANES data. The next module applies the predictive analysis step to detect and prediction of hypertension. In this study, we compare the accuracy, mean standard error (MSE), F1-score, and area under the ROC curve (AUC) for each classification model. The test results show that the proposed RF-MAH algorithm has an accuracy, F1-score, MSE and AUC outcomes of 99.48%, 99.62%, 0.0025 and 99.61%, respectively. Following these, the second-best outcomes of an accuracy rate of 99.51%, MSE of 0.0028, F1-score of 99.58%, and AUC of 99.65% were achieved by XGBoost with the MAH model. The proposed method can be used not only for hypertension but also for the detection of various diseases, such as stroke and cardiovascular disease. It is planned to support the identification and decision-making of high-risk patients with various diseases.
... Therefore, we focused on this problem by using distinct predictive models for the regular and biased inputs. In our previous study [20], the proposed method consisted of four deep learning models, including two Stacked Autoencoder (SAE) models and two deep neural network (DNN) models. First, we divided a training dataset into two groups based on their reconstruction errors given from the first SAE model. ...
... As a result, the Statistical-DBN outperformed NB, LR, SVM, RF, and DBN, and its accuracy and AUC reached 83.9% and 79.0%, respectively. The authors of [20] proposed a CHD risk prediction model using Autoencoder and DNN models. The first Autoencoder model was trained on a dataset labeled as risky for feature extraction. ...
... From them, 15,796 records were high risk, and 9,544 records were normal. Risk factors including age, knee joint pain status, waist circumference, neutral fat, BMI, weight change in oneyear status, SBP, TC, obesity status, frequency of eating out, HDL, and marital status were used to predict CHD risk [20]. The general descriptions of the risk factors used in the experimental study are shown in TABLE II. ...
Article
Full-text available
This study proposes an efficient prediction method for coronary heart disease risk based on two deep neural networks trained on well-ordered training datasets. Most real datasets include an irregular subset with higher variance than most data, and predictive models do not learn well from these datasets. While most existing prediction models learned from the whole or randomly sampled training datasets, our suggested method draws up training datasets by separating regular and highly biased subsets to build accurate prediction models. We use a two-step approach to prepare the training dataset: (1) divide the initial training dataset into two groups, commonly distributed and highly biased using Principal Component Analysis, (2) enrich the highly biased group by Variational Autoencoders. Then, two deep neural network classifiers learn from the isolated training groups separately. The well-organized training groups enable a chance to build more accurate prediction models. When predicting the risk of coronary heart disease from the given input, only one appropriate model is selected based on the reconstruction error on the Principal Component Analysis model. Dataset used in this study was collected from the Korean National Health and Nutritional Examination Survey. We have conducted two types of experiments on the dataset. The first one proved how Principal Component Analysis and Variational Autoencoder models of the proposed method improves the performance of a single deep neural network. The second experiment compared the proposed method with existing machine learning algorithms, including Naïve Bayes, Random Forest, K-Nearest Neighbor, Decision Tree, Support Vector Machine, and Adaptive Boosting. The experimental results show that the proposed method outperformed conventional machine learning algorithms by giving the accuracy of 0.892, specificity of 0.840, precision of 0.911, recall of 0.920, f-measure of 0.915, and AUC of 0.882.
... The recent research utilizing neural network is conducted by employing Multi-Layered Perceptron (MLP) trained by Backpropagation using Sigmoid activation function and 1000 epochs, as well as a learning rate of 0.25 and 25 neurons, resulting in accuracy of 80.66% in predicting patients with CHD [14]. Moreover, a better accuracy using deep neural network utilizing reconstruction error (RE) algorithm is obtained for KNHANES dataset, which is 86.34%, 91.37%, 82.90%, 86.91%, and 86.66% for accuracy, precision, recall, F-measure, and Area Under Curve (AUC), respectively [15]. In addition, the neural network model combined with feature correlation analysis (NN-FCA) applied on the KNHANES dataset produces an accuracy via the area under ROC curve of 0.749. ...
... Much of the application of ML to echocardiography thus far has concerned automation and accuracy improvement of tasks such as image structure segmentation, left-sided chamber size calculation and estimation of systolic function metrics like ejection fraction (38)(39)(40). Prediction of the development of relevant pathology from images, such as clinically significant coronary artery disease, is also a focus of efforts given the prognostic implications (41). Attention is now increasingly being afforded to diastolic function, given the aforementioned challenges of contemporary assessment methodology, diastolic dysfunction prevalence, and suitability of echocardiographic data for training ML models. ...
Article
Full-text available
Cardiac diastolic dysfunction is prevalent and is a diagnostic criterion for heart failure with preserved ejection fraction—a burgeoning global health issue. As gold-standard invasive haemodynamic assessment of diastolic function is not routinely performed, clinical guidelines advise using echocardiography measures to determine the grade of diastolic function. However, the current process has suboptimal accuracy, regular indeterminate classifications and is susceptible to confounding from comorbidities. Advances in artificial intelligence in recent years have created revolutionary ways to evaluate and integrate large quantities of cardiology data. Imaging is an area of particular strength for the sub-field of machine-learning, with evidence that trained algorithms can accurately discern cardiac structures, reliably estimate chamber volumes, and output systolic function metrics from echocardiographic images. In this review, we present the emerging field of machine-learning based echocardiographic diastolic function assessment. We summarise how machine-learning has made use of diastolic parameters to accurately differentiate pathology, to identify novel phenotypes within diastolic disease, and to grade diastolic function. Perspectives are given about how these innovations could be used to augment clinical practice, whilst areas for future investigation are identified.
... 73.3, and 75.3% for all classes on the OE_PCA_NB model is better performance for other methods. We provided multi-class ROC [15] curves for each compared model on the experimental dataset in Fig. 3. As mentioned before, we proposed to find a better model performance to predict medium-and high-level classes for the experimental datasets. ...
... In medicine and genomics, numerous studies are conducted using machine learning models [25][26][27][28][29][30][31][32][33][34][35][36][37]. However, research that compares and analyzes anthropometric factors, blood parameters, urinary parameters, and spirometric factor anomalies for diagnosis of MS is lacking. ...
Article
Full-text available
Metabolic syndrome (MS) is an aggregation of coexisting conditions that can indicate an individual’s high risk of major diseases, including cardiovascular disease, stroke, cancer, and type 2 diabetes. We conducted a cross-sectional survey to evaluate potential risk factor indicators by identifying relationships between MS and anthropometric and spirometric factors along with blood parameters among Korean adults. A total of 13,978 subjects were enrolled from the Korea National Health and Nutrition Examination Survey. Statistical analysis was performed using a complex sampling design to represent the entire Korean population. We conducted binary logistic regression analysis to evaluate and compare potential associations of all included factors. We constructed prediction models based on Naïve Bayes and logistic regression algorithms. The performance evaluation of the prediction model improved the accuracy with area under the curve (AUC) and calibration curve. Among all factors, triglyceride exhibited a strong association with MS in both men (odds ratio (OR) = 2.711, 95% confidence interval (CI) [2.328–3.158]) and women (OR = 3.515 [3.042–4.062]). Regarding anthropometric factors, the waist-to-height ratio demonstrated a strong association in men (OR = 1.511 [1.311–1.742]), whereas waist circumference was the strongest indicator in women (OR = 2.847 [2.447–3.313]). Forced expiratory volume in 6s and forced expiratory flow 25–75% strongly associated with MS in both men (OR = 0.822 [0.749–0.903]) and women (OR = 1.150 [1.060–1.246]). Wrapper-based logistic regression prediction model showed the highest predictive power in both men and women (AUC = 0.868 and 0.932, respectively). Our findings revealed that several factors were associated with MS and suggested the potential of employing machine learning models to support the diagnosis of MS.