p-Value corresponding to all the genes in the training set for dataset "human skeletal muscle-type II diabetes". 

p-Value corresponding to all the genes in the training set for dataset "human skeletal muscle-type II diabetes". 

Source publication
Article
Full-text available
Type II diabetes is a chronic condition that affects the way our body metabolizes sugar. The body's important source of fuel is now becoming a chronic disease all over the world. It is now very necessary to identify the new potential targets for the drugs which not only control the disease but also can treat it. Support vector machines are the clas...

Similar publications

Preprint
Full-text available
Background: Virtual reality simulators and machine learning have the potential to augment understanding, assessment and training of psychomotor performance in neurosurgery residents. Objective: This study outlines the first application of machine learning to distinguish "skilled" and "novice" psychomotor performance during a virtual reality neurosu...

Citations

... They have been used T2D-associated genes, ATAC sequences, T2D variants, mitochondrial DNA (mtDNA) sequences in their studies. In addition, in other studies in the literature, deep learning-based models [12][13][14][15], pathway analysis [16], CNN models, statistical analysis, Support Vector Machine Recursive Feature Elimination (SVM-RFE) approach [17] and some machine learning methods have been used to diagnosis T2D from genomic signals [19][20][21][22][23]. Moreover, Ensemble-based methods have been used for the prediction of diabetes [24][25][26][27][28]. Table 1 lists current studies for the detection of T2D-related genes. ...
Article
Full-text available
In Genome-Wide Association Studies (GWAS), detection of T2D-related variants in genome sequences and accurate modeling of the complex structure of the relevant gene are of great importance for the diagnosis of diabetes. For this purpose, this paper presents a novel strong algorithm to accurately and effectively identify Type 2 Diabetes (T2D) risk variants at high-performance rates. The proposed algorithm consists of five important phases. The first stage is to collect T2D-associated DNA sequences and to digitize them by the Entropy-based technique. The second stage is to transform these digitized DNA sequences into 224 × 224 pixels size spectrum images. The third is to extract a distinctive feature set from these spectrum images using the ResNet and VGG19 architectures. The fourth is to classify the effective feature set using SVM and k-NN methods. The last stage is to evaluate the system with k-fold cross-validation. As a result of the developed algorithm, the performances of the used Convolutional Neural Network (CNN) methods, the Entropy-based technique, and the classifiers were compared in relation. As a result of the study a combination model of the proposed Entropy-based technique, ResNet and Support Vector Machine (SVM) achieved the highest accuracy rate with 99.09%. With this study, the performance of the system in the extraction of epigenetic features and prediction of T2D from spectrogram images was investigated. The results show that the system will contribute to the identification of all genes in diabetes-related tissue and studies on new drug targets.
... They have been used T2D-associated genes, ATAC sequences, T2D variants, mitochondrial DNA (mtDNA) sequences in their studies. In addition, in other studies in the literature, deep learning-based models [12][13][14][15], pathway analysis [16], CNN models, statistical analysis, Support Vector Machine Recursive Feature Elimination (SVM-RFE) approach [17] and some machine learning methods have been used to diagnosis T2D from genomic signals [19][20][21][22][23]. Moreover, Ensemble-based methods have been used for the prediction of diabetes [24][25][26][27][28]. Table 1 lists current studies for the detection of T2D-related genes. ...
Article
In Genome-Wide Association Studies (GWAS), detection of T2D-related variants in genome sequences and accurate modeling of the complex structure of the relevant gene are of great importance for the diagnosis of diabetes. For this purpose, this paper presents a novel strong algorithm to accurately and effectively identify Type 2 Diabetes (T2D) risk variants at high-performance rates. The proposed algorithm consists of five important phases. The first stage is to collect T2D-associated DNA sequences and to digitize them by the Entropy-based technique. The second stage is to transform these digitized DNA sequences into 224 9 224 pixels size spectrum images. The third is to extract a distinctive feature set from these spectrum images using the ResNet and VGG19 architectures. The fourth is to classify the effective feature set using SVM and k-NN methods. The last stage is to evaluate the system with k-fold cross-validation. As a result of the developed algorithm, the performances of the used Convolutional Neural Network (CNN) methods, the Entropy-based technique, and the classifiers were compared in relation. As a result of the study a combination model of the proposed Entropy-based technique, ResNet and Support Vector Machine (SVM) achieved the highest accuracy rate with 99.09%. With this study, the performance of the system in the extraction of epigenetic features and prediction of T2D from spectrogram images was investigated. The results show that the system will contribute to the identification of all genes in diabetes-related tissue and studies on new drug targets.
... Reference [21] proposed an SVM-RFE model by modifying SVM. It ranks the genes of the data on the basis of discriminatory power, and the genes not participating are removed. ...
Article
Full-text available
At present, the prevalence of diabetes is increasing because the human body cannot metabolize the glucose level. Accurate prediction of diabetes patients is an important research area.Many researchers have proposed techniques to predict this disease through data mining and machine learning methods. In prediction, feature selection is a key concept in preprocessing. Thus, the features that are relevant to the disease are used for prediction. This condition improves the prediction accuracy. Selecting the right features in the whole feature set is a complicated process, and many researchers are concentrating on it to produce a predictive model with high accuracy. In this work, a wrapper-based feature selection method called recursive feature elimination is combined with ridge regression (L2) to form a hybrid L2 regulated feature selection algorithm for overcoming the overfitting problem of data set. Overfitting is a major problem in feature selection, where the new data are unfit to the model because the training data are small. Ridge regression is mainly used to overcome the overfitting problem. The features are selected by using the proposed feature selection method, and random forest classifier is used to classify the data on the basis of the selected features. This work uses the Pima Indians Diabetes data set, and the evaluated results are compared with the existing algorithms to prove the accuracy of the proposed algorithm. The accuracy of the proposed algorithm in predicting diabetes is 100%, and its area under the curve is 97%. The proposed algorithm outperforms existing algorithms.
... Kumar et al. [29] developed SVM-based technique to classify the most discriminatory gene target for diabetes mellitus. Barkana et al. [30] carried out analysis is related to the performance of descriptive statistical features to specify retinal vessel segmentation due to diabetes mellitus problems known as diabetic retinopathy. ...
... The proposed work obtained high accuracy than others.Paper[13] identified the insulin resistance using non invasive approaches of machine learning techniques. Experimented the work with CALERIE data set with 18 parameters such as age, gender,height etc., The selected attributes of feature selection is given as input to the classification algorithms such as logistic regression, CART, SVM,LDA,KNN etc., the analysis results shows high accuracy of 97% to identify the insulin resistance while using logistic regression and SVM.Paper[14] proposes an SVMRFE model by the modification of SVM. It just rank the genes of the data based on the discriminatory power and the gene not participated are removed. ...
Preprint
Full-text available
In day today life, diabetes illness is increasing in count due to the body not able to metabolize the glucose level. The prediction of the right diabetes patients is an important research area that many researchers are proposing the techniques to predict this disease through data mining and machine learning methods. In prediction, feature selection is one of the key concept in preprocessing so that the features that are relevant to the disease will be used for prediction. This will improve the prediction accuracy. Selecting right features among the whole feature set is a complicated process and many researchers are concentrating on it to produce the predictive model with high accuracy. In this proposed work, the wrapper based feature selection method called Recursive Feature Elimination (RFE) is combined with Ridge regression (L2) to form a hybrid L2 regulated feature selection algorithm to overcome the overfilling problem of the data set. Over fitting is the major problem in feature selection which means that the new data are not fit to the model since the training data is small. Ridge regression is mainly used to overcome the overfitting problem. Once the features are selected using the proposed feature selection method, random forest classifier is used to classify the data based on the selected features. The proposed work is experimented in PIDD data set and the evaluated results are compared with the existing algorithms to prove the accuracy effect of the proposed algorithm. From the results obtained by proposed algorithm, the accuracy of predicting the diabetes disease is high compared to other existing algorithms.
... DNNs have also been trained to successfully predict diabetic retinopathy based on retinal fundus images [24][25][26][27][28][29]. Furthermore, previous studies have established positive impacts from support vector machine-recursive feature elimination (SVM-RFE) as the feature selection algorithm on improving classification accuracy [30][31][32][33][34][35][36][37], especially for DNNs [38,39]. ...
... The algorithm calculates a rank score and eliminates the lowest-ranking features. Previous studies showed significant performance improvements by employing RFE, including predicting mental states (brain activity) [31,32], Parkinson [33], skin disease [34], autism [35], Alzheimer [36], and T2D [37]. They showed that SVM-RFE achieved superior performance than several comparison methods. ...
... To our best knowledge, only Kumar et al. [37] considered RFE for diabetes prediction. Kumar et al. used SVM-RFE to identify the most discriminatory gene target for T2D. ...
Article
Full-text available
Extracting information from individual risk factors provides an effective way to identify diabetes risk and associated complications, such as retinopathy, at an early stage. Deep learning and machine learning algorithms are being utilized to extract information from individual risk factors to improve early-stage diagnosis. This study proposes a deep neural network (DNN) combined with recursive feature elimination (RFE) to provide early prediction of diabetic retinopathy (DR) based on individual risk factors. The proposed model uses RFE to remove irrelevant features and DNN to classify the diseases. A publicly available dataset was utilized to predict DR during initial stages, for the proposed and several current best-practice models. The proposed model achieved 82.033% prediction accuracy, which was a significantly better performance than the current models. Thus, important risk factors for retinopathy can be successfully extracted using RFE. In addition, to evaluate the proposed prediction model robustness and generalization, we compared it with other machine learning models and datasets (nephropathy and hypertension–diabetes). The proposed prediction model will help improve early-stage retinopathy diagnosis based on individual risk factors.
... It can remove insignificant features in order to achieve higher classification performance. SVM-RFE was first proposed for gene selection [39], and has been widely applied to many other fields [40][41][42]. ...
Article
Full-text available
Functional connectivity derived from functional magnetic resonance imaging (fMRI) is used as an effective way to assess brain architecture. There has been a growing interest in its application to the study of intrinsic connectivity networks (ICNs) during different brain development stages. fMRI data are of high dimension but small sample size, and it is crucial to perform dimension reduction before pattern analysis of ICNs. Feature selection is thus used to reduce redundancy, lower the complexity of learning, and enhance the interpretability. To study the varying patterns of ICNs in different brain development stages, we propose a two-step feature selection method. First, an improved support vector machine based recursive feature elimination method is utilized to study the differences of connectivity during development. To further reduce the highly correlated features, a combination of F-score and correlation score is applied. This method was then applied to analysis of the Philadelphia Neurodevelopmental Cohort (PNC) data. The two-step feature selection was randomly performed 20 times, and those features that showed up consistently in the experiments were chosen as the essential ICN differences between different brain ages. Our results indicate that ICN differences exist in brain development, and they are related to task control, cognition, information processing, attention, and other brain functions. In particular, compared with children, young adults exhibit increasing functional connectivity in the sensory/somatomotor network, cingulo-opercular task control network, visual network, and some other subnetworks. In addition, the connectivity in young adults decreases between the default mode network and other subnetworks such as the fronto-parietal task control network. The results are coincident with the fact that the connectivity within the brain alters from segregation to integration as an individual grows.
... Thus, features that contribute the most to discriminating the two classes are represented by |ω| with the highest values, and features with small scores are generally considered as noise, redundant or irrelevant to the problem. Therefore, eliminating features with smaller scores does not bring about great changes of the optimization problem, which is the essence of the algorithm [37,38]. The SVMRFE algorithm is briefly described as below. ...
Article
Full-text available
Many medical imaging data, especially the magnetic resonance imaging (MRI) data, usually have a small sample size, but a large number of features. How to reduce effectively the data dimension and locate accurately the biomarkers from such kinds of data are quite crucial for diagnosis and further precision medicine. In this paper, we propose a hybrid feature selection method based on machine learning and traditional statistical approaches and explore the brain abnormalities of schizophrenia by using the functional and structural MRI data. The results show that the abnormal brain regions are mainly distributed in the supramarginal gyrus, cingulate gyrus, frontal gyrus, precuneus and caudate, and the abnormal functional connections are related to the caudate nucleus, insula and rolandic operculum. In addition, some complex network analyses based on graph theory are utilized on the functional connection data, and the results demonstrate that the located abnormal functional connections in brain can distinguish schizophrenia patients from healthy controls. The identified abnormalities in brain with schizophrenia by the proposed hybrid feature selection method show that there do exist some abnormal brain regions and abnormal disruption of the network segregation and network integration for schizophrenia, and these changes may lead to inaccurate and inefficient information processing and synthesis in the brain, which provide further evidence for the cognitive dysmetria of schizophrenia.
Article
It has been demonstrated that melatonin influences the developmental competence of both in vivo and in vitro matured oocytes. It modulates oocyte-specific gene expression patterns among mammalian species. Due to differences among study systems, the identification of the classifier orthologs—the homologous genes related among mammals that could universally categorize oocytes matured in environments with varied melatonin levels is still limitedly studied. To gain insight into such orthologs, cross-species transcription profiling meta-analysis of in vitro matured bovine oocytes and in vivo matured human oocytes in low and high melatonin environments was demonstrated in the current study. RNA-Seq data of bovine and human oocytes were retrieved from the Sequence Read Archive database and pre-processed. The used datasets of bovine oocytes obtained from culturing in the absence of melatonin and human oocytes from old patients were regarded as oocytes in the low melatonin environment (Low). Datasets from bovine oocytes cultured in 10–9 M melatonin and human oocytes from young patients were considered as oocytes in the high melatonin environment (High). Candidate orthologs differentially expressed between Low and High melatonin environments were selected by a linear model, and were further verified by Zero-inflated regression analysis. Support Vector Machine (SVM) was applied to determine the potentials of the verified orthologs as classifiers of melatonin environments. According to the acquired results, linear model analysis identified 284 candidate orthologs differentially expressed between Low and High melatonin environments. Among them, only 15 candidate orthologs were verified by Zero-inflated regression analysis (FDR ≤ 0.05). Utilization of the verified orthologs as classifiers in SVM resulted in the precise classification of oocyte learning datasets according to their melatonin environments (Misclassification rates < 0.18, area under curves > 0.9). In conclusion, the cross-species RNA-Seq meta-analysis to identify novel classifier orthologs of matured oocytes under different melatonin environments was successfully demonstrated in this study-delivering candidate orthologs for future studies at biological levels. Such verified orthologs might provide valuable evidence about melatonin sufficiency in target oocytes-by which, the decision on melatonin supplementation could be implied.