Conference Paper

Clinical resource effective classification of remote complications of gestational diabetes

To read the full-text of this research, you can request a copy directly from the authors.


Gestational diabetes mellitus (GDM) increases woman's risk to develop diabetes mellitus (DM). GDM is defined as glucose intolerance that is first diagnosed in pregnancy period, leading to possible complications for both mother and baby during pregnancy. There are limited data available about biomarkers and methods that indicate developing diabetes in a long term after GDM is diagnosed. The aim of this study was to build an objective method to evaluate remote GDM complications and find most informative indicators for developing DM. Over 15 year old clinical data including demographic, lifestyle, clinical, genetic and pregnancy related data were collected that carries valuable information about possible evolving complications. Patients were repeatedly clinically tested to see if DM or abnormal metabolism have developed after pregnancy over years. The research steps involve preprocessing data to evaluate missing values, finding most informative attributes and testing standard classification algorithms. Initially the attributes and records with large number of missing data were rejected. Only small percentage (2.04%) was imputed using regression based methods. This approach insures minimal impact to the final error. The data set were split in to two classes (1-healthy; 2-metabolic abnormal) and three classes (1-healthy; 2-metabolic abnormal; 3-diabetes mellitus) for different test scenarios. In two class test scenario the best performing algorithm was Logistic regression having accuracy of 78.54% and AUC=0.85. The number of selected most informative attributes was reduced to 15. In three class case, the best selected algorithm was Hoeffding tree having classification accuracy of 63.57% and AUC = 0.77, with 8 most informative attributes out of 44. A novel attribute ranking method was applied, which is based on majority voting of all tested algorithms. The ranks of attributes depend on their selection frequency by standard classification algorithms. In order to find the optimal number of attributes from frequency list, a voting based classification meta-algorithm was constructed. It features best performing algorithms from different groups where final result is combined using averaging probabilities rule. Analysis showed that attributes with frequency value above 20% give best classification results. Leading attributes in two class test scenario were income level, sugar consumption, genetic factors rs7903146_CC, rs12255372_GG and increased consumption of wheat products. In the three class scenario top

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.