Article

Differential Diagnostics of Thalassemia Minor by Artificial Neural Networks Model

Wiley
Journal of Clinical Laboratory Analysis
Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Current methods used to diagnose the thalassemia minor (TM) patients require high-cost assays, while broader screening based on routine blood count has limited specificity and sensitivity. This study developed a new screening technique for TM patients' diagnosis. The study enrolled 526 patients database that included 185 verified α and β TM cases, and control group consisted of iron-deficiency anemia (IDA), myelodysplastic syndrome (MDS), and healthy patients. More than 1,500 artificial neural networks (ANNs) models were created and the networks that gave high accuracy were selected for the study. TM patients were identified from the general database using the best-optimized ANNs. Comparison between three or six routine blood count parameters determined a slightly higher accuracy of the model with the three-parameter scheme, including mean corpuscular volume, red blood cell distribution width, and red blood cell. Based on these parameters, we were able to separate TM patients from the control group and MDS group, with specificity of 0.967 and sensitivity of 1. Including IDA patients into comparison gave lower but, still, very good values of specificity of 0.968 and sensitivity of 0.9. ANN-based TM diagnostics should be used for broad automatic screening of general population prior diagnosis with high-cost tests.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The most individually studied diseases were thalassemia [20,28,29,47], Down syndrome [21,30,34], cystic fibrosis [46,48,52], Marfan syndrome [25,27] and Huntington disease [19,33]. ...
... The latter part could be either machine learning models [28,29,37,41,45] or data-driven processes for feature weighting or dimension reduction to create scores [25,27,39]. Two studies [73,78] used a combination of similarity calculation between the patient's phenotype concepts and the knowledge-based description of a disease on the one hand and machine learning on the other hand. ...
... Another type of combination consisted of combining similarity between patients and text-mined literature on the one hand and similarity metrics among patients on the other hand [75]. In these frameworks, deep learning was used in 4 studies [28,29,45,74] and SVM in one study [41]. One study [73] used a fusion algorithm. ...
Article
Full-text available
Introduction: Rare diseases affect approximately 350 million people worldwide. Delayed diagnosis is frequent due to lack of knowledge of most clinicians and a small number of expert centers. Consequently, computerized diagnosis support systems have been developed to address these issues, with many relying on rare disease expertise and taking advantage of the increasing volume of generated and accessible health-related data. Our objective is to perform a review of all initiatives aiming to support the diagnosis of rare diseases. Methods: A scoping review was conducted based on methods proposed by Arksey and O'Malley. A charting form for relevant study analysis was developed and used to categorize data. Results: Sixty-eight studies were retained at the end of the charting process. Diagnosis targets varied from 1 rare disease to all rare diseases. Material used for diagnosis support consisted mostly of phenotype concepts, images or fluids. Fifty-seven percent of the studies used expert knowledge. Two-thirds of the studies relied on machine learning algorithms, and one-third used simple similarities. Manual algorithms were encountered as well. Most of the studies presented satisfying performance of evaluation by comparison with references or with external validation. Fourteen studies provided online tools, most of which aimed to support the diagnosis of all rare diseases by considering queries based on phenotype concepts. Conclusion: Numerous solutions relying on different materials and use of various methodologies are emerging with satisfying preliminary results. However, the variability of approaches and evaluation processes complicates the comparison of results. Efforts should be made to adequately validate these tools and guarantee reproducibility and explicability.
... Thalassemia is the most common single-gene disorder throughout the world and represents a major public health problem, especially are widespread throughout the Mediterranean region, the Middle East, Southeast Asia, and some parts of Africa. There are different types of thalassemia characterized by abnormal hemoglobin production, the most common are α-thalassemia and β-thalassemia (8). ...
... Most of the current hemoglobinopathy screening methods include high performance liquid chromatography (HPLC), hemoglobin electrophoresis, screening of PCR mutations, and DNA tests. All of these methods yield a higher project cost and require specialized instrumentation and trained technicians (8). ...
... Similarly, in a study examined the prediction premature birth in pregnant women via Assisted Reproductive Technologies using ANN and concluded that designed neural network for predicting premature birth in pregnant women through Assisted Reproductive Technologies can be helpful in prevention of premature birth complications (14). Studies have shown that ANN has been able to accurately detect IDA and thalassemia with a significant degree of precision and sensitivity (8,14). ...
Article
Full-text available
Introduction: Iron deficiency anemia (IDA) and β-thalassemia trait (β-TT) are the most common types of microcytic hypochromic anemias. The similarity and the nature of anemia-related symptoms pose a foremost challenge for discriminating between IDA and β-TT. Currently, advances in technology have gave rise to computer-based decision-making systems. Therefore, advances in artificial intelligence have led to the emergence of intelligent systems and the development of tools that can assist physicians in the diagnosis and decision-making. Aim: The aim of the present study was to develop a neural network based model (Artificial Neural Network) for accurate and timely manner of differential diagnosis of IDA and β-TT in comparison with traditional methods. Methods: In this study, an artificial neural network (ANN) model as the first precise intelligent method was developed for differential diagnosis of IDA and β-TT. Data set was retrieved from Complete Blood Count (CBC) test factors of 268 individuals referred to Padad private clinical laboratory at Ahvaz, Iran in 2018. ANN models with different topologies were developed and CBC indices were examined for diagnosis of IDA and β-TT. The proposed model was simulated using MATLAB software package version 2018. The results showed the best network architecture based on the advanced multilayer algorithm (4 input factors, 70 neurons with acceptable sensitivity, specificity, and accuracy). Finally, the results obtained from ANN diagnostic model was compared to existing discriminating indexes. Result: The results of this model showed that the specificity, sensitivity, and accuracy of the proposed diagnostic system were 92.33%, 93.13%, and 92.5%, respectably; i.e. the model could diagnose frequent occurrence of IDA in patients with β-TT. Conclusion: The results and evaluation of the developed model showed that the proposed neural network model has a proper accuracy and generalizability based on the initial factors of CBC testing compared to existing methods. This model can replace the high-cost methods and discriminating indices to distinguish IDA from β-TT and assist in accurate and timely manner diagnosis.
... Artificial Neural Networks (ANNs), which mimic the human brain's ability to process information, have shown even higher accuracy in diagnosing thalassemia. A study reported that ANNs achieved a sensitivity of 93.13% and specificity of 92.33% in discriminating between IDA and beta thalassemia [81][82][83]. ...
Chapter
Full-text available
Beta Thalassemia Major is a severe inherited blood disorder caused by mutations in the HBB gene, resulting in reduced or absent production of beta-globin chains. This condition leads to chronic anemia, requiring regular blood transfusions and iron chelation therapy. The disorder is prevalent in regions such as the Mediterranean, Middle East, South Asia, and Southeast Asia. Advances in molecular diagnostics, including PCR and non-invasive prenatal testing, have significantly improved early detection and treatment outcomes. Screening and prevention programs in high-risk areas have reduced the number of affected births. The use of artificial intelligence in specific diagnostic areas, particularly in managing iron overload, is also being explored to enhance patient care. This chapter covers the genetic structure, clinical manifestations, diagnostic methods, and iron overload management in Beta Thalassemia Major.
... 27 The conclusion of this study also asserts that sensitivity and specificity vary among ethnic and age groups. 13,28 The study's limitations include the exclusive focus on individuals suspected of having hemoglobinopathies and the absence of information on coexisting iron deficiency anemia. The lack of genetic analysis precludes evaluation of the use of indices such as Matos and Carvalho in screening for conditions such as alpha thalassemia trait (ATT). ...
Article
Full-text available
Background Microcytic hypochromic anemia is the most common type of anemia found in Beta Thalassemia Trait and Iron Deficiency Anemia, posing a diagnostic challenge due to their similar presentations. Diagnostic errors among them can lead to incorrect treatment, potentially resulting in fatal outcomes. In Indonesia HPLC (High Performance Liquid Chromatography) usage as gold standard for discrimination between these diseases are expensive. Discrimination indices offer an alternative for cheaper and effective initial screening. However, a comprehensive performance evaluation of those indices, such as Matos and Carvalho indices, alongside Mentzer index, Green and King index, England and Fraser index, RBC index, Shine and Lal index, and Srivastava index, has not been conducted in Indonesia. This study aims to determine the best discriminative performance index based on the highest Youden’s indes value between those seven indices. Methods Study consisted of 30 subjects of beta thalassemia trait and 35 subjects of iron deficiency anemia. Index calculations were performed using blood profile formulas and compared with gold standard test results to find each index sensitivity, specificity, and Youden’s index value. Results Matos and Carvalho indices exhibited superior discriminatory performance, achieving 80% sensitivity, 77.1% specificity, and a Youden's index of 57.14%. Among other indices, the RBC index demonstrated the highest sensitivity (90%), while the Green and King index excelled in specificity (97.14%). MCV and MCH values did not significantly differ between BTT and IDA groups. Conclusions The study's findings underscore the efficacy of Matos and Carvalho indices in discriminating BTT and IDA in this study population, highlighting their potential as valuable tools in initial screening efforts.
... However, existing rule-based formulas and ML algorithms for identifying hemoglobinopathies have limitations. The majority of these rule-based formulas and ML models are tailored to specific subcategories or subgoals, such as identifying one specific thalassemia (19,(23)(24)(25)(26) and distinguishing these from IDA (15,17,(27)(28)(29)(30)(31)(32). Moreover, these models are primarily designed using small single-center data sets lacking independent validation (15,16,18,(33)(34)(35)(36). ...
Article
Full-text available
Background Hemoglobinopathies, the most common inherited blood disorder, are frequently underdiagnosed. Early identification of carriers is important for genetic counseling of couples at risk. The aim of this study was to develop and validate a novel machine learning model on a multicenter data set, covering a wide spectrum of hemoglobinopathies based on routine complete blood count (CBC) testing. Methods Hemoglobinopathy test results from 10 322 adults were extracted retrospectively from 8 Dutch laboratories. eXtreme Gradient Boosting (XGB) and logistic regression models were developed to differentiate negative from positive hemoglobinopathy cases, using 7 routine CBC parameters. External validation was conducted on a data set from an independent Dutch laboratory, with an additional external validation on a Spanish data set (n = 2629) specifically for differentiating thalassemia from iron deficiency anemia (IDA). Results The XGB and logistic regression models achieved an area under the receiver operating characteristic (AUROC) of 0.88 and 0.84, respectively, in distinguishing negative from positive hemoglobinopathy cases in the independent external validation set. Subclass analysis showed that the XGB model reached an AUROC of 0.97 for β-thalassemia, 0.98 for α0-thalassemia, 0.95 for homozygous α+-thalassemia, 0.78 for heterozygous α+-thalassemia, and 0.94 for the structural hemoglobin variants Hemoglobin C, Hemoglobin D, Hemoglobin E. Both models attained AUROCs of 0.95 in differentiating IDA from thalassemia. Conclusions Both the XGB and logistic regression model demonstrate high accuracy in predicting a broad range of hemoglobinopathies and are effective in differentiating hemoglobinopathies from IDA. Integration of these models into the laboratory information system facilitates automated hemoglobinopathy detection using routine CBC parameters.
... Approximately 5% of the world's population are carriers of β-thalassemia genes, particularly in the Mediterranean countries, south-east Europe, Arab nations, Asia, and parts of Africa [12][13][14][15]. In India, approximately 10,000 children are born with β-TM every year, and there are nearly 42 million carriers of BTT, with some communities like Sindhis, Gujaratis, Mahars, Kolis, Saraswats, Lohanas, and Gaurs exhibiting high prevalence [16,17]. ...
Article
Full-text available
Background India has the most significant number of children with thalassemia major worldwide, and about 10,000-15,000 children with the disease are born yearly. Scaling up e-health initiatives in rural areas using a cost-effective digital tool to provide healthcare access for all sections of people remains a challenge for government or semi-governmental institutions and agencies. Methods We compared the performance of a recently developed formula SCSBTTBTT_{BTT} and its web application SUSOKA with 42 discrimination formulae presently available in the literature. 6,388 samples were collected from the Postgraduate Institute of Medical Education and Research, Chandigarh, in North-Western India. Performances of the formulae were evaluated by eight different measures: sensitivity, specificity, Youden’s Index, AUC-ROC, accuracy, positive predictive value, negative predictive value, and false omission rate. Three multi-criteria decision-making (MCDM) methods, TOPSIS, COPRAS, and SECA, were implemented to rank formulae by ensuring a trade-off among the eight measures. Results MCDM methods revealed that the Shine & Lal and SCSBTTBTT_{BTT} were the best-performing formulae. Further, a modification of the SCSBTTBTT_{BTT} formula was proposed, and validation was conducted with a data set containing 939 samples collected from Nil Ratan Sircar (NRS) Medical College and Hospital, Kolkata, in Eastern India. Our two-step approach emphasized the necessity of a molecular diagnosis for a lower number of the population. SCSBTTBTT_{BTT} along with the condition MCV≤\le 80 fl was recommended for a higher heterogeneous population set. It was found that SCSBTTBTT_{BTT} can classify all BTT samples with 100% sensitivity when MCV≤\le 80 fl. Conclusions We addressed the issue of how to integrate the higher-ranked formulae in mass screening to ensure higher performance through the MCDM approach. In real-life practice, it is sufficient for a screening algorithm to flag a particular sample as requiring or not requiring further specific confirmatory testing. Implementing discriminate functions in routine screening programs allows early identification; consequently, the cost will decrease, and the turnaround time in everyday workflows will also increase. Our proposed two-step procedure expedites such a process. It is concluded that for mass screening of BTT in a heterogeneous set of data, SCSBTTBTT_{BTT} and its web application SUSOKA can provide 100% sensitivity when MCV≤\le 80 fl.
... Moreover, new molecular genetic technologies to map and detect mutation and deletion by sequencing analysis prenatally [10,12,13]. Most of these technologies include PCR screening for mutations, electrophoresis, high-performance liquid chromatography (HPLC), and DNA test [10,14]. Clinically, thalassemia classification depends on the variant type (deletion and non-deletion variants) and the location of the globin within the genes. ...
... ANN was used in the differential diagnosis of beta-thalassemia and iron deficiency. Both diseases were accurately predicted with a specificity of 0.96 and a sensitivity of 0.99, and they stated that population screenings could be performed easily using ANN: [25]. ...
Article
Full-text available
Objectives This article presents the use of machine learning techniques such as artificial neural networks, K-nearest neighbors (KNN), naive Bayes, and decision trees in the prediction of hemoglobin variants. To the best of our knowledge, this is the first study using machine learning models to predict suspicious cases with HbS or HbD Los Angeles carriers state. Methods We had a dataset of 238 observations, of which 128 were HbD carriers, and 110 were HbS carriers. The features were age, sex, RBC, Hb, HTC, MCV, MCH, RDW, serum iron, TIBC, ferritin, HbA2, HbF, HbA0, retention time (RT) of the abnormal peak, and the area under the peak of the abnormal peak. KNN, naive Bayes, decision tree models, and artificial neural network models were trained. Model performances were estimated using 7-fold cross-validation. Results When RT, the key point of differentiation used in high-performance liquid chromatography (HPLC), was included as a feature, all models performed well. When RT was excluded (eliminated), the deep learning model performed the best (Accuracy: 0.99; Specificity: 0.99; Sensitivity: 0.99; F1 score: 0.99), while the naive Bayes model performed the worst (Accuracy: 0.94; Specificity: 0.97; Sensitivity: 0.90; F1 score: 0.93). Conclusions Deep learning and decision tree models have demonstrated high performance and have the potential to be integrated into medical laboratory work practices as a tool for hemoglobinopathy detection. These outcomes suggest that when machine learning models are fed enough data, they can detect a wide range of hemoglobin variants. However, more comprehensive studies with data from a larger number of patients and hemoglobinopathies will be useful for validating our models.
... Recently, the accessibility of powerful statistical software programs has paved the way for the application of advanced statistical models such as data mining techniques in the differential diagnosis of IDA from βTT. However, few studies have already employed such advanced statistical methods and data mining techniques for differential diagnosis of hematological data [40,[43][44][45][46][47][48][49][50][51][52]. Therefore, this study was intended to compare tree algorithms as powerful machine-learning methods and support vector machines (SVM) with hematological indices in differentiation between IDA and βTT. ...
Article
Full-text available
Background Several hematological indices have been already proposed to discriminate between iron deficiency anemia (IDA) and β‐thalassemia trait (βTT). This study compared the diagnostic performance of different hematological discrimination indices with decision trees and support vector machines, so as to discriminate IDA from βTT using multidimensional scaling and cluster analysis. In addition, decision trees were used to determine the diagnostic classification scheme of patients. Methods Consisting of 1178 patients with hypochromic microcytic anemia (708 patients with βTT and 470 patients with IDA), this cross-sectional study compared the diagnostic performance of 43 hematological discrimination indices with classification tree algorithms and support vector machines in order to discriminate IDA from βTT. Moreover, multidimensional scaling and cluster analysis were used to identify the homogeneous subgroups of discrimination methods with similar performance. Results All the classification tree algorithms except the LOTUS tree algorithm showed acceptable accuracy measures for discrimination between IDA and βTT in comparison with other hematological discrimination indices. The results indicated that the CRUISE and C5.0 tree algorithms had better diagnostic performance and efficiency among other discrimination methods. Moreover, the AUC of CRUISE and C5.0 tree algorithms indicated more precise classification with values of 0.940 and 0.999, indicating excellent diagnostic accuracy of such models. Moreover, the CRUISE and C5.0 tree algorithms showed that mean corpuscular volume can be considered as the main variable in discrimination between IDA and βTT. Conclusions CRUISE and C5.0 tree algorithms as powerful methods in data mining techniques can be used to develop accurate differential methods along with other laboratory parameters for the discrimination of IDA and βTT. In addition, the multidimensional scaling method and cluster analysis can be considered as the most appropriate techniques to determine the discrimination indices with similar performance for future hematological studies.
... Neural networks have received particular attention because of their ability to learn in complex issues with the ability to maintain accuracy, even in the absence of some information [24,25], such as in predicting death rate has been used along with statistical methods [26]. Based on previous studies, it can be said that the unique ability of neural networks to differentiate and classify as well as disease detection can be desirable and useful [27,28,29,30,31,32]. Boulain et al., in a study that attempted to predict ABGs from venous blood gases, did not yield favorable results [33]. ...
Preprint
Full-text available
Background Trauma is the third leading cause of death in the world and the first cause of death among people younger than 44 years. In traumatic patients, especially those who are injured early in the day, arterial blood gas (ABG) is considered a golden standard because it can provide physicians with important information such as detecting the extent of internal injury, especially in the lung. However, measuring these gases by laboratory methods is a time-consuming task in addition to the difficulty of sampling the patient. The equipment needed to measure these gases is also expensive, which is why most hospitals do not have this equipment. Therefore, estimating these gases without clinical trials can save the lives of traumatic patients and accelerate their recovery. Methods In this study, a method based on artificial neural networks for the aim of estimation and prediction of arterial blood gas is presented by collecting information about 2280 traumatic patients. In the proposed method, by training a feed-forward backpropagation neural network (FBPNN), the neural network can only predict the amount of these gases from the patient’s initial information. The proposed method has been implemented in MATLAB software, and the collected data have tested its accuracy, and its results are presented. Results The results show 87.92% accuracy in predicting arterial blood gas. The predicted arterial blood gases included PH, PCO2, and HCO3, which reported accuracy of 99.06%, 80.27%, and 84.43%, respectively. Therefore, the proposed method has relatively good accuracy in predicting arterial blood gas. Conclusions Given that this is the first study to predict arterial blood gas using initial patient information(systolic blood pressure (SBP), diastolic blood pressure (DBP), pulse rate (PR), respiratory rate (RR), and age), and based on the results, the proposed method could be a useful tool in assisting hospital and laboratory specialists, to be used.
... Even more complex styles, including combinations of different simple indices, multivariate discriminant analysis or artificial neural network computing are unable to reach absolute sensitivity and specificity. [7] Hoffman reported that although the reason is not apparent, the difference in gene according to ethnic groups may play a role in the fact that there is no screening indices have a superior function to detection TM. [4] By this article, especially in countries with limited resources, the physician should Use available facilities to reach a diagnosis. Red blood cell (RBC) indices which can be obtained by blood coulter are essential to make the mind of them more oriented regarding the differentiation between beta TM and other microcytic anemia. ...
Article
Full-text available
BACKGROUND: Thalassemia trait and other low red cell index (LRCI) diseases commonly have same presentation with microcytic hypochromic anemia. Most of beta thalassemia minor (TM) people are subclinical and without specific investigation may be undiagnosed or treated as iron‑deficiency anemia. Thalassemia carriers may be undiagnosed, which in turn leads to severe forms of thalassemia syndromes with poor premarital counseling in high‑prevalence areas. Many trials tried to find simple diagnostic tools to differentiate between thalassemia traits and other microcytic anemia depending on blood discriminative indices that can be found in limited resource places and routine clinics using blood cell count parameters. The aim was to assess the value of Matos and Carvalho index (MCI) in detecting TM from patients presented with microcytic anemia. PATIENTS AND METHODS: The study was carried out on 171 patients who were diagnosed as cases of hypochromic microcytic anemia in Kut Hemato‑oncology Center. By Measuring hematological parameters using five automated red cell discriminative indices (red blood cell (RBC) count, RBC distribution width, Shine and Lal index, MCI index, and Mentzer index [MI]) with measuring hemoglobin (Hb) A2 levels using Hb variant B thalassemia short arm program. RESULTS: Of 171 patients screened for TM, 108 patients were diagnosed as TM by Hb electrophoresis. Patients with TM presented with the mean age of 25.3 years, while the mean of age in patients with other LRCI anemia was 6.2 years. RBC count was the best index of correctly identifi ed patients as 84%, followed by MI and MCI with 74% and 72%, respectively. Furthermore, the RBC count was the best indicator Youden’s indices (58.2), with high sensitivity for BT (96.3%) followed by MI with Youden’s index (38). Wide thalassemia mutation play important role in this issue. CONCLUSION: RBC count are simply accessible and dependable ways for identifying beta thalassemia trait, but there are no red cells indices and methods have 100% specificity, efficacy, and sensitivity for the differentiation beta TM from other hypochromic microcytic anemia which may be due to wide thalassemia mutations. Keywords: Anemia, beta thalassemia minor, Matos and Carvalho index
... Barnhart-Magen et al. in 2013 [5] detected thalassemia minor patients using 1500 artificial neural networks for pattern recognition on the data set consisting of clinical features such as HB, MCV, MCH, RDW, RBC, and platelet count (PLT) of 526 patients. Two models were used, one with all six clinical features and the others using only MCV, RDW, and RBC as clinical features. ...
Article
Full-text available
Diagnosis of microcytic hypochromia is done by measuring certain characteristics changes in the count of blood cell and related indices. Complete blood count test (CBC) is the common process for measuring these characteristic changes. However, the CBC test cannot be completely relied upon since there are chances of false diagnosis as these characteristics are also related to other disorders. In order to rectify the same, other expensive and lengthy tests need to be done which leads to further delay in accurate diagnosis and which may prove detrimental. In an attempt to find the solution to this problem, this paper proposes a method that uses feature fusion for classification of microcytic hypochromia. Feature fusion means combining blood smear image features extracted by the deep convolutional neural network (CNN) and clinical features from CBC test. This fused data-set is further used to predict microcytic hypochromia. After obtaining fused data set we use linear discriminant analysis (LDA) and principal component analysis (PCA) to reduce data set dimensions which further results in less computational overhead. To differentiate between microcytic hypochromia patients and normal persons, k-nearest neighbors (k-NN), support vector machine (SVM), and neural network classification models are used. In order to check the performance of the above model, various evaluation metrics are used. Results achieved from the proposed method reflect that fused data set can effectively improve the identification ratio with a very limited number of patients diagnostic images and clinical data (10 for normal and 10 for β-thalassemia) and feed-forward back-propagation neural network on this data set achieved accuracy, sensitivity, and specificity of 99%, 1.00, and 0.98, respectively. The limited number of patients reduces the system complexity and researcher’s time for getting data from different hospital to train the network.
... Barnhart-Magen et al. [36] developed a new screening technique for thalassemia minor (TM) patients diagnosis. The authors created 1500 artificial neural networks using MATLAB to perform the task of patterns recognition in the data. ...
Article
Thalassemia is considered one of the most common genetic blood disorders that has received excessive attention in the medical research fields worldwide. Under this context, one of the greatest challenges for healthcare professionals is to correctly differentiate normal individuals from asymptomatic thalassemia carriers. Usually, thalassemia diagnosis is based on certain measurable characteristic changes to blood cell counts and related indices. These characteristic changes can be derived easily when performing a complete blood count test (CBC) using a special fully automated blood analyzer or counter. However, the reliability of the CBC test alone is questionable with possible candidate characteristics that could be seen in other disorders, leading to misdiagnosis of thalassemia. Therefore, other costly and time-consuming tests should be performed that may cause serious consequences due to the delay in the correct diagnosis. To help overcoming these challenging diagnostic issues, this work presents a new novel dataset collected from Palestine Avenir Foundation for persons tested for thalassemia. We aim to compile a gold standard dataset for thalassemia and make it available for researchers in this field. Moreover, we use this dataset to predict the specific type of thalassemia known as beta thalassemia (β-thalassemia) based on hybrid data mining model. The proposed model consists of two main steps. First, to overcome the problem of the highly imbalanced class distribution in the dataset, a balancing technique called SMOTE is proposed and applied to handle this problem. In the second step, four classification models, namely k-nearest neighbors (k-NN), naïve Bayesian (NB), decision tree (DT) and the multilayer perceptron (MLP) neural network are used to differentiate between normal persons and those patients carrying β-thalassemia. Different evaluation metrics are used to assess the performance of the proposed model. The experimental results show that the SMOTE oversampling method can effectively improve the identification ratio of β-thalassemia carriers in a highly imbalanced class distribution. The results reveal also that the NB classifier achieved the best performance in differentiating between normal and β-thalassemia carriers at oversampling SMOTE ratio of 400%. This combination shows a specificity of 99.47% and a sensitivity of 98.81%.
... Therefore, their use in low-resource environments can be problematic. This is equally true for expert systems and neural network approaches, which are difficult to reproduce in the field [36][37][38][39][40][41][42][43][44]. ...
Article
Full-text available
Background Many discriminant formulas have been reported for distinguishing thalassemia trait from iron deficiency in patients with microcytic anemia. Independent verification of several discriminant formulas is deficient or even lacking. Therefore, we have retrospectively investigated discriminant formulas in a large, well-characterized patient population. Methods The investigational population consisted of 2664 patients with microcytic anemia: 1259 had iron deficiency, 1196 ‘pure’ thalassemia trait (877 β- and 319 α-thalassemia), 150 had thalassemia trait with concomitant iron deficiency or anemia of chronic disease, and 36 had other diseases. We investigated 25 discriminant formulas that only use hematologic parameters available on all analyzers; formulas with more advanced parameters were disregarded. The diagnostic performance was investigated using ROC analysis. Results The three best performing formulas were the Jayabose (RDW index), Janel (11T), and Green and King formulas. The differences between them were not statistically significant (p>0.333), but each of them had significantly higher area under the ROC curve than any other formula. The Jayabose and Green and King formulas had the highest sensitivities: 0.917 both. The highest specificity, 0.925, was found for the Janel formula, which is a composite score of 11 other formulas. All investigated formulas performed significantly better in distinguishing β- than α-thalassemia from iron deficiency. Conclusions In our patient population, the Jayabose RDW index, the Green and King formula and the Janel 11T score are superior to all other formulas examined for distinguishing between thalassemia trait and iron deficiency anemia. We confirmed that all formulas perform much better in β- than in α-thalassemia carriers and also that they incorrectly classify approximately 30% of thalassemia carriers with concomitant other anemia as not having thalassemia. The diagnostic performance of even the best formulas is not high enough for making a final thalassemia diagnosis, but in countries with limited resources, they can be helpful in identifying those patients who need further examinations for genetic anemia.
... It is widely agreed that none of these indices is 100% sensitive or 100% specific. Even more complex approaches including combinations of different simple indices, multivariate discriminant analysis or artificial neural network computing are unable to reach absolute sensitivity and specificity [15][16][17][18][19][20][21][22][23]. It is somewhat surprising that comparative studies of these screening indices do not show a consistent picture: discriminant indices that are superior in one study may perform less well in another study. ...
Article
Full-text available
Background: More than 40 mathematical indices have been proposed in the hematological literature for discriminating between iron deficiency anemia and thalassemia trait in subjects with microcytic red blood cells (RBCs). None of these discriminant indices is 100% sensitive and specific and also the ranking of the discriminant indices is not consistent. Therefore, we decided to conduct the first meta-analysis of the most frequently used discriminant indices. Methods: An extensive literature search yielded 99 articles dealing with 12 indices that were investigated five or more times. For each discriminant index we calculated the diagnostic odds ratio (DOR) and summary ROC analysis was done for comparing the performance of the indices. Results: The ratio of microcytic to hypochromic RBCs (M/H ratio) showed the best performance, DOR = 100.8. This was significantly higher than that of all other indices investigated. The RBC index scored second (DOR = 47.0), closely followed by the Sirdah index (DOR = 46.7) and the Ehsani index (DOR = 44.7). Subsequently, there was a group of four indices with intermediate and three with lower DOR. The lowest performance (DOR = 6.8) was found for the RDW (Bessman index). Overall, the indices performed better for adults than for children.
Preprint
Full-text available
Hemoglobinopathies are a group of disorders in which the hemoglobin molecule has abnormal production or structure. The hemoglobin molecules in red blood cells (RBC) are impacted by the blood disease known as sickle cell disease (SCD), and Thalassemia is one of the major monogenic disorders that reduces hemoglobin production( Kohne et al ., 2011). This disorder results in a large number of red blood cells being destroyed, leading to anemia. India bears a huge burden of hemoglobinopathies; thalassemia is the most prevalent (Mondal SK et al., 2016). A key component of thalassemia prevention is a successful screening procedure to identify Thalassemia carriers. Effective screening programs have numerous obstacles, especially in environments with limited resources. Machine learning (ML) has been used to solve technical and domain-specific problems in a variety of prognostic and diagnostic medical jobs. In this work, we aimed to identify and analyze the most common mutation of β--thalassemia and sickle cell disease from the north Indian population by employing Machine learning-based algorithms. To accurately predict the carrier state from a simple blood test, and to predict pathogenic hemoglobin variants in a group of individuals, these results demonstrate the application of integrated bioinformatics and machine learning approaches. This study contributes to the validation of the models based on data from several individuals and hemoglobinopathies.
Article
Full-text available
Routine blood tests drive diagnosis, prognosis, and monitoring in traditional clinical decision support systems. As a routine diagnostic tool with standardized laboratory workflows, clinical blood analysis offers superior accessibility to a comprehensive assessment of physiological parameters. These parameters can be integrated and automated at scale, allowing for in-depth clinical inference and cost-effectiveness compared to other modalities such as imaging, genetic testing, or histopathology. Herein, we extensively review the analytical value of routine blood tests leveraged by artificial intelligence (AI), using the ICD-10 classification as a reference. A significant gap exists between standard disease-associated features and those selected by machine learning models. This suggests an amount of non-perceived information in traditional decision support systems that AI could leverage with improved performance metrics. Nonetheless, AI-derived support for clinical decisions must still be harmonized regarding external validation studies, regulatory approvals, and clinical deployment strategies. Still, as we discuss, the path is drawn for the future application of scalable artificial intelligence (AI) to enhance, extract, and classify patterns potentially correlated with pathological states with restricted limitations in terms of bias and representativeness.
Article
Full-text available
Artificial intelligence (AI) and machine learning (ML) are becoming vital in laboratory medicine and the broader context of healthcare. In this review article, we summarized the development of ML models and how they contribute to clinical laboratory workflow and improve patient outcomes. The process of ML model development involves data collection, data cleansing, feature engineering, model development, and optimization. These models, once finalized, are subjected to thorough performance assessments and validations. Recently, due to the complexity inherent in model development, automated ML tools were also introduced to streamline the process, enabling non-experts to create models. Clinical Decision Support Systems (CDSS) use ML techniques on large datasets to aid healthcare professionals in test result interpretation. They are revolutionizing laboratory medicine, enabling labs to work more efficiently with less human supervision across pre-analytical, analytical, and post-analytical phases. Despite contributions of the ML tools at all analytical phases, their integration presents challenges like potential model uncertainties, black-box algorithms, and deskilling of professionals. Additionally, acquiring diverse datasets is hard, and models’ complexity can limit clinical use. In conclusion, ML-based CDSS in healthcare can greatly enhance clinical decision-making. However, successful adoption demands collaboration among professionals and stakeholders, utilizing hybrid intelligence, external validation, and performance assessments.
Article
Background and objective: The traditional statistical screening method for thalassemia based on red blood cell (RBC) indices is being replaced by machine learning. Here, we developed deep neural networks (DNNs) that outperformed the traditional method for predicting thalassemia. Method: Using a dataset of 8693 records comprising genetic tests and other 11 features we constructed 11 DNN models and 4 traditional statistical models and then compared their performances and analysed feature importance for interpreting DNN models. Results: The area under the receiver operating characteristic curve, accuracy, Youden's index, F1 score, sensitivity, specificity, positive predictive value and negative predictive value, were 0.960, 0.897, 0.794, 0.897, 0.883, 0.911, 0.914, and 0.882, respectively, for our best model, and compared with the traditional statistical model based on the mean corpuscular volume, these values were increased by 10.22%, 10.09%, 26.55%, 8.92%, 4.13%, 16.90%, 13.86% and 6.07%, respectively, and by 15.38%, 11.70%, 31.70%, 9.89%, 3.05%, 22.13%, 17.11% and 5.94%, respectively, for the mean cellular haemoglobin model. The DNN model performance will reduce without age, RBC distribution width (RDW), sex, or both WBC and PLT. Conclusions: Our DNN model outperformed the current screening model. In 8 features, RDW and age were the most useful, followed by sex and the combination of WBC and PLT, the remaining nearly useless.
Article
Background Currently, more than forty discrimination formulae based on red blood cell (RBC) parameters and some supervised machine learning algorithms (MLAs) have been recommended for β-thalassemia trait (BTT) screening. The present study was aimed to evaluate and compare the performance of 26 such formulae and 13 MLAs on antenatal woman data with a recently developed formula SCSBTT, which is available for evaluation in over seventy countries as an Android app, called SUSOKA [16]. Methods A diagnostic database of 2942 antenatal females were collected from PGIMER, Chandigarh, India and was used for this analysis. The data set consists of hypochromic microcytic anemia, BTT, Hemoglobin E trait, double heterozygote for Hemoglobin S and BTT, heterozygote for Hemoglobin D Punjab and normal subjects. Performance of the formulae and the MLAs were assessed by Sensitivity, Specificity, Youden’s Index, and AUC-ROC measures. A final recommendation was made from ranking the obtained through two Multiple Criteria Decision-Making techniques (MCDM), Simultaneous Evaluation of Criteria and Alternatives (SECA) and TOPSIS. Results It was observed that Extreme Learning Machine (ELM) and Gradient Boosting Classifier (GBC) showed maximum Youden’s index and AUC-ROC measures compared to all discriminating formulae. Sensitivity remains maximum for SCSBTT. K-means clustering and the ranking from MCDM methods show that SCSBTT, Shine & Lal and Ravanbakhsh-F4 formula ensures higher performance among all formulae. The discriminant power of some MLAs and formulae was found considerably lower than that reported in original studies. Conclusion Comparative information on MLAs can aid researchers in developing new discriminating formulae that simultaneously ensure higher sensitivity and specificity. More multi-centric verification of the formulae on heterogeneous data is indispensable. SCSBTT and Shine & Lal formula, and ELM and GBC are recommended for screening BTT based on MCDM. SCSBTT can be used with certainty as a tangible cost-saving screening tool for mass screening for antenatal women in India and other countries.
Article
Objective As the storage of clinical data has transitioned into electronic formats, medical informatics has become increasingly relevant in providing diagnostic aid. The purpose of this review is to evaluate machine learning models that use text data for diagnosis and to assess the diversity of the included study populations. Methods We conducted a systematic literature review on three public databases. Two authors reviewed every abstract for inclusion. Articles were included if they used or developed machine learning algorithms to aid in diagnosis. Articles focusing on imaging informatics were excluded. Results From 2,260 identified papers, we included 78. Of the machine learning models used, neural networks were relied upon most frequently (44.9%). Studies had a median population of 661.5 patients, and diseases and disorders of 10 different body systems were studied. Of the 35.9% (N = 28) of papers that included race data, 57.1% (N = 16) of study populations were majority White, 14.3% were majority Asian, and 7.1% were majority Black. In 75% (N = 21) of papers, White was the largest racial group represented. Of the papers included, 43.6% (N = 34) included the sex ratio of the patient population. Discussion With the power to build robust algorithms supported by massive quantities of clinical data, machine learning is shaping the future of diagnostics. Limitations of the underlying data create potential biases, especially if patient demographics are unknown or not included in the training. Conclusion As the movement toward clinical reliance on machine learning accelerates, both recording demographic information and using diverse training sets should be emphasized. Extrapolating algorithms to demographics beyond the original study population leaves large gaps for potential biases.
Article
Antenatal screening for beta thalassemia trait (BTT) followed by counseling of couples is an efficient way of thalassemia control. Since high performance liquid chromatography (HPLC) is costly, other cost-effective screening methods need to be devised for this purpose. The present study was aimed at evaluating the utility of red cell indices and machine learning algorithms including an artificial neural network (ANN) in detection of BTT among antenatal women. This cross-sectional study included all antenatal women undergoing thalassemia screening at a tertiary care hospital. Complete blood count followed by HPLC was performed. Receiver operating characteristic (ROC) curve analysis was performed for obtaining optimal cutoff for each of the indices with determination of test characteristics for detection of BTT. Machine learning algorithms including C4.5 and Naïve Bayes (NB) classifier and a back-propagation type ANN including the red cell indices was designed and tested. Over a period of 15 months, 3947 patients underwent thalassemia screening. BTT was diagnosed in 5.98% of women on the basis of HPLC. ROC analysis yielded the maximum accuracy of 63.8%, sensitivity and specificity of 66.2% and 63.7%, respectively for Mean corpuscular hemoglobin concentration (MCHC). The C4.5 and NB classifier had accuracy of 88.56%–82.49% respectively while ANN had an overall accuracy of 85.95%, sensitivity of 83.81%, and specificity of 88.10% in detection of BTT. The present study highlights that none of the red cell parameters standalone is useful for screening for BTT. However, ANN with combination of all the red cell indices had an appreciable sensitivity and specificity for this purpose. Further refinements of the neural network can provide an appropriate tool for use in peripheral settings for thalassemia screening.
Article
Full-text available
More than 40 mathematical indices have been proposed in the hematological literature for discriminating between iron deficiency anemia and thalassemia trait in subjects with microcytic red blood cells (RBCs). None of these discriminant indices is 100% sensitive and specific and also the ranking of the discriminant indices is not consistent. Therefore, we decided to conduct the first meta-analysis of the most frequently used discriminant indices. An extensive literature search yielded 99 articles dealing with 12 indices that were investigated five or more times. For each discriminant index we calculated the diagnostic odds ratio (DOR) and summary ROC analysis was done for comparing the performance of the indices. The ratio of microcytic to hypochromic RBCs (M/H ratio) showed the best performance, DOR=100.8. This was significantly higher than that of all other indices investigated. The RBC index scored second (DOR=47.0), closely followed by the Sirdah index (DOR=46.7) and the Ehsani index (DOR=44.7). Subsequently, there was a group of four indices with intermediate and three with lower DOR. The lowest performance (DOR=6.8) was found for the RDW (Bessman index). Overall, the indices performed better for adults than for children. The M/H ratio outperformed all other discriminant indices for discriminating between iron deficiency anemia and thalassemia trait. Although its sensitivity and specificity are not high enough for making a definitive diagnosis, it is certainly of value for identifying those subjects with microcytic RBC in whom diagnostic tests for confirming thalassemia are indicated.
Chapter
Full-text available
Background: More than 40 mathematical indices have been proposed in the hematological literature for discriminating between iron deficiency anemia and thalassemia trait in subjects with microcytic red blood cells (RBCs). None of these discriminant indices is 100% sensitive and specific and also the ranking of the discriminant indices is not consistent. Therefore, we decided to conduct the first meta-analysis of the most frequently used discriminant indices. Methods: An extensive literature search yielded 99 articles dealing with 12 indices that were investigated five or more times. For each discriminant index we calculated the diagnostic odds ratio (DOR) and summary ROC analysis was done for comparing the performance of the indices. Results: The ratio of microcytic to hypochromic RBCs (M/H ratio) showed the best performance, DOR = 100.8. This was significantly higher than that of all other indices investigated. The RBC index scored second (DOR = 47.0), closely followed by the Sirdah index (DOR = 46.7) and the Ehsani index (DOR = 44.7). Subsequently, there was a group of four indices with intermediate and three with lower DOR. The lowest performance (DOR = 6.8) was found for the RDW (Bessman index). Overall, the indices performed better for adults than for children.
Article
Full-text available
Red blood cells (RBCs) extended parameters or erythrocyte subsets are now reported by the new Sysmex XE 5000 analyzer. This study was aimed at establishing a characteristic analytical feature, including the new erythrocyte and reticulocyte parameters, in case of thalassemia trait and iron deficiency (IDA). Ninety healthy individuals, 136 β-thalassemia carriers, 121 mild IDA, and 126 severe IDA patients were analyzed. The values obtained for the RBC extended parameters were significantly different (P<0.0001) in the groups; the only exception was %Hypo-He in the case of mild IDA and thalassemia (P=0.6226). %Hypo-He was considerably greater in severe IDA (23.4%) than in mild cases (12.4%), P<0.0001. %MicroR was more increased in thalassemia (38.6 %) than in the mild IDA (16.5%, P<0.001) and in severe IDA (21.6%, P<0.001). Immature reticulocyte fraction (IRF) mean values in the groups were statistically different; the thalassemia group had an intermediate value (8.7%) between healthy (4.4%) and IDA (16.7 and 12.9%). Erythrocytosis and severe microcytosis, together with a high percentage of microcytes and a moderate increase in IRF, is the profile of β-thalassemia carriers, whereas anisocytosis and the hypochromic subset correlates with the severity of the anemia in iron-deficient patients.
Article
Full-text available
Cell counter-based formulas have been used in the differential diagnosis of microcytic anemia. The measurement of RBC subpopulations is now available on the Sysmex XE 5000 analyzer (Sysmex, Kobe, Japan). We describe the new formulas: % microcytic - % hypochromic; and % microcytic - % hypochromic - red cell distribution width (RDW), derived from the percentages of microcytic and hypochromic RBCs. The present study aimed to prospectively evaluate the reliability of these new formulas in the differential diagnosis of microcytosis and β-thalassemia screening compared with already published indices. The indices were calculated for a set of 250 iron-deficient patients and 270 β-thalassemia carriers. Independent samples t test and receiver-operating characteristics analysis were applied. The % microcytic - % hypochromic - RDW, % microcytic - % hypochromic, and Green and King indices provided higher areas under the curve. The % microcytic - % hypochromic - RDW was the most reliable index evaluated, with 100% sensitivity and 92.6% specificity. This index can be used to efficiently screen patients with microcytosis for further hematologic studies to confirm β-thalassemia.
Article
Full-text available
The present study reports the results in 284 patients of applying a recently developed index, MCV-(10 x RBC), for discrimination between beta-thalassemia trait (beta-TT) and Iron Deficiency Anemia (IDA), the two most common causes of microcytic hypochromic anemias. A total of 284 carefully selected patients (130 patients with IDA and 154 with beta-TT) were studied. Sensitivity, specificity and Youden's index were compared between the proposed index and four other indices, namely England-Fraser, Mentzer, Srivastava and RBC count. The new index correctly identified 263 (92.96%) patients, standing inferior only to Mentzer which correctly diagnosed 269 (94.71%) patients. The best discrimination index according to Youden's criteria was Mentzer (Youden's index = 90.1) followed by the new index (Youden's index = 85.5). There are remarkable inconsistencies among the results obtained in different studies. Larger studies are needed to establish the optimal discrimination index as well as to confirm the results obtained in the present study. Nevertheless, the epidemiological indices of the proposed discrimination index and the simplicity of its calculation make it acceptable for use in Iran.
Article
Full-text available
This study examined the diagnostic accuracy of nine indices to discriminate between patients with mild-to-moderate (haemoglobin 8.5 - 11 g/dl) or moderate-to-severe (haemoglobin < 8.5 g/dl) iron deficiency anaemia (IDA) from those with beta-thalassaemia (beta-TT) (n = 100 per group). Indices examined were red blood cell (RBC) count, RBC distribution width (RDW), Mentzer index (MI), Shine and Lal index (S&L), England and Fraser index (E&F), Srivastava index (S), Green and King index (G&K), RDW index (RDWI), and Ricerca index (R). Index sensitivity, specificity, and positive and negative prognostic values were examined. Youden's indices were calculated and showed: S&L > G&K > E&F > RBC = RDWI > MI > S > R > RDW to differentiate between beta-TT and mild-to-moderate IDA; and S&L > G&K > E&F = RDWI > RBC > R > MI > S > RDW to differentiate between beta-TT or moderate-to-severe IDA. For both groups, S&L and G&K offered the best discrimination and RDW the worst. S&L showed the highest Youden index for beta-TT and IDA discrimination, but sensitivity and specificity were not 100%. In both mild and severe IDA, the S&L index may be used to differentiate cases of beta-TT from IDA cases, but large clinical trials are needed to explore this further.
Article
Full-text available
The clinical differentiation of the causes of microcytosis is difficult because of the lack of a method for the diagnosis of alpha thalassemia. A number of laboratory tests have been proposed for the differentiation of alpha thalassemia from iron deficiency, including decision functions based on the red blood cell indices generated by electronic cell counters. The accuracy of these screening methods was assessed in 93 patients with microcytosis known to be secondary to either iron deficiency or beta thalassemia minor and, prospectively, in 26 patients with microcytosis in whom globin chain synthesis ratio was used to diagnose thalassemia. The functions evaluated were: RBC volume distribution curve; osmotic fragility; erythrocyte count; discriminant function = MCV - (5 X Hgb) - RBC - 8.4; ratio of MCH/RBC; ratio of MCV/RBC; and 0.01 X MCH X (MCV)2. A simplified method of measuring anisocytosis using the RBC volume distribution curve was significantly more accurate (P less than 0.01) in distinguishing iron deficiency from thalassemia than any of the other decision functions. Analysis of red blood cell volume distribution, although not sufficiently accurate for definitive diagnosis, appears to be a useful technic in the initial screening of patients with microcytosis and in determining which additional testing should be done.
Article
In this paper, we investigate the feasibility of two typical techniques of Pattern Recognition in the classification for Thalassemia screening. They are the Support Vector Machine (SVM) and the K-Nearest Neighbour (KNN). We compare SVM and KNN with a Multi-Layer Perceptron (MLP) classifier. We propose a two-classifier system based on SVM. The first layer is used to differentiate between pathological and non-pathological cases while the second layer is used to discriminate between two different pathologies (α-thalassemia carrier against β-thalassemia carrier) from the first output layer (pathological cases).Using the parameters sensitivity (percentage of pathologic cases correctly classified) and specificity (percentage of non-pathologic cases correctly classified), the results obtained with this analysis show that the MLP classifier gives slightly better results than SVM although the amount of data available is limited. Both techniques enable thalassemia carriers to be discriminated from healthy subjects with 95% specificity, although the sensitivity of MLP is 92% while that of SVM is 83%.
Article
Over the past three years 25 302 adults in Kentucky have been tested for hæmoglobinopathies, and of these, hæmoglobin A2 was measured on 3734, 1973 with microcytosis and 1761 within the normal range. The best methods of detecting β-thalassæmia minor using red-blood-cell indices were compared. No method detected all heterozygotes. A new method was devised consisting of three parts: (1) hæmoglobin electrophoresis, (2) calculation of the product of the square of the mean corpuscular volume (M.C.V.) multiplied by the mean corpuscular hæmoglobin (M.C.H.) measured in units of one hundred, (3) A2 determination on all AA samples with (M.C.V.)2 × M.C.H. <1530 and on those with variant genotypes consistent with thalassæmia. In this series this new method detected 137 out of 138 heterozygotes with 4·4% false-positives.
Article
This article presents an application of a neural network and decision trees in thalassaemia screening. The aim is to classify thirteen classes of thalassaemia abnormality and one control class by inspecting the distribution of multiple types of haemoglobin in blood specimens, which are identified via high performance liquid chromatography (HPLC). C4.5 and random forests are the chosen architecture for decision tree implementation. For comparison, multilayer perceptrons are explored in classification via a neural network. The stratified 10-fold cross-validation results indicate that the best classification performance with overall accuracy of 97.2% (sensitivity = 97.2% and specificity = 99.8%) is achieved when C4.5 is used in conjunction with samples which have been pre-processed with input attribute discretisation and redundant attribute removal. Subsequently, C4.5 is applied to an additional sample set in a clinical trial which results in overall accuracy of 93.1% (sensitivity = 93.1% and specificity = 99.5%). These results suggest that a combination of C4.5 with haemoglobin typing analysis via HPLC may give rise to a guideline for further investigation of thalassaemia classification.
Article
Principal component analysis (PCA) is a multivariate technique that analyzes a data table in which observations are described by several inter‐correlated quantitative dependent variables. Its goal is to extract the important information from the table, to represent it as a set of new orthogonal variables called principal components, and to display the pattern of similarity of the observations and of the variables as points in maps. The quality of the PCA model can be evaluated using cross‐validation techniques such as the bootstrap and the jackknife. PCA can be generalized as correspondence analysis (CA) in order to handle qualitative variables and as multiple factor analysis (MFA) in order to handle heterogeneous sets of variables. Mathematically, PCA depends upon the eigen‐decomposition of positive semi‐definite matrices and upon the singular value decomposition (SVD) of rectangular matrices. Copyright © 2010 John Wiley & Sons, Inc. This article is categorized under: Statistical and Graphical Methods of Data Analysis > Multivariate Analysis Statistical and Graphical Methods of Data Analysis > Dimension Reduction
Article
This paper presents the use of a neural network and a decision tree, which is evolved by genetic programming (GP), in thalassaemia classification. The aim is to differentiate between thalassaemic patients, persons with thalassaemia trait and normal subjects by inspecting characteristics of red blood cells, reticulocytes and platelets. A structured representation on genetic algorithms for non-linear function fitting or STROGANOFF is the chosen architecture for genetic programming implementation. For comparison, multilayer perceptrons are explored in classification via a neural network. The classification results indicate that the performance of the GP-based decision tree is approximately equal to that of the multilayer perceptron with one hidden layer. But the multilayer perceptron with two hidden layers, which is proven to have the most suitable architecture among networks with different number of hidden layers, outperforms the GP-based decision tree. Nonetheless, the structure of the decision tree reveals that some input features have no effects on the classification performance. The results confirm that the classification accuracy of the multilayer perceptron with two hidden layers can still be maintained after the removal of the redundant input features. Detailed analysis of the classification errors of the multilayer perceptron with two hidden layers, in which a reduced feature set is used as the network input, is also included. The analysis reveals that the classification ambiguity and misclassification among persons with minor thalassaemia trait and normal subjects is the main cause of classification errors. These results suggest that a combination of a multilayer perceptron with a blood cell analysis may give rise to a guideline/hint for further investigation of thalassaemia classification.
Article
Anemia is the most common hematological disorder. The complete blood count (CBC) is used to identify anemia and others disorder relative to hematology. However, discriminating both of iron deficiency anemias (IDA) and thalassemia (THA) depend on the mean cell volume (MCV) less than 80fL (fluid ounces) that is imprecision and uncertain. Recently, more literatures applied soft computing methods to solved problem of classification under imprecision and uncertainty. This paper proposes a new approach which derived from soft computing, and rule-based, namely, hierarchical soft computing (HSC). HSC is fitting for discriminating microcyte anemia, which evaluated the performance of microcyte anemia diagnosis after ANFIS pruning rule found: (1) The 96% accuracy is inferred by ANFIS with 50 patterns, that is more accurate than traditional experience. (2) Both sensitivity (90.1%) and specificity (95.8%) are higher than discriminant function which has only higher either sensitivity or specificity. (3) The area under receiver operating characteristics curve (AUC) is 0.954 means that the accuracy is 95.4% when inference value is revised to 13.6. The HSC has been improved the performance of discriminant function to discriminate microcyte.
Article
The purpose of this article is to set forth our approach to diagnosing and managing the thalassemias, including β-thalassemia intermedia and β-thalassemia major. The article begins by briefly describing recent advances in our understanding of the pathophysiology of thalassemia. In the discussion on diagnosing the condition, we cover the development of improved diagnostic tools, including the use of very small fetal DNA samples to detect single point mutations with great reliability for prenatal diagnosis of homozygous thalassemia. In our description of treatment strategies, we focus on how we deal with clinical manifestations and long-term complications using the most effective current treatment methods for β-thalassemia. The discussion of disease management focuses on our use of transfusion therapy and the newly developed oral iron chelators, deferiprone and deferasirox. We also deal with splenectomy and how we manage endocrinopathies and cardiac complications. In addition, we describe our use of hematopoietic stem cell transplantation, which has produced cure rates as high as 97%, and the use of cord blood transplantation. Finally, we briefly touch on therapies that might be effective in the near future, including new fetal hemoglobin inducers and gene therapy.
Article
The thalassemias are a group of inherited hematologic disorders caused by defects in the synthesis of one or more of the hemoglobin chains. Alpha thalassemia is caused by reduced or absent synthesis of alpha globin chains, and beta thalassemia is caused by reduced or absent synthesis of beta globin chains. Imbalances of globin chains cause hemolysis and impair erythropoiesis. Silent carriers of alpha thalassemia and persons with alpha or beta thalassemia trait are asymptomatic and require no treatment. Alpha thalassemia intermedia, or hemoglobin H disease, causes hemolytic anemia. Alpha thalassemia major with hemoglobin Bart's usually results in fatal hydrops fetalis. Beta thalassemia major causes hemolytic anemia, poor growth, and skeletal abnormalities during infancy. Affected children will require regular lifelong blood transfusions. Beta thalassemia intermedia is less severe than beta thalassemia major and may require episodic blood transfusions. Transfusion-dependent patients will develop iron overload and require chelation therapy to remove the excess iron. Bone marrow transplants can be curative for some children with beta thalassemia major. Persons with thalassemia should be referred for preconception genetic counseling, and persons with alpha thalassemia trait should consider chorionic villus sampling to diagnose infants with hemoglobin Bart's, which increases the risk of toxemia and postpartum bleeding. Persons with the thalassemia trait have a normal life expectancy. Persons with beta thalassemia major often die from cardiac complications of iron overload by 30 years of age.
Article
Thalassemia involves gene mutation that causes the production of an insufficient amount of normal structure globin chains while Hb variant involves gene mutation that causes the change in type or number of amino acid of the globin chain. It has been reported that some 200 million people worldwide had hemoglobinopathies of some sort. Attempts to develop effective and economical techniques for screening and analysis of thalassemia and Hb variants have become very important. In this review, we report the different techniques available, ranging from initial screening to extensive analysis, comparing advantages and disadvantages. Some indirect studies related to thalassemia indication and treatment follow-up are also included. We hope that information on these various techniques would be useful for some scientists who are working on development of a new technique or improving the existing ones.
Article
Computer technology has been advanced tremendously and the interest has been increased for the potential use of 'Artificial Intelligence (AI)' in medicine and biological research. One of the most interesting and extensively studied branches of AI is the 'Artificial Neural Networks (ANNs)'. Basically, ANNs are the mathematical algorithms, generated by computers. ANNs learn from standard data and capture the knowledge contained in the data. Trained ANNs approach the functionality of small biological neural cluster in a very fundamental manner. They are the digitized model of biological brain and can detect complex nonlinear relationships between dependent as well as independent variables in a data where human brain may fail to detect. Nowadays, ANNs are widely used for medical applications in various disciplines of medicine especially in cardiology. ANNs have been extensively applied in diagnosis, electronic signal analysis, medical image analysis and radiology. ANNs have been used by many authors for modeling in medicine and clinical research. Applications of ANNs are increasing in pharmacoepidemiology and medical data mining. In this paper, authors have summarized various applications of ANNs in medical science. Full text (free) available at https://in.booksc.eu/dl/55097223/a0a523
Article
Over the past three years 25 302 adults in Kentucky have been tested for haemoglobinopathies, and of these, haemoglobin A2 was measured on 3734, 1973 with microcytosis and 1761 within the normal range. The best methods of detecting beta-thalassaemia minor using red-blood-cell indices were compared. No method detected all heterozygotes. A new method was devised consisting of three parts: (1) haemoglobin electrophoresis, (2) calculation of the product of the square of the mean corpuscular volume (M.C.V.) multiplied by the mean corpuscular haemoglobin (M.C.H.) measured in units of one hundred, (3) A2 determination on all AA samples with (M.C.V.)2 X M.C.H. less than 1530 and on those with variant genotypes consistent with thalassaemia. In this series this new method detected 137 out of 138 heterozygotes with 4-4% false-positives.
Article
A novel red cell discriminant function [MCV2 x RDW/(Hgb x 100)] was compared to six other discriminants in 102 patients with established mild iron deficiency anemia and 33 patients with beta-thalassemia minor. The discriminant incorporates the two key measurements of erythrocyte cell volume distribution, namely the mean (MCV) and standard deviation (RDW), which are known to be helpful for distinguishing between these two frequent causes of microcytic hypochromic anemia. Data used for the learning set to develop the new discriminant were obtained using an electrical impedance automated whole blood analyzer (Coulter S + IV) and were applied as a validation set for six other discriminants. The discriminant was also tested on smaller subsets of the patients groups using data obtained on either an alternate electrical impedance instrument (Sysmex E-5000) or a laser light scattering based system (Technicon H*1). From the comparison it was concluded that use of a discriminant function that incorporates a measurement of red cell volume dispersion results in enhanced accuracy for distinguishing iron deficiency anemia from thalassemia minor.
Article
Some cases of iron deficiency and heterozygous β-thalassæmia (thalassæmia trait) can be differentiated on the basis of the red-blood-cell count (R.B.C.) alone. However, in 72 patients presenting with microcytosis, a simple discriminant function derived from the mean cell volume (M.C.V. in fl.), the R.B.C. 106 per μl., and the hæmoglobin concentration (Hb in g. per 100 ml.) correctly identified all but one of the cases studied (99%). The function took the following form: D.F.'=M.C.V.-R.B.C. -(5×Hb)-3·4. A positive value indicated iron deficiency and a negative value indicated thalassæmia trait. This function is not applicable in pregnancy.
Article
New automated blood cell analyzers provide an index of red cell volume distribution width (RDW) or heterogeneity and a histogram display of red cell volume distribution. We have developed a classification of red cell disorders, based on mean corpuscular volume (MCV) or red cell size, heterogeneity, and histograms, to guide diagnosis from the peripheral blood analysis. The distinction of iron deficiency anemia from heterozygous thalassemia or the anemia of chronic disease and the detection of early iron and folate deficiency is improved. Red cell volume distribution histograms identify red cell fragmentation or agglutination, dimorphic populations, and artifactual counting of lymphocytes as red cells. We recommend the use of these new variables in the initial classification of anemia by the practicing physician.
Article
The differentiation between thalassemic and non-thalassemic microcytosis has important clinical implications in hematology and medicine. A simplified index, based on red cell parameters derived from automated blood cell analyzers, which could be used to discriminate between microcytic patients with a high probability of thalassemia minor and those with a low probability, would be an extremely useful tool. Five mathematical indices have been proposed as useful for this purpose. These are the: Bessman index, Shine and Lal index, England index, Mentzler index, and mean cell volume (MCV) alone. This study was designed to prospectively evaluate the efficacy of these indices. Patient samples were chosen every fourth day from all patient samples referred to the hematology laboratory at St. Joseph's Hospital over a 6-month period. All patient samples with an MCV < 80 fL and age > or = 18 years were considered eligible for the study. After enrollment and laboratory analysis were complete sensitivities and specificities were calculated for each of the indices using a variety of cut-off values and receiver operator characteristic (ROC) curves were constructed. Based on statistical analysis of the area under these curves, the authors conclude that MCV alone is as effective as the Mentzler and Shine and Lal indices in selecting microcytic patient samples with a high probability of thalassemia minor for thalassemia testing. They also conclude that the Bessman index and the England index are ineffective indices for this purpose.
Article
Thalassemias are pathologies that derive from genetic defects of the globin genes. The most common defects among the population affect the genes that are involved in the synthesis of alpha and beta chains. The main aspects of these pathologies are well explained from a biochemical and genetic point of view. The diagnosis is fundamentally based on hematologic and genetic tests. A genetic analysis is particularly important to determine the carriers of alpha-thalassemia, whose identification by means of the hematologic parameters is more difficult in comparison with heterozygotes for alpha-thalassemia. This work investigates the use of artificial neural networks (ANNs) for the classification of thalassemic pathologies using the hematologic parameters resulting from hemochromocytometric analysis only. Different combinations of ANNs are reported, which allow thalassemia carriers to be discriminated from normals with 94% classification accuracy, 92% sensitivity, and 95% specificity. On the basis of these results, an automated system that allows real-time support for diagnoses is proposed. The automated system interfaces a hemochromo analyzer to a simple PC.
Article
Iran is a country with high prevalence of about 5-10% of beta-thalassemia trait. The prevalence of Cooly's anemia has declined from 11.6 in 10000 population to 7.2 in 10000 in a five-year period due to screening program of beta-thalassemia trait before marriage. This study was conducted to compare the sensitivity of mean corpuscular hemoglobin (MCH) < 27 pg and mean corpuscular volume (MCV) < 80 fl as a screening test in first step of screening of beta-thalassemia trait. From 2449 couples (4898 cases) participating in the premarital screening to our clinic, 902 cases with either MCH < 27 pg, MCV < 80 fl, anemia, pallor or family history of beta-thalassemia were enrolled in the study. MCV, MCH as well as Hb A2 were measured in all cases. MCH and MCV had sensitivities of 98.5% and 97.6% for the diagnosis of beta-thalassemia trait, respectively. A false negative value of MCH is about 1% lower than that of MCV. MCH is a more sensitive screening test for detecting beta-thalassemia minor before marriage.
Article
Hypochromic microcytic anemias are frequently encountered in clinical practice. In most cases they are due to iron deficiency, but in some geographical areas other disorders, such as the β-thalassemia trait (β-TT), must be considered [1, 2]. β-Thalassemia is a genetically determined disorder in which the defect in the β-globin gene results in a decreased production of hemoglobin (Hb) A1. Clinically, the disease exists in two forms: β-thalassemia minor (heterozygous β-TT) and β-thalassemia major (homozygous β-TT). β-Thalassemia is frequently seen in Mediterranean countries, Africa and Asia and it is an important public health problem [3]; its prevalence was found to be 10.2&percnt; in Antalya [4].
Article
Unlabelled: Although neutrophils are essential components of the natural immune system, they have also been implicated in the pathogenesis of tissue injuries. We assessed the clinical significance of neutrophil apoptosis in the peripheral blood of patients with type 2 diabetes mellitus (T2DM). Patients and methods: The study included 52 patients with T2DM (30 men, 22 women). Control subjects were 16 healthy volunteers without diabetes (7 men, 9 women). Neutrophil apoptosis levels were measured active caspase-3 positive rate by flow cytometry. Results: The mean rate of neutrophil apoptosis in patients with T2DM was 15.0% (95% confidence interval [CI]: 9.5% approximately 20.5%), while that in the control group was 5.8% (95% CI: 1.6% approximately 10.0%). There were significant negative correlations between neutrophil apoptosis rate and hemoglobin (Hb) A1c levels (r = -0.352, P < .01). The mean rate of neutrophil apoptosis in the patient group with the 3 major complications (diabetic retinopathy, nephropathy, and neuropathy) was 11.1% (95% CI: 5.5%-16.7%, n = 36) and that of another group without complications was 23.8% (95% CI: 11.4%-36.2%, n = 16). There was a significant difference between these 2 groups (P < .05). Conclusions: The neutrophil apoptosis rate in patients with T2DM was significantly correlated with HbA1C levels. The mean rate of neutrophil apoptosis in the patient group with 3 major diabetic complications remained lower than that in another patient group without complications. The inhibition of neutrophil apoptosis by chronic hyperglycemia is thought to promote tissue injury and to enhance the risk of microangiopathy.
Article
beta-thalassemia screening is primarily limited to pregnant women. The ratio of the mean corpuscular volume (MCV) and red blood cell count (RBC) can be automatically calculated with any of the newer hematology analyzers. The results of 398 patient screens were collected. Data from the set were divided into training and validation subsets. The Mentzer ratio was determined through a receiver operating characteristic (ROC) curve on the first subset, and screened for thalassemia using the second subset. HgbA2 levels were used to confirm beta-thalassemia. We determined the correct decision point of the Mentzer index to be a ratio of 20. Physicians can screen patients using this index before further evaluation for beta-thalassemia (P < .05). The proposed method can be implemented by hospitals and laboratories to flag positive matches for further definitive evaluation, and will enable beta-thalassemia screening of a much larger population at little to no additional cost.
Article
Beta-thalassaemia minor and iron deficiency are the most common causes of microcytosis and/or hypochromasia. The present study evaluates the diagnostic reliability of different RBC indices and formulas, as well as our proposed formula, in the differentiation of the beta-thalassaemia minor from iron deficiency in Palestinian population. Complete blood count (CBC) parameters of 2196 certainly diagnosed (1272 beta-thalassaemia minor and 924 iron deficiency) samples were used to evaluate the following indices and formulas: Bessman index (RDW), Mentzer formula (MCV/RBC), England and Fraser formula (MCV - RBC - 5 x Hb- 3.4), Shine and Lal formula (MCV2 x MCH/100), Ehsani formula (MCV-10 x RBC), Srivastava formula (MCH/RBC), Green and King formula (MCV2 x RDW/Hb x 100), red distribution width index RDWI (RDW x MCV/RBC), RDW/RBC, as well as our formula (MCV-RBC -3 x Hb). For each index and formula, the receiver operative characteristic (ROC) curve was constructed to calculate the area under the curve (AUC), in addition, sensitivity, specificity, and likelihood ratios were calculated. No significant differences were reported between our formula, Green-King formula and the RDWI (P > 0.05) in discriminating beta-thalassaemia minor from iron deficiency (AUC = 0.914, 0.909 and 0.907 respectively). However, the three indices and formula showed the highest efficiencies and they were significantly (P < 0.05) better than the others in the discrimination efficiency . It was concluded that our formula, Green-King formula and the RDWI provided the highest reliabilities in differentiating beta-thalassaemia minor from iron deficiency in Palestinian population while Bessman index was poor and ineffective for that purpose.
Introduction to Neural Networks Using Matlab 6.0. Tata McGraw-Hill
  • S N Sivanandam
  • S Sumathi
  • S N Deepa
Sivanandam SN, Sumathi S, Deepa SN. Introduction to Neural Networks Using Matlab 6.0. Tata McGraw-Hill, Noida, UP, India; 2006.
Erythrocyte and reticulocyte parameters in iron deficiency and thalassemia
  • E Urrechaga
  • L Borkue
  • J F Escanero
Urrechaga E, Borkue L, Escanero JF. Erythrocyte and reticulocyte parameters in iron deficiency and thalassemia. J Clin Lab Anal 2011;25:223-228.