International Journal of Electrical and Computer Engineering (IJECE)
Vol. 5, No. 6, December 2015, pp. 1569~1576
ISSN: 2088-8708 1569
Journal homepage: http://iaesjournal.com/online/index.php/IJECE
Comparing Performance of Data Mining Algorithms in
Prediction Heart Diseses
Moloud Abdar
1
, Sharareh R. Niakan Kalhori
2
, Tole Sutikno
3
, Imam Much Ibnu Subroto
4
, Goli Arji
5
1
Department of Engineering, Damghan University, Iran
2
Department of Health Information Management, Tehran University of Medical Sciences, Iran
3
Department of Electrical Engineering, Universitas Ahmad Dahlan, Yogyakarta, Indonesia
4
Department of Informatics Engineering, Universitas Islam Sultan Agung, Semarang, Indonesia
5
Health Information Management, Tehran University of Medical Sciences, Iran
Article Info
ABSTRACT
Article history:
Received Aug 4, 2015
Revised Oct 11, 2015
Accepted Oct 27, 2015
Heart diseases are among the nation’s leading couse of mortality and moribidity. Data
mining teqniques can predict the likelihood of patients getting a heart disease. The
purpose of this study is comparison of different data mining algorithm on prediction of
heart diseases. This work applied and compared data mining techniques to predict the
risk of heart diseases.After feature analysis, models by six algorithms including decision
tree, neural network, support vector machine and k-nearest neighborhood developed and
validated. C5.0 Decision tree has been able to build a model with greatest accuracy
93.02%, KNN, SVM, Neural network have been 88.37%, 86.05% and 80.23%
respectively. Produced results of decision tree can be simply interpretable and
applicable; their rules can be understood easily by different clinical practitioner.
Keyword:
C5.0 Algorithm
Data Mining
Heart Disease
Neural Network
Copyright © 2015 Institute of Advanced Engineering and Science.
All rights reserved.
Corresponding Author:
Goli Arji,
Faculty of Allied Medical Sciences
Tehran University of Medical Sciences
Addres, Tehran,Iran
Email: Goliarji@ymail.com
1. INTRODUCTION
According to the latest statistics from the World Health Organization (WHO), heart diseases have a
great deal of attention in medical research due to its impact on human health [1]. Cardiovascular disease is
the number one cause of death in industrialized countries and not only have a major impact on individuals
and their quality of life in general, but also on public health costs and the countries’ economies. Diagnosis of
heart disease was more costly decision in diagnosis. Artificial Intelligence (AI) techniques were used vastly
in medical diagnosis.With the advancement of science, the volume of accumulated data in various fields has
been increased that it is well known the explosion of information [2]. When analyzing the accumulated data
they could reveal their hidden useful information. By performing data mining, which is a new science, we
able to extract the hidden knowledge of the data. Performing data mining reveals useful relationship existed
among data, and this rule can apply for right decision making [3],[4]. Classification is one of the subdivisions
of data mining, which acts in accordance with If-Then rule. Its purpose is to predict a variable based on other
features that are known as predictors. Neural Network , Support vector machine, and Decision Tree are
different form of classification algorithms [5-9]. The purpose of this study is comparison of different machin
learning algorithm on prediction of heart diseases.
This section summarises various technical articles on KDD process and data mining classification
techniques applied on heart diseses datasets:
ISSN: 2088-8708
IJECE Vol. 5, No. 6, December 2015 : 1569 – 1576
1570
Ram Bilas Pachori and his colleagues [10] have been studying and diagnosing heart disease using
tunable-Q wavelet obtained from heart rate signals. Since manual data entry occurs with errors and also it is
time consuming, Tunable-Q Wavelet Transform (TQWT) method is recommended in the present study.
Using the least squares support vector machine (LS-SVM), they have reported the accuracy of 96.8%,
sensitivity equal to 100%, and specificity of 93.7%.
Another study conducted by Yongqiang Lyu et al. [11] has been based on an evaluation model of
coronary artery disease by using data mining algorithm. In this research a new dynamic model, which makes
it possible to assess lifetime, suggests linear time-invariant approach to assess CHD. The model result based
on SYNTAX scores indicates a 5% possible error al [12] in this study they have used J4.8 Decision tree
method, and the reported precision was 84.1 percent.
In another study using genetic algorithm, SVM and SSVM conducted by Sumit Bhatia et al [13] in
classification of cardiac patients the features have been selected by genetic algorithm to help the SSVM in
the best mode of input selection, the obtained precision is 72.55%, while the precision obtained by GA-
SSVM has improved the result and its precision equals to 90.57%. Peter C. Austin and colleagues [14]
discuss heart malfunctions in their paper. The associated physicians have divided the patients into two groups
of "with" and "without" disease. They have found that the use of decision tree in data mining will have better
results than regression model. Using MV5, Saba Bashir et.al [15] applied MV5 algorithn and its precision
was 88.52%.
Another research done by Jasmine Nahar et al. [16] for finding relationship between heart disease
risk factors in men and women. It refers to the fact that coronary heart disease risk in women is less than
men. Doing exercise men and women can easily overcome their chest pain. One of the extracted points in this
paper introduces "Rest ECG" in both forms of normal and hyper, and "Slope being flat" is introduced as a
risk factor. However, the research resulth indicate that Rest ECG for men is considered a risk factor only in
its hyper form. The study concludes that Rest ECG should be considered as important factor to predict heart
disease in women. The research techniques including Apriori, Predictive Apriori and Tertius have compared
to each other and precision of predictive Apriori was 90%.
Kyle. Walker et.al [17] note that heart disease is
the principal cause of death in America, Texas. Therefore, the performed a study on different areas of Texas
using cluster analysis and result show that factors such as poor hygiene and economic deprivation and other
conditions affect the outbreak of disease.
In the paper presented by K. Rajeswari and colleagues [18], they study the heart disease using
Neural Network. They have studied the influence of feature selection for neural network algorithm in
identifying patients with Ischemic heart disease. 12 features have been used in the paper. The result of their
study shows that when all the features(attributes)are applied, the precision rate in training mode 89.4% and in
test mode is 82.2%. An interesting point in the conclusion is that any reduction in features entry causes the
precision decrease in both training and test modes. AV Senthil Kumar [19] applied fuzzy mechanism on
cardiac patients The calculated precision in this paper was 94.11%. Some examples of research done on
cardiac patients with different techniques have briefly mentioned below.
2. RESEARCH METHOD
The present study conducted by using data from the University of California, Irvine (UCI).This data
includes 13 features classified into 2 classes of "with" and "without" heart disease. After feature analysis,
models by six algorithms including decision tree, neural network, support vector machine and k-nearest
neighborhood developed and validated.
2.1. C5.0 Algorithm
C5.0 algorithm developed from C4.5 algorithm is one of the most important and widely used
algorithms in data mining. C4.5 itself is the extended form of ID3 algorithm. C5.0 has the ability to be
applied for classifying as a decision tree or a set of rules. Because of the understandability of their rules set,
they are preferred in many applications. The strength of the algorithm is in handling missing values or its
large number of entries, as well as the fact that less time is necessary to learn it [20], [21], [22], [23].
If S is training set and X contains n attributes so that the set S is divided into N sub categories: The
algorithm to test the features makes use of element is called the gain ratio [24].
The number of samples in the S is displayed in (S1, S2, S3,....Sn). For calculating the number of
samples that belong to Ci (the value Parameter i is [i = 1,2,3,4, ..., N]) is used in the following formula:
,. Also for calculate an instance belonging the Ci is used to the formula: ,/||
IJECE ISSN: 2088-8708
Comparing Performance of Data Mining Algorithms in Prediction Heart Diseses (Moloud Abdar)
1571
Training set can be calculated according to the formula
:
1.
,
||
log
,
||
That
includes information can be identified by all the samples in S. After the division of S to all
its subsets, Gain ratio is calculated as follows:
2.
||
3.
4.
||
||
||
||
5.
∆
6.Speciicity
TN
FPTN
7.Sensitivity
TP
TPFP
8.Precision
TP
TPFP
9.Accuracy
TPTN
TPFNFPTN
2.2. SVM Algorithm
Support Vector Machine (SVM) is a regulatory algorithm introduced by Vapnik in 1995. The base
of the algorithm is using the precision to generalize the errors. The algorithm makes "hyperplane" and
divides the data into classes so that all samples belonging to one class will be categorized on one side and the
rest on the other side. Linear SVM Classifier is defined for the SVM classifying task, and dividing them
occurs provided that the chosen line involves the most marginalized sure [13], [25].
2.3. KNN Algorithm
K-nearest neighbor algorithm is a method for classification based on similarity to other cases. Those
close to others, are called a "neighbor". When a case is new, its distance from each of the cases in the model
is calculated. Applying this classification, specifies the case as being the nearest neighbor, which is the most
similar. Therefore, it puts the case into the group that contains the nearest neighbors. The algorithm is also
able to calculate values continuously for a target. In this situation, the average or the median target value of
the nearest neighbor is used to obtain the predicted value of new case [26].
2.4. Neural Network Algorithm
Artificial Neural Network is a data processing algorithm, originated from human brain. The system
includes a large number of tiny processors to handle data processing. The processors act in the form of an
interconnected network parallel to each other to solve a problem. Using programming knowledge, in this
networks a data structure is designed that can act as neurons. This data structure is called the neuron[27],
[28], [29], [30].
ISSN: 2088-8708
IJECE Vol. 5, No. 6, December 2015 : 1569 – 1576
1572
2.5. Accuracy Measurment
In order to evaluate the prediction rate,there are several indices such as specificity, sensitivity,
precision, and accuracy to assess to assess the models’ validity. These indices(equation 6-9) are calculated by
the cofusion matrix (Figure 1). This matrix is a useful tool for analyzing the performance of classification
method in data diagnosis or observations of various categories. The ideal state, most parts of the relevant data
with the observations should be located on the main diagonal of the matrix, and the remaining values of the
matrix are zero or near zero [31], [32].
FN= The number of positively labeled data, which falsely have been classified as "Negative".
TN= The number of negatively labeled data, which have been classified as "Correct".
TP= The number of positively labeled data, which have been classified as "Correct".
FP= The number of negatively labeled data, which falsely have been classified as "Positive".
Figure 1. Confusion matrix
2.6. Data Set
In this study 270 record with 13 features has been used [33]. Patients’ attributions applied for
modeling, their definitions and their range of values presented in Table 1.
Table 1. Patients’ attributions applied for modeling, their definitions and their range of values.
Variable Variable Definition Categories of Values
Age Age of Heart Disease [29-77]
Sex Gender of Heart Disease (1 = male; 0 = female)
CP chest pain type [1-4]
RBP resting blood pressure [94-200]
SC serum cholestoral in mg/dl [126-564]
FBS fasting blood sugar > 120 mg/dl [0-1]
RER resting electrocardiographic results [0-2]
MHRA maximum heart rate achieved [71-202]
EIA exercise induced angina [0-1]
Oldpeak ST depression induced by exercise relative to rest [0-6.2]
Slope the slope of the peak exercise ST segment [1-3]
NUM number of major vessels (0-3) colored by
flourosopy
[0-3]
Thal Normal, fixed defect, reversible defect [3, 6, 7]
Variable to be predicted Class of Heart Disease Absence (1) or presence (2) of heart disease
By means of logestic regression variables which are significantly correlated with target variable are
selected as predictor (P<=0.05).they are presented an defined in Table 2.
Table 2. variables which are significantly correlated with target variable by using logestic Regression
Variable Variable Definition Categories of Values B Wald Sig Exp
Sex Gender 1 = male; 0 = female 1.104 6.337 0.012 3.018
CP chest pain type [1-4] 0.731 13.648 0.000 2.077
RBP resting blood pressure [94-200] 0.023 5.238 0.022 1.023
EIA exercise induced angina [0-1] 1.236 10.182 0.001 3.442
NUM number of major vessels (0-3) colored by flourosopy [0-3] 1.133 25.224 0.000 3.106
Thal Normal, fixed defect, reversible defect [3, 6, 7] 0.397 16.848 0.000 1.488
IJECE ISSN: 2088-8708
Comparing Performance of Data Mining Algorithms in Prediction Heart Diseses (Moloud Abdar)
1573
3. RESULTS AND ANALYSIS
This section presents the experimental results and analysis done for this study.In this work, four
classifiers including C5.0, SVM, KNN and Neural Network. Data divided into trainset and testset (70% and
30% respectively). The training set is used to build the classifier and test set used to validate it. Model
development is conducted in two main steps including model fitness and model accuracy. To calculate the
model fitness criteria we used the data of training set; however, to compute the model accuracy
measurements, data of testing set is applied which is merely much more valuable to judge about our models
accuracy. Related results of these experiments are demonstrated in Table 3.
Table 3. Comparison on model fitness and model accuracy of six various applied machine learning
algorithms
Model Fitness (through using training set) Model Accuracy (through using testing set)
Algorithms Specificity Sensitivity Precision Training
Accuracy
Specificity Sensitivity Precision Testing
Accuracy
C5.0 89.62 % 84.61 % 85.71 % 87.50 % 90.90 % 95.23 % 90.90 % 93.02%
SVM 84.90 % 79.48 % 79.48 % 82.61 % 90.90 % 80.95 % 89.47 % 86.05%
KNN 91.50 % 79.48 % 87.32 % 86.41 % 88.63 % 88.09 % 88.09 % 88.37%
Neural
Network
91.50 % 78.20 % 87.14 % 85.87 % 86.36 % 73.80 % 83.78 % 80.23%
C5.0 Decision tree has been able to build a model with greatest accuracy since the model prediction
accuracy is 93.02%. Model accuracies obtained from other classifiers are different as this value for
KNN,SVM, Neural network have been 88.37%,86.05% and 80.23% respectively.By analyzing the variables
importance in c5, 0 model we find that attention to features such as Thal, CP and Slope are so important in
prediction of heart diseases (Figure 2).
Figure 2.variable importance for heart diseases prediction based on C5.0 model
Figures 3 and 4 are comparative ROC curves based on risk of heart diseases.This figures show two
ROC curve for logistic regression and C5.0 decision tree C5.0 has outperformed than logistic regression with
area under curve (AUC) 0.869. AUC for logistic regression was 0.835. Overall, these results of area under
curve reveals better performance of C4.5 decision tree classification algorithm.
ISSN: 2088-8708
IJECE Vol. 5, No. 6, December 2015 : 1569 – 1576
1574
Figure 3. ROC curve for logistic regression Figure 4.ROC curve for C5.0 decision tree
In a study conducted to comparing between data mining tools for heart diseases data set in [34] and
[35] variable like blood pressure, blood sugar, age and sex showed a significant association with heart
diseases. The study conducted by Jasmine Nahar and her colleagues [16] also pointed out that sex was highly
important in predicting heart disease, wheras in this study features such as resting blood pressure, sex, chest
pain type, exercise induced angina and number of major vessels played a major role.In a paper Zahra
Alizadeh Sani et al [36] have used the C4.5 and Bagging algorithms to diagnosing coronary heart disease.
For C4.5 algorithms have reported the best accuracy rate. K. Rajeswari et al [18] applied neural network on
ischemic heart disease that the accuracy obtained for training and testing was 89.4 % and 82.2 %
respectively. T. John Peter and K. Somasundaram [37] have been used hybrid attribute selection method for
prediction of heart disease.The accuracy obtained by this model was 83.62 %. Kemal Polat and Salih Gunes
[38] by use of C4.5 decision tree algorithm obtained 92.59 % accuracy.
4. CONCLUSION
In this study, KNN, SVM, C5.0, Logistic Regression and Neural Network were implemented on
UCI dataset. Based on
investigated methods, decision tree has achieved the best performance.There are
different issues that influence the performance of applied models including type of problem and type of input
data(discrete or continous).due to the fact that dataset mainly was discrete,decision tree able to handle
numerical data.Because output variable labeled with two class:’with’ and ‘without’ heart diseases,decision
tree yielded better performance than other algorithms.
Decision trees are able to generate understandable
rules and can perform classification without requiring much computation and clearly indicate that which
fields are most important for prediction or classification.
REFERENCE
[1] WHO Report, the Top 10 Causes of Death, last accessed 12/9/2013 from http://
who.int/mediacentre/factsheets/fs310/en/, ( accessed 01.04.2015).
[2] Hamid Bagheri, Abdusalam Abdullah Shaltooki. Big Data: Challenges, Opportunities and Cloud Based Solutions.
International Journal of Electrical and Computer Engineering (IJECE), 2014; 5(2): 340-343.
[3] Vijayajothi P, Tan SY, Sarinder KD, Amandeep SS. A methodological review of data mining techniques in
predictive medicine: An application in hemodynamic prediction for abdominal aortic aneurysm disease. Published by
Elsevier, Biocybernetics and Biomedical Engineering, 2014; 34(3):139-145.
[4] K.C. Tan, E.J. Teoh, Q. Yu, K.C. Goh. A hybrid evolutionary algorithm for attribute selection in data mining. Expert
Systems with Applications, 2009; 36: 8616–8630.
[5] Nikola K, Elisa C. Spiking neural network methodology for modelling classification and understanding of EEG
spatio-temporal data measuring cognitive processes. Information Sciences, 2015; 294: 565–575.
[6] F. Lotte, M. Congedo, A. Lécuyer, F. Lamarche, B.A. Arnaldi. Review of classification algorithms for EEG-based
brain–computer interfaces. J. Neural Eng. 2007; 4(2):1-25.
IJECE ISSN: 2088-8708
Comparing Performance of Data Mining Algorithms in Prediction Heart Diseses (Moloud Abdar)
1575
[7] C. Anderson, D. Peterson. Recent advances in EEG signal analysis and classification, in: R. Dybowski, V. Gant
(Eds.). Clinical Applications of Artificial Neural Networks, Cambridge University Press, UK. 2001: 175–191
(Chapter 8).
[8] C. Anderson, E. Stolz, S. Shamsunder,” Multivariate autoregressive models for classification of spontaneous
electroencephalogram during mental tasks. IEEE Trans. Biomed. Eng. 1998; 45 (3): 277–286.
[9] K. Padmavathi, K. Sri Ramakrishna. Detection of Atrial Fibrillation using Autoregressive modeling. International
Journal of Electrical and Computer Engineering (IJECE), 2015; 5(1): 64-70.
[10] Shivnarayan P, Ram BP, U. Rajendra A. Automated diagnosis of coronary artery disease using tunable-Q wavelet
transform applied on heart rate signals. Knowledge-Based Systems, 2015; 82: 1-10.
[11] Yongqiang L, Jiaming H, Yiran W, Jijiang Y , Yida T, Wenyao W, Nazim A. Dynamic evaluation model of coronary
heart disease for ubiquitous healthcare. Computers in Industry, 2015; 69: 35-44.
[12] Mai Sh, Tim T, Rob S. Using Decision Tree for Diagnosing Heart Disease Patients. AusDM'11, Proceedings of the
9-th Australasian Data Mining Conference, Ballarat, Australia, 2011.
[13] Sumit B, Praveen P, G.N. Pillai. SVM Based Decision Support System for Heart Disease Classification with Integer-
Coded Genetic Algorithm to Select Critical Features. WCECS. Proceedings of the World Congress on Engineering
and Computer Science, San Francisco, USA, October 22 – 24, 2008.
[14] Peter C. Austin, Jack V. Tu, Jennifer E. Ho, Daniel Levy, Douglas S. Lee. Using methods from the data-mining and
machine-learning literature for disease classification and prediction: a case study examining classification of heart
failure subtypes. Journal of Clinical Epidemiology, 2013; 66(4): 398-407.
[15] Saba B, Usman Q, Farhan HK, M. Younus J. MV5: A Clinical Decision Support Framework for Heart Disease
Prediction Using Majority Vote Based Classifier Ensemble. Arab J Sci Eng, 2014; 39(11): 7771-7783.
[16] Jesmin N, Tasadduq I, Kevin ST, Yi-Ping Ph Ch. Association rule mining to detect factors which contribute to heart
disease in males and females. Expert Systems with Application, 2013; 40(4): 1086–1093.
[17] Kyle E. Walker*, Sean M. Crotty. Classifying high-prevalence neighborhoods for cardiovascular disease in Texas.
Applied Geography, 2014; 57: 22-31, 2014.
[18] K.Rajeswari, V.Vaithiyanathan, T.R. Neelakantan. Feature Selection in Ischemic Heart Disease Identification using
Feed Forward Neural Networks. International Symposium on Robotics and Intelligent Sensors 2012 (IRIS 2012),
Procedia Engineering, 2012; 41: 1818–1823.
[19] A.V Senthil Kumar. Generating Rules for Advanced Fuzzy Resolution Mechanism to Diagnosis Heart Disease.
International Journal of Computer Applications, 2013; 77(11): 6-12.
[20] Quinlan J R. Induction of decision trees. Machine Learning, 1986; 4: 81–106.
[21] Quinlan J R. C4.5: Programs for machine learning. Machine,Learning, 1994; 3:235–240.
[22] Quinlan J R. Bagging, Boosting and C4.5. Proceedings of 14th National Conference on Artificial Intelligence, 1996:
725–730.
[23] Xindong W , Vipin K , J. Ross Q , Joydeep Gh, Qiang Y, Hiroshi M , Geoffrey J. M, Angus Ng, Bing L, Philip S.
Yu, Zhi-Hua Z, Michael S, David JH, Dan S. Top 10 algorithms in data mining. Springer, 2008; 14(1): 1-37.
[24] Shuonan H, Rongtao H, Xinming S, Jun W, Chengshang Y, Research on C5.0 Algorithm Improvement and the
Test in Lightning Disaster Statistics”, International Journal of Control and Automation, vol. 7, no1, pp. 181-190,
2014.
[25] Vapnik, V. N. The nature of statistical learning theory. New York:Springer, 1995.
[26]. Yazdani A, Ebrahimi T, Hoffmann U. Classification of EEG signals using Dempster Shafer theory and a K-nearest
neighbor classifier. IEEE. In: Proc of the 4th int EMBS conf on neural engineering, 2009: 327–30.
[27] Daubechies I. The wavelet transform, time-frequency localization and signal analysis. IEEE. Trans Inform Theor,
1990; 36: 961–1005.
[28] Demuth H, Beale M, Hagan M. Neural network Toolbox™ user’s guide. The MathWorks, Inc.; 2009.
[29] Leng, G., McGinnity, T.M., Prasad, G. Design for self-organizing fuzzy neural networks based on genetic
algorithms. IEEE. Trans. Fuzzy Syst. 2006; 14 (6): 755–766.
[30] Frank H. F. Leung, H. K. Lam, S. H. Ling, Peter K. S. Tam . Tuning of the structure and parameters of a neural
network using an improved genetic algorithm. IEEE. Trans. Neural Networks, 2003; 14 (1): 79–88.
[31] Alizadeh S, Ghazanfari M,”Teimorpour B .Data Mining and Knowledge Discovery”, Publication of Iran University
of Science and Technology . 2nd ed, 2011. [Persian].
[32] Han J. Kamber M.chapter 1: introduction: Data Mining: Concepts and Techniques. Morgan Kaufman Publisher. 2nd
ed, 2006.
[33] UCI Archive, Machine Learning Repository,” https://archive.ics.uci.edu/ml/machine-learning-
databases/statlog/heart/ ( accessed 02.05.2015).
[34] G.Subbalakshmi, K. Ramesh, M. Chinna Rao. Decision Support in Heart Disease Prediction System using Naive
Bayes. Indian Journal of Computer Science and Engineering (IJCSE). 2011; 2(2): 170-176.
[35] Aditya M, Prince K, Himanshu A, Pankaj K. Early Heart Disease Prediction Using Data Mining Techniques.
Computer Science & Information Technology (CS & IT). 2014: 53-59.
[36] Roohallah A, Jafar H, Zahra A, Hoda M, Reihane B, Asma Gh, Fahime Kh, Fariba A. Diagnosing Coronary Artery
Disease via Data Mining Algorithms by Considering Laboratory and Echocardiography Features. Official Journal of
Rajaie Cardiovascular Medical and Research Center. 2013; 2(3): 133-139.
[37] T. John Peter, K. Somasundaram. Study and Development of Nevel Feature Selection Frmework for Heart Disease
Preciction. International Journal of Scientific and Research Publications. 2012; 2(10): 1-7.
[38] Kemal Polat, Salih Gunes. A hybrid approach to medical decision support systems: Combining feature selection,
fuzzy weighted pre-processing and AIRS. computer methods and programs in biomedicine. 2007; 88 :164–174.
ISSN: 2088-8708
IJECE Vol. 5, No. 6, December 2015 : 1569 – 1576
1576
BIOGRAPHIES OF AUTHORS
Moloud Abdar. He received his Undergraduate (Bachelor) degree in Computer
Engineering (Software Engineering) from the University of Damghan, Iran in 2015. He
has more than 7 conference and journal papers
about the Data Mining. Currently, his
research interests include data mining, web and text mining
, Artificial Intelligence
and Image
Processing.
Goli Arji. She is PHD student in health information management, Tehran
university of medical science. She is interested in data mining, fuzzy logic,
clinical decision support system, telemedicin and consumer health informatics.