Content uploaded by Doreen Ying Ying Sim

Author content

All content in this area was uploaded by Doreen Ying Ying Sim on Jul 24, 2018

Content may be subject to copyright.

RESEARCH ARTICLE Adv. Sci. Lett. 23(11), 11593–11598, 2017

11593 Adv. Sci. Lett. Vol. 23, No.11, 2018 doi:10.1166/asl.2017.10335

Copyright © 2017 American Scientific Publishers Advanced Science Letters

All rights reserved Vol.23, No. 11, 11593–11598, 2017

Printed in the United States of America

Improved Boosting Algorithms by Pre-

Pruning and Associative Rule Mining on Decision

Trees for predicting Obstructive Sleep Apnea

Doreen Ying Ying Sim1*, Chee Siong Teh1, Ahmad Izuanuddin Ismail2

1Faculty of Cognitive Sciences and Human Development, Universiti Malaysia Sarawak, Kuching, Sarawak, Malaysia

2Respiratory Medicine Unit, Department of Respiratory Medicine, UiTM Medical Specialist Centre, Faculty of Medicine,

Universiti Teknologi MARA, Selangor, Malaysia

An improved Boosting algorithm, named as Boosted PARM-DT, was developed by pre-pruning techniques and Associative

Rule Mining (ARM) on decision trees built from the clinical datasets** collected for Obstructive Sleep Apnea (OSA). The

Pruned-Associative-Rule-Mined Decision Trees (PARM-DT) developed by adopting pre-pruning techniques on tree depth,

minimum leaf and/or parent node size observations and maximum number of tree splits, based on Apriori and/or Adaptive

Apriori (AA) frameworks, is boosted to achieve better predictive accuracies. The improved algorithms were implemented in

OSA dataset and UCI online databases for comparisons. Better predictive accuracies were achieved in all the applied

datasets/databases when comparing the classical algorithm, i.e. Boosted DT, with the improved one, i.e. Boosted PARM-DT.

Keywords: pre-pruning techniques, Associative Rule Mining, Apriori, Adaptive Apriori (AA), Boosted PARM-DT

1. INTRODUCTION

Obstructive Sleep Apnea (OSA), like some other

diseases and medical illnesses, usually has an attribute or

a set of attributes which can perfectly or almost perfectly

confirm the medical diagnosis2-4. This attribute or set of

attributes, however, usually has low support threshold(s).

Since boosting algorithms, such as AdaBoost, are white-

box methods, and this research has raw data** collection

on OSA patients’ records** with the characteristics of it

which is fully known and understood, using pre-pruning

techniques, Associative Rule Mining (ARM) and Apriori

/Adaptive Apriori framework is a great advantage.

Sleep apnea affects both adults and children which

can result in as many as around 30 breathless episodes per

night. Untreated sleep apnea can cause death during sleep

or can incur serious health problems such as diabetes,

hypertension, stroke, and other cardiovascular diseases2-4.

If a person has Obstructive Sleep Apnea (OSA), his or her

tongue and throat muscles may become so relaxed and

floppy during sleep that those muscles can cause a

narrowing or even complete blockage of the airway(s)2-5.

Narrowing or complete blockage of the airway(s) can be

caused by the cephalometric anatomical abnormalities or

*Email: dsdoreenyy@gmail.com **see Acknowledgments

morphological defects (in this case, we concentrated just

on retrognathia, micrognathia and posterior pillar

webbing) and/or other anatomical defects such as throat

and/or tongue muscles flow back due to poor blood

circulation incurring muscle floppiness and relaxation2,4,5.

Table 1 shows the minimum support and minimum

confidence thresholds as per stated i.e. (1) bilateral Tonsils’

Size or TS (size ranges from 0 to 4, i.e. normal case to the

worst case); (2) crowding of oropharynx, i.e. MP

(Mallampati score ranges from 1 to 4); (3) Neck

Circumference or NC (greater than or equal to 40cm); (4)

Epworth Sleepiness Scale or ESS (i.e. ESS ranges from 0 to

24); (5) Morbid Obesity or MO (BMI greater than or equal

to 40); (6) Posterior Pillar Webbing or PPW; (7)

Retrognathia / Retro-positioned maxilla or RN (over-slung

or jutting lower jaw); and (8) Micrognathia / receding lower

jaw or receding chin or MN (short mentohyoid distance or

inferiorly displaced hyoid bone).

This paper is organized as follows: Section 2 deploys

ARM and pre-pruning techniques on decision trees of

OSA dataset; in Section 3, improved algorithms, i.e.

Boosted PARM-DT, are implemented. Experimental

results of comparing the algorithms proposed with

classical approaches are shown and analyzed in Section 4.

Conclusions and discussions are summarized in Section 5.

Adv. Sci. Lett. 23(11), 11593–11598, 2017 RESEARCH ARTICLE

11594

2. ASSOCIATIVE-RULE MINING AND PRE-

PRUNING TECHNIQUES ON DECISION TREES

Apriori lives on a uniform minimum support and

commonly used as a basic pruning strategy9-12 although

most of the datasets collected for medical research

purposes, the minimum support and minimum confidence

are user-specified from the research findings1. Interesting

patterns often occur at various levels of support8-10. The

improved algorithm proposed here is mainly based on

Adaptive Apriori, a non-uniform minimum support.

Only pre-pruning techniques are used as part of the

improved algorithms, i.e. Boosted PARM-DT, in this

research because all datasets and databases applied have

well-known characteristics. The decision trees developed

for OSA dataset** and UCI online databases are pre-

pruned by the following four ways based on Apriori and

Adaptive Apriori properties: (1) pre-pruning of decision

trees depends on a stopping criterion set to control the

tree depth and this is done by merging the leaves, or

working out the parent node size observations (see Fig.1);

(2) “pre-pruned” by halting its construction early, i.e.

working out the item or itemset having the largest

confident threshold but the lowest support threshold as

the minimum leaf node size observations for the decision

trees developed, i.e. setting tree depth controller(s) to

minimize entropy impurity (Eq.17,8); (3) Since pruning is

the inverse of splitting1,9,11, “pre-pruning” approaches are

done by deciding not to further split the tree, i.e. set

maximum number of tree splits starting from a number

of 4 (i.e. number of splits starting from a perfect binary

tree of Level 2) (Eq.37-9,11-12); (4) partition the subset of

tuples at a given node so that leaf node has minimum

impurity (Eq.17,8). Pre-pruning of decision trees is done

in a top-down fashion, i.e. from root (see Eq.27,8) to

branch node and then to leaf node(s). In Eq.1, decision

trees are grown until each leaf node has the lowest

impurity7,8,10. The P(Ѡj) is the fraction of patterns at

node N in category Ѡj 8,11-12.

Entropy impurity =

i (N) =

)(log)( 2j

jjPP

(Eq.1)

i (Nroot) =

0.1)(log)( 2

2

1

i

iiPP

(Eq.2)

Gini impurity =

i (N) =

jji

ji ipPP )(1

2

1

)()( 2

(Eq.3)

Post-pruning is usually applied for datasets which its

characteristics are not well-known6-8,10. Since the

characteristics of OSA dataset and the applied databases

are well-known, post-pruning technique is not applied in

this research. So, only pre-pruning techniques, Apriori,

and AA, are applied to develop Boosted PARM-DT.

DEFINITION 11 (BINARY RELATIONS

THROUGHOUT): If C is the set of confident rules plus

the default rule, this research considers two rules r and r’

in C. In this definition, r is ranked higher than r’, i.e.

denoted in this research as r >R r’, if any of the following

conditions or criteria holds:

Criteria 1: conf (r) > conf (r’)

Criteria 2: conf (r) = conf (r’), but sup (r) > sup (r’)

Criteria 3: sup (r) = sup (r’), but size (r) < size (r’)

Refer to Table 1 for OSA dataset, certain features

such as associating relationships between Micrognathia

(MN) and Retrognathia (RN), this research adopts

Criteria 2 in Definition 1 above. This is because MN can

be considered in the higher ranking than RN when taken

their low minimum support and perfect minimum

confidence thresholds into account. In other words, MN

can be considered more general than RN, i.e. rMN >G rRN.

DEFINITION 21,8-11 (ASSOCIATION RULE): An

association rule is an implication of the form X → Y,

where X

I, Y

I , X

Y = 0.

The optimization criterion for the tree splitting criteria

taken on decision of trees is the default setting, i.e. Gini’s

diversity index (or ‘gdi’). To grow decision trees in order to

fit the characteristics of OSA dataset, pre-pruning after

ARM based on Apriori and/or AA have to be analyzed.

conf(X → Y, D) =

),sup( ),sup( DX DXY

(Eq.4)

In Eq.4, support-driven and confidence-driven

pruning of the pre-pruning techniques on decision trees

by using ARM to work out the following: (1) Minimum

leaf- and (2) Minimum parent-node size observations; (3)

Maximum number of tree splits. It is based on the

association based classification which uses the minimum

support thresholds derived from the attribute(s) having

the highest confidence threshold(s) so as to prune over-

fitting rules. Pruning of over-fitting rules suffers from the

dilemma that rules of high support tend to have low

confidence7,11-12. However, prediction often depends on

high confidence1,8-12. High confidence features, as shown

in Table 1, usually have low support thresholds11-12.

Since OSA dataset has 2 attributes of RN and MN

having 100% confident thresholds but low support

thresholds of respectively 8/200 (i.e. 0.04), and 12/200

(i.e. 0.06), it is good to decide on using one of these

attributes (having k-itemset) to be the minimum leaf node

size observations or ‘MinLeafSize’ while growing the

decision trees. If using classical Boosting approach, i.e.

Boosted DT, no pruning technique or any ARM will be

used. So, in Boosted DT, the default setting will be

applied, i.e. the default values of the tree depth

controllers as follows:

(1) n -1 for the maximum number of decision splits

or ‘MaxNumSplits’, (n = training sample size) -

the maximum number of tree splits is size(X, 1)-

1, i.e. the number of training data minus 1;

(2) 1 for the minimum number of leaf node

observations or ‘MinLeafSize’;

(3) 10 for the minimum number of parent or branch

node observations or ‘MinParentSize’.

RESEARCH ARTICLE Adv. Sci. Lett. 23(11), 11593–11598, 2017

11595

Table.1. Minimum Support and Minimum

Confidence thresholds for each OSA variable in 200

patients’ records to be input to PARM-DT.

OSA variables

Minimum

Support

P(X

Y)

Minimum

Confidence

P(X

Y) | P(X)

1. Tonsil Size (TS)

0.70 (140/200)

0.76 (106/140)

2. Mallampati

Score (MP)

0.75 (150/200)

0.81 (122/150)

3. Neck Circum-

ference (NC)

4.

Circumference(NC)

ce

0.55 (109/200)

0.81 (88/109)

4. Epworth Sleepi-

n

ness Scale (ESS)

0.78 (155/200)

0.85 (132/155)

5. Morbid Obesity

(MO)

6. Posterior Pillar

Webbing (PPW)

0.40 (79/200)

0.165(33/200)

0.91 (72/79)

0.909 (30/33)

7. Retrognathia

(RN)

0.04 (8/200)

1.00 (8/8)

8. Micrognathia

(MN)

0.06 (12/200)

1.00 (12/12)

For ‘MinLeafSize’, the attribute having the highest

confident threshold and lowest support threshold is taken

as the highest priority. For minimum parent node size

observations or ‘MinParentSize’, the (k-1) itemset

having one-level higher of minimum support threshold

(but should be larger than 2 times the ‘MinLeafSize’)

should be set for ‘MinParentSize’2,3,9,12.

3. BOOSTED PARM-DT ALGORITHM BASED ON

APRIORI FRAMEWORK

Decision trees for OSA datasets and other applied

UCI databases were developed based on Apriori and/or

AA frameworks, and these trees were boosted by

GentleBoost (for two-class datasets) or AdaBoostM2 (for

non-two-class datasets), 200 iterations, 15-fold cross

validations, by using MATLAB(R2016a) software.

DEFINITION 37,8. (MONOTONICITY): If X is a

subset of Y, Sup(x) must not exceed Sup(Y). That is, for

all X, Y € J: (X is a subset of Y) → f(X) ≤ f(Y).

DEFINITION 41. (MCF PRINCIPLE): If there are

choices, the rule of the highest rank has the top priority,

and a specific rule that does not have higher rank than all

general rules is never used. That specific rule will be

deemed redundant and will be pruned.

In OSA dataset, Definitions 3 and 4 above were

adopted so as to prune the redundant rules and to perform

certain pre-pruning techniques before the decision tree is

fully grown. Reasons that the minimum leaf size chosen

for OSA dataset is 12 rather than 8 are because although

both RN and MN attributes are 100% confident attributes,

when we adopt the MCF Principle, MN has higher

minimum support threshold (i.e. 12) than RN in incurring

the OSA positive. In Table 1, MN has a support threshold

of 12 while RN only has a support threshold of 8. Setting

the minimum leaf size of 12 is to control the optimal tree

depth and to avoid over-fitting rules (since by default, the

number of splitting for n level of trees is n -1 splits).

Downward closure property, i.e. any subset of a

frequent item-set is also a frequent item-set1,6,8 ----------

when applying this property on the datasets applied, we

are going from general to specific rules, and this property

can only be applicable when there is an improvement in

the minimum confidence thresholds when implementing

to a certain attribute or certain set of attributes. As shown

Table 1, Fig. 1 and Fig.2, an example can be PPW and

MN OSA positive, the minconf for this item-set is

0.909, but MN OSA positive, the minconf for this item

is 1.00, so upward closure property is not applicable, but

downward closure property is applicable. The improved

boosting algorithms, i.e. Boosted PARM-DT, in narrative

form, and in pseudo-codes, are respectively as below:

Q: Is the dataset having attribute or set of attributes

with significantly high or perfect confident threshold?

Yes No Boosted DT

Improved Algorithms: Boosted PARM-DT:

OSA dataset (or UCI online databases)

Further Assoc. Rule Mining (ARM) on Decision Trees

Decision Trees (DT) Development and Completion

Boosted PARM-DT, or Boosted PARM Decision Trees

Boosted PARM-DT (in pseudo-codes):

1. Input variables: a set of OSA data (or data from

the applied UCI databases) with labels {(x1,

y1), …, (xN, yN)} where xi є X, yi є Y = {-1, +1};

the initial setting is the minimal parent node

observation of δ, i.e. δminparent; stepwise increase

of δ, i.e. δminparent+1; for tree leaves, the initial

setting is the minimal leaf node observation ζ, i.e.

ζ minleaf; ; the stepwise increase of ζ, i.e. ζ minleaf+1;

for tree splits, the initial setting of tree splitting

is 4 (i.e. Simple Tree), the maximum is ϗMaxSplit,

the stepwise increase of ϗ, i.e. ϗ initial+1.

2. Initialize:

The weights of the training OSA dataset,

wi

1

= 1/N, for all i = 1, 2, … , N.

3. Do for i = 1, 2, …, N (where N=no. of iterations)

Do while (ζ >= ζ minleaf)

Support-confidence, Apriori and AA to work out

MinLeafSize, MinParentSize and MaxNumSplits.

Pre-pruning Techniques by controlling the tree

depth, maximum number of splits and categorical

predictor(s), surrogate splits and/or setting tree

stopping criteria (since all are white-box methods)

No post-pruning technique is used since the

characteristics of OSA dataset and the applied

databases are all well-known. Boosting is done

after the pre-pruning technique and ARM on DT.

Adv. Sci. Lett. 23(11), 11593–11598, 2017 RESEARCH ARTICLE

11596

(a) Train Associative-Rule-Mined Decision

Tree or ARM-DT component classifier,

ht, on the weighted training OSA dataset,

(where t = number of weak classifiers)

(b) Calculate the training error of ht : εt =

N

i

t

i

w

1

, yi ≠ ht (xi),

(c) If εt > 0.5, go directly to (g);

Else, ζ minleaf = ζ minleaf+1 , then proceed

normally to (d);

(d) Set the weight of ARM-DT component

classifier : ht = αt =

t

t

1

ln

2

1

(e) Update weights of OSA training samples:

wt

i

1

=

Cxh

y

w

t

it

i

t

t

i))(exp(

where Ct is the normalization constant, and

N

i

t

i

w

1

1

=1.

(f) Go directly to Step 4 for output;

Do while (δ >= δminparent)

(g) Train ARM-DT component classifier, ht,

on the weighted training OSA data,

(h) Calculate the training error of ht : εt =

N

i

t

i

w

1

, yi ≠ ht (xi),

(i) If εt > 0.5, go directly to (j),

Else, δminparent = δminparent+1 , then proceed

normally to (d);

Do while (ϗ initial >= 4 AND ϗ initial <= ϗMaxSplit)

(j) Train ARM DT component classifier,

ht, on the weighted training OSA dataset,

(k) Calculate the training error of

ht : εt =

N

i

t

i

w

1

, yi ≠ ht (xi),

(l) If εt > 0.5, halt the loop;

Else, ϗ initial = ϗ initial+1, proceed to (d);

4. Output: the largest weighted classifier from the

associatively mined decision trees will be chosen,

f (x) = sign

T

ttt x

h

1)(

END (Algorithms completed)

4 IMPLEMENTATION OUTCOME

Apriori, Adaptive Apriori frameworks and

downward closure properties of Boosted PARM-DT on

OSA dataset are shown in Fig. 1 and Fig.2. For pre-

pruning techniques, a setting of the minimum number of

parent nodes and/or leaf nodes was done after applying

ARM on OSA dataset. These are that the minimum

number of parent nodes in between 67 to 74, and the

minimum number of leaf nodes are in between 12 to 20,

which reveals the highest prediction accuracies. These

findings are derived from the support-confidence

thresholds of the attributes, and the minimum child and

branch node size observations in the OSA dataset.

Fig.1. Support-based and confidence-based pruning of

the pre-pruning on trees developed for OSA dataset to

derive MinLeafSize and MinParentSize based on Apriori.

Fig.2. Pre-pruning techniques on decision trees for

OSA dataset based on Adaptive Apriori framework.

Table.2. Three Pre-Pruning Techniques i.e. with

MinLeafSize, MinParentSize, and MaxNumSplits

implemented after ARM as parts of Boosted PARM-DT,

to OSA dataset and another eight UCI online databases.

OSA dataset/UCI

online databases

Minimum

Leaf Size

Minimum

Parent Size

Max. no.

of splits

1. OSA(Ada.Apriori)

12

67-74

(default)

2. Breast Cancer

31

132

(default)

3. Diabetes (Pima)

285-300

(default)

(default)

4. Heart Disease

51

(default)

(default)

5. Liver Disorders

(default)

(default)

18-29

6.Credit Screening

150-160

350

(default)

7. Ionosphere

(default)

(default)

4 or 5

8. Hepatitis

(default)

(default)

4-9

9. Iris

(default)

(default)

4-9

10. OSA (Apriori)

18

74

(default)

Comparisons of Boosted PARM-DT, with Boosted

DT, were conducted in all databases and OSA dataset in

Table 3. Fig.3 then shows that Boosted PARM-DT on

AA framework produces the best ROC curve for OSA

RESEARCH ARTICLE Adv. Sci. Lett. 23(11), 11593–11598, 2017

11597

data if comparing Boosted PARM-DT on Apriori

framework, or with the classical Boosted DT algorithms.

Table.3. Predictive Accuracies of Boosted DT and

Boosted PARM-DT to OSA dataset (based on AA and

Apriori) and all online databases based on AA only

Predictive Accuracies

Improve-

ments

achieved

OSA dataset/UCI

online databases

from UCI

Boosted

DT

Boosted

PARM-DT

1. OSA (AA)

0.9250

0.9850

6.00% a

2. Breast Cancer

CCaCancer(W.)

0.9599

1.0000

4.01% c

3. Diabetes (Pima)

0.6440

0.7578

11.38% c

4. Heart Disease

0.6490

0.8074

15.84% c

5. Liver Disorders

0.5770

0.7159

13.89% b

6. Credit Screening

0.7768

0.8550

7.82% b

7. Monks-2 (tr.set)

0.5444

0.8994

35.50% c

8. Hepatitis

0.7580

0.8581

10.01% a

9. Monks-2 (te.set)

0.6019

1.0000

39.81% c

10. OSA (Apriori)

0.9250

0.9767

5.17%a

a significant at P <0.05; b significant at P <0.001; c significant at P <0.0001

Table.4. Results on the Scientific Significance of the

Improvements in Predictive Accuracies shown in Table 3

OSA/UCI dataset

instances

95% CI

p-value

1. OSA (AA)

200

1.7-10.7

0.0038

2. Breast Cancer

CCaCancer(W.)

699

2.6-5.7

<0.0001

3. Diabetes (Pima)

768

6.7-16.0

<0.0001

4. Heart Disease

270

8.1-23.4

<0.0001

5. Liver Disorders

345

6.6-21.1

0.0001

6. Credit Screening

690

3.6-12.0

0.0002

7. Monks-2 (tr.set)

169

26.0-44.3

<0.0001

8. Hepatitis

155

0.8-19.1

0.0255

9. Monks-2(te.set)

432

35.1-44.6

<0.0001

10. OSA (Apriori)

301

1.5-9.0

0.0034

95% confidence interval and p-values at one-tail t-test; “N-1” Chi-squared test

Fig.3. Receiver Operating Characteristics (ROC) curves

for OSA dataset based on 3 algorithms applied: green

and blue curvy lines respectively indicates the ROC

curves after applying each Boosted PARM-DT based on

AA and Apriori framework, while light grey curvy line

indicates that of normal algorithm, i.e. Boosted DT.

5. CONCLUSIONS AND DISCUSSIONS

Both Table 3 and Table 4 above show the scientific

reliabilities and statistics significance of applying the

improved algorithms to the datasets as accessed by

analyzing the improvements achieved through one-tailed

t-tests, p-values and 95% confidence interval. Boosted

PARM-DT, after pre-pruning techniques, augmented by

Apriori and/or AA and ARM, the predictive accuracies

are better than using the classical Boosting algorithms.

Boosted PARM-DT is a refined white-box method that

can give better predictive accuracies to all databases

applied. The most prominent characteristic of OSA

dataset is that it has two 100% or perfect confident

attributes, RN and MN, although having low support

thresholds, it is very conducive to apply the Apriori and

Adaptive Apriori before getting ARM to ‘cast on’ the

relationships for the pre-pruning on decision trees to take

place. By adopting Apriori to derive minimum support

threshold of MN to be MinLeafSize and AA with pushed

minimum support of MO to be MinParentSize for

developing decision trees for OSA dataset, Boosted

PARM-DT algorithms can prune overfitting rules which

can help medical doctors to make more accurate OSA

clinical diagnoses.

ACKNOWLEDGMENTS

**To obtain the raw data of OSA patients’ records,

formal Research and Ethics Committee Approval on

medical ground was acquired from Universiti Teknologi

MARA (Reference: 600-RMI (5/1/6)). This research is fully

supported by Fundamental Research Grant Scheme (FRGS),

UNIMAS (Reference: FRGS/ICT02(01)/1077/2013(23)).

REFERENCES

[1] R. Agrawal, T. Imielinski, A. Swami. Mining association rules

between sets of items in large databases. Proceeding of the 1993

ACM SIGMOD International Conference on Management of

Data, Washington, D. C., USA: ACM (1993) 207-216.

[2] D. Y. Y. Sim, C. S. Teh, P. K. Banerjee. Prediction Model by

using Bayesian and Cognition-Driven Techniques: A Study in

the Context of Obstructive Sleep Apnea. Proceeding of the 9th

International Conference on Cognitive Science, Malaysia,

Procedia - Social and Behavioral Sciences, 97 (2013) 528-537.

[3] D. Y. Y. Sim, C. S. Teh, A. I. Izuanuddin. Adaptive Apriori and

Weighted Association Rule Mining on Visual Inspected

Variables for Predicting Obstructive Sleep Apnea (OSA),

Australian Journal of Intelligent Information Processing

Systems, 14(2) (2014) 39-45.

[4] P. C. Deegan, W. T. McNicholas. Predictive Value of

Clinical Features for the Obstructive Sleep Apnea

Syndrome, European Respiratory Journal, ERS Journals Ltd.,

UK., 9:1 (1996) 117-124.

[5] T. I. Morgenthaler, R. N. Aurora, T. Brown. Practice

parameters for the use of auto-titrating continuous positive

airway pressure devices for titrating pressures and treating adult

patients with obstructive sleep apnea syndrome: an update for

2007, Sleep, 31 (2007) 141-147.

[6] A. K. Das. Mining rare item sets using both top down and

bottom up approach, Internationl Journal of Computer Science

and Information Technologies, 7(3) (2016) 1607-1614.

[7] J. Han, M. Kamber, J. Pei. Data Mining Concepts and

Techniques (3rd ed.), Elsevier, Morgan Kaufmann, USA

(2012) 17-27, 248-273, 461-488.

Boosted DT

Boosted PARM-DT(AA)

True Positive Rate

False Positive Rate

Boosted PARM-DT(Apriori)

Adv. Sci. Lett. 23(11), 11593–11598, 2017 RESEARCH ARTICLE

11598

[8] J. Han, J. Pei, Y. Yin. Mining frequent patterns without

candidate generation. Proceeding of the 2000 ACM SIGMOD

International Conference on Data Mining, New York, NY,

USA: CM Press, (2000) 1-12.

[9] K. Wang, Y. He, J. Han. Pushing support constraints into

association rules mining, IEEE Transactions on Knowledge and

Data Engineering, 15(3) (2003) 642-658.

[10] K. Wang, S. Zhou, S. Liew. Building hierarchical classifiers

using class proximity. Proceeding of the 25th International

Conference on Very Large Data Bases, San Francisco, CA,

USA: Morgan Kaufmann, (1999) 363-374.

[11] S. K. Pal, P. Mitra. Pattern Recognition Algorithms for Data

Mining, Chapman & Hall, Florida, USA: CRC Press LLC,

(2004) 165-168, 170-174.

[12] G. Hari Prasad, J. Nagamuneiah. A Strategy for Initiate Support

Check into Frequent Itemset Mining, International Journal of

Advanced Research in Computer Science and Software

Engineering, 2(7) (2012) 43-48.