Conference Paper

Comparing pattern discovery and back-propagation classifiers

Syst. Design Eng., Waterloo Univ., Ont., Canada
DOI: 10.1109/IJCNN.2005.1556039 Conference: Neural Networks, 2005. IJCNN '05. Proceedings. 2005 IEEE International Joint Conference on, Volume: 2
Source: IEEE Xplore


The pattern discovery (PD) algorithm of Wang and Wong was applied as a classifier to several continuous-valued data sets generated to explore performance across a selection of interesting linearly and non-linearly separable class distributions. Performance of several configurations of PD and backpropagation (BP) neural network classifiers and a minimum inter-class distance (MICD) classifier was quantified and compared. The best performance of the PD and BP classifiers were found to be similar for all class distributions studied and close to the optimal IMICD performance for linearly separable class distributions. The performance of both PD and BP classifiers was dependent on the classifier configuration. PD classifier performance depended on the number of intervals used to quantize the continuous data in a predictable, class-distribution independent way. BP performance depended on the number of hidden nodes in a way which was class-distribution dependent and difficult to determine a priori. The transparency and statistical validity of the patterns used and the decisions made by PD classifiers make them highly suitable for problems in which the rationale and confidence of classifications are required so that multiple classifications can be effectively combined to support decisions in a broader context such as medical diagnosis. The strong absolute and relative performance of PD classifiers and the relative simplicity of their implementation when applied to continuous-valued data suggest that they can be effectively utilized in decision support systems in which the underlying data is continuous or discrete valued.

5 Reads
  • Source
    • "Specifically, patterns are discovered by considering the normalized information content of each m-order event, as described previously [16], [17], [18], [19], calculated using the adjusted residual [22] "
    [Show abstract] [Hide abstract]
    ABSTRACT: A framework for the development of a decision support system (DSS) that exhibits uncommonly transparent rule-based inference logic is introduced. A DSS is constructed by marrying a statistically based fuzzy inference system (FIS) with a user interface, allowing drill-down exploration of the underlying statistical support, providing transparent access to both the rule-based inference as well as the underlying statistical basis for the rules. The FIS is constructed through a "pattern discovery" based analysis of training data. Such an analysis yields a rule base characterized by simple explanations for any rule or data division in the extracted knowledge base. The reliability of a fuzzy inference is well predicted by a confidence measure that determines the probability of a correct suggestion by examination of values produced within the inference calculation. The combination of these components provides a means of constructing decision support systems that exhibit a degree of transparency beyond that commonly observed in supervised-learning-based methods. A prototype DSS is analyzed in terms of its workflow and usability, outlining the insight derived through use of the framework. This is demonstrated by considering a simple synthetic data example and a more interesting real-world example application with the goal of characterizing patients with respect to risk of heart disease. Specific input data samples and corresponding output suggestions created by the system are presented and discussed. The means by which the suggestions made by the system may be used in a larger decision context is evaluated.
    IEEE Transactions on Knowledge and Data Engineering 09/2006; 18(8):1125- 1137. DOI:10.1109/TKDE.2006.132 · 2.07 Impact Factor
  • Source
    • "The PD algorithm uses the observed probability of occurrence to infer rules through analysis of discrete data values [13] [14] [15] [16]; this algorithm has been adapted through the use of quantization to function in continuous data domains, where the classification performance remains high [4] [5] [6] [7]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Using a calculation of maximum a priori conditional probability, a confi-dence measure providing an estimate of the reliability of the 'Pattern Discov-ery' system based on its own measure of discriminative ability is produced. This confidence measure is evaluated in terms of its ability to predict true relia-bility as examined over a variety of syn-thetic data type distributions. Artifacts arising from this confidence value are examined, and the reasons for their existence are discussed. An expla-nation of the relationship between the observed confidence values and the er-ror in their estimation of the true condi-tional probability is provided.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Pattern discovery (PD), an algorithm which discovers patterns based on a statistical analysis of training data was used to generate rules for a fuzzy rule based classification system (FRBCS). Classification performance of the FRBCS when using rules discovered by the PD algorithm and of the PD algorithm functioning as a classifier applied to a number of linearly and non-linearly separable continuous-valued data sets was compared. The results indicate an increased performance for the FRBCS. The improvement comes through both an increase in correct classifications and a decrease in the error rate in the class distributions studied. The use of trapezoidal shaped input membership functions applied to the input data values allowed vagueness in the input events to be modelled and resulted in a more robust determination of the characteristics of the input data which in turn resulted in more accurate classification. In addition, the standard use of a co-occurrence based weighting of the rules by the FRBCS outperformed the weight-of-evidence based selection and use of input patterns by the PD classifier.
    Fuzzy Information Processing Society, 2005. NAFIPS 2005. Annual Meeting of the North American; 07/2005
Show more