Conference Paper

Comparing pattern discovery and back-propagation classifiers

Syst. Design Eng., Waterloo Univ., Ont., Canada
DOI: 10.1109/IJCNN.2005.1556039 Conference: Neural Networks, 2005. IJCNN '05. Proceedings. 2005 IEEE International Joint Conference on, Volume: 2
Source: IEEE Xplore

ABSTRACT The pattern discovery (PD) algorithm of Wang and Wong was applied as a classifier to several continuous-valued data sets generated to explore performance across a selection of interesting linearly and non-linearly separable class distributions. Performance of several configurations of PD and backpropagation (BP) neural network classifiers and a minimum inter-class distance (MICD) classifier was quantified and compared. The best performance of the PD and BP classifiers were found to be similar for all class distributions studied and close to the optimal IMICD performance for linearly separable class distributions. The performance of both PD and BP classifiers was dependent on the classifier configuration. PD classifier performance depended on the number of intervals used to quantize the continuous data in a predictable, class-distribution independent way. BP performance depended on the number of hidden nodes in a way which was class-distribution dependent and difficult to determine a priori. The transparency and statistical validity of the patterns used and the decisions made by PD classifiers make them highly suitable for problems in which the rationale and confidence of classifications are required so that multiple classifications can be effectively combined to support decisions in a broader context such as medical diagnosis. The strong absolute and relative performance of PD classifiers and the relative simplicity of their implementation when applied to continuous-valued data suggest that they can be effectively utilized in decision support systems in which the underlying data is continuous or discrete valued.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Using a calculation of maximum a priori conditional probability, a confi-dence measure providing an estimate of the reliability of the 'Pattern Discov-ery' system based on its own measure of discriminative ability is produced. This confidence measure is evaluated in terms of its ability to predict true relia-bility as examined over a variety of syn-thetic data type distributions. Artifacts arising from this confidence value are examined, and the reasons for their existence are discussed. An expla-nation of the relationship between the observed confidence values and the er-ror in their estimation of the true condi-tional probability is provided.
  • [Show abstract] [Hide abstract]
    ABSTRACT: To design and develop a nonfuzzy classification paradigm from a statistical data set. The event association patterns of different orders are detected which provides a probabilistic inference mechanism to achieve flexible classification and prediction. To detect significant event associations, residual analysis in statistics is used. Patterns are detected and rules are generated based on the deviations of the observed patterns from a default model. The discriminative power of each rule generated is described using Weight of Evidence (WOE) statistic. Classification decisions are made using WOE based estimation of the relative likelihoods of each possible labeling. Estimates are calculated by using the set of rules triggered by matching input values. Experimental results are discussed towards the end of the paper.
    01/2010; DOI:10.1109/ICETET.2010.114
  • [Show abstract] [Hide abstract]
    ABSTRACT: A statistically based pattern discovery tool is presented that produces a rule-based description of complex data through the set of its statistically significant associations. The rules resulting from this analysis capture all the patterns observable within a data set for which a statistically sound rationale is available. The validity of such patterns recommends their use in cases where the rationale underlying a decision must be understood. High-risk decision making systems, a milieu familiar to many biologically-related problem domains, is the likely area of application for this technique. An analysis of the performance of this technique on a series of biologically relevant data distributions is presented, and the relative merits and weaknesses of this technique are discussed.