Conference Paper

Addressing Class Distribution Issues of the Drawing vs Writing Classification in an Ink Stroke Sequence.

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Complicated by temporal correlations among the strokes and varying distributions of the underlying classes, the drawing/writing classification of ink strokes in a digital ink file poses interesting challenges. In this paper, we present our efforts in addressing some of the issues. First, we describe how we adjust the outputs of the neural network to a priori probabilities of new observations to produce more accurate estimates of the posterior probabilities. Second, we describe how to adapt the parameters of the HMM to new data sets. Albeit the fact that the emission probabilities of the HMM are computed indirectly from the outputs of the neural network, our modified Baum-Welch algorithm still finds the correct estimates for the HMM's parameters. We also present experimental results of our new algorithms on 6 real-world data sets. The results show that our methods increase the F-Measures of both the drawing and the writing classes on the more "drawing-intensive" data sets which have stronger temporal correlations. But they do not perform well on the more "writing-intensive" data sets.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Addressing a similar problem, Bishop et al. (2004), Patel et al. (2007), and Bhat and Hammond (2009) present methods that integrate shape and temporal information for classifying individual strokes as either text or drawing strokes. Wang et al. (2007) improve on Bishop et al.'s method. The goal of most previous single-stroke classification techniques is to identify the text strokes so they can be sent to a character recognizer, while the shape strokes (i.e., strokes comprising graphic objects) are left ungrouped. ...
Article
Objects in freely-drawn sketches often have no spatial or temporal separation, making object recognition difficult. We present a two-step stroke-grouping algorithm that first classifies individual strokes according to the type of object to which they belong, then groups strokes with like classifications into clusters representing individual objects. The first step facilitates clustering by naturally separating the strokes, and both steps fluidly integrate spatial and temporal information. Our approach to grouping is unique in its formulation as an efficient classification task rather than, for example, an expensive search task. Our single-stroke classifier performs at least as well as existing single-stroke classifiers on text vs. nontext classification, and we present the first three-way single-stroke classification results. Our stroke grouping results are the first reported of their kind; our grouping algorithm correctly groups between 86% and 91% of the ink in diagrams from two domains, with between 69% and 79% of shapes being perfectly clustered.
... Bishop et al. 6 trained and evaluated a classification algorithm using a Hidden Markov Model. Wang et al. 7 extend Bishop's approach by integrating a neural network. Gennari et al. 8 segmented pen strokes and then used properties of the pen stroke segments to interpret hand-drawn diagrams. ...
... The system should be capable of detecting text or unrecognized strokes and processing them using a different recognition technique or leaving them as unrecognized. Separating text from diagrams is a challenging problem that we do not address here, although recent approaches have proven quite successful at this task [6, 58]. ...
Chapter
In recent years there has been an increasing interest in sketch-based user interfaces, but the problem of robust free-sketch recognition remains largely unsolved. This chapter presents a graphical-model-based approach to free-sketch recognition that uses context to improve recognition accuracy without placing unnatural constraints on the way the user draws. Our approach uses context to guide the search for possible interpretations and uses a novel form of dynamically constructed Bayesian networks to evaluate these interpretations. An evaluation of this approach on two domains—family trees and circuit diagrams—reveals that in both domains the use of context to reclassify low-level shapes significantly reduces recognition error over a baseline system that does not reinterpret low-level classifications. Finally, we discuss an emerging technique to solve a major remaining challenge for multi-domain sketch recognition revealed by our evaluation: the problem of grouping strokes into individual symbols reliably and efficiently, without placing unnatural constraints on the user’s drawing style.
... or using conditional random fields to classify strokes in organizational chart diagrams as either connectors or boxes. Addressing a similar problem, Bishop et al. (2004), Patel et al. (2007, and Bhat and Hammond (2009) present methods that integrate shape and temporal information for classifying individual strokes as either text or drawing strokes. Wang et al. (2007) improve on Bishop et al.'s method. The goal of most previous single-stroke classification techniques is to identify the text strokes so they can be sent to a character recognizer, while the shape strokes (i.e., strokes comprising graphic objects) are left ungrouped. Our approach goes further, grouping the shape strokes as well. We aim t ...
Conference Paper
Objects in freely-drawn sketches of ten have no spatial or temporal separation, making object recognition difficult. We present a two-step stroke-grouping algorithm that first classifies individual strokes according to the type of object to which they belong, then groups strokes with like classifications into clusters representing individual objects. The first step facilitates clustering by naturally separating the strokes, and both steps fluidly integrate spatial and temporal information. Our approach to grouping is unique in its formulation as an efficient classification task rather than, for example, an expensive search task. Our single-stroke classifier performs at least as well as existing single-stroke classifiers on text vs. nontext classification, and we present the first three-way single-stroke classification results. Our stroke grouping results are the first reported of their kind; our grouping algorithm correctly groups between 86% and 91% of the ink in diagrams from two domains, with between 69% and 79% of shapes being perfectly clustered. Copyright © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Article
Data mining has become an increasingly important tool for education researchers and practitioners. However, work in this field has focused on data from online educational systems. Here, we present techniques to enable data mining of handwritten coursework, which is an essential component of instruction in many disciplines. Our techniques include methods for classifying pen strokes as diagram, equation, and cross-out strokes. The latter are used to strike out erroneous work. We have also created techniques for grouping equation strokes into equation groups and then individual characters. Our results demonstrate that our classification and grouping techniques are more accurate than prior techniques for this task. We also demonstrate applications of our techniques for automated assessment of student competence. We present a novel approach for measuring the correctness of exam solutions from an analysis of lexical features of handwritten equations. This analysis demonstrates, for example, that the number of equation groups correlates positively with grade. We also use our techniques to extend graphical protocol analysis to free-form, handwritten problem solutions. While prior work in a laboratory setting suggests that long pauses are indicative of low competence, our work shows that the frequency of long pauses during exams correlates positively with competence.
Article
Objects in freely drawn sketches often have no spatial or temporal separation, making object identification difficult. We present a two-step stroke-grouping algorithm that first classifies individual strokes according to the type of object to which they belong, and then groups strokes with like classifications into clusters representing individual objects. The first step facilitates clustering by naturally separating the strokes, and both steps fluidly integrate spatial and temporal information. Our single-stroke classifier has comparable accuracy to an existing state-of-the-art single-stroke classifier on text vs. non-text classification, and is significantly more efficient. Furthermore, our classifier is also suitable for applications with more than two classes of strokes. Our approach to grouping is unique in its formulation as an efficient classification task rather than, for example, an expensive search task. In experiments on several types of sketches, our grouping method performed accurately, correctly grouping up to 92% of the ink, with up to 79% of the shapes being perfectly clustered. (C) 2014 Published by Elsevier Ltd.
Article
Full-text available
Many neural network classifiers provide outputs which estimate Bayesian a posteriori probabilities. When the estimation is accurate, network outputs can be treated as probabilities and sum to one. Simple proofs show that Bayesian probabilities are estimated when desired network outputs are 1 of M (one output unity, all others zero) and a squared-error or cross-entropy cost function is used. Results of Monte Carlo simulations performed using multilayer perceptron (MLP) networks trained with backpropagation, radial basis function (RBF) networks, and high-order polynomial networks graphically demonstrate that network outputs provide good estimates of Bayesian probabilities. Estimation accuracy depends on network complexity, the amount of training data, and the degree to which training data reflect true likelihood distributions and a priori class probabilities. Interpretation of network outputs as Bayesian probabilities allows outputs from multiple networks to be combined for higher level decision making, simplifies creation of rejection thresholds, makes it possible to compensate for differences between pattern class probabilities in training and test data, allows outputs to be used to minimize alternative risk functions, and suggests alternative measures of network performance.
Article
Full-text available
In real-world environments it usually is difficult to specify target operating conditions precisely, for example, target misclassification costs. This uncertainty makes building robust classification systems problematic. We show that it is possible to build a hybrid classifier that will perform at least as well as the best available classifier for any target conditions. In some cases, the performance of the hybrid actually can surpass that of the best known classifier. This robust performance extends across a wide variety of comparison frameworks, including the optimization of metrics such as accuracy, expected cost, lift, precision, recall, and workforce utilization. The hybrid also is efficient to build, to store, and to update. The hybrid is based on a method for the comparison of classifier performance that is robust to imprecise class distributions and misclassification costs. The ROC convex hull (ROCCH) method combines techniques from ROC analysis, decision analysis and computational geometry, and adapts them to the particulars of analyzing learned classifiers. The method is efficient and incremental, minimizes the management of classifier performance data, and allows for clear visual comparisons and sensitivity analyses. Finally, we point to empirical evidence that a robust hybrid classifier indeed is needed for many real-world problems.
Conference Paper
Full-text available
Many different metrics are used in machine learning and data mining to build and evaluate models. However, there is no general theory of machine learning metrics, that could answer questions such as: When we simultaneously want to optimise two criteria, how can or should they be traded off? Some metrics are inherently inde- pendent of class and misclassification cost distri- butions, while other are not — can this be made more precise? This paper provides a derivation of ROC space from first principles through 3D ROC space and the skew ratio, and redefines metrics in these dimensions. The paper demon- strates that the graphical depiction of machine learning metrics by means of ROC isometrics gives many useful insights into the characteris- tics of these metrics, and provides a foundation on which a theory of machine learning metrics can be built.
Article
Full-text available
Rare objects are often of great interest and great value. Until recently, however, rarity has not received much attention in the context of data mining. Now, as increasingly complex real-world problems are addressed, rarity, and the related problem of ...
Article
Full-text available
It sometimes happens (for instance in case control studies) that a classifier is trained on a data set that does not reflect the true a priori probabilities of the target classes on real-world data. This may have a negative effect on the classification accuracy obtained on the real-world data set, especially when the classifier's decisions are based on the a posteriori probabilities of class membership. Indeed, in this case, the trained classifier provides estimates of the a posteriori probabilities that are not valid for this real-world data set (they rely on the a priori probabilities of the training set). Applying the classifier as is (without correcting its outputs with respect to these new conditions) on this new data set may thus be suboptimal. In this note, we present a simple iterative procedure for adjusting the outputs of the trained classifier with respect to these new a priori probabilities without having to refit the model, even when these probabilities are not known in advance. As a by-product, estimates of the new a priori probabilities are also obtained. This iterative algorithm is a straightforward instance of the expectation-maximization (EM) algorithm and is shown to maximize the likelihood of the new data. Thereafter, we discuss a statistical test that can be applied to decide if the a priori class probabilities have changed from the training set to the real-world data. The procedure is illustrated on different classification problems involving a multilayer neural network, and comparisons with a standard procedure for a priori probability estimation are provided. Our original method, based on the EM algorithm, is shown to be superior to the standard one for a priori probability estimation. Experimental results also indicate that the classifier with adjusted outputs always performs better than the original one in terms of classification accuracy, when the a priori probability conditions differ from the training set to the real-world data. The gain in classification accuracy can be significant.
Conference Paper
Full-text available
We present a system that separates text from graphics strokes in handwritten digital ink. It utilizes not just the characteristics of the strokes, but also the information provided by the gaps between the strokes, as well as the temporal characteristics of the stroke sequence. It is built using machine learning techniques that infer the internal parameters of the system from real digital ink, collected using a tablet PC.
Article
In machine learning problems, differences in prior class probabilities -- or class imbalances -- have been reported to hinder the performance of some standard classifiers, such as decision trees. This paper presents a systematic study aimed at answering three different questions. First, we attempt to understand the nature of the class imbalance problem by establishing a relationship between concept complexity, size of the training set and class imbalance level. Second, we discuss several basic re-sampling or cost-modifying methods previously proposed to deal with the class imbalance problem and compare their effectiveness. The results obtained by such methods on artificial domains are linked to results in real-world domains. Finally, we investigate the assumption that the class imbalance problem does not only affect decision tree systems but also affects other classification systems such as Neural Networks and Support Vector Machines.
Conference Paper
Hand-drawn diagrams present a complex recognition problem. Elements of the diagram are often individually ambiguous, and require context to be interpreted. We present a recognition method based on Bayesian conditional random fields (BCRFs) that jointly analyzes all drawing elements in order to incorporate contextual cues. The classification of each object affects the classification of its neighbors. BCRFs allow flexible and correlated features, and take both spatial and temporal information into account. BCRFs estimate the posterior distribution of parameters during training, and average predictions over the posterior for testing. As a result of model averaging, BCRFs avoid the overfitting problems associated with maximum likelihood training. We also incorporate automatic relevance determination (ARD), a Bayesian feature selection technique, into BCRFs. The result is significantly lower error rates compared to ML- and MAP-trained CRFs.