[show abstract][hide abstract] ABSTRACT: We propose a novel method for recognizing sequential patterns such as motion trajectory of biological objects (i.e., cells, organelle, protein molecules, etc.), human behavior motion, and meteorological data. In the proposed method, a local classifier is prepared for every point (or timing or frame) and then the whole pattern is recognized by majority voting of the recognition results of the local classifiers. The voting strategy has a strong benefit that even if an input pattern has a very large deviation from a prototype locally at several points, they do not severely influence the recognition result; they are treated just as several incorrect votes and thus will be neglected successfully through the majority voting. For regularizing the recognition result, we introduce partial-dependency to local classifiers. An important point is that this dependency is introduced to not only local classifiers at neighboring point pairs but also to those at distant point pairs. Although, the dependency makes the problem non-Markovian (i.e., higher-order Markovian), it can still be solved efficiently by using a graph cut algorithm with polynomial-order computations. The experimental results revealed that the proposed method can achieve better recognition accuracy while utilizing the above characteristics of the proposed method.
PLoS ONE 01/2013; 8(10):e76980. · 3.73 Impact Factor
[show abstract][hide abstract] ABSTRACT: Applying mobile sensing technology to cognitive tasks will enable novel forms of activity recognition. Physical activity recognition technology has become mainstream-many dedicated mobile devices and smartphone apps count the steps we climb or the miles we run. What if devices and apps were also available that could count the words we read and how far we've progressed in our learning? The authors of this article demonstrate that mobile eye tracking can be used to do just that. Focusing on reading habits, they've prototyped cognitive activity recognition systems that monitor what and how much users read as well as how much they understand. Such systems could revolutionize teaching, learning, and assessment both inside and outside the classroom. Further, as sensing technology improves, activity recognition could be extended to other cognitive tasks including concentrating, retaining information, and auditory or visual processing. While this research is extremely exciting, it also raises numerous ethical questions-for example, who should know what we read or how much we understand?
[show abstract][hide abstract] ABSTRACT: Document image decoding (DID) is a trial to understand the contents of a whole document without any reference information about font, language, etc. Typically, DID approaches assume the correct segmentation of the document and some a priori knowledge about the language or the script. Unfortunately, this assumption will not hold if we deal with various documents, such as documents with various sized fonts, camera-captured documents, free-layout documents, or histori-cal documents. In this paper, we propose a part-based character identification method where no segmentation into characters is necessary and no a priori information about the document is needed. The approach clusters similar keypoints and groups frequent neighboring keypoint clusters. Then a second iter-ation is performed, i.e., the groups are again clustered and optionally pairs frequent group clusters are detected. Our first experimental results on multi font-size documents look already very promising. We could find nearly perfect correspondences between characters and detected group clusters.
[show abstract][hide abstract] ABSTRACT: In this paper we propose a part-based skew estimation method which is more robust to larger varieties of text images, such as camera-captured scene images. Specifically, the skew angle at each local part of the input image is estimated independently by referring the local part of upright character images stored as a database. Then the global skew angle is estimated by aggregating the estimated local skews. The proposed method does not assume that characters are laid-out in straight lines and thus have more robustness to the varieties of text images than conventional methods. The experimental results show the advantage of the proposed method over the conventional methods under several conditions.
[show abstract][hide abstract] ABSTRACT: In this paper, we report recent work of the data-embedding pen, which adds an ink-dot sequence along a handwritten pattern
during writing. The ink-dot sequence represents some information, such as writer’s name, date of writing, and URL. This information
drastically increases the value of handwriting on a paper. The embedded information can be extracted from the handwritten
pattern by image processing techniques and a stroke recovery technique. Consequently, we can augment the handwritten pattern
by the data-embedding pen to carry arbitrary information.
[show abstract][hide abstract] ABSTRACT: The purpose of this paper is to analyze how image patterns distribute inside their feature space. For this purpose, 832,612 manually ground-truthed handwritten digit patterns are used. Use of character patterns instead of general visual object patterns is very essential for our purpose. First, since there are only 10 classes for digits, it is possible to have an enough number of patterns per class. Second, since the feature space of small binary character images is rather compact, it is easier to observe the precise pattern distribution with a fixed number of patterns. Third, the classes of character patterns can be defined far more clearly than visual objects. Through nearest neighbor analysis on 832,612 patterns, their distribution in the 32 × 32 binary feature space is observed quantitatively and qualitatively. For example, the visual similarity of nearest neighbors and the existence of outliers, which are surrounded by patterns from different classes, are observed.
[show abstract][hide abstract] ABSTRACT: This paper reports a new method for visual tracking of humans using
active RFID technology. Previous studies were based on the assumption
that the radio intensity from an RFID tag will be linearly proportional
to the distance between the tag and the antenna or will remain
unchanged; however, in reality, the intensity fluctuates significantly
and changes drastically with a small change in the environment. The
proposed method helps to overcome this problem by using only accurate
binary information that reveals whether the target person is close to
the antenna. Several experimental results have shown that the
information from the RFID tag was useful for reliable tracking of
IEEJ Transactions on Industry Applications 01/2011; 131(4):441-447.
[show abstract][hide abstract] ABSTRACT: In character recognition, multiple prototype classifiers, where multiple patterns are prepared as representative patterns of each class, have often been employed to improve recognition accuracy. Our question is how we can improve the recognition accuracy by increasing prototypes massively in the multiple prototype classifier. In this paper, we will answer this question through several experimental analyses, using a simple 1-nearest neighbor (1-NN) classifier and about 550,000 manually labeled handwritten numeral patterns. The analysis results under the leave-one-out evaluation showed not only a simple fact that more prototypes provide fewer recognition errors, but also a more important fact that the error rate decreases approximately to 40% by increasing the prototypes 10 times. The analysis results also showed other phenomena in massive character recognition, such that the NN prototypes become visually closer to the input pattern by increasing the prototypes.
Proceedings of the 2011 ACM Symposium on Applied Computing (SAC), TaiChung, Taiwan, March 21 - 24, 2011; 01/2011
[show abstract][hide abstract] ABSTRACT: This paper presents a method of recovering digital ink for an intelligent camera pen, which is characterized by the functions that (1) it works on ordinary paper and (2) if an electronic document is printed on the paper the recovered digital ink is associated with the document. Two technologies called paper fingerprint and document image retrieval are integrated for realizing the above functions. The key of the integration is the introduction of image mosaicing and fast retrieval of previously seen fingerprints based on hashing of SURF local features. From the experimental results of 50 handwritings, we have confirmed that the proposed method is effective to recover and locate the digital ink from the handwriting on a physical paper.
Frontiers in Handwriting Recognition (ICFHR), 2010 International Conference on; 12/2010
[show abstract][hide abstract] ABSTRACT: This paper is concerned with automatic construction algorithm for gesture network. Gesture network is a network model of gestures for gesture recognition, especially early recognition and motion prediction. Manual construction of gesture network is inefficient, and thus its automatic construction method is expected; this is because gesture network has to be constructed, whenever target gestures are changed. This paper proposes an automatic construction algorithm for gesture network by logical DP matching. The experiment was conducted for evaluating the performance of the gesture network constructed automatically. The experimental result indicated that the proposed automatic construction algorithm for gesture network can be alternative of manual construction.
[show abstract][hide abstract] ABSTRACT: In this paper a more compact and more reliable coding scheme for the data-embedding pen is proposed. The data-embedding pen produces an additional ink-dot sequence along a handwritten pattern during writing. The ink-dot sequence represents, for example, meta-information (such as the writer's name and the date of writing) and thus drastically increases the value of the handwriting on a physical paper. There is no need to get access to any memory on the pen to recover the information, which is especially useful in multi-writer or multi-pen scenarios. In this paper we focus on the compactness of the encoded information. The aim of this paper is to encode as much information as possible in short stroke sequences. In our experiments we show that we can embed more information in shorter strokes than in previous work. In straight lines as short as 5 cm, 32 bits can successfully be embedded. Furthermore, the new encoding scheme also works reliably on more complex patterns.
International Conference on Frontiers in Handwriting Recognition, ICFHR 2010, Kolkata, India, 16-18 November 2010; 01/2010
[show abstract][hide abstract] ABSTRACT: This paper addresses the problem of how to extract, describe, and evaluate handwriting deformation from the deterministic viewpoint for improving recognition accuracy. The key ideas are threefold. The first is to extract handwriting deformation vector field (DVF) between a pair of input and target images by 2D warping. The second is to hierarchically decompose the DVF by a parametric deformation model of global/local affine transformation, where local affine transformation is iteratively applied to the DVF by decreasing window sizes. The third is to accept only low-order deformation components as natural, within-class handwriting deformation. Experiments using the handwritten numeral database IPTP CDROM1B show that correlation-based matching absorbing components of global affine transformation and local affine transformation up to the 3rd order achieved a higher recognition rate of 92.1% than that of 87.0% obtained by original 2D warping.
20th International Conference on Pattern Recognition, ICPR 2010, Istanbul, Turkey, 23-26 August 2010; 01/2010