ArticlePDF Available

A Search Technique for Pattern Recognition Using Relative Distances

Authors:

Abstract

A technique for creating and searching a tree of patterns using relative distances is presented. The search is conducted to find patterns which are nearest neighbors of a given test pattern. The structure of the tree is such that the search time is proportional to the distance between the test pattern and its nearest neighbor, which suggests the anomalous possibility that a larger tree, which can be expected on average to contain closer neighbors, can be searched faster than a smaller tree. The technique has been used to recognize OCR digit samples derived from NIST data at an accuracy rate of 97% using a tree of 7,000 patterns
910
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 17, NO. 9, SEPTEMBER 1995
A Search Technique for Pattern Recognition
Using Relative Distances
Thomas E. Portegys
Abstract-A
technique for creating and searching a tree of patterns
using relative distances is presented. The search is conducted to find pat-
terns which are nearest neighbors of a given test pattern. The structure
of the tree is such that the search time is proportional to the distance he-
tween the test pattern and its nearest neighbor, which suggests the
anomalous possibility that a larger tree, which can be expected on aver-
age to contain closer neighbors, can be searched faster than a smaller
tree. The technique has been used to recognize OCR digit samples de-
rived from NUT data at an accuracy rate of 97% using a tree of 7,000
patterns.
Index Terms-Pattern recognltlon, optical character recognition, near-
est neighbor, distance metric, branch and bound, NIST digit samples.
I. INTRODLJCTI~N
A technique for creating and searching a tree of patterns using
relative distances is presented. The search is conducted to find pat-
terns which are nearest neighbors of a given test pattern. The struc-
ture of the tree is such that the search time is proportional to the dis-
tance between the test pattern and its nearest neighbor, which sug-
gests the anomalous possibility that a larger tree, which can be ex-
pected on average to contain closer neighbors, can be searched faster
than a smaller tree. The technique has been used to recognize Optical
Character Recognition (OCR) digit samples derived from National
Institute of Standards and Technologies (NISI) data [I I ] at an accu-
racy rate of 97% using a tree of 7,000 patterns.
The task of recognizing handwritten characters is an active area of
research, especially in neural networks [4], [S], [lo]. This paper is an
investigation of a how a particular memory-based, nearest neighbor
search technique behaves when applied to a large number of patterns.
As a nearest neighbor scheme, the search technique attempts to ad-
dress problems encountered in dealing with many-dimensional ob-
jects 131, [51,[9], which in this case are represented by OCR patterns.
The technique is intended to be applicable not only to character rec-
ognition, but to pattern recognition tasks in general.
The paper is organized as follows: First, a description of the tech-
nique is presented, including the definition of distance, the insertion
algorithm, and the search algorithm. The results of the NIST and
other tests are then presented. Finally, a proposal is made for a device
to improve the speed of searching.
II. DESCRIPTION
A. Distance Formula
The distance between patterns Pl and P2 was chosen to be the
city-block distance:
where
&(A, P2) = ilpl, - p2il
(1)
i=l
N = number of pixel in the patterns
pl , p2 = pixel values
The city-block distance, which is the Hamming distance for binary
pixel values, is possibly not the best choice; it was chosen for the
Manuscript received Dec. 20,1993; revised Mar. 17.1995.
The author is with AT&T Bell Laboratories, Naperville. IL 60566; e-mail:
t.e.pertegys@att.com.
IEEECS Log Number P95091.
value of its fast computation. Any distance formula is suitable if it
conforms to these conditions:
dist(P1, F2) 10
(0
dist(P1, P2) =
dist(P2,
Pl) (ii)
dist(P1, P3) I dist(P1, P2) + dist(P2, F3)
(iii)
Conditions (i) and (ii) hold for the city-block distance due to the ab-
solute value operator. Condition (iii) is the well-known triangle ine-
quality [6]. This condition must hold for patterns containing single
pixels since it must be true for the distances between any three scalar
numbers. It must then hold for multiple pixel patterns since inequal-
ity relationships are preserved when summing.
B. Pattern Insertion
Patterns are stored in a tree structure according to their relative
distances for efficient searching. They are inserted into the tree by a
recursive procedure starting at the root which is the first prospective
parent pattern. A decision is made whether to link the new pattern di-
rectly to the parent pattern or to pass it on to the first child of the par-
ent to which it fits. An inserted pattern fits a child pattern if the
distance between it and the child is less than the distance between the
parent and the child multiplied by a constant. If the constant is S, for
example, the node is passed to the child if it is within a radius of half
the distance between the parent and the child patterns. Once passed
to a child, the child becomes the parent for the next iteration of link
checking.
The algorithm used for insertion, written in C, using RADIUS as
the link control constant, is given in Appendix A.
The purpose of the RADIUS constant is to control the degree
(bmshiness) of the tree. At the extremes, when RADIUS is set to 0,
all children are linked to the root pattern; when set to 2, no pattern
will have more than one child, i.e., the tree will be a linked list. For
the NIST tests described later, RADIUS was set to .7, which an aver-
age node degree of 3.6.
One feature of the algorithm is that the order of a patterns subtree
branches is important: A new pattern is always inserted in the first tit-
ting branch (even though there can be more than one such branch). This
features allows the search of a tree which contains a duplicate of a given
pattern to proceed with maximum efficiency: The duplicate will always
be found on the first branch which fits the pattern.
Another feature is a tree reorganization procedure which prevents
excessive children from accumulating on a parent pattern. When a
new child is linked to a parent pattern, every other previously in-
serted child is checked to determine if it should be linked to the new
child instead of the parent. For each such child, the child subtreeis
severed from the parent and each pattern in the subtree is inserted at
the current parent pattern. Note that they are not inserted into the new
child since not all patterns in the subtree necessarily fit there.
C. Pattern Searching
The purpose of the pattern search algorithm is to efficiently find
patterns in the tree which are nearest neighbors of a given test pat-
tern. Searching is done in a best-first manner, that is, the pattern
whose subtree could contain the pattern least distant from the search
pattern is searched next.
The essence of the search decision procedure is illustrated in
Fig. 1. When the search pattern S arrives at pattern A, it must deter-
mine whether to search either pattern B or D next. It does this by cal-
culating the least distances between S and the patterns which can
potentially appear within the subtrees of B and D. These are labeled
Band D, respectively. This distance, called the search distance, is
defined as follows for S and B:
0162-8828/95$04.00 Q 1995 IEEE
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 17, NO. 9, SEPTEMBER 1995
911
seurch(S, B) = dist(b, B) - (dist(A, B) * RADIUS) = dist(b, B) (2)
RADIUS is the link control constant (see the pattern insertion algo-
rithm). If (2) results in a value less than zero, the search distance is
set to zero. After calculating the search distances for B and D, the
search proceeds to the pattern having the least search distance.
Fig. 1. Search distance.
The search algorithm is given in Appendix B. The SEARCH-
WORK structures contain temporary data and are configured into a
copy of the searched portion of the pattern tree during the search. To
prevent searching the entire tree, a maximum parameter can be pro-
vided. This value of this parameter can be changed dynamically to
allow for more or less searching under various circumstances, such as
time constraints. A list of similar patterns is returned by the algo-
rithm.
The algorithm features a branch and bound capability which al-
lows portions of the search tree to be cut-off during the search: As
less distant patterns are found, greater cut-off is achieved. This is be-
cause a pattern does not have to be searched if its search distance is
greater than the distance of a previously found pattern.
III. DIGIT RECOGNITION TEST RESULTS
The patterns were derived from MST digits O-9, and were obtained
through an internal company source. They were size normalized to fit in
a 20
x
20 pixel box, and were then centered to fit in a 28
x
28 image
using center of gravity. The pixels were scaled to four levels of gray
value. Fig. 2 shows an example of a pattern for the digit 6.The com-
parisons were done without any translation or rotation.
A. Test 1
For the first test, approximately 7,000 digit samples were inserted
into a tree, and a different set of 1,000 patterns was selected as test
patterns. 97.1% of the test samples were identified correctly, which is
comparable with other pattern recognizers. In addition, on the aver-
age, the nearest neighbor was found after searching 337 patterns in
the search tree, and 6,000 patterns were not searched due to cut-off
conditions.
B. Test 2
The next series of tests were an attempt to simulate a search tree
large enough to presumably contain patterns that are very similar to a
set of search patterns. The question of what happens to the reliability
and extent of the search of such trees in relation to their size was the
focus of these tests.
Fig. 2. Example NIST digit.
The initial search tree contained 1,000 patterns, and these same
1,000 patterns also comprised the search set, meaning the search was
to find identical patterns in the tree. Following this, an additional
1,000 patterns were inserted into the tree, and a random set of 1,000
out of the 2,000 accumulated patterns was selected as the search set.
This procedure was repeated until 10,000 patterns were inserted into
the tree.
Fig. 3 shows the average number of patterns searched before
Ending the identical pattern as a function of search tree size. In all
cases, the identical pattern was found. It can be seen that only about
25 additional patterns were searched as the tree grew by 9,000 pat-
terns. The data also roughly conforms to the function logs.a(x)*lO,
where 3.6 was found to be the average degree of the search tree. This
suggests that the search effort is proportional to some logarithmic
function of the tree size.
100
40
tl
1oal2m3ow4aoo5Lm6ooo7m8m~1~
heafke
Fig. 3. Patterns searched to find identical pattern.
C. Test 3
As a check on the validity of using identical search patterns instead
of similar ones, noise was randomly introduced into the search patterns
such that they were closely similar, but not identical to, patterns in the
tree. lle noise was chosen such that the average distance between a
stored pattern and a modified search pattern was 10% of the average
distance between the stored pattern and a random pattern. Searching a
tree containing 10,000 patterns resulted in a 100% identification rate,
and an average search of 196 patterns to find the most similar. In addi-
tion, 9,065 patterns were not searched due to cut-off conditions.
912
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. VOL. 17. NO. 9, SEPTEMBER 1995
A forced cut-off at 200 patterns was introduced in the above
search to determine the effect of limiting the extent of the search.
This resulted in an identification rate of 96%, with the most similar
pattern being found after an average search of 79 patterns. 9,801
patterns were not searched due to cut-off conditions. The primary
reason for the effectiveness of the limit was that it curtailed the rela-
tively few searches of excessive extent. In most cases, these excessive
searches were not necessary since a correctly identifying pattern was
found early on, even though it was not the most similar one.
D. Test 4
To look further at the effect of increasing the distance between the
search patterns and the stored patterns in a larger tree, 217( 131,072)
patterns were created by generating all possible combinations of the
pixel values of 0 and 10, thus ensuring that the stored patterns have a
minimum distance of 10 between them. These were then added to the
search tree in random order. A set of search patterns was created by
randomly selecting 100 stored patterns. This formed the distance 0
search set. The distance 1 search set was formed by copying the dis-
tance 0 set, randomly selecting a pixel in each pattern and modifying
its value by 1. Distance 2-5 search sets were created in successive
manner.
Fig. 4 plots the number of patterns searched to find the most
similar stored pattern as a function of the distance of the search pat-
terns. The function appears to be a linear one, especially looking at
the distance 2-5 points.
.
0 I 2 3 4 5
dismmcfrom~p
Fig. 4. Patterns searched to find similar pattern in 131072 stored patterns.
IV. A PATTERN COMPARATOR DEVICE
The search mechanism demands the repetitive computation of(l),
which for the 784 pixel NIST pattern size requires a little over 5 ms
per invocation on a SUN SPARC workstation. This is the time to
compare two patterns during a search which involves multiple com-
parisons. In approximately the same time other systems are able to
complete a recognition computation using specialized hardware [ 121.
It seems clear that there is a need for a faster means of computing the
distance. A vector processor would be able to compute the difference
terms in parallel, but the summation of these terms would remain as a
bottleneck.
An optical device may hold the answer as a means of performing
the summation. Consider the device shown in Fig. 5. The difference
terms are transduced into optical signals whose intensities are pro-
portional to the sizes of the differences. A lens is used to focus these
signals onto a detector which is capable of responding in proportion
to the sum of the signal intensities, thus achieving a naturally parallel
summation of the terms.
It is likely that this device could be made to compute a sum in a
few microseconds, given the known performance of optical devices
VI9
171.
Fig. 5. Optical summation.
V. CONCLUSION
Much like chess playing machines, which have gained high rank-
ings largely by relying on speed [2], the findings presented here sug-
gest that a brute forceapproach, in the form of storing a large
number of patterns, may be effective for pattern recognition.
... By relaxing one of the constraints, the metric demonstrates improved search accuracy and efficiency compared to other popular metrics. It has been coupled with a fast multi-dimensional search technique [16] to recognize Optical Character Recognition (OCR) digit samples derived from National Institute of Standards and Technologies (NIST) data [15]. The metric is applicable not only to OCR tasks, but to pattern recognition tasks in general. ...
... Each of the metrics under test were plugged into a search engine [16] especially suitable for searching a high-dimensional pattern space (28 x 28 pixels). For each metric, the following procedure was used: ...
Article
Full-text available
Abstract A distance quasi-metric for pattern recognition is presented. The “quasi” modifier distinguishes the metric from “true” distance metrics which obey a set of standard constraints. By relaxing one of the constraints, and coupling it with a fast multi-dimensional search technique, the metric demonstrates improved,accuracy and efficiency compared,to other metrics in recognizing hand-written digit samples. A high-level design of a fast optical comparator for computing,the distance in is also presented. On () 3
Chapter
This book has presented several approaches to recognizing handwritten numerals and words based on Markov models, conditional rules, and fuzzy logic.
Article
In this study, an integrated OCR system was designed to identify numbers in an image. A new preprocessing, image segmentation, and number recognition methods to search the threshold value automatically and divide the image into several useful regions has been developed. A state transferring algorithm to evaluate the gray level variations of the information in the available data buffer. The processing based on segmentation algorithm deletes all unnecessary patterns and leaves only the character in order to increase the processing speed. The three-part comparisons algorithm is used in feature extraction. Many applications exist for this OCR system in industry, such as pattern recognition. The experimental results show that the recognition rate can exceed 99.7 per cent even with hand written characters.
Article
1 { } M i i C = принадлежит образ x. Функция классификации формирует ответ в виде набора альтернатив A = (a 1 , …, a m), m < M. Каждая альтернатива a k пред-ставляет собой пару <c k , p k >, где c k — код класса, p k — оценка принадлеж-ности к k-му классу. Альтернативы в векторе A отсортированы по убыва-нию оценок принадлежности. Необходимо отметить, что образ x может не соответствовать ни одному из M классов 1 { } M i i C = , поэтому к числу классов добавляется ещё один класс Θ, называемый отказом. Мы считаем, что серые (полутоновые) или черно-белые (бинарные) об-разы символов представлены растрами, т. е. матрицами R(M, N) размера M × N, элементы R ij которых являются действительными или целыми числами. Функция классификации может оценивать близость к множеству классов 1 117312, Москва, В-312, проспект 60-летия Октября, 9, ИСА РАН, OSlavin@cs.isa.ru.
Article
Considerable work has been reported in recent years on the utilization of hierarchical architectures for efficient classification of image data typically encountered in task domains relevant to automated inspection, part sorting, quality monitoring, and so on. Such work has opened up the possibility of further enhancements through the more effective use of multiple-experts in such structures, but a principal difficulty encountered is to formulate an efficient way to combine decisions of individual experts to form a consensus. The approach proposed here can be envisaged as a structure with multiple layers of filters to separate an input object/image stream. In an n-way classification problem, the primary layer channels the input stream into n different streams, with subsequent further processing dependent on the form of decision taken at the earlier stages. The decision about combining the initially filtered streams is taken based on the degree of confusion among the classes present. The filter battery effectively creates two separate types of output. One is the relatively well-behaved filtered stream corresponding to the defined target classes, while the other contains the patterns which are rejected by different filters as not belonging to the target stream. Subsequently, more specialized classifiers are trained to recognize the intended target classes only, while the rejected patterns from all the second layer filters are collected and presented to a reject recovery classifier which is trained on all the n input classes. Thus, progressively more focusing of the decision making occurs as the processing path is traversed, with the resultant increase in the overall classification capability of the overall system. In this paper, classification results are presented to illustrate the relative performance levels achieved with single expert classifiers in comparison with this type of multi-expert configuration where these single experts are integrated within the processing framework outlined above. A number of conclusions are drawn in relation to the value and potential of hierarchical/multi- expert systems in general and, more importantly, some guidelines are offered about optimizing classifier structures for particular application domains such as automated inspection processing.
Article
This article discusses the role and significance of nearest-neighbor (NNR) approaches (and its conceptual equivalents in the field of artificial intelligence, such as instance-based learning, lazy learning, memory-based reasoning, case-based reasoning, and the like) in the data mining and knowledge discovery process. The presentation first traces the development of NNR approaches from its origins in the early fifties to the present day with appropriate historical references. In the context of data mining applications, which necessarily involve large databases, computational concerns become a major issue and NNR techniques are particularly vulnerable in this sphere. Accordingly, this aspect of NNR techniques is discussed next in great detail to provide a panoramic view of the latest developments in this area. The associated issues of attribute selection and weighting are also addressed. This is followed by an overview of the different metrics that have been proposed in the literature to meet the special needs of the data mining community in contrast to the traditional Euclidean metric and its variants such as the Manhattan (city-block) distance generally employed in the pattern recognition field. A brief but direct discussion on the well-recognized problem of the curse of dimensionality is offered next, although this subject matter is indirectly covered in prior subsections. The article concludes with a brief closing summation of the objective and scope of the presentation highlighting some of the outstanding issues in this arena.
Article
Hand written numeral recognition is an area of pattern recognition that has applications in numerous fields including automated postal sorting, automatic bank cheque processing, hand written document analysis and so on. Recently, the potential advantages of using multiple experts in a unified structure have been demonstrated in addressing the problem of classification of hand written numerals. The motivation behind this paper is to implement a new approach to the solution of the problem of combining the decisions made by multiple experts, by making use of the restrictive and repetitive nature of the numeral structures and combining the a priori knowledge of the expected numeral classes that are to be processed and recognised with that derived from the training samples.
Article
Characteristics of optical recognition programs are described from the standpoint of typical recognition program modules. Not only quality criteria for the separate character recognition but also parameters of other important stages of document input, such as character boundary segmentation, binarization, page segmentation, and storing results, are discussed in detail. The set of characteristics presented can be used for the optimization of both separate recognition stages and the whole process of document input.
Conference Paper
Security measures require to know which vehicles are either leaving or entering premises. To this end we have developed an algorithm which allows us to identify license plates in vehicles and compare them with a database. To this purpose we have used morphological signal processing. Morphological image processing is a technique that is becoming increasingly important for a wide range of image processing tasks. These tasks are performed by successive application of Minkowsky's primitive operations. The success or failure of this type of image processing is critically dependent on the efficiency with which these primitives are computed, the transformation type, and the structuring element used. In this paper we show an algorithm for pattern recognition using the minimal distance between shape spectrums (PECSTRUM), obtained from successive openings of binary images, with base on a structuring element invariant in rotation and translation
Article
The principal obstacle in successfully recognising handwritten data is the inherent degree of intra-class variability encountered. This calls for subclass modelling of handwritten data based on the statistically significant variations within the main classes. A novel multi-prototyping approach based on statistical clustering techniques is investigated as an appropriate solution to this problem and very encouraging results have been achieved
Article
We report on results of training backpropagation nets with samples of hand-printed digits scanned off of bank checks and hand-printed letters interactively entered into a computer through a stylus digitizer. Generalization results are reported as a function of training set size and network capacity. Given a large training set, and a net with sufficient capacity to achieve high performance on the training set, nets typically achieved error rates of 4-5% at a 0% reject rate and 1-2% at a 10% reject rate. The topology and capacity of the system, as measured by the number of connections in the net, have surprisingly little effect on generalization. For those developing hand-printed character recognition systems, these results suggest that a large and representative training sample may be the single, most important factor in achieving high recognition accuracy. Benefits of reducing the number of net connections, other than improving generalization, are discussed.
Article
We present a neural network algorithm that simultaneously performs segmentation and recognition of input patterns that self-organizes to detect input pattern locations and pattern boundaries. We outline the algorithm and demonstrate this neural network architecture and algorithm on character recognition using the NIST database and report results herein. The resulting system simultaneously segments and recognizes touching characters, overlapping characters, broken characters, and noisy images with high accuracy. We also detail some of the characteristics of the algorithm on an artificial database in the appendix.© (1992) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.
Article
We propose a fast nearest neighbor finding algorithm, named tentatively an ordered partition, based on the ordered lists of the training samples of each projection axis. The ordered partition contains two properties, one is ordering¿to bound the search region, and the other is partitioning¿to reject the unwanted samples without actual distance computations. It is proved that the proposed algorithm can find k nearest neighbors in a constant expected time. Simulations show that the algorithm is rather distribution free, and only 4.6 distance calculations, on the average, were required to find a nearest neighbor among 10 000 samples drawn from a bivariate normal distribution.
Conference Paper
We present a neural network algorithm that simultaneously performs segmentation and recognition of input patterns that self-organizes to detect input pattern locations and pattern boundaries. We outline the algorithm and demonstrate this neural network architecture and algorithm on character recognition using the NIST database and report results herein. The resulting system simultaneously segments and recognizes touching characters, overlapping characters, broken characters, and noisy images with high accuracy. We also detail some of the characteristics of the algorithm on an artificial database in the appendix.
Conference Paper
A board is described that contains the ANNA neural-network chip, anda DSP32C digital signal processor. The ANNA #Analog Neural NetworkArithmetic unit# chip performs mixed analog#digital processing.The combination of ANNA with the DSP allows high-speed, end-toendexecution of numerous signal-processing applications, includingthe preprocessing, the neural-net calculations, and the postprocessingsteps. The ANNA board evaluates neural networks 10 to 100times faster than the DSP alone....