Conference PaperPDF Available

PATSEEK: Content Based Image Retrieval System for Patent Database.

Authors:

Abstract and Figures

A patent always contains some images along with the text. Many text based systems have been developed to search the patent database. In this paper, we describe PATSEEK that is an image based search system for US patent database. The objective is to let the user check the similarity of his query image with the images that exist in US patents. The user can specify a set of key words that must exist in the text of the patents whose images will be searched for similarity. PATSEEK automatically grabs images from the US patent database on the request of the user and represents them through an edge orientation autocorrelogram. L1 and L2 distance measures are used to compute the distance between the images. A recall rate of 100% for 61% of query images and an average 32% recall rate for rest of the images has been observed.
Content may be subject to copyright.
PATSEEK: Content Based Image Retrieval System for Patent Database
Avinash Tiwari1, Veena Bansal2
1,2 Industrial and Management Engineering Department
IIT Kanpur 208016 India
2veena@iitk.ac.in
ABSTRACT
A patent always contains some images along with the text. Many text based systems have been developed to search the
patent database. In this paper, we describe PATSEEK that is an image based search system for US patent database. The
objective is to let the user check the similarity of his query image with the images that exist in US patents. The user can
specify a set of key words that must exist in the text of the patents whose images will be searched for similarity.
PATSEEK automatically grabs images from the US patent database on the request of the user and represents them
through an edge orientation autocorrelogram. L1 and L2 distance measures are used to compute the distance between
the images. A recall rate of 100% for 61% of query images and an average 32% recall rate for rest of the images has
been observed.
Keywords: Patent Search, Content Based Image Retrieval, Recall rate, Precision
1. INTRODUCTION
The sheer size of the information available about the
patents has led many researchers to develop efficient
and effective retrieval techniques for patent databases.
The Web Patent Full-Text Database (PatFT) of
United States Patent Office (USPTO) contains the full-
text of over 3,000,000 patents from 1976 to the present
and it provides links to the Web Patent Full-Page
Images Database (PatImg), which contains over
70,000,000 images. The volume of this repository
makes it prohibitive for humans to find similar images
in them. This has motivated us to develop a content-
based image retrieval system for the patent databases.
Initial image retrieval techniques were text-based that
associated textual information, like filename, captions
and keywords with every image in the repository. For
image retrieval, keyword based matching was
employed for finding the relevant images. The manual
annotation required prohibitive amount of labor.
Moreover, it was difficult to capture the rich content of
images using a small number of key words apart from
being an unnatural way of describing images.
Soon it was realized that image retrieval based on the
contents is a natural and an effective way of retrieving
images. This led to the development of Content-Based
Image Retrieval (CBIR). The Content-Based Image
Retrieval (CBIR) [8] is aimed at efficient retrieval of
relevant images from large image databases based on
automatically derived image features. The original
Query by Image Content (QBIC) system [3] allowed
the user to select the relative importance of color,
texture, and shape. The virage system [1] allows
queries to be built by combining color, composition
(color layout), texture, and structure (object boundary
information).
Moments [6] fourier descriptors [5], [11] and chain
codes [4], [12] have also been used as features. Jain
and Vailya [7] introduced edge direction histogram
(EDH) using the Canny edge operator [2].
The edge orientation autocorrelogram (EOAC)
classifies edges based on their orientations and
correlation between neighboring edges in a window
around the kernel edge [9].
2. SYSTEM ARCHITECTURE
PATSEEK consists of two subsystems- one for
Creation of Feature Vector and Image Database and
another one for Retrieval of Images similar to the
query Image. The components of both subsystems and
their interaction are shown in figure 1.
Image Database
Image Segmentation
Images from Patent Database
Feature Extraction
Images
Keywords
Segmented Images
Result Display
Feature Extraction
Feature Matching
Ranked Results
Feature Vector
Feature Vector
Database
Query
Image
of imagesRetrieval
similar to query images
Creation of Feature Vector and Image Database
Image Grabber:
Figure 1. System Architecture of PATSEEK
In order to add the feature vectors and images to the
database, the user interacts with the system through its
graphical user interface and provides a set of
keywords. A snapshot of the user interface for
specifying the search criteria for the patents to be
grabbed is shown in figure 2. The patents are grabbed
from the USPTO website. The image grabber searches
the patent database and grabs the image pages from the
patents that satisfy the search criteria.
Figure 2. Patent Grabber
A page image may contain more than one individual
image. To extract the individual images from these
page images, we need to identify the connected
components or blocks. In these pages, the connectivity
is present at a very gross level and an individual image
is well separated in both directions from other images.
The occasional captions are insignificant compared to
the images and can be treated as noise. Our experiment
shows that these captions do not change the feature
vector of an image in any significant way. To identify
the connected components, we start scanning a page
image from the first pixel row. If a row has no black
pixel, the row is discarded from further consideration.
If we find a row that contains at least one black pixel,
the position of the row is recorded as the potential start
of a connected component. We continue the horizontal
scan as long as we continue to find rows that contain at
least one black pixel.
If we find enough (design parameter) contiguous
horizontal rows that contain no black pixels, we have
reached the end of the previous block if any. We
continue horizontal scan till the end of the page image.
For each horizontal block identified, we scan it
vertically to segment it vertically using vertical
threshold. These thresholds are set to 1 mm. A block
that is less than 5 square cm is discarded as noise.
A document containing three images and their
corresponding blocks that have been identified are
shown in figure 3.
Figure 3. Blocks Identified for Individual Images
The separated images are stored in the image database
along with patent number and the page number within
the patent where these images were found.
The graphic content are then used to calculate the
image feature vector. We selected EOAC for our work
because it is computationally inexpensive and
independent of translation and scaling. Also, the size
of the feature vector is small, 144 real numbers. We
computed magnitude and gradient of the edges using
Canny edge operator.
The edges that have less than 10 percent of the
maximum possible intensity are ignored from further
consideration.
Figure 4. Query Image Selection
The gradient of edges is then used to quantize edges
into 36 bins of 5 degrees each. The edge orientation
autocorrelogram is then formed which is a matrix,
consisting of 36 rows and 4 columns. Each element of
this matrix indicates the number of edges with similar
orientation. Columns 1, 2, 3 and 4 give the number of
edges that are 1, 3, 5 and 7 pixels apart. Each row
corresponds to 5 degrees bin.
Two edges with k pixel distance apart are said to be
similar if the absolute values of their orientations and
amplitude differences are less than an angle and an
amplitude threshold value, respectively [5]. These are
user defined thresholds. The edge orientation
autocorrelogram is stored in the database along with
patent number and the page number within the patent
where this image was found.
Feature vectors for the images are stored in an
RDMBS table. We have created the feature vector
database on Oracle as well as on Mysql.
3. THE DATABASE RETRIEVAL SYSTEM
For querying the database for images similar to a query
image, the user interacts with the system through a
graphical user interface (shown in figure 4). The user
can specify the query image by providing its name and
path. The image can be in any of the popular format
such as tiff, gif, jpeg etc.
PATSEEK gives user an opportunity to specify the
rotation angle for the query image. The angle may
range between 0 degrees and 180 degrees.
We have used L1 and L2 distance measures to select
12-nearest neighbors and both have given almost
identical results. The top twelve images, ranked on the
basis of the distance are displayed as thumbnails along
with the respective distance.
Figure 5. The User Interface for query result navigation
The graphical user interface displays the query image
and the results for browsing to the user. A snapshot of
the user interface is shown in the figure 5.
4. EXPERIMENTS
In this section, we report experimental results. All
experiments were performed on an Intel Pentium IV
Processor 2.4 GHz with 512 MBytes of RAM. The
system was implemented in Java (Sun JDK 1.4.1). The
feature vector database was initially created on Oracle
and later on moved to MySQL Ver 12.21. We did not
use the client of Oracle. We implemented the front-end
using Java and its utilities.
Our database contains approximately 200 images that
have been picked up from the patent database of
United States Patent Office. For the performance
evaluation, we have arbitrarily chosen 15 images from
our collection. For each query image, a set of relevant
images in the database have been manually identified.
An ideal image retrieval system is expected to retrieve
all the relevant images. One of the popular measures is
recall rate [10] that is the ratio of number of relevant
images retrieved and total number of relevant images
in the collection. The precision rate is the ratio of the
number of relevant images retrieved and total number
of images in the collection.
Minkowski-form distance is used assuming that each
dimension of image feature vector is independent of
each other and is of equal importance. L1 and L2 (also
called Euclidean distance) are some of the widely used
Minkowski-form distance measures.
We have compared the performance of two similarity
measures, L1 and L2 distance, in our experiments. For
our experiment, we calculated precision and recall rate
for each image. A recall rate of 100% for 61% of query
images and an average 32% recall rate for rest of the
images were observed. The precision rate varied
between 10% and 35%. Precision rate greatly depend
on the number of the images in the database. Graph 1
and graph 2 show the precision rates and recall rates
for 15 query images. The image shown in figure 6 is
one of the images selected for the query image shown
on the left in figure 4 but it is not one of the top
ranking images. For the same query image, when we
specify the angle of rotation as 180 degrees, the
ranking of this image improves considerably.
Figure 6. An image obtained by rotating the query
image
Trecision Rate
0.2
0.1
0.3
0.4
0.5
*
*
* *
***
**
*
*
**
Image Identifier
3456
21 7 8 9 10 11 12 13 14 15
**
Graph 1. Precision Rate
*
Image Identifier
3456
21 7 8 9 10 11 12 13 14 15
0.2
0.4
0.8
0.6
1.0 * * ****
*
*
*
*
* * *
*
Recall Rate
Graph 2. Recall rate
5. CONCLUSION
In this paper, we have described PATSEEK that is an
image retrieval system for the US patent database. The
system has shown good performance and can be
effectively utilized to locate similar patents before
issuing a new patent by the patent office. A researcher
or developer can locate all the images similar to the
image in his document before filing for a patent.
PATSEEK can compliment a text based search system.
The image feature vector database for PATSEEK is
expected to grow and the retrieval system must give
real-time performance even when database is large. At
present, we have not made any effort to optimize the
speed. The total time elapsed from the moment a query
image was given and to the moment relevant images
are retrieved was about 90 seconds. We plan to use
multidimensional indexing to cut down the retrieval
time.
REFERENCES
[1] Bach, J. R., et al., “The virage image search
engine: an open framework for image
management”, Proc. SPIE: Storage and retrieval
for Still Image and Video Databases IV, vol 2670,
pp 76-87, 1996.
[2] Canny, J., “A computational approach to edge
detection”, IEEE trans. Pattern Analysis and
Machine Intelligence, PAMI-8, pp 679-698, 1986.
[3] Flickener, M., et al., “Query by image and video
content: the QBIC system”, IEEE Computer, vol
28 , pp 23-32, 1995.
[4] Freeman, H., “On the encoding of arbitrary
geometric configurations”, IRE Trans. on
Electronic Computers}, vol EC-10, pp 260-268,
1961.
[5] Gonzalez R.C. and Wints P., Digital Image
Processing, Addison-Wesley Reading, MA, 1992.
[6] Hu, M. K., “Visual pattern recognition by
moments invariants”, IRE Transactions on
Information Theory, IT-8, pp. 179-187, 1962.
[7] Jain, A.K. and Vailaya, A., “Image Retrieval using
Color and Shape”, Pattern Recognition, vol 29, pp
1233-1244, August 1996.
[8] Kato, T., “Database architecture for content-based
image retrieval in Image Storage and Retrieval
Systems” (Jambardino A and Niblack W eds),
Proc SPIE 2185, pp 112-123, 1992.
[9] Mahmoudi, F., Shanbehzadeh J., Eftekhari A.M.,
and Soltanian-Zadeh H., “Image retrieval based on
shape similarity by edge orientation
autocorrelogram”, Pattern Recognition, vol 36, pp
1725-1736, 2003.
[10] Muller, H., Mullerm, W., McGSquire, D.,
Marchand-Maillet, S., and Pun, T., “Performance
evaluation in content-based image retrieval:
overview and proposals”, Pattern Recognition
Letters. vol 22, pp 593-601, 2001.
[11] Persoon, E. and Fu, K. S., “Shape discrimination
using Fourier descriptors”, IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol 8,
pp 388-397, 1986.
[12] Zhang, D. and Lu, G., “Review of shape
representation and description techniques”,
Pattern Recognition, 37(1), pp 1-19, 2004.
... Only little published work studies content-and image-based patent retrieval. These approaches represent the patent drawing with relational skeletons [13], edge orientation autocorrelogram [27], contour description matrix [38], adaptive hierarchical density histogram [30], etc. For example, the patent retrieval system PATSEEK [27] applies shape-based retrieval and represents a patent image with an edge orientation autocorrelogram. ...
... These approaches represent the patent drawing with relational skeletons [13], edge orientation autocorrelogram [27], contour description matrix [38], adaptive hierarchical density histogram [30], etc. For example, the patent retrieval system PATSEEK [27] applies shape-based retrieval and represents a patent image with an edge orientation autocorrelogram. However, these representations are lowlevel and handcrafted visual features, which are not discriminative enough for large-scale applications. ...
Preprint
Full-text available
Patent retrieval has been attracting tremendous interest from researchers in intellectual property and information retrieval communities in the past decades. However, most existing approaches rely on textual and metadata information of the patent, and content-based image-based patent retrieval is rarely investigated. Based on traits of patent drawing images, we present a simple and lightweight model for this task. Without bells and whistles, this approach significantly outperforms other counterparts on a large-scale benchmark and noticeably improves the state-of-the-art by 33.5% with the mean average precision (mAP) score. Further experiments reveal that this model can be elaborately scaled up to achieve a surprisingly high mAP of 93.5%. Our method ranks first in the ECCV 2022 Patent Diagram Image Retrieval Challenge.
... PATSEEK -система, основанная на поиске изображений патентов для американских патентных баз данных. Основой являются две подсистемы: одна для извлечения особенностей изображений, другая для поиска изображений на основе загруженных данных [10]. Извлечение особенностей изображений происходит с помощью метода edgeorientationautocorrelogram (EOAC). ...
Article
В работе проанализированы существующие автоматизированные системы поиска и классификации изображений в базе патентов, выделены их преимущества и недостатки. Все известные системы перед поиском предварительно анализируют загруженные изображения и изображения, хранящиеся в базе данных, после чего происходит поиск в патентной базе данных. С ростом времени рассмотрения заявки на регистрацию патента и с увеличением количества патентов растет. Эксперту патентного бюро необходимо установить уникальность патентуемой технологии.
... The work [11] shows the difference between human understanding about an image and digital representation of image features. The efficient usage of Edge Orientation Auto Correlogram (EOAC) in image retrieval is shown in [12]. ...
Article
Full-text available
The Content Based Image Retrieval (CBIR) system is a framework for finding images from huge datasets that are similar to a given image. The main component of CBIR system is the strategy for retrieval of images. There are many strategies available and most of these rely on single feature extraction. The single feature-based strategy may not be efficient for all types of images. Similarly, due to a larger set of data, image retrieval may become inefficient. Hence, this article proposes a system that comprises of two-stage retrieval with different features at every stage where the first stage will be coarse retrieval and the second will be fine retrieval. The proposed framework is validated on standard benchmark images and compared with existing frameworks. The results are recorded in graphical and numerical form, thus supporting the efficiency of the proposed system.
... Tiwari et al built up a us based patent database CBIR system [ PATSEEK] as a patent consisting of an image and textual information. The user must enter keywords along with the query image that may appear in the patent text for similarity search [9]. Krishnan et al created shading put together CBIR based with respect to the overflowing hues in the picture in the forefront, which just gives the picture's semantics. ...
Article
Full-text available
CBIR may be a set of techniques for semi-relevant image recovery from an image database that supports automatically derived image options. The visual characteristics are generally portrayed at low dimension in CBIR systems. They are basically unbending numerical estimates that cannot influence individual understandings and perceptions innate subjectivity and fogginess. As a result, a niche exists between low dimension features and semantics at high-level. We tend to witness the era of massive computing of information where computing resources turns into the most bottleneck to handle these massive datasets. With high-dimensional data in which each perspective on data is of high spatiality, selection of features is important to further increase the results of clustering and classification. To mitigate the emotional variety in the accuracy of retrieval between queries questions brought about by the single picture include calculations we built up a new diagram based learning technique method to effectively retrieve images. The method uses a four-layer system that incorporates the qualities of question development and combination of gabor and ripplet transform feature. In the main layer two picture sets are gotten using the gabor and ripplet-based retrieval methods, respectively, and the furthermore, the top positioned and basic pictures from both the top candidate lists form graph anchors. Utilizing every individual component, the graph anchors recover six picture sets from the picture database as an extension inquiry in the second layer. The pictures in the six picture sets are assessed for positive and negative information generation in the third layer and simple MKL is connected to learn the appropriate query-dependent fusion weights to achieve the final result of image recovery. The UC Mercedland Use Land Cover data set conducted extensive experiments. The source code was on our website. The recovery accuracy is fundamentally upgraded compared to other related methods without giving up the adaptability of our methodology.
... Every operating system provides a directory structure for organizing the data and a search facility for finding a file by its name. A content-based image retrieval (CBIR) system focuses on retrieving images similar to a query image [2]. A text-based image retrieval (TBIR) system retrieves images that contain the query text. ...
Article
Full-text available
In this work, we present a hybrid image and document filing system that we have built. When a user wants to store a file in the system, it is processed to generate tags using an appropriate open-source machine learning system. Presently, we use OpenCV and Tesseract OCR for tagging files. OpenCV recognizes objects in the images and TesserAct recognizes text in the image. An image file is processed for object recognition using OpenCV as well for text/captions process using TesserAct, which are used for tagging the file. All other files are processed using Tesseract only for generating tags. The user can also enter their own tags. A database system has been built that stores tags and the image path. Every file is stored with its owner identification and it is time-stamped. The system has a client-server architecture and can be used for storing and retrieving a large number of files. This is a highly scalable system.
Chapter
Patent retrieval has been attracting tremendous interest from researchers in intellectual property and information retrieval communities in the past decades. However, most existing approaches rely on textual and metadata information of the patent, and content-based image-based patent retrieval is rarely investigated. Based on traits of patent drawing images, we present a simple and lightweight model for this task. Without bells and whistles, this approach significantly outperforms other counterparts on a large-scale benchmark and noticeably improves the state-of-the-art by 33.5% with the mean average precision (mAP) score. Further experiments reveal that this model can be elaborately scaled up to achieve a surprisingly high mAP of 93.5%. Our method ranks first in the ECCV 2022 Patent Diagram Image Retrieval Challenge.
Chapter
In the recent decades, lots of images have been added to the database and is growing rapidly. Since the database is very huge, hence retrieval and querying of these images become difficult. The content-based image retrieval (CBIR) system provides an efficient option, that is based on extraction of image features and compare. The primary method of feature extraction in CBIR system is based on the colour content of images. Finding the similarity among image features is mainly based on distance metrics, and it plays crucial role in image retrieval. There are many such similarity metrics found from the literature, few of them perform well on some specific cases. Thus, it is important to know, the appropriate metrics for optimal image retrieval. This article presents a comprehensive survey on various popular distance metrics on wide range of image database. The survey is based on finding the similarity of image features, that are based on colour content of an image. The survey gives good insight into the similarity measuring metrices. The popular set of large image database is used for analysis purpose. The results show, superiority of Canberra and Bray–Curtis distances over other distance metrics.
Chapter
Full-text available
Every Android application needs the collection of permissions during installation time, and these can be used in permission-based malware detection. Different ensemble strategies for categorising Android malware have recently received much more attention than traditional methodologies. In this paper, classification performance of one of the primary ensemble approach (Stacking) in R libraries in context of for Android malware is proposed. The presented technique reserves both the desirable qualities of an ensemble technique, diversity, and accuracy. The proposed technique produced significantly better results in terms of categorisation accuracy.KeywordsStackingEnsembleClassificationVotingAndroid malwares
Article
Full-text available
This paper describes a computational approach to edge detection. The success of the approach depends on the definition of a comprehensive set of goals for the computation of edge points. These goals must be precise enough to delimit the desired behavior of the detector while making minimal assumptions about the form of the solution. We define detection and localization criteria for a class of edges, and present mathematical forms for these criteria as functionals on the operator impulse response. A third criterion is then added to ensure that the detector has only one response to a single edge. We use the criteria in numerical optimization to derive detectors for several common image features, including step edges. On specializing the analysis to step edges, we find that there is a natural uncertainty principle between detection and localization performance, which are the two main goals. With this principle we derive a single operator shape which is optimal at any scale. The optimal detector has a simple approximate implementation in which edges are marked at maxima in gradient magnitude of a Gaussian-smoothed image. We extend this simple detector using operators of several widths to cope with different signal-to-noise ratios in the image. We present a general method, called feature synthesis, for the fine-to-coarse integration of information from operators at different scales. Finally we show that step edge detector performance improves considerably as the operator point spread function is extended along the edge.
Article
Full-text available
Evaluation of retrieval performance is a crucial problem in content-based image retrieval (CBIR). Many different methods for measuring the performance of a system have been created and used by researchers. This article discusses the advantages and shortcomings of the performance measures currently used. Problems such as defining a common image database for performance comparisons and a means of getting relevance judgments (or ground truth) for queries are explained. The relationship between CBIR and information retrieval (IR) is made clear, since IR researchers have decades of experience with the evaluation problem. Many of their solutions can be used for CBIR, despite the differences between the fields. Several methods used in text retrieval are explained. Proposals for performance measures and means of developing a standard test suite for CBIR, similar to that used in IR at the annual Text REtrieval Conference (TREC), are presented.
Article
This paper describes visual interaction mechanisms for image database systems. The typical mechanisms for visual interactions are query by visual example and query by subjective descriptions. The former includes a sketch retrieval function and a similarity retrieval function, while the latter includes a sense retrieval function. We adopt both an image model and a user model to interpret and operate the contents of image data from the user's viewpoint. The image model describes the graphical features of image data, while the user model reflects the visual perception processes of the user. These models, automatically created by image analysis and statistical learning, are referred to as abstract indexes stored in relational tables. These algorithms are developed on our experimental database system, the TRADEMARK and the ART MUSEUM.
Article
A method is described which permits the encoding of arbitrary geometric configurations so as to facilitate their analysis and manipulation by means of a digital computer. It is shown that one can determine through the use of relatively simple numerical techniques whether a given arbitrary plane curve is open or closed, whether it is singly or multiply connected, and what area it encloses. Further, one can cause a given figure to be expanded, contracted, elongated, or rotated by an arbitrary amount. It is shown that there are a number of ways of encoding arbitrary geometric curves to facilitate such manipulations, each having its own particular advantages and disadvantages. One method, the so-called rectangular-array type of encoding, is discussed in detail. In this method the slope function is quantized into a set of eight standard slopes. This particular representation is one of the simplest and one that is most readily utilized with present-day computing and display equipment.
Article
This paper introduces a new feature vector for shape-based image indexing and retrieval. This feature classifies image edges based on two factors: their orientations and correlation between neighboring edges. Hence it includes information of continuous edges and lines of images and describes major shape properties of images. This scheme is effective and robustly tolerates translation, scaling, color, illumination, and viewing position variations. Experimental results show superiority of proposed scheme over several other indexing methods. Averages of precision and recall rates of this new indexing scheme for retrieval as compared with traditional color histogram are 1.99 and 1.59 times, respectively. These ratios are 1.26 and 1.04 compared to edge direction histogram.
Article
More and more images have been generated in digital form around the world. There is a growing interest in finding images in large collections or from remote databases. In order to find an image, the image has to be described or represented by certain features. Shape is an important visual feature of an image. Searching for images using shape features has attracted much attention. There are many shape representation and description techniques in the literature. In this paper, we classify and review these important techniques. We examine implementation procedures for each technique and discuss its advantages and disadvantages. Some recent research results are also included and discussed in this paper. Finally, we identify some promising techniques for image retrieval according to standard principles.
Article
This paper deals with efficient retrieval of images from large databases based on the color and shape content in images. With the increasing popularity of the use of large-volume image databases in various applications, it becomes imperative to build an automatic and efficient retrieval system to browse through the entire database. Techniques using textual attributes for annotations are limited in their applications. Our approach relies on image features that exploit visual cues such as color and shape. Unlike previous approaches which concentrate on extracting a single concise feature, our technique combines features that represent both the color and shape in images. Experimental results on a database of 400 trademark images show that an integrated color- and shape-based feature representation results in 99% of the images being retrieved within the top two positions. Additional results demonstrate that a combination of clustering and a branch and bound-based matching scheme aids in improving the speed of the retrievals.