Article

An effective solution for trademark image retrieval by combining shape description and feature matching

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Trademark image retrieval (TIR), a branch of content-based image retrieval (CBIR), is playing an important role in multimedia information retrieval. This paper proposes an effective solution for TIR by combining shape description and feature matching. We first present an effective shape description method which includes two shape descriptors. Second, we propose an effective feature matching strategy to compute the dissimilarity value between the feature vectors extracted from images. Finally, we combine the shape description method and the feature matching strategy to realize our solution. We conduct a large number of experiments on a standard image set to evaluate our solution and the existing solutions. By comparison of their experimental results, we can see that the proposed solution outperforms existing solutions for the widely used performance metrics.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... It is because that shape retrieval is based on whether shape features are extracted from the contour only or the whole shape interior. Each type of method is further divided into local and global feature extraction approaches [2,6]. These approaches are based on whether the shape is represented as a whole or by segments. ...
... Moreover, using a threshold value made the system database dependent. An effective solution for those drawbacks was proposed by Heng et al. [6] by combining two shape features for representing and matching. In their work, contour-based descriptor included the histogram of centroid distances and represented the relationship among two adjacent boundary points and the centroid. ...
... According to the type of tangency and the number of touching points on the boundary, a shock point can be of first, second, third or fourth order. The loci of all the shock points in Fig. 2 give the Blum's medial axis and also the idea of the whole shock graph [12,6].The second and fourth order shocks are the generic cases of shock orders which are involved in occurring instability in shape retrieval. The second order shocks are the sources of flow while the fourth order shocks are termination points of flow, which represent branch and end points, respectively. ...
... Colors may sometimes be ignored as far as the unique identity of a logo (represented as an intrinsic graphic pattern) is concerned [43]. Trademark image retrieval (TIR) is a branch of CBIR [26] and many studies have been carried out in this particular area. Review of techniques related to trademarks is out of the scope of this review paper and is not included here. ...
... In both cases a primary logo detection procedure is necessary to provide the required information for logo recognition. In literature, there are many research works for the recognition of logos [75] and trademarks [26,[30][31][32][33][34]. Since trademark recognition is relatively close to the logo recognition problem, a few number of papers from trademark recognition literature are also reviewed here. ...
... Local features are summarized according to their extraction for each point of input domain, whereas global features are extracted based on sets of pixels, on a region or even on the whole document. Based on this categorization outline, local features used for logo recognition include: features extracted from local zone [2], differential invariants [8], negative shape features [9], primitives (line segments) [10], curvature and distance from centroid point [26,30], SIFT and SURF descriptors derived from hessian-affine interest points [12,15,29], horizontal gaps per total area, vertical gaps per total area, ratio of hole area to total area [13,17], color [16], Delaunay triangulation of components/local features [16,17], bag-of-words features [17], edge based features extracted using GHT [21], Fourier coefficients of segmented boundary curves [25], rectangle features extracted from integral image [27], etc. ...
Article
Full-text available
With the advance of technology, business offices and organizations together with their clients create a massive amount of administrative documents every day. Administrative documents commonly contain some salient entities such as logos, stamps or seals as the means of their authentication and proprietorship. These salient entities provide quite discriminative information, which can effectively be used for different tasks of document image retrieval, classification and recognition in document-based applications. Thus, proper detection/recognition of these entities in document images increases the performance of such applications in terms of document retrieval, classification, and recognition. To present the state-of-the-art research on the retrieval of administrative document images, this paper deals with a survey of administrative document image retrieval in relation to seals and logos. All the available datasets, feature extraction and classification techniques for logo and seal detection/recognition are discussed systematically. The shortcomings of the present technologies on logo and seal based document processing are also highlighted. Avenues of the future works are further given for the benefit of readers. To the best of authors' knowledge, there is no survey on administrative document image retrieval and hence the authors hope that this work will be helpful to the researchers of the document analysis community.
... However, it can only depict global shape properties. In this case, most trademark retrieval studies use the integration of global and local descriptors to represent shape (Anuar et al. 2013;Qi et al. 2010;Wei et al. 2009). In these studies, the researchers use ZM as a global feature and contour-based shape descriptor to extract local features. ...
... In previous work, the appropriate value for the database was determined, and so there was good performance from the retrieval result. But if the database is changed, the result might be not ideal, because it is hard to obtain these values empirically (Qi et al. 2010). In order to make some improvements for the matching strategy, Anuar et al. (2013) proposed a novel retrieval technique which split the matching process into two stages. ...
... For performance evaluation of an image retrieval system, the database is one of the significant issues. Although MPEG-7 based datasets were popularly used in some trademark retrieval systems (Anuar et al. 2013;Qi et al. 2010), the images are not deliberately designed for performance evaluation of color logo image retrieval systems. Therefore, in this paper, a new color logo image database, which contains over 2300 original images and a number of distorted versions, was created. ...
Article
Full-text available
Due to their uniqueness and high value commercially, logos/trademarks play a key role in e-business based global marketing. However, existing trademark/logo retrieval techniques and content-based image retrieval methods are mostly designed for generic images, which cannot provide effective retrieval of trademarks/logos. Although color and spatial features have been intensively investigated for logo image retrieval, in most cases they were applied separately. When these are combined in a fused manner, a fixed weighting is normally used between them which cannot reflect the significance of these features in the images. When the image quality is degraded by various reasons such as noise, the reliability of color and spatial features may change in different ways, such that the weights between them should be adapted to such changes. In this paper, adaptive fusion of color and spatial descriptors is proposed for colored logo/trademark image retrieval. First, color quantization and k-means are combined for effective dominant color extraction. For each extracted dominant color, a component-based spatial descriptor is derived for local features. By analyzing the image histogram, an adaptive fusion of these two features is achieved for more effective logo abstraction and more accurate image retrieval. The proposed approach has been tested on a database containing over 2300 logo/trademark images. Experimental results have shown that the proposed methodology yields improved retrieval precision and outperforms three state-of-the-art techniques even with added Gaussian, salt and pepper, and speckle noise.
... Most traditional methods have addressed TIR by extracting a series of handcrafted features and using them to feed a k-Nearest Neighbor (kNN) [8] in order to obtain a ranking of the most similar logos. Some of the features used for this comparison include methods based on color histograms [9], shape [10], local descriptors such as SIFT [11], or a combination of them [12,13]. In most cases, the dimensionality of these features is reduced by using Bag of Words, as occurs in the work by [14]. ...
... 9 Textiles, clothing, sewing accessories, headwear, footwear. 10 Tobacco, smokers' requisites, matches, travel goods, fans, toilet articles. 11 Household utensils. ...
Preprint
Full-text available
Logo classification is a particular case of image classification, since these may contain only text, images, or a combination of both. In this work, we propose a system for the multi-label classification and similarity search of logo images. The method allows obtaining the most similar logos on the basis of their shape, color, business sector, semantics, general characteristics, or a combination of such features established by the user. This is done by employing a set of multi-label networks specialized in certain characteristics of logos. The features extracted from these networks are combined to perform the similarity search according to the search criteria established. Since the text of logos is sometimes irrelevant for the classification, a preprocessing stage is carried out to remove it, thus improving the overall performance. The proposed approach is evaluated using the European Union Trademark (EUTM) dataset, structured with the hierarchical Vienna classification system, which includes a series of metadata with which to index trademarks. We also make a comparison between well known logo topologies and Vienna in order to help designers understand their correspondences. The experimentation carried out attained reliable performance results, both quantitatively and qualitatively, which outperformed the state-of-the-art results. In addition, since the semantics and classification of brands can often be subjective, we also surveyed graphic design students and professionals in order to assess the reliability of the proposed method.
... Traditional methods addressed TIR by extracting hand-crafted features and then matching them with the prototypes of a dataset by using kNN to obtain a ranking of the most similar ones. Features used to represent logos include color histograms [3], texture descriptors, shape [10], a combination of them [6,4], or local descriptors such as SIFT [1]. In most cases, the feature dimensionality is reduced with a clustering method such as Bag of Words [5]. ...
... In the scope of this work, we call figurative designs to those codes between 1 a 24, as they are related to the particular objects that can be found in the image logo. The figurative subcategories were not used since they are too specific (for example, 10 [14]. In the scope of this work, figurative elements are those with codes from 1 to 24. ...
Chapter
Full-text available
The classification of logos is a particular case within computer vision since they have their own characteristics. Logos can contain only text, iconic images or a combination of both, and they usually include figurative symbols designed by experts that vary substantially besides they may share the same semantics. This work presents a method for multi-label classification and retrieval of logo images. For this, Convolutional Neural Networks (CNN) are trained to classify logos from the European Union TradeMark (EUTM) dataset according to their colors, shapes, sectors and figurative designs. An auto-encoder is also trained to learn representations of the input images. Once trained, the neural codes from the last convolutional layers in the CNN and the central layer of the auto-encoder can be used to perform similarity search through kNN, allowing us to obtain the most similar logos based on their color, shape, sector, figurative elements, overall features, or a weighted combination of them provided by the user. To the best of our knowledge, this is the first multi-label classification method for logos, and the only one that allows retrieving a ranking of images with these criteria provided by the user.
... All these artificially-produced images are designed to have a visual impact and consisting of multiple elements, which may be closed regions, lines, or areas of texture. Existing TIR systems, however, typically treat trademark images as indivisible structures by computing descriptors integrating global and local image features [1]- [6] or by partitioning the image [7]- [9] without considering the distribution of their component shapes. Such a practice has been successful in retrieving nearduplicated images but may fail in detecting similar instances that preserve the topology of their components without conserving the relative location of their elements. ...
... The later includes Fourier descriptor of the shape, moment invariants, and grey [5] also use ZM, but they employ the edge-gradient cooccurrence matrix derived from the contour information as the local descriptor. Qi et al. [6] use the histogram of centroid distances and a region descriptor based on improved feature points matching and the spatial distribution of feature points to avoid the calculation of ZM. It is important to emphasize that, in contrast to those approaches, the final descriptor produced by HoVW does not encode local and global information separately. ...
Preprint
Full-text available
In this paper, we present the Hierarchy-of-Visual-Words (HoVW), a novel trademark image retrieval (TIR) method that decomposes images into simpler geometric shapes and defines a descriptor for binary trademark image representation by encoding the hierarchical arrangement of component shapes. The proposed hierarchical organization of visual data stores each component shape as a visual word. It is capable of representing the geometry of individual elements and the topology of the trademark image, making the descriptor robust against linear as well as to some level of nonlinear transformation. Experiments show that HoVW outperforms previous TIR methods on the MPEG-7 CE-1 and MPEG-7 CE-2 image databases.
... All these artificially-produced images are designed to have a visual impact and consisting of multiple elements, which may be closed regions, lines, or areas of texture. Existing TIR systems, however, typically treat trademark images as indivisible structures by computing descriptors integrating global and local image features [1]- [6] or by partitioning the image [7]- [9] without considering the distribution of their component shapes. Such a practice has been successful in retrieving nearduplicated images but may fail in detecting similar instances that preserve the topology of their components without conserving the relative location of their elements. ...
... The later includes Fourier descriptor of the shape, moment invariants, and grey [5] also use ZM, but they employ the edge-gradient cooccurrence matrix derived from the contour information as the local descriptor. Qi et al. [6] use the histogram of centroid distances and a region descriptor based on improved feature points matching and the spatial distribution of feature points to avoid the calculation of ZM. It is important to emphasize that, in contrast to those approaches, the final descriptor produced by HoVW does not encode local and global information separately. ...
Conference Paper
Full-text available
In this paper, we present the Hierarchy-of-Visual-Words (HoVW), a novel trademark image retrieval (TIR) method that decomposes images into simpler geometric shapes and defines a descriptor for binary trademark image representation by encoding the hierarchical arrangement of component shapes. The proposed hierarchical organization of visual data stores each component shape as a visual word. It is capable of representing the geometry of individual elements and the topology of the trademark image, making the descriptor robust against linear as well as to some level of nonlinear transformation. Experiments show that HoVW outperforms previous TIR methods on the MPEG-7 CE-1 and MPEG-7 CE-2 image databases.
... Although an effective approach has been employed for feature representation, an inappropriate choice of feature similarity measure leads to poor retrieval results in a TIR system. To achieve both the advantage of the region and geometric-based features, a two-component solution (TCS) has been proposed with region and geometric-based features [15,19,20]. Wei et al. [20] present two-component feature matching, which employs the Euclidean distance with a threshold and penalty value. ...
... There are many techniques that aim to avoid these selection problems such as Machine Learning techniques [19,23,24] and Relevance Feedback techniques [25]. The property of the image retrieval requires that the computer retrieves images that are similar to human perception and does not depend on rigid distance metrics to measure feature similarity. ...
Article
Full-text available
The existing trademark image retrieval (TIR) approaches mostly use complex image features, the integration of multi features, a tree structure, etc. to enable highly accurate retrieval. However, there is the heavy computational burden for complex image features and maximum similarity subtree isomorphism (MSSI) measurement. This paper aims to provide an efficient solution for TIR in real-time applications, especially in measuring the similarity between multi-object trademark images. In particular, we propose a novel algorithm for tree similarity measurement based on the fuzzy inference system (FIS) to improve retrieval efficiency. Furthermore, the integration of global and local geometric descriptors is used to enable accurate retrieval. The global descriptor is computed by employing the Hu moments, while the local descriptors are generated by using a tree structure based on the five geometric features: convexity, eccentricity, compactness, circle variance, and elliptic variance. During the retrieval process, the similarity coefficient between the query and the database image is obtained from the similarity of the global and local descriptors. The proposed technique is evaluated using 1800 trademark images, including 12 different classes and 416 trademark images. Additionally, the three common indices, the precision/recall rate, the Bull's eye score, and the average normalized modified retrieval rank (ANMRR) are used as the performance indices. The experimental results show that the proposed technique is superior to the other two competitive approaches. It shows 19.43% and 26.78% precision/recall improvement, 19.56% and 30.58% improvement in the average Bull's eye score, and 0.167 and 0.236 improvement in the ANMRR score, respectively, for the 416 query images. It can be concluded from the experimental analysis that the proposed technique not only provides reliable retrieval results but also improves the retrieval efficiency by 151 times in the retrieval process.
... Most traditional methods have addressed TIR by extracting a series of handcrafted features and using them to feed a k-nearest neighbour (kNN) (Duda et al., 2001) to obtain a ranking of the most similar logos. Some of the features used for this comparison include methods based on colour histograms (Ghosh & Parekh, 2015), shape (Qi et al., 2010), local descriptors such as SIFT (Chiam, 2015), or a combination of them (Guru & Kumar, 2018;Kumar et al., 2016). In some cases, the dimensionality of these features is reduced with Bags of Words (Iandola et al., 2015). ...
Article
Full-text available
Classifying logo images is a challenging task as they contain elements such as text or shapes that can represent anything from known objects to abstract shapes. While the current state of the art for logo classification addresses the problem as a multi‐class task focusing on a single characteristic, logos can have several simultaneous labels, such as different colours. This work proposes a method that allows visually similar logos to be classified and searched from a set of data according to their shape, colour, commercial sector, semantics, general characteristics, or a combination of features selected by the user. Unlike previous approaches, the proposal employs a series of multi‐label deep neural networks specialized in specific attributes and combines the obtained features to perform the similarity search. To delve into the classification system, different existing logo topologies are compared and some of their problems are analysed, such as the incomplete labelling that trademark registration databases usually contain. The proposal is evaluated considering 76,000 logos (seven times more than previous approaches) from the European Union Trademarks dataset, which is organized hierarchically using the Vienna ontology. Overall, experimentation attains reliable quantitative and qualitative results, reducing the normalized average rank error of the state‐of‐the‐art from 0.040 to 0.018 for the Trademark Image Retrieval task. Finally, given that the semantics of logos can often be subjective, graphic design students and professionals were surveyed. Results show that the proposed methodology provides better labelling than a human expert operator, improving the label ranking average precision from 0.53 to 0.68.
... Otherwise, coarse alignment is required to provide an initial pose for them. In the subsequent work, much attention will be devoted to finding more robust correspondence, such as pointto-point, face-to-face, and feature-to-feature correspondence [39,40]. By utilizing these correspondences, alignment parameters are estimated, thereby improving the accuracy of point cloud alignment and reducing the reliance on the correct initial poses. ...
Article
Full-text available
In the present day, 3D point clouds are considered to be an important form of representing the 3D world. In computer vision, mobile robotics, and computer graphics, point cloud registration is a basic task, and it is widely used in 3D reconstruction, reverse engineering, among other applications. However, the mainstream method of point cloud registration is subject to the problems of a long registration time as well as a poor modeling effect, and these two factors cannot be balanced. To address this issue, we propose an adaptive registration mechanism based on a multi-dimensional analysis of practical application scenarios. Through the use of laser point clouds and RGB images, we are able to obtain geometric and photometric information, thus improving the data dimension. By adding target scene classification information to the RANSAC algorithm, combined with geometric matching and photometric matching, we are able to complete the adaptive estimation of the transformation matrix. We demonstrate via extensive experiments that our method achieves a state-of-the-art performance in terms of point cloud registration accuracy and time compared with other mainstream algorithms, striking a balance between expected performance and time cost.
... However, the Hough transform-based methods require large storage space and high computational complexity, which results in low processing efficiency. Furthermore, template matching [13] and least square-based methods have been proposed [14] for the discriminative-based approach. For the stochastic-based methods, Dehmeshki and Ye [15] proposed a Genetic Algorithm (GA)-based method for shape recognition that effectively detects regular shapes. ...
... Previous works addressing trademark similarity have been focused on visual comparison and developing systems capable of retrieving visually similar trademarks [2,18,27,14,4,3,15,38,42]. These approaches are mainly limited to trademarks based on figure, which account about only 30% of all trademarks [21]. ...
... Wei et al. [13] proposed synthetic features to describe interior structure and global shape for the trademark image retrieval. Qi et al. [14] combined shape description and feature matching to retrieve trademark images. Anuar et al. [15] presented an integration of global-local descriptors to extract the features through Zernike's moments coefficients and edge gradient co-occurrence matrix. ...
Article
Full-text available
Trademark images or materials such as symbols, text, logos, image, design or phrase are used to unique representation of any organization. Retrieval of trademark material images are important to protect the new trademark image that is to be registered. Therefore, retrieval of similar trademark images is required. In this paper, an approach is presented to extract more similar trademark images so that a unique trademark image can be registered. In this paper, Zernike moment of the query image and dataset images are computed, then most similar images from the dataset are retrieved at the first layer refinement. In the second layer, texture features are extracted of query image and refined dataset images to retrieve most appropriate similar images. Zernike moments is applied to extract global shape features and Scale Invariant Feature Transform (SIFT) and Speeded-Up Robust Feature (SURF) are applied to extract texture features on the basis of a few key-points of the trademark images. A weighted average of both the key-points feature vectors is computed for retrieving the rank1, rank5, rank10, rank15 and rank20 most similar images using Euclidean distance. Experiments have been performed on a proposed dataset to perform the analysis and found that proposed work perform better and improves the accuracy.
... In traditional trademark retrieval methods, people are more inclined to extract features through the shallow visual features of images. Qi et al. [2] combined shape description and feature matching, and applied it to trademark retrieval. Anuar et al. [3] improved the performance of trademark retrieval by integrating global descriptors and local descriptors. ...
Article
Full-text available
Aiming at the high cost of data labeling and ignoring the internal relevance of features in existing trademark retrieval methods, this paper proposes an unsupervised trademark retrieval method based on attention mechanism. In the proposed method, the instance discrimination framework is adopted and a lightweight attention mechanism is introduced to allocate a more reasonable learning weight to key features. With an unsupervised way, this proposed method can obtain good feature representation of trademarks and improve the performance of trademark retrieval. Extensive comparative experiments on the METU trademark dataset are conducted. The experimental results show that the proposed method is significantly better than traditional trademark retrieval methods and most existing supervised learning methods. The proposed method obtained a smaller value of NAR (Normalized Average Rank) at 0.051, which verifies the effectiveness of the proposed method in trademark retrieval.
... As a result, logos similar to the logo of Starbucks were extracted. In the studies by Wei et al. [14] and Qi et al. [15], elements such as curvature, distance from the center point, etc., were quantified to determine the shapes appearing in the logos in order to determine similarity between various logos. However, the above studies have been carried out to measure the characteristics and shapes of logos and to classify the characteristics based on abstract representation, disregarding the basic information such as style, pattern, and characteristics of the transformations (distortion, repetition, and reflection) used by the shape elements. ...
Article
Full-text available
A logo is an effective way of expressing a brand’s identity and an essential element in conveying the values and image of the company. The development process of a competitive logo should be based on a design that is future-proof in a rapidly changing global market; hence, understanding the design trends for successful logo design is key. In this study, the design shape elements of logo trend models were analyzed and made into a database. Then, a trend analysis system was produced using radial visualization (RadViz) and circular parallel coordinates data visualization techniques. RadViz allows observation of clusters of logos that have similar shape elements, whereas with circular parallel coordinates plots, detailed information of the shape elements of each logo trend can be seen. Using the system, it was confirmed that shape elements—such as transformation to surface, overlapping, artificiality, concept of color and rhythm—play a major role in driving a trend. It was observed that trends change over time as various shape elements are added or removed. In addition, our study is expected to help predict the logo trend models that will come into style in the future. While similar efforts have been made in the past, our proposed system improves upon them by utilizing standard design elements as the categorizing criteria, using a unique combination of RadViz and circular parallel coordinates data visualization techniques. Using our system as a guideline, many users would be able to create logos that reflect what is trending.
... It performs satisfactory and better results. But the system cannot be used for calculating rotation invariant image features accurately [9]. M. -Y. ...
... A method that uses shape features and a deformable template matching process to discard spurious matches was introduced in [10]. The method presented in [11] combines a contour-based shape descriptor and a region-based shape descriptor for TIR. In [12], a comparative study of several shape features and matching techniques for trademark images was presented. ...
... Prominent techniques discussed by them are edge histogram density, autocorrelation function for quantifying texture and RGB colour histogram. A similar work of finding similar images from a large database is mentioned in Qi et al. (2010) that are focused on trademark images. For defining shape, two descriptors have been used; RAPC-HCD which is contour based and SDFP-FPM which is a region based descriptor. ...
Article
Full-text available
This paper describes an effective tool to automate the process of trademark similarity checking at the time of registration. A combination of shape and colour feature has been illustrated here so that images taken from different vantage points or levels fall into the same group. This was particularly important for trademark images to curb attempts at trademark infringement through object transformations. SIFT has been used as a powerful technique that is unaffected by various transformations like rotation, scaling, translation, change in view etc. Correlogram on the other hand has been used to describe the colour content of the image which is then combined with SIFT to completely represent a trademark image.
... The academia has put forward a variety of programs about the shape recognition of image retrieval, such as use the histogram of adjacent edge and the circumcircle's radius and the feature point distance (Qi et al., 2010), use wavelet modulus maxima and invariant moments for edge location (Cao et al., 2006), by sub-image shape and spatial structure (Guo et al., 2005) and combination of global features and internal structure (Wei et al., 2009), etc. ...
Article
In response to the demand of the logistics industry's application, a retrieval algorithm for logistics bill is proposed which combines the local feature with the global feature of images, which solved the problem of rotation positioning and applied to a set of practical courier receipts retrieval system. By using the scale invariance principle of the local features combined with Zernike invariant moments of the global features, we can quickly calculate the image rotating angle and make the exact match. Experimental results show that this method not only keeps the well precision and recall ratio ability of SIFT features, but also reduces the counting times which are required by fine matching.
... are compared with the results produced in (Qi et al., 2010) using the methods developed by Wei in (Wei et al., 2009) As we can see, the retrieval results of the method proposed by Qi et al. and illustrated by Fig. 8(b) provide 50% precision rate for the top ten retrieval. However, the method of Jain and Vailaya achieve 60% and the proposed approach with 70%. ...
Article
Full-text available
In this paper, we propose an approach for two-dimensional shape representation and matching using the B-spline modelling and Dynamic Programming (DP), which is robust with respect to affine transformations such as translation, rotation, scale change and some distortions. Boundary shape is first splitedinto distinctpartsbased on the curvature. Curvature points are critical attributes for shape description, allowing the concave and convex parts of an objectrepresentation, which are obtained by the polygonal approximation algorithm in our approach. After thateach part is approximated by a normalized B-spline curve usingsome global features including the arc length, the centroid of the shape and moments. Finally, matching and retrieval of similar shapes are obtained using a similarity measure defined on their normalized curves with Dynamic Programming. Dynamic programming not only recovers the best matching, but also identifies the most similar boundary parts. The experimental results on some benchmark databases validate the proposed approach. Copyright © 2014 SCITEPRESS - Science and Technology Publications. All rights reserved.
... The omparison of experimental trademark retrieval system, sues. Although MPEG-7 is eval research [21,22], the esigned for performance rieval system. Therefore, in lour logo image database, and a number of distorted collected from online and o images, 50 football team rams, where some of them whole database will make ive feedback can also be th the system [23]. ...
Conference Paper
Due to its uniqueness and high value in commercial side, logos and trademarks play a key role in e-business based global marketing. Detecting misused and faked logos need designated and accurate image processing and retrieval techniques. However, existing colour and shape based retrieval techniques, which are mainly designed for natural images, cannot provide effective retrieval of logo images. In this paper, an effective approach is proposed for content-based image retrieval of coloured logos and trademarks. By extracting the dominant colour from colour quantization and measuring the spatial similarity, fusion of colour and spatial layout features is achieved. The proposed approach has been tested on a database containing over 250 logo images. Experimental results show that the proposed methodology yields more accurate results in retrieving relevant images than conventional approaches even with added Gaussian and Salt & pepper noise.
... Wei et al. [38] have used ZMs as global features and centroid distance along with contour curvatures as local features. Qi et al. [39] have given a technique for trademark image retrieval along with effective feature matching strategy. Shu and Wu [40] have introduced contour points distribution histogram with earth mover distance scheme for shape matching. ...
... Wei et al. [38] have used ZMs as global features and centroid distance along with contour curvatures as local features. Qi et al. [39] have given a technique for trademark image retrieval along with effective feature matching strategy. Shu and Wu [40] have introduced contour points distribution histogram with earth mover distance scheme for shape matching. ...
... • web search [5][6][7] • mobile application [8,9] • arts and museums [10,11] • medical imaging [12,13] • geoscience [14,15] • business (trademark) [16][17][18][19] • intelligent transportation [20] • criminal prevention [21,22] With this variety of applications, the core or the key problem of CBIR is the same: in order to find images that are visually similar to a given query, it should have both a proper representation of the images by compacting visual features and a measure that can determine how similar or dissimilar the different images are from the query. Comprehensive reviews could be found in [1,[23][24][25][26][27]. ...
Article
This thesis comes within content-based image retrieval for images by constructing feature vectors directly fromtransform domain. In particular, two kinds of transforms are concerned: Discrete Cosine Transform (DCT) andDiscrete Wavelet Transform (DWT), which are used in JPEG and JPEG2000 compression standards. Based onthe properties of transform coefficients, various feature vectors in DCT domain and DWT domain are proposedand applied in face recognition and color texture retrieval. The thesis proposes four kinds of feature vectors in DCTdomain: Zigzag-Pattern, Sum-Pattern, Texture-Pattern and Color-Pattern. The first one is an improved method based onan existing approach. The last three ones are based on the capability of DCT coefficients for compacting energy and thefact that some coefficients hold the directional information of images. The histogram of these patterns is chosen as descriptor of images. While constructing the histogram, with the objective to reduce the dimension of the descriptor, either adjacent patterns are defined and merged or a selection of the more frequent patterns is done. These approaches are evaluated on widely used face databases and texture databases. In the aspect of DWT domain, two kinds of approaches for color texture retrieval are proposed. In the first one, color-vector and multiresolution texture-vector are constructed, which categorize this approach into the context of extracting color and texture features separately. In contrast, the second approachis in the context of extracting color and texture features jointly: multiresolution feature vectors are extracted from luminance and chrominance components of color texture. Histogram of vectors is again chosen as descriptor and using k-means algorithm to divide feature vectors into partitions corresponding to the bins of histogram. For histogram generation, two methods are used. The first one is the classical method, in which the number of vectors that fall into the corresponding partition is counted. The second one is the proposition of a sparse representation based histogram in which a bin value represents the total weight of corresponding basis vector in the sparse representation.
... As a branch of Shape based CBIR, the research of trademark image retrieval (TIR) [6] is of great practical significance. For example if a company wants to register a new trademark, they must find whether there is any similar trademark in existing database. ...
Article
Trademark carries the prestigious values for a particular company so it is very important to distinguish it from the similar context trademark. In this research paper, we propose an efficient Trademark Image Retrieval (TIR) model which is a branch of Content Based Image Retrieval (CBIR). In the proposed system we extract the edge point of a particular image and after this the edge point are evaluated to find the corner pixel from it. For the performance evaluation of the system we use the most commonly used method namely precision-recall. From the experimental result we conclude that the TIR based on shape feature perform better and gives satisfactory result.
... An important problem for image analysis is the detection of circular shapes, in particular for industrial applications such as automatic inspection of manufactured products and components, aided vectorization of drawings, target detection, etc. [5,38]. Two sorts of techniques are commonly applied to solve the object location challenge: first hand deterministic techniques including the application of Hough transform based methods [50], geometric hashing and template or model matching techniques [1,32]. On the other hand, stochastic techniques including random sample consensus techniques [12], simulated annealing [8] and Genetic Algorithms (GA) [37], have been also used. ...
Chapter
Full-text available
Optimization approaches, inspired by different metaphors, have recently attracted the interest of the scientist community. On the other hand, circle detection over digital images has received considerable attention from the computer vision community over the last few years as tremendous efforts have been directed towards seeking for an optimal detector. This chapter presents an algorithm for the automatic detection of circular shapes embedded into cluttered and noisy images with no consideration of conventional Hough transform techniques. The approach is based on a physics-inspired technique known as the Electromagnetism-like Optimization (EMO). It follows the Electromagnetism principle regarding a attraction-repulsion mechanism which manages particles towards an optimal solution. Each particle represents a solution by holding a charge which is related to the objective function to be optimized. The algorithm uses the encoding of three non-collinear points embedded into the edge map as candidate circles. Guided by the values of the objective function, the set of encoded candidate circles (charged particles) are evolved using the EMO algorithm so that they can fit into actual circular shapes over the edge map. Experimental evidence from several tests on synthetic and natural images which provide a varying range of complexity validates the efficiency of our approach regarding accuracy, speed and robustness.
... However, this global descriptor was not suitable to deal with incomplete information or transformed versions of the original logos, and it was not proper to exactly describe the locality of logo traits. Qi et al. [10] proposed an effective solution for trademark image retrieval by combining shape description and feature matching. This method can be generalized to other applications. ...
Article
Full-text available
Logo recognition is an important issue in document image, advertisement, and intelligent transportation. Although there are many approaches to study logos in these fields, logo recognition is an essential subprocess. Among the methods of logo recognition, the descriptor is very vital. The results of moments as powerful descriptors were not discussed before in terms of logo recognition. So it is unclear which moments are more appropriate to recognize which kind of logos. In this paper we find out the relations between logos with different transforms and moments, which moments are fit for logos with different transforms. The open datasets are employed from the University of Maryland. The comparisons based on moments are carried out from the aspects of logos with noise, and rotation, scaling, rotation and scaling.
Article
Self-organizing neural network has been widely used to extract the topological structure of image shape because it features topological preservation, dynamic adaptation, clustering and dimensionality reduction. However, it is difficult to automatically extract the topology structure with an appropriate number of neurons from the complex and diverse data. In this paper, a novel self-organizing neural network called self-adaptive growing neural network(SAGNN) is proposed, which can generate an appropriate number of neurons autonomously according to the size of input data without setting the total number of neurons in advance. Firstly, Similarity Evaluation Index(SEI) is proposed to evaluate the similarity between the output network and the input space. Then, on the basis of growing neural gas (GNG) network, the SEI as a network growth control condition is introduced into the SAGNN, so that the SAGNN can grow neurons on demand until the expected quantization error is not significantly improved. Experiments involving both artificial and real data sets show that SAGNN can extract the topological structure from the unsupervised data without any prior conditions (including the appropriate number of neurons).
Article
Batik as a traditional art is well regarded due to its high aesthetic quality and cultural heritage values. It is not uncommon to reuse versatile decorative shape patterns across batiks. General-purpose image retrieval methods often fail to pay sufficient attention to such a frequent reuse of shape patterns in the graphical compositions of batiks, leading to suboptimal retrieval results, in particular for identifying batiks that use copyrighted shape patterns without proper authorization for law-enforcement purposes. To address the lack of an optimized image retrieval method suited for batiks, this study proposes a new method for retrieving salient shape patterns in batiks using a rich combination of global and local features. The global features deployed were extracted according to the Zernike moments (ZMs); the local features adopted were extracted through curvelet transformations that characterize shape contours embedded in batiks. The method subsequently incorporated both types of features via matching a weighted bipartite graph to measure the visual similarity between any pair of batik shape patterns through supervised distance metric learning. The derived similarity metric can then be used to detect and retrieve similar shape patterns appearing across batiks, which in turn can be employed as a reliable similarity metric for retrieving batiks. To explore the usefulness of the proposed method, the performance of the new retrieval method is compared against that of three peer methods as well as two variants of the proposed method. The experimental results consistently and convincingly demonstrate that the new method indeed outperforms the state-of-the-art methods in retrieving salient shape patterns in batiks.
Article
The segmentation of a shape into a series of meaningful parts is a fundamental problem in shape analysis and part-based object representation. However, it is difficult to make the result of shape segmentation accord with the expectations of humans performing the same task. There is still a need for an effective way to segment the shape although a variety of methods have been proposed. In this paper, we present a novel shape decomposition algorithm, which is implemented in a coarse-to-fine manner, taking into account the critical points on the silhouette. First, a part-cut hypotheses candidate set is generated and classified into 2 categories. Then, the hypotheses with adjacent endpoints are determined first, and later, the other kinds of hypotheses are finely determined by our presented measures such as chord arc ratio and inner angle. We note that the proposed coarse-to-fine decomposition conforms to the mechanism of human vision. The extensive experimental results on a large set of shapes show that our algorithm can generate shape decomposition results that better accord with human intuition compared to competing algorithms.
Article
In this research, five steps are brought up to build up the trademark map, including (1) deciding sample range of trademarks, (2) analyzing the first-time information, (3) analyzing the second-time information, (4) building up the trademark map, and (5) analyzing the trademark map. This standard procedure can help enterprises create their trademark maps efficiently. A multi-dimensional scale is used for analyzing and building up the trademark map of the most famous one hundred brands, and 86 consumers are requested to proceed with the experiment of brand identification. The results are shown as follows. (1) To display the distribution of trademark samples clearly by building a visualized map, the level of trademark similarity between samples can be understood. (2) Enterprises can the apply trademark map for judging the identification and feasibility of their trademarks so that they are capable of avoiding tort and creating their own and only brand image.
Article
Full-text available
Trademark retrieval (TR) has become an important yet challenging problem due to an ever increasing trend in trademark applications and infringement incidents. There have been many promising attempts for the TR problem, which, however, fell impracticable since they were evaluated with limited and mostly trivial datasets. In this paper, we provide a large-scale dataset with benchmark queries with which different TR approaches can be evaluated systematically. Moreover, we provide a baseline on this benchmark using the widely-used methods applied to TR in the literature. Furthermore, we identify and correct two important issues in TR approaches that were not addressed before: reversal of contrast, and presence of irrelevant text in trademarks severely affect the TR methods. Lastly, we applied deep learning, namely, several popular Convolutional Neural Network models, to the TR problem. To the best of the authors, this is the first attempt to do so.
Article
Full-text available
The objective of this paper is identification and analysis of the prevention policies of trademark infraction and challenges to find similarity between trademarks. Additionally, this paper proposed an approach to enhance semantic retrieval system of conceptually similar trademarks using algorithms of machine learning like Naive Bayes (NB), Artificial Neural Network (ANN) and Support Vector Machine (SVM). Similarity of trademarks is calculated using Tversky index, Cosine similarity, Jaccard coefficient etc. The performance of classification algorithms are compared on the parameter like accuracy on a same set of trademarks representing real trademark infraction cases. The proposed approach is the first step to automate the process of finding conceptually similar trademarks.
Article
While providing relevance feedback (RF) by users proves to be an effective method for content-based image retrieval, how to interpret and learn from the user-provided feedback, however, remains an unsolved problem. In this paper, we propose an integrated users-feedback and learning algorithm by screening individual elements of content features and driving a group of swarmed particles inside the feature space to provide a possible solution. In comparison with the existing approaches, the proposed algorithm achieves a number of advantages, which can be highlighted as: (i) interpretation of users’ feedback is independent of both the content features and relevance feedback schemes, and hence the proposed algorithm can be applicable to any content features and relevance feedback methods; (ii) the RF interpretation is followed by a group of swarmed particles, acting as multiple agents rather than a single query image in searching for the desirable images; (iii) the proposed RF interpretation and learning is exploited not only in reweighting the content similarity measurement, but also in regrouping the database images. Extensive experiments support that our proposed algorithm outperforms the existing representative techniques, providing good potential for further research and development for a wide range of content-based image retrieval applications.
Article
Full-text available
In this paper, a logo classification system based on the appearance of logo images is proposed. The proposed classification system makes use of global characteristics of logo images for classification. Color, texture, and shape of a logo wholly describe the global characteristics of logo images. The various combinations of these characteristics are used for classification. The combination contains only with single feature or with fusion of two features or fusion of all three features considered at a time respectively. Further, the system categorizes the logo image into: a logo image with fully text or with fully symbols or containing both symbols and texts.. The K-Nearest Neighbour (K-NN) classifier is used for classification. Due to the lack of color logo image dataset in the literature, the same is created consisting 5044 color logo images. Finally, the performance of the classification system is evaluated through accuracy, precision, recall and F-measure computed from the confusion matrix. The experimental results show that the most promising results are obtained for fusion of features.
Conference Paper
The performance of different feature extraction and shape description methods in trademark image recognition systems have been studied by several researchers. However, the potential improvement in classification through feature fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of three classifiers, each trained on different feature sets. Three promising shape description techniques, including Zernike moments, generic Fourier descriptors, and shape signature are used to extract informative features from logo images, and each set of features is fed into an individual classifier. In order to reduce recognition error, a powerful combination strategy based on the Dempster-Shafer theory is utilized to fuse the three classifiers trained on different sources of information. This combination strategy can effectively make use of diversity of base learners generated with different set of features. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers’ output, showing significant performance improvements of the proposed methodology.
Article
Nonnegative matrix factorization based on graph regularization has shown great performance improvement compared with traditional nonnegative matrix factorization methods by utilizing the geometric structure of the data. However, the existing methods based on graph regularization are very sensitive to the choices of graph parameters. In order to solve this problem, we propose a novel matrix decomposition method, called Multiple Graph Regularization Constrained Nonnegative Matrix Factorization (MGNMF), which can simultaneously handle nonnegative matrix factorization with label constrains and multiple graph learning. We derive an efficient multiplicative updating procedure for the proposed model, and prove its theoretic justification of the algorithm convergence. Experiments on benchmark face recognition data sets demonstrate the effectiveness of our proposed algorithm in comparison to the state-of-the-art approaches.
Article
Trademarks are visual symbols with high reputational value, which requires protection. This paper proposes an algorithm to retrieve phonetically similar trademarks that can be used as a means for supporting trademark examination during the registration process. The algorithm employs a phonology based string similarity algorithm together with a typography mapping and token rearrangement to compute a phonetic similarity between trademarks. The trademark phonetic similarity score is then computed from the employed phonetic similarity algorithm. The proposed algorithm advances the state-of-the-art in trademark retrieval by providing a mechanism to compare trademarks with special characters or symbols phonetically. The proposed algorithm is tested on 1,400 trademarks obtained from real court cases between 1999 and 2012. The proposed algorithm improves the R-precision score by 14% and 17% compared with two state-of-the-art methods.
Article
Most of the existing shape retrieval methods need a one-to-one shape descriptor matching procedure to achieve a high retrieval rate. However, high performance shape matching methods are usually computationally demanding, which are obviously not suitable for large shape databases. Shapes should be indexed for efficient retrieval. In this paper, we propose a simple but efficient shape descriptor ROMS and index shapes via the Bag-of-Words (BoW) framework. ROMS is a multi-scale descriptor and defined by the ratio of a triangle middle and side line in each scale. In order to deal with articulation, part-aware metric is also introduced. These strategies make ROMS invariant to translation, rotation, scale, articulation, meanwhile capturing both the local curvature information and the part structure of the shape. Furthermore, we present a symmetry detection method based on ROMS. Owing to the above distinguishing characteristics and advantages, the method can detect both extrinsic and intrinsic symmetries. Extensive experiments have been performed on several public databases including the MPEG7 CE-shape-1, the Kimia database and the ETH-80 database. The experiments show that ROMS achieves better result than the state of art methods and scales up to large database via BoW framework.
Article
New computing technologies, media acquisition/storage devices, and multimedia compression standards have increased the amount of digital data generated and stored by computer users. Nowadays, it is easy to access electronic books, patents and trademarks which contain tremendous graphics. Hence, it is imperative to develop an effective method for retrieving images by using graphics as query keywords. Although many content-based retrieval methods have been proposed, few are specifically designed for graphics. Moreover, most existing graphics retrieval methods adopt contour-based rather than pixel-based approaches. A contour-based method is concerned with a lot of lines, curves and components that must be correspondingly matched, which provides accurate results but requires intensive computation. Thus, the objective of this study was to develop a simple yet effective pixel-based graphics retrieval. The proposed method adopts histograms of oriented gradient (HOG) as graphics features. The graphics are first divided into small spatial regions, i.e., blocks, from which HOGs are computed. The HOGs, one for each block, are then concatenated to form the representation for the graphics. Finally, the similarity between the query and database graphics can be computed with x2 distance based on HOG The retrieved list includes similar graphics in order of increasing x2 distance. Experimental results using patent database confirm that the proposed method has higher retrieval accuracy compared to existing pixel-based methods.
Article
Event extraction has a broad range of application in systems biology, ranging from support for the creation and annotation of pathways to automatic population or enrichment of databases. In this task, trigger detection, in which we assign the event type to each token, plays a critical role. However, word sense ambiguity makes the trigger detection challenging. In this paper, we explore some new features to solve this problem. Trigger detection is addressed with a multi-class SVM classifer that assigns event classes to individual tokens. Furthermore, we have reviewed current features that have been proposed to analyze the effect of each feature. Compared with previous approach, the system achieved an F-score of 66.3% on the trigger detection in Bio NLP 2011 shared task corpus.
Article
Motivated by the Weber’s Law, this paper proposes an efficient and robust shape descriptor based on the perceptual stimulus model, called Weber’s Law Shape Descriptor (WLSD). It is based on the theory that human perception of a pattern depends not only on the change of stimulus intensity, but also on the original stimulus intensity. Invariant to scale and rotation is the intrinsic properties of WLSD. As a global shape descriptor, WLSD has far lower computation complexity while is as discriminative as state-of-art shape descriptors. Experimental results demonstrate the strong capability of the proposed method in handling shape retrieval.
Article
This paper presents a series of new image descriptors based on statistical thermodynamics and discusses their application in content-based image retrieval and image clustering. The paper puts forward image descriptors which represent macro-visual characteristics such as ‘‘image energy,’’ ‘‘image pressure,’’ ‘‘image mass,’’ and ‘‘image temperature’’ according to the analysis-localized sub-system within the statistical thermodynamic theory. We can find a lot of mathematical laws by applying statistical thermodynamic theory in digital image processing. The proposed method has the characteristics of the fast calculation. Experiment verifies the rationality and effectiveness of the proposed method.
Article
Trademarks are signs of high reputational value. Thus, they require protection. This paper studies conceptual similarities between trademarks, which occurs when two or more trademarks evoke identical or analogous semantic content. This paper advances the state-of-the-art by proposing a computational approach based on semantics that can be used to compare trademarks for conceptual similarity. A trademark retrieval algorithm is developed that employs natural language processing techniques and an external knowledge source in the form of a lexical ontology. The search and indexing technique developed uses similarity distance, which is derived using Tversky's theory of similarity. The proposed retrieval algorithm is validated using two resources: a trademark database of 1400 disputed cases and a database of 378,943 company names. The accuracy of the algorithm is estimated using measures from two different domains: the R-precision score, which is commonly used in information retrieval and human judgment/collective human opinion, which is used in human-machine systems.
Article
Image retrieval methods based on annular histogramof feature points are calculation efficient, invariant to image rotation and translation transform. However, these methods have two main disadvantages. One is that the annular histogram can't describe the spatial distribution of feature points accurately, thus different images may have similar annular histogram. Another one is the methods based on feature points cannot describe the shape information of object. These disadvantages affect the retrieval accuracy to a certain extent. In this paper, an image retrieval method based on Hu invariant moments and improved annular histogram is proposed. Firstly, edge of image is detected and feature points are calculated based on the edge curvature. Then, image features are described based on both the edge and points. Annular histogram combined with standard deviation ellipse method is used to describe the spatial distribution of feature points. Hu invariant moment of the edge is used to represent the object's shape information. Finally, the similarity is measured based on both the point feature and the shape feature. Experiment results show that the proposed method can improve the image retrieval precision effectively.
Conference Paper
Full-text available
In the medical domain, experts usually look at specific anatomical structures to identify the cause of a pathology, and therefore they can largely benefit from automated tools that retrieve relevant slice(s) from a patient's image volume in diagnosis. Accordingly, this paper introduces a novel search and retrieval work for finding relevant slices in brain MR (magnetic resonance) volumes. As intensity is non-standard in MR we explore performance of two complementary intensity invariant features, local binary patterns and Kanade-Lucas-Tomasi feature points, their extended versions with spatial context, and a simple edge descriptor with spatial context. Experiments on real and simulated data showed that the local binary patterns with spatial context is fast, highly accurate, and robust to geometric deformations and intensity variations.
Conference Paper
Full-text available
In this paper, we outline some of the main challenges facing trademark searchers today, and discuss the extent to which current automated systems are meeting those challenges.
Article
Full-text available
A Modified Direct Method for the computation of the Zernike moments is presented in this paper. The presence of many factorial terms, in the direct method for computing the Zernike moments, makes their computation process a very time consuming task. Although the computational power of the modern computers is impressively increasing, the calculation of the factorial of a big number is still an inaccurate numerical procedure. The main concept of the present paper is that, by using Stirling’s Approximation formula for the factorial and by applying some suitable mathematical properties, a novel, factorial-free direct method can be developed. The resulted moments are not equal to those computed by the original direct method, but they are a sufficiently accurate approximation of them. Besides, their variability does not affect their ability to describe uniquely and distinguish the objects they represent. This is verified by pattern recognition simulation examples.
Article
Full-text available
Extending beyond the boundaries of science, art, and culture, content-based multimedia information retrieval provides new paradigms and methods for searching through the myriad variety of media over the world. This survey reviews 100+ recent articles on content-based multimedia information retrieval and discusses their role in current research directions which include browsing and search paradigms, user studies, affective computing, learning, semantic queries, new features and media types, high performance indexing, and evaluation techniques. Based on the current state of the art, we discuss the major challenges for the future.
Article
Full-text available
The paper presents a review of 200 references in content-based image retrieval. The paper starts with discussing the working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap. Subsequent sections discuss computational steps for image retrieval systems. Step one of the review is image processing for retrieval sorted by color, texture, and local geometry. Features for retrieval are discussed next, sorted by: accumulative and global features, salient points, object and shape features, signs, and structural combinations thereof. Similarity of pictures and objects in pictures is reviewed for each of the feature types, in close connection to the types and means of feedback the user of the systems is capable of giving by interaction. We briefly discuss aspects of system engineering: databases, system architecture, and evaluation. In the concluding section, we present our view on: the driving force of the field, the heritage from computer vision, the influence on computer vision, the role of similarity and of interaction, the need for databases, the problem of evaluation, and the role of the semantic gap.
Article
Full-text available
Various types of moments have been used to recognize image patterns in a number of applications. A number of moments are evaluated and some fundamental questions are addressed, such as image-representation ability, noise sensitivity, and information redundancy. Moments considered include regular moments, Legendre moments, Zernike moments, pseudo-Zernike moments, rotational moments, and complex moments. Properties of these moments are examined in detail and the interrelationships among them are discussed. Both theoretical and experimental results are presented.
Conference Paper
Full-text available
Various types of moments have been used to recognize image patterns in a number of applications. The authors evaluate a number of moments and addresses some fundamental questions, such as image representation ability, noise sensitivity, and information redundancy. Moments considered include regular moments, Legendre moments, Zernike moments, pseudo-Zernike moments, rotational moments and complex moments. Properties of these moments are examined in detail, and the interrelationships among them are discussed. Both theoretical and experimental results are presented
Article
Full-text available
This paper presents a new multiscale curvature-based shape representation technique with application to curve data compression using B-spline wavelets. The evolution of the curve is implemented in the B-spline scale-space, which enjoys a number of advantages over the classical Gaussian scale-space, for instance, the availability of fast algorithms. The B-spline wavelet transforms are used to efficiently estimate the multiscale curvature functions. Based on the curvature scale-space image, we introduce a coarse-to-fine matching algorithm which automatically detects the dominant points and uses them as knots for curve interpolation
Article
Full-text available
Presents a review of 200 references in content-based image retrieval. The paper starts with discussing the working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap. Subsequent sections discuss computational steps for image retrieval systems. Step one of the review is image processing for retrieval sorted by color, texture, and local geometry. Features for retrieval are discussed next, sorted by: accumulative and global features, salient points, object and shape features, signs, and structural combinations thereof. Similarity of pictures and objects in pictures is reviewed for each of the feature types, in close connection to the types and means of feedback the user of the systems is capable of giving by interaction. We briefly discuss aspects of system engineering: databases, system architecture, and evaluation. In the concluding section, we present our view on: the driving force of the field, the heritage from computer vision, the influence on computer vision, the role of similarity and of interaction, the need for databases, the problem of evaluation, and the role of the semantic gap
Article
Full-text available
No feature-based vision system can work unless good features can be identified and tracked from frame to frame. Although tracking itself is by and large a solved problem, selecting features that can be tracked well and correspond to physical points in the world is still hard. We propose a feature selection criterion that is optimal by construction because it is based on how the tracker works, and a feature monitoring method that can detect occlusions, disocclusions, and features that do not correspond to points in the world. These methods are based on a new tracking algorithm that extends previous Newton-Raphson style search methods to work under affine image transformations. We test performance with several simulations and experiments. 1 Introduction IEEE Conference on Computer Vision and Pattern Recognition (CVPR94) Seattle, June 1994 Is feature tracking a solved problem? The extensive studies of image correlation [4], [3], [15], [18], [7], [17] and sum-of-squared-difference (SSD...
Conference Paper
Toward the development of an object recognition and positioning system, able to deal with arbitrary shaped objects in cluttered environments, methods for matching two arbitrarily shaped regions of different objects are introduced, and how to efficiently compute the coordinate transformation which makes two matching regions coincide is shown. In both cases, matching and positioning, the results are invariant with respect to viewer coordinate system, and these techniques apply to both 2-D and 3-D problems, under either Euclidean or affine coordinate transformations. The 3-D Euclidean case is useful for the recognition and positioning of solid objects from range data, and the 2-D affine case for the recognition and positioning of solid objects from projections, e.g., from curves in a single image, and in motion estimation. The matching of arbitrarily shaped regions is done by computing for each region a vector of centered moments. These vectors are viewpoint-dependent, but the dependence on the viewpoint is algebraic and well known. This paper presents a new family of computationally efficient algorithms based on matrix computations, for the evaluation of both Euclidean and affine algebraic moment invariants of data sets. The use of algebraic moment invariants greatly reduces the computation required for the matching and, hence, initial object recognition. The approach to determining and computing these moment invariants is different than those used by the vision community previously. The method for computing the coordinate transformation which makes the two matching regions coincide provides an estimate of object position. The estimation of the matching transformation is based on the same matrix computation techniques introduced for the computation of invariants. It involves simple manipulations of the moment vectors. It neither requires costly iterative methods, nor going back to the data set. These geometric invariant methods appear to be very important for dealing with the situation of a large number of different possible objects in the presence of occlusion and clutter, and the approach to computing these moment invariants is different than those used by the vision community previously.
Article
Shape is one of the primary low level image features in Content Based Image Retrieval (CBIR). Many shape representations and retrieval methods exist. However, most of those methods either do not well capture shape features or are difficult to do normalization (making matching difficult). Among them, methods based Fourier descriptors (FDs) achieve both good representation (perceptually meaningful) and easy normalization. Besides, FDs are easy to derive and compact in terms of representation. Design of FDs focuses on how to derive Fourier invariants from Fourier coefficients and how to obtain Fourier coefficients from shape signatures. Different Fourier invariants and shape signatures have been exploited to derive FDs. In this paper, we study different FDs and build a Java retrieval framework to compare shape retrieval performance using different FDs in terms of computation complexity, robustness, convergence speed and retrieval performance. The retrieval performance of the different FDs is compared using a standard shape database.
Article
Although many systems for optical reading of printed matter have been developed and are now in wide use, comparatively little success has been achieved in the automatic interpretation of optical images of three-dimensional scenes. This paper is addressed to the latter problem and is specifically concerned with automatic recognition of aircraft types from optical images. An experimental system is described in which certain features called moment invariants are extracted from binary television images and are then used for automatic classification. This experimental system has exhibited a significantly lower error rate than human observers in a limited laboratory test involving 132 images of six aircraft types. Preliminary indications are that this performance can be extended to a wider class of objects and that identification can be accomplished in one second or less with a small computer.
Article
Zernike moments (ZMs) have been successfully used in pattern recognition and image analysis due to their good properties of orthogonality and rotation invariance. However, their computation by a direct method is too expensive, which limits the application of ZMs. In this paper, we present a novel algorithm for fast computation of Zernike moments. By using the recursive property of Zernike polynomials, the inter-relationship of the Zernike moments can be established. As a result, the Zernike moment of order n with repetition m, Znm, can be expressed as a combination of Zn−2,m and Zn−4,m. Based on this relationship, the Zernike moment Znm, for n>m, can be deduced from Zmm. To reduce the computational complexity, we adopt an algorithm known as systolic array for computing these latter moments. Using such a strategy, the multiplication number required in the moment calculation of Zmm can be decreased significantly. Comparison with known methods shows that our algorithm is as accurate as the existing methods, but is more efficient.
Article
The set of Zernike moments belongs to the class of continuous orthogonal moments which is defined over a unit disk in polar coordinate system. The approximation error of Zernike moments limits its applications in real discrete-space images. The approximation error of Zernike moments is divided into geometrical and numerical errors. In this paper, the geometrical and numerical errors of Zernike moments are explored and methods are proposed to minimize them. The geometrical error is minimized by mapping all the pixels of discrete image inside the unit disk. The numerical error is eliminated using the proposed exact Zernike moments where the Zernike polynomials are integrated mathematically over the corresponding intervals of the image pixels. The proposed methods also overcome the numerical instability problem for high order Zernike moments. Experimental results prove the superiority and reliability of the proposed methods in providing better image representation and reconstruction capabilities. The proposed methods are also not lacking in preserving the scale and translation invariant properties of Zernike moments.
Article
Since the number of registered trademarks is increasing rapidly, the job of identifying infringement of similar trademarks by human inspection becomes laborious and time-consuming. To deal with the problem, we propose an automatic content-based trademark retrieval method. The proposed method automatically selects appropriate features based on feature selection principles to discriminate trademarks. The database trademarks are softly clustered into classes using a fuzzy approach to increase the retrieval speed. The user can submit a query through trademark examples to get a list of database trademarks ordered by similarity ranks. The query results can be iteratively refined by the feedback presented by the user until the trademarks of interest are retrieved. Experiments are conducted on a trademark database containing 1000 images and the retrieval results are very encouraging.
Article
A trademark image retrieval (TIR) system is proposed in this work to deal with the vast number of trademark images in the trademark registration system. The proposed approach commences with the extraction of edges using the Canny edge detector, performs a shape normalisation procedure, and then extracts the global and local features. The global features capture the gross essence of the shapes while the local features describe the interior details of the trademarks. A two-component feature matching strategy is used to measure the similarity between the query and database images. The performance of the proposed algorithm is compared against four other algorithms.
Article
In this paper, we present a shape retrieval method using triangle-area representation for nonrigid shapes with closed contours. The representation utilizes the areas of the triangles formed by the boundary points to measure the convexity/concavity of each point at different scales (or triangle side lengths). This representation is effective in capturing both local and global characteristics of a shape, invariant to translation, rotation, and scaling, and robust against noise and moderate amounts of occlusion. In the matching stage, a dynamic space warping (DSW) algorithm is employed to search efficiently for the optimal (least cost) correspondence between the points of two shapes. Then, a distance is derived based on the optimal correspondence. The performance of our method is demonstrated using four standard tests on two well-known shape databases. The results show the superiority of our method over other recent methods in the literature.
Article
More and more images have been generated in digital form around the world. There is a growing interest in finding images in large collections or from remote databases. In order to find an image, the image has to be described or represented by certain features. Shape is an important visual feature of an image. Searching for images using shape features has attracted much attention. There are many shape representation and description techniques in the literature. In this paper, we classify and review these important techniques. We examine implementation procedures for each technique and discuss its advantages and disadvantages. Some recent research results are also included and discussed in this paper. Finally, we identify some promising techniques for image retrieval according to standard principles.
Article
This paper deals with efficient retrieval of images from large databases based on the color and shape content in images. With the increasing popularity of the use of large-volume image databases in various applications, it becomes imperative to build an automatic and efficient retrieval system to browse through the entire database. Techniques using textual attributes for annotations are limited in their applications. Our approach relies on image features that exploit visual cues such as color and shape. Unlike previous approaches which concentrate on extracting a single concise feature, our technique combines features that represent both the color and shape in images. Experimental results on a database of 400 trademark images show that an integrated color- and shape-based feature representation results in 99% of the images being retrieved within the top two positions. Additional results demonstrate that a combination of clustering and a branch and bound-based matching scheme aids in improving the speed of the retrievals.
Article
This paper details a comparative analysis on time taken by the present and proposed methods to compute the Zernike moments, Zpq. The present method comprises of Direct, Belkasim's, Prata's, Kintner's and Coefficient methods. We propose a new technique, denoted as q-recursive method, specifically for fast computation of Zernike moments. It uses radial polynomials of fixed order p with a varying index q to compute Zernike moments. Fast computation is achieved because it uses polynomials of higher index q to derive the polynomials of lower index q and it does not use any factorial terms. Individual order of moments can be calculated independently without employing lower- or higher-order moments. This is especially useful in cases where only selected orders of Zernike moments are needed as pattern features. The performance of the present and proposed methods are experimentally analyzed by calculating Zernike moments of orders 0 to p and specific order p using binary and grayscale images. In both the cases, the q-recursive method takes the shortest time to compute Zernike moments.
Article
The purpose of content based image retrieval (CBIR) systems is to allow users to retrieve pictures from large image repositories. In a CBIR system, an image is usually represented as a set of low level descriptors from which a series of underlying similarity or distance functions are used to conveniently drive the different types of queries. Recent work deals with combination of distances or scores from different and usually independent representations in an attempt to induce high level semantics from the low level descriptors of the images. Choosing the best method to combine these results requires a careful analysis and, in most cases, the use of ad-hoc strategies. Combination based on or derived from product and sum rules are common approaches. In this paper we propose a method to combine a given set of dissimilarity functions. For each similarity function, a probability distribution is built. Assuming statistical independence, these are used to design a new similarity measure which combines the results obtained with each independent function.
Conference Paper
This paper presents a novel approach using combined features to retrieve images containing specific objects, scenes or buildings. The content of an image is characterized by two kinds of features: Harris-Laplace interest points described by the SIFT descriptor and edges described by the edge color histogram. Edges and corners contain the maximal amount of information necessary for image retrieval. The feature detection in this work is an integrated process: edges are detected directly based on the Harris function; Harris interest points are detected at several scales and Harris-Laplace interest points are found using the Laplace function. The combination of edges and interest points brings efficient feature detection and high recognition ratio to the image retrieval system. Experimental results show this system has good performance.
Article
Available in film copy from University Microfilms International. Computer-produced copy. Thesis (Ph. D.)--Brown University, 1991. Vita. Includes bibliographical references (leaves 217-234).
Article
The Artisan system retrieves abstract trademark images by shape similarity. It analyzes each image to characterize key shape components, grouping image regions into families that potentially mirror human image perception, and then derives characteristic indexing features from these families and from the image as a whole. We have evaluated the retrieval effectiveness of our prototype system on more than 10,000 images from the UK Trade Marks Registry
Article
Retrieval efficiency and accuracy are two important issues in designing a content-based database retrieval system. We propose a method for trademark image database retrieval based on object shape information that would supplement traditional text-based retrieval systems. This system achieves both the desired efficiency and accuracy using a two-stage hierarchy: in the first stage, simple and easily computable shape features are used to quickly browse through the database to generate a moderate number of plausible retrievals when a query is presented; in the second stage, the candidates from the first stage are screened using a deformable template matching process to discard spurious matches. We have tested the algorithm using hand drawn queries on a trademark database containing 1; 100 images. Each retrieval takes a reasonable amount of computation time (¸ 4-5 seconds on a Sun Sparc 20 workstation). The top most image retrieved by the system agrees with that obtained by human subjects, ...
we use DD-MSDCD and ZM to represent the contour-based shape feature and the region-based shape feature, respec-tively, and the weight-based feature matching strategy is used to calculate the dissimilarity value between any two images
  • Article In
  • In
  • Wbs
ARTICLE IN PRESS In WBS, we use DD-MSDCD and ZM to represent the contour-based shape feature and the region-based shape feature, respec-tively, and the weight-based feature matching strategy is used to calculate the dissimilarity value between any two images. In References