Conference PaperPDF Available

Are two pictures better than one?

Authors:

Abstract

A major hurdle in practical content based image retrieval (CBIR) is conveying the user's information need to the system. One common method of query specification is to express the query using one or more example images. The authors consider whether using more examples improves the effectiveness of CBIR in meeting a user's information need. We show that using multiple examples improves retrieval effectiveness by around 9%-20% over single-example queries, but that further improvements in using more than two examples may not justify the added processing required
Are
Two
Pictures Better
Than
One?
S
.M.M.
Tahaghoghi
James
A.
Thom Hugh
E.
Williams
Department
of
Computer Science, RMIT University,
GPO
Box
2476K
Melbourne
3001,
Australia
E-mail:
{stahagho,
jat,
hugh}@cs
.rmit .edu.au
Abstract
A
major hurdle in practical Content-Based Image Re-
trieval (CBIR) is conveying the user’s information need to
the system. One common method
of
query specification is
to express the query using one or more example images.
In
this work, we consider whether using more examples im-
proves the effectiveness
of
CBIR
in
meeting a user’s in-
formation need. We show that using multiple examples
improves retrieval effectiveness by around
9%-20%
over
single-example queries, but that further improvements in
using more than
two
examples may not justifL the added
processing required.
Keywords Image retrieval, multimedia databases, multi-
ple queries, recall and precision.
1.
Introduction
Without pictorial material, less value
is
placed on pub-
lished material, be it a magazine article, reference soft-
ware,
or
an
instruction manual. Unfortunately, although
we have more digitised images than ever, finding
an
image
that meets
an
information need is becoming more difficult;
searching through large numbers of images is not a trivial
task.
There are two well-known methods of finding an image
in an image database. One
is
to manually associate
a
tex-
tual annotation with each image before the image is added
to the database: these captions are indexed, and at query
time, they are searched.
This
method suffers from a lack of
scalability and the caption is implicitly linked to the annota-
tor’s abstract of the image, not the image content itself. The
second method is to use machine vision techniques to auto-
matically recognise pre-defined objects.
This
method falters
when applied to unconstrained domains, where the images
are not limited to a fixed number of known categories.
A
practical alternative to these traditional approaches is
Content-Based Image Retrieval (CBIR). In CBIR, the sys-
tem produces and stores a summary of each image in the
collection-usually as it
is
added to the database-by ex-
tracting feature data from it. Similar data extracted from
the user’s query is compared with this stored data, and a list
of
images is presented to the user, sorted by increasing sta-
tistical difference to the query. The most common features
used in CBIR
are
those based on colour, texture, and shape.
It seems intuitive that if we are presenting a sample im-
age as the query, using more example images should lead
to
better retrieval results. For example, if we present the
system an image
of
a red rose, it is possible that a large
number of the top-ranking results will be red objects that
are not flowers. However, if we present the system with
three images of red roses, we might speculate that the sys-
tem may extract more features
of
red roses and more effec-
tively present red roses as high-ranking answers.
We have experimented with multiple-image querying in
CBIR on a medium-size image collection using different
features,
methods
of
combining
results,
and
techniques for
calculating the distance between images. We have
found
that multiple-image querying with two examples improves
retrieval effectiveness by around
97~20%
over single
im-
age querying for closely-matching answers. Importantly,
adding more than two images to the query produces only
modest further improvement.
2.
Background
Content-Based Image Retrieval (CBIR) allows users to
pose image queries to a database
of
images, with the goal
of retrieving relevant images that satisfy the user’s infor-
mation need. In a similar way to Information Retrieval (IR)
practice, likely relevance is approximated by statistical sim-
ilarity, where images returned have the highest estimated
statistical likelihood of being perceived as relevant to the
query [15,20]. An answer to an image query is usually an
ordered list of images.
A CBIR system stores summaries
of
each image in the
collection, as well as the images themselves. The sum-
1530-0919/01
$10.00
@
2001
IEEE
138
maries are usually a representation of one or more features
extracted from the images stored in the database. When
searching for matching images, the same features are ex-
tracted from queries. The query features are then statisti-
cally compared with the stored feature data, and a list of
images is presented to the user, sorted by similarity. The
most common features used in CBIR are those based on
colour, texture, and shape.
Several existing CBIR systems allow the user to present
a single example image as a query. Among these sys-
tems are QBIC [4,5], Virage[6], VisualSEEk[l9], and NE-
TRA
[7].
Pisaro
[18]
supports a multiple-example query
paradigm, where a user can select small tiles of colour and
texture to build up an example mosaic for the query.
The CHITRA system supports multiple-example queries
[lo,
91, where the user can select any number of example
images to pose as a query’. Analysis of the effectiveness of
multiple example querying-as embodied in the CHITRA
system-is the subject of this paper.
2.1.
Feature Spaces
Careful choice of the features used in a CBIR system
is crucial to system effectiveness. Indeed, combination of
several features to represent an image is also important. The
possible features that can be used fall into three primary
categories, colour, texture, and shape.
Colour features used to abstract or summarise images in-
clude the
RGB,
Munsell,
HSV,
L*a*b*
(LAB),
and L*u*v*
(LUV)
spaces
[12].
Among colour spaces, the simplest and
best-known is Red, Green, Blue (RGB); this is in largely
due to its direct relation to the method of image representa-
tion in computer monitors. However, while it is conceptu-
ally simple, there is no simple mapping between RGB and
human perception of colour.
Human perception of colour
is
more complex than sim-
ple RGB, since we deal with around a dozen colours, both
when observing sights and discussing them
[l,
31. This ob-
servation permits classification of colours into a small num-
ber
of
perceptually significant
colours.
An ideal colour space feature would accurately map to
human perception and allow us to accurately estimate how
different two images are in terms of colour. Such a colour
space has several characteristics. First, it will be linear in
terms of human perception; a unit change in the value of
one of the colour space components will be equally percep-
tible across the range
of
values of this component. Second,
it will separate the brightness component from chromaticity
to avoid the effects of varying lighting conditions on per-
ceived colour. Last, a good colour space avoids combina-
tions of opposing colours; some colours-white and black,
‘The
CHITRA
system
can
be
accessed
on
the world-wide web
at
http://kroid.mds.rmit.edu.au/Nstahagho/cbir/
red and green, and blue and yellow-are diametrically op-
posed in terms of human perception, we never talk of a
“reddish-green” or a “bluish-yellow”
[
171.
The Munsell colour space is widely acknowledged to
be the closest to human perception, but has the disadvan-
tages of being non-linear and difficult to transform from
other colour spaces. An approximate, fuzzy version of the
Munsell colour space has been developed for application to
CBIR [U].
The Intemational Commission on Illumination (CIE
*)
has progressively developed several colour spaces that are
practical for CBIR. In particular, the L*a*b* (LAB) and
L*u*v* (LUV) colour spaces have been designed to better
match the characteristics of human perception, while re-
maining mathematically tractable
[
111.
They have similar
characteristics, and largely coexist through lack
of
agree-
ment between their developers.
The
YUV
colour space separates the luminance
(Y)
from
chrominance. The
U
and V components are often subsam-
pled at half the rate of that for luminance, since the human
eye is not as sensitive to colour variations as it is to varia-
tions in luminance
[
131.
A second feature class used
in
CBIR is texture. While
it is not as effective as colour in most cases, it does pro-
vide added discrimination where colour alone does not suf-
fice
[
141. We do not discuss texture features in detail here.
The third feature class used in CBIR is shape. Images
may be partitioned into regions, and the shape, colour, and
texture of these regions can be used for retrieval. Shape
features are mostly derived
from
the moments of the image
regions [2]. As with texture, we do not discuss shape-based
CBIR in detail here.
We have compared nine feature spaces using single-
image queries with the techniques described later in Sec-
tion
4.
In the interest of brevity, results for six of the fea-
ture spaces are not reported in this paper. The three best-
performing spaces-LAB, LUV, and Y uv-were retained
for continued investigation. Results with these schemes us-
ing single and multiple-example querying are reported in
Section 4.
2.2.
Distance measures
The calculation of statistical similarity between a query
feature and the features of each database image requires a
distance
or
similarity
measure. Two common methods for
calculating the distance between two images are the
Man-
hattan-sum of absolute distances-and the
Euclidean-
sum of squares of distances-distance measures. Other dis-
tance measures have been developed, each with its own ad-
vantages [16].
2Coniniission
Internutionule
de
I’
EcIuir-uge
139
In addition to comparing the
YUV,
LUV,
and
LAB
fea-
ture spaces, we compare Manhattan and Euclidean distance
measures in this paper.
3.
Multiple Example Queries
Almost all
CBIR
systems only allow single-example
queries, that is, database images are retrieved after com-
parison to a single query image. The results are retumed in
ranked order of increasing distance from the example image
in one or more feature spaces.
In this paper, we investigate whether providing more ex-
amples images as a query can improve the effectiveness of
CBIR. Multiple-example querying provides users with an
additional, altemative querying mode permitting expression
of different features of an information need. For example, a
user who wishes to find images of red roses may not be con-
cemed whether the images retumed show a bunch of roses,
a
flowerbed of roses, or a single rose. Accordingly, the user
may present three images as a single query that illustrates
these different groupings of red roses and conveys a broad
information need.
When
a
user can not find a single image that expresses
an information need, multiple-image querying provides a
powerful altemative. Consider a case where the user wishes
to retrieve images
of
a red rose but does not have a repre-
sentative query image. In this case, the user may select two
images that together convey the concept: a white rose and
red carnation. While the results are unllkely to solely con-
sist
of
red roses-we would also expect to see white carna-
tions, white roses, and red carnations-in this case multiple-
example querying allows querying
to
proceed.
In
our comparison of possible multiple-example query-
ing
schemes, we have restricted ourselves to multiple-
example, single-feature queries, that is, distance measure-
ment is based
on
a
single feature.
As
discussed previously,
we additionally restrict the comparison to three colour fea-
tures and two distance measures.
There are two possible approaches to multiple-example
querying. First, when multiple query images are presented
we can combine the image features to form a composite
feature, and execute a single query. Second, when mul-
tiple query images are presented we can execute multiple
queries-one database query per query image-and com-
bine the ranked answer lists. Composites of these ap-
proaches are also possible. We investigate the latter ap-
proach of combining ranked result sets in this paper.
To
illustrate multiple example querying, consider Fig-
ure
1
that shows two example images as points and three
candidate database images. Which of the three candidate
images-A,
B,
and C-best matches a query represented
by Examples 1 and 2? Three simple approaches would be
to select:
__....I
..
Example
1
Example
2
Figure
1.
Image points in a two-dimensional
query
space
e
Image A, since it is close to one of the examples, ex-
ample 1
0
Image
B,
since it
is
not
too
distant from either example
e
Image
C,
since
it
has the smallest total distance from
the examples
Finding effective
combining functions
for multiple-
image querying is .a difficult problem.
In
this paper, we
consider three approaches: the
Sum,
Minimum,
and
Max-
imum
functions. These determine the distance of a partic-
ular collection image from the specified multiple example
images to be the sum, the minimum, and the maximum of
the jndividual distances respectively.
To process a two-example query, the distance of the can-
didate image to each example is calculated. Then, the com-
bining function is applied to reduce the multiple distances
to
a single aggregate value.
When this has been performed for all images
in
the col-
lection, the user is presented with a list
of
the images,
ar-
ranged in order of increasing aggregate value.
For
the example in Figure 1, these functions
would
retum
the best matches as shown below:
Minimum
Maximum Sum
1
A
B
C
2
C
C
B
3
B
A
A
4.
Do
More Examples Help?
In our experiments, we used a collection of one thou-
sand assorted images. These images are categorised into
one of ten
concepts:
Buildings, Fish, Flowerbeds, Flow-
ers, Greenbeds, Mountains, People, Plants, Sea, and Sun-
sets. The number of images for each concept varied from
35-for Mountains-to 1 12-for Sunsets. We selected and
removed
2
1 images at random from each concept to
form
a
query set of 210 images, leaving
790
images in the database
as a test set.
To experiment with multiple-example querying, we par-
titioned our 210-image query set. First, we began by se-
lecting one image from each of the 10 concept sets of 21
140
images each. This is a single-example query, and the re-
trieval performance with this query is recorded; we discuss
performance measurement below. Second, another exam-
ple was extracted from each concept and paired with each
of our single examples, forming a two-example query. Last,
another example can be extracted from each concept and
triples made from each pair. For each query concept set
of 21 images, we can produce either
10
independent two-
example queries or 7 independent three-example queries.
This process of extracting images and producing indepen-
dent sets can be generalised for four-example and larger
queries, although the number of independent queries is dra-
matically reduced.
To measure retrieval effectiveness, we use recall-
precision as often used in information retrieval [20]. Recall-
precision measurement requires that each image in the
database can be classified as either “relevant” or “not rele-
vant” to each query that is posed. In most practical applica-
tions, this assessment is impractical, since it is not feasible
to assess each image in the database for relevance to each
query. However, by making a simplistic assumption that
only the images that are members of a concept are relevant
answers to a query extracted from that concept, we can ap-
proximate recall-precision measurement. For example, for
a fish query, only fish concept images are deemed relevant
and all images from other concepts are judged as irrelevant
answers. This somewhat restrictive assumption allows the
practical calculation of effectiveness performance values.
Precision
is
a measure
of
the fraction
of
relevant answers
retrieved at a particular point, that is
Concept images retrieved
Total images retrieved
P=
Recall, in contrast, measures the fraction of the relevant an-
swers that have been retrieved at a particular point, or
Concept images retrieved
Total concept images
R=
Conventionally, precision is reported for each
of
eleven re-
call values being
10%
intervals from
0%
to
100%.
Interpolated precision values are often used when dis-
cussing average effectiveness. The interpolated precision
at a given recall value R is the maximum actual precision
that appears at any recall value equal to or greater than
R.
Interpolated values are also commonly calculated at 10%
increments of recall.
Thus,
the interpolated precision at
0%
recall is the highest precision obtained at any recall value.
Table 1 shows the precision at four recall levels--0%,
lo%,
20%, and 30%-using single-example queries and the
LAB, LUV,
and
YUV
features. We also show the results of
two-example queries with the three combining functions-
minimum, maximum, and sum-and precision values rela-
tive to the one-example figure shown.
With two examples and the sum combining function,
there are significant improvements in retrieval performance.
Using the
YUV
colour space-with the Euclidean distance
and the sum combining function-adding a second exam-
ple image to a query improves precision by between 8.7%
and 20.3% at low recall levels. We have calculated confi-
dence levels of above
99%.
At higher recall levels, gains in
precision are more modest; however, it is frequently argued
that users are more concerned with high precision at lower
recall levels [20].
The improvement in effectiveness with the
LAB
and
LUV
features is less than that with
YUV.
While
YUV
is slightly
less effective as a single-example scheme, two-examples
with
Y
uv
is more effective than any other single or multiple
example scheme. Interestingly, the sum combining func-
tion is the only consistently effective scheme, suggesting
that images that are close to both examples in the query are
more likely to be relevant than images that are close to only
one of the examples.
Figure 2 shows the effect of adding a third example to
each of our 70 independent two-example queries. The im-
provement shown is much less striking than that
of
two-
example queries over one-example queries. The same trend
continues as more examples are added to each independent
query, as shown
in
Figure
3.
This figure shows the re-
sults of a much smaller experiment, with different distance
functions and combining measures, however we have ob-
served the same trend in increasing the number of exam-
ples with all such variations in query parameters; because
of the smaller query set, the confidence interval for these
results is much lower and they are indicative only
of
the
performance trend. We conclude that adding
a
second ex-
ample can significantly improve retrieval effectiveness but
that adding more examples offers little additional improve-
ment.
In the results presented
so
far, we have compared dif-
ferent feature spaces and combining functions. Figure 4
shows a comparison of the Manhattan and Euclidean dis-
tance measurement schemes using the HSV feature space
and a sum combining function. Using the Manhattan dis-
tance improves effectiveness by around 2%-3% over the
Euclidean distance.
We have
also
experimented with the perceptual colour
categories, but obtained poor results that we do not report
in detail here. We believe that partitioning colours in this
manner is not particularly effective from
a
CBIR
perspec-
tive. We also used a 48-dimensional Gabor texture vec-
tor
[8].
While not performing well individually, we empir-
ically observed that it did prove useful in special-cases for
differentiating images with no identifying colour. For ex-
ample, we noted that in our experiments, texture produced
effective results for the Buildings concept, where the colour
features performed poorly.
141
Table
1.
Average retrieval effectiveness of
70
one-image and two-image queries. Precision is shown
for for
0%,
lo%,
20%, and
30%
recall and the
LAB,
LUV,
and
YUV
features. For two-image queries,
results
of
the minimum, maximum, and sum combining functions are shown. The Euclidean distance
function is used for similarity calculation.
Recall Feature One Image
Two
images
(%>
(Precision %)
(%
Precision Change)
Sum Minimum Maximum
LAB
51.5
+
5.5
-
5.6
+0.9
0
LUV
51.3
-
0.5
-
0.4 -4.5
YUV
50.7 +10.7
+
0.8
+0.2
LAB
32.0
+
1.0
-10.8 -5.1
10
LUV
30.6
+
6.6
-
8.6 f2.5
YUV
31.8 +14.2 -10.3 -3.8
LAB
24.5
-t
2.8 -12.0
-2.0
20
LUV
23.3
4-
9.6
-
4.1
+4.5
5.
Conclusions
YUV
24.5
+
8.7
-
5.0
-4.5
LAB
18.9
+
7.5
-
5.7
-5.3
30 LUV
18.0 +20.1
+
0.6
$2.0
Y
uv
17.9
+20.3
+
9.1
$2.1
In this paper, we have examined whether presenting
multiple image queries to a content-based image retrieval
system improves the retrieval effectiveness over single-
image querying. We have shown that for selected param-
eters, using more example images improves retrieval per-
formance. Two-example queries with selected parameters
improve
retrieval
effectiveness
by
between
9%-20%
over
single-example queries.
We
have also found that using more
than three examples
in
a
query
is
unlikely
to
improve re-
trieval significantly.
In
future work, we aim to develop heuristic methods to
capture the user's requirements. If a system is presented
with two different examples--one of a red rose and another
of a yellow daffodil-should the system judge we are inter-
ested
in
only red or yellow objects, red or yellow flowers, or
any type
of
flower? Most humans would not agree on any
one solution to this problem, and it is probable that iterative
feedback methods must be incorporated into the system
to
continually re-evaluate the user's opinion of the results.
Content-based image retrieval is becoming more impor-
tant with the increasing size and prevalence of image repos-
itories. We have shown that multiple-example querying can
'
improve retrieval effectiveness in searching these databases
and better meet user information needs.
Acknowledgments
This work was supported by the Australian Research
Council and the Multimedia Database Systems group at
RMIT University. We thank
M.V.
(Rania)
Ramakrishna and
Surya Nepal for their contribution to the CHITRA project.
We
also express our appreciation to the anonymous referees
for their helpful comments.
References
[l
J
B. Berlin and
P.
Kay.
Basic
Color Terms: Their Universaliq
and Evoluriori.
U.
of Cal. Press., Berkeley, Califomia, USA,
1969.
[2]
C. Carson,
S.
Belongie,
H.
Greenspan, and
J.
Malik. Region-
based image querying. In
Proc
of
IEEE
Workshop
on
Content-based
Access
of
Image
and Video Libraries,
1997.
In conjunction with
IEEE
CVPR
'97.
131
C. Carson
and
V.
E.
Ogle. Storage and retrieval
of
feature
data for a very large online image collection.
Bulleriu
of
the lEEE Computer
Socieo
Technical Committee
on
Data
Engineering,
19(4):
1%27,
December
1996.
[4]
C.
Faloutsos,
R.
Barber,
M.
Flickner,
J.
Hafner,
W.
Niblack,
D. Petkovic, and
W.
Equitz. Efficient and effective querying
by image content.
Journal
of
lntelligent Information
Sys-
tems,
3(3
&
4):231-262, July 1994.
[5]
M. Flickner,
H.
Sawhney,
W.
Niblack.
J.
Ashley,
Q.
Huang,
B. Dom,
M.
Gorkani,
J.
Hafner, D. Lee, D. Petkovic, and
D.
S.
and. Query by image and video content: The QBIC
system.
Computer,
28(9):23-32,
September
1995.
142
---
Three Examples (Sum)
----.
Two
Examples
(Sum)
-
One Example
60
8
nn
--
----
I
I
I I
I
0
20
40
60
80
100
Recall
(%)
Figure
2.
The performance of three examples-using the sum combining function, the
YUV
colour
space, and Euclidean distance-is not markedly better than that of two example querying.
-
Ten examples
_._---
Five examples
_--.
One example
--.
---
-
------___
-___
----*__
-
-
- - -
-
--=-a-
-I-_
0
I
I
I
I
100
1
0
20
40
60 80
Recall
(%)
Figure
3.
Effect
of
increasing the number of query examples using the
LAB
feature space, Manhattan
distance, and the sum combining function.
[6]
A. Gupta. Visual information retrieval: A Virage perspec-
tive. Technical Report Revision
4,
Virage Inc., 9605 Scran-
ton Road, Suite
240,
San Diego, CA 92121, 1997. URL:
http://www.virage.com/wpaper/.
[7]
W.
Y.
Ma and
B.
S.
Manjunath. NETRA:
A
toolbox for nav-
igating large image databases. In
Proc.
IEEE
International
Conference
oti
Image Processing,
IEEE
International Con-
ference
on
Image Processing, pages 568-571, Santa Bar-
bara, California, Oct 1997.
[8]
B.
S.
Manjunath and
W.
Y.
Ma. Texture features for brows-
ing and retrieval of image data.
IEEE
Transactions on Pat-
tern
Analysis
arid
Machine
Intelligence,
18(8):837-842, Au-
gust 1996.
191
S.
Nepal and
M.
Ramakrishna.
A
generalized test bed for
image databases.
In
10th
lnternationul Coilference
of
the
In-
formation Resources Management Association,
pages 926-
928,
Hershey, Pennsylvania,
USA,
May 1999.
[lo1
S.
Nepal,
M.
V. Ramakrishna, and
J.
A.
Thom.
Four layer
schema
for
image data modelling.
In
Australian Com-
piiter
Science
Communications,
Vol20,
No
2,
Proceedings
of
the
9th
Australasian Database Conference,
ADC’98,
pages
189-200,
Perth,
Australia, 2-3 February 1998.
11
1)
1.
C.
on
Illumination. Publication
CIE
No.
17.4,
Intema-
tional Lighting Vocabulary.
[I23
C. Poynton. Frequently asked questions about color, 1997.
URL:
http://home.inforamp.net/-poynton/colorfaq.html.
1131
W.
K.
Pratt.
Digital Image Processing.
Wiley, New York,
USA,
second edition, 1999.
1141
M.
Ramakrishna and
J.
Lin. Gabor histogram feature for
content-based image retrieval. In
Proc.
Fifth
Iiiternationul
Conference
cm
Digital Image Computing,
Perth, Australia,
December 1999.
143
-
Manhattan Distance
---. Euclidean Distance
I
I
I
I
I
0
20
40
60
80
100
Recall
(%)
Figure 4. The Manhattan distance measure performs better than the Euclidean distance using the
HSV
feature space.
G.
Salton.
Automatic Text Processing: The Transformation,
Analysis, and Retrieval of Information by Computer.
Addi-
son, Reading, Massachusetts,
1989.
S.
Santini and R. Jain. Similarity matching. In
Proceed-
ings of the Second Asian Conference on Computei- Vision,
ACCV
'95.
Invited Paper,
pages 571-580, Singapore, De-
cember 1995.
M. Seabom,
L.
Hepplewhite, and J. Stonham. Fuzzy colour
category map
for
content-based image retrieval. Technical
Report
701,
Department
of
Electrical and Electronic Engi-
neering, Brunel University, Uxbridge, Middlesex, UB8 3PH,
UK,
1999.
[18]
M. Seabom,
L.
Hepplewhite, and
J.
Stonham.
Pisaro:
Per-
ceptual colour and texture queries using stackable mosaics.
In
Proceedings of the International Conference
on
Multi-
media Computing and Systems,
Uxbridge, Middlesex, UB8
3PH,
UK,
1999.
[19]
J.
R. Smith and
S.-E
Chang. VisualSEEk: A fully automated
content-based image query system.
In
Proc ACM Multime-
dia
96, pages
87
-
98, Boston, MA, November
1996.
[20]
I.
Witten, A. Moffat, and
T.
Bell.
Managing Gigabytes:
Compressing and Indexing Documents and Images.
Morgan
Kaufmann Publishers, Los Altos, CA
94022,
USA, second
edition, 1999.
144
... 13 http://www.gnu.org/software/gift Tahaghoghi et al. [2001; show that using two images as the query improves the retrieval effectiveness. They studied three functions, namely Sum, Maximum and Minimum to combine the ranked lists computed for each query image. ...
... The difference between this work and previous work by Tahaghoghi et al. [2001; is that instead of using whole image features, we combine the query region features. To tag and retrieve the image regions, we first extract all the regions and their shape features from the images in the collection. ...
... Studies have shown that queries with single images are not sufficient for better retrieval [25]. To express the information need, single image or single region is not sufficient. ...
Article
Full-text available
Content Based Image Retrieval systems open new research areas in Computer Vision due to the high demand of image searching methods. CBIR is the process of finding relevant image from large collection of images using visual queries. The proposed system uses multiple image queries for finding desired images from database. The different queries are connected using logical AND operation. Local Binary Pattern (LBP) texture descriptors of the query images are extracted and those features are compared with the features of the images in the database for finding the desired images. The proposed system is used for retrieving similar human face expressions. The use of multiple queries reduces the semantic gap between low level visual features and high level user expectation. The experimental result shows that, the use of multiple queries has better retrieval performance over single image queries.
... Another type of approach for improving the accuracy of a CBIR system involves processing multiple queries. Such multiple querying approaches including [8, 9] utilize several query images for a single query by submitting each one to the database separately. The responses obtained from processing each of the query images are combined to give a cumulative result. ...
Article
Improving safety is a key goal in autonomous road vehicles. Driver support systems that help drivers react to changing road conditions can potentially improve safety. As with any vehicle, autonomous vehicle driving on public roads must obey the rules of the road. Many of these rules are conveyed through the use of road signs, so an autonomous vehicle must be able to detect and recognize the road signs and change its behavior accordingly. This implies that the system must be able to detect a real world road sign and match its image to images that are already present in its underlying database. In order to be effective it is critical that the system is able to perform this matching accurately. The ability to match a picture of a real world image with images already in the database based on visual characteristics is called Content-Based Image Retrieval (CBIR). This paper proposes a method for improving the accuracy of CBIR systems by augmenting their underlying databases. 1. INTRODUCTION Fatal car crashes occur every day around the world. An estimated 30% of the fatal car crashes can be attributed to the driver inattention and fatigue [1]. This can be reduced by autonomous vehicles which work in part by automatically recognizing road signs placed near streets and highways. The roadway is well structured, and the appearance of the road signs is highly restricted. Each type of sign must be of a particular size, color and shape with few exceptions. Thus, the ability to correctly identify the size, color, and shape of an object is useful when attempting to automatically recognize an image of a given road sign.
... Tahaghoghi et al. [14] have demonstrated improved retrieval performance when using two images but question the usefulness of using more than two claiming that "further improvements in using more than two examples may not justify the added processing required." [14, p. 138] Our results show that there is added benefit when more than two queries are used. ...
Article
The existing image retrieval methods generally require at least one complete image as a query sample. From the practical point of view, a user may not have an image sample in hand for query. Instead, partial information from multiple image samples would be available. This paper therefore attempts to deal with this problem by presenting a novel framework that allows a user to make an image query composed of several partial information extracted from multiple image samples via Boolean operations (i.e. AND, OR and NOT). Based on the request from the query, a Descriptor Cluster Label Table (DCLT) is designed to efficiently find out the result of Boolean operations on partial information. Experiments show the promising result of the proposed framework on commodity query and criminal investigation, respectively, although it is essentially applicable to different scenarios as well by changing descriptors.
Conference Paper
Literature review is a time-consuming burden because it is hard to find relevant articles. But literature review is so important because it allows researchers to find solutions to their questions/problems from previous work already performed and published by others. It is difficult to wade through documents quickly and assess their quality by only looking at their title, abstract, or even full-text. The human visual system allows us to quickly glance at images and infer the main subject of an article and decide whether we are interested in reading more. In some cases, such as biology articles for example, figures showing photos of experimental results quickly allow a researcher in the literature review phase to determine the quality of the work by its results. This work describes a system for literature review that uses content-based image retrieval (CBIR) techniques to search for relevant documents using the content of figures in a document along with relevance feedback refinement instead of keyword search guesswork. The long-term goal is to use it as a subsystem in a content-based document retrieval system where the figures and their captions are combined with the document's body text. This paper describes the processing of the documents to extract available raster graphics as well as text with its layout and formatting information intact. The process of matching a figure to its caption using this layout information is then described. While caption-based search is implemented but not quite merged into the system yet, the figure-caption matching is complete. Two novel modified tf-idf measures that are being considered to take into account bold/italic text, font size, and document structure as a way to infer text importance rather than just rely on text frequency is detailed mathematically and explained intuitively. CBIR queries where there are multiple images that form the query are issued as separate queries and their results are then merged together.
Article
Since multimedia database management systems determine similarity by comparing sets of image features, relevant images in the database can be missed if their features do not match those extracted from the query image. Many failed matches can be avoided if modified versions of the missed relevant images are also stored in the underlying database. To minimize the storage cost associated with adding extra images to the database, the modified versions can be stored as sequences of editing operations instead of as large, binary objects. This article presents a technique for processing color-based queries in this environment that accesses the sequences of editing operations directly. It also presents a methodology that can be used to speed up the query processing just as ordered indices speed up the processing of traditional queries. In addition, this article provides a performance illustrating the technique's strengths and weaknesses when compared with the traditional approach to processing color-based queries. The results indicate that with low similarity thresholds, the proposed technique processes similarity searches more accurately than the traditional approach while using less database storage space since the modified versions are kept as editing operation sequences. © 2008 Wiley Periodicals, Inc. Int J Imaging Syst Technol, 18, 182–194, 2008
Chapter
In this work, we have proposed a new technique of facial image retrieval based on constrained ICA. Our technique requires no offline learning, pre-processing, and feature extraction. The system has been designed so that none of the user-provided information is lost, and in turn more semantically accurate images can be retrieved. As our future work, we would like to test the system in other domains such as the retrieval of chest x-rays and CT images.
Conference Paper
Content-Based Image Retrieval (cbir) is the practical class of techniques used for information retrieval from large image collections. Many CBIR systems allow users to specify their information need by providing an example image. This query-by-example paradigm can be extended to support multiple example images. In this work, we present a large-scale experiment that shows the average performance of querying with multiple examples is significantly better than single-example querying. We also investigate the effects of providing different numbers of example images, the impact of example quality, and the relative performance of functions used to combine image features. Our experiments indicate that three-example queries are more effective than other numbers of examples, and that the minimum combining function is robust for most query types.
Conference Paper
A major problem raised by a region-based image retrieval system is the proper description of regions for efficient and semantically meaningful retrieval. We present CoIRS a novel cluster oriented image retrieval system. A distinguishing aspect of CoIRS is its integration of a robust unsupervised learning for the detection of regions. The segmentation is based on local color and texture features that allows cluster- or region-based search. In addition, a privileged component is the constructing of cluster signatures (CS) that include, color, texture, and shape features of the clusters centroids. The system constructs region signatures (RS) as well which includes region based features such as the invariant moments, area, and eccentricity. Also, another distinctive feature of the system is Feature ranking. Three features were used for ranking the signatures, color, texture, or shape. CoIRS framework proved to provide successful retrieval results supported by precision estimation. The system is evaluated using a database of 2000 images composed of different categories of images.
Article
Full-text available
The role of the emerging field of visual information retrieval (VIR) systems is to go beyond text-based descriptors to elicit, store and retrieve the "imagery-based" information content in visual media. The basic premise behind VIR systems is that images and video are first-class information-bearing entities and that users should be able to query their content as easily as they query textual documents, without necessarily using manual annotation. Most applications for VIR fall between automated pixel-oriented information models and fully human-assisted database schemes. Most of the current VIR systems are limited in the query types they can handle. Development of a comprehensive language for visual assets is a far more difficult task. However, query specification for visual informations should not be performed exclusively with an example or specification-based paradigm but through a collection of different tools. Unfortunately, much effort has not been directed to establish criteria for evaluating, benchmarking and comparing VIR systems.
Article
Full-text available
Image content based retrieval is emerging as an important research area with application to digital libraries and multimedia databases. The focus of this paper is on the image processing aspects and in particular using texture information for browsing and retrieval of large image data. We propose the use of Gabor wavelet features for texture analysis and provide a comprehensive experimental evaluation. Comparisons with other multiresolution texture features using the Brodatz texture database indicate that the Gabor features provide the best pattern retrieval accuracy. An application to browsing large air photos is illustrated.
Article
In the QBIC (Query By Image Content) project we are studying methods to query large on-line image databases using the images'' content as the basis of the queries. Examples of the content we use include color, texture, shape, position, and dominant edges of image objects and regions. Potential applications include medical (Give me other images that contain a tumor with a texture like this one), photo-journalism (Give me images that have blue at the top and red at the bottom), and many others in art, fashion, cataloging, retailing, and industry. We describe a set of novel features and similarity measures allowing query by image content, together with the QBIC system we implemented. We demonstrate the effectiveness of our system with normalized precision and recall experiments on test databases containing over 1000 images and 1000 objects populated from commercially available photo clip art images, and of images of airplane silhouettes. We also present new methods for efficient processing of QBIC queries that consist of filtering and indexing steps. We specifically address two problems: (a) non Euclidean distance measures; and (b) the high dimensionality of feature vectors. For the first problem, we introduce a new theorem that makes efficient filtering possible by bounding the non-Euclidean, full cross-term quadratic distance expression with a simple Euclidean distance. For the second, we illustrate how orthogonal transforms, such as Karhunen Loeve, can help reduce the dimensionality of the search space. Our methods are general and allow some false hits but no false dismissals. The resulting QBIC system offers effective retrieval using image content, and for large image databases significant speedup over straightforward indexing alternatives. The system is implemented in X/Motif and C running on an RS/6000.