ArticlePDF Available

Abstract and Figures

In this paper a colour object search algorithm is presented. Given an image, areas of interest are generated (for each database model) by isolating regions whose colours are similar to model colours. Since the model may exist at one or more of these region locations, each is examined individually. At each region location the object size is estimated and a growing process initiated to include all pixels with model colours. Growing is terminated when a match measure (based on object size and the number of pixels with each model colour) is maximised; if it exceeds a predefined threshold then the object is assumed to be present. Several experiments are presented which demonstrate the algorithm's robustness to scale, affine object distortion, varying illumination, image clutter and occlusion. 1 Introduction Object search requires two capabilities: the ability to recognise an object when it comes into view and a mechanism that brings the object into view. The first of these probl...
Content may be subject to copyright.
A Colour Object Search Algorithm
Paul A. Walcott and Tim J. Ellis
Centre for Information Engineering
Department of Electrical, Electronic and Information Engineering,
City University, London EC1V 0HB, UK
[p.a.walcott|t.j.ellis]@city.ac.uk
Abstract
In this paper a colour object search algorithm is presented. Given an image,
areas of interest are generated (for each database model) by isolating regions
whose colours are similar to model colours. Since the model may exist at one
or more of these region locations, each is examined individually. At each
region location the object size is estimated and a growing process initiated to
include all pixels with model colours. Growing is terminated when a match
measure (based on object size and the number of pixels with each model
colour) is maximised; if it exceeds a predefined threshold then the object is
assumed to be present. Several experiments are presented which demonstrate
the algorithm’s robustness to scale, affine object distortion, varying
illumination, image clutter and occlusion.
1 Introduction
Object search requires two capabilities: the ability to recognise an object when it comes
into view and a mechanism that brings the object into view. The first of these problems
(object recognition) has received a great deal of attention, but geometric solutions, e.g.
the interpretation tree [1] and geometric invariance [7] have dominated. More recently,
colour-based recognition algorithms e.g. colour histogram intersection [10], the colour
region adjacency graph [6] and methods which use the statistics of colour space
components [9] have become more popular. The second problem (bringing the object
into view) has received less attention. The most straightforward solution is linear
search which examines the entire search space at a high spatial resolution, however
this process is time-consuming. An alternative is the use of a visual cue such as colour
which is: (a) used by the human visual system; (b) salient [3]; (c) resilient to changes
in spatial resolution [2]
object geometry is less reliable at low spatial resolutions.
There are several methods of performing object search using colour. Wixson and
Ballard [15] used a camera mounted on a robot arm to examine the walls of a room (in
real time); and for each camera gaze a confidence, based on the ratio of colour
populations, was calculated to determine object presence. Swain and Ballard [10]
proposed a method called histogram backprojection which determined a confidence
value for each image pixel; peaks in the confidence space after smoothing
corresponded to object hypotheses. Vinod et al. [11] used histogram backprojection to
identify object hypotheses and colour histogram intersection [10] to verify object
presence. Schettini [8] first performed a search for an object with a similar shape then
used colour histogram intersection for object colour match verification. Finally, Matas
et al. [6] used a colour adjacency graph (whose nodes represent model colours and
British Machine Vision Conference
297
edges encode information about the adjacency of colours and their reflectance ratios) to
identify object hypotheses and the colour region adjacency graph for object match
verification.
All of these techniques suffer from limitations. Wixson and Ballard’s colour
proportion ratios are not robust to object occlusion. Histogram backprojection does not
rank object hypotheses and the object size must be known prior to search. Since
illumination changes shift values in the colour histogram bins, histogram intersection
is unreliable; therefore Vinod’s method is unstable. Schettini’s shape-based method
deteriorates at low spatial resolution because shape is unreliable. Matas et al.’s graph
search process is computationally expensive but can represent 3-dimensional
deformable objects with perspective distortions all of these other algorithms can
only represent 2-dimensional objects which are affine distorted.
The proposed algorithm identifies image regions with colours that are similar to the
model and uses these as cues. The object size, estimated from the image data, is used to
calculate a match measure which rank the cues. This method can represent both 2- and
3-dimensional planar objects in cluttered scenes and is invariant to moderate occlusion.
The remaining Sections of this paper are organised as follows: Section 2 describes
the object search methodology adopted; Section 3 details the object search algorithm;
Section 4 presents the experimental results and Section 5 the conclusions.
2 Colour Object Search
In earlier work [12], object cues were generated by locating spatially-close regions with
colours that were similar to salient model colours. A match measure was then used to
determine object presence. Since no region growing was incorporated only part of the
object was found; also the match measure used was not robust to occlusion. To improve
this method a new algorithm is proposed which simplifies cue generation, incorporates
region growing and provides hypotheses ranking with a robust match measure.
In the new algorithm, object search is completed in three stages: cue generation,
region growing and hypotheses ranking. In the cue generation phase, image regions
with colours that are similar to the colours of the given model are identified; these are
cue locations because if the object is present it should be at one of these locations.
The region growing phase is necessary to identify object pixels. Growing begins at
the cue region and includes neighbouring pixels with colours which are similar to
model colours. To prevent the premature halting of the growing process due to gaps
between regions, the image is divided into windows. Growing starts at the window
containing the centroid of the cue region and includes windows in the 8-neighbourhood
whose pixels increase the object match measure. Growing is terminated when: no more
windows contain pixels with colours that are similar to model colours; or the match
measure has been maximised; or the object size has been reached. Since the object size
was not known a priori, an object size bound is calculated from the cue region’s size.
The growing process is repeated for different object sizes between the size bound.
Finally, for a given cue a match measure is calculated for each object size. The
cue’s rank is the maximum match measure and if it exceeds a predefined threshold the
object is assumed to be present. The match measure used is colour histogram
intersection where the bin colour was defined by the model regions chromaticity co-
ordinates.
British Machine Vision Conference
298
Before object search a model database must be generated. For each model colour
five parameters are stored: the mean region colour, the minimum and maximum region
area percentages for the given colour and the sum of the region area percentages for the
given colour. This process is detailed in Algorithm 1.
Algorithm 1: Model Parameter Generation
1.
Segment the model image and ignore regions that are smaller than the predefined
area threshold.
2.
Determine the regions, found in Step 1 with similar colour and calculate the total
area of these regions.
3.
For each model colour store in the model database: the chromaticity co-ordinates
(r,g) of the model colour, the percentage of the total model area (percentage
coverage) with the given model colour, and the percentage coverage of the smallest
and largest image regions with the given model colour.
4. Repeat Steps 1-3 for each database model.
Database model7 (c.f. Figure 1) is represented by four regions, two yellow and two
blue. The largest yellow region is on the right and occupies 72% of the model area
while the smallest yellow region (on the far left) occupies 16%. Similarly, the largest
and smallest blue regions occupy 10% and 2% of the total model area, respectively.
The total percentage coverage for yellow is 88% and blue 12% (c.f. Table 1).
Figure 1: A database model (model7).
The pre-processing requirements for both model and test images are colour constancy
and colour image segmentation. Colour constancy is required because the model and
test images were not captured under the same illuminant. The colour constancy
algorithm used (because it was developed in-house) is due to Hung and Ellis [4]. Image
segmentation was achieved using the software colour filter described in Walcott [13].
In this algorithm Khotanzad et al.’s [5] hill climbing clustering algorithm is used to
cluster the colour histogram of the image. The pixels belonging to each identified
cluster are backprojected into the image and the resulting connected components are
treated as regions of constant reflectance.
r g Percentage
coverage
Percentage coverage
of smallest region
Percentage coverage
of largest region
0.48 0.46 88.0 16.0 72.0
0.14 0.24 12.0 2.0 10.0
Table 1: The parameters for the model in Figure 1.
British Machine Vision Conference
299
3 The Object Search Algorithm
The object search algorithm is detailed in Algorithm 2.
Algorithm 2:
Colour Object Search
1. Segment the image and remove regions with areas that are less than the minimum
area threshold or greater than 50% of the image area. Divide the image into N
windows each of size
nm
×
.
2. Repeat Steps 3 - 7 for each database model:
3.
Cue generation: Cue locations are image regions with colours that are similar to
model colours. If (,)
µ µ
rg
is a model colour and (
'
,
'
)
µ µ
rg
an image region
colour, then if
C
rr gg
cthreshold= + <(
'
)(
'
)
µ µ µ µ
22
a colour match is
recorded.
4. Repeat Steps 5 - 7 for each cue:
5. Object size determination: By assuming that the cue region is part of the model
and not more than half of it is occluded a minimum (min_size) and maximum
(max_size) object size bound can be calculated:
min_
__ __ _
size
number of pixels in cue region
PCLMR
= 100 (2)
max_
__ __ _
size
number of pixels in cue region
PCSMR
= 2 100 (3)
where PCSMR and PCLMR are the percentage coverage of the smallest and largest
model regions with a similar colour as the cue region, respectively.
6. Region Growing: If s object sizes are used between (and including) min_size and
max_size then the object size increment
k
size size
s
=
max_ min_
1
.
for (object_size = min_size; object_size <= max_size; object_size = object_size + k)
(a) no_ object_pixels = 0;
(b) Determine the window corresponding to the centroid of the cue region.
(c) Grow into a neighbouring window (8-neighbourhood) if the colours in that
window increases HIM(, ), equation (4).
(d) if HIM(, ) increases then no_object_pixels = no_object_pixels + no of
pixels in window with model colours.
(e) Repeat Steps (c) and (d) until: no_object_pixels >= object_size, or there
are no more neighbouring windows containing model colours; or
HIM(, ) is maximised.
7. Match measure: The maximumHIM(, ) for all object sizes is the cue rank. If
this value exceeds match_threshold then the model is assumed to exist at this cue.
British Machine Vision Conference
300
HIM I
j
M
j
j
n
(, ) min( , )=
=
1
(4)
where
M
j
number of el pixels with colour j
el size
=
__mod_ _ _ _
mod _
and
I
j
number of object pixels with colour j
object size
=
__ _ _ _ _
_
.
In Step 1, any region larger than 50% of the image area is considered a background
region even if this assumption is incorrect the object will still be identified through
its other regions. The object size calculation in Step 3 assumes that the cue region is
part of the object and not more than half of it is occluded. For example, consider a
model with three regions, two red occupying 25% and 50% of the total object area and
one green (occupying the remaining 25%). If a red region containing 20 pixels is found
in the image then
min_size = =
20
50
100 40and
max_size = =2
20
25
100 160 are the
minimum and maximum object sizes in pixels, respectively. There are four important
parameters in this algorithm: window size, colour threshold, match percentage
threshold; and minimum region area. The window size must be large enough to span
gaps between regions due to unclassified pixels; if it is too large, although it speeds up
the search, the accuracy of the object location is lost. The colour threshold selected is
dependent on the colour constancy algorithm used; the better the algorithm, the smaller
the threshold. The match percentage threshold is based on the allowed amount of
object occlusion and the minimum region area is selected arbitrarily, however it must
be small enough to include the smallest object region that needs to be recognised.
4 Results
The 25 model database used in these experiments is illustrated in Figure 2. The
database contains books, cereal boxes, playing cards and a Christmas card box. Many
of these models have similar geometry, e.g. models 1, 2, 4, 5, 6, 7 and 15; models 12,
13 and 14; models 20, 21, 22, 23, 24 and 25; and models 9, 10 and 11. The playing
card models (9, 10 and 11) have only two colours, white and red and models 10 and 11
have similar colour proportions. Models 13 and 14 have practically the same
representative colour regions other than the text ‘Debugger’ and ‘Assembler.
The test images used in these experiments are illustrated in Figure 3 (available in
colour on the proceedings CDROM). Figure 3(a) contains an occluded model 6 and 14;
Figure 3(b) contains model 14. Figure 3(c), (d), (e) and (f) contain model 1 and 2;
model 9 and 10 (both occluded); model 2; and model 5, 7, and 12, respectively. These
images highlight the performance of the algorithm in conditions such as: cluttered
scenes, affine object distortion and object occlusion. Because the model and test images
were captured under different illuminants, a colour constancy algorithm [4] was
applied to all images before processing.
British Machine Vision Conference
301
Figure 2: The model database.
Image Number
of models
in image
Correct Match
Placement
False
positives
Percentage
reduction in
search
space
1
st
2
nd
3
rd
>3
rd
(a) 2 2 9 80.7
(b) 1 1 9 78.7
(c) 2 1 1 11 58.3
(d) 2 2 5 0.0
(e) 1 1 9 60.1
(f) 3 1 1 1 8 41.6
Table 2: A summary of the results of applying Algorithm 2 to the images of Figure 3.
Table 2 presents a summary of the results of applying Algorithm 2 to the six images
of Figure 3. The first column of this table contains the image identifier (Figure 3(a) -
(f)), the second column contains the number of database models that are present in the
given image. The third column contains the placement of each match, that is whether
the cue with the best match value (1
st
) represents the object, or is the second best (2
nd
),
or the third (3
rd
) best or worse (>3
rd
). The fourth column contains the total number of
false positives that have occurred, and finally column five gives the percentage
British Machine Vision Conference
302
reduction in the search space (which is defined as the number of image windows
containing the localised model (and the false positives) over the total number of image
windows).
(a) (b)
(c) (d)
(e) (f)
Figure 3: The images used in the object search experiments.
In Figure 3(a) both models 6 and 14 had the second best rank at their correct image
location, however there were 9 false positives. The percentage reduction in the search
space was 80.7% and the percentage reduction of the models present in the image is
44%. The remainder of Table 2 is interpreted in the same way. It is important to note
however that there was no appreciable reduction in the search space for Figure 3(d).
British Machine Vision Conference
303
Cue location (506, 412)
0
50
100
0246810
Object Size increments
Match
Percent
Cue location (301,156)
0
10
20
0246810
Object Size increments
Match
Percent
(a) (b)
Cue location (193,184)
0
10
20
0246810
Object Size increments
Match
Percent
Cue location (555,168)
0
5
10
15
20
0246810
Object Size increments
Match
Percent
(c) (d)
Cue location (248,451)
0
10
20
30
0246810
Object Size increments
Match
Percent
Cue location (291,442)
0
10
20
30
0246810
Object Size increments
Match
Percent
(e) (f)
Figure 4
: The match percentages at increasing object size increments (at the cue
locations indicated) when searching for model6 in image Figure 3(a). In this example
the location which yields a match percentage of 93%, (a) contains model6.
To illustrate the search process, consider the search for model6 in Figure 3(a). The
parameters used in all the experiments were: window size 10 10
×
pixels, minimum
object match percentage 89.5% and 11 object sizes were selected between
min_size
and
max_size
(all equally spaced) for each region cue found. In Step 3 of Algorithm 2, six
cue regions were identified with centroids at (x,y) co-ordinates: (506,412), (301,156),
(193,184), (555,168), (248,451), and (291,442). At each cue, the growing process was
initiated for each of the 11 object sizes (c.f. Figure 4) and the match measure
calculated. The only cue which generated a match percentage greater than 89.5% is at
co-ordinates (506,412) (c.f. Figure 4(a) and 5(a)) where a match percentage of 93%
was calculated for object size increment 0 (i.e.
min_size
); therefore this is the solution.
British Machine Vision Conference
304
(a) (b)
(c ) (d)
(e) (f)
Figure 5: The six cue regions.
5 Conclusions
In this paper a colour object search algorithm capable of locating 2- or 3-dimensional
planar objects was presented. Successful searches were performed in: complex,
cluttered scenes; high and low spatial resolutions of the object; affine object distortion;
and up to 50% occlusion of the object area. The recognition rate for these experiments
was 45% at both rank 1 and 2 and 10% at rank 3. These were no false negatives results
but 51 false positives. The average reduction in the model database and the object
search space were 68% and 53%, respectively.
This algorithm outperforms all of the colour object localisation/search algorithms
presented in this paper [8] [9] [10] [11] [12] [15] except Matas et al.’s colour adjacency
graph [6]; these methods are 2-dimensional planar (except Matas et al. which is 2- or
3-dimensional) while the adopted method is 2- or 3-dimensonal planar. The colour
histogram intersection metric used by this algorithm is more robust to occlusion than
Wixson and Ballard’s [15] and our [12] colour ratio measure; and more stable under
changes in illumination and the position of the light source than Vinod [11] (since
British Machine Vision Conference
305
shifts in histogram bin values disrupt the metric) because a given model colour falls
into only one bin. also, the normalised colour space is independent of image
geometry. In histogram backprojection determining peaks in the confidence space of
complex scenes with several false positives, is non-trivial. In addition, in the adopted
algorithm object size is calculated from image data and fewer floating point numbers
are required to represent the model (5 numbers per model colour).
The algorithm is incapable of representing perspectively distorted 3-dimensional
objects or objects with similar colour proportions but different topologies (whereas
Matas et al. can). Possible improvements include the calculation of a more accurate
object size (for example based on the area of a non-occluded region) and the
improvement of the cue generation process by using for example pairs of related
regions (e.g. adjacent).
References
[1] Grimson, W., E., L., “
Object Recognition by Computer: The Role of Geometric
Constraints
”, MIT Press, Cambridge, Massachusetts, London, England, 1990.
[2] Healey, G., E., Binford, T., O., “The Role and Use of Colour in a General Vision System”,
proc. DARPA IV Workshop
, USC, CA, USA, 1987, pp 599-613.
[3] Hilbert, D., R
., “Color and Color Perception: A Study in Anthropocentric Realism
”, Center
for the Study of Language and Information (CSLI), 1986.
[4] Hung, T, W, R, Ellis, T, "Spectral Adaptation with Uncertainty using Matching
", IEE Proc.
of the 5
th
Int. Conf. on Image Processing and its Applications
, Scotland, 1995, pp 786-790.
[5] Khotanzad, A, Bouarfa, A, "Image Segmentation by a Parallel, Non-Parametric Histogram
Based Clustering Algorithm",
Pattern Recognition
, 23, 9, 1990, pp 961-973.
[6] Matas, J, Marik, R, Kittler, J, "On Representation and Matching of Multi-coloured
Objects", Proc. of ICCV, Boston, 1995, pp 726-732.
[7] Mundy, J., L., Zisserman, A., “
Geometric Invariance in Computer Vision
”, MIT Press,
Cambridge, Ma, London, 1992.
[8] Schettini, R., “Multicolored Object Recognition and Location”,
Pattern Recognition
Letters
, 15, 1994, pp 1089-1097.
[9] Stricker, M., Orengo, M., “Similarity of Color Images
”, in Stor. and Retrieval for Image
and Video Databases III, SPIE Proc. Series
, 2402, Feb. 1995, pp 381-392.
[10] Swain, M., J., Ballard, D., H., “Indexing via Color Histograms”,
Proc. ICCV
, 1990, pp 390-
393.
[11] Vinod, V., V., Murase, H., “Object Location Using Complementary Color Features:
Histogram and DCT”, in
Proc. of ICPR 96
, 1996, pp 554-559.
[12] Walcott, P, Ellis, T, "The Localisation of Objects in Real World Scenes Using Colour
",
Proc. of 2nd ACCV
., Singapore, December 1995, pp 243-247.
[13] Walcott, P, "Object Recognition Using Colour, Shape and Affine Invariant Ratios",
BMVC96
, Edinburgh, Scotland, September 1996, pp 273-282.
[14] Walcott, P., “
Colour Object Search
”, PhD thesis, City University, July, 1998.
[15] Wixson, L, Ballard, D, "Real-time Detection of Multi-coloured Objects",
SPIE Sensor
Fusion II: Human and Mach. Strategies
, 1198, Nov., 1989, pp 435-446.
... Matas et al. [8] used a color adjacency graph (whose nodes represent model colors and edges encode information about the adjacency of colors and their reflectance ratios) to identify object hypotheses and the color region adjacency graph for object match verification, Matas et al.'s graph search process is computationally expensive, but it can represent 3-dimensional deformable objects with perspective distortions. Paul A. Walcott and Tim J. Ellis [9] proposed a color object search algorithm, which included three stages: cue generation, region growing and hypotheses ranking, and this algorithm depended on its color image segmented section and had expensive computation. Many chosen parameters influenced its flexibility. ...
... If we extend this conclusion to the whole image, in the corresponding backprojected image, the values of the object areas are greater than that of the background regions, and the object area is always local maxima. According to the normalized histogram of the template in equation (8), we can get another probabilistic image with the equation (9). It can overcome the drawbacks of the histogram backprojected algorithm. ...
Article
Full-text available
We propose a novel and robust color object detection and localization algorithm. Without a priori information about the number of objects, our method can detect all the objects with similar color feature in template. An improved histogram backprojection algorithm is used to find the object candidate regions. The weighted histogram intersection is used to verify the presence of objects. With the color feature in template, our method can detect and locate the objects accurately, get the number of objects, estimate their scales and orientations. Our experimental results on outdoor images obtained under different environments verify the effectiveness of our algorithm.
Conference Paper
A colour object search technique is presented. Given an image, regions of probabilistic occurrence are generated (for each object model) by isolating regions whose colours are similar to the template colours. Since the template may exist at one or more of these region locations, each is examined individually. At each region location the object size is estimated and an expansion process initiated to include all pixels matching the template colours. Expansion is terminated when a match measure quantified by defining a term `extent of match' (EOM), based on object size and the number of pixels with each model colour, is maximised. If the EOM exceeds a predefined threshold then the object is assumed to be present. Several experiments are presented which demonstrate the algorithm's robustness to scale, affine object distortion, varying illumination, image clutter and occlusion
Article
Fast object recognition is critical for robots in the real world. However, geometry-based object recognition methods calculate the pose of the object as part of the recognition process and hence are inherently slow. As a result, they are not suitable for tasks such as searching for an object in a room. If pose calculation is eliminated from the process and a scheme is used that simply detects the likely presence of the object in a scene, considerable efficiency can be gained. This paper contains a discussion of the requirements of any searching task and presents a fast method for detecting the presence of known multi-colored objects in a scene. The method is based on the assumption that the color histogram of an image can contain object "signatures" which are invariant over a wide range of scenes and object poses. The resulting algorithm has been easily implemented in off-the-shelf hardware and used to build a robot system which can sweep its gaze over a room searching for an object.
Conference Paper
The authors use colour as a cue to estimate the presence of target objects in an image, by making assumptions about the spectral properties of surfaces in the image. They compute a match with the colour knowledge database using a least mean square error measure. A by-product of this match is a diagonal matrix spectral transformation which enables the system to adapt to varying illumination conditions. The fitness measure can be combined with other algorithms (e.g. edge detection, colour segmentation) and applied to the image to confirm the identity and the location of the objects. They present the implementation details of this simplified colour constancy algorithm, and provides results of its application to a number of test images
Article
A novel computational strategy exploiting both shape and color information for the recognition and location of a two-dimensional model object on a known background is presented here. The proposed strategy is used to improve the search phase of a “search-and-replace” function for raster images.
Article
This paper describes a totally automatic non-parametric clustering algorithm and its application to unsupervised image segmentation. The clusters are found by mode analysis of the multidimensional histogram of the considered vectors through a non-iterative peak-climbing approach. Systematic methods for automatic selection of an appropriate histogram cell size are developed and discussed. The algorithm is easily parallelizable and is simulated on a SEQUENT parallel computer. Image segmentation is performed by clustering features extracted from small local areas of the image. Segmentation of textured, color, and gray-level images are considered. Eight-dimensional random field model based features, three-dimensional RGB components, and one-dimensional gray levels are utilized for these three types of images respectively. For texture segmentation, an image plane cluster validity procedure based on region growing of the mapped back clusters in the feature space is developed. Most of the phases are also parallelized resulting in almost linear speed ups. Quite satisfactory results are obtained in all cases.
Conference Paper
Color constitutes an important cue for recognizing and locating objects in complex scenes. Most of the existing techniques using colors employ only color histograms for object recognition and/or location. Color histograms are stable but not accurate. In this paper we study the complementary nature of color histogram and DCT coefficients with respect to accuracy and stability and develop a combined method using both histograms and discrete cosine transform (DCT) coefficients. The methods are experimentally evaluated. The combined method has higher stability and accuracy than using either feature alone
Conference Paper
A new representation for objects with multiple colours-the colour adjacency graph (CAG)-is proposed. Each node of the CAG represents a single chromatic component of the image defined as a set of pixels forming a unimodal cluster in the chromatic scattergram. Edges encode information about adjacency of colour components and their reflectance ratio. The CAG is related to both the histogram and region adjacency graph representations. It is shown to be preserving and combining the best features of these two approaches while avoiding their drawbacks. The proposed approach is tested on a range of difficult object recognition and localisation problems involving complex imagery of non-rigid 3D objects under varied viewing conditions with excellent results