Conference PaperPDF Available

Object Detection in Color Image

Authors:

Abstract and Figures

In this paper the problem of automatized object detection in a color image is treated. The solution basing on the classic pixel clustering methods is developed. The parameter for the heterogeneity of image areas is introduced. The method for markup of an image with automatically produced object names is proposed. The data structure is described. The computational complexity of clustering algorithms is estimated.
Content may be subject to copyright.
43
Object Detection in Color Image
M. Kharinov 1), A. Buslavsky 1)
1) St. Petersburg Institute for Informatics and Automation of RAS,
14_liniya Vasil’evskogo ostrova 39, St. Petersburg, 199178 Russia,
khar@iias.spb.su, www.spiiras.nw.ru
Abstract: In this paper the problem of automatized object
detection in a color image is treated. The solution basing
on the classic pixel clustering methods is developed. The
parameter for the heterogeneity of image areas is
introduced. The method for markup of an image with
automatically produced object names is proposed. The
data structure is described. The computational complexity
of clustering algorithms is estimated.
Keywords: pixel clustering, standard deviation, quasi-
optimal approximations, hierarchy.
1. INTRODUCTION
A few decades ago, the creation of software and
hardware image processing systems was mainly limited to
the development of the user interface, which most
programmers of each firm were engaged in. The situation
has significantly changed with the advent of the Windows
operating system, when the majority of developers
switched to solving the problems of image processing
itself. However, this has not yet led to cardinal progress in
solving typical tasks of recognizing faces, car numbers
and road signs, analyzing remote and medical images, etc.
Each of these "eternal" problems is solved by trial and
error by the efforts of numerous groups of engineers and
scientists. As modern technical solutions turn out to be
excessively expensive, the task of automating the creation
of software tools for solving intellectual problems is
formulated and intensively solved abroad [1]. In the field
of image processing, the required toolkit should support
the analysis and recognition of images of previously
unknown content and ensure the effective development of
applications by ordinary programmers. Just as the
Windows toolkit supports the creation of interfaces for
solving various applied problems.
2. PROBLEM STATEMENT
The processing of images of any content is declared in
the context of detection of the most noticable areas in the
image. This image processing domain in original works
is reffered as “salience region detection” [2,3]. In these
works, a so-called Saliency Map of visible areas is
constructed for an image. Then, the Saliency Map is
usually converted to a black-and-white object-to-
background mask by a threshold transformation. In
discussed model of pixel clustering, the problem
statement [2,3] is being generalized and developed. The
considering problem is that of ordering sets of pixels
according to the heterogeneity (local nonhomogeneity)
parameter, which makes sense of a quantitative measure.
Like the number of pixels the heterogeneity decreases or
at least does not increase when splitting a set of pixels
into subsets.
Usually, the parameter of heterogeneity type, which
sets the contrast, the complexity of the image section is
determined for a given pixel by evaluating its
characteristics with respect to the remaining pixels of the
image [2,3]. In this paper to introduce of the target
heterogeneity parameter a special sequence of piecewise
constant image approximations is calculated. It is
assumed that the approximations constitute the binary
hierarchy and have special properties that are necessary
for the correct interpretation of the trivially defined
heterogeneity parameter. Up to some refinements a
hierarchical sequence of approximations is produced by
classical cluster analysis methods [4,5].
Figuratively speaking, the task is to model the visual
perception of a living creature that has just appeared, for
example, a man, a fly, etc. that has no any experience but
is able to see the scene in ordered colors and to
distinguish objects, which presented and ranked by size
via clusters of pixels. In this case, computer simulation is
not associated with imitation of biological visual
perception, but relies on optimization of piecewise
constant approximations of the image by standard
deviation
or approximation error, i.e. total squared
error 2
3
NE , where N is the number of pixels in the
image.
3. HETEROGENEITY PARAMETER
It is known that the sequence of optimal piecewise-
constant approximations of an image depending on the
number
g
of pixel clusters is described by a
monotonically increasing sequence of non-positive
increments 0
E of the approximation error
0... 132
N
EEE
or by a convex sequence of
the values E themselves:
1...,,3,2,
2
11
Ng
EE
Egg
g.
(1)
Since, in the general case, the sequence of optimal
approximations is not hierarchical, the problem of
approximating of a hierarchy of optimal approximations
by a hierarchical sequence of quasi-optimal
approximations arises. Quasi-optimal approximations are
described by a convex sequence of g
E values and contain
an optimal approximation with a fixed number of clusters
0
g (Fig. 1).
When the condition of convexity of the approximating
curve is satisfied, the limitation of the approximation
error within a certain threshold g
E is guaranteed
(downward dotted straight line in Fig. 1):
N
g
EEg1
1.
(2)
Heterogeneity parameter

jiEsplit
is defined for
each cluster ji as the absolute value of the
approximation error increment

jiEsplit
caused by
44
the division of cluster ji into the pair of nested
clusters i and j that specified in the considered binary
hierarchy.
g
E
g0
E1
N
1
0
Fig.1 – Approximation of optimal approximations (gray curv
e
by quasioptimal approximations (solid black curve).
From the property of convexity of the curve for
quasioptimal approximations in Fig. 1 it follows that the
heterogeneity parameter does not increase with decreasing
of clusters:


jEiEjiE splitsplitsplit
,. (3)
Therefore, it can be considered as a quantitative
measure of the heterogeneity of pixel clusters.
The numerical value of the heterogeneity parameter

jiEsplit is expressed via the numbers of pixels i
n,
j
n and their three-component values i
I, j
I averaged
within the clusters ji, as:
 
2
,ji
ji
ji
mergesplit II
nn
nn
jiEjiE
,(4)
where

0jiEmerge is a non negative increment of
the approximation error caused by merging of two
clusters i and j into the cluster ji .
4. HIERARCHY OF PIXEL CLUSTERS
A posteriori any convex sequence 1g
E, 2g
E, ...,
Ng
E (Fig. 1) can be obtained by increasing clusters of
pixels using Ward method [4-6]. In this method, at the
beginning, each pixel is treated as an independent cluster.
Then, at each step, the pair of clusters ji, is merged with
each other providing the minimum increment of the
approximation error

jiEmerge ,
:

jiEjijiji merge
gji ,minarg,:,
1,...2,1,0,
, (5)
where the number of clusters decreases from N to 1.
For gray images, along with the Ward's iterative pixel
clustering, Otsu hierarchical method [7] of iterative
division of pixel clusters into two is also applicable,
which, in combination with the Otsu multi-threshold
method [8,9], provides the simplest software
implementation of target pixel clustering according to
Fig.1.
At the output of iterative clustering, a binary cluster
hierarchy is generated in an algorithm of iterative merging
or splitting of pixel sets. It contains N image
approximations. These approximations contain only
12
N different pixel clusters. Of these, N clusters are
indivisible, since they consist of individual pixels. And
for each of the other 1
N clusters, the split operation is
maintained, providing the restoration of pair of clusters
of pixels 1 and 2, which was merged each other forming
cluster of 3 in Ward’s method: 213:2,13
5. SEQUENCE OF OBJECTS
The detection of a sequence of geometrically non-
intersecting objects is performed by the threshold value of
heterogeneity or area, i. e. the number of pixels in a
cluster. In this case, a subsequence of approximations
consisting of pixel clusters with heterogeniety values
higher and lower than the established threshold is selected
from the full hierarchical sequence of approximations. So,
the image field is divided into regions of objects and
background. The sequence of objects that reveal in the
image field is encoded in the object rating map by
numbers in the order of their detection (Fig. 2).
Fig.2 – Encoding of revealed objects in the standard color
Lenna ima
g
e.
Fig. 2 shows the original image on the left and the
object rating map in 5 tones with 14132 segments on the
top-right. This map is calculated for a split
E
threshold
equal to 1% of the maximum value 1
E marked in the
Fig.1.The bottom-right representation presents the image
with 13151 colors, which is obtained by averaging the
pixels inside the segments of the object rating map.
Objects that revealed first on the top-right object
rating map are marked in black. Last revealed objects are
marked in white. The values of heterogeneity, and the
values of the marked area increase with increasing
intensity of gray tones. The resulting pixel intensities for
given threshold are treated as automatically calculated
object identifiers.
Note that to get the top-right representation of the
image in several colors it is enough to specify a single
45
threshold value of the heterogeneity parameter or area
threshold value.
Object rating maps obtained for several thresholds of
heterogeneity and/or area describe image points marked
by vectors of object sequence numbers, which are
analyzed as automatically generated names of objects and
intended for further image recognition.
6. СOMPUTATIONAL COMPLEXITY
The computational complexity of the discussed
computer image processing is determined by iterative
generating of a hierarchy of image approximations that
satisfies the conditions (3) of order conservation in the
dichotomous separation of clusters. Due to the
peculiarities of video data, the indicated fact that
generation can be performed by the classical Ward’s
method [4-6] does not allow one to accurately predict the
result of calculations. The images are characterized by
repeatability of the minimum values of merge
E at the
initial iterations of the pixel merging. Therefore, the result
of the calculations is affected by the order of merging of
cluster pairs. So, the target hierarchical sequence of
approximations satisfying (3) is constructed ambiguously.
The original Ward’s method is used extremely rarely
in image processing due to the large computational
complexity that quadratically increases with the number
N of pixels in the image. However, the obtaining image
approximations in ordered colors (3) can be significantly
accelerated by applying the Ward’s method to parts of an
image containing a limited number of pixels. In this case,
the processing is divided into three stages.
At the first stage, the image is divided into 0
g clusters
of pixels, in particular, into 0
g segments processed by
Ward's method as independent images. At this stage, a
hierarchical sequence of approximations is constructed for
each cluster.
At the second stage, the quality of image partition into
0
g clusters is optimized by approximation error E. As a
result, the image is subdivided into 0
g superpixels
(elementary pixel clusters), which are characterized by a
minimized value of E.
In the final stage, Ward's super-pixel clustering is
performed. The processing finishes when all 0
g
superpixels merge into one cluster, and the complete
hierarchical sequence from N image approximations is
calculated.
The total computational complexity of the first and the
third stages of processing by Ward's method, depending
on the number N of image pixels, is estimated in order of
magnitude as the function )(Nf :
2
0
0
2
~)( g
g
N
Nf , (6)
which has a minimum at 0
g:
3
2
02
N
g. (7)
Then, when choosing the number of superpixels 0
g
according to (7), the computational complexity )(Nf is
expressed as:
32
2
3
~)( NNNf . (8)
Thus, for increasing pixel number N, the
computational complexity )(Nf grows as 3
4
N, which
ensures the applicability of Ward's method to the images
of actual dimensions.
If a hierarchical Ward's pixel clustering in each of 0
g
clusters of pixels at first stage of processing results in
such image approximation for which the minimum
increment of the approximation error merge
E does not
exceed the maximum heterogeneity value split
E:
,maxmin splitmerge EE , (9)
then the second stage of generation of ordered colors is
omitted. In this case, the estimation (8) describes the
computational complexity of the algorithm as a whole. It
is so, since (9) is a criterion of preserving of the color
order while merging of the hierarchy of 0
g superpixels
with the inside hierarchies of pixel set for each
superpixel.
If criterion (9) is violated for 0
g structured sets of
pixels after first stage, all violations are suppressed at the
intermediate stage of superpixel formation due to the so-
called SI method consisting in iteratively performing
division of one cluster acompanied with merging of
another pair of clusters [10,11].
The idea of the SI method is obvious. If condition (9)
is violated, we are looking for a cluster to divide it into
two subclusters with a maximum decrease in the
approximation error split
Emax . Along with the division
of the found cluster, a pair of clusters with a minimum
increment merge
E
min is found and merged.
If instead of a simple merging of clusters after the
merging, the sequence of enlarged approximations is
updated, then the efficiency of the approximation error
minimizing increases. It is so, because in this case, the
maximum drop of the approximation error split
Emax is
calculated on the set of maximal values for each cluster.
7. DATA STRUCTURE
Speed operations with hierarchically structured
clusters of image pixels are supported in terms of trees. In
this case, it is more convenient to use the Sleator-Tarjan
dynamic trees [12,13] instead of conventional trees, in
particular, dendrograms, etc.
In the conventional interpretation of a tree, a new node
is generated when merging sets of pixels. And when
interpreting according to Sleator-Tarjan, the merging of
pixel sets is described by establishing an arc between the
root nodes of trees. So, the sets of pixels themselves are
ordered in the tree structure (Fig. 3).
Fig. 3 explains the difference in the interpretations of
trees by the example of an image consisting of four
pixels.
A characteristic feature of the developing software
toolbox for the formation and ordering of averaged colors
in an image is the utilization of a reversible merging of
pixel clusters. In a reversible merging, for each cluster
containing more than one pixel, two clusters united when
given cluster is obtained, are memorized. In this case,
iterative merging of cluster pairs is performed “from
pixels” in some calculated order. The merging order is
stored and replaced by the opposite when splitting the
clusters.
46
Fig.3 – Formation of a binary hierarchy of pix el clusters in
terms of conventional trees (above) and dynamic Sleator-
Tarjan trees (below).
In addition to reversing the order of cluster merging,
in the process of cluster dividing an arbitrary choice of
one or another cluster to be divided in two is also
supported. In this case, the modification of the initial
order of cluster merging is performed. Modification of the
cluster merging order is supported automatically.
Thus, reversible calculations are not limited to simple
data recovery at any step, as in [14,15], but are
implemented in a generalized sense. And it becomes
possible to reduce the approximation error and improve
the quality of image approximations due to the
combination of operations of merging and division of
pixel clusters in two. For generalized reversible
calculations Sleator-Tarjan dynamic trees (acyclic graphs)
are supplemented with cycles (cyclic graphs) in the form
of linked lists. Sleator-Tarjan dynamic trees together with
linked lists make up a network connecting the image
pixels (Fig. 4).
Fig. 4 illustrates an image matrix of 25 pixels
interconnected by arcs of the Sleator-Tarjan tree, which is
shown by solid lines. The tree has a single root node that
coincides with the first image pixel and treated as an
identifier for combining all the pixels of the image into a
single cluster. When breaking the arc incident to the root
node, the whole tree splits into two trees, and the whole
set of image pixels is divided into two clusters, which are
further considered as separate images. For each node of
the tree Fig. 4 incoming arcs are combined into cycles,
shown by thin dashed lines. Cycles define the order in
which the arcs have been established, which is provided
by an additional indication for each cycle of either a start
or end node by means of pointers, indicated by bold
dashed lines. The merging of clusters is determined by the
establishment of arcs between root nodes, and the inverse
operation of dividing the cluster into two is provided by
the breaking of arcs. When reversing the process of
cluster merging for a given root node, the arcs are broken
in the reverse order.
split
E
Fig.4 – The scheme of reversible cluster merging.
The data structure, illustrated by the scheme in fig. 4,
is given by three arrays: an array of dynamic trees, an
array of cycles, and an array of pointers of initial or end
nodes in cycles. The real data structure for speedup
calculations includes a number of additional arrays, which
are described by less complex schemes.
Sleator-Tarjan dynamic tree and cycles of fig. 4
constitute a typical network in which incoming arcs for a
given node are indexed by values of heterogeneity (the
drop of the approximation error) split
E, caused by the
division of the pixel cluster specified by this node. At the
same time, the considered network is called dynamic,
since it is dynamically rebuild in the course of
computations. And the discussed network is called
algebraic, since it is obtained by merging trees and cycles
according to the established rules. In this case, condition
(3) provides a hierarchical sequence of image partitions,
which is described by a convex sequence of
approximation error values. In network this condition is
expressed in that the weights of the arcs are weakly
monotonically fall down when traversing the cycles from
the end to the initial element and from the root to the
periphery of the Sleator-Tarjan dynamic tree (Fig. 4).
To obtain a hierarchical sequence of image partitions
corresponding to a convex sequence of approximation
error values, the pixel clusters are calculated by breaking
at each step of those arc that provide a maximum value of
split
E. On the other hand, if the original image is
divided into
g
independent images containing more than
one pixel, then there are
g
options for dividing it into
1
g independent images. Thus, in addition to the only
binary hierarchy of image partitions for object detection
in descending order, it is provided a variety of different
sequences of partitions of the original image into
independent structured images that represent objects in
various combinations.
Unlike conventional trees, Sleator-Tarjan dynamic
trees are built directly on the set of image pixels, without
specifying additional nodes. Contrary to conventional
trees, the binary hierarchy of pixel clusters in terms of
Sleator-Tarjan dynamic trees is specified by an irregular
tree structure. But the visual interpretation of the
calculations is clearly preserved (Fig. 4).
47
Compared to conventional trees, the main features of
the Sleator-Tarjan dynamic trees are that:
1. Metadata describing the hierarchy of pixel clusters and
the clusters themselves are supported on the original
set of coordinates.
2. To minimize
or 2
3
NE , a generalized mode of
reversible computations is implemented. At that the
state of the computing system at any step is not
necessarily restored to the same. Due to this fact,
minimization of E is realized not only in direct but
also during the reverse course of calculations.
Thus, Sleator-Tarjan dynamic trees provide all the
capabilities of conventional trees with minimal memory
costs. Thanks to the first of listed properties, Sleator-
Tarjan dynamic trees are quite convenient to provide the
simplest implementation of reversible calculations. In the
developed data structure, dynamic trees (acyclic graphs)
are constructed in several forms and supplemented with
cycles (cyclic graphs). In combination, they form a
dynamic algebraic network that supports high-speed
computation, storage and conversion of millions of sets of
pixels in the computer’s RAM. At the same time, as
experience shows, mastering the toolbox of dynamic
networks in order to solve problems of image processing
of a particular type, presents a certain complexity for
programmers, which prevents its implementation into the
practice of image processing. A feature of the software
implementation of the model is the multiple calculation of
the extreme values of the array elements while modifying
data, which requires time-consuming acceleration of the
algorithms by routine and special programming
techniques. Therefore, for the implementation of a
computational model, the freely available ready-made
programs are preferable.
8. CONCLUSION
In the field of image recognition, a key problem is the
creation of software tools for developing solutions to
specific engineering problems, i.e. face recognition, road
conditions, estimation of distances to objects by stereo
pairs, etc. The urgency of the problem is currently
dictated primarily by financial considerations.
Moreover, solving the problem requires the model of
the image (model of objects in the image), which is
implemented in the package of application programs that
are freely available and are intended to automate the
creation of specific applications.
The development of the required package has long
been realized in the United States (PPAML project of the
agency DARPA 20013-2017, [1]). In Russia, such
projects have not yet been carried out, and the
development of the theory of image processing is mainly
carried out by generalizing solutions to specific applied
problems [16].
Perhaps, in addition to the inductive development of
image processing software toolboxes, a special attention
should be paid to the deductive development of
application development toolboxes [17], including the
discussed model of quasioptimal image approximations,
which is being developed in SPIIRAS [7,13].
9. REFERENCES
[1] PPAML (Probabilistic Programming for Advanced
Machine Learning), DARPA project, 2013-2017,
https://galois.com/project/probabilistic-programming-for-
advanced-machine-learning/.
[2] Achanta R., Hemami S., Estrada F., Susstrunk S.
Frequency-tuned salient region detection, Computer
vision and pattern recognition (CVPR), IEEE
conference, 2009. pp. 1597-1604.
[3] Cheng M.M., Mitra N.J., Huang X., Torr P.H., Hu
S.M. Global contrast based salient region detection,
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 2015. Vol. 37. 3. pp. 569-582.
[4] Ajvazyan S.A., Buhshtaber V.M., Enyukov I.S.,
Meshalkin L.D., Applied Statistics: Classification and
Dimension Reduction, Moscow: Finance and
Statistics, 1989. 607 pp.
[5] Mandel I.D. Cluster Analysis, Moscow: Finance and
Statistics, 1988. 176 pp.
[6] Ward J.H., Jr. Hierarchical grouping to optimize an
objective function, J. Am. Stat. Assoc. 1963. Vol. 58,
Issue 301. pp. 236-244.
[7] Kharinov M.V. Pixel Clustering for Color Image
Segmentation, Programming and Computer Software,
2015, Vol. 41, 5, pp. 258–266, DOI:
10.1134/S0361768815050047
[8] Otsu N. A Threshold Selection Method from Gray-
Level Histograms, IEEE Trans. on systems, MAN,
and CYBERNETICS, January 1979. Vol. SMC-9,
1. pp. 62-66.
[9] Ping-Sung Liao, Tse-Sheng Chen, Pau-Choo Chung A
Fast Algorithm for Multilevel Thresholding, J. Inf.
Sci. Eng., 2001. 17, pp. 713-727.
[10] Kharinov M.V., Khanykov I.G. Optimization of
a piecewise constant approximation of a segmented
image, Proceedings of SPIIRAS, Vol. 3(40), 2015.
pp. 183-202.
[11] Kharinov M.V. Reversible Image Merging for
Low-level Machine Vision, arXiv preprint,
arXiv: 1604.03832, 2016. 5 pp.
[12] Nock, R., Nielsen F, Statistical Region Merging,
IEEE Trans. Pattern Anal. Mach. Intell. 2004. 26(11),
pp. 1452–1458.
[13] Kharinov M.V. Data structures of learning
system for automatic image recognition, PhD thesis,
St.-Pt. Institute for Informatics and Automation of
Russian Academy of Sciences, 1993. 20 pp.
[14] Toffoli T. Reversible computing, In International
Colloquium on Automata, Languages, and
Programming, Springer Berlin Heidelberg. 1980. –
pp. 632-644.
[15] Zongxiang Yan Reversible Three-Dimensional
Image Segmentation. US Patent 20110158503 A1.
2009. 10 pp.
[16] Chochia P.A. Theory and methods of processing
video information on the basis of a two-scale image
model, Dis. Doct. technical. Sciences, Moscow: IPPI
RAS, 2016. 302 pp.
[17] Gurevich I.B., Zhuravlev Y.I. Computer science:
Subject, fundamental research problems,
methodology, structure, and applied problems, Pattern
recognition and image analysis, 2014. Vol. 24, 3.
pp. 333-346.
... The detection result is displayed as a grayscale image. It is obtained by on-line multiple transformation by the threshold ∂E ∂g of the binary hierarchy of suboptimal image approximations to get the target object hierarchy ( [1]). 4 An object means a union of " base objects " or a " base object " part. ...
... So, the problem is solved by the error E minimizing in the process of high-speed reversible computations. 13 All calculations in the model are performed in terms of the Algebraic Multilayer Network (AMN) [1]. 14 Experience has shown that computing in terms of AMN is prohibitively difficult for ordinary programmers. ...
... At the same time, they are an alternative to modern artificial neural networks (ANN). Specific algorithms have a long history of routine computations [1]. In Russia, Sleator-Tarjan dynamic trees are practically unused. ...
Conference Paper
Full-text available
This paper presents a self-consistent mathematical model of a digital image, developed through laborious programming by trial and error. The model is described at the level of the problem statement, so that other models of the same type could be built in its likeness. The model class is defined by the elementary definitions of the basic concepts. Basic concepts are easily entered in the form of questions and formal definitions are introduced in the form of appropriate answers. Just in this paper the name "Scholar" of the model is introduced for the first time as a designation for models of such type.
Article
Full-text available
In this paper a hierarchical model for pixel clustering and image segmentation is developed. In the model an image is hierarchically structured. The original image is treated as a set of nested images, which are capable to reversibly merge with each other. An object is defined as a structural element of an image, so that, an image is regarded as a maximal object. The simulating of none-hierarchical optimal pixel clustering by hierarchical clustering is studied. To generate a hierarchy of optimized piecewise constant image approximations, estimated by the standard deviation of approximation from the image, the conversion of any hierarchy of approximations into the hierarchy described in relation to the number of intensity levels by convex sequence of total squared errors is proposed.
Article
Full-text available
Image segmentation using a hierarchical sequence of piecewise constant approximations that minimally differ from the original image in terms of the total squared error is discussed. It is proposed to obtain these approximations by two combined clustering and segmentation methods based on clustering image pixels using Ward’s method. In the first method, the number of segments in clusters is reduced in the course of hierarchical clustering by reclassifying pixels from one cluster to another. In the second method, a limited number of superpixels representing connected segments of the image are formed by enlarging source pixels, and then the superpixels are clusterized by Ward’s method. To decompose the image into superpixels, the segmentation quality is improved while preserving the number of segments. As a result, a noticeable improvement in the quality of image approximations is achieved, and their invariant encoding gives a marking of the image for subsequent object detection.
Article
Full-text available
In this paper а problem of segmentation of the color image, approached by piecewise constant approximations, is analyzed. The quality of the optimization is estimated by the classical standard deviation of image pixels from the pixels of approximations. The modern versions of the classical methods of image simulating by piecewise constant approximations characterized by minimal values of standard deviation or total squared error are detailed. Four main operations over pixel clusters and appropriate working criterions for the optimized approximation generating are discussed. The algorithmic versions of approximation transformation, providing the enhancement of approximation by standard deviation and also by visual perception for the given number of segments are proposed.
Article
Full-text available
This paper explores a statistical basis for a process often described in computer vision: image segmentation by region merging following a particular order in the choice of regions. We exhibit a particular blend of algorithmics and statistics whose segmentation error is, as we show, limited from both the qualitative and quantitative standpoints. This approach can be efficiently approximated in linear time/space, leading to a fast segmentation algorithm tailored to processing images described using most common numerical pixel attribute spaces. The conceptual simplicity of the approach makes it simple to modify and cope with hard noise corruption, handle occlusion, authorize the control of the segmentation scale, and process unconventional data such as spherical images. Experiments on gray-level and color images, obtained with a short readily available C-code, display the quality of the segmentations obtained.
The work is devoted to computer science. The subject, fundamental research problems, methodology, structure, and applied problems are defined and analyzed. The mathematical apparatus of computer science and its main methods—formalization, algorithmization, mathematical modeling, and programming—are considered. A characterization is given to the main fields of computer science pattern recognition, image analysis, artificial intelligence, intelligent data analysis, and information technologies. An in-depth analysis is carried out of the relationship and interaction between computer science and cybernetics. The role and the subject of informatics are discussed.
Conference Paper
The theory of reversible computing is based on invertible primitives and composition rules that preserve invertibility. With these constraints, one can still satisfactorily deal with both functional and structural aspects of computing processes; at the same time, one attains a closer correspondence between the behavior of abstract computing systems and the microscopic physical laws (which are presumed to be strictly reversible) that underly any concrete implementation of such systems. According to a physical interpretation, the central result of this paper is that it is ideally possible to build sequential circuits with zero internal power dissipation.
Article
Otsu reference proposed a criterion for maximizing the between-class variance of pixel intensity to perform picture thresholding. However, Otsu's method for image segmentation is very time-consuming because of the inefficient formulation of the be- tween-class variance. In this paper, a faster version of Otsu's method is proposed for improving the efficiency of computation for the optimal thresholds of an image. First, a criterion for maximizing a modified between-class variance that is equivalent to the criterion of maximizing the usual between-class variance is proposed for image segmen- tation. Next, in accordance with the new criterion, a recursive algorithm is designed to efficiently find the optimal threshold. This procedure yields the same set of thresholds as the original method. In addition, the modified between-class variance can be pre-computed and stored in a look-up table. Our analysis of the new criterion clearly shows that it takes less computation to compute both the cumulative probability (zeroth order moment) and the mean (first order moment) of a class, and that determining the modified between-class variance by accessing a look-up table is quicker than that by performing mathematical arithmetic operations. For example, the experimental results of a five-level threshold selection show that our proposed method can reduce down the processing time from more than one hour by the conventional Otsu's method to less than 107 seconds.
Article
A procedure for forming hierarchical groups of mutually exclusive subsets, each of which has members that are maximally similar with respect to specified characteristics, is suggested for use in large-scale (n > 100) studies when a precise optimal solution for a specified number of groups is not practical. Given n sets, this procedure permits their reduction to n − 1 mutually exclusive sets by considering the union of all possible n(n − 1)/2 pairs and selecting a union having a maximal value for the functional relation, or objective function, that reflects the criterion chosen by the investigator. By repeating this process until only one group remains, the complete hierarchical structure and a quantitative estimate of the loss associated with each stage in the grouping can be obtained. A general flowchart helpful in computer programming and a numerical example are included.