ChapterPDF Available

Superpixels Optimized by Color and Shape

Authors:
Superpixels optimized by color and shape
Vitaliy Kurlin, Donald Harvey
Department of Computer Science, University of Liverpool, UK
Abstract. Image over-segmentation is formalized as the approximation
problem when a large image is segmented into a small number of con-
nected superpixels with best fitting colors. The approximation quality is
measured by the energy whose main term is the sum of squared color
deviations over all pixels and a regularizer encourages round shapes.
The first novelty is the coarse initialization of a non-uniform superpixel
mesh based on selecting most persistent edge segments. The second nov-
elty is the scale-invariant regularizer based on the isoperimetric quotient.
The third novelty is the improved coarse-to-fine optimization where local
moves are organized according to their energy improvements. The algo-
rithm beats the state-of-the-art on the objective reconstruction error and
performs similarly to other superpixels on the benchmarks of BSD500.
Keywords: superpixel, segmentation, approximation, boundary recall,
reconstruction error, energy minimization, coarse-to-fine optimization
1 Introduction: motivations, problem and contributions
1.1 Motivations: superpixels speed up higher level processing
Modern cameras produce images containing millions of pixels in a rectangular
grid. This pixel grid is not the most natural nor most efficient representation, be-
cause not all these pixels are needed to correctly understand an image. Moreover,
processing a large image pixel by pixel is slow, and many important algorithms
have the running time O(n2) in the number nof pixels. However we know of a
smart vision system (called a human brain) that quickly extracts key elements
of complicated scenes by skipping the vast majority of incoming light signals.
The main challenge of low-level vision is to represent a large image in a
less-redundant form that can speed up the higher level processing. The central
problem is the unsupervised over-segmentation when a pixel-based image is seg-
mented into superpixels (unions of square-based pixels), which are perceptually
meaningful atomic regions with consistent features such as color or texture.
Our motivations are to address the following key challenges of superpixels:
rigorously state the over-segmentation as an approximation problem when a
large pixel-based image is approximated by a mesh of fewer superpixels;
add constraints that superpixels are connected and have no inner holes;
optimize superpixels in a data-driven way, e.g. by smartly choosing an original
configuration and attempting steps in a good order according to their costs;
avoid parameters whose influence on superpixels are hard to describe.
II
Fig. 1. Odd rows: superpixel meshes by algorithms SLIC [1], SEEDS [2], ETPS [3],
ours. Even rows: Reconstructed images with the average color for every superpixel.
Blue rectangles show the areas where our compact superpixels better capture details.
1.2 Oversegmentation by superpixels is an approximation problem
The aim is to segment an image of npixels into at most k < n superpixels that
are connected unions of pixels satisfying conditions (1.2a)–(1.2d) below.
(1.2a) The resulting superpixels with best constant colors approximate the image
well, e.g. a difference between an image and its approximation is minimized.
(1.2b) By construction the superpixels are connected and have no inner holes.
(1.2c) Superpixels adhere well to object boundaries, e.g. in comparison with
human-drawn contours in the Berkeley Segmentation Database BSD500 [4].
(1.2d) The only parameters are the number of superpixels and a shape coefficient
for a trade-off between the accuracy of boundaries and shapes of superpixels.
Since images are often replaced by their superpixel meshes, condition (1.2a)
highlights the importance to measure the quality of such an approximation. The
pixelwise sum of squared differences is the standard statistical mean error and
can be based on colors (as in Definition 1) or on texture information or other
pixel features. Condition (1.2b) guarantees that no post-processing is needed so
that a superpixel mesh can be represented by a simple graph (instead of a much
larger regular grid) whose nodes are superpixels and whose links connect adjacent
superpixels. Condition (1.2c) follows the tradition to evaluate superpixels on
BSD benchmarks. Condition (1.2d) restricts manually chosen parameters.
III
1.3 Contributions to the state-of-the-art for superpixels
First, we introduce an adaptive initialization of a superpixel mesh, whose main
idea of persistent edges from Definition 3 can be used in any hot spot anal-
ysis. Second, the new regularizer in Definition 2 is scale-invariant, hence the
superpixels are truly optimized by shape. Third, the optimization is improved in
subsection 4.2 and its time is justified for the first time in Theorem 12. Here are
the stages of the algorithm SOCS: Superpixels Optimized by Color and Shape.
Stage 1: detecting persistent horizontal and vertical edges along object bound-
aries to form a non-uniform grid of rectangular blocks, see subsection 3.2.
Stage 2: merging blocks in a grid when a reconstruction error is minimally
increased to get a non-regular initial mesh that is quickly adapted to a given
image and contains a required number of superpixels, see subsection 3.4.
Stage 3: subdividing rectangular blocks within every superpixel into sub-blocks
going from a coarse level to a finer level of optimization in subsection 4.1.
Stage 4: a new way to choose boundary blocks for moving to adjacent super-
pixels, then repeat Stage 3 until all blocks become pixels, see subsection 4.2.
2 A review of past superpixel algorithms
The excellent survey by D. Stutz et al. [5, table 3 in section 8] recommends 6
algorithms, which are reviewed below in addition to few other good methods.
A pixel-based image is represented by a graph Gwhose nodes are in a 1–1
correspondence with all pixels, while all edges of Grepresent adjacency relations
between pixels, when each pixel is connected to its closest 4 or 8 neighbors.
The seminal Normalized Cuts algorithm by Shi and Malik [6] finds an optimal
partition of Ginto connected components, which minimizes an energy taking into
account all nodes of G. The Entropy Rate Superpixels (ERS) of Lie et al. [7]
minimizes the entropy rate of a random walk on a graph. Based on Compact
Superpixels by Veksler and Boykov [8], the faster algorithm by Zhang et al.
[9] processes an average image from BSD500 in 0.5 sec. The Contour Relaxed
Superpixels (CRS) by Conrad et al. [10] optimize a cost depending on texture.
The Simple Linear Iterative Clustering (SLIC) algorithm by Achanta et al. [1]
forms superpixels by k-means clustering in a 5-dimensional space using 3 colors
in CIELAB space and 2 coordinates per pixel. Because the search is restricted to
a neighborhood of a given size, the complexity is O(kmn), where nis the number
of pixels and mis the number of iterations. This gives an average running time
of about 0.2s per image in BSD500. If a final cluster of pixels is disconnected or
contains holes, post-processing is possible, but increases the runtime.
The recent improvements of SLIC are the Linear Spectral Clustering (LSC) by
Li et al. [11] based on a weighted k-means clustering in a 10-dimensional space,
and the Eikonal Region Growing Clustering (EGRC) by Buyssens et al.[12].
The coarse-to-fine optimisation progressively approximates a superpixel seg-
mentation. At the initial coarse level, each superpixel consists of large rectangular
IV
blocks of pixels. At the next level, all blocks are subdivided into 4 rectangles and
one rearranges the blocks to find a better approximation depending on a cost
function, which continues until all blocks become pixels.
SEEDS (Superpixels Extracted via Energy-Driven Sampling) by Van den
Bergh et al. [2] seems the first superpixel algorithm to use a coarse-to-fine op-
timization. The colors of all pixels within each fixed superpixel are put in bins,
usually 5 bins for each channel. Each superpixel has the associated sum of devia-
tions of all bins from an average bin within the superpixel. This sum is maximal
for a superpixel whose pixels have colors in one bin. SEEDS iteratively maximizes
the sum of deviations by shrinking or expanding superpixels.
The ETPS algorithm (Extended Topology Preserving Superpixels) by Yao et
al. [3] minimizes a different cost function, which is the reconstruction error RE in
subsection 3.1 plus the deviation of pixels within a superpixel from a geometric
center, along with a cost proportional to the boundaries of superpixels. This
regularizer encourages superpixels of small sizes, however the benchmarks on
BSD500 are computed [3, Fig. 4] without the regularizer (as for SEEDS).
SEEDS and ETPS satisfy topological Condition (1.2c) by construction. ETPS
was highlighted as the best algorithm by D. Stutz et al. [5, table 3 in section 8].
3 Energy-based superpixels formed by coarse blocks
This section explains the new adaptive initialization for coarse superpixels that
are better than a uniform grid, which is used in most past superpixel algorithms.
Persistent edges in a given image generate a non-uniform mesh of rectangular
blocks. These blocks are iteratively merged in such a way that the energy function
remains as small as possible or until we get a maximum number of superpixels.
3.1 The energy is a reconstruction error of approximation
An image Ican be considered as a function from pixels to a space of colors. We
consider I(p) as the vector (L, a, b) of 3 colors in the CIELAB space, which is
more perceptually uniform than RGB space with red, green, blue components.
In the CIELAB space,Lis the lightness, arepresents the colors from red to
green (the lowest value of ameans red, the highest value of ameans green). The
component bsimilarly represents the opponent colors from yellow to blue. The
OpenCV function cvtColor outputs each Lab channel in the range [0,255].
Definition 1 Let an image Iof npixels be segmented into ksuperpixels. For
every pixel p, denote by S(p)the superpixel containing p. Then S(p)has the
mean color(S(p)) = 1
|S(p)|P
qS(p)
I(q). Since Iis approximated by superpixels
with mean colors, the natural measure of quality is the Reconstruction Error
RE =sum of squared color deviations =
n
X
p=1 I(p)color(S(p))2.(3.1a)
V
Each of the 3 colors in the Lab space has the range [0,255]. Hence the following
normalized Root Mean Square of the color error in percents is shown in Fig. 6.
RMS =rRE
3n×100%
255 =v
u
u
t
1
3n
n
X
p=1 I(p)color(S(p))2×100%
255 .(3.1b)
The Reconstruction Error RE can be written similarly to (3.1a) for other
pixel properties instead of colors, e.g. for texture. The main objective advantage
of nRMS in (3.1a) is its independence of any subjective ground-truth.
A color term proportional to RE was used by Yao et al. [3] with the regu-
larizer P D = sum of squared pixel deviations = P
ppcenter(S(p))2, where
center(S(p)) = 1
|S(p)|P
qS(p)
qis the geometric center of the superpixel S(p). The
above term P D is not invariant under scaling and penalizes large superpixels,
which has motivated us to introduce the scale-invariant regularizer.
Definition 2 The isoperimetric quotient IQ(S) = 4πarea(S)
perimeter2(S)of a super-
pixel Sis a scale-invariant shape characteristic having the maximum value 1 for
a round disk S. The IQ measure of an image over-segmentation I=Sk
i=1 is
the average IQ = X
superpixels S
IQ(S)
#superpixels =4π
k
k
X
i=1
area(Si)
perimeter2(Si).(3.1c)
The SOCS algorithm will minimize the energy equal to the weighted sum
Energy =RE
n+c×IQ,where RE is in (3.1a), cis a shape coefficient. (3.1d)
Schick et al. [13] suggested another weighted average of isoperimetric quo-
tients CO = P
superpixels S
area(S)
#superpixels IQ(S) = 4π
k
k
P
i=1
area2(Si)
perimeter2(Si), when
larger superpixels are forced to have more round shapes, see experiments Fig. 6.
3.2 Stage 1: detection of persistent horizontal and vertical edges
The SOCS algorithm starts by finding persistent edges along horizontal and
vertical lines of a pixel grid, see Definition 3. The first step is to apply to a
given image Ithe bilateral filter from OpenCV with the size of 5 pixels and
sigma values 100 for deviations in the color and coordinate spaces. The second
step is to compute the image gradients dxIand dyIusing the standard 2 ×2
masks. For every row j= 1,...,rows(I) in a given image I, we have a graph
of gradients |dy(I)|over 1 icolumns(I). The similar graph of magnitudes
|dx(I)|can be computed over every column of I. For any such graph of discrete
values f(1), . . . , f(l), Definition 3 formalizes the automatic method to detect
continuous intervals atb, where the graph fhas persistently high values.
VI
Fig. 2. The effect of cin (3.1d): 1st:c= 0, 2nd:c= 1, 3rd:c= 10, 4th:c= 100.
Definition 3 For a function f:RRdiscretely sampled at t= 1, . . . , l, the
strength of a line edge L= [a, b]is the sum P
t[a,b]
f(t). Fig. 3 visualises the
strength of an edge Las the area under a continuous graph f(t)over L. For
any threshold v, the superlevel set f1[v, +) = {tR:f(t)v}consists
of several edges Li. When vis decreasing, the edges Liare growing and merge
with each other until we get a single edge covering all points t= 1, . . . ,l. For any
fixed v, we compute the widest gap between the strengths of the edges that form
the superlevel set f1[v, +). We find a critical level vbetween the median and
maximum of the widest gap above is maximal, see Fig. 3. At this critical value
v, the edges whose strengths are above the widest gap are called persistent.
Proofs of claims can be replaced by more image experiments in a final version.
Lemma 4 For any image Iof a size w×hpixels, the persistent edges in all
w+hhorizontal and vertical lines in Ican be found in time O(wlog h+hlog w).
Proof. Assuming that the points t= 1, . . . , l form a connected interval graph, the
segments in Definition 3 are connected components of a superlevel set f1[v, +).
These components are maintained by a union-find structure, which requires
O(log l) operations per update (creating a new segment, adding a new node
to an old segment or merging 2 segments). Every update requires changes of
strengths for at most 2 segments, hence O(log l) operations we if keep the or-
dered set of strengths in a binary tree. The time is O(wlog h+hlog w) for w
columns (vertical lines) of length hand hrows (horizontal lines) of length w.ut
VII
Fig. 3. Left: a superlevel set has 4 edges in green with their strengths highlighted as
yellow areas at the median value of f.Right: strengths of edges are analyzed when a
threshold vis decreasing, the widest gap between strengths of edges is shown in red.
Given an expected number kof superpixels, the average area of a single
superpixel is n/k. If such a superpixel is a square, its side would be s=pn/k.
If an image Ihas a size w×h, we build the a non-uniform grid of 2[w/s]×2[h/s]
rectangular blocks. We select 2[w/s] columns and 2[h/s] rows that have the
maximum strengths of their persistent edges from Definition 3. To avoid close
edges, after selecting a current maximum along a line x=const or y=const, we
later ignore the neighboring lines at a distance less than 4 pixels. By extending
persistent edges to the boundary of I, we get a non-uniform edge grid, see Fig. 4.
3.3 Cost of merging superpixels and the superpixel structure
The edge grid is already adapted to the image Ibetter than the standard uniform
grid used in other algorithms. However, large regions of almost constant colors
such as sky can be cut by extended edges into unnecessary small blocks. Stage 2
in subsection 3.4 will merge rectangular blocks into a smaller number of larger
superpixels without increasing the Reconstruction Error too much.
If an image is segmented into superpixels I=k
i=1Si, the Reconstruction
Error in formula (3.1a) decomposes as a sum of energies over all superpixels:
RE =
k
X
i=1
E(Si),where E(Si) = X
pSiI(p)color(Si)2.(3.3a)
The cost of merging Si, Sjis E(Si, Sj) = E(SiSj)E(Si)E(Sj)0.(3.3b)
This cost can be 0 only if Si, Sjhave exactly the same mean color(Si) =
color(Sj). Technically, two superpixels Si, Sjmay share more than one edge,
e.g. a connected chain of edges. If the intersection SiSjis disconnected, e.g.
one edge eand a vertex v6∈ e, we set E(Si, Sj)=+, so the superpixels Si, Sj
will not merge to avoid harder cases when a superpixel may touch itself.
To prepare the coarse-to-fine optimization in section 4 when superpixels are
iteratively improved, we introduce the superpixel structure from our implemen-
tation with sums and pointers to 4 sub-blocks for each rectangular block.
VIII
Fig. 4. Left: red persistent edges generate the blue edge grid at Stage 1. Middle:
initial mesh after Stage 2. Right: final mesh with 99 superpixels after Stages 3-4.
We split any rectangular block from the non-uniform grid into the four
smaller rectangular sub-blocks by subdividing each side into 2 almost equal parts
whose lengths differ by at most 1 pixel. We don’t subdivide 1-pixel sides, so 1-
pixel wide blocks are subdivided into 2 blocks. Since each block may keep pointers
to its 4 sub-blocks, the superpixel structure looks like a large tree where the root
points to coarsest blocks each of which points to its 4 sub-blocks and so on.
Definition 5 The superpixel structure of Scontains the number |S|of pixels
in a superpixel S,sum(S) = P
pS
I(p),sum2(S) = P
pSI(p)2, the list of (x, y)
indices in the block grid of blocks covered by S. Each block Bhas the index of its
superpixel S, similar sums |B|,sum(B),sum2(B)and pointers to its 4 sub-blocks.
The color sums of a superpixel Sin Definition 5 are justified by Lemma 6.
Lemma 6 In (3.3b) the cost S(Ei, Ej)of merging superpixels Si, Sjcan be com-
puted by using the structure of Si, Sjfrom Definition 5 in a constant time.
Proof. Since sum(S) = |S|color(S), the energy in (3.3b) becomes
E(S) = X
pSI(p)22color(S)I(p) + color(S)2=X
pSI(p)2
2color(S)X
pS
I(p) + |S|color(S)2= sum2(S)sum(S)2
|S|.(3.3c)
Since the union SiSjnicely affects the area and sums, i.e. |SiSj|=|Si|+|Sj|,
sum(SiSj) = sum(Si) + sum(Sj),sum2(SiSj) = sum2(Si) + sum2(Sj),
(3.3c) implies that the computation of E(Si, Sj) is independent of |SiSj|.ut
3.4 Stage 2: merging adjacent superpixels with minimum energy
At Stage 2 adjacent superpixels are iteratively merged starting from pairs with
a minimum cost E(Si, Sj) in (3.3b). Since superpixels may share more than
IX
one edge, we associate the cost E(Si, Sj) to pairs of adjacent superpixels. Each
unordered pair (Si, Sj) has a unique key(Si, Sj), e.g. formed by indices of super-
pixels in the edge grid. Stage 2 finishes when the number of superpixels drops
down to a given maximum m. Fig. 6 shows experiments where the number of
superpixels can go down to 0.25mif nRMS jumps by not more than 2%.
Lemma 7 If an image I=k
i=1Siis segmented into ksuperpixels, there are at
most O(k)pairs (Si, Sj)of adjacent superpixels. In time O(klog k)one can find
and merge (Si, Sj)with a minimal cost E(Si, Sj)updating the costs of all pairs.
Proof. Since the common boundary of Si, Sjgrows over time, we keep the list of
all common edges in the binary edge tree indexed by key(Si, Sj), which allows a
fast insertion and deletion of new pairs of adjacent superpixels. To quickly find
key(Si, Sj) and the corresponding pair of adjacent superpixels with a minimum
cost E(Si, Sj), we put all keys into the binary cost tree indexed by E(Si, Sj).
All ksuperpixels form a planar network with fbounded faces, gedges, where
each pair of adjacent superpixels is represented by one edge. Since each face f
has at least 3fedges, the doubled number of edges 2gis at least 3f, so f2
3g.
The Euler formula kg+f= 1 gives 1 kg+2
3g, hence g3(k1).
Then both binary trees above have the size O(k). The first element in the
cost tree has the minimum cost E(Si, Sj) and can be found and removed in a
constant time. The search for the corresponding key(Si, Sj) in the edge tree in
time O(log k) leads to the list of common edges of the superpixels Si, Sj.
The edge grid from is converted into a polygonal mesh using the OpenMesh
library. Then each common edge is removed by the collapse and remove edge
operations from OpenMesh taking a constant time. For each of remaining O(k)
edges of SiSjon the boundary of another superpixel S, the cost E(S, SiSj)
is computed by Lemma 6 and is added to the cost tree in time O(log k).
4 The coarse-to-fine optimization for superpixels
This section carefully analyzes the coarse-to-fine optimization used by Yao et al.
[3]. At Stage 3 each rectangular block in a current grid is subdivided as explained
before Definition 5. At Stage 4 each boundary block that belongs to a superpixel
Siand is adjacent to another superpixel Sjis checked for a potential move from
Sito Sj. After completing this optimization for all boundary blocks, Stages 3
and 4 are repeated at the next finer level until all blocks become pixels.
4.1 Stage 3: subdividing rectangular blocks into four sub-blocks
Lemma 8 explains how the superpixel structure from Definition 5 helps us to
quickly compute color sums for all superpixels and subdivide all superpixels.
X
Lemma 8 Let an image I=k
i=1Siof npixels be segmented into ksuperpixels.
Then all |Si|,sum(Si),sum2(Si)can be found in time O(n)independent of k.
Proof. We recursively compute all sums for each block Bby adding the corre-
sponding sums from each of 4 sub-blocks of B. Since, for each single-pixel block
B=p, we have |B|= 1, sum(B) = I(p), sum2(B) = I(p)2, we need only
O(n) + O(n/4) + O(n/16) + ··· =O(n) additions to compute the sums for all
blocks. For each superpixel Si, we find |Si|,sum(Si),sum2(Si) by adding the
sums from all blocks in Siin time O(|Si|), so the total time is O(n).
Lemma 9 When blocks are subdivided going from a coarse to a finer level, each
superpixel Scontaining bblocks larger than 1×1can be updated in time O(b).
Proof. By Definition 5 for each superpixel S, we only need to replace the list of
blocks in the current grid by a longer list of blocks in the refined grid, which
is done by merging the lists from the 4 sub-blocks of each block covering by S.
The index of Sis copied to every new sub-block, which takes O(b) time.
4.2 Tree of boundary blocks and local connectivity of superpixels
A block Bin a superpixel Sis called boundary if one of its 4 side neighbors
belongs to a different superpixel, which can be quickly checked by comparing
superpixel indices of all blocks. The ETPS algorithm puts all boundary blocks
into a priority queue and adds any new boundary blocks to the end of this queue.
We have replaced this queue by a binary tree where blocks are ordered by costs
of moves so that moves are attempted according to their costs, not row by row.
Fig. 5. Left: allowed moves preserves the local connectivity. Right: a forbidden move.
Blocks in the tree are tested one by one for a potential move to an adjacent
superpixel. Such a move was called forbidden in [3, section 3] if Sbecomes
disconnected after removing B. However, the global connectivity of SBis
slow to check. A removal of a boundary block Bfrom a superpixel Srespects
the local connectivity of Sif the 8-neighborhood of Bwithin SBis connected,
see Fig. 5. The 3 pictures in [3, Fig. 3] show some (but not all) forbidden moves,
so we justify below why the local connectivity can be checked in a constant time.
Lemma 10 For any boundary block Bmoving from a superpixel Sito another
superpixel Sj, the local connectivity of SiBcan be checked in a constant time.
XI
Proof. We go around the circular 8-neighborhood N8(B), consider all blocks of
SBas isolated vertices, add an edge between vertices u, v if the corresponding
blocks in N8(B) share a common side. Then SBis locally connected around
Bif and only if the resulting graph on at most 8 vertices is connected.
4.3 Stage 4: updating superpixels in a constant time per move
Let the move of BSito another superpixel Sjkeep SiBlocally connected.
If the Reconstruction Error in (3.1a) is decreased, we move Bto Sjand will add
new boundary blocks to the cost tree, otherwise we remove Bfrom the tree.
Lemma 11 For any block Bmoving from a superpixel Sito another superpixel
Sj, the structures of both superpixels Si, Sjcan be updated in a constant time.
Proof. All sums of colors over the block Bare subtracted from the corresponding
sums of Siand are added to the sums of Sj. We change the superpixel index of
Bfrom ito j. After Bhas moved, only its 4 neighboring blocks can change their
boundary status, which is checked in a constant time by comparing superpixel
indices. Any new boundary blocks are added to the binary tree of blocks.
The time at Stage 4 essentially depends on the number qof boundary blocks
that are processed in the cost tree. Stage 4 finishes when the tree is empty or q
exceeds the upper bound of ngiven pixels, which never happened for BSD500.
Theorem 12 The SOCS algorithm segmenting an image of npixels into ksu-
perpixels has the asymptotic computational complexity O(n+k2log k+q).
Proof. Stage 1 has time about O(nlog n) by Lemma 4. At Stage 2 we merge
at most O(k) pairs of superpixels, each pair in time O(klog k) by Lemma 7. By
Lemma 8 all superpixels are subdivided in time O(b) for a grid with bblocks
larger than 1 ×1. The number of such blocks increases to nby a factor of at
least 2, hence the total time for Stages 3 is O(n). By Lemmas 10 and 11 the
time for Stage 4 is proportional to the number qof processed blocks in the cost
tree, because each boundary block is adjacent to at most 3 other superpixels.
5 Comparisons with other algorithms on BSD500
The Berkeley Segmentation Database BSD500 [4] contains 500 natural images
and human-drawn closed contours around object boundaries. Then all pixels
in every image are split into disjoint segments, which are large unions of pixels
comprising a single object. So every pixel has the index of its superpixel and some
pixels are also labeled as boundary. Every image has about 5 human drawings,
which vary significantly and are called the ground-truth segmentations.
For an image I, let I=Gjbe a segmentation into ground-truth segments
and I=k
i=1Sibe an oversegmentation into superpixels produced by an algo-
rithm. Each quality measure below compares the superpixel S1, . . . , Skwith the
best suitable ground-truth from the BSD500 database for every image.
XII
Let G(I) = Gjbe the union of ground-truth boundary pixels and B(I)
be the boundary pixels produced by a superpixel algorithm. For a distance εin
pixels, the Boundary Recall BR(ε) is the ratio of ground-truth boundary pixels
pG(I) that are within the distance εfrom the superpixel boundary B(I).
The Undersegmentation Error UE =1
nX
jX
SiGj6=|SiGj|(5a)
was often used in the past, where |SiGj|is the number of pixels that are in Si,
but not in Gj. However a superpixel is fully penalized when SiGjis 1 pixel,
which required ad hoc thresholds, e.g. the 5% threshold |SiGj| 0.05|Si|by
Achanta et al. [1], or ignoring boundary pixels of Siby Liu et al. [7].
Van den Bergh et al. [2] suggested the more accurate measure, namely
the Corrected Undersegmentation Error CU E =1
nX
i|SiGmax(Si)|(5b),
where Gmax(Si) is the ground-truth segment having the largest overlap with Si.
Neubert and Protzel [14] introduced the Undersegmentation Symmetric Error
USE =1
nX
jX
SiGj6=
min{in(Si), out(Si)},where (5c),
in(Si) is the area of Siinside Gj,out(Si) is the area of Sioutside Gj. To keep
graphs readable, Fig. 6 compares SOCS to the 3 past algorithms ETPS, SEEDS,
SLIC, coming on top of others in the evaluations by Stutz et al. [5, Table 3].
As suggested by Theorem 12, the running time of SOCS is similar to ETPS
at about 1s on average per BSD image on a laptop with 2.6 GHz and 8G RAM.
6 Summary and discussion of the new SOCS algorithm
The SOCS algorithm has a fast adaptive initialization that is based on persistent
edges in an image and can substantially reduce the number of superpixels without
compromizing the quality of approximation. The new coarse-to-fine optimization
quickly converges to a minimum by moving boundary blocks of large sizes and
then by subdividing them into smaller blocks many of which remain stable.
The first theoretical contribution is the formal statement of the image over-
segmentation as an approximation problem by superpixels in subsection 1.2.
The adaptive initialization of superpixels consisting of large rectangular blocks
can be used in many other algorithms that start from a coarse uniform grid.
The coarse-to-fine optimization has been substantially improved by keeping
boundary blocks sorted in a binary tree instead of a linear queue.
The SOCS algorithm outperforms the state-of-the art for the approximation
error (nRMS) and undersegmentation errors CUE/USE on BSD500 images.
XIII
Fig. 6. Each dot is (average number of superpixels, average benchmark) on BSD500.
SOCS0 in red has shape coefficient c= 0, SOCS10 in orange has c= 10, see (3.1d).
XIV
Here are the practical advantages of the new SOCS algorithm.
The output superpixels are connected, because the connectivity is checked in
Stage 4 when boundary blocks are updated, which gives the overall speed-up.
The SOCS algorithm can be stopped at any time after Stage 1, e.g. at any
optimization step, because each update needs a constant time by Lemma 11.
The only essential input parameters are the maximum number kof superpixels
and the shape coefficient for a trade-off between accuracy and compactness.
The SOCS algorithm is modular and allows improvements in different parts,
e.g. persistent edges in Stage 1 can be found in another way, merging blocks in
Stage 2 can be done by another strategy with the same Reconstruction Error.
We are happy to publish the C++ code based on OpenCV and OpenMesh in
September 2017, and thank all anonymous reviewers for their helpful suggestions.
References
1. Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., usstrunk, S.: Slic super-
pixels compared to the state-of-the-art. Transactions PAMI 34 (2012) 2274–2282
2. Van de Bergh, M., Boix, X., Roig, G., Van Gool, L.: Seeds: superpixels extracted
via energy-driven sampling. Int J Computer Vision 111 (2015) 298–314
3. Yao, J., Boben, M., Fidler, S., Urtasun, R.: Real-time coarse-to-fine topologically
preserving segmentation. In: Proceedings of CVPR. (2015) 216–225
4. Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical
image segmenetaton. Transactions PAMI 33 (2011) 898–916
5. Stutz, D., Hermans, A., Leibe, B.: Superpixels: An evaluation of the state-of-the-
art. Computer Vision and Image Understanding (2017)
6. Shi, J., Malik, J.: Normalized cuts and image segmentation. Transactions PAMI
22 (2000) 888–905
7. Liu, M.Y., Tuzel, O., Ramalingam, S., Chellappa, R.: Entropy rate superpixel
segmentation. In: Proceedings of CVPR. (2011) 2097 2104
8. Veksler, O., Boykov, Y., Mehrani, P.: Superpixels and supervoxels in an energy
optimization framework. In: Proceedings of ECCV. (2010) 211–224
9. Zhang, Y., Hartley, R., Mashford, J., Burn, S.: Superpixels via pseudo-boolean
optimization. In: Proceedings of ICCV. (2011) 211–224
10. Conrad, C., Mertz, M., Mester, R.: Contour-relaxed superpixels. In: Proc. Energy
Minimization Methods in Computer Vision & Pattern Recognition. (2013) 280–293
11. Li, Z., Chen, J.: Superpixel segmentation using linear spectral clustering. In:
Proceedings of CVPR. (2015) 1356–1363
12. Buyssens, P., Toutain, M., Elmoataz, A., ezoray, O.: Eikonal-based vertices grow-
ing and iterative seeding for efficient graph-based segmentation. In: Int. Conf.
Image Processing (ICIP). (2014) 4368–4372
13. Schick, A., Fischer, M., Stifelhagen, R.: Measuring and evaluating the compactness
of superpixels. In: Proceedings of ICPR. (2012) 930–934
14. Neubert, P., Protzel, P.: Compact watershed and preemptive slic: On improving
trade-offs of superpixel segmentation algorithms. In: Proc. ICPR. (2014) 996–1001
... More recent persistence-based algorithms for graph reconstruction [16,18,19] and image segmentation [10,11,20,21] essentially find most persistent cycles hidden in a cloud, hence go beyond the tree reconstruction problem in section 1. ...
... ASk(C) has little dependence on the branching factor β , e.g. all values [20,50] produced almost identical results in Table 10 and Fig. 9. ...
Preprint
The tree reconstruction problem is to find an embedded straight-line tree that approximates a given cloud of unorganized points in Rm\mathbb{R}^m up to a certain error. A practical solution to this problem will accelerate a discovery of new colloidal products with desired physical properties such as viscosity. We define the Approximate Skeleton of any finite point cloud C in a Euclidean space with theoretical guarantees. The Approximate Skeleton ASk(C) always belongs to a given offset of C, i.e. the maximum distance from C to ASk(C) can be a given maximum error. The number of vertices in the Approximate Skeleton is close to the minimum number in an optimal tree by factor 2. The new Approximate Skeleton of any unorganized point cloud C is computed in a near linear time in the number of points in C. Finally, the Approximate Skeleton outperforms past skeletonization algorithms on the size and accuracy of reconstruction for a large dataset of real micelles and random clouds.
... The traditional over-segmentation problem is to group square pixels into superpixels that adhere to object boundaries. The boundaries of these pixel-based superpixels consist of short horizontal and vertical edges restricted to a given grid, see [17] . ...
Article
Full-text available
The over-segmentation problem is to split a pixel-based image into a smaller number of superpixels that can be treated as indecompasable regions to speed up higher level image processing such as segmentation or object detection. A traditional superpixel is a potentially disconnected union of square pixels, which can have complicated topology (with holes) and geometry (highly zigzag boundaries). This paper contributes to new resolution-independent superpixels modeled as convex polygons with straight-line edges and vertices with real coordinates not restricted to a fixed pixel grid. Any such convex polygon can be rendered at any resolution higher than in original images, hence superpixels are resolution-independent. The key difficulty in obtaining resolution-independent superpixels is to find continuous straight-line edges, while classical edge detection focuses on extracting only discrete edge pixels. The recent Persistent Line Segment Detector (PLSD) avoids intersections and small angles between line segments, which are hard to fix before a proper polygonal mesh can be constructed. The key novelty is an automatic selection of strongest straight-line segments by using the concept of persistence from Topological Data Analysis, which allows to rank segments by their strength. The PLSD performed well in comparison with the only past Line Segment Detector Algorithm (LSDA) on the Berkeley Segmentation Database of 500 real-life images. The PLSD is now extended to the Persistent Resolution-Independent Mesh (PRIM).
... The similar Coarse-to-Fine (CtF) algorithm by Yao et al. [19] minimizes the discrete Reconstruction Error from section 4. The recent improvement of this Coarse-to-Fine approach [9] allows the user to make shapes of superpixels more round by giving more weight to an isoperimetric quotient. ...
Chapter
Full-text available
The over-segmentation into superpixels is an important pre-processing step to smartly compress the input size and speed up higher level tasks. A superpixel was traditionally considered as a small cluster of square-based pixels that have similar color intensities and are closely located to each other. In this discrete model the boundaries of superpixels often have irregular zigzags consisting of horizontal or vertical edges from a given pixel grid. However digital images represent a continuous world, hence the following continuous model in the resolution-independent formulation can be more suitable for the reconstruction problem. Instead of uniting squares in a grid, a resolution-independent superpixel is defined as a polygon that has straight edges with any possible slope at subpixel resolution. The harder continuous version of the over-segmentation problem is to split an image into polygons and find a best (say, constant) color of each polygon so that the resulting colored mesh well approximates the given image. Such a mesh of polygons can be rendered at any higher resolution with all edges kept straight. We propose a fast conversion of any traditional superpixels into polygons and guarantees that their straight edges do not intersect. The meshes based on the superpixels SEEDS (Superpixels Extracted via Energy-Driven Sampling) and SLIC (Simple Linear Iterative Clustering) are compared with past meshes based on the Line Segment Detector. The experiments on the Berkeley Segmentation Database confirm that the new superpixels have more compact shapes than pixel-based superpixels.
Article
Full-text available
Superpixels group perceptually similar pixels to create visually meaningful entities while heavily reducing the number of primitives. As of these properties, superpixel algorithms have received much attention since their naming in 2003. By today, publicly available and well-understood superpixel algorithms have turned into standard tools in low-level vision. As such, and due to their quick adoption in a wide range of applications, appropriate benchmarks are crucial for algorithm selection and comparison. Until now, the rapidly growing number of algorithms as well as varying experimental setups hindered the development of a unifying benchmark. We present a comprehensive evaluation of 28 state-of-the-art superpixel algorithms utilizing a benchmark focussing on fair comparison and designed to provide new and relevant insights. To this end, we explicitly discuss parameter optimization and the importance of strictly enforcing connectivity. Furthermore, by extending well-known metrics, we are able to summarize algorithm performance independent of the number of generated superpixels, thereby overcoming a major limitation of available benchmarks. Furthermore, we discuss runtime, robustness against noise, blur and affine transformations, implementation details as well as aspects of visual quality. Finally, we present an overall ranking of superpixel algorithms which redefines the state-of-the-art and enables researchers to easily select appropriate algorithms and the corresponding implementations which themselves are made publicly available as part of our benchmark at davidstutz.de/projects/superpixel-benchmark/.
Article
Full-text available
We design a new fast algorithm to automatically segment a 2D cloud of points into persistent regions. The only input is a dotted image without any extra parameters, say a scanned black-and-white map with almost closed curves or any image with detected edge points. The output is a hierarchy of segmentations into regions whose boundaries have a long enough life span (persistence) in a sequence of nested neighborhoods of the input points. We give conditions on a noisy sample of a graph, when the boundaries of resulting regions are geometrically close to original cycles in the unknown graph.
Conference Paper
Full-text available
Preprocessing a 2D image often produces a noisy cloud of interest points. We study the problem of counting holes in noisy clouds in the plane. The holes in a given cloud are quantified by the topological persistence of their boundary contours when the cloud is analyzed at all possible scales. We design the algorithm to count holes that are most persistent in the filtration of offsets (neighborhoods) around given points. The input is a cloud of n points in the plane without any user-defined parameters. The algorithm has a near linear time and a linear space O(n). The output is the array (number of holes, relative persistence in the filtration). We prove theoretical guarantees when the algorithm finds the correct number of holes (components in the complement) of an unknown shape approximated by a cloud.
Article
Full-text available
We design a new fast algorithm to automatically complete closed contours in a finite point cloud on the plane. The only input can be a scanned map with almost closed curves, a hand-drawn artistic sketch or any sparse dotted image in 2D without any extra parameters. The output is a hierarchy of closed contours that have a long enough life span (persistence) in a sequence of nested neighborhoods of the input points. We prove theoretical guarantees when, for a given noisy sample of a graph in the plane, the output contours geometrically approximate the original contours in the unknown graph.
Conference Paper
The over-segmentation problem for images is studied in the new resolution-independent formulation when a large image is approximated by a small number of convex polygons with straight edges at subpixel precision. These polygonal superpixels are obtained by refining and extending subpixel edge segments to a full mesh of convex polygons without small angles and with approximation guarantees. Another novelty is the objective error difference between an original pixel-based image and the reconstructed image with a best constant color over each superpixel, which does not need human segmentations. The experiments on images from the Berkeley Segmentation Database show that new meshes are smaller and provide better approximations than the state-of-the-art.
Conference Paper
A major insight from our previous work on extensive comparison of super pixel segmentation algorithms is the existence of several trade-offs for such algorithms. The most intuitive is the trade-off between segmentation quality and runtime. However, there exist many more between these two and a multitude of other performance measures. In this work, we present two new super pixel segmentation algorithms, based on existing algorithms, that provide better balanced trade-offs. Better balanced means, that we increase one performance measure by a large amount at the cost of slightly decreasing another. The proposed new algorithms are expected to be more appropriate for many real time computer vision tasks. The first proposed algorithm, Preemptive SLIC, is a faster version of SLIC, running at frame-rate (30 Hz for image size 481x321) on a standard desktop CPU. The speed-up comes at the cost of slightly worse segmentation quality. The second proposed algorithm is Compact Watershed. It is based on Seeded Watershed segmentation, but creates uniformly shaped super pixels similar to SLIC in about 10 ms per image. We extensively evaluate the influence of the proposed algorithmic changes on the trade-offs between various performance measures.