ArticlePDF Available

Patch-Based Image Vectorization with Automatic Curvilinear Feature Alignment

Authors:

Abstract and Figures

Raster image vectorization is increasingly important since vector-based graphical contents have been adopted in personal computers and on the Internet. In this paper, we introduce an effective vector-based representation and its associated vectorization algorithm for full-color raster images. There are two important characteristics of our representation. First, the image plane is decomposed into nonoverlapping parametric triangular patches with curved boundaries. Such a simplicial layout supports a flexible topology and facilitates adaptive patch distribution. Second, a subset of the curved patch boundaries are dedicated to faithfully representing curvilinear features. They are automatically aligned with the features. Because of this, patches are expected to have moderate internal variations that can be well approximated using smooth functions. We have developed effective techniques for patch boundary optimization and patch color fitting to accurately and compactly approximate raster images with both smooth variations and curvilinear features. A real-time GPU-accelerated parallel algorithm based on recursive patch subdivision has also been developed for rasterizing a vectorized image. Experiments and comparisons indicate our image vectorization algorithm achieves a more accurate and compact vector-based representation than existing ones do.
Content may be subject to copyright.
Appears in ACM Transactions on Graphics (special issue for SIGGRAPH Asia 2009)
Patch-Based Image Vectorization with Automatic Curvilinear Feature Alignment
Tian Xia
§
Binbin Liao
§
Yizhou Yu
§
University of Illinois at Urbana-Champaign
Figure 1: MAGNOLIA. Left: original image. Mid left: reconstructed image from our vector-based representation. Mid right: magnification
(×4) of the enclosed area in mid left. Right top: magnification of the enclosed area in mid right (original image ×8). Right bottom:
magnification (×8) using bicubic interpolation.
Abstract
Raster image vectorization is increasingly important since vector-
based graphical contents have been adopted in personal computers
and on the Internet. In this paper, we introduce an effective vector-
based representation and its associated vectorization algorithm for
full-color raster images. There are two important characteristics
of our representation. First, the image plane is decomposed into
nonoverlapping parametric triangular patches with curved bound-
aries. Such a simplicial layout supports a flexible topology and fa-
cilitates adaptive patch distribution. Second, a subset of the curved
patch boundaries are dedicated to faithfully representing curvilin-
ear features. They are automatically aligned with the features. Be-
cause of this, patches are expected to have moderate internal vari-
ations that can be well approximated using smooth functions. We
have developed effective techniques for patch boundary optimiza-
tion and patch color fitting to accurately and compactly approxi-
mate raster images with both smooth variations and curvilinear fea-
tures. A real-time GPU-accelerated parallel algorithm based on re-
cursive patch subdivision has also been developed for rasterizing a
vectorized image. Experiments and comparisons indicate our im-
age vectorization algorithm achieves a more accurate and compact
vector-based representation than existing ones do.
CR Categories: I.3.3 [Computer Graphics]: Picture/Image Gen-
eration, Graphics Utilities; I.3.5 [Computer Graphics]: Computa-
tional Geometry and Object Modeling—Boundary representations
Keywords: vector graphics, curvilinear features, mesh simplifica-
tion, thin-plate splines
§
e-mail:{tianxia2,liao17,yyz}@illinois.edu
1 Introduction
Vector-based graphical contents have been increasingly adopted in
personal computers and on the Internet. This is witnessed by re-
cent desktop operating systems, such as Windows Vista, and mul-
timedia frameworks for rich internet applications, such as Adobe
Flash. Needless to say that vector-based drawing tools, such as
Adobe Illustrator and CorelDraw, continue to enjoy immense popu-
larity. Such a wide adoption is due to the fact that vector graphics is
compact, scalable, editable and easy to animate. Compactness and
scalability also make vector graphics well suited for high-definition
displays.
Since raster images are likely to remain as the dominant format for
raw data acquired by imaging devices, raster image vectorization
is going to be of increasing importance. A powerful vector-based
image representation and its associated operations need to exhibit
the following traits:
Representation Power Vector graphics have been typically
used for encoding abstract visual forms such as fonts, charts, maps
and cartoon arts. There has been a recent trend to enhance the repre-
sentation power of vector graphics so that they can more faithfully
represent full-color raster images. Such images have regions with
smooth color and shading variations as well as curvilinear edges
(features) across which there exist rapid color or intensity changes.
A primary goal is to design powerful vector primitives that can ac-
curately approximate raster images with as few vector primitives as
possible. Since curvilinear features are important visual cues de-
lineating silhouettes and occluding contours, they help users better
interpret images. It is essential to use dedicated vector primitives,
such as curves, to represent such features. In addition, dedicated
primitives can significantly alleviate aliasing along curvilinear fea-
tures, and make it more convenient to create layers or boundaries
by cutting the image open along these features.
Automatic Vectorization From a user’s perspective, vectoriz-
ing an existing raster image should be hassle free. Almost all the
details should be taken care of automatically except for very few
tunable parameters.
Responsive Rasterization On the other hand, rasterizing a
vector-based image on display devices should not be much slower
than opening a regular raster image.
In this paper, we introduce an effective vector-based image rep-
1
Appears in ACM Transactions on Graphics (special issue for SIGGRAPH Asia 2009)
(a) (b) (c) (d)
(e) (f) (g) (h)
Figure 2: Vectorization pipeline. (a) Original image. (b) Automatically detected curvilinear features. ( c) Part of the triangular surface
mesh constructed from (a) and (b), corresponding to the enclosed part in (a). (d) The coarsened mesh. (e) 2D projection of (d), served as the
base domain. (f) The set of initial 2D triangular B´ezier patches. (g) Non-overlapping patches optimized from (f). (h) The final vector-based
reconstruction by a thin-plate spline color fitting on each patch.
resentation and its associated image vectorization algorithm that
meets the aforementioned requirements. There are two impor-
tant characteristics of our representation. First, the image plane
is decomposed into a set of nonoverlapping parametric triangu-
lar patches with curved boundaries. Such a simplicial layout of
patches facilitates adaptive patch distribution and supports a flex-
ible topology with multiple boundaries. Second, a subset of the
curved patch boundaries in the decomposition are dedicated to rep-
resenting curvilinear features. They are automatically aligned with
the features. Because of this, patches are expected to have moder-
ate internal variations that can be well approximated using smooth
functions. Thus, our vector-based representation can accurately and
compactly approximate raster images with both smooth variations
and curvilinear features.
We have developed fully automatic algorithms for raster image vec-
torization and vectorized image rasterization for our vector-based
representation. There are a few notable aspects of these algorithms.
First, each color channel of the input image is represented geomet-
rically as a triangle mesh. Automatically detected curvilinear fea-
tures are explicitly represented as boundaries in the mesh by cutting
the mesh open along them. Second, a compact vector-based repre-
sentation needs to have as few patches as possible. We achieve this
goal with an adapted feature-preserving mesh simplification algo-
rithm. Mesh simplification is also capable of preserving weak fea-
tures that have been missed during automatic feature detection and
aligning them with patch boundaries. Relying on both feature de-
tection and mesh simplification, our method achieves robust feature
alignment with curved patch boundaries. Third, we develop an ef-
fective nonlinear optimization method for computing B´ezier curves
as patch boundaries with respect to an additional constraint that
prevents intersections among different B´ezier curves. We further
perform thin-plate spline fitting to accurately represent the color
variations within each patch. Fourth, a real-time GPU-based par-
allel algorithm is developed for rasterizing vectorized images. It is
based on recursive subdivision of the 2D B´ezier patches. Experi-
ments and comparisons indicate our image vectorization algorithm
can achieve a more accurate and compact vector-based representa-
tion than existing ones.
2 Related Work
There is a large body of literature on vectorization of non-
photographic images [Chang and Hong 1998; Zou and Yan 2001;
Hilaire and Tombre 2006]. These images, often in the form of car-
toon arts, maps, engineering and hand drawings, are line- or curve-
based and regions in-between curvilinear features are filled with
uniform colors or color gradients. Because of this, algorithms are
mainly designed for contour tracing, line pattern recognition and
curve fitting. A novel representation for random-access rendering
of antialiased vector graphics on the GPU has been introduced in
[Nehab and Hoppe 2008]. It has the ability to map vector graphics
onto arbitrary surfaces, or under arbitrary deformations.
Recent work sees an increasing interest in vectorization of full-
color raster images. Compare to line drawings, these images need
a more powerful vector representation that accounts for color vari-
ations across the image space in addition to curvilinear features.
Several softwares, such as VectorEye, Vector Magic, and AutoTrace
have been developed for automatic conversion from bitmap to vec-
tor graphics. Also available are commercial tools (CorelDRAW,
Adobe Live Trace, etc.) that help the user design and edit vector-
based images.
Existing vectorization techniques roughly fall into three categories.
A few algorithms are based on constrained Delaunay triangulation.
For example, [Lecot and Levy 2006] developed the ArDeco sys-
tem for image vectorization and stylizing. It decomposes an image
into a set of triangles, and each triangle is filled with pre-integrated
gradient. [Swaminarayan and Prasad 2006] triangulates an image
using only pixels on detected edges. Each triangle is then assigned
a color by sparse sampling, resulting in blurred, stylish regions.
Delaunay triangulation is also used for the image compression al-
gorithm in [Demaret et al. 2006], where an image is approximated
using a linear spline over an adapted triangulation. The adap-
tive approach well preserves features of an image. Overall, these
triangulation-based algorithms require a large number of triangles
to approximate detailed color variations due to the limited repre-
sentation power of the color functions defined over each triangle.
In addition, each curvilinear feature needs to be approximated by
a large number of short line segments. Our technique overcomes
2
Appears in ACM Transactions on Graphics (special issue for SIGGRAPH Asia 2009)
these problems and achieves similar or better reconstruction re-
sults with far fewer vector primitives by using more powerful spline
patches with curved boundaries.
The second category of techniques aims for a more editable and
flexible vector representation. These techniques normally use a
higher-order parametric surface mesh. [Price and Barrett 2006]
presented an object-based vectorization method, in which a cubic
ezier grid is generated for each selected image object by recursive
graph-cut segmentation and an error-driven subdivision. The data
fitting capability of cubic B´ezier patches prevents a very sparse rep-
resentation. There are typically many tiny patches in rapidly chang-
ing regions. The idea of optimized gradient mesh was introduced
in [Sun et al. 2007]. A gradient mesh consists of a rectangular grid
of Ferguson patches. An input image is first segmented into several
sub-objects, and a gradient mesh is then optimized to fit each sub-
object. The technique in [Sun et al. 2007] involves manual mesh
initialization which aligns mesh boundaries with salient image fea-
tures. Such user-assisted mesh placement can be time-consuming
for an image with a relatively large number of features. The most
recent work in [Lai et al. 2009] proposes an automatic technique to
align the boundary of an entire gradient mesh with the boundary of
an object layer. However, the rectangular arrangement of patches
still imposes unnecessary restrictions to achieve a highly adaptive
spatial layout. As a result, it is still challenging to automatically
align internal patch boundaries with detailed features inside the ob-
ject layer. In comparison, our technique automatically aligns patch
boundaries with all curvilinear features. In addition, a triangular de-
composition offers a flexible simplicial layout that makes it easier
to adaptively distribute patches.
Another class of techniques adopts a mesh-free representation and
differs from the first two categories in that it treats the color varia-
tion as a propagation from detected curvilinear features. Diffusion
curves, proposed in [Orzan et al. 2008], creates curves with color
and blur attributes, and models the color variation as a diffusion
from these curves by solving a Poisson equation. This technique is
particularly well suited for interactive drawing of full-color graph-
ics and is a significant leap from other vector-based drawing tools.
However, it has limitations when vectorizing raster images. First,
it is well known that any solution of the Poisson equation performs
membrane interpolation while color variations in raster images may
not satisfy this condition especially in regions with relatively sparse
features. Second, automatically detected edges have undesirable
spatial distributions. While some regions may have overly dense
edges, others may have too few edges to accurately approximate
the original image. Third, edge-based representation does not guar-
antee closed regions, making it infeasible to perform region-based
color or shape editing. In comparison, our technique is less depen-
dent on edge detection and fits a thin-plate spline to pixel colors
within a patch to achieve a more faithful reconstruction.
3 Vector-Based Image Representation
In our vector-based representation, the entire image plane is decom-
posed into a set of non-overlapping triangular regions with curved
boundaries. We model every triangular region as a 2D triangular
ezier patch. Note that any boundary of a 2D triangular B´ezier
patch is a 2D B´ezier curve. Every color channel over each 2D
ezier patch is represented separately as a scalar function using a
distinct analytical formulation. We chose to adopt thin-plate splines
to represent these color channels. The thin-plate splines are defined
over the parametric domain of their corresponding B´ezier patch.
Curved boundaries of 2D B´ezier patches lend significant model-
ing power to the vector-based representation. Compare to straight
line segments used in existing triangulation-based approaches, its
advantage is twofold: a smaller number of patches and a more
(a) (b) (c)
(d) subpixel triangulation patterns. gray dots are pixels. yellow and orange dots are
newly inserted subpixels, belonging to different features.
Figure 3: (a) Triangulation and feature detection at pixel resolu-
tion: black dots are pixels on a detected feature. (b) Candidates
at subpixel resolution: one of the two candidate lines sandwich-
ing the detected feature is where the true discontinuity is. (c)
Re-triangulation of the affected area (blue tris): new subpixels
(yellow dots) along the selected candidate are inserted. (d) Re-
triangulartion patterns.
efficient color fitting within each patch. When directly applying
constrained Delaunay triangulation, a considerably larger number
of line segments are needed to approximate a curvilinear feature,
resulting in a larger number of triangles. Our method can approxi-
mate the same feature at the same precision with much fewer curved
segments. Since color discontinuities are detected as features
and automatically aligned with patch boundaries in our method,
smoother gradients and transitions are left inside patches. Exempt
from approximating high frequency information, our patch-wise
color fitting scheme i.e. thin-plate splines, proves to be very ef-
fective. In the absence of a feature at a patch boundary, the two
neighboring patches sharing the boundary have a continuous color
transition across the boundary.
4 Image Vectorization Based on Triangular
Decomposition
This section describes our image vectorization technique, and we
choose the RGB color space for the vectorization in this paper. For
a full-color raster image, every color channel is considered as a
height field and converted into a distinct 3D triangle mesh, which
we call a channel mesh. A feature-preserving mesh simplification is
then performed collectively on all channel meshes, so their resultant
coarse meshes always have the same topology, i.e. the same num-
ber of vertices and edges, and project to the same configuration of
non-overlapping 2D triangles on the original image plane. The 2D
projection of the triangles in the simplified mesh serve as the base
domain for B´ezier patch formulation and optimization. 2D triangu-
lar B´ezier patches are computed for every triangle in the base do-
main. We approximate the color variations over each B´ezier patch
with a thin-plate spline for every color channel. The main proce-
dure is sketched in Fig. 2.
4.1 Initial Mesh Construction
An image is first triangulated at pixel resolution on the 2D image
plane. Pixels are connected row- and column-wise resulting in an
image grid. Every rectangular cell in the grid is divided into two
triangles by either of the two diagonals. Since we would also like
to preserve important curvilinear image features as much as possi-
ble during vectorization, it is necessary to describe the features at
subpixel resolution in the triangulation. We start by delineating im-
portant image features i.e. color discontinuities with image-space
3
Appears in ACM Transactions on Graphics (special issue for SIGGRAPH Asia 2009)
x
y
(a) triangulated 2D image plane
x
y
z
(b) channel mesh
Figure 4: A pixel in (a) (magenta dot) is lifted to a vertex in (b).
A subpixel in (a) (blue dot) is split into a pair of dual vertices in
(b). A subpixel feature in (a) (green line) becomes a pair of dual
features in (b) (red and yellow lines), forming a hole in the mesh.
edge detection using the Canny detector [Canny 1986]. For most of
our experiments, we use a low threshold of 0.05 and a high thresh-
old of 0.125. Detected features are thinned to 1-pixel wide, linked
and removed of T-junctions [Kovesi ]. Features shorter than 10 pix-
els are discarded. Since a color discontinuity is associated with at
least two neighboring pixels with very different color intensities, a
detected feature pixel could be located either at the high end or low
end of the discontinuity, and a true subpixel feature should be some-
where between the two extremes. In other words, the true subpixel
feature could be on either side of the detected feature as shown in
Fig. 3(b). Wecompare image gradients on both sides of the detected
feature and, on the side with the larger gradient, a true feature edge
at subpixel resolution is traced half-pixel away from the detected
feature (Fig. 3(c)). The process of inserting subpixels is similar to
that of the marching cubes algorithm [Lorensen and Cline 1987].
Once a new subpixel is created in the middle of a triangle edge, any
triangle affected by such edge subdivision is re-triangulated (Fig.
3(d)).
Once we have a 2D triangulation with both pixels and subpixels,
we elevate it to a triangular surface mesh, called a channel mesh,
for every color channel. Every pixel in the image space is lifted to a
3D vertex on the mesh. Image coordinates of the pixel is interpreted
as the x- and y-coordinates, and the color value at the pixel as the
z-coordinate. Every subpixel, however, is split into a pair of dis-
connected dual vertices, which have the same x and y coordinates
but are assigned different z coordinates. As depicted in Fig. 4, the
split virtually transforms a subpixel feature in 2D triangulation into
a pair of dual features, each representing one side of the original
feature. Reflected in the channel mesh, the split cuts the mesh open
and leaves a hole in the mesh for every feature. Each dual vertex
of the pair carries different attributes associated with one side of
the feature. More precisely, each dual vertex gets half of the orig-
inal connectivity and is assigned the color of its nearest pixel on
the same side. To smooth out noises, an optional fairing operation
[Taubin 1995] on the z-coordinate can be performed within a nar-
row neighborhood (normally 1-2 pixels) of dual features.
Vertex split is crucial in avoiding undesired blurriness along edges
during vectorization. It effectively preserves the color discontinu-
ities by converting detected features into mesh boundaries, which
are refrained from collapsing during base domain computation
(Section 4.2). Also worth of note is that, throughout the initial mesh
construction and base domain computation, we ensure all channel
meshes differ only in the z coordinate at each corresponding vertex.
Their projections on the original image plane always correspond to
the same 2D triangulation. This important property guarantees a
consistent color estimation for different color channel during patch
fitting (Section 4.4).
4.2 Base Domain Computation
We aim to approximate the entire image with a sparse set of patches,
so channel meshes must be simplified to a coarser resolution. Mesh
simplification [Hoppe et al. 1993; Garland and Heckbert 1997] can
be thought of as a process that clusters nearby triangles in the orig-
inal mesh into regions and represent each of the regions using one
triangle in the simplified mesh. In the same vein of [Garland and
Heckbert 1997], we apply the highly efficient simplification algo-
rithm based on the quadric error metric. Even though the quadric
error metric is not specifically related to thin-plate splines that we
use for color fitting, it measures accumulated squared distances to
a set of planes and only merges triangles in the same flat region on
the mesh. Thus, it is capable of preserving weak features that have
been missed during automatic feature detection and aligning them
with region boundaries.
v
v
i
j
(a) dual edge collapse (b) left: valid. right: flipped (c) intersection
Figure 5: (a) The pair of orange edges are contracted at the same
time. When v
i
(blue dot) collapses into v
j
(the magenta dot), the
dual vertex of v
i
(the other blue dot) also collapses into the other
magenta dot. (b) The right mesh is a flipped configuration of the
left one, with a fold-over of the dashed triangle. (c) An intersection
between a boundary B´ezier curve and its immediate neighbor edge,
rendering the mesh invalid.
Recall that when constructing initial channel meshes, automatically
detected features in image space are incorporated as mesh bound-
aries. We therefore tailor the simplification algorithm as follows.
1. All channel meshes are coarsened in a synchronized manner
by applying synchronized edge contractions on all channel meshes.
We define the error associated with an edge as the accumulated
quadric error over that corresponding edge of all channel meshes.
2. When two boundary vertices contract, their corresponding dual
vertices must contract at the same time (Fig. 5(a)). We define the
error of either edge as the average of the pair. This is to assert a
seamless projection of the mesh onto the image plane.
3. Important features should be kept intact during simplification.
For an edge contraction, we use one of the edge’s two original ver-
tices as the new vertex position. Interior vertices are allowed to col-
lapse both among themselves and into boundary vertices. Bound-
ary vertices, however, are refrained from being absorbed by interior
vertices. Furthermore, vertices on the same mesh boundary are al-
lowed to collapse among themselves while contracting to vertices
on other boundaries is disabled.
4. At any step during simplification, the 2D projection of any chan-
nel mesh should be free of fold-overs. In addition, 2D projections of
mesh boundaries are automatically fitted with B´ezier curves. And
non-boundary edges should not intersect with these B´ezier curves
when projected onto the image plane. These requirements guar-
antee the resulting triangular decomposition of the image plane is
valid. The use of mesh simplification with additional modifica-
tions to generate a triangular base domain is primarily motivated
by the fact that constrained Delaunay triangulation (CDT) cannot
directly handle B´ezier curves as constraints. CDT typically dis-
cretizes curves into short line segments.
We impose extra checks when each edge contraction is carried out
to enforce the above requirements. After a tentative edge contrac-
tion, we project relevant neighborhoods in the resulting mesh onto
the image plane. If a reversed ordering (a flip) is detected in the
1-ring neighborhood of a vertex (Fig. 5(b)), we reclaim the con-
4
Appears in ACM Transactions on Graphics (special issue for SIGGRAPH Asia 2009)
traction, penalize the edge by adding an extra error to the origi-
nal quadric error, and resume from the next edge with the mini-
mum error. The same measure also applies to planar B´ezier curve
fitting of subpixel image features. As illustrated in Fig. 6, if the
edge contraction occurs on mesh boundaries, we re-fit a single new
planar curve to the x and y coordinates of the boundary vertices
previously represented using two adjacent curves. If any newly fit
ezier curve fails to approximate the original polyline feature seg-
ment within a predefined error threshold (our choice is 1 pixel), or it
intersects with any of its immediate neighboring edges (Fig. 5(c)),
we roll back the tentative contraction and restart at the next edge
with the smallest error. For better magnification results, control
points may be slightly adjusted to enforce G
1
continuity between
adjacent curves. In all our experiments, we use planar cubic B´ezier
curves for boundary fitting.
A
B
C
v
v
v
(a) before
A
C
v
v
(b) after
Figure 6: The magenta polyline denotes the shape of the original
boundary from v
A
to v
C
. ( a) Before v
B
collapses into v
C
, there
are three vertices (v
A
, v
B
, and v
C
) on the boundary. It is fitted
with two separate B´ezier curves, one from v
A
to v
B
and the other
from v
B
to v
C
. (b) After the collapse the boundary is fitted with a
single new curve from v
A
to v
C
.
During simplification we also keep track of vertices being con-
tracted by associating each edge with an absorbed list. Consider an
arbitrary edge e
ij
= v
i
v
j
and the 1-ring neighbor of v
i
, N(v
i
).
Initially, the absorbed list L
ij
of e
ij
consists of its two endpoints,
L
ij
= {v
i
, v
j
}. As sketched in Fig. 7, once e
ij
is contracted,
i.e. v
i
collapses into v
j
, all other incident edges of the disappear-
ing vertex v
i
, {e
ki
, v
k
N(v
i
), v
k
/ N(v
j
),k = j}, need to
be updated by replacing v
i
with v
j
as its new endpoint. e
ki
thus
becomes e
kj
, and L
ij
is appended to L
ki
, becoming the new ab-
sorbed list L
kj
:= L
ki
+ L
ij
. Note that the new list is always
ordered such that the first and last entries are always the two end-
points of the edge, L
jk
= {v
j
, ··· , v
i
, ··· , v
k
}. By the time the
algorithm terminates, each existing edge will have collected a set of
contracted vertices.
Final coarse meshes are projected back to the original image plane.
The set of projected triangles on the original image plane is the base
domain M
b
=(V
b
,E
b
,F
b
). For an edge e
b
E
b
with endpoints
v
b
i
, v
b
j
V
b
, there must exist a path connecting these endpoints
on the original mesh. Instead of directly searching for such a path,
we make use of the ordered vertices in the absorbed list of e
b
to
generate an initial path between v
b
i
and v
b
j
. This initial path serves
as the starting point for B´ezier curve optimization. For every base
triangle f
b
F
b
, we will formulate a triangular B´ezier patch.
v
i
v
j
v
k
(a) before contraction
v
j
v
k
(b) after contraction
Figure 7: Edge contraction. The edge e
ij
= v
i
v
j
(black) in (a)
is contracted, causing v
i
to collapse into v
j
. The three surviving
edges (colored in blue) originally incident to v
i
update their ab-
sorbed list by appending the absorbed list of e
ij
to their own.
4.3 B
´
ezier Patch Optimization
The base domain, M
b
=(V
b
,E
b
,F
b
), covers the entire image
plane. The image will finally be decomposed into a set of non-
overlapping triangular B´ezier patches. There is a one-to-one cor-
respondence between the base triangles and B´ezier patches. For
each B´ezier patch, the location of its three corners coincides with
the location of the three vertices of its corresponding base triangle.
Curved patch boundaries, however, are yet to be computed from
edges in the base domain. Note that some of these edges corre-
spond to mesh boundaries. Since mesh boundaries are fitted with
ezier curves on the image plane during mesh simplification, these
curves are B´ezier patch boundaries already and do not need further
processing. Other edges, along with their associated traced paths
(Fig. 8(a)), will be used for computing B´ezier patch boundaries.
We compute patch boundaries by formulating it as an optimization
problem. Before being used for optimization, the traced path of an
edge in the base domain has to be pruned clean of branches or small
loops (Fig. 8(b)). Then it is fitted with a B´ezier curve that specifies
the initial boundary position. These curves, as shown in Fig. 8(c),
normally intersect with each other, resulting in a set of folding and
overlapping B´ezier patches. Our goal is to improve upon them to
generate a set of non-overlapping patches while boundaries are kept
as close as possible to where originally traced paths reside. Inspired
by [Branets and Carey 2005], this overall goal is formulated as a
nonlinear optimization which is solved iteratively. Within each iter-
ation, we optimize every patch boundary in a sequential order. The
optimization of a single patch boundary is elaborated as follows.
Consider a boundary B´ezier curve C shared by two triangular
patches B
1
and B
2
. When optimizing C, we seek the optimal po-
sitions of Cs control points (excluding two endpoints) as well as
control points over the interior of B
1
and B
2
by fixing the con-
trol points of the other two boundaries of these patches. If there
are intersections between C and other boundary curves of B
1
and
B
2
, it satisfies that p
i
B
1
B
2
, |J(p
i
)|≤0 where J( p
i
) is
the Jacobian at point p
i
1
. On the other hand, the Jacobian at any
point within a base triangle is always the same and its determinant
is always positive. We regard a B´ezier patch’s corresponding base
triangle as its “ideal” reference and try to optimize the determinant
of its Jacobian towards that of the base triangle. To make the opti-
mization more tractable, instead of enforcing a positive determinant
of the Jacobian everywhere in B
1
B
2
, we try to enforce a posi-
tive determinant of the Jacobian at a dense set of sample points in
B
1
B
2
. The sample points are uniformly drawn within the para-
metric domains of B
1
and B
2
as well as on the curve C.
Another factor to integrate into the objective function is the dis-
tance between the current boundary C and the initially traced path
C
0
. We would like to keep C close to C
0
. C and C
0
are both uni-
formly sampled, and the distance is estimated as the sum of distance
between every pair of corresponding sample points. We summarize
the objective function as
k=1,2
p
i
B
k
w
i
(|J
B
k
(p
i
)|−|J
b
B
k
|)
2
+
p
j
C,
p
0
j
C
0
w
d
p
j
p
0
j
2
,
(1)
where we denote by J
b
B
k
the Jacobian of the base triangle associ-
ated with patch B
k
, and p
i
represents a sample point over the patch
B
k
. w
i
= w
p
if |J
B
k
(p
i
)| > 0, and w
i
= w
n
if |J
B
k
(p
i
)|≤0.
1
Let B(u, v, w)=
x(u, v, w)
y (u, v, w)
, where u + v + w =1,bea
2D triangular B´ezier patch. The Jacobian of B is defined as a 2x2 matrix,
∂x
∂u
∂x
∂v
∂y
∂u
∂y
∂v
.
5
Appears in ACM Transactions on Graphics (special issue for SIGGRAPH Asia 2009)
(a) traced paths (b) path pruning (c) initial fitting
Figure 8: An initial B´ezier patch. (a) The base triangle. Every
edge is associated with a traced path. (b) Traced paths are pruned
clean of branches and self loops. (c) Pruned paths are fitted with
ezier curves, which may introduce inter-curve intersections and
neighboring patch overlap.
(a) features (b) lines (c) zoom in: feature blurred
(d) curves (e) zoom in: feature preserved
Figure 9: Curved boundaries vs. line segments. (a) Part of the
object silhouette is not detected due to a weak contrast. (b) Use
triangle domain directly. The straight line segments do not align
with the object boundary. ( c) The result is a blurred boundary re-
gion. ( d) Boundaries are optimized from traced paths into curves,
roughly following the object boundaries. (e) The undetected fea-
tures are preserved in the result.
w
p
, w
n
, and w
d
are the weighting factors for Jacobian with a posi-
tive determinant, Jacobian with a negative determinant, and the dis-
tance, respectively. In all our experiments, w
p
=0.2, w
n
=1, and
w
d
=5. As mentioned earlier, the unknowns in the above objec-
tive function include Cs control points (excluding two endpoints)
as well as control points over the interior of B
1
and B
2
. Every
sample position p
i
is expressed in these unknown control points,
and the determinant of the Jacobian is a function with respect to the
same set of unknowns.
Since we define two different Jacobians for any sample point on a
boundary curve, one for each adjacent patch, the objective function
(Eq. 1) is C
2
and eligible to be solved by the BFGS algorithm, a
quasi-Newton method that approximates the second derivatives of
the function using the difference between successive gradient vec-
tors. In practice, we choose the relatively faster implementation
bfgs2 in GSL [Fletcher 1987]. In all our experiments, we have suc-
cessfully resolved all boundary intersections using 2D cubic B´ezier
curves as patch boundaries.
The example in Fig. 9 demonstrates why curved boundaries are
necessary at places with no detected features. Edge detection nor-
mally fails at features with a weak color contrast. Fortunately, mesh
simplification exhibits a similar behavior with region growing and
tends not to grow across such weak features unless there are no
other smoother regions left. Thus, when mesh simplification termi-
nates, weak features would follow region boundaries and are often
identified as the initial traced paths. Since our patch boundary op-
timization encourages an optimized boundary curve to follow the
initial path, the final optimized boundary is most likely to follow
the original weak feature. If we directly took those straight edges
from the base domain as the final patch boundaries, a weak curved
feature would fall into the interior of a patch and adversely affect
color fitting.
4.4 Patch Color Fitting
Given a 2D triangular B´ezier patch, we need to approximate color
variations over the region covered by the patch. Intuitively we
would define the color field using the B´ezier patch itself, storing
a color value as an additional coordinate at every control point.
However, low-degree triangular B´ezier patches have few internal
control points (1 in the case of cubic). Unless a very large number
of patches are used, the limited degree of freedom does a poor job
in color fitting inside the patch while maintaining continuity across
patch boundaries, leading to severe color distortion. Instead, we
apply thin-plate spline fitting in the parametric domain of the patch
to achieve the goal. Thin-plate spline (TPS) interpolation [Pow-
ell 1995] is a widely used method for scattered data interpolation.
Given the parameters of a set of N points, {(u
i
,v
i
)}, in the 2D
parametric domain of the patch, each pair of parameters (u
i
,v
i
)
associating with a color value h
i
, TPS interpolation attempts to
construct a smooth function f (u, v) that satisfies all constraints
f(u
i
,v
i
)=h
i
. The solution f minimizes the bending energy
I(f )=

f
2
uu
+2f
2
uv
+ f
2
vv
dudv and is expressed as
f(u, v)=
N
i=1
α
i
φ( (u
i
,v
i
) (u, v) )+b
0
+ b
1
u + b
2
v, (2)
where
N
i=1
α
i
=0,
N
i=1
α
i
u
i
=
N
i=1
α
i
v
i
=0, and
(u
i
,v
i
)s serve as the center locations of the thin-plate radial ba-
sis function,
φ(s)=s
2
log s. (3)
Combined with constraints f (u
i
,v
i
)=h
i
, 1 i N, one solves
for TPS coefficients {α
i
} by a linear system
KP
P
T
O

α
b
=
h
o
, (4)
where K
ij
= φ( (u
i
,v
i
) (u
j
,v
j
) ), the i-th row of P
is [1,u
i
,v
i
], O is a 3 × 3 zero matrix, o is a 3 × 1 zero vec-
tor, α =[α
1
2
, ···
N
]
T
, h =[h
1
,h
2
, ··· ,h
N
]
T
, and b =
[b
0
,b
1
,b
2
]
T
. Once α and b are solved, we are able to compute
the color, i.e. the height value, corresponding to any point in the
parametric domain using Equation (2).
One drawback of TPS interpolation is that it is sensitive to the con-
straints. If there is noise in the constraints, the reconstruction re-
sult tends to have undesirable undulations. Instead of interpolation,
we opt for TPS fitting by adding a number of extra constraints at
{(u
i
,v
i
)} with color values {h
i
}. TPS fitting essentially solves for
the same number of coefficients in the least squares sense. The for-
mulation of the linear system is almost intact except for the newly
added rows. The overdetermined system is written as
KP
K
P
P
T
O
α
b
=
h
h
o
, (5)
where K
ij
= φ( (u
i
,v
i
) (u
j
,v
j
) ), the i-th row of P
is
[1,u
i
,v
i
], and h
=[h
1
,h
2
, ··· ,h
K
]
T
. The fitting version of TPS
effectively smoothes out noises and renders a more robust approxi-
mation to the given data points.
TPSdoes not have any restriction on the 2D parametric domain over
which it is defined. The color variations over a patch can therefore
be accurately approximated if we sample constraints from both in-
terior and edges of the triangular parametric domain. We treat each
6
Appears in ACM Transactions on Graphics (special issue for SIGGRAPH Asia 2009)
(a) base triangle (b) ezier patch (c) inter-patch consistency
Figure 10: (a) In a base triangle, Barycentric coordinates of all
centers (yellow dots), corners (blue dots), and additional points
(orange dots) sampled near edges, are computed. (b) These co-
ordinates are used for the evaluation of corresponding fitting con-
straints on the B´ezier patch. (c) Additional points (orange dots)
serve as constraints for color continuity across patch boundaries.
color channel as a height field over the 2D parametric domain of the
ezier patch. In Fig. 10, we illustrate how extra fitting constraints
are sampled. We simply take the base triangle as the parametric do-
main of its corresponding patch. Every edge of the base triangle is
evenly divided into n segments, and the base triangle is partitioned
into n
2
smaller triangles, where n is typically set to 10. For the
center and corners of every subdivided triangle, we compute their
barycentric coordinates and evaluate their corresponding points on
the B´ezier patch. These points on the B´ezier patch are actually
image-space locations where color values should be sampled. The
color values at any constraint are computed from the input image
using bilinear interpolation.
This method may induce missampled color values in regions near
detected features. In the image space, a feature is a polyline in
subpixel resolution (Fig. 3(c)). In the parametric domain, however,
the polyline is fitted with a B´ezier curve. The polyline and the curve
generally intersect with each other. As illustrated in Fig. 11(a), a
sample point on the left hand side of the B´ezier curve is supposed
to be blue. But in the image space, it is actually located on the right
hand side of the polyline (i.e. the detected feature), and is therefore
missampled as a green point.
(a) (b) (c) (d)
Figure 11: Free-form deformation to eliminate missampling.
We resolve this missampling problem with a free-form deforma-
tion [Lee et al. 1995] from the parametric domain to the image do-
main. As shown in Fig. 11(b), for each B´ezier curve and its cor-
responding polyline, we define a set of corresponding point pairs
by sampling the curve (arc-length) and the polyline (chord-length).
As points on the curve move towards their correspondences on the
polyline, the curve eventually deforms into its corresponding poly-
line (Fig. 11(c)). From these feature constraints, a continuous and
one-to-one warping from the parametric domain to the image do-
main is derived. Sample points near B´ezier curves will be associ-
ated with correct color values once being warped (Fig. 11(d)).
To guarantee the continuity of derivatives across patch boundaries,
we sample additional constraints near the edges of the parametric
domain. For every constraint on an edge, we add a pair of extra
constraints slightly away from the edge in the direction normal to
the edge (Fig. 10(a)). Note that patch boundaries that model color
discontinuities have only one incident patch, and therefore do not
a
b
c
(a) 3-way
a
b
c
i
j
k
(b) 4-way
a
b
c
i
(c)
a
b
c
i
j
k
(d)
a
b
c
i
j
k
(e)
Figure 12: Evaluation of control points by four passes of the de
Casteljau’s algorithm in a uniform 4-way subdivision. ( c) 1st pass
with t =(0.5, 0.5, 0). (d) 2nd and 3rd passes, both with t =
(0.5, 0.5, 0). (e) Last pass with t =(1, 1, 1).
require such additional constraints.
The center locations of the thin-plate basis functions are sampled in
the parametric domain in the same manner, only at a much sparser
rate. Our experiments show that in the presence of extra fitting
constraints, 16 basis functions per patch (with at most 9 bases at
the interior of the patch) suffice for a reasonable approximation.
5 Vector Image Rasterization
To visualize the vectorized image from the previous section, we
need to render the triangular patches and their associated color in-
formation onto a discrete image plane. We have developed a real-
time GPU-based parallel algorithm for rendering vectorized im-
ages. It can achieve more than 30 frames per second on a 768x512
resolution. This algorithm relies on 4-way subdivision of triangu-
lar B´ezier patches. In the following, we briefly introduce this type
of subdivision first, and then present the GPU-based rasterization
algorithm.
Conventional 3-way subdivision by the de Casteljau’s algorithm
does not apply in the case of triangular patch. For example,
Fig. 12(a) shows a uniform 3-way subdivision with parameters t =
(
1
3
,
1
3
,
1
3
), where patch boundaries are never subdivided. Patches
only become thinner and thinner but never converge to the desired
resolution. Instead, we uniformly subdivide B´ezier patch B
abc
into
four smaller B´ezier patches B
aik
, B
ibj
, B
kjc
, and B
ijk
. As il-
lustrated in Fig. 12(c) - 12(e), we start by partitioning B
abc
into
two halves, B
aic
and B
ibc
, using parameters t =(0.5, 0.5, 0) for
the de Casteljau’s algorithm. We do a similar subdivision on B
aic
and B
ibc
respectively. By now, all control points of B
aik
and B
ibj
(shaded in blue) are available. The last pass of the de Casteljau’s
algorithm is performed on patch B
ijc
(in green) with parameters
t =(1, 1, 1). This gives us the control points for B
kjc
and B
ijk
.
This last pass performs extrapolation using a negative parameter.
Since de Casteljau’s algorithm only performs reparameterization
without altering the underlying geometry, control points resulting
from such extrapolation can exactly reproduce the geometry of the
original subpatches B
kjc
and B
ijk
.
Our GPU-based rasterization algorithm was developed using
NVidia CUDA [NVidia 2008] on a GeForce 8800 GTX GPU. The
algorithm has three major stages. First, recursively perform 4-
way subdivision (Fig. 12(b)) on every original 2D triangular B´ezier
patch B
i
until the bounding box of each resulting patch becomes
smaller than a pixel. This stage is highly parallelizable because
the subdivision of every patch is independent of each other [Patney
and Owens 2008]. We perform in parallel one level of subdivision
across all patches that need to be further subdivided. Each thread
block has 64 threads and processes 16 patches in parallel because
of the limited size of the shared memory. Thus, every 4 threads
in the same block perform the 4-way subdivision of a single patch
in parallel. Such parallel subdivision is repeated until all resulting
patches have become sufficiently small.
The second stage involves three substeps. i) approximate every re-
sulting small patch F
j
from the previous stage using the triangle
7
Appears in ACM Transactions on Graphics (special issue for SIGGRAPH Asia 2009)
Figure 13: Left: original image. Middle: 380 triangular B´ezier
patches for flower petals. Right: reconstructed result by our repre-
sentation with 0.98 per pixel mean reconstruction error.
T
j
formed by its three corners and compute the 2D bounding box
of T
j
; ii) for every pixel inside the bounding box, perform point-
in-triangle test using barycentric coordinates to identify the pixels
covered by T
j
; iii) for every pixel covered by T
j
, compute its para-
metric values with respect to patch B
i
using the barycentric co-
ordinates of the pixel within T
j
and the parametric values of T
j
’s
vertices with respect to B
i
. By the end of this stage, we will have
figured out for every pixel which original 2D B´ezier patch covers
it. We parallelize the second stage over the set of finest patches
from the previous stage. Each thread block has 256 threads each of
which is responsible for executing all three substeps on one distinct
patch.
In the third and last stage, for every pixel, we retrieve the thin-plate
spline computed for the 2D B´ezier patch that covers the pixel, and
substitute the parameters of the pixel into Equation (2) to obtain
its color. We parallelize this last step over the set of pixels. Each
thread block has 512 threads each of which takes care of thin-plate
spline evaluation for one distinct pixel.
6 Results and Discussions
We have fully implemented our image vectorization algorithm and
successfully tested it on a large number of raster images. Thanks
to feature-preserving mesh simplification and thin-plate spline fit-
ting, our algorithm is not sensitive to edge detection results, and can
achieve high-quality reconstruction with relatively fewperceptually
important curvilinear features.
Image vectorization results from our algorithm can be found in
Figs. 1, 13-19 and the supplemental materials. We adopt the same
parameter setting in most of our experiments and have discussed
them in various subsections in Section 4. The parameters for Canny
edge detection were given at the beginning of Section 4.1. The
weights for patch boundary optimization were given after Eq. 1 in
Section 4.3. And the sampling scheme of the constraints for thin-
plate spline fitting has been discussed at the end of Section 4.4. Ev-
ery B´ezier patch uses at most 19 3D TPS terms for color variations
and an amortized cost of 4.5 2D control points for its geometry,
summing up to 66 scalar coefficients for a complete representation.
The number of triangular patches varies from image to image and
has been summarized in Table 1 along with mean reconstruction
errors. The number of patches to achieve a similar level of recon-
struction accuracy increases with the complexity of edge structures
in the raster image. Our algorithm typically takes 1-2 minutes on
an Intel Core 2 Duo 3.0GHz processor to vectorize a 512x512 im-
age. Our real-time GPU-based rasterization algorithm achieves 60
frames per second on a 384x512 resolution and 32 frames per sec-
ond on a 768x512 resolution using a GeForce 8800 GTX GPU.
We have compared our vectorization algorithm with three auto-
matic ones in the literature [Lecot and Levy 2006; Orzan et al.
2008; Lai et al. 2009]. Fig. 13 shows our image-space triangu-
lar decomposition and vector-based reconstruction of a foreground
layer used in [Lai et al. 2009] (Fig. 9). Compared with their au-
tomatic gradient mesh generation technique, our automatic method
still achieves the same level of reconstruction quality with far fewer
patches and a comparable total degree of freedom. We consider a
small number of patches as “being compact”. When the number
of patches decreases, each patch inevitably covers a larger region
of the image and needs a more complex model to approximate its
color variations. This tradeoff is indeed necessary because a smaller
number of patches better support standard applications such as edit-
ing, where fewer patches mean less user manipulation.
The technique in [Lai et al. 2009] only aligns the boundary of the
gradient mesh with the outmost boundary of the foreground layer. It
is hard to perform detailed feature alignment within the foreground
layer using gradient meshes. As a result, curvilinear features sepa-
rating petals of different flowers are not well preserved, which intro-
duces visual artifacts. In comparison, our method easily performs
such internal feature alignment and achieves better visual quality.
A side-by-side comparison and a cost analysis can be found in the
supplemental materials.
Figure 14: Comparison with diffusion curves [Orzan et al. 2008].
Left column: (from top down) original image, reconstruction re-
sult using diffusion curves, and using our method, both achieving a
reconstruction error around 1.0 per pixel. Mid column: automati-
cally detected feature edges (top) and reconstruction error (ampli-
fied by 4) (bottom) of diffusion curves. Right column: feature edges
and reconstruction error (amplified by 4) of our method.
Fig. 14 shows a comparison with diffusion curves [Orzan et al.
2008]. Our method requires significantly fewer edge features to
achieve the same level of mean reconstruction error. Diffusion
curves require more edges and therefore have to lower edge detec-
tion thresholds. At low thresholds, the detected edges have an un-
desirable spatial distribution. Some smooth regions are filled with
overly dense edges while others have insufficient number of edges,
such as the central highlight area in the teapot image. Because
shading variations in these undersampled areas do not closely fol-
low membrane interpolation, color approximation using diffusion
gives rise to relatively large errors. Because of triangular decom-
position and thin-plate spline fitting, our method achieves better
reconstruction quality in regions with insufficient number of edges.
To achieve the same level of reconstruction error, diffusion curves
require 36.46K coefficients in total (B´ezier curves for edge repre-
sentation and polyline fitting for colors and blur values) while our
representation has 66 × 525 = 34.65K degrees of freedom.
Fig. 15 shows our vector-based reconstruction of Lena. Our method
successfully preserves a large number of tiny details and achieves
far better visual quality than the ArDeco system (Fig. 6 in [Lecot
and Levy 2006]).
In addition to successful reconstructions at the original resolution,
our representation also makes it easy to perform standard vector
graphics operations such as magnification or editing. Our vectoriza-
tion is a lossy procedure because thin-plate spline fitting removes
details with extremely high frequencies. It should therefore be con-
8
Appears in ACM Transactions on Graphics (special issue for SIGGRAPH Asia 2009)
Figure 15: Left: Original Image. Right: Reconstructed image by
our representation. We achieve a mean reconstruction error = 2.4
with 2011 patches.
Figure 16: Magnification: original resolution ×16. Left: an ac-
curate magnification by precise feature alignment. Middle: color
bleeding caused by misalignment of vector representation to the
feature. Right: bicubic interpolation.
sidered as a type of image stylization, where important image fea-
tures are preserved while certain internal high-frequency variations
cannot be recovered during magnification. Magnification can be
performed by rasterizing the vector representation on an image grid
of a higher resolution. Editing first requires an interactive selection
of patches in regions of interest. This is followed by modifying
TPS basis values or geometry control points in regions of interest.
As shown in Fig. 1 and 18, both color discontinuities and smooth
variations are preserved in zoomed-in images. It largely attributes
to the precise feature alignment, the absence of which leads to mag-
nification artifacts such as the color bleeding in Fig. 16 caused by
misalignment of vector representations to image features. Without
precise alignment, color values of a pixel cannot be guaranteed to
be derived only from the correct side of the true image feature when
an image is magnified. Editing suffers the same problem as in mag-
nification, as shown in Fig. 17.
Figure 17: Top left: original image. Top right: color editing re-
sult. Bottom left: closeup (×4) of color editing with misalignment
of vector representation to image features. Bottom right: closeup
(×4) of color editing with precise feature alignment.
Limitations. There are certain aspects of our algorithm that can
be further improved. Currently there is no scale information asso-
ciated with the detected edge features. Since the scale of an edge
signifies its importance, we could perform scale-space edge detec-
tion [Lindeberg 1998] and set a scale threshold to prune less impor-
# patches mean error
Fig. 2(h) PEPPERS 219 0.87
Fig. 13
FLOWERS 380 0.98
Fig. 14
TEAPOT 525 0.98
Fig. 18
DOLL 661 1.45
Fig. 19
FACE 782 0.77
Fig. 19
GOLDFISH 925 1.61
Fig. 1
MAGNOLIA 1093 1.23
Fig. 18
BUDDHA 1687 1.28
Fig. 15 Lena
2011 2.40
Table 1: Statistics for image vectorization.
tant edges before initial mesh construction. We could also relate the
quality of the vectorized image with the scale threshold. A higher
quality vectorization is associated with a lower scale threshold to
preserve more curvilinear features. Second, our thin-plate spline
fitting has not been fully optimized. Currently, the center locations
of the basis functions are determined by uniform subdivision of the
parametric domain of the patch, and fixed thereafter during patch
color fitting. We could formulate a nonlinear optimization that re-
places the current linear solver to search for optimal center locations
of the bases to further reduce fitting errors. Such nonlinear opti-
mization would be more expensive though. The number of bases
for each patch could also be adaptively determined according to the
magnitude of fitting errors. Also, there is still room to accelerate
our vectorization algorithm using multi-core processors.
7 Conclusions
In this paper, we have introduced an effective vector-based repre-
sentation and its associated vectorization algorithm for full-color
raster images. Our representation is based on a triangular decom-
position of the image plane. The boundaries of a triangular patch
in this decomposition are represented using B´ezier curves and in-
ternal color variations of the patch are approximated using thin-
plate splines. Experiments and comparisons have indicated that our
representation and associated vectorization algorithm can achieve a
more accurate and compact vector-based representation than exist-
ing ones. A real-time GPU-based algorithm has also been devel-
oped for rendering vectorized images.
Acknowledgments
Thanks to the anonymous reviewers for valuable comments and
suggestions, and to Xiaomin Jiao for helpful discussions. This work
was partially supported by NSF (IIS 09-14631).
References
BRANETS, L., AND CAREY, G. F. 2005. Extension of a mesh qual-
ity metric for elements with a curved boundary edge or surface.
Transactions of the ASME 5 (December).
C
ANNY, J. 1986. A computational approach to edge detection.
IEEE Trans. Pat. Anal. Mach. Intell. 8, 6, 679–698.
C
HANG, H.-H., AND HONG, Y. 1998. Vectorization of hand-
drawn image using piecewise cubic b´ezier curves fitting. Pattern
recognition 31, 11, 1747–1755.
D
EMARET,L., DYN, N., AND ISKE, A. 2006. Image compression
by linear splines over adaptive triangulations. Signal Processing
86, 7, 1604–1616.
F
LETCHER, R. 1987. Practical Methods of Optimization, sec-
ond ed. John Wiley & Sons.
9
Appears in ACM Transactions on Graphics (special issue for SIGGRAPH Asia 2009)
Figure 18: Far left and far right: original image. Mid left and mid right: reconstructed image (×1). Middle: magnification (×4).
Figure 19: Additional vectorization results. The first and the third are original images, followed by their reconstructed results respectively.
G
ARLAND, M., AND HECKBERT, P. 1997. Surface simplification
using quadric error metrics. In SIGGRAPH 1997, 209–216.
H
ILAIRE, X., AND TOMBRE, K. 2006. Robust and accurate vec-
torization of line drawings. IEEE Trans. Pattern Anal. Mach.
Intell. 28, 6, 890–904.
H
OPPE, H., DEROSE,T.,DUCHAMP,T.,MCDONALD, J., AND
STUETZLE, W. 1993. Mesh optimization. In Computer Graph-
ics (SIGGRAPH Proceedings), 19–26.
K
OVESI, P. D. MATLAB and Octave functions for computer vision
and image processing.
L
AI, Y.-K., HU, S.-M., AND MARTIN, R. 2009. Automatic and
topology-preserving gradient mesh generation for image vector-
ization. ACM Trans. Graph. 28, 3, Article 85.
L
ECOT,G., AND LEVY, B. 2006. Ardeco: Automatic region detec-
tion and conversion. In Proceedings of Eurographics Symposium
on Rendering, 349–360.
L
EE, S.-Y., CHWA, K.-Y., AND SHIN, S. Y. 1995. Image meta-
morphosis using snakes and free-form deformations. In SIG-
GRAPH ’95, ACM, 439–448.
L
INDEBERG, T. 1998. Feature detection with automatic scale se-
lection. Int’l Journal of Computer Vision 30, 2, 77–116.
L
ORENSEN, W. E., AND CLINE, H. E. 1987. Marching cubes:
A high resolution 3d surface construction algorithm. Computer
Graphics 21,4.
N
EHAB, D., AND HOPPE, H. 2008. Random-access rendering of
general vector graphics. ACM Trans. Graph. 27, 5, Article 135.
NV
IDIA, 2008. Nvidia CUDA programming guide 2.0.
http://developer.nvidia.com/object/cuda.html.
O
RZAN, A., BOUSSEAU, A., WINNEM
¨
OLLER, H., BARLA,P.,
T
HOLLOT, J., AND SALESIN, D. 2008. Diffusion curves: a
vector representation for smooth-shaded images. ACM Trans.
Graph. 27, 3, Article 92.
P
ATNEY, A., AND OWENS, J. D. 2008. Real-time reyes-style
adaptive surface subdivision. ACM Trans. Graph. 27, 5, Article
143.
P
OWELL, M. 1995. A thin plate spline method for mapping curves
into curves in two dimensions. In Computational Techniques and
Applications.
P
RICE, B., AND BARRETT, W. 2006. Object-based vectorization
for interactive image editing. The Visual Computer 22, 9 (Sept.),
661–670.
S
UN, J., LIANG, L., WEN,F.,AND SHUM, H.-Y. 2007. Im-
age vectorization using optimized gradient meshes. ACM Trans.
Graph. 26, 3, Article 11.
S
WAMINARAYAN, S., AND PRASAD, L. 2006. Rapid automated
polygonal image decomposition. In AIPR ’06: Proceedings of
the 35th Applied Imagery and Pattern Recognition Workshop,
IEEE Computer Society, 28.
T
AUBIN, G. 1995. A signal processing approach to fair surface
design. In Proc. SIGGRAPH’95, 351–358.
Z
OU, J. J., AND YAN, H. 2001. Cartoon image vectorization
based on shape subdivision. In CGI ’01: Computer Graphics
International 2001, IEEE Computer Society, 225–231.
10
... Image vectorization is another alternative way to directly obtaining the bitmap from imagery. Traditional vectorization techniques primarily depend on segmentation or edge detection to group pixels into larger regions, subsequently fitting vector curves and region primitives to these segments [126], [127]. Challenges include aligning patch boundaries and automating mesh generation [127], [128]. ...
... Traditional vectorization techniques primarily depend on segmentation or edge detection to group pixels into larger regions, subsequently fitting vector curves and region primitives to these segments [126], [127]. Challenges include aligning patch boundaries and automating mesh generation [127], [128]. In contourbased vectorization, simpler elements such as lines, circles, and Bézier curves represent discontinuity sets in piecewise constant images, often including silhouettes and pixel art [129], [130]. ...
Preprint
Full-text available
This survey provides a comprehensive overview of the advancements in Artificial Intelligence in Graphic Design (AIGD), focusing on integrating AI techniques to support design interpretation and enhance the creative process. We categorize the field into two primary directions: perception tasks, which involve understanding and analyzing design elements, and generation tasks, which focus on creating new design elements and layouts. The survey covers various subtasks, including visual element perception and generation, aesthetic and semantic understanding, layout analysis, and generation. We highlight the role of large language models and multimodal approaches in bridging the gap between localized visual features and global design intent. Despite significant progress, challenges remain to understanding human intent, ensuring interpretability, and maintaining control over multilayered compositions. This survey serves as a guide for researchers, providing information on the current state of AIGD and potential future directions\footnote{https://github.com/zhangtianer521/excellent\_Intelligent\_graphic\_design}.
... Vectorization has been explored for decades with various representations proposed to divide images into non-overlapping regions [17,29,33,35,36]. Despite their capability of generating vivid vectorization of realistic images, the complex primitives and lack of hierarchy make the vector output less intuitive for manipulation. ...
... The former representations divide the input image into non-overlapping 2D patches across which colors are interpolated [30]. Shapes of patches include triangular [17,33,36], rectangular [1,29] or irregular ones such as bézigons [35]. The different selection of mesh shapes determines how patches are organized and how colors are interpolated within patches. ...
Preprint
The widespread use of vector graphics creates a significant demand for vectorization methods. While recent learning-based techniques have shown their capability to create vector images of clear topology, filling these primitives with gradients remains a challenge. In this paper, we propose a segmentation-guided vectorization framework to convert raster images into concise vector graphics with radial gradient fills. With the guidance of an embedded gradient-aware segmentation subroutine, our approach progressively appends gradient-filled B\'ezier paths to the output, where primitive parameters are initiated with our newly designed initialization technique and are optimized to minimize our novel loss function. We build our method on a differentiable renderer with traditional segmentation algorithms to develop it as a model-free tool for raster-to-vector conversion. It is tested on various inputs to demonstrate its feasibility, independent of datasets, to synthesize vector graphics with improved visual quality and layer-wise topology compared to prior work.
... We build a sparse mesh object from the height field computed by the method of section 3.1 (by pixel-grid triangulation and mesh simplification [40]). The target scene can be existing 3D scenes, or recovered from an image. ...
Preprint
We present an object relighting system that allows an artist to select an object from an image and insert it into a target scene. Through simple interactions, the system can adjust illumination on the inserted object so that it appears naturally in the scene. To support image-based relighting, we build object model from the image, and propose a \emph{perceptually-inspired} approximate shading model for the relighting. It decomposes the shading field into (a) a rough shape term that can be reshaded, (b) a parametric shading detail that encodes missing features from the first term, and (c) a geometric detail term that captures fine-scale material properties. With this decomposition, the shading model combines 3D rendering and image-based composition and allows more flexible compositing than image-based methods. Quantitative evaluation and a set of user studies suggest our method is a promising alternative to existing methods of object insertion.
... Various other types of image vectorization methods exist that are specific to line drawings [8], [9], [10], [11], [12], [13], [14], natural images [15], [16], [17], [18], [19], [20], [21], [22], [23], and pixel art [24]. However, these methods merely capture the intrinsic nature of clipart images and are likely to fail in generating precise curve boundaries; thus, they are not well suited for the task considered here. ...
Preprint
Bezigons, i.e., closed paths composed of B\'ezier curves, have been widely employed to describe shapes in image vectorization results. However, most existing vectorization techniques infer the bezigons by simply approximating an intermediate vector representation (such as polygons). Consequently, the resultant bezigons are sometimes imperfect due to accumulated errors, fitting ambiguities, and a lack of curve priors, especially for low-resolution images. In this paper, we describe a novel method for vectorizing clipart images. In contrast to previous methods, we directly optimize the bezigons rather than using other intermediate representations; therefore, the resultant bezigons are not only of higher fidelity compared with the original raster image but also more reasonable because they were traced by a proficient expert. To enable such optimization, we have overcome several challenges and have devised a differentiable data energy as well as several curve-based prior terms. To improve the efficiency of the optimization, we also take advantage of the local control property of bezigons and adopt an overlapped piecewise optimization strategy. The experimental results show that our method outperforms both the current state-of-the-art method and commonly used commercial software in terms of bezigon quality.
... As for modeling continuous color variation, Lai et al.[LHM09] and Sun et al.[SLWS07] chose the gradient mesh based on a rectangular mesh, requiring some user processing. Xia et al.[XLY09] decomposed images into a series of non-overlapping Bézier patches and applied thin-plate splines interpolation to fit colors on each patch, achieving C 0 continuity of colors at the patch boundaries. Chen et al.[CXC14] and Cao et al. [CXC * 18] approximate images using polynomials or barycentric coordinates on polygonal patches, reducing the number of patches compared to triangular ones. ...
Article
Full-text available
Image triangulation methods, which decompose an image into a series of triangles, are fundamental in artistic creation and image processing. This paper introduces a novel framework that integrates cubic Bézier curves into image triangulation, enabling the precise reconstruction of curved image features. Our developed framework constructs a well‐structured curved triangle mesh, effectively preventing overlaps between curves. A refined energy function, grounded in differentiable rendering, establishes a direct link between mesh geometry and rendering effects and is instrumental in guiding the curved mesh generation. Additionally, we derive an explicit gradient formula with respect to mesh parameters, facilitating the adaptive and efficient optimization of these parameters to fully leverage the capabilities of cubic Bézier curves. Through experimental and comparative analyses with state‐of‐the‐art methods, our approach demonstrates a significant enhancement in both numerical accuracy and visual quality.
... Different from image rasterization, vectorization is inherently more complex due to the potential non-uniqueness of its results. Traditional methods [3,11,21,22,25,10] normally build specific algorithm-based methods that rely on image segmentation to conduct vectorization. To address this issue, recent research has tried to leverage the power of learning-based approaches. ...
Preprint
Full-text available
Creating and understanding art has long been a hallmark of human ability. When presented with finished digital artwork, professional graphic artists can intuitively deconstruct and replicate it using various drawing tools, such as the line tool, paint bucket, and layer features, including opacity and blending modes. While most recent research in this field has focused on art generation, proposing a range of methods, these often rely on the concept of artwork being represented as a final image. To bridge the gap between pixel-level results and the actual drawing process, we present an approach that treats a set of drawing tools as executable programs. This method predicts a sequence of steps to achieve the final image, allowing for understandable and resolution-independent reproductions under the usage of a set of drawing commands. Our experiments demonstrate that our program synthesizer, Art2Prog, can comprehensively understand complex input images and reproduce them using high-quality executable programs. The experimental results evidence the potential of machines to grasp higher-level information from images and generate compact program-level descriptions.
... A new color image vectorization based on region merging is explored in [23] which is free from color quantization. Patch-based methods, on the other hand, utilize meshes in different ways to capture fine details in raster images, such as using gradient mesh to capture the contrast changes in [29], and using curvilinear feature alignment in [58]. Some deployed neural networks for vectorization tasks: for drawings [13], floorplans [31], a generative model for font vectorization [32] and exploring latent space for vectorized output in [46]. ...
Preprint
Image vectorization is a process to convert a raster image into a scalable vector graphic format. Objective is to effectively remove the pixelization effect while representing boundaries of image by scaleable parameterized curves. We propose new image vectorization with depth which considers depth ordering among shapes and use curvature-based inpainting for convexifying shapes in vectorization process.From a given color quantized raster image, we first define each connected component of the same color as a shape layer, and construct depth ordering among them using a newly proposed depth ordering energy. Global depth ordering among all shapes is described by a directed graph, and we propose an energy to remove cycle within the graph. After constructing depth ordering of shapes, we convexify occluded regions by Euler's elastica curvature-based variational inpainting, and leverage on the stability of Modica-Mortola double-well potential energy to inpaint large regions. This is following human vision perception that boundaries of shapes extend smoothly, and we assume shapes are likely to be convex. Finally, we fit B\'{e}zier curves to the boundaries and save vectorization as a SVG file which allows superposition of curvature-based inpainted shapes following the depth ordering. This is a new way to vectorize images, by decomposing an image into scalable shape layers with computed depth ordering. This approach makes editing shapes and images more natural and intuitive. We also consider grouping shape layers for semantic vectorization. We present various numerical results and comparisons against recent layer-based vectorization methods to validate the proposed model.
... As an alternative to rectangular meshes, triangle-based patches have been explored, where the patch boundaries are curved edges (Xia et al., 2009;Xiao et al., 2022). Control over the continuity across adjacent patches is desirable (Liao et al., 2012;Zhou et al., 2014). ...
Preprint
Full-text available
Research on smooth vector graphics is separated into two independent research threads: one on interpolation-based gradient meshes and the other on diffusion-based curve formulations. With this paper, we propose a mathematical formulation that unifies gradient meshes and curve-based approaches as solution to a Poisson problem. To combine these two well-known representations, we first generate a non-overlapping intermediate patch representation that specifies for each patch a target Laplacian and boundary conditions. Unifying the treatment of boundary conditions adds further artistic degrees of freedoms to the existing formulations, such as Neumann conditions on diffusion curves. To synthesize a raster image for a given output resolution, we then rasterize boundary conditions and Laplacians for the respective patches and compute the final image as solution to a Poisson problem. We evaluate the method on various test scenes containing gradient meshes and curve-based primitives. Since our mathematical formulation works with established smooth vector graphics primitives on the front-end, it is compatible with existing content creation pipelines and with established editing tools. Rather than continuing two separate research paths, we hope that a unification of the formulations will lead to new rasterization and vectorization tools in the future that utilize the strengths of both approaches.
... Diffusion curves [Orzan et al. 2008] defines images as harmonic functions with curve Dirichlet boundary conditions, a space our method can accurately approximate (Section 5). Triangle meshbased representations [Davoine et al. 1996;Demaret et al. 2006;Su and Willis 2004;Tu and Adams 2011] and more advanced curveand patch-based primitives [Lai et al. 2009;Lecot and Levy 2006;Sun et al. 2007;Xia et al. 2009;Zhao et al. 2017] are introduced for vectorization of natural images. These approaches, while similar to ours in merging discrete geometries and functional representations for interiors, often construct geometric boundaries separately, relying on edge detection, segmentation, or user input. ...
Preprint
Full-text available
Effective representation of 2D images is fundamental in digital image processing, where traditional methods like raster and vector graphics struggle with sharpness and textural complexity respectively. Current neural fields offer high-fidelity and resolution independence but require predefined meshes with known discontinuities, restricting their utility. We observe that by treating all mesh edges as potential discontinuities, we can represent the magnitude of discontinuities with continuous variables and optimize. Based on this observation, we introduce a novel discontinuous neural field model that jointly approximate the target image and recovers discontinuities. Through systematic evaluations, our neural field demonstrates superior performance in denoising and super-resolution tasks compared to InstantNGP, achieving improvements of over 5dB and 10dB, respectively. Our model also outperforms Mumford-Shah-based methods in accurately capturing discontinuities, with Chamfer distances 3.5x closer to the ground truth. Additionally, our approach shows remarkable capability in handling complex artistic drawings and natural images.
Article
Full-text available
This paper describes a computational approach to edge detection. The success of the approach depends on the definition of a comprehensive set of goals for the computation of edge points. These goals must be precise enough to delimit the desired behavior of the detector while making minimal assumptions about the form of the solution. We define detection and localization criteria for a class of edges, and present mathematical forms for these criteria as functionals on the operator impulse response. A third criterion is then added to ensure that the detector has only one response to a single edge. We use the criteria in numerical optimization to derive detectors for several common image features, including step edges. On specializing the analysis to step edges, we find that there is a natural uncertainty principle between detection and localization performance, which are the two main goals. With this principle we derive a single operator shape which is optimal at any scale. The optimal detector has a simple approximate implementation in which edges are marked at maxima in gradient magnitude of a Gaussian-smoothed image. We extend this simple detector using operators of several widths to cope with different signal-to-noise ratios in the image. We present a general method, called feature synthesis, for the fine-to-coarse integration of information from operators at different scales. Finally we show that step edge detector performance improves considerably as the operator point spread function is extended along the edge.
Article
The formulation of a local cell quality metric [Branets, L., and Carey, G. E, 2003, Proceedings of the 12th International Meshing Roundtable, Santa Fe, NM, pp. 371-378; Engineering with Computers (in press)] for standard elements defined by affine maps is extended here to the case of elements with quadratically curved boundaries. We show for two-dimensional and three-dimensional simplex elements with quadratically curved boundaries that all cases of map degeneracy can be identified by the metric. Moreover we establish a "maximum principle" which allows estimating the bounds on the quality metric. The nondegeneracy conditions for biquadratic quadrilaterals with one curved edge are also determined. The metric is implemented in an untangling/smoothing algorithm for improving unstructured meshes including simplex elements that have curved boundary segments. The behavior and efficiency of this algorithm is illustrated for numerical test problems in two and three dimensions.
Article
We present a GPU based implementation of Reyes-style adaptive surface subdivision, known in Reyes terminology as the Bound/Split and Dice stages. The performance of this task is important for the Reyes pipeline to map efficiently to graphics hardware, but its recursive nature and irregular and unbounded memory requirements present a challenge to an efficient implementation. Our solution begins by characterizing Reyes subdivision as a work queue with irregular computation, targeted to a massively parallel GPU. We propose efficient solutions to these general problems by casting our solution in terms of the fundamental primitives of prefix-sum and reduction, often encountered in parallel and GPGPU environments. Our results indicate that real-time Reyes subdivision can indeed be obtained on today's GPUs. We are able to subdivide a complex model to subpixel accuracy within 15 ms. Our measured performance is several times better than that of Pixar's RenderMan. Our implementation scales well with the input size and depth of subdivision. We also address concerns of memory size and bandwidth, and analyze the feasibility of conventional ideas on screen-space buckets.
Article
In this paper, we present a new curve-fitting algorithm for vectorizing hand-drawn key frames in a computer-aided cartooning system. In our algorithm, a new optimization technique is developed for finding a parametric curve, which approximates to the given data points. In order to obtain the desired results, we also define a new distance function to measure the error between the curve and the data points. In addition, in the piecewise curve fitting our method can satisfy the geometrical continuity on a non-corner knot. With these techniques the hand-drawn images can faithfully be represented by piecewise cubic Bézier curves.
Article
This paper proposes a new method for image compression. The method is based on the approximation of an image, regarded as a function, by a linear spline over an adapted triangulation, D(Y), which is the Delaunay triangulation of a small set Y of significant pixels. The linear spline minimizes the distance to the image, measured by the mean square error, among all linear splines over D(Y). The significant pixels in Y are selected by an adaptive thinning algorithm, which recursively removes less significant pixels in a greedy way, using a sophisticated criterion for measuring the significance of a pixel. The proposed compression method combines the approximation scheme with a customized scattered data coding scheme. We compare our compression method with JPEG2000 on two geometric images and on three popular test cases of real images.
Conference Paper
We present RaveGrid, a software that efficiently converts a raster image to a scalable vector image comprised of polygons whose boundaries conform to the edges in the image. The resulting vector image has good visual quality and fidelity and can be displayed at various sizes and on various display screen resolutions. The software can render vector images in the SVG (scalable vector graphics) format or in EPS (Encapsulated Postscript). The ubiquity of image data in graphics, on the Web, and in communications, as well as the wide range of devices, from big screen TVs to hand-held cellular phones that support image display, calls for a scalable and more manipulable representation of imagery. Moreover, with the growing need for automating image-based search, object recognition, and image understanding, it is desirable to represent image content at a semantically higher level by means of tokens that support computer vision tasks.