3D Screen-space Widgets for Non-linear Projection
Patrick Coleman, Karan Singh∗
Univ. of Toronto
Nisha Sudarsanam, Cindy Grimm‡
Washington Univ. in St. Louis
Figure 1: Defining a nonlinear projection of a 3D scene. Left: Original scene from a default view showing sketched 3D features. Right: The
new, nonlinear projection. Two curve constraints are used to bow the side walls, and a point constraint is used to warp the back wall.
Linear perspective is a good approximation to the format in which
Artists expressing 3D scenes, however, create nonlinear projections
that balance their linear perspective view of a scene with elements
of aesthetic style, layout and relative importance of scene objects.
Manipulating the many parameters of a linear perspective camera to
achieve a desired view is not easy. Controlling and combining mul-
tiple such cameras to specify a nonlinear projection is an even more
cumbersome task. This paper presents a direct interface, where an
artist manipulates in 2D the desired projection of a few features of
the 3D scene. The features represent a rich set of constraints which
define the overall projection of the 3D scene. Desirable proper-
ties of local linear perspective and global scene coherence drive a
heuristic algorithm that attempts to interactively satisfy the given
constraints as a weight-averaged projection of a minimal set of lin-
ear perspective cameras. This paper shows that 2D feature con-
straints are a direct and effective approach to control both the 2D
layout of scene objects and the conceptually complex, high dimen-
sional parameter space of nonlinear scene projection.
Graphics—Computational Geometry and Object Modeling
I.3.5 [Computing Methodologies]: Computer
Keywords: Camera control, Projection, Sketch interface, Perspec-
tive, Rendering, Non-linear perspective
∗e-mail: patrick, firstname.lastname@example.org
Projection, together with occlusion in a 2D image, provide a viewer
with information about the 3D spatial relationship between scene
objects. Linear perspective, besides being a good approximation to
the human visual system, is able to provide us with depth informa-
tion in 2D without the need for occlusion. Artists expressing 3D
scenes balance linear perspective with elements of style and the rel-
ative importance of scene objects to create projections that are both
informative and aesthetically pleasing.
Creating informative and appealing 2D projections of 3D scenes is
a challenging task not only in Computer Graphics but in traditional
artistic media as well. Even within the bounds of realism, artists
very often deviate from strict linear perspective for two chief rea-
sons. First, linear perspective can be too restrictive to display all of
the necessary parts of a scene. It also may not be able to meet all
of the desired 2D framing and layout constraints. While this can be
fixed somewhat by rearranging objects in a staged scene, often that
is not a feasible option. Second, viewers are not overly critical of
deviations from a linear perspective as long as the overall projection
jection (preferably linear perspective) [Kubovy 1986]. Perspective
distortions can also be used to convey mood and vary the relative
importance of objects [Singh 2002].
Current camera control techniques offer the user direct control
over the various parameters of a linear perspective camera. Con-
trolling these parameters to obtain a desired projection of even a
single object can be difficult. The global nature in which linear
perspective camera parameters affect the projection can make the
specification of a desired projection for multiple scene objects al-
most intractable. While there has been some recent work on the
ear perspectives [Agrawala et al. 2000; Singh 2002; Coleman and
Singh2004; Grimm2001], thesesystemssimilarlyrequiretheusers
to achieve the desired 2D projection indirectly, by controlling and
combining a large number of camera parameters.
In this paper we thus explore a direct interface to camera control
for scene projection.
We start out with a small number (typically one) of default view-
points (see Figure 1a), that are defined using a conventional Com-
puter Graphics camera. The artist now places simple 3D geometric
proxies, such as points, lines, and boxes, directly into the scene.
These geometric proxies reflect primitive shapes that artists tradi-
tionally use to lay out a 3D scene on a canvas. The 3D geometric
proxies are then projected into 2D to become the visual handles
which the artist can manipulate to change the current projection of
the 3D features (see Figure 1b).
The altered projections of the 3D geometric proxies onto the 2D
canvas become constraints that, along with other desirable projec-
tion properties, interactively define an overall projection of the 3D
Each of the 2D features represents a desired deviation from
the corresponding default view. Our features are designed so that
changes to them map naturally to perceived changes in the pro-
jection. To demonstrate the flexibility of the feature set we show
how an arbitrary set of features, plus the default view, can be com-
bined to control a single, linear perspective camera. The solution to
this smaller problem is found using a nonlinear constraint satisfac-
tion algorithm, which takes as input the features and the allowable
changes to the default view, and returns a set of camera parameters
which best meets these constraints.
We build upon this single camera solver to define nonlinear pro-
jections using multiple cameras. A nonlinear projection is repre-
sented as a weighted average of multiple linear perspectives as pro-
posed by Singh [Singh 2002]. We look for the smallest set of single
cameras that satisfies the 2D features and has the property of local
linear perspective. The algorithm attempts to group features that
are proximal in image space and fit each group with a single linear
perspective camera. Once the group of cameras is determined, we
define weighting functions based on the corresponding 3D features.
There are a small number of nonlinear projections, for example
panoramas and fish-eye views, that arise often enough to warrant a
specific feature of their own. Each of these features creates two or
more cameras which are then incorporated into the multiple camera
algorithm as a unit.
This paper contributes the first direct interface to controlling the 2D
layout and projection of a 3D scene. Specifically, the contributions
of this interface are threefold. First, we describe a rich set of feature
primitives that constrain different parameters of a linear perspective
camera. These features can be used individually or in combination
to form a compelling widget set for interactive linear perspective
camera control. Unlike previous constraint-based approaches [Gle-
icher and Witkin 1992] we define a family of explicit behaviour
sets for the under-constrained case. Second, we describe a novel
algorithm for the construction of continuous nonlinear projections
from sketched feature primitives, with properties of global scene
coherence and local linear perspective. Thirdly, we define complex
feature primitives that are specifically designed to constrain popular
nonlinear projections such as panoramas and fish-eyes.
The paper is organized as follows. Section 2 positions this paper
relative to previous research in nonlinear projection, camera con-
trol and sketch interfaces. Section 3 elaborates on the system inter-
face and workflow. Section 4 defines our feature set and discusses
how changes to the 2D features are reflected in the allowable cam-
era changes. Section 5 describes our approach to finding a single
linear perspective camera that satisfies the 2D feature changes. In
Section 6 we extend the single camera solution to multiple cameras,
define weight functions, and how to interpolate between cameras.
Section 7 describes the design of complex features to capture cer-
tain popular nonlinear projections. Section 8 provides the conclu-
Image-space constraints have been used to control camera anima-
tions [Blinn 1988; Gleicher and Witkin 1992], automatic camera
control for teleconferencing-type applications [Drucker and Zeltzer
1995], and automatic composition [Tomlinson et al. 2000; Gooch
et al. 2001]. Gleicher [Gleicher and Witkin 1992], Blinn [Blinn
1988], and Tomlinson [Tomlinson et al. 2000] all used image-space
constraints and a general purpose solver to create a camera anima-
tion. All of these approaches used only seven of the eleven camera
parameters (position, orientation, and focal length), a small vocab-
ulary of image constraints (point constraints, size in image, and a
notion of “up”) and a simple heuristic for any unconstrained para-
meters (keep them the same as before). We extend image-space
constraints to all eleven parameters, introduce a comprehensive
family of image constraints, and, most importantly, introduce a set
of heuristics for managing the unconstrained parameters. Because
changes in the 2D projection of 3D geometry can be accounted for
by a variety of camera parameter changes, a naive implementation
of a constraint system can result in (from the user’s point of view)
very unexpected behavior. Our heuristics constrain the remaining
degrees of freedom by taking into account how the user is manip-
ulating the 2D projections; this results in a much more stable, but
not restrictive, system.
There are a variety of ways to solve for the camera parameters
given a set of constraints. Gleicher [Gleicher and Witkin 1992] in-
troduced a version that uses a standard Inverse Kinematics solution
to solve for the change in camera parameters that will reduce the
constraint error. The system iterates until it converges. The advan-
tage of this approach is that it reduces to a least-squares problem
for which solutions are well-understood; the disadvantage is that it
requires a fair amount of machinery to create new constraints and
these constraints must be expressible as quadratic constraints on
the camera parameters or image points. We have experimented with
this solver and have found that, for even relatively simple point con-
straints it can get stuck in local minima [Grimm and Barrett 2005].
Both Agrawala et al. [Agrawala et al. 2000] and Grimm [Grimm
2001] present a multi-projection approach where each object in the
scene is assigned to some camera and rendered based on the linear
perspective of that camera. The multiple renderings are compos-
ited to generate the final image using a visibility ordering of the
objects from some master camera view. This approach provides
good results for multiple discrete projections but does not handle
projections continuously varying over objects as seen in Figure 5
or Figure 1. Agrawala also encapsulates a small number of camera
manipulations (dolly-in plus zoom, fixed view, fixed position, and
orientation) to make it easier for the user to create certain effects.
Our system also supports these operations through the perspective,
orientation, and size constraints.
The idea of constructing a nonlinear projection as a combination
of multiple linear perspectives was presented by Singhand
Coleman. While Singh’s approach did not have ways to con-
trol global scene coherence Coleman’s approach did employ con-
straints to control the composition of the final nonlinear projection.
However, both approachesrequireduserstospecifyindividualcam-
era parameters of several cameras in the scene. We, on the other
hand, introduce an interface based on placing simple features in the
scene, for specifying multiple cameras.
3D Exploratory view
Non linear solver
•Change 2D component
•Change 3D component
•Change default view
•Change feature influence
Figure 2: System Architecture.
3 System Interface and Workflow
The user-centric workflow in this paper ultimately drives the under-
lying framework for defining the overall scene projection. In the
physical world an artist has their own view of the 3D scene and a
2D canvas upon which to render it. Similarly, the artist’s view of
the 3D scene is captured in our setting by a conventional linear per-
spective camera which we call the exploratory view (see Figure 1a)
and a 2D canvas (see Figure 1b). The 2D canvas allows interac-
tive manipulation of the projected 3D constraints and is where the
modified 3D scene projection appears. The exploratory view can be
controlled using current techniques or those described in this paper,
and is used to define the default camera.
In the physical world an artist first picks one or more viewpoints
with which to render parts of the scene. While viewing the scene
from one of these viewpoints, the artist begins sketching a few key
features of the scene. This sketch defines the overall layout and
projection of the scene in 2D. The sketched features are placed to
balance the perceived view(s) of the scene with artistic intent.
Similarly, within our framework the artist first finds one or more
viewpoints of interest by directly manipulating the exploratory
view. They next define 3D geometric features of interest. These
are points, lines, boxes or curves that are placed on objects (or their
bounding boxes) in the 3D scene. These 3D proxies correspond to
the set of features artists use to simplify the geometry of the scene.
Now we bring to bear some of the advantages of working in the dig-
ital realm. The 3D geometric proxies are automatically projected
based on the artist’s viewpoint of interest. The artist can then edit
these features, leave them unchanged, or override them entirely by
sketching them afresh. The artist can delete existing scene features
to unconstrain the scene projection or add new ones.
Unlike a traditional artist who has to fill the details of the over-
all scene projection after sketching the features, in our system the
overall scene is always projected onto the canvas and interactively
changes as the user changes the proxies. Figure 2 shows the ele-
ments of user interaction as part of the overall system architecture.
A typical user interaction session is as follows. As the user ma-
nipulates the exploratory view it is projected simultaneously on the
2D canvas. The user can bookmark a view and label any view as
a default view. 3D features can be interactively added, removed,
and edited in the exploratory view. Each proxy is matched with a
default view; typically the geometry is sketched in the correspond-
ing default view, but it doesn’t have to be. This allows the user to
introduce constraints on features that are not visible in the default
As the user creates 3D features, their projections appear in the
2D canvas. The user can then edit the projected features in the 2D
canvas, and our system will interactively compute a new projection
to match the requested changes, while maintaining, as appropriate,
fidelity to the corresponding default view.
At some point the user will introduce constraints that cannot be
satisfied with a single linear camera. At this stage the can toggle be-
tween viewing the scene non-linearly with all constraints enabled,
or continue to use linear projection with a user-selected subset of
the constraints enabled (the system shows the disabled constraints
in their manipulated, not projected, locations). Non-linear render-
ing is, in general, not real-time; the locally linear mode allows the
user to fine-tune the linear projection in one area, while using the
2D proxies to indicate what is happening in the remainder of the
The user can also use the exploratory view to change the 3D
drop-off functions associated with each of the 3D features. These
drop-off functions determine the contribution of each local linear
camera to the overall scene projection at arbitrary points in space.
Although our system supports multiple default views, for sim-
plicity most of this paper assumes a single, default view.
4 Feature Primitives
We now describe the feature primitives that a user manipulates to
control the overall projection.
From an artistic standpoint the perspective projection of an ob-
ject can be thought of as having four components: Position, size,
orientation and perspective. The first two relate to the location and
size of the object on the canvas. The next two define the viewing
transform relative to the center of the object. Orientation can be
broken into three components, rotation in the image plane (“spin”),
left-right rotation, and up-down rotation. Perspective is related to
the view distance from the object and is artistically conceptualized
by defining vanishing points for families of parallel lines [o’Connor
Jr. et al. 1998]. While there may be a large number of linear pro-
jections that satisfy a given feature constraint, the type of feature
and how it has been changed suggests an order of preference of the
feature’s control over the four different components.
As an example, the simplest feature primitive is a point. Fig-
ure 3a shows the default view of a scene with a specified point on
the table. We intuitively expect the camera to pan (translate per-
pendicular to the viewing axis) when the point is moved elsewhere
on the canvas, rather than changing the size, orientation, or per-
spective. Similarly, if we have a line in the scene, and the line has
only been translated, we expect only the position component of the
camera to change. If, however, we rotated the line about its center,
we would expect the orientation to change. Scaling the line would
produce a size change. Details of the implementation can be found
Our set of feature primitives are based on three principles.
• The feature primitive shapes are simple and traditionally used
by artists to lay out scene projections.
• The feature set provides successive coverage of the four com-
ponents of linear perspective.
• Some feature primitives that control different components of
linear perspective combine effectively to define new, more
4.1Anatomy of a Feature
Each feature has a 3D component and a corresponding 2D compo-
nent. Where appropriate, the 2D component has handles for mov-
ing the entire feature, rotating it in the image plane, and scaling
A single point feature
Changing the wedge’s
position, size, and orientation
Full wedge constraint
Two lines: Position, size,
and orientation only
Two lines plus point: No
aspect ratio change
Two lines plus two points:
Changing position and
Cube edge: Full camera
Wedge constraint, initial
Box plus orientation wedge
Cube edge, initial position
Line feature, initial
Changing orientation Changing size
Figure 3: A taxonomy of feature primitives.
it (either uniformly or in a single direction). Depending on how
the 2D feature has been altered, we unconstrain the corresponding
camera component. Each feature generates a constraint (Section 5)
which is sent to the constraint solver (Section 6) along with which
camera components are unconstrained. The system determines, for
the entire set of features, which camera components are still con-
strained; each component generates a constraint which is also sent
to the solver (Section 5.2).
We now describe our set of feature primitives:
Point: The simplest feature primitive. The point feature uncon-
strains the position component of the camera. If more than one
point feature is present the size, orientation, and perspective com-
ponents are unconstrained, in that order.
Line: A line is essentially two point features, but the relationship
between the points is important. The line unconstrains position,
orientation in the plane, and/or size, depending on the how the line
has changed (see Figure 3d).
Wedge: A wedge is built from three points and is useful for speci-
fying the position, full orientation, and size. Changing the position,
size, and orientation of the wedge unconstrain the corresponding
camera components. Changing the angle of the wedge and the rela-
tive lengths of the edges changes the left-right and up-down orien-
tation of the object. A variation of this feature is the Orientation
wedge, in which the position of the wedge is ignored.
Two lines: This feature is built from four points, usually arranged
as two parallel lines, although this is not required. This feature
behaves as the line does for affine transformations. Full orienta-
tion and perspective are unconstrained when the relative angles and
sizes of the lines change. This constraint is motivated by the de-
finition of vanishing points in art and is thus useful with objects
that have natural parallel or orthogonal features, such as a table.
The Vanishing point is a restricted form of this constraint, defined
using two parallel lines, that only controls perspective.
Cube edge: This constraint is built from six point constraints and
represents the edge of a cube. The user positions a cube around the
object and chooses which edge to manipulate. This feature is the
most general, and can be used to control all of the camera parame-
ters. Again, affine transformations of the feature act to unconstrain
position, size, and image space orientation, while changing the rel-
ative angles completely unconstrains the camera.
Bounding box: The 3D component of this feature is a 3D plane
centered within the bounding box of an object, aligned to be per-
pendicular to the current camera. The 2D component is a 2D plane.
The position and size components of the camera are changed so that
the 3D component projects inside of the 2D plane. A variation of
this is the size box which only constrains the projected size of the
Rotation: The 3D component of this feature is a set of orthogonal
axes; the 2D component is the projection of those axes. This feature
strictly controls the orientation of the object and allows a default
view orientation to be defined for an object. Often, objects like a
table or chair only define a partial default orientation in terms of an
up vector feature.
We describe two complementary combinations of the above fea-
tures that can be used to completely specify a linear projection.
Bounding box-Rotation: These two features completely control
the size, position, and orientation of the object, but not the perspec-
tive (the constraint solver will use the perspective of the default
Bounding box-Orientation wedge-Vanishing-point: These three
constraints together control all four projection components.
In addition to the projection constraints themselves, the user can
ing, or very narrow and wide projections.
5Constraining a Single Camera
Once the user has specified one or more features, the system must
find the camera that best meets those feature constraints. This prob-
lem is very closely related to the one of camera calibration in com-
puter vision [Zhang 2000]. In computer vision, the problem is usu-
tures (usually points, lines and sometimes conics), find the extrinsic
(position and orientation) and intrinsic (focal length, aspect ratio,
skew, center of projection) camera parameters that map the 3D fea-
tures to their 2D image locations. Unfortunately, for a perspective
camera with 3D features in general position there is no closed-form,
linear solution. Additionally, there are 11 camera parameters (6 ex-
trinsic and 5 intrinsic) so there must be at least as many independent
constraints as there are camera parameters to fully specify the cam-
era. We restrict the parameter search space from this general setting
with feature primitive definitions that carry knowledge of the user’s
intended change in projection from the default view. For example,
we can assume the default view’s center of projection and skew un-
less the user draws a feature that is explicitly designed to control
Our basic approach is to use a general-purpose, non-linear solver
to satisfy the feature constraints. Each feature f produces a con-
straint in the form of an error equation Efto be minimized. Each
constrained component of the camera produces another error equa-
tion, Ed. Each error equation is weighted then summed to produce
the complete error equation:
The system provides default weights which can be over-ridden
by the user if desired. The solver searches over the space of camera
parameters to find the set that minimizes this equation.
We use the standard method for building a perspective ma-
trix [Foley et al. 1990] from a rotation (3dof), translation (3dof),
focal length (1dof), aspect ratio (1dof), center of projection (2dof),
and skew(1dof). Each feature has access to its 3D components, 2D
components, the default camera D, and the camera currently under
consideration C. Some notation:
• C(P) = p is the projection of a 3D point P into a 2D screen
point p by the camera C. This encapsulates the matrix mul-
tiplication, the homogeneous point normalization, dropping
the depth component, and scaling to the width and height
of the screen. C(?V) =? v will be the projection of the vector
?V = P−Q, found by taking C(P)−C(Q).
• We will use upper case for 3D points, and lower case for 2D
points. The subscript d indicates the desired screen-space lo-
cation of the 2D feature.
5.1Feature Constraint Equations
For each feature in Section 4 we need an equation that measures
how well the constraint is satisfied. Note that the magnitude of the
constraints is normalized to correspond roughly to pixel error, i.e.,
if a point projects one pixel away from its desired location then the
error function returns one.
Point: This equation measures the distance between p =C(P), the
projected 3D feature point, and the desired location pd:
Line: Let P and Q be the end points of the line. We measure the
difference in the projected end points.
Wedge: The wedge constraint is the sum of two line constraints,
each scaled by half.
Orientation wedge: This constraint uses only the angle between
the projected line and the sketched line, and the lengths. Scaling by
360 equates a one degree error with one pixel.
Two lines: We could use four point constraints in this case, but we
have found that the constraint captures the preference of camera pa-
rameters better if we include the difference in directions and lengths
of each line pair as well. This prevents slight inconsistencies in the
end point locations from dramatically changing the perspective. If
P0 and P1 are the end points of one line then the equation for the
first line is:
and similarly for the second line.
Cube edge: This constraint is implemented as the sum of six point
Rotation: The rotation constraint specifies a desired orientation for
a coordinate frame e1,e2,e3centered in the middle of the object. If
we take just the rotation matrix of the camera and multiply it by the
x, y, and z axes, we get the coordinate frame E1,E2,E3. We measure
the difference in these coordinate frames by taking the individual
The Up constraint is a special case of the rotation constraint,
where we only consider e2(the y axis).
Bounding box: To measure the error of the bounding box con-
straint, we use a plane located at the center of the bounding box.
The feature equation is the difference between the projected plane
and the desired plane as specified by the user. Let Pibe the ithcor-
ner of the projected plane. Let pibe the corner of the desired plane.
If aspect ratio changes are not allowed, then we constrain only one
of the width or the height, whichever is less.
Size box: The size box equation is similar to the bounding box,
except we measure the differences in the width and height only and
ignore the center.
5.2Default view constraint
In this section we define the default camera error function, Ed, for
each of the camera components. Which equations are active de-
pends on which camera components are constrained, whether or not
the user is allowing center-of-projection and skew changes, and as-
pect ratio changes. To determine which camera components are un-
constrained we initially set all components to be constrained, then
walk through the current list of features and unconstrain the com-
ponents the feature is editing.
For the following discussion, let C be the camera under consid-
eration, and D be the default camera.
The following is a general-purpose way of measuring how much
an input value v varies from a desired value vd?= 0. This equation
equally punishes smaller and larger deviations and has a maximum
magnitude of one:
Center of projection: The default value for the center of projection
is (0,0). To measure the deviation we simply measure the size of
the current camera’s center of projection u,v:
|v/vd| < 1
|v/vd| ≥ 1
where W,H is the size of the screen. If we wish to constrain the
center of projection to a different value we use Equation 1. Note
that changing the center of projection by 1 corresponds (roughly)
to moving the center of projection to the top of the screen.
Skew: We use Equation 1, scaled by (W +H)/2 to measure the
skew. The default value for skew is 1.
Position: Translation and center of projection are the primary pa-
rameters that influence position, but if an object is off the viewing
axis then rotation will have an effect as well. Taking any point P on
the optical axis of the default camera:
which is essentially a point constraint. This will allow the camera
to rotate around the optical axis.
The focal length and translation along the view direction
are the primary parameters influencing size. Usually, we prefer to
change size by changing the focal length, leaving the translation,
which also affects perspective, unchanged. Let Pc= eye + look,
Pu= eye + look + (H/f) up, and Pr= eye + look + (H/f)
right be the points at the center, and center-top, and center-right
of the film plane of the default camera (f is the focal length). Then
the projected sizes of these vectors should be the same:
To measure the aspect ratio we use the ratio of the two film-plane
Orientation: The parameters primarily influencing orientation are
the rotations. The skew parameter can also influence the orien-
tation, although in general allowing this parameter to vary results
in unexpected camera views. Measuring orientation differences is
identical to the rotation constraint described above. Note that if
we use the “up”, “look”, and “right” vectors of the camera that we
can separate orientation into rotation in the image plane, left-right
rotation, and up-down rotation.
Perspective: Perspective depends primarily on the orientation and
position of the object relative to the camera, although changing the
center of projection and skew also play a role. We measure the
change in perspective distortion by measuring the angle change of
the edges of a cube placed a unit distance along the look vector.
The final optimization equation is a weighted sum of all active con-
straint equations plus the default camera equations. We use the
weights in two ways. First, changing the weight of an individual
feature causes that feature to become “stronger”. This provides the
user with additional control in the case where not all constraints
are satisfied. Second, changing all of the weights for the default
camera constraints simultaneously allows the user to indicate how
important the default camera is. The initial values for all feature
constraints is one, while the default camera constraint weights are
100 because we want these to over-ride the feature constraints.
5.4 The Solver
We use a simplex, or amoeba [Nelder and Mead 1965] solver. This
solver deals well with large numbers of parameters, parameters that
have unequal effects, and does not require explicit derivatives. The
solver requires the number of parameters, an initial starting condi-
tion, and an equation to minimize. It searches through parameter
space until the decrease in the error function falls below a given
threshold. We have found this solver to be more stable, less likely
to get stuck in local minima, and an order of magnitude faster than
the Jacobian gradient approach [Gleicher and Witkin 1992] used by
Gleicher [Grimm and Barrett 2005].
The solver’s initial conditions are the current camera parameters.
If the user has explicitly forbidden oblique projections we can leave
the corresponding parameters out. For computational efficiency
reasons, we usually solve for the best translation and rotation pa-
rameters, then use those parameter values to initialize a search over
the remaining parameters. Except for dramatic constraint changes,
the solver usually iterates 100-400 times before stabilizing, with
average solve times under 14 milliseconds per change.
Weighted constraint solvers are notoriously difficult to manage,
can be very sensitive to perturbations in the weights, and can suffer
from local minima. We are insulated from many of these problems
for several reasons. We have a user in the loop who can (indirectly)
nudge and guide the solver by making small changes to the image-
space constraints. It is also visually clear which constraints are not
being met and what happens as constraints are changed. In the
following section it will be important to be able to determine if the
constraints were actually met; because we know a mapping from
the magnitude of each error constraint to approximate pixel error,
we can determine appropriate thresholds for success.
One problem with the simplex solver is that it moves around er-
ratically in the solution space. Thus viewing the intermediate steps
of the solver is disconcerting for the user. Hence for visualization
purposes we solve for the desired camera and then interpolate from
the current camera to the desired camera.
6 Constraining Multiple Cameras
We represent continuous nonlinear projections as a blend of multi-
ple linear perspective cameras [Singh 2002]. Given a set of features
we now need to compute a minimal set of linear cameras to fit the
feature constraints and a corresponding set of weight functions that
Figure 4: Changing the camera influence for the point constraint in
describe how much each camera contributes to the projection of any
point in the 3D scene.
We do this as a two step process. In the first step we find groups
of features such that all of the feature’s constraints can be satisfied,
within a specified tolerance, by a single linear perspective camera.
We take the smallest group such that each constraint is covered by
at least one camera. We use a heuristic to drive a greedy search
that captures local linear perspective in the image by attempting
to group features that are sketched close to each other on the 2D
canvas. Once the set of cameras has been determined the 3D fea-
ture components in each camera’s group is used to define the cam-
era’s contribution to the overall projection of points in space. More
specifically, the weight function for any camera is a summed dis-
tance surface implicit function [Singh 2002] defined around each
3D feature component that the constraint satisfies (see Figure 4).
We now look at the grouping algorithm in detail.
6.1 Determining the Cameras
We take a greedy approach to determining the grouping of features
which are satisfied by a single camera. The input to the algorithm
is a set of n features F1...Fnand their corresponding desired 2D
projections p1...n. We define a set of n possible groups G1...Gn
(and corresponding cameras) as follows:
For i = 1 to n
Sort features j ?= i by ?pi− pj?
for k = minjto maxj
if satisfied( Gi
We now cull any duplicate groups or any group which is a subset
of another. We illustrate this algorithm in Figure 5.
Scene with four
Try A with B, C, and D,
in that order. Groups
with C, but not B.
Try B with A, C, and
D, in that order.
Groups with C.
Try C with B, D,
and A, in that order.
Groups with B.
Try D with C, B, and A,
in that order. Groups
with C, but not B.
Final groups: A
and C, B and C, C
Figure 5: An illustration of the constraint grouping algorithm.
While this algorithm does not guarantee a minimum number of
cameras it is simple and robust and does a good job of capturing lo-
cal linear perspective in the image by first satisfying constraints of
features that are close to each other with the same linear perspec-
tive. By covering features with as many cameras as possible, we
ensure better interpolation results.
Once the set of output camerasC1...Cmhave been defined we need
to the projection of all points in space.
It is important that the regions proximal to the various 3D fea-
ture components be projected by the cameras that satisfy that fea-
ture constraint. The influence of one camera on the overall projec-
tion then falls off spatially from its 3D feature components at a rate
that reflects how local the camera’s projection is. Implicit surface
primitives lend themselves perfectly to capturing such weighting
functions. Each 3D feature component Pjdefines a distance-based
implicit function in 3D, called fj. We use two parameters, rin,rout,
for each feature that determine the area influenced by Pj. We set fj
to be one at any distance r <rin, and zero past rout. Between rinand
routthe function falls-off in a typical bell-shaped blend function g1
For each output camera Ci, the weight function wiis a simple
summation of the weight functions of all of the features that are
in the group Gito which the camera Ciis fit. Given a point Q
in the scene, we first calculate the weight for each camera. The
weights are normalized if they sum to greater than 1. Points where
the weights sum to less than 1 fall outside the locality of any of the
fitted cameras. These points are blended in with their projection
in the default camera view. The user can easily vary the influence
different feature constraints have on the overall scene projection
(see Figure 4) by controlling their drop-off distances directly in the
The nonlinear projection for any point in space Q is now com-
puted as a simple blend of projections of the cameras C1...Cmand
the default view D. The projection q of a point Q is now simply
||Q−P|| < rin
rin≤ ?Q−P? ≤ rout
?Q−P? > rout
1g(x) = (x2−1)2,x ∈ [0,1] is an example of such a function.
Outer box Download full-text
Original inner box
Figure 6: The 2-box, or fish-eye feature primitive.
7 Nonlinear Feature Primitives
Thus far our nonlinear projection algorithm takes a number of fea-
tures, constructs groups that can be satisfied by a single perspective
camera, and blends these cameras continuously. While the result-
ing nonlinear projection is likely to satisfy the specified constraints
there is no mechanism for the user to define particular nonlinear
projections. In keeping with our user-centric approach we would
like the user to be able to specify specific types of nonlinear projec-
tions within our system framework. We accomplish this by defining
complex feature primitives that must be satisfied by a pre-specified
nonlinear projection. We demonstrate two feature primitives that
correspond to a panorama view and a fish-eye projection.
Panorama: A panorama can be generated by either spinning the
camera in place or spinning it around an object.
panorama we draw a line in the scene and a corresponding curve
(which starts off as a line) in 2D. The center point of the curve can
be moved to create an arc in the image. This arc is approximated
by a small number of line constraints. We then generate one cam-
era for each line constraint. We either constrain the camera to spin
around its axis to spin around the center point of the arc.
The panorama feature is always grouped only with itself. The
columns of Figure 1 are distorted using a panorama.
2-Box or fish-eye: The 2-box feature is an extension of the
bounding-box feature. It is defined using two concentric boxes.
The outer box is treated as the usual bounding-box feature and par-
ticipates in the specification of a nonlinear projection as described
in Sections 3-5. Once the outer box has been satisfied by some
camera Cout, we create an additional camera Cinthat uses Coutas
the default view. The camera Cinis allowed to translate along the
viewing axis of the Coutcamera, and if the center of the inner box
has changed, perform a pan as well. The inner box is not included
in the feature groups. The weight function for Cinis computed as a
fall-off from the inner box to the outer box. As shown in Figure 6
the 2-box is an effective feature with which a user can specify and
control a fish-eye or telescoping projection.
To specify a
It is worth noting that a user can composite multiple 3D scenes or
objects onto the same canvas in 2D without having to actually ad-
dress their placement relative to each other in a common 3D scene.
Each 3D scene would have its own exploratory view but the overall
projection would be controlled and viewed on a single 2D canvas.
We have presented an interactive technique for specifying non-
linear projections. Our approach addresses the high-dimensional
space problem by making a series of heuristic decisions based on
expected image-space behavior. These heuristics are visually en-
capsulated in a rich set of 2D primitives. By using a non-linear
solver and a default camera we can easily allow arbitrary combina-
tions of feature primitives. We then combine features into coherent
groups, using a set of heuristics based on desirable properties of
projections. This combination results in a very flexible system that
provides as much control as possible to the user while still allowing
them to interactively control nonlinear projections.
The authors would like to thank Alias Wavefront for donating their
Maya Software. This research is funded in part by NSF Grants CCF
0238062 and CNS 0139576.
AGRAWALA, M., ZORIN, D., AND MUNZNER, T. 2000. Artistic multi-
projection rendering. In Eurographics Rendering Workshop 2000, Euro-
BLINN, J. 1988. Where am i? what am i looking at? In IEEE Computer
Graphics and Applications, vol. 22, 179–188.
COLEMAN, P., AND SINGH, K. 2004. Ryan: rendering your animation
nonlinearlyprojected. InNPAR’04: Proceedingsofthe3rdinternational
symposium on Non-photorealistic animation and rendering, ACM Press,
New York, NY, USA, 129–156.
DRUCKER, S. M., AND ZELTZER, D. 1995. Camdroid: A system for im-
3D Graphics, ACM SIGGRAPH, 139–144. ISBN 0-89791-736-7.
FOLEY, J., VAN DAM, A., FEINER, S., AND HUGHES, J. 1990. Computer
Graphics : Principles and Practice. Addison Wesley.
GLEICHER, M., AND WITKIN, A. 1992. Through-the-lens camera control.
Siggraph 26, 2 (July), 331–340. ISBN 0-201-51585-7. Held in Chicago,
GOOCH, B., REINHARD, E., MOULDING, C., AND SHIRLEY, P. 2001.
GRIMM, C., AND BARRETT, L. 2005. Camera interpolation using screen-
space constraints. Tech. Rep. 7, Washington university in St. Louis.
GRIMM, C. 2001. Post-rendering composition for 3d scenes. Eurographics
short papers 20, 3.
KUBOVY, M. 1986. The Psychology of Perspective and Renaissance Art.
Cambridge University Press.
NELDER, J. A., AND MEAD, R. 1965. A simplex method for function
minimization. In Computer Journal, vol. 7, 308–313.
O’CONNOR JR., C., KIER, T., AND BURGHY, D. 1998. Perspective Draw-
ing and Application. Prentice Hall.
SINGH, K. 2002. A fresh perspective. In Graphics Interface 2002, 17–24.
TOMLINSON, B., BLUMBERG, B., AND NAIN, D. 2000. Expressive au-
tonomous cinematography for interactive virtual environments. In Proc.
of the 4th International Conference on Autonomous Agents, ACM Press,
ZHANG, Z. 2000. A flexible new technique for camera calibration. In IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 22.