in: Proceedings of the Tenth IEEE International Conference on Computer Vision, pp. 1292-1299, Beijing, China,
October 15-21, 2005.
Squaring the Circle in Panoramas
1. Dept. of Electrical Engineering Califormia Institute of Technology Pasadena, CA 91125, USA
2. Informatik VII (Graphische Systeme), Universitat Dortmund, Dortmund, Germany
Pictures taken by a rotating camera cover the viewing
sphere surrounding the center of rotation. Having a set of
images registered and blended on the sphere what is left to
be done, in order to obtain a ﬂat panorama, is projecting
the spherical image onto a picture plane. This step is unfor-
tunately not obvious – the surface of the sphere may not be
ﬂattened onto a page without some form of distortion. The
objective of this paper is discussing the difﬁculties and op-
portunities that are connected to the projection from view-
ing sphere to image plane. We ﬁrst explore a number of al-
ternatives to the commonly used linear perspective projec-
tion. These are ‘global’ projections and do not depend on
image content. We then show that multiple projections may
coexist successfully in the same mosaic: these projections
are chosen locally and depend on what is present in the pic-
tures. We show that such multi-view projections can pro-
duce more compelling results than the global projections.
As we explore a scene we turn our eyes and head and cap-
ture images in a wide ﬁeld of view. For millennia painters
and (more recently) photographers have grappled with the
problem of creating pictures that render the visual impres-
sion of ‘being there’. Recent advances in storage, com-
putation and display technology have made it possible to
develop ‘virtual reality’ environments where the user feels
‘immersed’ in a virtual scene and can explore it by mov-
ing within it. However, the humble still picture, painted
or printed on a ﬂat surface, is still a popular medium: it
is inexpensive to reproduce, easy and convenient to carry,
store and display. Even more importantly, it has unrivaled
size, resolution and contrast. Furthermore, the advent of in-
expensive digital cameras, their seamless integration with
computers, and recent progress in detecting and matching
informative image features  together with the develop-
ment of good blending techniques [7, 5] have made it possi-
ble for any amateur photographer to produce automatically
mosaics of photographs covering very wide ﬁelds of view
and conveying the vivid visual impression of large panora-
mas, something that so far was the exclusive preserve of
the artist. Such mosaics are superior to panoramic pictures
taken with conventional ﬁsh-eye lenses in many respects:
they may span wider ﬁelds of view, they have unlimited
resolution, they make use of cheaper optics and they are not
restricted to the projection geometry imposed by the lens.
The geometry of single view point panoramas has long
been well understood [12, 21]. This has been used for mo-
saicing of video sequences (e.g., [13, 20]) as well as for ob-
taining super-resolution images (e.g., [6, 23]). By contrast
when the point of view changes the mosaic is ‘impossible’
unless the structure of the scene is very special. Let’s ex-
plore for a moment the ‘easy’ case, where all pictures share
the same center of projection C. If we consider the viewing
sphere, i.e. the unit sphere centered in C, we may identify
each pixel in each picture with the ray connecting C with
that pixel and passing through the surface of the viewing
sphere, as well as through the physical point in the scene
that is imaged by that pixel. By detecting and matching vi-
sual features in different images we may register automat-
ically the images with respect to each other. We may then
map every pixel of every images we collected to the corre-
sponding point of the viewing sphere and obtain a spheri-
cal image that summarizes all our information on the scene.
This spherical image is the most natural representation: we
may represent this way a scene of arbitrary angular width
and if we place our head in C, the center of the sphere, we
may rotate it around and capture the same images as if we
were in the scene.
What is left to be done, in order to obtain our panorama-
on-a-page, is projecting the spherical image onto a picture
plane. This step is unfortunately not obvious – the surface
of the sphere may not be ﬂattened onto a page without some
form of distortion. The choice of projection from the sphere
to the plane has been dealt with extensively by painters and
cartographers. An excellent review is provided in .
The best known projection is linear perspective (also
called ‘gnomonic’ and ‘rectilinear’). It may be obtained by
projecting the relevant points of the viewing sphere onto a
tangent plane, by means of rays emanating from the cen-
ter of the sphere C. Linear perspective became popular
amongst painters during the Renaissance. Brunelleschi is
credited with being the ﬁrst to use correct linear perspec-
tive. Alberti wrote the ﬁrst textbook on linear perspective
describing the main construction methods . It is believed
by many to be the only ‘correct’ projection because it maps
lines in 3D space to lines on the 2D image plane and be-
cause when thepicture is viewedfrom one special point, the
‘center of projection’ of the picture, the retinal image that is
obtained is the same as when observing the original scene.
A further, somewhat unexpected, virtue is that perspective
pictures look ‘correct’ even if the viewer moves away from
the center of projection, a very useful phenomenon called
‘robustness of perspective’ [18, 22].
Unfortunately, linear perspective has a number of draw-
backs. First of all: it may only represent scenes that are
at most 180
wide: as the ﬁeld of view becomes wider,
the area of the tangent plane dedicated to representing one
degree of visual angle in the peripheral portion of the pic-
ture becomes very large compared to the center, and even-
tually becomes unbounded. Second, there is an even more
stringent limit to the size of the visual ﬁeld that may be
represented successfully using linear perspective: beyond
widths of 30
architectural structures (parallelepipeds)
appear to be distorted, despite the fact that their edges are
straight [18, 14]. Furthermore, spheres that are not in the
center of the viewing ﬁeld project to ellipses onto the image
plane and appear unnatural and distorted  (see Fig 1). A
similar phenomenon affects cylinders. Renaissance painters
knew of these shortcomings and adopted a number of cor-
rective measures , some of which we will discuss later.
The objective of this paper is discussing the difﬁculties
and opportunities that are connected to the projection from
viewing sphere to image plane, in the context of digital im-
age mosaics. We ﬁrst explore a number of alternatives to
linear perspective which were developed by painters and
cartographers. These are ‘global’ projections and do not
depend on image content. We explore experimentally the
tradeoffs of these projections: how they distort architec-
ture and people and how well do they tolerate wide ﬁelds
of view. We then show that multiple projections may co-
exist successfully in the same mosaic: these projections are
chosen locally and depend on what is seen in the pictures
that form the mosaic. We conclude with a discussion of the
work that lies ahead.
In this paper we do not address issues of image regis-
tration and image blending and instead rely on the code by
Brown and Lowe [4, 2] for our experiments.
Figure 1: Perspective distortions. Left: Five photographs
of the same person taken by a rotating camera, after rec-
tiﬁcation (removing spherical lens distortion). Right: An
overlay of the ﬁve photographs after blackening everything
butthe person’sface. This shows that spherical objects look
distorted under perspective projection even at mild viewing
angles. For example, in the above ﬁgure, the centers of the
faces in the corners are at ∼ 20
2 Global Projections
What are the alternatives to linear perspective?
An important drawback of linear perspective is the ex-
cessive scaling of sizes at high eccentricities. Consider
a painter taking measurements in the scene by using her
thumb and using these measurements to scale objects on
the canvas. She takes angular measurements in the scene
and translates them into linear measurements onto the can-
vas. This construction is called Postel projection . It
avoids the ‘explosion’ of sizes in the periphery of the pic-
ture. Along lines radiating from the point where the picture
plane touches the viewing sphere, it actually maps lengths
on the sphere to equal lengths in the image. Lines that run
orthogonal to those (i.e., concentric circles around the tan-
gent point) will be magniﬁed at higher eccentricities, but
much less than by linear perspective. The Postel projection
is close to the cartographic stereographic projection. The
stereographic projection is obtained by using the pole oppo-
site to the point of tangency as the center of projection.
Consider now the situation in which we wish to repre-
sent a very wide ﬁeld of view. A viewer contemplating a
wide panorama will rotate his head around a vertical axis in
order to take in the full view. Suppose now that the view
has been transformed into a ﬂat picture hanging on a wall
and consider a viewerexploring that picture: the viewerwill
walk in front of the picture with a translatory motion that is
parallel to the wall. If we replace rotation around a vertical
axis with sideways translation in front of the picture we ob-
tain a family of projections which are popular with cartog-
raphers. Wrap a sheet of paper around the viewing sphere
forming a cylinder that touches the sphere at the equator.
One may project the meridians onto the cylinder by main-
taining lengths along vertical lines, thus obtaining the ge-
ographic projection. Alternatively, one may want to vary
locally the scale of the meridians so that they keep in pro-
Perspective Geographic Mercator Transverse Mercator Stereographic
Figure 2: Spherical projections. Figures taken out of Matlab’s help pages visualizing the distortions of various projections.
Grid lines correspond to longitude and latitude lines. Small circles are placed at regular intervals across the globe. After
projection, the small circles appear as ellipses (called Tissot indicatrices) of various sizes, elongations, and orientations.
The sizes and shapes of the ellipses reﬂect the projection distortions.
portion with the parallels. This is the Mercator projection
(for mathematical deﬁnitions of these projections see ).
Figure 2 visualizes the properties of these projections. In
this visualization grid lines correspond to longitude and lat-
itude lines. When projecting images onto the sphere, verti-
cal lines are projected onto longitude lines. Horizontal lines
are not projected onto latitude lines but rather onto tilted
great circles, thus the visualization of the latitude lines does
not convey what happens to horizontal image lines. All of
these projections are global and are independent of the im-
Figure 3 illustrates the above projections on a panorama
constructed of images taken at an indoor scene. This is a
typical example of panoramas of man-made environments
which usually contain many straight lines. Selecting from
the above projections implies bending either the horizon-
tal lines, the vertical lines, or both. In most cases a bet-
ter choice is to keep vertical lines straight as this results in
a panorama where narrow vertical slits look correct. This
matches the observations in , which shows that our per-
ception of a picture is affectedby the fact that normally peo-
ple shift their gaze horizontally and rarely shift it vertically.
Shifting one’s gaze horizontally across a panorama looks
best when vertical lines are not bent. This motivates the
use of either the Geographic or the Mercator projections, as
both keep vertical lines straight. In both these projections
the rotation of the camera is transformed into sideways mo-
tion of the observer.
When the camera performs mostly pan motion, i.e.,
when the vertical angle is small, both projections produce
practically the same result. However, for larger tilt an-
gles the Geographic projection distorts circles, i.e., it does
not maintain correct proportions, while the Mercator does
maintain conformality, thus the Mercatorprojectionis a bet-
ter option (see Figure 4). Note, that the conformality im-
plies that in the Mercator projection spherical and cylindri-
cal objects, such as people, are not distorted but the back-
ground is, see for example Figure 8.
An important issue in all cylindrical projections is the
choice of equator. Once the images are on the sphere one
can rotate thesphere in any desired way before projecting to
the plane. In other words, the cylinder wrapping the sphere
can touch the sphere along an equator of choice. When a
wrong equator is selected, vertical lines in 3D space will
not be projected onto vertical lines in the panorama (see left
panel of Figure 5). Finding the correct equator is easy. The
user is requested to mark a single vertical line and a horizon
point in one (or two) of the input images. The sphere is
then rotated so that projection of the marked vertical line
aligns with a longitude line and the equator goes through
the selected horizon point. This results in a straightened
panorama, see for example, right panel of Figure 5.
Should other projections be considered? Yes, we think
so. The Transverse Mercator projection is known in
the mapping world as an excellent choice for mapping ar-
eas that are elongated north-to-south. This corresponds to
panoramas with little pan motion and large tilt motion. The
bending of vertical lines is small near the meridian, thus,
when the pan angle is small we are better off using the
Transverse Mercator projection which keeps the horizontal
lines straight. This is illustrated in Figures 4, 6.
Forfar awayoutdoorsscenes almost anyprojectionlooks
good as the scenes rarely contain any straight lines. Never-
theless, too much bending might disturb the eye even on
free form objects like clouds. This implies the usage of the
stereographicprojection,which bends both vertical and hor-
izontal lines but less than the cylindrical projections.
3 Multi View Projection
The projections explored in Section 2 are ‘global’, in that
once a tangent point or a tangent line is chosen, the pro-
jection is completely determined by this parameter. This is
by no means a necessary property for a good projection. We
may instead tailor the projection locally to thecontent of the
images in order to improve the ﬁnal effect. We next explore
a few options for such multi-view projections.
Perspective Transverse Mercator
Figure 3: Spherical projections. There are many spherical projections. Each has its pros and cons.
Figure 4: Preserving proportions. In the Geographic pro-
jection the circular pot at the bottom of the panorama is
distorted into an ellipse. In the Mercator projection this
does not happen.
3.1 Multi-Plane Perspective Projection
As was shown in Section 2, a global projection of wide
panoramas bends lines, which is unpleasant to the eye. To
obtain both a rectilinear appearance and a large ﬁeld ofview
we suggest using a multi-plane perspectiveprojection. Such
multi-plane projections were suggested by Greene  for
rendering textured surfaces. Rather than projecting the
sphere onto a single plane, multiple tangent planes to the
sphere are used. Each projection is linear perspective. The
tangent planes have to be arranged so that they may be un-
folded into a ﬂat surface without distortion, e.g., the points
of tangency belong to a maximal circle. One may think
of the intersections of the tangent planes being ﬁtted with
hinges that allow ﬂattening. The projection onto each plane
is perspective and covers only a limited ﬁeld of view, thus it
is pleasant to the eye.
This process introduces large orientation discontinuities
at the intersection between the projection planes, however,
in many man-made environment these discontinuities will
not be noticed if they occur along natural discontinuities.
The tangent planes must therefore be chosen in a way that
ﬁts the geometry of the scene, e.g. so that the vertical edges
of a room project onto the seams and each projection plane
corresponds to a single wall. Orientation discontinuities
caused by the projection this way co-occur with orientation
discontinuities in the scene and therefore they are visually
unnoticeable (see Figures 3, 8, 6). Sometimes no seam may
be found that completely corresponds to discontinuities in
the scene: for example in Figure 9 the chair on the right
is clearly distorted. Another caveat is that some arrange-
ments will cause a loss in the impression of depth: for ex-
ample, when projecting a panoramaof a standard room onto
a square prism (see left panel of Figure 7). Most often the
sensation of depth can be maintained by appropriate choice
of the projection planes (see right panel of ﬁgure 7).
We have currently implemented a simple user inter-
face to allow choosing the position of the multiple tan-
gent planes. We assume that the hinges between tangent
Mercator With Wrong Equator Mercator With Correct Equator
Figure 5: Choice of equator Panoramas of the Pantheon. A wrong choice of the equator results in tilted vertical lines. The
columns on the right and left appear converging. Correcting the equator selection results in columns standing up-right.
Perspective Geographic Mercator Mercator Multi-Plane
Figure 6: Verticalpanoramas. Left and right panelsshow results beforeand after cropping(see Section 4 for further details).
For wide angle panoramas, perspective cannot capture the full range, thus the photographerslegs are excluded. Geographic
distorts proportions (see how squashed the legs look). Mercator stretches the legs across the bottom. Transverse-Mercator
captures both the sculpture and the photographer which suggests it is the best global projection option for narrow vertical
panoramas. Multi-Plane does even better.
planes are either associated to vertical or horizontal lines:
the user is presented with the Geographic projection of the
panorama and clicks once anywhere on a single vertical line
to choose a seam and once again to choose the point of tan-
gency of each projection plane. Automating this operation
is an interesting exercise which we leave for the future.
3.2 Preserving Foreground Objects
The multi-plane perspective projection takes us back to the
second challenge presented in Section 1. Recall, that even
for small ﬁelds of view nearby (foreground) objects are of-
ten perceived as distorted. Our solution to this problem
draws its inspiration from the Renaissance artists.
During the Renaissance the rules of perspective were un-
derstood, and linear perspective was used to produce pic-
tures that had a realistic look. Painters noticed earlier on,
that spheres and cylinders (and therefore people) would ap-
pear distorted if they were painted according to the rules of
a global perspective projection (a sphere will project to an
ellipse). It thus became common practice to paint people,
spheres and cylinders by using linear perspective centered
around each object. (see for example the The School of
Athens by Raphael [18, 14]). This results in paintings with
multiple view points. There is one global view point used
for the background and an additional view point for each
Renaissance paintings look good precisely because they
are constructed using a multiplicity of projections. Each
projection is chosen in order to minimize the apparent dis-
tortion of either the ambient architecture, or of a speciﬁc
person/object. We follow this example and adopt the multi-
view point approach to construct realistic looking panora-
mas. We ﬁrst separate the background and foreground ob-
jects. A panorama is constructed from the background by
using a global projection: perspective for ﬁelds of view that
are narrower than, say, 40
and Multi-Plane otherwise. The
foreground objects are projected using a ‘local’ perspec-
tive projection, with a central line of sight going through
the center of each object, and then they are pasted onto the
background. More in detail:
(1) Obtain a foreground-background segmentation for
each image and cut out the foreground objects [15, 19].
Currently we use the GIMP  implementation of In-
telligent Scissors  which requires manual interac-
tion, we found it to take less than a minute per image.
(2) Fill in the holes in the background caused by cutting
out the foreground objects using a texture propagation
technique (e.g., [8, 3]). We used our implementation of
. Note, that the hole ﬁlling need notbe perfect as most
of it will be covered eventually by the repasting of the
foregroundobjects. As we are most sensitive to people’s
distortions, one could acquire each picture containing a
person a second time, once the person moved. In that
case hole ﬁlling won’t be required.
(3) Construct a panorama of the ﬁlled background im-
(4) Overlay foregroundobjects on top of the background
panorama. For each foregroundobject, ﬁnd its bounding
box in the original image and in the panorama if it were
projected along with the background. Rescale the cut-
out object to have the same height as its projection (note,
that the width will be different). Paste the object so that
the centers of the bounding boxes align.
This process is illustrated in Figure 10. Five frames were
taken out of a video sequence showing a child walking from
right to left, while facing the camera. The child was cut-out
from each image, texture propagation was used to ﬁll in the
holes and a perspective panorama of the background was
constructed (see Figure 10 top). The cut-outs of the child
were then pasted onto the background in two ways. Once
applying the same perspective projection used for the back-
ground, which resulted in distorting the child’s head into a
variety of ellipsoidal shapes (see Figure 10 middle). Then
using the multi-view approach described above which pro-
duced a signiﬁcantly better looking result, removing all the
head distortions, see Figure 10 bottom. Another example
is displayed in Figure 11 (for this example we had avail-
able clear background images so hole ﬁlling was not re-
quired). Figure 9 displays our full solution including both
multi-plane projection for the background and multi-view
projection to correct the chair in the foreground.
In all the experiments displayed in this paper the compu-
tation of the transformations of the input images to the
sphere was done using Matthew Brown’s Autostitch soft-
ware [4, 2].
When the images do not cover the full viewing sphere,
the boundaries of the panorama can have all sort of shapes,
depending on the projection, e.g., see left panel of Fig-
ure 6. Thus, for visualization purposes, the panoramas were
cropped to display a complete rectangular portion. This re-
sults in different coverage areas for each projection. The
uncropped panoramas, as well as more results are provided
in the attached supplemental material.
5 Discussion & Conclusions
The challenge of constructing panoramas from images
taken from a single viewpoint goes beyond image matching
and image blending. The choice of the mapping between
the viewing sphere and the image plane is an interesting
problem in itself. Artists and cartographers have explored
this problem and have proposed a number of useful global
projections. Additionally, artists have developed a practice
to use multiple local projections which are guided by the
content of the images. Inspired by the artists we have pro-
posed a new set of projections which incorporate multiple
local projections with multiple view points into the same
panorama to produce more compelling results. Further au-
tomating this process is a worthwhile challengefor machine
This reseearch was supported by MURI award number
AS3318 and the Center of Neuromorphic Systems Engi-
neering award EEC-9402726. We also wish to acknowl-
edge our useful conversationswith PatHanrahan, Jan Koen-
derink, Marty Banks, Bill Freeman, Ged Ridgway and
David Lowe and to thank Matthew Brown for providing his
 Leon Battista Alberti. On Painting. First appeared 1435-36.
Translated with Introduction and Notes by John R. Spencer.
New Haven: Yale University Press. 1970.
 Autostitch. http://www.autostitch.net/.
 M. Bertalmo, G. Sapiro, V. Caselles, and C. Ballester. Im-
age inpainting. In Proceedings of SIGGRAPH, New Orleans,
USA, July 2000.
 M. Brown and D. Lowe. Recognising panoramas. In Pro-
ceedings of the 9th International Conference on Computer
Vision, volume 2, pages 1218–1225, Nice, October 2003.
 P. J. Burt and Edward H. Adelson. A multiresolution spline
with application to image mosaics. ACM Trans. Graph.,
 D. Capel and A. Zisserman. Automatic mosaicing with
super-resolution zoom. In CVPR ’98: Proceedings of the
Straight Projection Oblique Projection
Figure 7: Multi-Plane projection. In each panel the top ﬁgure displays the geographic projection and the interaction
required by the user - deﬁnition of the intersection lines between the tangent planes (marked in blue) and the center of
projection for each tangent plane (marked in green and red). The middle panel displays a top view of the projection. The
bottom panel displays the ﬁnal result.
IEEE Computer Society Conference on Computer Vision
and Pattern Recognition, page 885. IEEE Computer Society,
 P. E. Debevec and J. Malik. Recovering high dynamic range
radiance maps from photographs. In Proceedings of SIG-
GRAPH, August 1997.
 A.A. Efros and Thomas K. Leung. Texture synthesis by non-
parametric sampling. In IEEE International Conference on
Computer Vision, pages 1033–1038, Corfu, Greece, Septem-
 A. Flocon and A. Barre. Curvilinear Perspective, From Vi-
sual Space to the Constructed Image. University of Califor-
nia Press, 1987.
 The GIMP. http://www.gimp.org/.
 N. Greene. Environment mapping and other applications of
world projections. IEEE Computer Graphics and Applica-
tions, 6(11):21–29, November 1986.
 R. I. Hartley and A. Zisserman. Multiple View Geometry in
Computer Vision. Cambridge University Press, 2000.
 M. Irani, B. Rousso, and S. Peleg. Computing occluding
and transparent motions. Int. J. Comput. Vision, 12(1):5–16,
 M. Kubovy. The Psychology of Perspective and Renissance
Art. Cambridge University Press, 1986.
 Y. Li, J. Sun, C.K. Tang, and H. Shum. Lazy snapping. In
Proceedings of SIGGRAPH, 2004.
 MathWorld. http://mathworld.wolfram.com/.
 E.N. Mortensen and W.A. Barrett. Intelligent scissors for
image composition. In SIGGRAPH ’95: Proceedings of the
22nd annual conference on Computer graphics and interac-
tive techniques, pages 191–198. ACM Press, 1995.
 M. H. Pirenne. Optics, Painting & Photography. Cambridge
University Press, 1970.
 C. Rother, V. Kolmogorov, and A. Blake. Grabcut - inter-
active foreground extraction using iterated graph cuts. Proc.
ACM Siggraph, 2004.
 H. S. Sawhney and R. Kumar. True multi-image alignment
and its application to mosaicing and lens distortion correc-
tion. IEEE Trans. Pattern Anal. Mach. Intell., 21(3):235–
 R. Szeliski and H. Shum. Creating full view panoramic im-
age mosaics and environment maps. Computer Graphics,
31(Annual Conference Series):251–258, 1997.
 D. Vishwanath, A. R. Girshick, and M. S. Banks. Why pic-
tures look right when viewed from the wrong place. Person-
nal communication. (Manuscript accepted for publication).
 A. Zomet and S. Peleg. Applying super-resolution to
panoramic mosaics. In WACV ’98: Proceedings of the
4th IEEE Workshop on Applications of Computer Vision
(WACV’98), page 286. IEEE Computer Society, 1998.
Figure 8: Architecture vs. spherical objects. The perspective projection distorts people at large viewing angles. The
Mercator projection keeps the people undistorted, but distorts the wall and white-board at the background. The Multi-Plane
projection provides the most compelling result with no noticeable distortions in both background and people.
Figure 10: Correcting perspective distortions. Top:
Panorama of the background only. Artifacts in the hole
ﬁlling are visible, but are inessential as they will be even-
tually covered by the foreground object. Center: A global
perspective projection of both background and foreground.
The child’s head appears distorted. Bottom: A multi-view
point panorama providing the most compelling look with no
Mercator Multi-Plane Multi-Plane Multi-View
Figure 9: Multi-Plane Multi-View. The multi-plane projection rectiﬁed the background but the chair on the right got
distorted. Using the Multi-View approach the chair is undistorted.
Figure 11: Correcting perspective distortions. In the Perspective panorama the person’s head is highly distorted. A
Multi-view panorama provides a more compelling look, removing all distortions.