ArticlePDF Available

Abstract and Figures

Route panorama is a new image medium for digitally archiving and visualizing scenes along a route. It's suitable for registration, transmission, and visualization of route scenes. This article explores route panorama projection, resolution, and shape distortion. It also discusses how to improve quality, achieve real-time transmission, and display long route panoramas on the Internet.
Content may be subject to copyright.
1070-986X/03/$17.00 © 2003 IEEE Published by the IEEE Computer Society 57
Feature Article
Route panorama is a
new image medium
for digitally
archiving and
visualizing scenes
along a route. It’s
suitable for
registration,
transmission, and
visualization of route
scenes. This article
explores route
panorama
projection,
resolution, and
shape distortion. It
also discusses how to
improve quality,
achieve real-time
transmission, and
display long route
panoramas on the
Internet.
I
n ancient western art, compositions
showed the perspective of a scene’s pro-
jection toward a 2D view plane. In con-
trast, an ancient oriental painting
technique exposed scenes of interest at different
locations in a large field of view, which let view-
ers explore different segments by moving their
focus. A typical and famous example is a long
painting scroll named A Cathay City (see Figure
1), painted 900 years ago. It recorded the pros-
perity of the capital city of ancient China in the
Song Dynasty by drawing scenes and events
along a route from a suburb to the inner city.
The invention of paper in ancient China
allowed most paintings to be created on foldable
and scrollable papers instead of canvases. As
opposed to a single-focus perspective projection
at one end of a street or a bird’s eye view, one
benefit of this scrollable style is that a path-
oriented projection can display more detailed
visual information in an extended visual field
than a single-focus perspective projection at one
end of a street or a bird’s-eye view from the top.
Today we can realize a composition approach
similar to A Cathay City by creating a new image
medium—Route Panorama, a program that con-
tains an entire scene sequence along a route. We
generate long route panoramas by using digital
image-processing techniques and render route
views continuously on the Internet. Now we can
capture, register, and display route panoramas
for many miles taken from vehicles, trains, or
ships. We can use this approach for many prac-
tical purposes, including profiles of cities for vis-
itors or for introductory indexes of local
hometowns. In an extended manner, we could
probably even use it as part of something like an
enhanced version of Mapquest for people navi-
gating through large cities like Los Angeles or
Beijing. Figure 2 (next page) shows an example
of a route panorama from a one-hour video in
Venice, a place where people often get lost.
Route scenes on the Internet
There are many things to consider when cre-
ating a quality route panorama. To begin with,
we create a route panorama by scanning scenes
continuously with a virtual slit in the camera
frame to form image memories. We call it a virtual
slit because it isn’t an actual lens cover with a slit
in it; rather, for each camera image in the video
sequence, we’re copying a vertical pixel line at a
fixed position.
We paste these consecutive slit views (or
image memories) together to form a long, seam-
less 2D image belt. We can then transmit the 2D
Figure 1. A Cathay City
painted on an 11 m
scroll in the Song
Dynasty 900 years ago.
(From the collection of
the National Palace
Museum, Taipei,
Taiwan, China.)
Digital Route
Panoramas
Jiang Yu Zheng
Indiana University–Purdue University, Indianapolis
image belt via the Internet, enabling end users to
easily scroll back and forth along a route. The
process of capturing a route panorama is as sim-
ple as recording a video on a moving vehicle. We
can create the route panorama in real time with a
portable PC inside the vehicle or by replaying a
recorded video taken during the vehicle’s move-
ment and inputting it into a computer later.
Nevertheless, the generated image belt with its
consecutive slit views pieced together has much
less data than a continuous video sequence.
My colleagues and I first invented a route
panorama 10 years ago for mobile robot naviga-
tion.
1-3
We called it a generalized panoramic view
because we discovered it while creating the first
digital panoramic view. In this early study, we
mounted a camera on a moving vehicle and it
constantly captured slit views orthogonal to the
moving direction. The route panorama is a special
case of a more general representation called a
dynamic projection image,
4
which forms the 2D
image using a nonfixed, temporal projection.
Figure 3 (next page) illustrates how a vertical slit
can scan the street scenes displayed in Figure 2
when the camera is positioned sideways along a
smooth path. This viewing scheme is an
orthogonal-perspective projection of scenes—
orthogonal toward the camera path and perspec-
tive along the vertical slit. Generally speaking,
common modes of transportation such as a four-
wheeled vehicle, ship, train, or airplane can pro-
vide a smooth path for the camera.
Compared with existing approaches to model
a route using graphics models,
5,6
our route
panorama has an advantage in capturing scenes.
It doesn’t require taking discrete images by man-
ual operation or texture mapping onto geometry
models. A route panorama can be ready after dri-
ving a vehicle in town for a while. It yields a con-
tinuous image scroll that other image stitching or
mosaic approaches,
7-9
in principle, are impossible
to realize. A mosaicing approach works well for
stitching images taken by a camera rotating at a
static position. For a translating camera, howev-
er, scenes with different depths have different dis-
parities (or optical flow vectors) in consecutive
images. Overlapping scenes at one depth will add
layers and create dissonant scenes at other depths,
much like overlapping stereo images.
A route panorama requires only a small frac-
tion of data compared to a video sequence and has
a continuous format when accessed. If we pile a
sequence of video images together along the time
axis, we obtain a 3D data volume called
spatial–temporal volume that’s full of pixels (see
Figure 4, p. 60). The route panorama comprises
pixel lines in consecutive image frames, which
correspond to a 2D data sheet in the spatial–tem-
poral volume. Ideally, if the image frame has a
width w (in pixels), a route panorama only has
1/w of the data size of the entire video sequence
(w is 200~300), since we only extract one pixel
line from each frame when viewing through the
slit. The route panorama neglects redundant
58
IEEE MultiMedia
Figure 2. Segment of
route panorama
recording canal scenes
in Venice.
scenes in the consecutive video frames, which we
can observe from the traces of patterns in the
epipolar plane images (Figure 4 shows one EPI).
This shows a promising property of the route
panorama as an Internet medium, which can
deliver large amounts of information with small
data. The missing scenes are objects under occlu-
sion when exposed to the slit. Also, as normal 2D
images capturing dynamic objects, a route panora-
ma only freezes instantaneous slit views, rather
than capturing object movements as video does.
Figure 5 (see p. 60) is another example of a
route panorama. It displays a segment of the First
Main Street of China. We used a small digital
video camera on a double-decker bus to record
the 5-km route panorama. No image device other
than an image grabber is required in processing.
Projecting scenes
Compared with existing media such as photo,
video, and panoramic views, route panorama has
its own projection properties. For instance, much
can depend on the smoothness of the ride while
the camera is recording. In an ideal situation
roads are flat and the vehicle has solid suspen-
sion, a relatively long wheelbase, and travels at a
slow speed. Under these conditions, the camera
has less up-and-down translation and can tra-
verse a smooth path along a street.
In this article, we assume the camera axis is set
horizontally for simplicity. We define a path-
oriented coordinate system SYD for the projec-
tion of scenes toward a smooth camera trace S on
the horizontal plane. We denote a point in such
a coordinate system as P(S, Y, D), where S is the
camera-passed length from a local starting point
(see Figure 3b), Y is the height of the point from
the plane containing the camera path, and D is
the depth of the point from the camera focus.
Assume t and y are the horizontal and vertical
axes of the route panorama, respectively. The
projection of P in the route panorama denoted
by p(t, y) is
t = S/r y = Yf/D (1)
where r (m/frame) is the slit sampling interval on
the camera trace, which is the division of the vehi-
cle’s speed V (m/second) by the camera’s frame
rate per second (fps). The camera’s focal length f
(in pixels) is known in advance after calibration.
Because the route panorama employs an
orthogonal-perspective projection, the aspect
ratio of an object depends on its distance from
the path. Figure 6 displays a comparison of views
between an ordinary perspective image and a
route panorama. The further the object, the
lower the object height is in the route panorama.
Object widths in the route panorama are propor-
tional to their real widths facing the street. In this
59
Camera
Camera trace
S
Slit
Virtual slit
(pixel line)
Route panorama
in long memory
Automatic
copy and
paste
Camera motion
Camera path
Image frame
P
(a)
(b)
Y
D
YD
S
S
t
y
Buildings
Viewing plane
(plane of sight)
Route panorama
Figure 3. Route scene scanning by constantly cutting a vertical pixel line in the
image frame and pasting it into another continuous image memory when the
camera moves along a smooth path on the horizontal plane.
60
x
t (image number)
y
Origin
Spatial-temporal volume
Route panorama
Image frame
Figure 4. Data size of a
route panorama is one
sheet out of a volume of
video images.
Figure 5. A segment of route panorama generated from a video taken on a bus (before removing shaking components).
Horizon
Vanishing
point
Vanishing
point
Horizon
Asymptote
Asymptote
Perspective projection
Route panorama
Image frame
L4
L2
L2
Hf /D
1
Hf /D
1
Hf /D
2
Hf /D
2
Wf /D
1
Wf /D
2
W/r
W/r
L1
L1
L1
L1
(a)
(c)
(d)
(b)
L4
Figure 6. 2D projections of 3D objects
in a route panorama compared with
a perspective projection image. W
stands for the object width, H is
object height, and D is object depth.
(a) Ordinary perspective projection.
(b) Orthogonal-perspective
projection. (c) A typical object in
perspective projection and (d) in an
orthogonal-perspective projection
images.
sense, distant objects are extended horizontally
in the route panorama. Thus, route panoramas
aren’t likely to miss large architectures. Small
objects such as trees, poles, and signboards are
usually narrower than buildings along the t axis
and might disappear after squeezing in the t
direction, while buildings won’t disappear. In a
normal perspective projection image, a small tree
close to the camera may occasionally occlude a
large building behind it.
We examine several sets of structure lines typ-
ically appearing on 3D architectures (see Figures
6c and 6d) and find their shapes in the route
panorama from a linear path. Assuming a linear
vector V = (a, b, c) in the global coordinate sys-
tem with its X axis parallel to the camera path,
the line sets are
L1 {V | c = 0}: lines on vertical planes parallel
to the camera path. These lines may appear on
the front walls of buildings.
L2 {V | a = c = 0} L1: vertical lines in the 3D
space. These lines are vertical rims on archi-
tectures.
L3 {V | c 0}: lines stretching in depth from
the camera path.
L4 {V | c 0, b = 0} L3: horizontal 3D lines
nonparallel to the camera path.
Obviously, lines in L2 are projected as vertical
lines in the route panorama through the vertical
slit. Denoting two points P
1
(S
1
,Y
1
,D
1
) and
P
2
(S
2
,Y
2
,D
2
) on the line, where P
1
P
2
V, their pro-
jections in the route panorama are p
1
and p
2
. The
projection of the line is then
according to the projection model in Equation 1.
For a line in L1 where D = D
2
D
1
= 0 and Y/S
is constant b/a, its projection becomes
which is linear in the route panorama. Therefore,
a line in L1 is still projected as a line in the route
panorama for a linear path.
The most significant difference from perspec-
tive projection is a curving effect on line set L3
(Figure 6d). For a line in L3, its projection in the
route panorama is a curve, because point P
2
in
the route panorama is
which is a hyperbolic function of τ. This curve
approaches a horizontal asymptotic line y = fb/c
from p
1
when τ→∞. Particularly for lines in L4,
their projections are curves approaching toward
the projection of horizon (y = 0) in the route
panorama.
The path curvature also defines the lines’ curv-
ing effect if the camera moves on a curved trace.
Because of space, we omit the analysis here.
Nevertheless, we can obtain reasonably good
visual indexes of route scenes as long as we allow
for smooth curves in the route panorama.
Another interesting property of the route
panorama is the common asymptote for a set of
parallel lines stretching in depth (parallel lines in
L3). Under perspective projection, we project par-
allel lines with a depth change in the 3D space
onto the image plane as nonparallel lines and
their extensions on the image plane cross at a
common point called the vanishing point (accord-
ing to the principle in computer vision). In the
route panorama obtained from a linear camera
path, however, a set of 3D parallel lines stretch-
ing in depth has a common asymptotic line in
the route panorama. This is because parallel lines
in L3 have the same direction (a, b, c), and their
projections in the route panorama all approach
the same horizontal asymptotic line y = fb/c
when τ→∞.
ty
S
r
fY
D
SS
r
fY Y
DD
aS
r
fY b
Dc
22
2
2
2
1
1
1
1
1
1
,
()
()
()
=
=
+
+
+
=
+
+
+
τ
τ
τ
v =
=
=
S
r
fY
D
a
r
fb
D
a
r
fb
D
111
τ
τ
τ
v =−=
pp
SS
r
fY
D
fY
D
21
21
2
2
1
1
61
July–September 2003
We can obtain reasonably
good visual indexes of route
scenes as long as we allow for
smooth curves in the route
panorama.
If we fix the camera axis so that the plane of
sight through the slit isn’t perpendicular to the
camera path, we obtain a parallel-perspective
projection along a linear path because all the per-
spective planes of sight are parallel. We can fur-
ther extend this to a bended-parallel-perspective
projection when the camera moves along a
curved path. We can extend most properties of
the orthogonal-perspective projection similarly.
Stationary image blur and close object
filtering
When we’re actually recording a route
panorama, we obtain slit views by cutting a
pixel line in the image frame of a video camera.
Every slit view itself then is a narrow-perspec-
tive projection. The sampling rate of the slit
has a limit lower than the video rate. If the
vehicle speed isn’t slow enough, it’s reflected
in the route panorama, because the panorama
is actually the connection of narrow perspec-
tive projections at discrete positions along the
route (as Figure 7 depicts). Scenes contained in
a narrow wedge are projected onto the one-
pixel line at each time instance. We examine
surfaces that can appear in three depth ranges
from the camera path. First, at the depth where
surface 2 in Figure 7 is located, each part of the
surface is taken into consecutive slit views
without overlapping, just as a normal perspec-
tive projection does. We call this the just-sam-
pling range (depth).
Second, for a surface closer than the just-
sampling range, the consecutive slit views don’t
cover every fine part of the surface (surface 3 in
Figure 7). Surfaces in this range are undersam-
pled in the route panorama. If the spatial fre-
quency of intensity distribution is low on the
surfaces—that is, the surface has relatively
homogeneous intensity—we can recover the
original intensity distribution from the sampled
slit views (the route panorama), according to
the Nyquist theorem in digital signal process-
ing. Otherwise, we may lose some details with-
in that range. Therefore, route panorama has a
function of filtering out close objects such as
trees, poles, people, and so forth. By reducing
the camera’s sampling rate, this filtering effect
becomes clearer and more distinct. This is help-
ful when we’re mainly interested in architec-
tures along a street.
Third, if a surface is farther than the just-
sampling range, the camera could oversample
the surface points. Because a slit view accumu-
lates intensities in the narrow perspective
wedge, a point on surface 1 may be counted in
the overlapped wedges of the consecutive slit
views. Therefore, a distant object point retain-
ing at a position in the perspective projection
may cause a blur horizontally in the route
panorama. We call this phenomenon stationary
blur, since it’s converse to the motion blur
effect in a dynamic image where a fast translat-
ing point wipes across several pixels during the
image exposure.
We can give the just-sampling depth’s
numerical computation in a more general form
to include bended parallel-perspective projec-
tion. If we set a global coordinate system
O-XYZ, we can describe the smooth camera path
by S[X(t), Z(t)]. If the vehicle is moving on a
straight lane without obvious turns, the camera
path almost has zero curvature (κ≅0). Selecting
a camera observing side, we can divide a path
roughly as linear, concave, or convex segments,
depending on the sign of curvature. For sim-
plicity, we assume the radius of curvature R(t) of
a curved camera path is constant between two
consecutive sampling positions S
1
and S
2
, where
R(t) > 0 for a concave path, R(t) < 0 for a convex
path, and R(t) =∞for a linear path. The curve
length between S
1
and S
2
is r. The wedge’s angle
is 2θ where f × tanθ = 1/2, because we cut one
pixel as the slit width (see Figure 8).
For bended parallel-perspective projection,
the plane of sight—which is the wedge’s central
plane—has an angle α from the camera transla-
62
IEEE MultiMedia
Slit view covered regions
Oversampling range
Just-sampling depth
Undersampling range
Focal length of camera
Over sampled
Sampled surface
Slit view
S
1
P
J
S
2
Surface 1
Surface 2
Surface 3
Slit
Camera path
Figure 7. Oversampling
range, just-sampling
depth, and
undersampling range of
the route panorama.
tion direction that is the curve’s tangent vector.
The two consecutive wedges of thin perspective
projection meet at a vertical line through point
P
j
in Figure 8. We have a vector relation on the
horizontal plane as
S
1
P
j
= S
1
S
1
+ S
1
P
j
in triangle S
1
S
2
P
j
. Then we obtain S
1
P
j
and S
2
P
j
as
by using sine theorem. It isn’t difficult to further
calculate the just-sampling depth D
j
in the plane
of sight to the just-sampled surface. Figure 9
shows the equation for this.
It’s important to note that the just-sampling
range not only relies on the camera’s sampling
rate, vehicle speed, image resolution, and camer-
a’s focal length, but also depends on the camera
path’s curvature. The just-sampling range tends
to be close when the camera moves on a concave
path and far on a linear path. When the camera
path changes from linear to convex (R(t) varies
from −∞ to 0), D
j
extends to the infinity (D
j
+∞) and then starts to yield negative values (P
j
flips to the path’s other side). This means that the
consecutive perspective wedges won’t intersect
when the convex path reaches a high curvature
and the entire depth range toward infinity is
undersampled.
For a simple orthogonal-perspective projec-
tion toward a linear path, we can simplify the
equation in Figure 9 to
by setting α=π/2 and R(t) →∞. Overall, there are
differently sampled regions in a route panorama
depending on the subjects’ depths. We can select
a sampling rate so that the just-sampling range is
D
r
j
=
2 tan
θ
SP
Rt
r
Rt
r
Rt
r
Rt
SP
Rt
r
Rt
r
Rt
r
Rt
j
j
1
2
2
22
2
2
22
2
=
++
+
=
−−
+
()sin
()
sin
()
sin
()
()sin
()
sin
()
sin
()
αθ
θ
αθ
θ
63
July–September 2003
1 pixel
Radius of curvature
Motion vector
S
1
S
2
=S
2
P
j
D
j
P
j
R(t)
Camera path
Camera path
Camera axis
θ
α
(a)
(b)
Figure 8. Just-sampling range for a curved camera path. (a) Curved camera path and a physical sampling wedge. (b) Two
consecutive slit views and just-sampling depth.
Figure 9. Calculating the just-sampling depth D
j
in the plane of sight to the
just-sampled surface.
approximately at the front surfaces of the build-
ings of interest.
The collection of slit views is a process of
smoothing spatial intensity distribution: output
value at a point by averaging intensities over a
space around it. We can estimate the degree of
stationary blur as follows. At depth D, the width
of the perspective wedge is W = D tanθ. We can
average the color distribution at depth D over W
to produce a pixel value for the slit view, which
is the convolution between the intensity distrib-
ution and a rectangular pulse function with
width W and height 1/W. If we set a standard test
pattern at depth D that’s a step edge with unit
contrast or sharpness, we can easily verify that
the convoluted result is a triangular wave with
the contrast reduced to 1/(D tanθ). Therefore, an
edge’s sharpness is inversely proportional to its
depth. This is important when estimating
objects’ depth in a route panorama.
If a segment of route panorama mainly con-
tains objects far away, we can squeeze it along
the t axis to reduce the stationary blurring effect
and, at the same time, reduce shape distortion.
This scaling may visually improve the objects’
appearances in the route panorama if there’s no
requirement to keep the exact scale or resolution
of the route panorama horizontally.
Dealing with camera shaking
Improving image quality is a crucial step
toward the real application of route panoramas in
multimedia and the Internet. In Figure 5, we
pushed a small video camera firmly on a window
frame of the bus to avoid uncontrollable acciden-
tal camera movement. We still can observe severe
zigzags on horizontal structural lines in the route
panorama. This is because the camera shook
when the vehicle moved over an uneven road. To
cope with the camera shaking, some have tried to
compensate by using a gyroscope. However,
adding special devices might increase the diffi-
culty of spreading this technology. Our approach
was to develop an algorithm to rectify distortion
according to some constraints from scenes and
motion.
We can describe the camera movement at any
instance by three degrees of translations and
three degrees of rotations (as displayed in Figure
10). Among them, the camera pitch caused by
left-and-right sway and the up-and-down trans-
lation caused by the vehicle shaking on uneven
roads are the most significant components affect-
ing the image quality. The latter only yields small
up-and-down optical flow in the image if the
vehicle bumping is less than several inches.
Overall, camera pitch influences image qual-
ity the most and luckily we can compensate for
it with an algorithm that reduces camera shak-
ing (many algorithms along these lines
exist).
10,11
Most of these algorithms work by
detecting a dominant motion component
between consecutive frames in the image
sequence. For a route panorama, however, we
only need to deal with shaking components
between consecutive slit lines. The idea is to fil-
ter the horizontal zigzagged lines in the route
panorama to make them smooth, which
involves spatial processing. We do this by using
the following criteria:
smooth structural lines in the 3D space should
also be smooth in a route panorama, and
vertical camera shaking (in pitch) joggles the
slit view instantly to the opposite direction.
As Figure 11 illustrates, we estimate the cam-
era motion from the joggled curves in the route
panorama and then align vertical pixel lines
accordingly to recover the original straight
structural lines. The way to find an instant cam-
era shaking is to check if all the straight lines
joggle simultaneously at that position. We track
line segments horizontally in the route panora-
ma after edge detection and calculate their con-
secutive vertical deviations along the t axis. At
64
IEEE MultiMedia
(Camera pan R
y
)
Roll R
z
Pitch R
x
Camera
Vehicle turn
Wheelbase
T
y
T
x
T
z
Figure 10. Vehicle and camera model in taking a
route panorama. Note that (T
x
, T
y
, T
z
, R
x
, R
y
, and
R
z
) = (forward translation, up-and-down
translation, translation sideways, pitch, pan, and
roll), respectively. Also, translation sideways
doesn’t occur for a vehicle movement.
each t position, we use a median filter to obtain
the common vertical deviation of all lines to
yield the camera’s shaking component. The
median filter prevents taking an original curve
in the scene as a camera-joggled line and result-
ing in a wrong pitch value. After obtaining the
sequence of camera parameters along the t axis,
we prepare a window shifting along the hori-
zontal axis and apply another median filter to
the camera sequence in the window, which
eliminates disturbances from abrupt vehicle
shaking.
Suppose the original structure lines in the
scenes are horizontal in an ideal route panorama
(Figure 11a). The camera, however, shakes verti-
cally over time (Figure 11b), which joggles the
structure lines inversely in the captured route
panorama (Figure 11c). By shifting all the verti-
cal pixel lines according to the estimated camera
motion sequence, we can align curved lines
properly to make horizontal structure lines
smooth in the route panorama (Figure 11d).
Figure 12 shows the route panora-
ma of Figure 5 after removing the
camera shakes in the pitch. This
algorithm is good at removing
small zigzags on the structure lines
to produce smooth curves. Once we
apply the algorithm, it’s easy to
modify the route panorama to
make major lines smooth and
straight.
Real-time transmission and display
Our next step is to transmit a long route
panorama on the Internet and to seamlessly
scroll it back and forth. Displaying and stream-
ing route panoramas gives users the freedom to
easily maneuver along the route.
We developed three kinds of route panorama
displays to augment a virtual tour. The first type
is a long, opened form of route panorama (see
Figure 2). The second type is a side-window view
(see Figure 13) that continuously reveals a section
of the route panorama back and forth. We call it
a route image scroll. You can control the direction
of movement and the scrolling speed with a
mouse. The third type is a forward view of the
vehicle for street traversing; we call it a traversing
window (see Figure 14, next page).
We can combine these different displays in var-
ious ways. In rendering a traversing window, we
map both side-route panoramas onto two sidewalls
65
July–September 2003
Figure 11. Recovering straight structure lines from camera-
joggled curves in a route panorama.
Figure 12. Recovering smooth structure lines in the route panorama by removing the camera-shaking components.
Figure 13. Real-time streaming data transmission over the Internet to show route scenes.
(a)
(b)
(c)
(d)
along the street and then project to a local
panoramic view (a cylindrical image frame). We
then display the opened 2D form of this panoram-
ic view so that users can observe the street stretch-
ing forward as well as architectures passing by,
while traversing the route. We render the travers-
ing window continuously according to the mov-
ing speed specified by the mouse. Although the
traversing window isn’t a true 3D display, major
portions in the traversing window have an optical
flow that resembles real 3D scenes. As another
form of use, it’s even possible to display these pseu-
do-3D routes within a car’s navigation system.
As a route panorama extends to several miles,
it’s unwise to download the whole image and
then display it. We developed a streaming data
transmission function in Java that can display
route image scrolls and traversing windows dur-
ing download. Because of the route panoramas’
small data sizes, we achieved much faster trans-
mission of street views than video.
The image belt provides a visual profile of a
long street for its compactness. By adding clicking
regions in the image, the route panorama becomes
a flexible visual index of routes for Web page link-
ing. On the other hand, we can automatically
scroll a route panorama in a small window to give
viewers the feeling that they’re viewing architec-
tures and shops from a sightseeing bus. With two
cameras directing left and right sides of the vehi-
cle, we can establish two side views of a route by
synchronizing the route panoramas. If we drive
vehicles passing every street in a town for all the
route panoramas, we can generate a visual map of
the town for virtual tourism on the Internet.
With the tools discussed here, we can register
and visualize an urban area using panoramic
views, route panoramas, route image scrolls, and
maps. All these images have much less data com-
pared to video and look more realistic than 3D
computer-aided design models. We can link areas
within a city map on the Web to corresponding
areas in the route panoramas so that clicking a
spot on the map can a update route image scroll
accordingly (and vice versa). Eventually, we can
use these tools to visualize large-scale spaces such
as a facility, district, town, or even city.
Conclusion
Streets have existed for thousands of years
and there are millions of streets in the world
now. These streets have a tremendous amount
of information including rich visual contexts
that are closely related to our lifestyles and
reflect human civilization and history.
Registering and visualizing streets in an effective
way is important to our culture and commercial
life. With the Route Panorama software, when
you click a map to follow a certain route, the
route panorama will also be scrolled according-
ly, showing real scenes along the route. This will
greatly enhance the visualization of Geographic
Information Systems (GIS).
Because of the 2D characteristics of captured
route panoramas, we can use them in a variety of
ways. For instance, we can display and scroll
them on wireless phone screens or handheld ter-
minals for navigation in cities or facilities. By
connecting a route panorama database with the
Global Positioning System, we can locate our
position in the city and display the correspond-
ing segment of a route panorama on a liquid
crystal display. Displaying route panoramas and
panoramic images are basically raster copy of sec-
tions of images. Hence, the proposed techniques
are even applicable for 2D animation and game
applications, potentially providing richer, more
realistic content than before.
Currently, we’re working on several innova-
tions for our Route Panorama software. This
includes constructing 3D streets from route
panoramas, sharpening distance scenes due to
the stationary blur, establishing route panoramas
with a flexible camera setting, capturing high
rises into route panoramas, combining route
panoramas with local panoramic views in
cityscape visualization, and linking route panora-
mas to existing GIS databases. MM
66
IEEE MultiMedia
Figure 14. The
traversing window
dynamically displays
two sides of route
panoramas in an open,
cylindrical panoramic
view for virtual
navigation along a
street.
East <- The 5th Street The 4th Street -> West
Move forward
Speed up
Pause
67
July–September 2003
References
1. J.Y. Zheng, S. Tsuji, and M. Asada, “Color-Based
Panoramic Representation of Outdoor Environment
for a Mobile Robot,” Proc. 9th Int’l Conf. Pattern
Recognition, IEEE CS Press, vol. 2, 1988, pp.801-803.
2. J.Y. Zheng and S. Tsuji, “Panoramic Representation
of Scenes for Route Understanding,” Proc. 10th Int’l
Conf. Pattern Recognition, IEEE CS Press, vol. 1,
1990, pp.161-167.
3. J.Y. Zheng and S. Tsuji, “Panoramic Representation for
Route Recognition by a Mobile Robot,” Int’l J. Comput-
er Vision, Kluwer, vol. 9, no.1, 1992, pp. 55-76.
4. J.Y. Zheng and S. Tsuji, “Generating Dynamic Projec-
tion Images for Scene Representation and Recogni-
tion,” Computer Vision and Image Understanding, vol.
72, no. 3, Dec.1998, pp. 237-256.
5. T. Ishida, “Digital City Kyoto: Social Information
Infrastructure for Everyday Life,” Comm. ACM, vol.
45, no. 7, July 2002, pp. 76-81.
6. G. Ennis and M. Lindsay, “VRML Possibilities: The
Evolution of the Glasgow Model,” IEEE MultiMedia,
vol. 7, no. 2, Apr.–June 2000, pp. 48-51.
7. Z. Zhu, E.M. Riseman, and A.R. Hanson, “Parallel-
Perspective Stereo Mosaics,” Proc. Eighth IEEE Int’l
Conf. Computer Vision, IEEE CS Press, vol. I, 2001,
pp. 345-352.
8. H.S. Sawhney, S. Ayer, and M. Gorkani, “Model-
Based 2D and 3D Motion Estimation for Mosaicing
and Video Representation,” Proc. 5th Int’l Conf.
Computer Vision, IEEE CS Press, 1995, pp. 583-590.
9. S.E. Chen and L. Williams, “Quicktime VR: An
Image-Based Approach to Virtual Environment
Navigation,” Proc. Siggraph 95, ACM Press, 1995,
pp. 29-38.
10. Z. Zhu et al., “Camera Stabilization Based on 2.5D
Motion Estimation and Inertial Motion Filtering,”
Proc. IEEE Int’l Conf. Intelligent Vehicles, IEEE CS
Press, vol. 2, 1998, pp. 329-334.
11. Y.S. Yao and R. Chellappa, “Selective Stabilization
of Images Acquired by Unmanned Ground
Vehicles,” IEEE Trans. Robotics and Automation, vol.
RA-13, 1997, pp. 693-708.
Jiang Yu Zheng is an associate
professor at the Department of
Computer and Information Sci-
ence at Indiana University–
Purdue University, Indianapolis.
His current research interests
include 3D modeling, dynamic image processing, scene
representation, digital museums, and combining
vision, graphics, and human interfaces. Zheng received
a BS degree from Fudan University, China, and MS and
PhD degrees from Osaka University, Japan. He received
the 1991 Best Paper Award from the Information Pro-
cessing Society of Japan for generating the first digital
panoramic image.
Readers may contact Jiang Yu Zheng at jzheng@
cs.iupui.edu.
January–March
Contributions to Computing
Learn how yesterday’s contributions have shaped today’s comput-
ing world. Featured articles cover topics such as the first assembly-
line production of a digital computer, a manager’s account of
working at Computer Sciences Corporation in the mid-1960s, and
searching for tractable ways of reasoning about programs.
April–June
Evolution of Digital Computers
Digital computers have evolved into powerful workstations, but
that wasn’t always the case. Read about one man’s foray into
digital computing as well as the making of the MCM/70 micro-
computer. This issue also features articles on the glory days of
Datamation and the history of the sector, an analog calculating
instrument.
July–September
Historical Reconstructions
This special issue is devoted to recording and recounting efforts to
preserve computing practice through physical reconstruction,
restoration, and simulation. In addition to providing accounts of
such projects through substantive articles, the issue will serve as a
digest of projects and initiatives as well as societies and organiza-
tions active in these areas.
October–December
Women in Computing
Since the days of Ada Lovelace, women have played an important
role in the history of computing, and this role has received increas-
ing attention in recent years. Scholarship in this area has begun to
move beyond simply demonstrating women's presence in the
history of computing to considering how computing and gender
constructs have shaped one another over time.
2003
Editorial Calendar
http://computer.org/annals
... Because of the scanning effect of side zones on the scenes sideways [13], the profiles may contain shapes of scenes rather than motion traces repeated by the same objects, if the zone does not have a zero-flow in the horizontal motion profile. Such scanned scenes provide no information on the object speed. ...
Article
Full-text available
The objective of this paper is the instantaneous computation of time-to-collision (TTC) for potential collision only from the motion information captured with a vehicle borne camera. The contribution is the detection of dangerous events and degree directly from motion divergence in the driving video, which is also a clue used by human drivers. Both horizontal and vertical motion divergence are analyzed simultaneously in several collision sensitive zones. The video data are condensed to the motion profiles both horizontally and vertically in the lower half of the video to show motion trajectories directly as edge traces. Stable motion traces of linear feature components are obtained through filtering in the motion profiles. As a result, this avoids object recognition and sophisticated depth sensing in prior. The fine velocity computation yields reasonable TTC accuracy so that a video camera can achieve collision avoidance alone from the size changes of visual patterns. We have tested the algorithm for various roads, environments, and traffic, and shown results by visualization in the motion profiles for overall evaluation.
... Because of the scanning effect of side zones on the scenes sideways [13], the profiles may contain shapes of scenes rather than motion traces repeated by the same objects, if the zone does not have a zero-flow in the horizontal motion profile. Such scanned scenes provide no information on the object speed. ...
Preprint
Full-text available
(Accepted as regular paper by IEEE Transaction of Intelligent Transportation Systems) The objective of this work is the instantaneous computation of Time-to-Collision (TTC) for potential collision only from the motion information captured with a vehicle borne camera. The contribution is the detection of dangerous events and degree directly from motion divergence in the driving video, which is also a clue used by human drivers. Both horizontal and vertical motion divergence are analyzed simultaneously in several collision sensitive zones. The video data are condensed to the motion profiles both horizontally and vertically in the lower half of the video to show motion trajectories directly as edge traces. Stable motion traces of linear feature components are obtained through filtering in the motion profiles. As a result, this avoids object recognition and sophisticated depth sensing in prior. The fine velocity computation yields reasonable TTC accuracy so that the video camera can achieve collision avoidance alone from the size changes of visual patterns. We have tested the algorithm for various roads, environments, and traffic, and shown results by visualization in the motion profiles for overall evaluation.
... Zheng proposed an effective method named route panorama [5]- [7] to scan the temporal scenes in the side direction of the path and connect them to the spatial image. Different from a local panorama at a static viewpoint [8], [9], the route panorama is constructed from partial views at consecutive viewpoints along a path. ...
Article
Train-borne video captured from the camera installed in the front or back of the train has been used for railway environment surveillance, including missing communication units and bolts on the track, broken fences, unpredictable objects falling into the rail area or hanging on wires on the top of rails. Moreover, the track condition can be perceived visually from the video by observing and analyzing the train-swaying arising from the track irregularity. However, it's a time-consuming and labor-intensive work to examine the whole large scale video up to dozens of hours frequently. In this paper, we propose a simple and effective method to detect the train-swaying quickly and automatically. We first generate the long rail track panorama (RTP) by stitching the stripes cut from the video frames, and then extract track profile to perform the unevenness detection algorithm on the RTP. The experimental results show that RTP, the compact video representation, can fast examine the visual train-swaying information for track condition perceiving, on which we detect the irregular spots with 92.86% recall and 82.98% precision in only 2 minutes computation from the video close to 1 hour.
Article
Full-text available
Driving video is available from in-car camera for road detection and collision avoidance. However, consecutive video frames in a large volume have redundant scene coverage during vehicle motion, which hampers real-time perception in autonomous driving. This work utilizes compact road profiles (RP) and motion profiles (MP) to identify path regions and dynamic objects, which drastically reduces video data to a lower dimension and increases sensing rate. To avoid collision in a close range and navigate a vehicle in middle and far ranges, several RP/MPs are scanned continuously from different depths for vehicle path planning. We train deep network to implement semantic segmentation of RP in the spatial-temporal domain, in which we further propose a temporally shifting memory for online testing. It sequentially segments every incoming line without latency by referring to a temporal window. In streaming-mode, our method generates real-time output of road, roadsides, vehicles, pedestrians, etc. at discrete depths for path planning and speed control. We have experimented our method on naturalistic driving videos under various weather and illumination conditions. It reached the highest efficiency with the least amount of data.
Chapter
The innovative combination of wireless sensor network (WSN) technology with visual monitoring and surveillance technology in computer vision has been emerging as an important new paradigm. This emerging technology will play a crucial role in visual monitoring and surveillance for automatic object detection and tracking in applications such as real-time traffic monitoring and control, vehicle parking control, intrusion detection, security surveillance, military battlefield monitoring, and so on. Compared to traditional WSNs with scalar sensing data, the development of WVSNs presents much greater challenges in terms of node’s computation power, storage, wireless bandwidth capacity and energy conservation due to the processing and transmission of the huge amount of two-dimensional (2D) image data. We introduce the use of linear CCD sensors for wireless sensor network here. It reads temporal data from a CCD array continuously and stores them to form a 2D image profile. Compared to most of the sensors in the current sensor networks that output temporal signals, it delivers more information such as color, shape, and event of a flowing scene. On the other hand, it abstracts passing objects in the profile without heavy computation and transmits much less data than a video from normal cameras. This paper focus on several unsolved issues of line sensors in capturing targets in the 3D space such as sensor setting, shape analysis, robust object extraction, and real time background adapting to ensure long-term sensing and visual data collection via networks. All the developed algorithms are executed in constant complexity for reducing the sensor and network burden. A sustainable visual sensor network can thus be established in a large area to monitor passing objects and people for surveillance, traffic assessment, invasion alarming, etc.
Chapter
In order to provide users with a virtual tour which have walk through and binocular stereoscopic experience, the authors propose a method to use the tracked robot carrying a single panoramic camera. Panorama photos of continuous movement are taken by the tracked robot so that the audience can wander freely in the virtual museum. Panoramic photos are captured at the distance calculated according to requirements of stereo vision comfort. The recorded photos are used to make binocular panoramic video. Two adjacent panoramic images are used as stereoscopic pairs so as to realize comfortable stereoscopic vision. Because only one camera is used, not only the amount of data is reduced, but also the occlusion issue is avoided. Videos can be shot at different distances according to the rules of visual comfort so that the users can enlarge the picture and choose an appropriate spacing.
Article
Camera systems in fast motion suffer from the effects of motion blur, which degrades image quality and can have a large impact on the performance of visual tasks. The degradation in image quality can be mitigated through the use of image reconstruction. Blur effects and the resulting reconstruction performance are highly dependent on the point-spread function resulting from camera motion. This work focuses on the motion planning problem for a camera system with boundary conditions on time and position, with the objective of improving the performance of optical character recognition. Tuned edge-preserving trajectories are shown to result in higher recognition accuracy when compared to inverse error and linear trajectories. Simulation and experimental results provide quantitative measures to verify edge-preservation and greater recognition performance.
Article
This paper introduces an application of route panoramas (RP) in geospatial information visualization. An RP is a long image archiving scenes along a street. We model large geospatial areas such as a town with RPs at a more detailed level than a road map. This paper focuses on the indexing, transmission, and visualization of scenes. The pervasively embedded view-to-space links in the RPs guide transition to detailed spaces. We stream RPs on the Internet in response to the viewer's interaction in virtual city traversing. Various visualization styles of the RP are designed for city guidance on different platforms. Combined with other image media, viewers can look around, traverse a grid of routes, and retrieve information associated to various locations.
Article
Full-text available
This paper explores an interesting image projection produced by scanning dynamic scenes with a slit camera. Based on the concept of Anorthoscopic Perception, we investigate how a two-dimensional Dynamic Projection Image of three-dimensional scenes is generated from consecutive 1-D snapshots taken through a slit, when the relative motion is homogeneous between the viewer and scenes. By moving the camera in the 3-D environment or rotating an object, we can obtain various dynamic projection images. These dynamic projection images contain major spatial and temporal information about 3-D scenes in a small amount of data. Consequently, the projection is suited for the memorization, registration, and indexing of image sequences. The generated images also directly show some of the motion properties in dynamic scenes. If a relative motion between the camera and a subject is planned properly, the dynamic projection image can even provide a texture image of the subject along with some expected photometry characteristics. Therefore, the dynamic projection can facilitate dynamic object recognition, 3-D structure acquisition, and image compression, all for a stable motion between the objects and camera. We outline various applications in vision, robotics, and multimedia and summarize the motion types and the camera setting for generating such dynamic projection images.
Conference Paper
Full-text available
A color-based panoramic representation is presented of outdoor scenes along routes of a mobile robot. A qualitative description, which is composed of a series of significant visual events called landmarks and their ranges and their ranges of distinctiveness, is generated from it. The mobile robot builds such a map and selects landmarks autonomously in its five moves along a certain route. When the robot is given a command to go along that route, it tries to pursue the memorized route by matching the landmarks with the incoming scenes. Retinex theory is used in matching of scenes in different weather conditions
Article
Full-text available
This paper explores an interesting image projection produced by scanning dynamic scenes with a slit camera. Based on the concept of Anorthoscopic Perception, we investigate how a two-dimensionalDynamic Projection Imageof three-dimensional scenes is generated from consecutive 1-D snapshots taken through a slit, when the relative motion is homogeneous between the viewer and scenes. By moving the camera in the 3-D environment or rotating an object, we can obtain various dynamic projection images. These dynamic projection images contain major spatial and temporal information about 3-D scenes in a small amount of data. Consequently, the projection is suited for the memorization, registration, and indexing of image sequences. The generated images also directly show some of the motion properties in dynamic scenes. If a relative motion between the camera and a subject is planned properly, the dynamic projection image can even provide a texture image of the subject along with some expected photometry characteristics. Therefore, the dynamic projection can facilitate dynamic object recognition, 3-D structure acquisition, and image compression, all for a stable motion between the objects and camera. We outline various applications in vision, robotics, and multimedia and summarize the motion types and the camera setting for generating such dynamic projection images.
Conference Paper
Full-text available
In this paper we present a novel method for automatically and efficiently generating stereoscopic mosaics by seamless registration of optical data collected by a video camera mounted on an airborne platform that undergoes dominant translational motion. There are four critical points discussed in this paper: (1) Using a parallel-perspective representation, a pair of geometrically registered stereo mosaics can be constructed before we explicitly recover any 3D information under rather general motion. (2) A PRISM (parallel ray interpolation for stereo mosaicing) technique is proposed to make stereo mosaics seamless in the presence of motion parallax and for rather arbitrary scenes. A fast PRISM algorithm is presented and issues on stitching point selection and occlusion handling are discussed. (3) The epipolar geometry of parallel-perspective stereo mosaics generated under constrained 6 DOF motion is formulated, which shows optimal baselines, easy search for correspondence and constant depth resolution. (4) The proposed methods for the generation of stereo mosaics and then the reconstruction of a 3D map are efficient in both computation and storage. Experimental results on long video sequences are given
Article
Full-text available
Here, we explore a new theme: Route recognition, in robot navigation. It is faced with problems of visual sensing, spatial memory construction, and scene recognition in a global world. The strategy of this work is route description from experience, that is, a robot acquires a route description from route views taken in a trial move, and then uses it to guide the navigation along the same route. In cognition phase, a new representation of scenes along a route termed panoramic representation is proposed. This representation is obtained by scanning sideviews along the route, which provides rich information such as 2D projections of scenes called Panoramic view and generalized panoramic view, a path-oriented 2 1/2D sketch, and a path description, but only contains a small amount of data. The continuous panoramic view (PV) and generalized panoramic view (GPV) are efficient in processing, compared with fusing discrete views into a complete route model. In recognition phase, the robot matches the panoramic representation memorized in the trial move and that from incoming images so that it can locate and orient itself. We employ dynamic programming and circular dynamic programming in coarse matching of GPVs and PVs, and employ feature matching in fine verification. The advantage of wide fields of GPV and PV brings a reliable result to the scene recognition.
Article
Full-text available
ed at retrievinginformation around the world, digital cities focus onlocal information. Besides those information services,AOL provides local advertising opportunitiesfor vertical marketsincluding auto, real estate,employment, and health.AOL digital cities are veryhomogeneous as a result ofpursuing economic efficiency.In Europe, however, more than 100 local authoritiesstarted different digital cities in the last eightyears. The topics include telematic applications, carfreecities, ...
Conference Paper
Full-text available
It is fairly common in video sequences that a mostly fixed background (scene) is imaged with or without objects. The dominant background changes in the image plane mostly due to camera operations and motion (zoom, pan, tilt, track etc.). We address the problem of computation of the dominant image transformation over time and demonstrate how this can be effectively used for efficient video representation through video mosaicing and image registration. We formulate the problem of dominant component estimation as that of model based robust estimation using M estimators with direct, multi resolution methods. In addition to 2D affine and plane projective models, that have been used in the past for describing image motion using direct methods, we also employ a true 3D model of motion and scene structure imaged with uncalibrated cameras. This model parameterizes the image motion as that due to a planar component and a parallax component. For rigid 3D scenes imaged under camera motion only, least squares (LS) methods with the plane and parallax parameterization are also presented. Furthermore, in the context of robust estimation, in contrast with previous approaches for similar problems, our algorithm employs an automatic computation of a scale parameter that is crucial in rejecting the non dominant components as outliers
Article
This paper presents a novel approach to stabilize video sequences digitally. A 2.5D inter-frame motion model is proposed so that the stabilization system can work in situations where significant depth changes are present and the camera has both rotational and translational movements. An inertial model for motion filtering is proposed in order to eliminate the vibration of the video sequences and to achieve good perceptual properties. The implementation of this new approach integrates four modules: pyramid-based motion detection, motion identification and 2.5D motion parameter estimation, inertial motion filtering, and affine-based motion compensation. The stabilization system can smooth unwanted vibrations of video sequences in real-time. We test the system on IBM PC compatible machines and the experimental results show that our algorithm outperforms many algorithms that require parallel pipeline image processing machines.
Conference Paper
This paper studies the problem of selective stabilization of images acquired by a camera mounted on a vehicle navigating a rough terrain. Selective stabilization is defined here as the separation of rotation components into smooth rotation and residual oscillatory rotation. We consider both kinematic and kinetic models suitable for capturing these phenomena and achieve their separation. A scheme for detecting the occurrence and disappearance of smooth rotation is devised, and appropriate dynamic laws are employed to achieve selective stabilization. 3-D locations of close feature points are subsequently estimated in a stabilized frame of reference, thus providing more useful structural information