ArticlePDF Available

Airborne Optical Sectioning

Authors:

Abstract and Figures

Drones are becoming increasingly popular for remote sensing of landscapes in archeology, cultural heritage, forestry, and other disciplines. They are more efficient than airplanes for capturing small areas, of up to several hundred square meters. LiDAR (light detection and ranging) and photogrammetry have been applied together with drones to achieve 3D reconstruction. With airborne optical sectioning (AOS), we present a radically different approach that is based on an old idea: synthetic aperture imaging. Rather than measuring, computing, and rendering 3D point clouds or triangulated 3D meshes, we apply image-based rendering for 3D visualization. In contrast to photogrammetry, AOS does not suffer from inaccurate correspondence matches and long processing times. It is cheaper than LiDAR, delivers surface color information, and has the potential to achieve high sampling resolutions. AOS samples the optical signal of wide synthetic apertures (30–100 m diameter) with unstructured video images recorded from a low-cost camera drone to support optical sectioning by image integration. The wide aperture signal results in a shallow depth of field and consequently in a strong blur of out-of-focus occluders, while images of points in focus remain clearly visible. Shifting focus computationally towards the ground allows optical slicing through dense occluder structures (such as leaves, tree branches, and coniferous trees), and discovery and inspection of concealed artifacts on the surface.
Content may be subject to copyright.
Journal of
Imaging
Article
Airborne Optical Sectioning
Indrajit Kurmi, David C. Schedl and Oliver Bimber *
Institute of Computer Graphics, Johannes Kepler University Linz, 4040 Linz, Austria;
indrajit.kurmi@jku.at (I.K.); david.schedl@jku.at (D.C.S.)
*Correspondence: oliver.bimber@jku.at; Tel.: +43-732-2468-6631
Received: 4 July 2018; Accepted: 11 August 2018; Published: 13 August 2018


Abstract:
Drones are becoming increasingly popular for remote sensing of landscapes in archeology,
cultural heritage, forestry, and other disciplines. They are more efficient than airplanes for capturing
small areas, of up to several hundred square meters. LiDAR (light detection and ranging) and
photogrammetry have been applied together with drones to achieve 3D reconstruction. With airborne
optical sectioning (AOS), we present a radically different approach that is based on an old idea:
synthetic aperture imaging. Rather than measuring, computing, and rendering 3D point clouds
or triangulated 3D meshes, we apply image-based rendering for 3D visualization. In contrast to
photogrammetry, AOS does not suffer from inaccurate correspondence matches and long processing
times. It is cheaper than LiDAR, delivers surface color information, and has the potential to achieve
high sampling resolutions. AOS samples the optical signal of wide synthetic apertures (30–100 m
diameter) with unstructured video images recorded from a low-cost camera drone to support optical
sectioning by image integration. The wide aperture signal results in a shallow depth of field and
consequently in a strong blur of out-of-focus occluders, while images of points in focus remain clearly
visible. Shifting focus computationally towards the ground allows optical slicing through dense
occluder structures (such as leaves, tree branches, and coniferous trees), and discovery and inspection
of concealed artifacts on the surface.
Keywords: computational imaging; image-based rendering; light fields; synthetic apertures
1. Introduction
Airborne laser scanning (ALS) [
1
4
] utilizes LiDAR (light detection and ranging) [
5
8
] for remote
sensing of landscapes. In addition to many other applications, ALS is used to make archaeological
discoveries in areas concealed by trees [
9
11
], to support forestry in forest inventory and ecology [
12
],
and to acquire precise digital terrain models (DTMs) [
13
,
14
]. LiDAR measures the round travel time of
reflected laser pulses to estimate distances at frequencies of several hundred kilohertz, and for ALS
is operated from airplanes, helicopters, balloons, or drones [
15
19
]. It delivers high resolution point
clouds that can be filtered to remove vegetation or trees when inspecting the ground surface [20,21].
Although LiDAR has clear advantages over photogrammetry when it comes to partially occluded
surfaces, it also has limitations: for small areas the operating cost is disproportionally high; the huge
amount of 3D point data requires massive processing time for registration, triangulation and
classification; it does not, per se, provide surface color information; and its sampling resolution
is limited by the speed and other mechanical constraints of laser scanners and recording systems.
Here, we present a different approach to revealing and inspecting artifacts occluded by moderately
dense structures, such as forests, that uses low-cost, off-the-shelf camera drones. Since it is entirely
image-based and does not reconstruct 3D points, it has very low processing demands and can
therefore deliver real-time 3D visualization results almost instantly after short image preprocessing.
Like drone-based LiDAR, it is particularly efficient for capturing small areas of several hundred square
J. Imaging 2018,4, 102; doi:10.3390/jimaging4080102 www.mdpi.com/journal/jimaging
J. Imaging 2018,4, 102 2 of 12
meters, but it has the potential to provide a high sampling resolution and surface color information.
We call this approach airborne optical sectioning (AOS) and envision applications where fast visual
inspections of partially occluded areas are desired at low costs and effort. Besides archaeology,
forestry and agriculture might offer additional use cases. Application examples in forestry include the
visual inspection of tree population (e.g., examining trunk thicknesses), pest infestation (with infrared
cameras), and forest road conditions underneath the tree crowns.
2. Materials and Methods
Cameras integrated in conventional drones apply lenses of several millimeters in diameter.
The relatively large f-number (i.e., the ratio of the focal length to the lens diameter) of such cameras
yields a large depth of field that makes it possible to capture the scene in focus over large distances.
Decreasing the depth of field requires a lower f-number that can be achieved, for instance, with a
wider effective aperture (i.e., a wider lens). Equipping a camera drone with a lens several meters rather
than millimeters wide is clearly infeasible, but would enable the kind of optical sectioning used in
traditional light microscopes, which applies high numerical aperture optics [2224].
For cases in which wide apertures cannot be realized, the concept of synthetic apertures has
been introduced [
25
]. Synthetic apertures sample the signal of wide apertures with arrays of smaller
sub-apertures whose individual signals are computationally combined. This principle has been used
for radar [2629], telescopes [30], microscopes [31,32], and camera arrays [33].
We sample the optical signal of wide synthetic apertures (30–100 m diameter) with unstructured
video images recorded by a camera drone to support optical sectioning computationally with synthetic
aperture rendering [
34
,
35
]. The following section explains the sampling principles and the visualization
techniques of our approach.
2.1. Wide Sythetic Aperture Sampling
Let us consider a target on the ground surface covered by a volumetric structure of random
occluders, as illustrated in Figure 1. The advantage of LiDAR (cf. Figure 1a) is that the depth of a
surface point on the target can theoretically be recovered if a laser pulse from only one direction is
reflected by that point without being entirely occluded. In the case of photogrammetry (cf. Figure 1b),
a theoretical minimum of two non-occluded directions are required for stereoscopic depth estimation
(in practice, many more directions are necessary to minimize measurement inaccuracies). In contrast
to LiDAR, photogrammetry relies on correspondence search, which requires distinguishable feature
structures and it fails for uniform image features.
J. Imaging 2018, 4, x FOR PEER REVIEW 2 of 12
color information. We call this approach airborne optical sectioning (AOS) and envision applications
where fast visual inspections of partially occluded areas are desired at low costs and effort. Besides
archaeology, forestry and agriculture might offer additional use cases. Application examples in
forestry include the visual inspection of tree population (e.g., examining trunk thicknesses), pest
infestation (with infrared cameras), and forest road conditions underneath the tree crowns.
2. Materials and Methods
Cameras integrated in conventional drones apply lenses of several millimeters in diameter. The
relatively large f-number (i.e., the ratio of the focal length to the lens diameter) of such cameras yields
a large depth of field that makes it possible to capture the scene in focus over large distances.
Decreasing the depth of field requires a lower f-number that can be achieved, for instance, with a
wider effective aperture (i.e., a wider lens). Equipping a camera drone with a lens several meters
rather than millimeters wide is clearly infeasible, but would enable the kind of optical sectioning used
in traditional light microscopes, which applies high numerical aperture optics [22–24].
For cases in which wide apertures cannot be realized, the concept of synthetic apertures has been
introduced [25]. Synthetic apertures sample the signal of wide apertures with arrays of smaller sub-apertures
whose individual signals are computationally combined. This principle has been used for radar [26–29],
telescopes [30], microscopes [31,32], and camera arrays [33].
We sample the optical signal of wide synthetic apertures (30–100 m diameter) with unstructured
video images recorded by a camera drone to support optical sectioning computationally with
synthetic aperture rendering [34,35]. The following section explains the sampling principles and the
visualization techniques of our approach.
2.1. Wide Sythetic Aperture Sampling
Let us consider a target on the ground surface covered by a volumetric structure of random
occluders, as illustrated in Figure 1. The advantage of LiDAR (cf. Figure 1a) is that the depth of a
surface point on the target can theoretically be recovered if a laser pulse from only one direction is
reflected by that point without being entirely occluded. In the case of photogrammetry (cf. Figure 1b), a
theoretical minimum of two non-occluded directions are required for stereoscopic depth estimation
(in practice, many more directions are necessary to minimize measurement inaccuracies). In contrast
to LiDAR, photogrammetry relies on correspondence search, which requires distinguishable feature
structures and it fails for uniform image features.
Figure 1. For light detection and ranging (LiDAR) (a) a single ray is sufficient to sample the depth of
a point on the ground surface, while photometric stereo (b) requires a minimum of two views and
relies on a dense set of matchable image features. In airborne optical sectioning (AOS), for each point
on the synthetic focal plane we directionally integrate many rays over a wide synthetic aperture to
support optical sectioning with a low synthetic f-number (c). The spatial sampling density at the focal
plane depends on the field of view, the resolution of the drone’s camera, and the height of the drone
relative to the focal plane (d).
Figure 1.
For light detection and ranging (LiDAR) (
a
) a single ray is sufficient to sample the depth of a
point on the ground surface, while photometric stereo (
b
) requires a minimum of two views and relies
on a dense set of matchable image features. In airborne optical sectioning (AOS), for each point on the
synthetic focal plane we directionally integrate many rays over a wide synthetic aperture to support
optical sectioning with a low synthetic f-number (
c
). The spatial sampling density at the focal plane
depends on the field of view, the resolution of the drone’s camera, and the height of the drone relative
to the focal plane (d).
J. Imaging 2018,4, 102 3 of 12
Our approach computes 3D visualizations entirely by image-based rendering [
35
]. However,
it does not reconstruct 3D point clouds and therefore does not suffer from inaccurate correspondence
matches and long processing times. As illustrated in Figure 1c, we sample the area of the synthetic
aperture by means of multiple unstructured sub-apertures (the physical apertures of the camera drone).
Each sample corresponds to one geo-referenced (pose and orientation) video image of the drone.
Each pixel in such an image corresponds to one discrete ray
r
captured through the synthetic aperture
that intersects the aperture plane at coordinates
u
,
v
and the freely adjustable synthetic focal plane at
coordinates
s
,
t
. To compute the surface color
Rs,t
of each point on the focal plane, we computationally
integrate and normalize all sampled rays
rs,t,u,v
of the synthetic aperture (with directional resolution
U,V) that geometrically intersect at s,t:
Rs,t=1
UV
U,V
u,v
rs,t,u,v. (1)
Some of these directionally sampled rays might be blocked partially or fully by occluders (red in
Figure 1c). However, if the surface color dominates in the integral, it can be visualized in the rendered
image. This is the case if the occluding structure is sufficiently sparse and the synthetic aperture wide
and sufficiently densely sampled. The wider the synthetic aperture, the lower its synthetic f-number
(
Ns
), which in this case is the ratio of the height of the drone
h
(distance between the aperture plane
and focal plane) to the diameter of the synthetic aperture (size of the sampled area).
A low
Ns
results in a shallow depth of field and consequently in a strong point spread of
out-of-focus occluders over a large region of the entire integral image. Images of focused points remain
concentrated in small spatial regions that require an adequate spatial resolution to be resolved.
The spatial sampling resolution
fs
(number of sample points per unit area) at the focal plane
can be determined by back-projecting the focal plane onto the imaging plane of the drone’s camera,
as illustrated in Figure 1d. Assuming a pinhole camera model and parallel focal and image planes,
this leads to:
fs=n
4h2tan2(FOVc/2)(2)
where
FOVc
is the field of view of the drone’s camera,
n
is its resolution (number of pixels on the
image sensor), and
h
is the height of the drone. Note, that in practice, the spatial sampling resolution
is not uniform, as the focal plane and image plane are not necessarily parallel, and imaging with
omnidirectional cameras is not uniform on the image sensor. Furthermore, the sampling resolution
is affected by the quality of the drone’s pose estimation. Appendix Aexplains how we estimate the
effective spatial sampling resolution that considers omnidirectional images and pose-estimation errors.
2.2. Wide Synthetic Aperture Visualization
The data basis for visualization is the set of geo-referenced video images recorded by the drone
during synthetic aperture sampling. They must be rectified to eliminate lens distortion. Figure 2a
illustrates how a novel image is computed for a virtual camera by means of image-based rendering [
35
].
The parameters of the virtual camera (position, orientation, focal plane, field of view, and aperture
radius) are interactively defined in the same three-dimensional coordinate system as the drone poses.
J. Imaging 2018,4, 102 4 of 12
J. Imaging 2018, 4, x FOR PEER REVIEW 4 of 12
Figure 2. For visualization, we interactively define a virtual camera (green triangle) by its pose, size
of its aperture, field of view, and its focal plane. Novel images are rendered by ray integration
(Equation (1)) for points s,t at the focal plane (black circle) that are determined by the intersection of
rays (blue) through the camera’s projection center (grey circle) and pixels at x,y in the camera’s image
plane (blue circle) within its field of view FOVv. Only the rays (green) through u,v at the synthetic
aperture plane that pass through the virtual camera’s aperture Av, are integrated. While (a) illustrates
the visualization for a single focal plane, (b) shows the focal slicing being applied to increase the depth
of field computationally.
Given the virtual camera’s pose (position and orientation) and field of view FOVv, a ray through
its projection center and a pixel at coordinates x,y in its image plane can be defined (blue in Figure 2a).
This ray intersects the adjusted focal plane at coordinates s,t. With Equation (1), we now integrate all
rays rs,t,u,v from s,t through each sample at coordinates u,v in the synthetic aperture plane that intersect
the virtual camera’s circular aperture area Av of defined radius. The resulting surface color integral
Rs,t is used for the corresponding pixel’s color. This is repeated for all pixels in the virtual camera’s
image plane to render the entire image. Interactive control over the virtual camera’s parameters
allows real-time renderings of the captured scene by adjusting perspective, focus, and depth of field.
The extremely shallow depth of field that results from a wide aperture blurs not only occluders,
such as trees, but also points of the ground target that are not located in the focal plane. To visualize
non-planar targets entirely in focus, we adjust two boundary focal planes (dashed black lines in
Figure 2b) that enclose the ground target, and interpolate their plane parameters to determine a fine
intermediate focal slicing (dotted black lines in Figure 2b). Repeating the rendering for each focal
plane, as explained above, leads to a focal stack (i.e., a stack of x,y-registered, shallow depth-of-field
images with varying focus). For every pixel at x,y in the virtual camera’s image plane, we now
determine the optimal focus through all slices of the focal stack by maximizing the Sobel magnitude
computed within a 3 × 3 pixel neighborhood. The surface color at the optimal focus is then used for
the pixel’s color. Repeating this for all pixels leads to an image that renders the ground target within
the desired focal range with a large depth of field, while occluders outside are rendered with a
shallow depth of field.
3. Results
We carried out two field experiments for proof of concept. In experiment 1 (cf. Figure 3) we
recovered an artificial ground target, and in experiment 2 (cf. Figure 4) the ruins of an early 19th
century fortification tower. Both were concealed by dense forest and vegetation—being entirely
invisible from the air.
Figure 2.
For visualization, we interactively define a virtual camera (green triangle) by its pose,
size of its aperture, field of view, and its focal plane. Novel images are rendered by ray integration
(Equation (1)) for points
s
,
t
at the focal plane (black circle) that are determined by the intersection of
rays (blue) through the camera’s projection center (grey circle) and pixels at
x
,
y
in the camera’s image
plane (blue circle) within its field of view
FOVv
. Only the rays (green) through
u
,
v
at the synthetic
aperture plane that pass through the virtual camera’s aperture
Av
, are integrated. While (
a
) illustrates
the visualization for a single focal plane, (
b
) shows the focal slicing being applied to increase the depth
of field computationally.
Given the virtual camera’s pose (position and orientation) and field of view
FOVv
, a ray through
its projection center and a pixel at coordinates
x
,
y
in its image plane can be defined (blue in Figure 2a).
This ray intersects the adjusted focal plane at coordinates
s
,
t
. With Equation (1), we now integrate all
rays
rs,t,u,v
from
s
,
t
through each sample at coordinates
u
,
v
in the synthetic aperture plane that intersect
the virtual camera’s circular aperture area
Av
of defined radius. The resulting surface color integral
Rs,t
is used for the corresponding pixel’s color. This is repeated for all pixels in the virtual camera’s
image plane to render the entire image. Interactive control over the virtual camera’s parameters allows
real-time renderings of the captured scene by adjusting perspective, focus, and depth of field.
The extremely shallow depth of field that results from a wide aperture blurs not only occluders,
such as trees, but also points of the ground target that are not located in the focal plane. To visualize
non-planar targets entirely in focus, we adjust two boundary focal planes (dashed black lines in
Figure 2b) that enclose the ground target, and interpolate their plane parameters to determine a fine
intermediate focal slicing (dotted black lines in Figure 2b). Repeating the rendering for each focal plane,
as explained above, leads to a focal stack (i.e., a stack of
x
,
y
-registered, shallow depth-of-field images
with varying focus). For every pixel at
x
,
y
in the virtual camera’s image plane, we now determine
the optimal focus through all slices of the focal stack by maximizing the Sobel magnitude computed
within a 3
×
3 pixel neighborhood. The surface color at the optimal focus is then used for the pixel’s
color. Repeating this for all pixels leads to an image that renders the ground target within the desired
focal range with a large depth of field, while occluders outside are rendered with a shallow depth
of field.
3. Results
We carried out two field experiments for proof of concept. In experiment 1 (cf. Figure 3) we
recovered an artificial ground target, and in experiment 2 (cf. Figure 4) the ruins of an early 19th century
fortification tower. Both were concealed by dense forest and vegetation—being entirely invisible from
the air.
J. Imaging 2018,4, 102 5 of 12
Figure 3.
A densely forested patch is captured with a circular synthetic aperture of 30 m diameter (
a
),
sampled with 231 video images (
b
) from a camera drone at an altitude of 20 m above ground. A dense
3D point cloud reconstruction from these images (
c
) took 8 h to compute with photogrammetry. (
d
) A
single raw video image captured through the narrow-aperture omnidirectional lens of the drone. After
image rectification, the large depth of field does not reveal features on the ground that are occluded by
the trees (
e
). Increasing the aperture computationally decreases the depth of field and significantly blurs
out-of-focus features (
f
). Shifting focus computationally towards the ground slices optically through the
tree branches (
f
) makes a hidden ground target visible (
g
,
h
). Synthetic aperture rendering was possible
in real-time after 15 min of preprocessing (image rectification and pose estimation). 3D visualization
results are shown in Supplementary Materials Video S1.
Figure 4.
Ruins of an early 19th century fortification tower near Linz, Austria, as seen from the
air (
a
). The structure of the remaining inner and outer ring walls and a trench become visible after
optical sectioning (
b
). Image preprocessing (rectification and pose estimation) for AOS took 23 min.
Photometric 3D reconstruction took 15 h (without tree removal) and does not capture the site structures
well (
c
). Remains of the site as seen from the ground (
d
). An LiDAR scan with a resolution of
8 samples/m
2
(
e
) does not deliver surface color but provides consistent depth. The synthetic aperture
for AOS in this experiment was a 90
sector of a circle with 50 m radius and was sampled with
505 images at an altitude of 35 m above the ground (
f
,
g
). For our camera drone, this leads to effectively
74 samples/m
2
on the ground. 3D visualization results are shown in Supplementary Materials Video S2.
J. Imaging 2018,4, 102 6 of 12
In both experiments, we used a low-cost off-the-shelf quadcopter (Parrot Bebop 2) equipped with
a fixed 14MP Complementary Metal-Oxide-Semiconductor (CMOS) camera and a 178
FOV fisheye
lens. For path planning, we first outlined the dimensions and shapes of the synthetic apertures in
Google Maps and then used a custom Python script to compute intermediate geo-coordinates on the
flying path that samples the desired area on the synthetic aperture plane. Parrot’s ARDroneSDK3
API (developer.parot.com/docs/SDK3/) was used to control the drone autonomously along the
determined path and to capture images.
For experiments 1 and 2, the distance between two neighboring recordings was 1 m and 2–4 m
respectively (denser in the center of the aperture sector). In both cases, directional sampling followed
a continuous scan line path, the ground speed of the drone was 1 m/s, and the time for stabilizing
the drone at a sample position and recording an image was 1 s. This led to sampling rates of about
30 frames per minute and 12–20 frames per minute for experiment 1 and 2, respectively.
We applied OpenCV’s (opencv.org) omnidirectional camera model for camera calibration
and image rectification and implemented a GPU visualization framework based on Nvidia’s
CUDA (NVIDIA, Santa Clara, CA, USA) for image-based rendering as explained in Section 2.2.
The general purpose structure-from-motion and multi-view stereo pipeline, COLMAP [
36
], was used
for photometric 3D reconstruction, and for pose estimation of the drone.
For experiment 2, we achieved an effective spatial sampling resolution of 74 samples/m
2
on the
ground (see Appendix Afor details) from an altitude of 35 m with a low-cost drone that uses a fixed
omnidirectional camera. Since only 80
of the camera’s 178
FOV
was used, just a fraction of the sensor
resolution was utilized (3.5 MP of a total of 14 MP in our example). In the 60
×
60 m region of interest
shown in Figure 4b (corresponding to
FOVc
= 80
and
n
= 3.5 MP), we find a ten times denser spatial
sampling resolution than in the LiDAR example shown in Figure 4e (i.e., 74 samples/m
2
compared
to 8 samples/m
2
). A total of 100 focal slices were computed and processed for achieving the results
shown in Figure 4b.
Photometric 3D reconstruction from the captured images requires hours (8–15 h in our
experiments) of preprocessing and leads in most cases to unusable results for complex scenes, such as
forests (cf. Figures 3c and 4c). Currently, processing for AOS requires minutes (15–23 min in our
experiments), but this can be sped up significantly with a more efficient implementation.
4. Discussion and Conclusions
Drones are becoming increasingly popular for remote sensing of landscapes. They are
more efficient than airplanes for capturing small areas of up to several hundred square meters.
LiDAR and photogrammetry have been applied together with drones for airborne 3D reconstruction.
AOS represents a very different approach. Instead of measuring, computing, and rendering 3D point
clouds or triangulated 3D meshes, it applies image-based rendering for 3D visualization. In contrast
to photogrammetry, it does not suffer from inaccurate correspondence matches and long processing
times. It is cheaper than LiDAR, delivers surface color information, and has the potential to achieve
high spatial sampling resolutions.
In contrast to CMOS or Charge-Coupled Device (CCD) cameras, the spatial sampling resolution
of LiDAR is limited by the speed and other mechanical constraints of laser deflection. Top-end
drone-based LiDAR systems (e.g., RIEGL VUX-1UAV: RIEGL Laser Measurement Systems GmbH,
Horn, Austria) achieve resolutions of up to 2000 samples/m
2
from an altitude of 40 m. The sampling
precision is affected by the precision of the drone’s inertial measurement unit (IMU) and Global
Positioning System (GPS) signal, which is required to integrate the LiDAR scan lines during flight.
While IMUs suffer from drift over time, GPS is only moderately precise. In contrast, pose estimation
for AOS is entirely based on computer vision and a dense set of visually registered multi-view images
of many hundreds of overlapping perspectives. It is therefore more stable and precise. Since AOS
processes unstructured records of perspective images, GPS and IMU data is only used for navigating
the drone and does not affect the precision of pose estimation.
J. Imaging 2018,4, 102 7 of 12
More advanced drones allow external cameras to be attached to a rotatable gimbal (e.g., DJI M600:
DJI, Shenzhen, China). If equipped with state-of-the-art 100 MP aerial cameras (e.g., Phase One iXU-RS
1000, Phase One, Copenhagen, Denmark) and appropriate objective lenses, we estimate that these can
reach spatial sampling resolutions one to two orders of magnitude higher than those of state-of-the-art
LiDAR drones.
However, AOS also has several limitations: First, it does not provide useful depth information,
as is the case for LiDAR and photogrammetry. Depth maps that could be derived from our focal
stacks by depth-from-defocus techniques [
37
] are as unreliable and imprecise as those gained from
photogrammetry. Second, ground surface features occluded by extremely dense structures might not
be resolved well—not even with very wide synthetic apertures and very high directional samplings (see
Appendix B). LiDAR might be of advantage in these cases, as (in theory) only one non-occluded sample
is required to resolve a surface point. This, however, also requires rigorously robust classification
techniques that are able to correctly filter out vegetation under these conditions. Third, AOS relies on
an adequate amount of sun light being reflected from the ground surface while LiDAR benefits from
active laser illumination.
In conclusion, we believe that AOS has potential to support use cases that benefit from visual
inspections of partially occluded areas at low costs and little effort. It is easy to use and provides
real-time high-resolution 3D color visualizations almost instantly. For applications that require
quantitative 3D measurements, however, LiDAR or photogrammetry still remain methods of first
choice. We intend to explore optimal directional and spatial sampling strategies for AOS. Improved
drone path planning, for instance, would lead to better usage of the limited flying time and to reduced
image data. Furthermore, the fact that recorded images spatially overlap in the focal plane suggests that
the actual spatial sampling resolution is higher than for a single image (as estimated in Appendix A).
Thus, the potential advantage of computational super-resolution methods should be investigated.
Maximizing the Sobel magnitude for determining the optimal focus through the focal stack slices is
a simple first approach that leads to banding artifacts and does not entirely remove occluders (e.g.,
some tree trunks, see Supplementary Video S2). More advanced image filters that are based on defocus,
color, and other context-aware image features will most likely provide better results.
Supplementary Materials:
The following supplementary videos available online at zenodo.org/record/1304029:
Supplementary Video S1: 3D visualization of experiment 1 (Figure 3) and Supplementary Video
S2: 3D visualization of experiment 2 (Figure 4). The raw and processed (rectified and cropped)
recordings and pose data for experiments 1 and 2 can be found at zenodo.org/record/1227183 and
zenodo.org/record/1227246, respectively.
Author Contributions:
Conceptualization, O.B.; Investigation, I.K., D.C.S. and O.B.; Project administration,
O.B.; Software, I.K. and D.C.S.; Supervision, O.B.; Visualization, I.K. and D.C.S.; Writing—original draft, O.B.;
Writing—review & editing, I.K., D.C.S. and O.B.
Funding: This research was funded by Austrian Science Fund (FWF) under grant number [P 28581-N33].
Acknowledgments:
We thank Stefan Traxler, Christian Steingruber, and Christian Greifeneder of the OOE
Landesmuseum (State Museum of Upper Austria) and OOE Landesregierung (State Government Upper Austria)
for insightful discussions and for providing the LiDAR scans shown in Figure 4e.
Conflicts of Interest: The authors declare no conflict of interest.
Appendix A
In the following we explain how the effective spatial sampling resolution is estimated for
omnidirectional cameras and the presence of pose-estimation errors.
By back-projecting the slanted focal plane at the ground surface onto the camera’s image plane
using the calibrated omnidirectional camera model and the pose estimation results, we count the
number of pixels that sample the focal plane per square meter. This approximates Equation (2) for the
non-uniform image resolutions of omnidirectional cameras.
For experiment 2, this was
fs
= 985 pixel/m
2
for an altitude of
h
= 35 m and the
n
= 3.5 MP fraction
of the image sensor that was affected by the scene portion of interest (
FOVc
= 80
). This implies that
J. Imaging 2018,4, 102 8 of 12
the size of each pixel on the focal plane was
p
= 3.18
×
3.18 cm. The average lateral pose estimation
error
e
is determined by COLMAP as the average difference between back-projected image features
after pose estimation and their original counterparts in the recorded images. For experiment 2, this was
1.33 pixels on the image plane (i.e.,
e
= 4.23 cm on the focal plane). Thus, the effective size
p
of a
sample on the focal plane that considers both the projected pixel size
p,
and a given pose-estimation
error
e
, is
p
=
p
+
2e
(cf. Figure A1). For experiment 2, this leads to
p
= 3.18 cm + 2
×
4.23 cm =
11.64 cm, which is equivalent to an effective spatial sampling resolution of f0s= 74 samples/m2.
J. Imaging 2018, 4, x FOR PEER REVIEW 8 of 12
For experiment 2, this was = 985 pixel/m2 for an altitude of h = 35 m and the n = 3.5 MP
fraction of the image sensor that was affected by the scene portion of interest (FOVc = 80°). This
implies that the size of each pixel on the focal plane was p = 3.18 × 3.18 cm. The average lateral pose
estimation error e is determined by COLMAP as the average difference between back-projected
image features after pose estimation and their original counterparts in the recorded images. For
experiment 2, this was 1.33 pixels on the image plane (i.e., e = 4.23 cm on the focal plane). Thus, the
effective size p of a sample on the focal plane that considers both the projected pixel size p, and a
given pose-estimation error e, is p = p + 2e (cf. Figure A1). For experiment 2, this leads to p = 3.18 cm
+ 2 × 4.23 cm= 11.64 cm, which is equivalent to an effective spatial sampling resolution of ′ = 74
samples/m2.
Figure A1. The size p of a pixel projected onto the focal plane at the ground surface increases to p by
twice the average pose-estimation error e in both lateral directions. This reduces the effective spatial
sampling resolution.
Figure A2 illustrates AOS results of a known ground truth target (a multi-resolution chart of six
checkerboard patterns with varying checker sizes). As in experiment 1, it was captured with a 30 m
diameter synthetic aperture (vertically aligned, sampled with 135 images) from a distance of 20 m. In
this case, the pixel size projected onto the focal plane was p = 1.74 × 1.74 cm (assuming a 30 × 30 m
region of interest) and the pose-estimation error was e = 1.6 cm (0.94 pixels on the image plane). This
leads to an effective sample size of p = 5.03 cm and an effective spatial sampling resolution of
′ = 396 samples/m2. As expected, the checker size of 3 × 3 cm can still be resolved in single raw
images captured by the drone, but only the 5 × 5 cm checker size can be resolved in the full-aperture
AOS visualization, which combines 135 raw images at a certain pose-estimation error.
Note that the effective spatial sampling resolution does not, in general, depend on the directional
sampling resolution (i.e., the number of recorded images). However, it can be expected that it
increases if the pose-estimation error decreases since more images are used for structure-from-motion
reconstruction.
Figure A1.
The size
p
of a pixel projected onto the focal plane at the ground surface increases to
p
by
twice the average pose-estimation error
e
in both lateral directions. This reduces the effective spatial
sampling resolution.
Figure A2 illustrates AOS results of a known ground truth target (a multi-resolution chart of six
checkerboard patterns with varying checker sizes). As in experiment 1, it was captured with a 30 m
diameter synthetic aperture (vertically aligned, sampled with 135 images) from a distance of 20 m.
In this case, the pixel size projected onto the focal plane was
p
= 1.74
×
1.74 cm (assuming a 30
×
30
m region of interest) and the pose-estimation error was
e
= 1.6 cm (0.94 pixels on the image plane).
This leads to an effective sample size of
p
= 5.03 cm and an effective spatial sampling resolution of
f0s
= 396 samples/m
2
. As expected, the checker size of 3
×
3 cm can still be resolved in single raw images
captured by the drone, but only the 5
×
5 cm checker size can be resolved in the full-aperture AOS
visualization, which combines 135 raw images at a certain pose-estimation error.
Note that the effective spatial sampling resolution does not, in general, depend on the
directional sampling resolution (i.e., the number of recorded images). However, it can be
expected that it increases if the pose-estimation error decreases since more images are used for
structure-from-motion reconstruction.
J. Imaging 2018,4, 102 9 of 12
J. Imaging 2018, 4, x FOR PEER REVIEW 9 of 12
Figure A2. A multi-resolution chart (a) attached to a building and captured with a 30 m diameter
synthetic aperture from a distance of 20 m. (b) A single raw video image captured with the drone. (c)
A contrast enhanced target close-up captured in the raw image. (d) A full-aperture AOS visualization
focusing on the target. (e) A contrast enhanced target close-up in the AOS visualization. All images
are converted to grayscale for better comparison.
Appendix B
In the following we illustrate the influence of occlusion density, aperture size, and directional
sampling resolution with respect to reconstruction quality, based on simulated data that allows for a
quantitative comparison to the ground truth.
The simulation applies the same recording parameters as used for experiment 1 (Figure 3): a
ground target with dimensions of 2.4 × 0.8 m, a recording altitude of 20 m, a full synthetic aperture
of 30 m diameter, and an occlusion volume of 10 m³ around and above the target filled with a uniform
and random distribution of 5123 (maximum) opaque voxels at various densities (simulating the
occlusion by trees and other vegetation). The simulated spatial resolution (i.e., image resolution) was
5122 pixels. The simulated directional resolution (i.e., number of recorded images) was either 32, 92,
or 272.
Figure A3 illustrates the influence of an increasing occlusion density at a constant spatial and
directional sampling resolution (5122 × 92 in this example) and at full synthetic aperture (i.e., 30 m
diameter).
Figure A2.
A multi-resolution chart (
a
) attached to a building and captured with a 30 m diameter
synthetic aperture from a distance of 20 m. (
b
) A single raw video image captured with the drone. (
c
) A
contrast enhanced target close-up captured in the raw image. (
d
) A full-aperture AOS visualization
focusing on the target. (
e
) A contrast enhanced target close-up in the AOS visualization. All images are
converted to grayscale for better comparison.
Appendix B
In the following we illustrate the influence of occlusion density, aperture size, and directional
sampling resolution with respect to reconstruction quality, based on simulated data that allows for a
quantitative comparison to the ground truth.
The simulation applies the same recording parameters as used for experiment 1 (Figure 3):
a ground target with dimensions of 2.4
×
0.8 m, a recording altitude of 20 m, a full synthetic aperture of
30 m diameter, and an occlusion volume of 10 m
3
around and above the target filled with a uniform and
random distribution of 512
3
(maximum) opaque voxels at various densities (simulating the occlusion
by trees and other vegetation). The simulated spatial resolution (i.e., image resolution) was 512
2
pixels.
The simulated directional resolution (i.e., number of recorded images) was either 32, 92, or 272.
Figure A3 illustrates the influence of an increasing occlusion density at a constant spatial
and directional sampling resolution (512
2×
9
2
in this example) and at full synthetic aperture
(i.e., 30 m diameter).
Figure A4 illustrates the impact of various synthetic aperture diameters (30 m, 15 m, and pinhole,
which approximates a single recording of the drone) and directional sampling resolutions within
the selected synthetic aperture (3
2
, 9
2
, and 27
2
perspectives) at a constant occlusion density (50% in
this example).
J. Imaging 2018,4, 102 10 of 12
J. Imaging 2018, 4, x FOR PEER REVIEW 10 of 12
Figure A3. AOS simulation of increasing occlusion density above ground target at a 512
2
× 9
2
sampling
resolution and a full synthetic aperture of 30 m diameter. Projected occlusion voxels are green. The
numbers indicate the structural similarity index (Quaternion SSIM [36]) and the PSNR with respect
to the ground truth (gt).
Figure A4 illustrates the impact of various synthetic aperture diameters (30 m, 15 m, and pinhole,
which approximates a single recording of the drone) and directional sampling resolutions within the
selected synthetic aperture (3
2
, 9
2
, and 27
2
perspectives) at a constant occlusion density (50% in this
example).
Figure A4. AOS simulation of various synthetic aperture diameters (pinhole = single drone recording,
15 m, and 30 m) and directional sampling resolutions within the selected synthetic aperture (3
2
, 9
2
,
and 27
2
perspectives). The occlusion density is 50%. Projected occlusion voxels are green. The
numbers indicate the structural similarity index (Quaternion SSIM [38]) and the PSNR with respect
to the ground truth.
Figure A3.
AOS simulation of increasing occlusion density above ground target at a 512
2×
9
2
sampling resolution and a full synthetic aperture of 30 m diameter. Projected occlusion voxels are
green. The numbers indicate the structural similarity index (Quaternion SSIM [
36
]) and the PSNR with
respect to the ground truth (gt).
J. Imaging 2018, 4, x FOR PEER REVIEW 10 of 12
Figure A3. AOS simulation of increasing occlusion density above ground target at a 512
2
× 9
2
sampling
resolution and a full synthetic aperture of 30 m diameter. Projected occlusion voxels are green. The
numbers indicate the structural similarity index (Quaternion SSIM [36]) and the PSNR with respect
to the ground truth (gt).
Figure A4 illustrates the impact of various synthetic aperture diameters (30 m, 15 m, and pinhole,
which approximates a single recording of the drone) and directional sampling resolutions within the
selected synthetic aperture (3
2
, 9
2
, and 27
2
perspectives) at a constant occlusion density (50% in this
example).
Figure A4. AOS simulation of various synthetic aperture diameters (pinhole = single drone recording,
15 m, and 30 m) and directional sampling resolutions within the selected synthetic aperture (3
2
, 9
2
,
and 27
2
perspectives). The occlusion density is 50%. Projected occlusion voxels are green. The
numbers indicate the structural similarity index (Quaternion SSIM [38]) and the PSNR with respect
to the ground truth.
Figure A4.
AOS simulation of various synthetic aperture diameters (pinhole = single drone recording,
15 m, and 30 m) and directional sampling resolutions within the selected synthetic aperture (3
2
, 9
2
,
and 27
2
perspectives). The occlusion density is 50%. Projected occlusion voxels are green. The numbers
indicate the structural similarity index (Quaternion SSIM [
38
]) and the PSNR with respect to the
ground truth.
J. Imaging 2018,4, 102 11 of 12
From the simulation results presented in Figures A3 and A4 it can be seen that an increasing
occlusion density leads to a degradation of reconstruction quality. However, a wider synthetic aperture
diameter and a higher number of perspective recordings within this aperture always lead to an
improvement of reconstruction quality as occluders are blurred more efficiently. Consequently,
densely occluded scenes have to be captured with wide synthetic apertures at high directional
sampling rates.
References
1.
Rempel, R.C.; Parker, A.K. An information note on an airborne laser terrain profiler for micro-relief studies.
In Proceedings of the Symposium Remote Sensing Environment, 3rd ed.; University of Michigan Institute of
Science and Technology: Ann Arbor, MI, USA, 1964; pp. 321–337.
2.
Nelson, R. How did we get here? An early history of forestry lidar. Can. J. Remote Sens.
2013
,39, S6–S17.
[CrossRef]
3.
Sabatini, R.; Richardson, M.A.; Gardi, A.; Ramasamy, S. Airborne laser sensors and integrated systems.
Prog. Aerosp. Sci. 2015,79, 15–63. [CrossRef]
4.
Kulawardhana, R.W.; Popescu, S.C.; Feagin, R.A. Airborne lidar remote sensing applications in non-forested
short stature environments: A review. Ann. For. Res. 2017,60, 173–196. [CrossRef]
5.
Synge, E. XCI. A method of investigating the higher atmosphere. Lond. Edinb. Dublin Philos. Mag. J. Sci.
1930
,
9, 1014–1020. [CrossRef]
6.
Vasyl, M.; Paul, F.M.; Ove, S.; Takao, K.; Chen, W.B. Laser radar: Historical prospective—From the East to
the West. Opt. Eng. 2016,56, 031220. [CrossRef]
7.
Behroozpour, B.; Sandborn, P.A.M.; Wu, M.C.; Boser, B.E. Lidar System Architectures and Circuits.
IEEE Commun. Mag. 2017,55, 135–142. [CrossRef]
8.
Du, B.; Pang, C.; Wu, D.; Li, Z.; Peng, H.; Tao, Y.; Wu, E.; Wu, G. High-speed photon-counting laser ranging
for broad range of distances. Sci. Rep. 2018,8, 4198. [CrossRef] [PubMed]
9.
Chase, A.F.; Chase, D.Z.; Weishampel, J.F.; Drake, J.B.; Shrestha, R.L.; Slatton, K.C.; Awe, J.J.; Carter, W.E.
Airborne LiDAR, archaeology, and the ancient Maya landscape at Caracol, Belize. J. Archaeol. Sci.
2011
,38,
387–398. [CrossRef]
10.
Khan, S.; Aragão, L.; Iriarte, J. A UAV–lidar system to map Amazonian rainforest and its ancient landscape
transformations. Int. J. Remote Sens. 2017,38, 2313–2330. [CrossRef]
11.
Inomata, T.; Triadan, D.; Pinzón, F.; Burham, M.; Ranchos, J.L.; Aoyama, K.; Haraguchi, T. Archaeological
application of airborne LiDAR to examine social changes in the Ceibal region of the Maya lowlands.
PLoS ONE 2018,13, 1–37. [CrossRef] [PubMed]
12.
Maltamo, M.; Naesset, E.; Vauhkonen, J. Forestry Applications of Airborne Laser Scanning: Concepts and
Case Studies. In Managing Forest Ecosystems; Springer: Amsterdam, The Netherlands, 2014.
13.
Sterenczak, K.; Ciesielski, M.; Balazy, R.; Zawiła-Niedzwiecki, T. Comparison of various algorithms for DTM
interpolation from LIDAR data in dense mountain forests. Eur. J. Remote Sens.
2016
,49, 599–621. [CrossRef]
14.
Chen, Z.; Gao, B.; Devereux, B. State-of-the-Art: DTM Generation Using Airborne LIDAR Data. Sensors
2017
,
17, 150. [CrossRef] [PubMed]
15.
Nagai, M.; Chen, T.; Shibasaki, R.; Kumagai, H.; Ahmed, A. UAV-Borne 3-D Mapping System by Multisensor
Integration. IEEE Trans. Geosci. Remote Sens. 2009,47, 701–708. [CrossRef]
16.
Lin, Y.; Hyyppa, J.; Jaakkola, A. Mini-UAV-Borne LIDAR for Fine-Scale Mapping. IEEE Geosci. Remote
Sens. Lett. 2011,8, 426–430. [CrossRef]
17.
Favorskaya, M.; Jain, L. Handbook on Advances in Remote Sensing and Geographic Information Systems:
Paradigms and Applications in Forest Landscape Modeling. In Intelligent Systems Reference Library;
Springer International Publishing: New York, NY, USA, 2017.
18.
Kwon, S.; Park, J.W.; Moon, D.; Jung, S.; Park, H. Smart Merging Method for Hybrid Point Cloud Data using
UAV and LIDAR in Earthwork Construction. Procedia Eng. 2017,196, 21–28. [CrossRef]
19.
Chiang, K.W.; Tsai, G.J.; Li, Y.H.; El-Sheimy, N. Development of LiDAR-Based UAV System for Environment
Reconstruction. IEEE Geosci. Remote Sens. Lett. 2017,14, 1790–1794. [CrossRef]
J. Imaging 2018,4, 102 12 of 12
20.
Wallace, L.; Lucieer, A.; Malenovský, Z.; Turner, D.; Vopenka, P. Assessment of Forest Structure Using Two
UAV Techniques: A Comparison of Airborne Laser Scanning and Structure from Motion (SfM) Point Clouds.
Forests 2016,7, 62. [CrossRef]
21.
Guo, Q.; Su, Y.; Hu, T.; Zhao, X.; Wu, F.; Li, Y.; Liu, J.; Chen, L.; Xu, G.; Lin, G.; et al. An integrated UAV-borne
lidar system for 3D habitat mapping in three forest ecosystems across China. Int. J. Remote Sens.
2017
,38,
2954–2972. [CrossRef]
22. Streibl, N. Three-dimensional imaging by a microscope. J. Opt. Soc. Am. A 1985,2, 121–127. [CrossRef]
23.
Conchello, J.A.; Lichtman, J.W. Optical sectioning microscopy. Nat. Methods
2005
,2, 920–931. [CrossRef]
[PubMed]
24.
Qian, J.; Lei, M.; Dan, D.; Yao, B.; Zhou, X.; Yang, Y.; Yan, S.; Min, J.; Yu, X. Full-color structured illumination
optical sectioning microscopy. Sci. Rep. 2015,5, 14513. [CrossRef] [PubMed]
25. Ryle, M.; Vonberg, D.D. Solar Radiation on 175 Mc./s. Nature 1946,158, 339. [CrossRef]
26. Wiley, C.A. Synthetic aperture radars. IEEE Trans. Aerosp. Electron. Syst. 1985,AES-21, 440–443. [CrossRef]
27.
Moreira, A.; Prats-Iraola, P.; Younis, M.; Krieger, G.; Hajnsek, I.; Papathanassiou, K.P. A tutorial on synthetic
aperture radar. IEEE Geosci. Remote Sens. Mag. 2013,1, 6–43. [CrossRef]
28.
Ouchi, K. Recent Trend and Advance of Synthetic Aperture Radar with Selected Topics. Remote Sens.
2013
,5,
716–807. [CrossRef]
29.
Li, C.J.; Ling, H. Synthetic aperture radar imaging using a small consumer drone. In Proceedings of the 2015
IEEE International Symposium on Antennas and Propagation USNC/URSI National Radio Science Meeting,
Vancouver, BC, Canada, 19–25 July 2015; pp. 685–686.
30.
Baldwin, J.E.; Beckett, M.G.; Boysen, R.C.; Burns, D.; Buscher, D.; Cox, G.; Haniff, C.A.; Mackay, C.D.;
Nightingale, N.S.; Rogers, J.; et al. The first images from an optical aperture synthesis array: Mapping of
Capella with COAST at two epochs. Astron. Astrophys. 1996,306, L13–L16.
31.
Turpin, T.M.; Gesell, L.H.; Lapides, J.; Price, C.H. Theory of the synthetic aperture microscope. In Proceedings
of the SPIE’s 1995 International Symposium on Optical Science, Engineering, and Instrumentation, San Diego,
CA, USA, 9–14 July 1995; Volume 2566.
32.
Levoy, M.; Zhang, Z.; Mcdowall, I. Recording and controlling the 4D light field in a microscope using
microlens arrays. J. Microsc. 2009,235, 144–162. [CrossRef] [PubMed]
33.
Vaish, V.; Wilburn, B.; Joshi, N.; Levoy, M. Using plane + parallax for calibrating dense camera arrays.
In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,
Washington, DC, USA, 27 June–2 July 2004; Volume 1, pp. 2–9.
34.
Levoy, M.; Hanrahan, P. Light Field Rendering. In Proceedings of the 23rd Annual Conference on Computer
Graphics and Interactive Techniques, SIGGRAPH ’96, New Orleans, LA, USA, 4–9 August 1996; ACM:
New York, NY, USA, 1996; pp. 31–42.
35. Isaksen, A.; McMillan, L.; Gortler, S.J. Dynamically reparameterized light fields. In Proceedings of the 27th
Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’00, 2000, New Orleans,
LA, USA, 23—28 July 2000; ACM: New York, NY, USA, 2000; pp. 297–306.
36.
Schoenberger, J.L.; Frahm, J.-M. Structure-from-Motion Revisited. In Proceedings of the Conference on
Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016. [CrossRef]
37.
Pentland, A.P. A new sense for depth of field. IEEE Trans. Pattern Anal. Mach. Intell.
1987
,4, 523–531.
[CrossRef]
38.
Kolaman, A.; Yadid-Pecht, O. Quaternion Structural Similarity: A New Quality Index for Color Images.
IEEE Trans. Image Process. 2012,21, 1526–1536. [CrossRef] [PubMed]
©
2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
... The integral images can be further analyzed to support, for instance, automated person classification with advanced deep neural-networks. In Schedl et al. (2020b), we have shown that integrating single images before classification rather than combining classification results of single images is significantly more effective when classifying partially occluded persons in aerial thermal images (92% vs. 25% average precision). In Schedl et al. (2021), we demonstrate a first fully autonomous drone for search and rescue based on AOS. ...
... Moving targets, such as walking people or running animals lead to motion blur in integral images that are nearly impossible to detect and track. Applying AOS to far infrared (thermal imaging), as in Schedl et al. (2020bSchedl et al. ( , 2021, restricts it to cold environment temperatures, while using it in the visible range (RGB imaging) as in Kurmi et al. (2018), often suffers from too little sunlight penetrating through dense vegetation. ...
... Therefore, integrating multiple rays (i.e., averaging their corresponding pixels) results in focus of the target and defocus of the occluders 1 . This increases the probability of detecting the target reliably under strong occlusion conditions Schedl et al. (2020b). Figure 1(b) illustrates our current prototype. ...
Preprint
Full-text available
Detecting and tracking moving targets through foliage is difficult, and for many cases even impossible in regular aerial images and videos. We present an initial light-weight and drone-operated 1D camera array that supports parallel synthetic aperture aerial imaging. Our main finding is that color anomaly detection benefits significantly from image integration when compared to conventional single images or video frames (on average 97% vs. 42% in precision in our field experiments). We demonstrate, that these two contributions can lead to the detection and tracking of moving people through densely occluding forest
... With airborne optical sectioning (AOS) (28)(29)(30)(31)(32)(33)(34), we have introduced a wide SA imaging technique that uses crewed or uncrewed aircraft, such as drones (Fig. 1A), to sample images within large (SA) areas from above occluded volumes, such as forests. On the basis of the poses of the aircraft during capturing, these images are computationally combined to integrate images. ...
... In (30), we presented a statistical model to explain the efficiency of AOS with respect to occlusion density, occluder sizes, number of integrated samples, and size of the SA. Although the SA in (30) and (34) are 2D, the equations and AOS principles still hold in the case of 1D apertures as the ones being applied in this work. The main advantages of AOS over alternatives, such as LIDAR (37)(38)(39) or synthetic aperture radar (8)(9)(10), are its computational performance in real-time occlusion removal and its independence to wavelength. ...
... The main advantages of AOS over alternatives, such as LIDAR (37)(38)(39) or synthetic aperture radar (8)(9)(10), are its computational performance in real-time occlusion removal and its independence to wavelength. AOS has been applied in the visible spectrum (28) and in the far-infrared (thermal) spectrum (31) for wildlife observations (32) and search and rescue (SAR) (34). In addition, it can be applied to near-infrared wavelengths, for example, to address applications in agriculture or forestry. ...
Article
Autonomous drones will play an essential role in human-machine teaming in future search and rescue (SAR) missions. We present a prototype that finds people fully autonomously in densely occluded forests. In the course of 17 field experiments conducted over various forest types and under different flying conditions, our drone found, in total, 38 of 42 hidden persons. For experiments with predefined flight paths, the average precision was 86%, and we found 30 of 34 cases. For adaptive sampling experiments (where potential findings are double-checked on the basis of initial classification confidences), all eight hidden persons were found, leading to an average precision of 100%, whereas classification confidence was increased on average by 15%. Thermal image processing, classification, and dynamic flight path adaptation are computed on-board in real time and while flying. We show that deep learning–based person classification is unaffected by sparse and error-prone sampling within straight flight path segments. This finding allows search missions to be substantially shortened and reduces the image complexity to 1/10th when compared with previous approaches. The goal of our adaptive online sampling technique is to find people as reliably and quickly as possible, which is essential in time-critical applications, such as SAR. Our drone enables SAR operations in remote areas without stable network coverage, because it transmits to the rescue team only classification results that indicate detections and can thus operate with intermittent minimal-bandwidth connections (e.g., by satellite). Once received, these results can be visually enhanced for interpretation on remote mobile devices.
... Optical Sectioning (AOS) [23] [24][25] [26][27] [28] [29][30] [31] is an effective widesynthetic-aperture aerial imaging technique that can be deployed using camera drones. It allows virtual mimicking of a wide aperture optic of the shape and size of the scan area (possibly hundreds to thousands of square meters) that generates images of an extremely shallow depth of field above an occluding structure, such as a forest. ...
... Compared to alternative airborne scanning technologies, such as LiDAR [32][33] [34] and synthetic aperture radar [1][2] [3], AOS is cheaper, wavelength independent, and offers real-time computational performance for occlusion removal. We have applied AOS within the visible [23] and the thermal [26] spectra, and demonstrated its usefulness in archeology [23], wildlife observation [27], and search and rescue (SAR) [30] [31]. By employing the randomly distributed statistical model in [25], we explained AOS' efficiency with respect to occlusion density, occluder sizes, number of integrated samples, and size of the synthetic aperture. ...
... Compared to alternative airborne scanning technologies, such as LiDAR [32][33] [34] and synthetic aperture radar [1][2] [3], AOS is cheaper, wavelength independent, and offers real-time computational performance for occlusion removal. We have applied AOS within the visible [23] and the thermal [26] spectra, and demonstrated its usefulness in archeology [23], wildlife observation [27], and search and rescue (SAR) [30] [31]. By employing the randomly distributed statistical model in [25], we explained AOS' efficiency with respect to occlusion density, occluder sizes, number of integrated samples, and size of the synthetic aperture. ...
Preprint
Full-text available
Fully autonomous drones have been demonstrated to find lost or injured persons under strongly occluding forest canopy. Airborne Optical Sectioning (AOS), a novel synthetic aperture imaging technique, together with deep-learning-based classification enables high detection rates under realistic search-and-rescue conditions. We demonstrate that false detections can be significantly suppressed and true detections boosted by combining classifications from multiple AOS rather than single integral images. This improves classification rates especially in the presence of occlusion. To make this possible, we modified the AOS imaging process to support large overlaps between subsequent integrals, enabling real-time and on-board scanning and processing of groundspeeds up to 10 m/s.
... With Airborne Optical Sectioning (AOS) [21][22][23][24][25][26][27][28][29][30][31] we have introduced an optical synthetic aperture imaging technique that computationally removes occlusion caused by dense vegetation, such as forest, in real-time. It utilizes manually, automatically [21][22][23][24][25][26][27][28]30], or fully autonomously [29] operated camera drones that sample multispectral (RGB and thermal) images within a certain (synthetic aperture) area above forest (cf. ...
... With Airborne Optical Sectioning (AOS) [21][22][23][24][25][26][27][28][29][30][31] we have introduced an optical synthetic aperture imaging technique that computationally removes occlusion caused by dense vegetation, such as forest, in real-time. It utilizes manually, automatically [21][22][23][24][25][26][27][28]30], or fully autonomously [29] operated camera drones that sample multispectral (RGB and thermal) images within a certain (synthetic aperture) area above forest (cf. Fig. 1a,b). ...
Preprint
Full-text available
Occlusion caused by vegetation is an essential problem for remote sensing applications in areas, such as search and rescue, wildfire detection, wildlife observation, surveillance, border control, and others. Airborne Optical Sectioning (AOS) is an optical, wavelength-independent synthetic aperture imaging technique that supports computational occlusion removal in real-time. It can be applied with manned or unmanned aircrafts, such as drones. In this article, we demonstrate a relationship between forest density and field of view (FOV) of applied imaging systems. This finding was made with the help of a simulated procedural forest model which offers the consideration of more realistic occlusion properties than our previous statistical model. While AOS has been explored with automatic and autonomous research prototypes in the past, we present a free AOS integration for DJI systems. It enables bluelight organizations and others to use and explore AOS with compatible, manually operated, off-the-shelf drones. The (digitally cropped) default FOV for this implementation was chosen based on our new finding.
... To limit the scope of this paper, we only focus on the human-centered security-related surveillance tasks. Aerial surveillance empowers a wide range of human-centric applications, including border patrol [52], search and rescue [15], [121], maritime surveillance [128], protest monitoring, drug trafficking monitoring [145], military IED tracking [93] and crime fighting [160]. Due to these burgeoning applications of aerial surveillance, corporate aerial surveillance is rapidly growing. ...
... Portman et al. [115] and Ma et al. [88] showed handcrafted approaches would struggle to deal with aerial thermal human detection due to a large number of variations. Schedl et al. showed that aerial thermal person detection under occlusion conditions can be notably improved by combining multi-perspective images before classification [121]. Using a synthetic aperture imaging technique, they achieve this with a precision and recall of 96% and 93%, respectively on their own dataset. ...
Preprint
Full-text available
The rapid emergence of airborne platforms and imaging sensors are enabling new forms of aerial surveillance due to their unprecedented advantages in scale, mobility, deployment and covert observation capabilities. This paper provides a comprehensive overview of human-centric aerial surveillance tasks from a computer vision and pattern recognition perspective. It aims to provide readers with an in-depth systematic review and technical analysis of the current state of aerial surveillance tasks using drones, UAVs and other airborne platforms. The main object of interest is humans, where single or multiple subjects are to be detected, identified, tracked, re-identified and have their behavior analyzed. More specifically, for each of these four tasks, we first discuss unique challenges in performing these tasks in an aerial setting compared to a ground-based setting. We then review and analyze the aerial datasets publicly available for each task, and delve deep into the approaches in the aerial literature and investigate how they presently address the aerial challenges. We conclude the paper with discussion on the missing gaps and open research questions to inform future research avenues.
... In such systems, synthetic apertures are constrained mainly by the physical size of the camera array used. With Airborne Optical Sectioning (AOS) (39)(40)(41)(42)(43)(44)(45), we have introduced a wide synthetic-aperture imaging technique that employs manned or unmanned aircraft, such as drones (Fig. 1A), to sample images within large (synthetic aperture) areas from above occluded volumes, such as forests. Based on the poses of the aircraft during capturing, these images are computationally combined to integrate images. ...
... The main software running on the SoCC was implemented using Python, while submodules, such as the drone communication protocol and the integral computation, were implemented using C, OpenGL, and C++ and integrated using Cython. How integral images are computed was explained in detail in (39,42,45). This work differs in that we did not assume the ground surface to be planar, but approximated it with a digital elevation model (DEM). ...
Preprint
Full-text available
Drones will play an essential role in human-machine teaming in future search and rescue (SAR) missions. We present a first prototype that finds people fully autonomously in densely occluded forests. In the course of 17 field experiments conducted over various forest types and under different flying conditions, our drone found 38 out of 42 hidden persons; average precision was 86% for predefined flight paths, while adaptive path planning (where potential findings are double-checked) increased confidence by 15%. Image processing, classification, and dynamic flight-path adaptation are computed onboard in real-time and while flying. Our finding that deep-learning-based person classification is unaffected by sparse and error-prone sampling within one-dimensional synthetic apertures allows flights to be shortened and reduces recording requirements to one-tenth of the number of images needed for sampling using two-dimensional synthetic apertures. The goal of our adaptive path planning is to find people as reliably and quickly as possible, which is essential in time-critical applications, such as SAR. Our drone enables SAR operations in remote areas without stable network coverage, as it transmits to the rescue team only classification results that indicate detections and can thus operate with intermittent minimal-bandwidth connections (e.g., by satellite). Once received, these results can be visually enhanced for interpretation on remote mobile devices.
... The use of thermal cameras is therefore a standard technique for response operations [3]. The interesting aspect of the work of Schedl and colleagues is accordingly the AOS [4], which achieves something that may sound a bit like a miracle, namely to remove the occluding forest such that persons on the ground become clearly visibleor at least sufficiently visible to be reliably detectable by machine vision (Fig.1c). AOS can be seen as a form of Synthetic Aperture (SA) sensing, which is a technology that originated in the 1950s to improve the resolution of radar as a remote sensing tool [5]. ...
Article
Signal processing of thermal images that are autonomously collected by a drone detects people in densely occluded forests.
Article
Detecting and tracking moving targets through foliage is difficult, and for many cases even impossible in regular aerial images and videos. We present an initial light-weight and drone-operated 1D camera array that supports parallel synthetic aperture aerial imaging. Our main finding is that color anomaly detection benefits significantly from image integration when compared to conventional raw images or video frames (on average 97% vs. 42% in precision in our field experiments). We demonstrate that these two contributions can lead to the detection and tracking of moving people through densely occluding forest.
Article
Full-text available
Fully autonomous drones have been demonstrated to find lost or injured persons under strongly occluding forest canopy. Airborne optical sectioning (AOS), a novel synthetic aperture imaging technique, together with deep-learning-based classification enables high detection rates under realistic search-and-rescue conditions. We demonstrate that false detections can be significantly suppressed and true detections boosted by combining classifications from multiple AOS—rather than single—integral images. This improves classification rates especially in the presence of occlusion. To make this possible, we modified the AOS imaging process to support large overlaps between subsequent integrals, enabling real-time and on-board scanning and processing of groundspeeds up to 10 m/s.
Article
In this article we demonstrate that acceleration and deceleration of direction-turning drones at waypoints have a significant influence to path planning which is important to be considered for time-critical applications, such as drone-supported search and rescue. We present a new path planning approach that takes acceleration and deceleration into account. It follows a local gradient ascend strategy which locally minimizes turns while maximizing search probability accumulation. Our approach outperforms classic coverage-based path planning algorithms, such as spiral- and grid-search, as well as potential field methods that consider search probability distributions. We apply this method in the context of autonomous search and rescue drones and in combination with a novel synthetic aperture imaging technique, called Airborne Optical Sectioning (AOS), which removes occlusion of vegetation and forest in real-time.
Article
Full-text available
We demonstrate a high-speed photon-counting laser ranging system with laser pulses of multiple repetition rates to extend the unambiguous range. In the experiment, the laser pulses of three different repetition rates around 10 MHz were employed to enlarge the maximum unambiguous range from 15 m to 165 km. Moreover, the range of distances was increased as well, enabling the measurement on different targets of large separation distance with high depth resolution. An outdoor photon-counting laser ranging up to 21 km was realized with high repetition rate, which is beneficial for the airborne and satellite-based topographic mapping.
Article
Full-text available
Although the application of LiDAR has made significant contributions to archaeology, LiDAR only provides a synchronic view of the current topography. An important challenge for researchers is to extract diachronic information over typically extensive LiDAR-surveyed areas in an efficient manner. By applying an architectural chronology obtained from intensive excavations at the site center and by complementing it with surface collection and test excavations in peripheral zones, we analyze LiDAR data over an area of 470 km² to trace social changes through time in the Ceibal region, Guatemala, of the Maya lowlands. We refine estimates of structure counts and populations by applying commission and omission error rates calculated from the results of ground-truthing. Although the results of our study need to be tested and refined with additional research in the future, they provide an initial understanding of social processes over a wide area. Ceibal appears to have served as the only ceremonial complex in the region during the transition to sedentism at the beginning of the Middle Preclassic period (c. 1000 BC). As a more sedentary way of life was accepted during the late part of the Middle Preclassic period and the initial Late Preclassic period (600–300 BC), more ceremonial assemblages were constructed outside the Ceibal center, possibly symbolizing the local groups’ claim to surrounding agricultural lands. From the middle Late Preclassic to the initial Early Classic period (300 BC-AD 300), a significant number of pyramidal complexes were probably built. Their high concentration in the Ceibal center probably reflects increasing political centralization. After a demographic decline during the rest of the Early Classic period, the population in the Ceibal region reached the highest level during the Late and Terminal Classic periods, when dynastic rule was well established (AD 600–950).
Article
Full-text available
In recent decades, global biodiversity has gradually diminished due to the increasing pressure from anthropogenic activities and climatic change. Accurate estimations of spatially continuous three-dimensional (3D) vegetation structures and terrain information are prerequisites for biodiversity studies, which are usually unavailable in current ecosystem-wide studies. Although the airborne lidar technique has been successfully used for mapping 3D vegetation structures at landscape and regional scales, the relatively high cost of airborne lidar flight mission has significantly limited its applications. The unmanned aerial vehicle (UAV) provides an alternative platform for lidar data acquisition, which can largely lower the cost and provide denser lidar points compared with airborne lidar. In this study, we implemented a low-cost UAV-borne lidar system, including both a hardware system and a software system, to collect and process lidar data for biodiversity studies. The implemented UAV-borne lidar system was tested in three different ecosystems across China, including a needle-leaf–broadleaf mixed forest, an evergreen broadleaf forest, and a mangrove forest. Various 3D vegetation structure parameters (e.g. canopy height model, canopy cover, leaf area index, aboveground biomass) were derived from the UAV-borne lidar data. The results show that the implemented UAV-borne lidar system can generate very high resolution 3D terrain and vegetation information. The developed UAV-based hardware and software systems provide a turn-key solution for the use of UAV-borne lidar data on biodiversity studies. ARTICLE HISTORY
Article
Full-text available
Lidar (light detection and ranging) remote sensing technology provides promising tools for 3D characterization of the earth’s surface. In ecosystem studies, lidar derived structural parameters relating to vegetation and terrain have been extensively used in many applications and are rapidly expanding. Yet, most of the lidar applications have focused on tall, woody vegetation in forested environments and less research attention is given to non-forest, short stature vegetation dominated ecosystems. Similar to the lidar developments in forestry, novel methodological approaches and algorithm developments will be necessary to improve estimates of structural and biophysical properties (i.e. biomass and carbon storage) in non-forested short stature environments. Under changing climate scenarios, the latter is particularly useful to improve our understanding of their future role as terrestrial carbon sinks. In an attempt to identify research gaps in airborne lidar remote sensing application in short stature vegetation studies, in this review article we provide a comprehensive overview on the current state of airborne lidar applications. Our focus is mainly on the levels of accuracies and errors reported, as well as the potentials and limitations of the methods applied in these studies. We also provide insights into future research needs and applications in these environments.
Book
Airborne laser scanning (ALS) has emerged as one of the most promising remote sensing technologies to provide data for research and operational applications in a wide range of disciplines related to management of forest ecosystems. This book provides a comprehensive, state-of-the-art review of the research and application of ALS in a broad range of forest-related disciplines. However, this book is more than just a collection of individual contributions – it consists of a well-composed blend of chapters dealing with fundamental methodological issues and contributions reviewing and illustrating the use of ALS within various domains of application. The main aim of this book is to provide the scientific and technical background of ALS with a particular focus on applicability in operational forestry. Most of the chapters are devoted to applications in forest inventory and forest ecology such as forest management inventory and assessments of canopy cover, habitats and organism-habitat relationships. Many of the chapters focus on boreal forests simply because methods were initially developed for boreal conditions. However, examples show the most common applications of ALS at various geographical scales; from individual trees, to forest stands, regions and nations. The reviews provide a comprehensive and unique overview of recent research and applications that researchers, students and practitioners of forest remote sensing and forest ecosystem assessment should consider as a useful reference text.
Article
3D imaging technologies are applied in numerous areas, including self-driving cars, drones, and robots, and in advanced industrial, medical, scientific, and consumer applications. 3D imaging is usually accomplished by finding the distance to multiple points on an object or in a scene, and then creating a point cloud of those range measurements. Different methods can be used for the ranging. Some of these methods, such as stereovision, rely on processing 2D images. Other techniques estimate the distance more directly by measuring the round-trip delay of an ultrasonic or electromagnetic wave to the object. Ultrasonic waves suffer large losses in air and cannot reach distances beyond a few meters. Radars and lidars use electromagnetic waves in radio and optical spectra, respectively. The shorter wavelengths of the optical waves compared to the radio frequency waves translates into better resolution, and a more favorable choice for 3D imaging. The integration of lidars on electronic and photonic chips can lower their cost, size, and power consumption, making them affordable and accessible to all the abovementioned applications. This review article explains different lidar aspects and design choices, such as optical modulation and detection techniques, and point cloud generation by means of beam-steering or flashing an entire scene. Popular lidar architectures and circuits are presented, and the superiority of the FMCW lidar is discussed in terms of range resolution, receiver sensitivity, and compatibility with emerging technologies. At the end, an electronic-photonic integrated circuit for a micro-imaging FMCW lidar is presented as an example.
Chapter
Appleton1 and Hey2 have directed attention to the fact that radio- frequency energy, with some of the characteristics of random ‘noise’, is emitted with greatly increased intensity from the sun under the conditions of violent disturbance associated with a large sunspot. These observations were confined mainly to the region of frequencies near 60 Mc./s.
Article
In disaster management, reconstructing the environment and quickly collecting the geospatial data of the impacted areas in a short time are crucial. In this letter, a light detection and ranging (LiDAR)-based unmanned aerial vehicle (UAV) is proposed to complete the reconstruction task. The UAV integrate an inertial navigation system (INS), a global navigation satellite system (GNSS) receiver, and a low-cost LiDAR. An unmanned helicopter is introduced and the multisensor payload architecture for direct georeferencing is designed to improve the capabilities of the vehicle. In addition, a new strategy of iterative closest point algorithm is proposed to solve the registration problems in the sparse and inhomogeneous derived point cloud. The proposed registration algorithm addresses the local minima problem by the use of direct-georeferenced points and the novel hierarchical structure as well as taking the feedback bias into INS/GNSS. The generated point cloud is compared with a more accurate one derived from a high-grade terrestrial LiDAR which uses real flight data. Results indicate that the proposed UAV system achieves meter-level accuracy and reconstructs the environment with dense point cloud.
Article
In this article, a robust unmanned aerial remote-sensing system, equipped with a survey-grade Lidar scanner and a multispectral camera system, assembled to study pre-Columbian Amazonian archaeology is presented. The data collected from this system will be utilized in a novel inter-disciplinary way by combining these data with in situ data collected by archaeologists, archae-botanists, paleoecologists, soil scientists, and landscape ecologists to study the nature and scale of the impact of pre-Columbian humans in transforming the landscapes of Amazonian rainforest. The outputs of this research will also inform future policy on the conservation, sustainability, and ecological state of the forest. ARTICLE HISTORY
Book
This book presents the latest advances in remote-sensing and geographic information systems and applications. It is divided into four parts, focusing on Airborne Light Detection and Ranging (LiDAR) and Optical Measurements of Forests; Individual Tree Modelling; Landscape Scene Modelling; and Forest Eco-system Modelling. Given the scope of its coverage, the book offers a valuable resource for students, researchers, practitioners, and educators interested in remote sensing and geographic information systems and applications.