Conference PaperPDF Available

Probabilistic Target Detection by Camera-Equipped UAVs

Authors:

Abstract and Figures

This paper is motivated by the real world problem of search and rescue by unmanned aerial vehicles (UAVs). We consider the problem of tracking a static target from a bird's-eye view camera mounted to the underside of a quadrotor UAV. We begin by proposing a target detection algorithm, which we then execute on a collection of video frames acquired from four different experiments. We show how the efficacy of the target detection algorithm changes as a function of altitude. We summarise this efficacy into a table which we denote the observation model. We then run the target detection algorithm on a sequence of video frames and use parameters from the observation model to update a recursive Bayesian estimator. The estimator keeps track of the probability that a target is currently in view of the camera, which we refer to more simply as target presence. Between each target detection event the UAV changes position and so the sensing region changes. Under certain assumptions regarding the movement of the UAV, the proportion of new information may be approximated to a value, which we then use to weight the prior in each iteration of the estimator. Through a series of experiments we show how the value of the prior for unseen regions, the altitude of the UAV and the camera sampling rate affect the accuracy of the estimator. Our results indicate that there is no single optimal sampling rate for all tested scenarios. We also show how the prior may be used as a mechanism for tuning the estimator according to whether a high false positive or high false negative probability is preferable.
Content may be subject to copyright.
Probabilistic Target Detection by Camera-Equipped UAVs
Andrew Symington, Sonia Waharte, Simon Julier, Niki Trigoni
Abstract— This paper is motivatedby thereal world problem
of search and rescue by unmanned aerial vehicles (UAVs). We
consider the problem oftrackingastatic targetfrom a bird’s-
eye view camera mounted to the underside of a quadrotor UAV.
We begin by proposingatarget detection algorithm, which we
then execute on a collection of video frames acquired from
four different experiments. We show how theefficacyofthe
target detection algorithm changes as a function of altitude.
We summarise this efficacy intoatable which we denote the
observation model. We then run the target detection algorithm
on a sequence of video frames and use parameters from the
observation model to update a recursive Bayesian estimator.
Theestimator keeps track ofthe probability that a target
is currently in view ofthecamera, which werefer to more
simply as target presence. Between each target detection event
the UAV changes position and so the sensing region changes.
Under certain assumptions regarding the movement ofthe UAV,
the proportion of new information may be approximated toa
value, which we thenuse to weightthe prior in each iteration
oftheestimator. Through a series of experiments we show how
the value ofthe prior for unseenregions, the altitude ofthe
UAV and thecamera sampling rate affectthe accuracyofthe
estimator. Ourresults indicate thatthere is no single optimal
sampling rate for all tested scenarios. We also show how the
prior may be used as a mechanism for tuning theestimator
according to whether a high false positive or high false negative
probability is preferable.
Index TermsUAV, vision, target detection, sensing
I. INTRODUCTION
Unmanned Aerial Vehicles (UAVs) are typically commis-
sioned for tasks that are perceived as too dull or dangerous
for a humanpilottocarryout [13]. In the past, research
involving UAVs was constrainedbylargeand expensive
flight platforms whichofferedgreater payloads. However,
recent advances in embedded computing and sensors have
made small, low-cost autonomoussystems accessible to the
broaderresearch community.
In this paper weconsider a single UAV with a bird’s-
eye view video camera, which it uses to sense the world
beneath it. Computer vision is an active research field that
has already seenvarious applications to UAVs, such as
navigation [12], stabilisation and localisation [11], feature
tracking [7], SLAM [5], collision avoidance [15] and au-
tonomous landing [3]. The work in this paper was inspired
by Mondragon et al [7] who use vision-based targettracking
This work wassupportedbythe SUAAVE project. Further information
aboutthis projectcanbe found at http://www.suaave.org.
Andrew Symington, Niki Trigoni and Sonia Waharteare with the Oxford
University Computing Laboratory, Wolfson Building, Oxford, OX13QD,
UK.niki.trigoni@comlab.ox.ac.uk
Simon Julier is with the Department of Computer Science, Uni-
versity College London, Malet Place, London, WC1E 6BT, UK.
s.julier@cs.ucl.ac.uk
to estimate the velocityof the UAV relative to a global visual
reference frame. In this work weadopt a similar vision-
based algorithm, except we use it to track whether or not a
particular targetis inview of thecamera. We then measure
theeffect of thealtitude of the UAV and the frame rate of
thecamera on theaccuracyof the tracker over time.
We beginbyproposing atarget detection algorithm that
acts as a binary classifier or, put more simply, it determines
whether or not a target object exists in a video frame. One
would expectthattheefficacyof such aclassifier varies
according to the physical appearance of the target, camera
resolution,lighting conditions, etc. However, inour work
we will assume thatthese factors remain constant and the
efficacy issimply a function of the UAV’s altitude:the further
thecamera from the target, the less information is available
to the target detection algorithm, which causes it to miss
targets. To this end, we run the target detection algorithm on
a series of video frames and tabulate its efficacy as a function
of altitude, which we denote the observation model.
The target detection algorithm treats eachframe indepen-
dently and so we therefore introduce a recursive Bayesian
estimator to fuseaseries of noise-affecteddetection events
(observations) over time. Theestimator takes thisseries to
track the probabilityoftarget presence (whether the target
is inview of thecamera), taking into accounttheefficacy
of the target detection algorithm atthecurrent altitude. The
values it uses toquantify this efficacy are drawndirectlyfrom
the observation model. Our estimator also takes into account
the factthatthecamera view changes over time. Under
certain assumptions regarding the movement of the UAV, the
proportion of new information is givenbythe sampling rate
of the detector. We therefore introduce a term that decays
theestimateaccording to the proportion of new information,
which wecall the exploration ratio, thatis addedbetween
successive observations.
The remainder of this paper is organised as follows. In
Section II we describe the series of experiments that were
conducted toobtain the videodata used in this paper and
show the post-processing steps that we followed to label
the images with a ground truth. In Section III we beginby
describing the target detection algorithm. We then measure
theefficacyof this algorithm against a set of images in
order to constructthe observation model. In Section IVwe
firstly show the methodology behind thecalculation of the
exploration ratio for ourUAV platform and then introduce the
recursive Bayesian estimator. In Section Vwe first measure
theaccuracyofour estimator by comparing the probability
of target presence againstthe ground truthover time, varying
the UAV altitude, sampling rateand prior. We then comment
!"#"$%&&&$%'()*'+(,-'+.$/-'0)*)'1)$-'$2-3-(,14$+'5$67(-8+(,-'
6'19-*+:)$/-';)'(,-'$<,4(*,1(
=+>$?@AB$!"#"B$6'19-*+:)B$6.+4C+B$DE6
FGA@#@H!HH@I"H"@HJ#"JK!LM""$N!"#"$%&&& H"GL
on our findings. Section VI concludes this paper.
II.DATAACQUISITION AND P RE PARATION
In thissection we discuss howour videodata was acquired
and thenprepared for use by the target detection algorithm.
A. Acquisition
Inorder toobtainvideodata used in this paper we fixed a
FlyCamOne2 video camera to the underside of an Ascending
Technologies Hummingbirdquadrotor UAV. A number of
targets were positioned in a 20m×20mgridona flat grass
field. We flew the UAV over the grid at fixed altitudes of
5m, 10m, 15m and 20m. In addition to capturing videodata
at 27 frames per second, the UAV also recorded GPS and
inertial dataat around 10Hz. We found thatthe human target
depicted in Fig. 2 consistentlyyielded a sufficient amount of
information to train an effective target detection algorithm,
so we used it as the targetinour model.
B. Preparation
The first objective of data preparation is to isolateaset
of example video frames that we will use to train the target
detection algorithm. Eachof these video frames must contain
anunoccluded example image of the target. We draw a
rectanglearound the target and only the information within
that rectangle is used to train the target detection algorithm.
For our data set we used ten such frames for each altitude.
These frames constitute the training set, while the remainder
constitute the evaluation set.
The second objective is to label eachframe in theevalua-
tion set with a ground truth. The ground truth is effectively a
binary flag which tell us whether the framecontains either(i)
some or all of the target, or(ii)none of the target at all. The
efficacyof the target detection algorithm will be measured
by comparing the result of the detector againstthe ground
truth, for all frames in theevaluation set.
III. TARGET DETECTION
Thissection begins by describing the target detection
algorithm that we used. Our goalisnottopresent a novel and
provably superior algorithm, but rather to leverageexisting
techniques to createarealistic system, which wecanuse to
generate meaningful results. In essence,onecouldusean
alternate method such as Violaand Jones’s [14]boosting
to achieveexactly the same outcome, perhaps with a greater
accuracy. However, regardless of thealgorithm thatis chosen,
it will act as a binary classifierfor unseen images. Itthe
second part of thissection we showhow to measure the
efficacyof the target detection algorithm and summarise it
in the form of anobservation model, which we will thenuse
in the next section toupdate our belief of target presence.
A. The target detection algorithm
For our application we requireatarget detection algorithm
that determines whether or not a single imagecontains an
instance of the target. Inorder to achieve this we use Bay,
Tuytelaars and Van Gool’s [1] Speeded-up Robust Features
Fig. 1. This diagram provides a high-level summaryof the target detection
algorithm and the method by which it was evaluated. Recall thatthe video
frames are split into a training set and evaluation set. For eachframe
in theevaluation set, the target detection algorithm loops through the
templateexampleand determines whether it contains the target object: the
algorithm uses SURF keypoint matching with FLANN to find the most
likelyhomographic projection of the training image into theevaluation
image. The result of the target detection algorithm is compared to the ground
truth labelto form the observation model.
(SURF)1. The features producedbySURF are, essentially,
keypoints on a 2D image that are robusttoslightchanges
in scale, rotation and perspective. Each SURF keypoint has
an associated multidimensional descriptor2that characterises
the grayscale image gradientin the region surrounding the
keypoint. The similaritybetween twokeypoints is calculated
by measuring the distance between their descriptors with
the n-dimensional Euclidean metric. The Hessiandetermi-
nantthresholdgoverns the sensitivityof the SURF detector
and, hence, the number offeatures that are returned. We
determined empirically that a thresholdof500 culled many
of the weaker(and oftenbackground) keypoints, while
maintaining an acceptable number of keypoints on objects
at high altitudes.
Recall thatin the previoussection we divided all the video
frames inour data setinto a training set and an evaluation
set. For clarity, let us assume that weare working with the
data from onealtitudeonly. Now, assume that weare given
somearbitrary image which might or mightnot contain the
target, or part of a target. We refer to this as the unseen
image. The target detection algorithm simply loopsover all
images in the training set and attempts to locateeachone
of these images within the unseen image. If a location is
1Thealgorithm that we implemented is basedonthe find obj.cpp sample
code in the OpenCV pre-2.0distribution.
2Usually a 64 or128 double vector, depending on the required resolution.
H"GG
found for any of the training images thealgorithm returns
apositive detection event, which signals thatthe target was
found. Otherwise, a negative detection eventis returned.
What remains tobeexplained is how the training image is
located in anunseen image. This is where the SURF features
are used. The detection algorithm begins by calculating the
SURF keypoints for both images. Theaim is to find a
correspondence between the keypoints in the training image
and the keypoints in the unseen image. Inorder todo
so the Fast Library for Approximate Nearest Neighbors
(FLANN)[8] algorithm is used. This algorithm provides a
fast approximation to knearest-neighbour classification. The
result being that eachkeypointin the unseen image is paired
withits closest keypointin the training image. Recall that
the similarityof twokeypoints is calculatedbymeasuring
the distance between their two associated SURF descriptors.
In the next stage of the detection phase we removeall
of the weak correspondences. Weak correspondences usually
occur when some keypointlocatedonthe background clutter
in the unseen image is incorrectlypaired with a keypoint
in the training image; FLANN always maps to the nearest
neighbour, regardless of the distance to it. The intuitive way
todothis wouldbe to threshold thecorrespondences based
on the distance betweeneachpair of keypoints. Inpractice,
this heuristic fails. A superior approach involves thresholding
basedonthe distance ratiobetweeneachkepyointin the
unseen imageand its two closest matching neighbours. That
is, if the ratioof the distances between the two closest
matches is greater than some threshold wecull thecorrespon-
dence — the idea being thatifone keypointin the unseen
image maps to twokeypoints in the template image with
equal strength, it isunlikely thatitdescribessome particular
feature uniquely [6]. Rather, it is more likely thatthe feature
is background clutter being arbitrarily mapped to close
neighbours indescriptor space. Ramisa et al [10] discuss the
selection of this threshold for a varietyof keypoint detectors.
Theirresearch shows that good results are typicallyobtained
for SURF using a thresholdbetween0.6 and 0.8. Through
experimentation we found that a value of 0.6 was best for
our application: many background keypoints were discarded,
while meaningful keypoints were preserved.
Finally, thecorrespondence setis passed to the RANdom
SAmple Consensus (RANSAC)[4] algorithm, whichdeter-
mines the mostlikelyprojection of the template image into
the unseen image, given the presence of some statistical
outliers. Figure 2 illustrates this process it shows the
correspondence set as acollection of whitelines connecting
the template image keypoints to the unseen image keypoints.
The projection isshown as a bounding polygon in the scene.
If the RANSAC algorithm finds a projection and it has
greater than fivecorrespondences weassume thatthe scene
contains the target. Through experimentation we found that
if the threshold isset any lower the target detection algorithm
returnssignificantly more false detections atlower altitudes.
Conversely, if we setthe threshold any higher, the target
detection showssignificantly more false negatives occur at
higher altitudes.
Fig. 2. Atthe top left of this figure is an example training imageand
beneath it is anunseen imagecontaining the target. SURF keypoints for
both the targetimageand scenearecalculated. FLANN is used to find
mappingsfrom keypoints in the unseen image tokeypoints in the template
image (shown as lines in the figure). For clarity, we have not drawn any
weak mappings – those whichhaveadistance ratio above the threshold.
The RANSAC algorithm uses the mappings to calculate the mostlikely
projection of the training image into the unseen image (shown as a polygon).
B. Observation model
The performance of a machine learning algorithm is
measuredbycounting the number of true positives (TP), true
negatives (TN), false positives (FP) and false negatives (FN).
For a binary classifier these values are typically expressed in
the form of a 2x2 confusion matrix [9].
To evaluate the performance of the target detection al-
gorithm weexecuted it on every frame in theevaluation
set and compared the detection result to the ground truth.
If the target detection algorithm agreed with the ground
truth, we incremented the TP and TN count for positiveand
negative detection events respectively. If the target detection
algorithm detected a targetincorrectly, the false positive
count was incremented. On the other hand, if the target
detection algorithm failed todetect a targetthat was there,
the false negativecount was incremented. We repeated this
process for all four altitudes and the resultant confusion
matrix is listed in Table I.
The observation modelis deriveddirectlyfrom theconfu-
sion matrix. In essence, the observation modelissimply the
false positive probabilityαhand false negative probabilityβh
as a function of the UAV’s altitudeh. Wecalculate values
for αhand βhusing Eqn. 1 and Eqn. 2 respectively. The
H"GA
values for our data set arealso listed in Table I .
αh=F P
(F P +T N )(1)
βh=F N
(F N +T P )(2)
IV. RE CU RS IV E BAY ESIAN ESTIMATOR
In thissection we firstlydescribe how to measure the
amount of new information thatis introducedbetween two
observations, which wecall theexploration ratio. We then
presentthe recursive Bayesian estimator, whichuses both
theexploration ratio and the observation model for a given
altitude to maintain a best estimate of target presence.
TABLE I
CONFU SION M ATRI CES AND OB SE RVATION MODEL
Altitude Truth Detected Not Detected αhβh
5m Present 88 24 0.24569 0.21428
Absent 685 2103
10mPresent 100 25 0.06286 0.20000
Absent 87 1296
15mPresent 516 258 0.03107 0.33333
Absent 38 1185
20mPresent 571 302 0.00130 0.34593
Absent 2 1526
A. Calculating theexploration ratio
The role of theexploration ratio is to measure the
proportion of new information thatis introduced at each
observation, resulting from the movement of the UAV, as
a function of the UAV’s altitudeand the sampling rate of the
sensor. To simplify thecalculation of theexploration ratio we
will make the following assumption regarding the movement
of the UAV:italways moves at aconstant speed in a single
direction. Although onecoulduseamoreaccurate method
thattakes into accounttheattitudeand velocityof the UAV,
for this initial study we useasimpler approximation.
Let xhand yhbe the length and widthof thecamera
sensing region at somealtitudeh, all of which are given
in meters. Both xhand yhare related tooneanother
according to theaspect ratioof thecamera. In Fig. 3 we show
thecamera sensor coverageafter the UAV displacessome
distance das a result of moving in the specifieddirection.
The new, shared and lost areas areclearly marked in the
gure. Theexploration ratioeissimply the ratioof new
area to entire observation area. Inorder to calculate this ratio
we’ll first need a value for d. The value for dis calculated
by dividing theconstant velocityvof the UAV (in meters
per second)by the sampling rate r(in frames per second).
Eqn. 3 shows the full equation for calculating e.
e=dxh
yhxh
(3)
=v
ryh
(4)
Fig. 3. Thecoverage or observation region of the video camera sensor is
givenbya rectangle of length xand width y. Betweeneachobservation
the UAV movessome small distance dand the observation region changes
accordingly. The ratioof new area (d×xh)over the total sensing area
(xh×yh) is referred to as theexploration ratio and it varies as a function
of altitudeand sampling rate.
We determined the xhand yhvalue for each altitude in
our videodata set using the one meter intervalticks on the
star-shaped calibration pattern that we laidonthe ground.
Thecalibration pattern is clearlyvisible in the sample frame
shown in Fig. 2. We then measured theaverage velocityof
the UAV by integrating theacceleration readingsfrom the
inertial data todetermineareasonable value for v, which
turnedouttobe slightlyover one meter per second. Finally,
wecalculated the evalue for all combinations of the four
altitudes (5m, 10m, 15m and 20m) and sampling rates (1 ,5
and 10 frames per second) that wechose to test. The result
of ourcalculations are listed in Table II.
TABLE II
EXP LOR ATION R ATIO FO R VA RI OUS ALT ITUDE S AND SA MPL ING R ATES
Altitudeh yhr= 1 FPS r= 5 FPS r=10 FPS
5m 4m 0.250000 0.050000 0.025000
10m 7m 0.142857 0.028571 0.014286
15m10m 0.100000 0.020000 0.010000
20m13m 0.076923 0.015385 0.007692
B. The recursive Bayesian estimator
The role of the recursive Bayesian estimator is to take
a series of observations and maintain the the probabilityof
target presence. Moreover, theestimator takes into account
the factthatthecamera view changes over timeand also that
there issomeerror associated with the observation,bothof
whichvary with altitude.
The observation model parameters are the probabilityof
false positiveand the probabilityoffalse negative, we defined
earlier as αhand βhrespectively for somealtitudeh. Let us
assume thatthecamera sensor has an observation region
O(kt)which is visible from thecamera when the UAV is
located at position ktattime t. Let xTrepresentthe target.
H"GF
We use Chung’s [2] error model, where dt= 0 and dt= 1
denote negativeand positive target detection events:
P rh(dt= 1|xTO!kt") = 1 βh
P rh(dt= 0|xTO!kt") = βh
P rh(dt= 0|xT$∈ O!kt") = 1 αh
P rh(dt= 1|xT$∈ O!kt") = αh
Let dtbe the tth observation,Dtbe the set of tobserva-
tions and let xT= 1 be theeventthat a target exists in a
particularframe. The probability thatthe targetis presentin
the frameattime tis computedusing Bayes rule:
P r(xT= 1|Dt) =
P r(dt|xT= 1)P r(xT= 1|Dt1)
P r(dt|Dt1)(5)
The updateequation for the recursive Bayesian estimator
is conditional on whether a positive or negative detection
eventis encountered. Eqn. 6 shows this updateequation.
Pt=#(1βh)Pt1
(1βh)Pt1+αh(1Pt1),if dt= 1
βhPt1
βhPt1+(1αh)(1Pt1),if dt= 0 (6)
So far theestimator has implicitly assumed thatthecamera
view does not change betweenobservations. However, since
the UAV is moving this is notthecase. We therefore
introduce a state transition term to theestimator thattakes
into accountthat a portion of thecurrent framecontains
new information. The probabilityP0represents our prior
belief of target presence for some unexplored region. In each
iteration the refactoredprior, given in Eqn. 7, is a weighted
combination of the previousstep’s posterior and P0.
Pt1eP0+ (1 e)Pt1(7)
This weightedupdateequation causes theestimate to
convergeexponentially to P0over time. This is useful for two
reasons. Firstly, it takes into accountthe factthatthecamera
changes position over timeand, hence,objects may appear
and disappearfrom view. Secondly, it provides a method of
ensuring thattheestimate never converges to zeroorone
after a series of positive or negative detection events, which
happens as a result of the limited storagecapacityof floating
point data types.
V. EXPERIMENTS AND RESULTS
Weconducted a series of experiments inorder to mea-
sure theeffect of altitude, sampling rateand prior on the
performance of theestimator. We used real streams of
video frames, taken from four different altitudes (5m, 10m,
15m and 20m)3. For example, Fig. 4 shows the result of
running the Bayesian estimator on a video stream taken
from an altitude of 5m. If after anobservation the posterior
probabilityof theestimator exceeds 0.5 (see the dashed line
in the bottom three graphs in Fig. 4) weconsider this tobea
positive detection event. By comparing the ground truth with
3Recall thattheaccuracyof the target detection algorithm dependson
thealtitude, asshown in Table I.
theestimator’s predictions, wecan measure the probability
of the detector making a false positive prediction and that of
making a false negative prediction. We use these two metrics
to assess theestimator’saccuracy4.
The graphs in Fig. 5 show theaccuracyof theestimator
when thealtitude is 5m and 20m, and for different prior(P0)
values and sampling rates. Our first observation is that as the
prior increases, so the false positive probability increases,
whereas the false negative probabilitydecreases. Our second
observation is thatthe lower the sampling rate, the higher
theexploration ratio, and thus the higher the impact of prior
on the false positiveand false negative probabilities. These
twoobservations held for all four altitudes tested, although
not all graphs are included for space reasons.
Our thirdobservation relates to theeffect of altitudeon
the two estimator metrics. When thealtitudechanges the
estimator uses a new set of parameters from the observation
model and a different exploration ratio. Theeffect of the
exploration ratio is relatively straightforward and we dis-
cussed it in the previous paragraph. However, theeffect of
the observation model parameters is less obvious, despite
there being a generaltrend in the parameters themselves —
in Table I we see that αhand and βhdecreaseand increase
respectively with altitude. Moreover, any trend that might
exist maybe further obfuscatedbythe factthat different
video sequences were used to evaluate the four different
altitudes. Therefore, wecannot draw any conclusiveevidence
from ourresults that suggest a trend basedonaltitude.
Finally, our experimental resultsshow thatthatthere is no
best sampling rate for all scenarios. The optimal sampling
rate dependson thealtitude, on the prior, as well as on
whether theapplication is more interested in reducing false
positives orfalse negatives.
VI. CONCLUSION AND FUTURE WORK
In this paper we use videodata to train a target detection
algorithm and measure parameters for anobservation model
that describes its efficacy. We then implemented a recursive
Bayesian estimator to fuseaseries of detectionsover time,
taking into accountthe observation model and exploration
ratio associated with thealtitudeat which the observations
occur. Finally, weconducted a series of experiments to test
the impact of the prior, altitudeand sampling rate on the
performance of theestimator, compared to the ground truth.
Whileourresultsshow that sampling rate has a significant
effecton theestimator’s performance,it is clear thatthere
is no optimal sampling rate that fits all scenarios. The
prior shouldbechosen in conjunction with theapplication
requirements — in thecase of search and rescue one would
seek to minimize the false negative probability, while for
situations where thereareenergy orresource constraints one
would seek to minimize the false positive probability.
In future work we plan to run the full estimator online. We
alsoplan to conduct a detailed study ofhow altitudeaffects
the performance of theestimator.
4These probabilities must not beconfused withαhand βh, which
measure theaccuracyof the target detection algorithm.
H"A"
No
Yes
0 50 100 150 200 250 300 350
Presence
Time (seconds)
Ground truth for 5m altitude
0
0.2
0.4
0.6
0.8
1
0 50 100 150 200 250 300 350
Detection probability
Time (seconds)
5m altitude, 1 frame per second (prior=0.05)
0
0.2
0.4
0.6
0.8
1
0 50 100 150 200 250 300 350
Detection probability
Time (seconds)
5m altitude, 5 frames per second (prior=0.05)
0
0.2
0.4
0.6
0.8
1
0 50 100 150 200 250 300 350
Detection probability
Time (seconds)
5m altitude, 10 frames per second (prior=0.05)
Fig. 4. The top graph shows the ground truth for the videodatacaptured at
5m. The three graphs below show theevolution of the probabilityof target
presence for the same altitudeand period for 1 FPS, 5 FPS and 10 FPS.
The dashed lineat 0.5 is the threshold for a positive detection.
VII. ACKNOWLEDGMENT S
This research wassupportedbythe Sensing Unmanned
Autonomous Aerial Vehicles (SUAAVE) projectunder grants
EP/F064217/1, EP/F064179/1 and EP/F06358X/1. Specifi-
cally, we’d like to thank Stephen Hailes, Renzode Nardi,
Graeme McPhillips, Mohib Wallizadaand Dietmar Backes
for assisting with theacquisition of the videodata.
REF ER EN CE S
[1] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, “Speeded-up robust
features (surf),” Comput. Vis. Image Underst.,vol. 110,no. 3, pp.
346–359,2008.
[2] T. Chung and J. Burdick, “ADecision-Making framework for control
strategies inprobabilistic search,” in IEEE International Conference
on Robotics and Automation,2007,2007,pp.4386–4393.
[3] Y. Fan, S. Haiqing, and W. Hong, “A vision-based algorithm for
landing unmanned aerial vehicles,” in Computer Science and Software
Engineering,2008 Intl. Conf. on,vol. 1, Dec.2008,pp.993–996.
[4] M. Fischler and R. Bolles, “Random sampleconsensus: a paradigm
for model fitting with applications to imageanalysis and automated
cartography,Commun. ACM,vol. 24,no. 6, pp.381–395,1981.
[5] T. Lemaire, S. Lacroix, and J. Sola, “A practical3d bearing-only
slam algorithm,” in Intelligent Robots and Systems, 2005 IEEE/RSJ
International Conference on, Aug.2005,pp.2449–2454.
[6] D. G. Lowe, “Distinctive image features from scale-invariant key-
points,” Intl. Journal of Computer Vision,vol. 60,pp.91–110,2004.
[7]I. Mondragon, P. Campoy, J. Correa, and L. Mejias, “Visual model
feature tracking for uav control,” in Intelligent Signal Processing.2007
IEEE International Symposium on, Oct. 2007,pp.1–6.
0
0.05
0.1
0.15
0.2
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Probability of FP
Prior
False positive probability as a function of prior (altitude=5m)
1 FPS
5 FPS
10 FPS
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Probability of FN
Prior
False negative probability as a function of prior (altitude=5m)
1 FPS
5 FPS
10 FPS
0
0.05
0.1
0.15
0.2
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Probability of FP
Prior
False positive probability as a function of prior (altitude=20m)
1 FPS
5 FPS
10 FPS
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Probability of FN
Prior
False negative probability as a function of prior (altitude=20m)
1 FPS
5 FPS
10 FPS
Fig. 5. The four graphs above showhow the value of the prior affects
the performance of theestimator, in terms of the false positiveand false
negative probabilities for varioussampling rates. The top and bottom two
graphs were generated from 5m and 20m data respectively.
[8] M. Mujaand D. G. Lowe, “Fast approximate nearest neighbors
with automaticalgorithm conguration,” in International Conference
on Computer Vision Theory and Applications (VISAPP09), Lisboa,
Portugal, May 2009.
[9] F. Provost and R. Kohavi, “On applied research in machine learning,”
in Machine learning,1998,pp.127–132.
[10] A. Ramisa, S. Vasudevan, D. Aldavert, R. Toledo, and R. Lopez de
Mantaras, “Evaluation of the sift object recognition method in mobile
robots,” in Proceeding ofthe 2009 conference on Artificial Intelligence
ResearchandDevelopment. Amsterdam, The Netherlands, The
Netherlands: IOSPress,2009,pp.9–18.
[11] H. Romero, R. Benosman, and R. Lozano, “Stabilization and location
of a fourrotor helicopter applying vision,” in American Control
Conference,2006, June2006,pp.6pp.–.
[12] B. Sinopoli, M. Micheli, G. Donato, and T. Koo, “Vision basednavi-
gation for anunmanned aerial vehicle,” in Robotics and Automation,
2001. Proceedings2001 ICRA. IEEE International Conference on,
vol. 2, 2001,pp.1757–1764 vol.2.
[13] P. van Blyenburgh, “UAVs: anoverview,” Air & Space Europe,vol. 1,
no. 5-, pp.43 – 47,1999.
[14] P. Violaand M. J. Jones, “Robust real-time face detection,” Intl.
Journal of Computer Vision,vol. 57,no.2,pp.137–154, May 2004.
[15] J.-C. Zufferey and D. Floreano, “Toward30-gram autonomous indoor
aircraft: Vision-basedobstacleavoidance and altitudecontrol,” in IEEE
International Conference on Robotics and Automation,2005, April
2005,pp.2594–2599.
H"A#
... Recently, various machine and deep learning methods have been used to increase the probability of target recognition. Research on probabilistic searches includes [20][21][22][23][24][25][26][27], and search methods using artificial intelligence include [28][29][30][31][32]. Symington et al. [20] conducted an early study on stochastic searches. They proposed a target detection algorithm based on a recursive Bayesian model that estimated the probability of a target being present in video frames collected from a camera mounted on a UAV. ...
... Recently, various machine and deep learning methods have been used to increase the probability of target recognition. Research on probabilistic searches includes [20][21][22][23][24][25][26][27], and search methods using artificial intelligence include [28][29][30][31][32]. Symington et al. [20] conducted an early study on stochastic searches. They proposed a target detection algorithm based on a recursive Bayesian model that estimated the probability of a target being present in video frames collected from a camera mounted on a UAV. ...
... If d t is the t-th observation, D t is the set of t-th observations, and x T = 1 indicates that a target exists in a specific area, then the probability of the target at time t is calculated using Bayes' rule [44] as follows [20]: ...
Article
Full-text available
Recently, air pollution problems in urban areas have become serious, and unmanned aerial vehicles (UAVs) can be used to monitor air pollution because they can perform spatial movement. However, because air pollution sources are fluid, probabilistic search methods are required to identify a target through the probability of its existence. This study proposes an efficient algorithm to detect air pollution in urban areas using UAVs. An improved A-star algorithm that can efficiently perform searches based on a probabilistic search model using a UAV is designed. In particular, in the proposed improved A-star algorithm, several special weights are used to calculate the probability of target existence. For example, a heuristic weight based on the expected target, a weight based on data collected from the drone sensor, and a weight based on the prior information of obstacles presence are determined. The method and procedure for applying the proposed algorithm to the stochastic search environment of a drone are described. Finally, the superiority of the proposed improved A-star algorithm is demonstrated by comparing it with existing stochastic search algorithms through various practical simulations. The proposed method exhibited more than 45% better performance in terms of successful search rounds compared with existing methods.
... The POMDP is solved online during the flight to take into account the current landscape. [11]- [13] supplement color recognition with thermal and/or radar data in order to improve efficiency and accuracy of detection. ...
Preprint
Full-text available
Search and rescue missions and surveillance require finding targets in a large area. These tasks often use unmanned aerial vehicles (UAVs) with cameras to detect and move towards a target. However, common UAV approaches make two simplifying assumptions. First, they assume that observations made from different heights are deterministically correct. In practice, observations are noisy, with the noise increasing as the height used for observations increases. Second, they assume that a motion command executes correctly, which may not happen due to wind and other environmental factors. To address these, we propose a sequential algorithm that determines actions in real time based on observations, using partially observable Markov decision processes (POMDPs). Our formulation handles both observations and motion uncertainty and errors. We run offline simulations and learn a policy. This policy is run on a UAV to find the target efficiently. We employ a novel compact formulation to represent the coordinates of the drone relative to the target coordinates. Our POMDP policy finds the target up to 3.4 times faster when compared to a heuristic policy.
... This type of detection can occur in many ways, such as when an individual is hidden by the surroundings in the camera sensing region or when an individual passes the camera trap swiftly and is not detected [24]. To reach the recommended lower bound for the detection probability, the false positive and false negative detections must also be taken into account based on the study area [25]. Our future work will be focused to study the sensitivity of the spatiotemporal models to these factors and how these factors can be incorporated into the model to maintain the recommended lower bound for the detection probability. ...
Article
Full-text available
An important parameter in the monitoring and surveillance systems is the probability of detection. Advanced wildlife monitoring systems rely on camera traps for stationary wildlife photography and have been broadly used for estimation of population size and density. Camera encounters are collected for estimation and management of a growing population size using spatial capture models. The accuracy of the estimated population size relies on the detection probability of the individual animals, and in turn depends on observed frequency of the animal encounters with the camera traps. Therefore, optimal coverage by the camera grid is essential for reliable estimation of the population size and density. The goal of this research is implementing a spatiotemporal Bayesian machine learning model to estimate a lower bound for probability of detection of a monitoring system. To obtain an accurate estimate of population size in this study, an empirical lower bound for probability of detection is realized considering the sensitivity of the model to the augmented sample size. The monitoring system must attain a probability of detection greater than the established empirical lower bound to achieve a pertinent estimation accuracy. It was found that for stationary wildlife photography, a camera grid with a detection probability of at least 0.3 is required for accurate estimation of the population size. A notable outcome is that a moderate probability of detection or better is required to obtain a reliable estimate of the population size using spatiotemporal machine learning. As a result, the required probability of detection is recommended when designing an automated monitoring system. The number and location of cameras in the camera grid will determine the camera coverage. Consequently, camera coverage and the individual home-range verify the probability of detection.
... There are several systems that can classify targets that do not use radar, such as cameras in the visible range [37], thermal cameras [38], LiDAR (light detection and ranging) devices [39], and more. These systems provide a detailed picture of both a target and the environment of that target, and they are more informative than radar devices that only provide vector information. ...
Article
Full-text available
This study presents a reliable classification of walking pedestrians and animals using a radar operating in the millimeter waves (mmW) regime. In addition to the defined targets, additional targets were added in an attempt to fool the radar and to present the robustness of the proposed technique. In addition to the classification capabilities, the presented scheme allowed for the ability to detect the range of targets. The classification was achieved using a deep neural network (DNN) architecture, which received the recordings from the radar as an input after the pre-processing procedure. Qualitative detection was made possible due to the radar’s operation at extremely high frequencies so that even the tiny movements of limbs influenced the detection, thus enabling the high-quality classification of various targets. The classification results presented a high achievable accuracy even in the case where the targets attempted to fool the radar and mimicked other targets. The combination of the use of high frequencies alongside neural-network-based classification demonstrated the superiority of the proposed scheme in this research over the state of the art. The neural network was analyzed with the help of interpretable tools such as explainable AI (XAI) to achieve a better understanding of the DNN’s decision-making process and the mechanisms via which it was able to perform multiple tasks at once.
... As a result, target search is required to collect the location information of these SNs. Currently, many technologies such as the Global Positioning System (GPS), RF and cameras are capable of assisting UAVs to search for static or mobile targets [18], [19]. There are already some studies investigating the UAV-enabled target search based on a priori known target location probability map [20]- [24]. ...
Article
Due to the outstanding merits such as mobility, high maneuverability, and flexibility, Unmanned Aerial Vehicles (UAVs) are viable mobile power transmitters that can be rapidly deployed in geographically constrained regions. They are good candidates for supplying power to energy-limited Sensor Nodes (SNs) with Wireless Power Transfer (WPT) technology. In this paper, we investigate a UAV-enabled WPT system that transmits power to a set of SNs at unknown positions. A key challenge is how to efficiently gather the locations of SNs and design a power transfer scheme. We formulate a multi-objective optimization problem to jointly optimize these objectives: maximization of UAV's search efficiency, maximization of total harvested energy, minimization of UAV's flight energy consumption and maximization of UAV's energy utilization efficiency. To tackle these issues, we present a two-stage strategy that includes a UAV Motion Control (UMC) algorithm for obtaining the coordinates of SNs and a Dynamic Genetic Clustering (DGC) algorithm for power transfer via clustering SNs into clusters. First, the UMC algorithm enables the UAV to autonomously control its own motion and conduct target search missions. The objective is to make the energy-restricted UAV find as many SNs as feasible without any priori known location information. Second, the DGC algorithm is introduced to optimize the energy consumption of the UAV by combining a genetic clustering algorithm with a dynamic clustering strategy to maximize the amount of energy harvested by SNs and the energy utilization efficiency of the UAV. Finally, experimental results show our proposed algorithms outperform their counterparts.
Article
In recent years, unmanned aerial vehicles (UAVs) have been widely applied in traffic offloading and data collection. Due to the advantages in terms of mobility and flexibility. We investigate the data collection scheme for wireless sensor network (WSN) with randomly distributed ground sensors. In this paper, an efficient UAV-aided data collection scheme for large-scale WSN is proposed, a group of UAVs are deployed to provide service to ground sensors with unknown positions. The goal is to maximize the data transmission rate of sensor network by optimizing the coverage area of UAVs and the association of ground sensors with UAVs. To solve the optimization problem, we first introduce a concept termed distributed coverage area (DCA) based on Reuleaux triangle (RT). Then, an attractive mechanism is designed using Roman domination for UAVs to select appropriate attractive source. The mechanism can ensure that the UAVs will prioritize providing service above the vertices with denser distribution of ground sensors, while the remaining UAVs are deployed along the edges of the sensor network for a more comprehensive coverage. Finally, an ordered improved particle swarm optimization (IPSO) deployment algorithm is proposed to search the random locations of ground sensors and optimize the specific positions of UAVs. The simulation results show the superiority of the proposed scheme, and the coverage performance for data collection is committed.
Article
Determining the size of objects (static or moving) appearing in the aerial imagery can foster targets’ reconnaissance and support rapid decision-making during the flight. This paper presents a novel method, based on previous consolidated approaches, to measure the dimensions of any target appearing in an aerial image, either vertically (height), horizontally (length, width), or even areas. It works with optical cameras; no other payload sensors are needed. It requires a single optical image and a short processing time (it can work in real-time during the flight). The required parameters for the calculation are the focal length, detector dimensions, Pitch and Roll (Euler) angles, and the relative elevation of the aerial camera from the points where the targets are. This latter parameter is strongly dependent on the positional accuracy of the aerial system, which is not always very high. Therefore, this paper also proposes an approach to retrieve the accurate elevation by considering only two points of known coordinates visible in the image. The coordinates of these points can be obtained using Digital Elevation Models (DEMs), which can subsequently allow retrieving the relative elevation of the camera at any other point of the image. The method was tested using an accurate DEM and 24 aerial images acquired by a drone. The results showed that in most of the cases considered, the metrics were estimated with an error lower than 5 cm. In the last part of the paper, potential limitations and solutions to automatize the processes were discussed.
Article
Multiple unmanned aerial vehicles (UAVs) cooperative search have been widely adopted for surveillance and search-related applications. For a certain search area, UAVs may need to search it repeatedly to obtain a high-confidence result about the target distribution in the search area. However, the short battery life and moderate computational capability restrict UAVs to repeatedly execute the computation-intensive and energy-consuming search tasks. To address the issue, in this paper, we utilize edge computing to develop a continual and cooperative UAV search mechanism. Specifically, we first establish an edge computing enabled multi-UAV cooperative search framework, in which the mobility model of UAV, search task computing and offloading models are presented. An uncertainty minimization problem is then formulated, aiming to obtain a high-efficiency and high-confidence search result at the unpredictable uncertainty in search area. Considering that round-trip energy consumption, offloading decision-making, and trajectory planning may contribute to the reduction in uncertainty, we propose an uncertainty minimization-based cooperative target search (UMCTS) strategy. Finally, extensive simulation results validate that UMCTS can outperform the existing strategies and achieve at least 89%89\% performance gain on average uncertainty. Based on the results, we also present a comprehensive analysis and discussion on how different parameters affect the search performance.
Article
Unmanned aerial vehicles (UAVs) are widely used for surveillance and monitoring to complete target search tasks. However, the short battery life and moderate computational capability hinder UAVs to process computation-intensive tasks. The emerging edge computing technologies can alleviate this problem by offloading tasks to the ground edge servers. How to evaluate the search process so as to make optimal offloading decisions and make optimal flying trajectories represent fundamental research challenges. In this paper, we propose to utilize the concept of uncertainty to evaluate the search process, which reflects the reliability of the target search results. Thereafter, we propose a deep reinforcement learning (DRL) technique to jointly make optimal computation offloading decisions and flying orientation choices for multi-UAV cooperative target search. Specifically, we first formulate an uncertainty minimization problem based on the established system model. By introducing a reward function, we prove that the uncertainty minimization problem is equivalent to a reward maximization problem, which is further analyzed by a Markov decision process (MDP). To obtain the optimal task offloading decisions and flying orientation choices, a deep Q-network (DQN) based DRL architecture with a separated Q-network is then proposed. Finally, extensive simulations validate the effectiveness of the proposed techniques, and comprehensive discussions on how different parameters affect the search performance are given.
Conference Paper
Full-text available
This article presents a bearing only 3D SLAM algorithm which has the same complexity and optimality as the usual extended Kalman filter used in classical SLAM. We especially focus on the landmark initialization process, which relies on visual point features tracked in the sequence of acquired images: a probabilistic approach to estimate their parameters is presented. This induces a particular structure of the filter architecture, in which are memorized a set of past robot poses. Simulations are made to compare the influence of some parameters required by our approach, and results with an indoor robot and an airship are presented.
Conference Paper
Full-text available
We have constructed a frontal face detection system which achieves detection and false positive rates which are equivalent to the best published results [7, 5, 6, 4, 1]. This face detection system is most clearly distinguished from previous approaches in its ability to detect faces extremely rapidly. Operating on 384 by 288 pixel images, faces are detected at 15 frames per second on a conventional 700 MHz Intel Pentium III. In other face detection systems, auxiliary information, such as image differences in video sequences, or pixel color in color images, have been used to achieve high frame rates. Our system achieves high frame rates working only with the information present in a single grey scale image. These alternative sources of information can also be integrated with our system to achieve even higher frame rates.
Article
This article presents a novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features). SURF approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (specifically, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper encompasses a detailed description of the detector and descriptor and then explores the effects of the most important parameters. We conclude the article with SURF's application to two challenging, yet converse goals: camera calibration as a special case of image registration, and object recognition. Our experiments underline SURF's usefulness in a broad range of topics in computer vision.
Article
This dossier aims to give an overview of the current situation pertaining to uninhabited aerial vehicle systems in the world and at giving some indications on what the future may have in store for us. It tries to give a representative overview of the UAVs currently in use, being considered for purchase and the general state of UAV-related technology and the industry involved.
Conference Paper
This paper presents the search problem formulated as a decision problem, where the searcher decides whether the target is present in the search region, and if so, where it is located. Such decision-based search tasks are relevant to many research areas, including mobile robot missions, visual search and attention, and event detection in sensor networks. The effect of control strategies in search problems on decision-making quantities, namely time-to-decision, is investigated in this work. We present a Bayesian framework in which the objective is to improve the decision, rather than the sensing, using different control policies. Furthermore, derivations of closed-form expressions governing the evolution of the belief function are also presented. As this framework enables the study and comparison of the role of control for decision-making applications, the derived theoretical results provide greater insight into the sequential processing of decisions. Numerical studies are presented to verify and demonstrate these results
Conference Paper
In this paper, we deal with the problem of local positioning and orientation of a rotorcraft in indoor flight using a simple vision system. We apply two different approaches to obtain a navigation system for the flying machine. The first approach is based on the perspective of n-points and the second one follows the plane-based pose technique. Our aim is to obtain a good estimate of variables that are difficult to measure using conventional GPS and inertial sensor in urban environment or indoor. We propose a method to measure translational speed as well as position and orientation in a local frame
Conference Paper
We aim at developing autonomous micro-flyers capable of navigating within houses or small built environments. The severe weight and energy constraints of indoor flying platforms led us to take inspiration from flying insects for the selection of sensors, signal processing, and behaviors. This paper presents the control strategies enabling obstacle avoidance and altitude control using only optic flow and gyroscopic information. For experimental convenience, the control strategies are first implemented and tested separately on a small wheeled robot featuring the same hardware as the targeted aircraft. The obstacle avoidance system is then transferred to a 30-gram aircraft, which demonstrates autonomous steering within a square textured arena.
Conference Paper
For many computer vision problems, the most time consuming component consists of nearest neighbor match- ing in high-dimensional spaces. There are no known exact algorithms for solving these high-dimensional problems that are faster than linear search. Approximate algorithms are known to provide large speedups with only minor loss in accuracy, but many such algorithms have been published with only minimal guidance on selecting an algorithm and its parameters for any given problem. In this paper, we describe a system that answers the question, "What is the fastest approximate nearest-neighbor algorithm for my data?" Our system will take any given dataset and desired degree of precision and use these to automatically determine the best algorithm and parameter values. We also describe a new algorithm that applies priority search on hierarchical k-means trees, which we have found to provide the best known performance on many datasets. After testing a range of alternatives, we have found that multiple randomized k-d trees provide the best performance for other datasets. We are releasing public domain code that implements these approaches. This library provides about one order of magnitude improvement in query time over the best previously available software and provides fully automated parameter selection.
Conference Paper
Autonomous landing is an important part of the autonomous control of unmanned aerial vehicles (UAVs). In this article we present the design and implementation of a vision algorithm for autonomous landing. We use an onboard camera to obtain the image of the ground where the landing target (landmark) is laid and a method based on Zernike moments will be responsible for the target detection and recognition. Then the pose and position of the UAV will be estimated using the image of the landmark. As an important part of the article, the results of experiments on an OpenGL-developed testbed are presented as well as the trial results using simulated flight video sequences to demonstrate the accuracy and efficiency of our algorithm.