ArticlePDF Available

Artificial intelligence for automated detection of large mammals creates path to upscale drone surveys

Springer Nature
Scientific Reports
Authors:

Abstract and Figures

Imagery from drones is becoming common in wildlife research and management, but processing data efficiently remains a challenge. We developed a methodology for training a convolutional neural network model on large-scale mosaic imagery to detect and count caribou ( Rangifer tarandus ), compare model performance with an experienced observer and a group of naïve observers, and discuss the use of aerial imagery and automated methods for large mammal surveys. Combining images taken at 75 m and 120 m above ground level, a faster region-based convolutional neural network (Faster-RCNN) model was trained in using annotated imagery with the labels: “ adult caribou ”, “ calf caribou ”, and “ ghost caribou ” (animals moving between images, producing blurring individuals during the photogrammetry processing). Accuracy, precision, and recall of the model were 80%, 90%, and 88%, respectively. Detections between the model and experienced observer were highly correlated (Pearson: 0.96–0.99, P value < 0.05). The model was generally more effective in detecting adults, calves, and ghosts than naïve observers at both altitudes. We also discuss the need to improve consistency of observers’ annotations if manual review will be used to train models accurately. Generalization of automated methods for large mammal detections will be necessary for large-scale studies with diverse platforms, airspace restrictions, and sensor capabilities.
Content may be subject to copyright.
1
Vol.:(0123456789)
Scientic Reports | (2023) 13:947 | https://doi.org/10.1038/s41598-023-28240-9
www.nature.com/scientificreports
Articial intelligence for automated
detection of large mammals
creates path to upscale drone
surveys
Javier Lenzi
1*, Andrew F. Barnas
1,2, Abdelrahman A. ElSaid
3, Travis Desell
4,
Robert F. Rockwell
5 & Susan N. Ellis‑Felege
1
Imagery from drones is becoming common in wildlife research and management, but processing
data eciently remains a challenge. We developed a methodology for training a convolutional neural
network model on large‑scale mosaic imagery to detect and count caribou (Rangifer tarandus),
compare model performance with an experienced observer and a group of naïve observers, and discuss
the use of aerial imagery and automated methods for large mammal surveys. Combining images
taken at 75 m and 120 m above ground level, a faster region‑based convolutional neural network
(Faster‑RCNN) model was trained in using annotated imagery with the labels: “adult caribou”, “calf
caribou”, and “ghost caribou” (animals moving between images, producing blurring individuals
during the photogrammetry processing). Accuracy, precision, and recall of the model were 80%,
90%, and 88%, respectively. Detections between the model and experienced observer were highly
correlated (Pearson: 0.96–0.99, P value < 0.05). The model was generally more eective in detecting
adults, calves, and ghosts than naïve observers at both altitudes. We also discuss the need to improve
consistency of observers’ annotations if manual review will be used to train models accurately.
Generalization of automated methods for large mammal detections will be necessary for large‑scale
studies with diverse platforms, airspace restrictions, and sensor capabilities.
Drones oer a variety of advantages that make them a powerful tool for wildlife ecologists1,2. In the past, it has
been challenging to obtain data of animal counts spatially and temporally because aircra missions and satel-
lite images are expensive, and ground-based surveys in many cases are restrictive in terms of accessibility to
sites, the areas that could be covered, and the low cost-eectiveness ratio. More recently, drones have emerged
as a highly cost-eective tool that allows researchers to reduce survey costs, notably increasing the amount of
high-quality information35. Additionally, drones can oer a non-invasive technique that reduces disturbance
in comparison with traditional approaches6,7. For these reasons, this technology is being increasingly adopted
by wildlife ecologists.
Studies about species detection, abundance, distribution, behavior, and reproduction of terrestrial and marine
vertebrates have been growing in the past 2 decades using drone technology3,4,8,9. In particular, studies using
drones have been carried out in terrestrial mammals, mostly on large herbivores1018. Most of these studies have
been conducted in African ecosystems, like the savannas; however, studies on terrestrial mammalian herbivores
in the wild are still lacking in Arctic and sub-Arctic ecosystems. ese are logistically and nancially challenging
regions19, where the survey area needed to be covered is usually very large, and weather conditions make occu-
pied survey ights dangerous20 and satellites unreliable21. Fortunately, drones can ameliorate all three of those
problems, and could be used for conservation, research, and monitoring in these challenging environments.
One species of conservation interest is caribou (Rangifer tarandus) where population declines appear associ-
ated with human activities along its distributional range22. Available methodologies used to monitor caribou
populations such as collaring or monitoring with occupied aircras, although useful, could be disruptive to
individuals, nancially challenging, and certainly logistically intensive23. e use of alternative methodologies,
OPEN
1Department of Biology, University of North Dakota, Grand Forks, ND 58202, USA. 2School of Environmental
Studies, University of Victoria, Victoria, BC V8W 2Y2, Canada. 3Department of Computer Science, University of
North Carolina Wilmington, Wilmington, NC, USA. 4Department of Software Engineering, Rochester Institute of
Technology, Rochester, NY, USA. 5Vertebrate Zoology, American Museum of Natural History, New York, NY 10024,
USA. *email: javier.lenzi@und.edu
Content courtesy of Springer Nature, terms of use apply. Rights reserved
2
Vol:.(1234567890)
Scientic Reports | (2023) 13:947 | https://doi.org/10.1038/s41598-023-28240-9
www.nature.com/scientificreports/
such as drones, to study caribou in sub-Arctic habitats of northern North America, has been assessed by Patter-
son etal.24. ese authors evaluated the use of drones as a methodology to manually detect and count surrogate
caribou in natural habitats. However, to our knowledge drone technology has not been empirically evaluated with
wild caribou. Additionally, the generation and management of large amounts of raw data and manual counting
from imagery are still time consuming, inecient, and error-prone, decreasing the benets of the technology25,26.
erefore, ecient and cost-eective approaches to detect and count wild caribou could be advantageous, because
its broad distribution requires access to remote locations and to cover extensive (in the range of millions of km2)
sections of land27. is situation imposes a challenge for researchers in their eorts to acquire and rapidly analyze
data. As a result, monitoring that incorporates drones also requires the development of automated procedures
to provide accurate and timely information for wildlife managers23.
One approach to facilitate detection and counting of individuals from aerial imagery is machine learning,
and in particular the development of convolutional neural networks (CNNs), which are highly successful for
computer vision tasks. CNNs are a type of deep neural network useful for image classication and recognition,
which are composed by two elements: feature extraction and classication. e purpose of feature extraction is to
produce feature maps, which is carried out by processes called convolutions28. Convolutions consists of applying
of a lter that slides over an input image (or other feature map) combining the input value and the lter value to
produce another feature map. e process is repeated multiple times with dierent layers of convolutional lters
resulting in multiple layers of feature maps of progressively smaller sizes, where the nal layer is a vector of single
values, as opposed to tensors of feature maps29. en, the classication part takes this nal layer and adds a small
number of fully connected layers, similar to a regular feed forward neural network28,30. e end of the classica-
tion part is a loss function, typically somax for classication tasks, which provides a predicted probability for
each of the target objects. Applications of CNNs to drone imagery have been growing during the past decade31.
For instance, in koalas (Phascolarctus cinereus)32, cetaceans33, olive ridley sea turtles (Lepidochelys olivacea)34,
kiang (Equus kiang)35, birds3640 and a set of African terrestrial mammals4143. Depending on the quality of the
imagery and the amount of training data, evidence shows that precision and accuracy of detections using CNNs
can be high, in some cases better than human counts25. As a result, there are opportunities to develop CNNs for
a host of dierent wildlife surveys, including methods to count large mammals in remote locations, such as the
challenge caribou pose.
e objectives of this study were to train a CNN to detect and classify caribou from large-scale drone imagery,
as most modern CNN architectures are not capable of dealing with huge input images (e.g., mosaics exceeding
sizes of 50k by 50k pixels). Our aim was to develop an ecient and cost-eective approach to provide accurate
and timely information for researchers and wildlife managers. Additionally, in studies where automatic detection
and classication algorithms are developed, manual classication is employed for two reasons: rst, to train and
develop algorithms and secondly for validation44. Both processes could be carried out by expert and/or naïve
observers (besides citizen science ventures). In this study, we use an expert observer (who was involved in the
eld data collection) and a team of qualied naïve observers (some of which are experienced image analysts in
other contexts) to manually classify detections of dierent types of caribou (see “Manual counts” section for
caribou type details). e experienced observer classications are used to train and test the CNN model. In addi-
tion, annotations of naïve observers are used to mimic a lifelike scenario, where a qualied team of volunteers
is employed to generate training data for algorithms in large-scale contexts, from detections and classications
of terrestrial mammals in drone imagery. us, our second objective was to compare the CNN model’s detec-
tions and classications to the detections and classications provided by our team of naïve observers. Finally, we
discuss the limitations and what is needed to scale up our approach for large-scale studies required to address
populations of large terrestrial mammals.
Methods
Study area. We conducted drone surveys on 18 July 2016 within the braided delta of the Mast River in
Wapusk National Park, Manitoba, Canada (Supplementary Fig. S1.1 online). e study area where imagery was
collected is 1 km2 (Supplementary Information). It consists primarily of small dwarf willow-dominated (Salix
sp.) islands (approximately 1–300 m2), open graminoid meadows, and river habitat. For an in-depth geophysi-
cal and biological description of the study area see4548. e Cape Churchill sub-population present in the study
area was estimated in 2937 individuals in 2007 and is part of the Eastern Migratory caribou population, recently
designated as Endangered27.
Drone surveys. During drone surveys of nesting common eiders (Somateria mollissima)49, a herd of caribou
moved into the study area and remained bedded down or mostly sedentary for several hours. We opportunisti-
cally collected imagery of the entire herd during our eider survey. Flights were performed with small xed-wing
drone (Trimble UX5), which contained a 16.1 MP optical camera in the nadir position. Images of caribou were
collected during four ights between 09:08 and 12:41, at altitudes of 120m (2 ights) and 75m (2 ights) above
ground level (AGL). Following surveys, individual images from each ight were stitched together using Pix4D
v. 3.1.22 to create four georeferenced orthomosaic TIFF images (ground sampling distance: 3.7 at 120m and
2.4cm at 75m), which were subsequently used to perform manual and automated counts. For further details
such as payload, sensor, data collection, data post-processing, permits, regulations, training, and quality reports
of this study, see the Drone Reporting Protocol in the Supplementary Information50.
Methods were planned in accordance with the relevant guidelines and regulations. e study was designed
considering the potential impacts on target and non-target species. us, we ew no less than 75m above ground
level to reduce disturbance on caribou and other biodiversity, as well as were the lowest altitude threshold the
Trimble xed-wing drone could y. Also, according to national regulations for drone operations, 122m is the
Content courtesy of Springer Nature, terms of use apply. Rights reserved
3
Vol.:(0123456789)
Scientic Reports | (2023) 13:947 | https://doi.org/10.1038/s41598-023-28240-9
www.nature.com/scientificreports/
maximum height that we were authorized to y, so to stay below this threshold we restricted our maximum
altitude to 120m. Data collection and eld procedures were authorized by Canadian Wildlife Service Research
and Collection Permit 16-MB-SC001, Wapusk National Park WAP-2005-18760 and WAP-2015-18846, UND’s
Institutional Animal Care and Use Committee #A3917-01 (protocol 1505-2), UNDs Unmanned Aircra System
Research Compliance Committee reviewed human privacy and data management projects (approved April 10,
2015), and a Special Flight Operations Certicate (File: 5812-11-302, ATS: 17-18-00008,40, RDIMS: 13138829).
Manual counts. Manual counts of caribou on each of the four mosaics, were performed by six observers.
One (AFB) is an experienced observer who participated in the eld work activities and is acquainted with the
behavior and spatial distribution of this caribou herd. e rest of the ve observers (naïve observers) lacked
experience with the species, although some are experienced image analysts in other settings. All naïve observers
were specically trained in the counting procedure. To perform the identication and classication, all observers
used the platform Open UAS Repository—OUR (https:// digit alag. org/ our/). OUR is a web-based infrastructure
that allows users to upload, share, and collaborate in the analysis of large-scale UAS imagery. Mosaics were
uploaded to this repository for the observers to search and count caribou individuals with the aid of a 50 × 50m
grid overlay across the image.
e counting procedure involved the identication of three types of targets: “adult caribou”, “calf caribou”,
and “ghost caribou” that are the product of the image mosaic processing. Although adult caribou were dominant
in the images and their body size was variable, calves (smaller individuals) could be distinguished based on
their size (Fig.1a). “Ghosts, however, could be of either size and appeared as blurred or even transparent in the
images (Fig.1a). Because individuals move during image collection, they become visible in multiple images as
“ghosts” that appear in one image but from an overlapping image they are not present, which causes a challenge
for the mosaicking process and ultimately the automated recognition algorithms. us, we decided to include
this category in the classication in order to account for this potential source of error.
During the classication process, the observer used the labeling tool of the Open UAS Repository to draw a
rectangle or bounding box surrounding each individual identied (Fig.1b). Each rectangle contains the actual
caribou or ghost including all the pixels and the least possible amount of background (Fig.1b). Aer the process
of labeling, each classied image box was logged into a text le containing information of the type of label and
a list of vertex coordinates (pixel coordinates) of the rectangles for all classied caribou. In addition to labeling,
observers were asked to record the time they spent in processing each image, for further comparisons with the
automatic detection algorithm.
Automatic detections: faster‑RCNN. We trained a Faster-RCNN (faster region-based convolutional
neural network) model51 with the goal to be independent of the image resolutions of the sensors or ight charac-
teristics, such as altitude, that impact pixel sizes and ultimately ground sampling distance. To utilize the captured
large-scale mosaics as training data, these mosaic images were cut into smaller pieces or “tiles” to train and test
the model52,53. Tiling is a useful method when computer memory limits the training of large data sets. us, the
four orthomosaic les (120m AGL: 845MB and 980MB, 75m AGL: 1.77GB and 1.88GB), were tiled to 1000
pixels × 1000 pixels images. Occasionally, as a product of the tiling process, individual caribou (adults, calves,
or ghosts) could be cut o and split into two consecutive tiles. To avoid losing or double counting split animals,
tiles were overlapped by 100 pixels on the right and lower borders, so that if an animal is located on an edge, it
was counted in the following tile (see Supplementary Fig. S2.1 online). Finally, to evaluate the performance of
Figure1. (A) Examples of adult (red circle), calf (blue circle), and ghost (yellow circle) caribou that observers
classied at the Open UAS Repository—OUR (https:// digit alag. org/ our/). (B) Adults (red rectangles), calves
(blue rectangles), and ghosts (yellow rectangles) aer the classication in the repository. Figure was assembled
using Anity Designer v 1.9.1.979 (https:// an ity. serif. com/).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
4
Vol:.(1234567890)
Scientic Reports | (2023) 13:947 | https://doi.org/10.1038/s41598-023-28240-9
www.nature.com/scientificreports/
the Faster-RCNN independent of the ground sampling distance, the tiles of the four orthomosaics were shued
for training.
A total of 2,148 tiles were produced and split into a training data set of 1607 tiles (75%; Mosaic 1 at 120m
AGL: 271 tiles, Mosaic 2 at 120m AGL: 288 tiles, Mosaic 3 at 75m AGL: 548 tiles, and Mosaic 4 at 75m AGL:
500 tiles) and a testing data set of 541 tiles (25%, including tiles with no caribou on them as negative examples;
Mosaic 1 at 120m AGL: 80 tiles, Mosaic 2 at 120m AGL: 105 tiles, Mosaic 3 at 75m AGL: 173 tiles, Mosaic 4 at
75m AGL: 183 tiles; Fig.2). e training data set was employed to train the Faster R-CNN model (Fig.2) using
TensorFlow54. One NVIDIA GPU and one Intel(R) Xeon(R) CPU at 2.2GHz were used in the training/testing of
the model. e training took a week and performed 20,000 epochs of backpropagation using stochastic gradient
decent. During the model training, and at xed intervals (30 training epochs), the model was assessed using the
testing data set. When the training learning curve of the model55,56 was at, the threshold was reached and the
training was concluded (see Fig.2). e model was compared against our experienced observer to determine its
performance. Using this approach, we assumed that the experienced observer did not miss any individual and
correctly classied all the adults, calves, and ghosts.
e performance of the Faster-RCNN model was evaluated estimating accuracy, precision, and recall. Accu-
racy was dened as the proportion of true positives in relation to the experienced observer. Precision was
dened as the proportion of true positives predicted by the model that were actually correct. Recall was dened
as the proportion of true positives in relation to all the relevant elements. Accuracy, precision, and recall were
estimated as follows:
Figure2. Flowchart of the steps followed to train and test the Faster R-CNN model from the expert observer
counts. Results of the trained detection model were then used for comparisons with the experienced observer
counts.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
5
Vol.:(0123456789)
Scientic Reports | (2023) 13:947 | https://doi.org/10.1038/s41598-023-28240-9
www.nature.com/scientificreports/
Accuracy, precision and recall were estimated for caribou as a species, and for the caribou categories: adults,
calves, and ghosts, and for each of the four mosaic at 120m and 75m AGL. Using the testing data set and
comparing the tiles classied by the Faster-RCNN model with those classied by our experienced observer,
we proceeded to estimate true positives, false positives, and false negatives as follows. First, we counted all the
individual caribou identied by the Faster-RCNN model that matched our experienced observer and classied
them as true positives. en, in relation to our experienced observer, we counted the caribou that were missed
as false negatives, and those that were misclassied (birds, rocks, trees, and trunks classied as caribou) as false
positives (Fig.3). Second, we counted all the caribou categories separately: adults, calves, and ghosts identied by
the Faster-RCNN model that matched our experienced observer and classied them as true positives. en, we
estimated the adults, calves, and ghosts missed by our model as false negatives, and those that were misclassied
(calves classied as adults or ghosts, adults as calves or ghosts, ghosts as adults or calves; birds, rocks, trees, and
trunks classied as any of these caribou categories) as false positives (Fig.3). To estimate accuracy, we did not
use true negatives because our approach did not consider classifying the absence of caribou in the images, and
(1)
Accuracy
=
true positives
true positives
+
false positives
+
false negatives
(2)
Precision
=
true positives
true positives
+
false positives
(3)
Recall
=
true positives
true positives
+
false negatives
Figure3. Example of comparison of experienced observer counts (A, C) with the Faster-RCNN (B) and naïve
observers (D). Blue arrows indicate misclassications and yellow arrows missed individuals. Note that at the
Faster-RCNN image, detection of an adult and a ghost are overlapped. Counted adults are the true positives,
misclassications are false positives, and missed individuals are false negatives. Figure was assembled using
Anity Designer v 1.9.1.979 (https:// an ity. serif. com/).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
6
Vol:.(1234567890)
Scientic Reports | (2023) 13:947 | https://doi.org/10.1038/s41598-023-28240-9
www.nature.com/scientificreports/
Faster-RCNN is an object detection model which only draws bounding boxes (detections) on classes of interest.
For such models, true negatives are typically not calculated.
Comparison between the Faster‑RCNN model and naïve observers’ classications. Compari-
sons between the Faster-RCNN model counts and naïve observers counts were established in relation to our
experienced observer. First, from each tile used for testing, we counted all the individual caribou predicted by
the Faster-RCNN model output that matched the experienced observer (true positives). en, we counted the
caribou that were missed (false negatives) and those that were misclassied (rocks, birds, trees, and tree trunks;
Fig.3). Secondly, we repeated this procedure incorporating the categories adults, calves, and ghosts. To estimate
accuracy, precision, and recall we employed Eqs.13, respectively. In addition, we estimated the percentage dif-
ference (i.e., the proportion of detections and classications in relation to the true positives) to evaluate how the
Faster-RCNN model and naïve observers counts compared with the experienced observer. A graphical compari-
son of the counts in each tile of the experienced observer with the Faster-RCNN model was implemented, and
Pearson correlation coecients estimated. Also, we assessed the amount of time allocated to image classication
by all the observers and the time the Faster-RCNN took to produce the outputs, to compare how time-consum-
ing both approaches are. Finally, we evaluated the proportion of misclassications and missed individuals of the
Faster-RCNN model and naïve observers in relation to the true positives for each of the mosaics and caribou
class (adults, calves, and ghosts).
Results
Faster‑RCNN performance. Overall accuracy of the Faster-RCNN model was 80%, precision 90% and
recall 88%. e model performed better at higher altitudes, with accuracies between 80 and 88% at 120m AGL,
and between 75 and 76% at 75m AGL (Table1). When the performance of the Faster-RCNN model was ana-
lyzed by caribou classes and altitudes, the model was more ecient in detecting adults and calves than ghosts at
both 120m AGL mosaics (Table2). At 75m AGL, accuracy and precision of adults were higher than calves and
ghosts in the rst mosaic, although recall of adults was lower than calves and ghosts. In the second 75m mosaic,
the highest accuracy was detected in calves, followed by adults and decreasing markedly in ghosts (Table2).
However, precision for ghosts was higher than adults, although lower than calves in this mosaic, and recall was
higher in adults, followed by calves and ghosts (Table2).
Faster‑RCNN versus manual counts. e counts of the Faster-RCNN and experienced observer per tile
for each of the mosaics showed high correlation at 120m and 75m (Fig.4). Overall, most of the detections of
the Faster-RCNN model seem overestimated in relation to the experienced observer (Fig.4), and the percentage
dierence between both conrms an overestimation between a minimum of + 8.2% (at 120m) and maximum
of + 63.5% (at 75m, Table1).
Faster-RCNN and the experienced observer counts per tile and per caribou class for each of the mosaics,
showed in general high correlations for adults, calves, and ghosts (Fig.4). In most cases, the Faster-RCNN over-
estimated the count of adults, calves, and ghosts (Fig.4, Table2). e percentage dierence for adults ranged
between + 19.5% (at 75m) and + 36.2% (at 75m). For calves, this percentage oscillated between − 7.4% (at 120m)
and + 52.6% (75m), and for ghosts, between -12.0% (120m) and + 140.0% (120m) (Table2).
Table 1. Comparison between the Faster-RCNN model, expert observer (AFB), and naïve observers counts
(raw counts, and true positives: TP, false positives: FP, false negatives: FN, accuracy, precision, recall, and
percentage dierence: % di.) of caribou for each of the mosaics at 120m and 75m AGL in Wapusk National
Park, Manitoba, Canada.
Mosaic Raw counts TP FP FN Accuracy Precision Recall % di
Experienced observer
Mosaic 1 (120m) 95
Mosaic 2 (120m) 109
Mosaic 3 (75m) 63
Mosaic 4 (75m) 136
Model
Mosaic 1 (120m) 114 91 16 3 0.88 0.91 0.97 + 20.0
Mosaic 2 (120m) 118 101 24 20 0.80 0.95 0.83 + 8.2
Mosaic 3 (75m) 103 56 22 1 0.75 0.76 0.98 + 63.5
Mosaic 4 (75m) 168 129 24 28 0.76 0.91 0.82 + 23.5
All observers
Avg ± S.D
Mosaic 1 (120m) 99.2 ± 9.7 28.0 ± 20.9 3.6 ± 4.1 2.5 ± 2.6 0.81 0.88 0.92 + 4.0
Mosaic 2 (120m) 114.4 ± 8.2 33.9 ± 16.6 4.3 ± 3.4 2.1 ± 1.5 0.84 0.89 0.94 + 4.3
Mosaic 3 (75m) 67.7 ± 15.9 18.3 ± 16.0 1.9 ± 1.9 1.4 ± 1.7 0.83 0.89 0.93 + 5.9
Mosaic 4 (75m) 150.1 ± 14.8 54.1 ± 18.6 4.4 ± 3.7 3.6 ± 5.0 0.84 0.90 0.92 + 9.3
Content courtesy of Springer Nature, terms of use apply. Rights reserved
7
Vol.:(0123456789)
Scientic Reports | (2023) 13:947 | https://doi.org/10.1038/s41598-023-28240-9
www.nature.com/scientificreports/
Overall, naïve observers showed relatively high accuracy and precision in each of the mosaics (Table1);
when adults, calves, and ghosts were analyzed independently, accuracy and precision ranges were more vari-
able, especially because ghosts were detected and classied with higher error (Table2). Percentage dierence at
120m ranged between − 9.7% and + 63.6%, and at 75m varied between − 12.9% and + 38.2% (Table1). Adults
percentage dierence varied between − 12.9% at 75m and + 1.6% at 120m, while calves varied between − 26.9%
at 120m and + 18.2% at 75m. Ghost was the most variable category with percentage dierences between − 10.0%
at 75m and − 63.6% at 120m.
In relation to time spent by the human observers for the classication of the images, it took on average
121.6 ± 45.9min for the rst 75m mosaic, 103.8 ± 66.1min for the second 75m mosaic, 59.2 ± 20.3min for the
third mosaic at 120m, and 51.9 ± 42.5min for the last 120m mosaic. Not considering the training phase of the
Faster-RCNN model that took 1week, the time that it took the model to process the testing set was 19.8min.
Table 2. Comparison between the Faster-RCNN model, expert observer (AFB), and naïve observers
counts (raw counts, true positives: TP, false positives: FP, false negatives: FN, accuracy, precision, recall, and
percentage dierence: % di.) of adults, calves, and ghosts caribou for each of the mosaics at 120m and 75m
AGL in Wapusk National Park, Manitoba, Canada.
Mosaic Class Raw counts TP FP FN Accurac y Precision Recall % di
Experienced observer
Mosaic 1 (120m)
Adults 62
Calves 29
Ghosts 4
Mosaic 2 (120m)
Adults 59
Calves 25
Ghosts 25
Mosaic 3 (75m)
Adults 44
Calves 9
Ghosts 10
Mosaic 4 (75m)
Adults 77
Calves 25
Ghosts 34
Faster RCNN model
Mosaic 1 (120m)
Adults 77 65 10 0 0.87 0.87 1.00 + 19.5
Calves 27 24 1 2 0.89 0.96 0.92 − 7.4
Ghosts 10 6 5 1 0.50 0.55 0.86 + 140.0
Mosaic 2 (120m)
Adults 70 51 13 5 0.74 0.80 0.91 + 15.7
Calves 26 19 5 3 0.70 0.79 0.86 + 3.8
Ghosts 22 16 6 12 0.47 0.73 0.57 − 12.0
Mosaic 3 (75m)
Adults 69 48 12 1 0.79 0.80 0.98 + 36.2
Calves 19 11 6 0 0.65 0.65 1.00 + 52.6
Ghosts 15 12 4 0 0.75 0.75 1.00 + 33.3
Mosaic 4 (75m)
Adults 102 78 17 0 0.82 0.82 1.00 + 24.5
Calves 29 22 2 2 0.85 0.92 0.92 + 13.8
Ghosts 37 31 5 26 0.50 0.86 0.54 + 8.1
All observers
(Avg ± S.D.)
Mosaic 1 (120m)
Adults 56.6 ± 8.6 56.4 ± 8.7 3.2 ± 1.8 3.8 ± 3.8 0.89 0.94 0.93 − 9.7
Calves 27.6 ± 3.0 24.8 ± 2.9 2.8 ± 2.4 2.6 ± 2.3 0.83 0.89 0.91 − 6.9
Ghosts 11.2 ± 7.9 5.8 ± 3.1 5.4 ± 6.5 1.2 ± 1.1 0.50 0.59 0.83 + 63.6
Mosaic 2 (120m)
Adults 60.6 ± 5.9 55.0 ± 6.5 6.0 ± 1.9 2.4 ± 2.1 0.87 0.90 0.96 + 1.6
Calves 25.0 ± 3.9 21.2 ± 2.7 3.8 ± 2.8 1.6 ± 1.3 0.81 0.85 0.93 0.0
Ghosts 28.8 ± 10.1 25.6 ± 8.2 3.2 ± 4.5 2.4 ± 1.1 0.83 0.91 0.91 + 13.7
Mosaic 3 (75m)
Adults 41.5 ± 4.4 39.8 ± 4.5 1.2 ± 0.8 1.4 ± 1.3 0.94 0.97 0.97 − 6.8
Calves 11.4 ± 3.0 7.4 ± 1.5 4.0 ± 2.6 0.4 ± 0.5 0.64 0.67 0.95 + 18.2
Ghosts 9.2 ± 3.5 7.6 ± 2.5 1.6 ± 2.1 2.4 ± 2.3 0.67 0.86 0.76 − 10.0
Mosaic 4 (75m)
Adults 66.8 ± 8.3 63.0 ± 7.6 3.8 ± 2.5 6.2 ± 6.5 0.86 0.94 0.91 − 12.9
Calves 27.8 ± 3.9 24.6 ± 2.6 3.2 ± 2.9 1.0 ± 1.0 0.86 0.89 0.96 + 10.7
Ghosts 54.8 ± 16.6 47.8 ± 14.7 7.0 ± 4.8 3.8 ± 5.2 0.80 0.87 0.93 + 38.2
Content courtesy of Springer Nature, terms of use apply. Rights reserved
8
Vol:.(1234567890)
Scientic Reports | (2023) 13:947 | https://doi.org/10.1038/s41598-023-28240-9
www.nature.com/scientificreports/
Misclassications and missing individuals. In the rst 120m mosaic, the Faster-RCNN missed less
individuals and proportions of adults, calves, and ghosts than the naïve observers (Table3, Supplementary Fig.
S2.2 online). In the second 120m mosaic, however, naïve observers detected more ghosts than the Faster-RCNN
model, although calves and adults were better detected by the model (Table3, Supplementary Fig. S2.2 online).
In the rst mosaic at 75m, the Faster-RCNN model did not miss calves and ghosts and missed only one adult
(Table3, Supplementary Fig. S2.2 online). In the second mosaic at 75m, the Faster-RCNN model did not miss
adults, detected more calves and a much lower proportion of ghosts than observers (Table3, Supplementary Fig.
S2.2 online).
Faster-RCNN classied caribou better than the pool of observers in the rst 120m mosaic. Adults classied
as ghosts, was the misclassication with the highest proportion for both, the Faster-RCNN model and observers
(Table4, Supplementary Fig. S2.3 online). In the second 120m mosaic, Faster-RCNN classied equally or better
than the pool of naïve observers, except for ghosts classied as adults that was higher for the model (Table4,
Supplementary Fig. S2.3 online). At 75m, misclassications of the Faster-RCNN model were overall lower than
Figure4. Comparison between (A) the individual caribou counts and (B) caribou adults, calves, and ghosts of
the experienced observer and the Faster-RCNN per tile for each mosaic. e diagonal 1:1 line indicates a perfect
t of the experienced observer and Faster-RCNN counts. Plots were created using R v 4.1.3 (R Core Team, 2022)
and gure assembled using Anity Designer v 1.9.1.979 (https:// an ity. serif. com/).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
9
Vol.:(0123456789)
Scientic Reports | (2023) 13:947 | https://doi.org/10.1038/s41598-023-28240-9
www.nature.com/scientificreports/
the observer’s misclassications (Table4, Supplementary Fig. S2.3 online). However, in the rst mosaic the model
misclassied more calves as adults and ghosts as calves. In the second 75m mosaic, the model misclassied more
calves as ghosts, and ghosts as adults (Table4, Supplementary Fig. S2.3 online).
Naïve observers did not misclassify caribou as objects i.e., trees, trunks, rocks, or birds. However, the Faster-
RCNN model did misclassify caribou as objects in all the mosaics. At 120m, 9 objects were misclassied as
adults, 1 as a calf, and 1 as a ghost. At 75m, 22 objects were misclassied as adults, 5 as calves, and 4 as ghosts.
Discussion
To the best of our knowledge, we present one of the rst attempts to employ automated detection of large mam-
mals from drone-based imagery in North America. We have developed a method for training Faster-RCNN
models given large-scale mosaic imagery to classify and count caribou adults, calves, and ghosts independent of
the altitude and ground sampling distance of the collected imagery. Aer having compared the image detections
and classications of the Faster-RCNN model with those of an experienced observer on the same images, we
noticed that the automatic detection and classication performance could be promising for future implementa-
tions. When the analysis was performed by mosaic and caribou class, the Faster-RCNN model performance was
also promising, in some cases it accomplished better outcomes than the naïve observers. However, ghost was
the category which detection and classication were both challenging by the Faster-RCNN and naïve observers.
Adults and calves in some cases were better detected and classied by the Faster-RCNN model than the naïve
observers. However, a high proportion of adults were misclassied as calves in all the mosaics, mostly by some
naïve observers rather than the Faster-RCNN model. is study suggests that there is a need to improve consist-
ency among observers to better classify groups, required to train models accurately in large-scale studies. ese
types of studies are also challenged by double counting of individuals, a problem that needs to be overcome. Our
study found that having trained the model from images with dierent ground sampling distances, detection and
classication of caribou is satisfactory, opening new promising avenues for the implementation of large-scale
studies and monitoring.
In applied contexts, the benets of drone technology could be challenged by the high amount of informa-
tion collected by the sensors, which articial intelligence is attempting to unravel. Given these extensive data
sets, especially from highly mobile species as the one analyzed in this study, practitioners could be benetted by
using teams of human observers as ground truth for labeling and model training. In this scenario, a minimum
level of consistency is desirable for successfully training algorithms from multiple operators57, because accuracy
of a model could be undermined by the high uncertainty of the observer annotations58. Our study found that
Table 3. Number and proportion of caribou adults, calves, and ghosts that were missed by the Faster-RCNN
and the pool of naïve observers (mean ± SD) per mosaic.
Mosaic Class Missing Proportion missing
Faster-RCNN model
Mosaic 1 (120m)
Adults 0 0.00
Calves 2 0.08
Ghosts 1 0.14
Mosaic 2 (120m)
Adults 5 0.09
Calves 3 0.14
Ghosts 12 0.43
Mosaic 3 (75m)
Adults 1 0.02
Calves 0 0.00
Ghosts 0 0.00
Mosaic 4 (75m)
Adults 0 0.00
Calves 2 0.08
Ghosts 26 0.46
Naïve observers
Mosaic 1 (120m)
Adults 3.8 ± 3.7 0.07
Calves 2.6 ± 2.3 0.09
Ghosts 1.2 ± 1.1 0.20
Mosaic 2 (120m)
Adults 2.4 ± 2.1 0.04
Calves 1.6 ± 1.3 0.07
Ghosts 2.4 ± 1.1 0.09
Mosaic 3 (75m)
Adults 1.4 ± 1.3 0.03
Calves 0.4 ± 0.5 0.05
Ghosts 2.4 ± 2.3 0.24
Mosaic 4 (75m)
Adults 6.2 ± 6.5 0.09
Calves 1.0 ± 1.0 0.04
Ghosts 3.8 ± 5.3 0.09
Content courtesy of Springer Nature, terms of use apply. Rights reserved
10
Vol:.(1234567890)
Scientic Reports | (2023) 13:947 | https://doi.org/10.1038/s41598-023-28240-9
www.nature.com/scientificreports/
detection and classication varied among observers, which opens questions on how to minimize this variability
for further implementation of articial intelligence in large-scale applied settings. Further, we propose the fol-
lowing actions when designing surveys to improve the quality of training data: (a) the expansion of the training
phase and regular assessments of observers’ asymptotic learning curves; (b) working as a group to allow for
collective and common learning conducive to a more standardized experience; (c) to provide observers with
the same set of tools, such as screens sizes and resolutions, similar timelines with appropriate resting times, and
other standard working conditions like light, temperature, and ventilation; and nally (d) aid observers with
alternative tools such as GPS or radio tracked caribou to verify true positive detections. is way we might be
able to account for inter-observer variation and train models from multiple observers in large-scale situations.
One benet of mosaicking is the potential reduction of double counting when animals move across the land-
scape, since overlapped areas in the imagery are excluded8,59; but the process is not perfect, and ghosts emerge
are one drawback that we aimed to account for with our classication system. It is noteworthy that our Faster-
RCNN model had issues detecting ghosts in two mosaics, one at 120m and one at 75m AGL, which similarly
happened with the naïve observers. e model had also diculties to correctly classify ghosts that were mostly
confounded with adults, although did better than some naïve observers. Considering that the movements of the
caribou herd analyzed in this study were relatively slow, the question of if mosaicking reduce double counting
would work with highly mobile species, arise and might be further assessed. To avoid the presence of ghosts in
mosaics, it could be useful to use the original raw images or strip sampling as an alternative59, although additional
eorts should be allocated to reduce the number of double-counts. For instance, ying more than one drone
simultaneously, similar to employ multiple eld observers, could reduce double counts, although it could be
costly. It may also be helpful to incorporate object tracking components from video footage into the CNN analysis
method, to reduce double counts of the same individuals. In any case, it is important that ight plans consider
minimizing behavioral responses to reduce the chances that animals do not move in reaction to the aircra59,60.
Moreover, if we could design surveys to reduce double counting close to zero, we could be able to explore the use
of hierarchical models to detect and classify individuals. It has been proposed that N-mixture models is a good
method to estimate abundance from aerial imagery59, although these models are very sensitive to violations of
assumptions, i.e., observers do not double count individuals, counts are independent binomial random variables,
detection probabilities are constant, abundance is random and independent, and the population is closed61,62.
Other approaches like a modication of the Horvitz–ompson method63 that combines generalized linear and
additive models for overall probability of detection, false detection, and duplicate detection have been proposed
as better alternatives to N-mixture models64. is could be a promising avenue to couple models that account
Table 4. Counts and proportion of misclassications of adults, calves, and ghosts of the Faster-RCNN and
naïve observers (mean ± SD) per mosaic.
Class Mosaic Faster-RCNN counts Faster-RCNN
proportions Naïve observers
counts Naïve observers
proportions
Adult as calf Mosaic 1 120m AGL 1 0.04 2.6 ± 1.9 0.09
Adult as ghost Mosaic 1 120m AGL 3 0.33 4.2 ± 5.5 0.34
Calf as adult Mosaic 1 120m AGL 2 0.03 2.4 ± 1.3 0.04
Calf as ghost Mosaic 1 120m AGL 1 0.14 1.2 ± 1.3 0.15
Ghost as adult Mosaic 1 120m AGL 0 0.00 0.8 ± 8.3 0.02
Ghost as calf Mosaic 1 120m AGL 0 0.00 0.2 ± 0.4 0.01
Adult as calf Mosaic 2 120m AGL 3 0.14 3.2 ± 2.4 0.13
Adult as ghost Mosaic 2 120m AGL 2 0.11 2.2 ± 2.8 0.06
Calf as adult Mosaic 2 120m AGL 3 0.06 2.8 ± 0.8 0.05
Calf as ghost Mosaic 2 120m AGL 1 0.06 1.0 ± 1.7 0.03
Ghost as adult Mosaic 2 120m AGL 9 0.15 3.2 ± 1.9 0.06
Ghost as calf Mosaic 2 120m AGL 1 0.05 0.6 ± 0.5 0.03
Adult as calf Mosaic 3 75m AGL 0 0.00 3.4 ± 2.4 0.29
Adult as ghost Mosaic 3 75m AGL 1 0.08 1.2 ± 1.8 0.10
Calf as ghost Mosaic 3 75m AGL 0 0.00 0.4 ± 0.5 0.05
Calf as adult Mosaic 3 75m AGL 2 0.04 0.2 ± 0.4 0.0
Ghost as calf Mosaic 3 75m AGL 1 0.08 0.6 ± 0.9 0.08
Ghost as adult Mosaic 3 75m AGL 0 0.00 1.0 ± 1.0 0.03
Adult as calf Mosaic 4 75m AGL 1 0.04 2.0 ± 1.2 0.07
Adult as ghost Mosaic 4 75m AGL 3 0.09 6.2 ± 4.1 0.10
Calf as adult Mosaic 4 75m AGL 2 0.03 1.8 ± 1.9 0.03
Calf as ghost Mosaic 4 75m AGL 1 0.03 0.8 ± 1.3 0.01
Ghost as adult Mosaic 4 75m AGL 4 0.05 2.0 ± 0.7 0.03
Ghost as calf Mosaic 4 75m AGL 1 0.04 1.2 ± 1.8 0.04
Content courtesy of Springer Nature, terms of use apply. Rights reserved
11
Vol.:(0123456789)
Scientic Reports | (2023) 13:947 | https://doi.org/10.1038/s41598-023-28240-9
www.nature.com/scientificreports/
for imperfect detection to train neural networks, especially in contexts where data sets are becoming large and
dicult to process by human observers.
An algorithm able to learn and classify animals from imagery taken at dierent altitudes or ground sampling
distances could be an advantage for generalizable models65, especially useful for researchers and practitioners.
Our Faster-RCNN model could be able to satisfactorily detect and classify caribou at both altitudes with dierent
ground sampling distances. is might open further avenues to overcome diculties that prevent combining
dierent sources of data, especially when dealing with broadly distributed species. For example, dierent types
of airspaces that constrain ights at certain altitudes could vary between countries and regions. Additionally,
access to standardized platforms and sensors for long term and large-scale studies is a challenge, which could
be overcome with algorithms like ours, potentially independent of ground sampling distances. Some success-
ful examples of this approach are present in the literature for the detection of conifers66, crops67, and large
mammals15. To achieve that, inter alia, we needed to assess what are the limits for algorithms to be trained with a
range of ground sampling distances, able to accurately classify targets; in addition to evaluations under dierent
whether conditions, backgrounds, and species. Ultimately, we could be able to generalize and optimize resources
and data, to leverage the application of this technology for studying and managing wildlife.
To successfully apply drone technology to large-scale studies of large mammals, we need to scale up ights to
larger landscapes rather than smaller areas. However, there are still technical and logistic limitations related to
the use of beyond visual line of sight platforms (BVLOS) that facilitate larger areas surveys. A few examples of
BVLOS use have been carried out in wildlife ecology, mostly in marine settings. For instance, Hodgson etal.68
assessed the detection probability of a BVLOS drone platform to detect humpback whales (Megaptera novaean-
gliae) in Australia. Similarly, Ferguson etal.69 evaluated the performance of images taken from a drone platform
in relation to direct surveys and imagery from manned aircra, to detect and count marine mammals in Alaska;
BVLOS platforms are promising although still expensive and less ecient than human observers onboard occu-
pied aircras, authors concluded. Isolated marine biodiversity such as marine mammals, seabirds, and tundra
communities have successfully been surveyed on King George island in Antarctica using BVLOS technology70.
Nevertheless, a bigger problem BVLOS surveys have, would be the sheer amount of data collected, and concord-
antly with our ndings, manual counts of wildlife are not scalable due to time restrictions71.
Data availability
e data that support the ndings of this study are available from the corresponding author upon request.
Received: 29 September 2022; Accepted: 16 January 2023
References
1. Chapman, A. It’s okay to call them drones. J. Unmanned Veh. Syst. 2, iii–v (2014).
2. Chabot, D., Hodgson, A. J., Hodgson, J. C. & Anderson, K. ‘Drone’: Technically correct, popularly accepted, socially acceptable.
Drone Syst. Appl. 10, 399–405 (2022).
3. Chabot, D. & Bird, D. M. Wildlife research and management methods in the 21st century: Where do unmanned aircra t in?. J.
Unmanned Veh. Syst. 3, 137–155 (2015).
4. Christie, K. S., Gilbert, S. L., Brown, C. L., Hateld, M. & Hanson, L. Unmanned aircra systems in wildlife research: Current and
future applications of a transformative technology. Front. Ecol. Environ. 14, 241–251 (2016).
5. Whitehead, K. & Hugenholtz, C. H. Remote sensing of the environment with small unmanned aircra systems (UASs), part 1: A
review of progress and challenges. J. Unmanned Veh. Syst. 2, 69–85 (2014).
6. Barnas, A. et al. Evaluating behavioral responses of nesting lesser snow geese to unmanned aircra surveys. Ecol. Evol. 8, 1328–1338
(2018).
7. Mulero-Pázmány, M. et al. Unmanned aircra systems as a new source of disturbance for wildlife: A systematic review. PLoS ONE
12, e0178448 (2017).
8. Linchant, J., Lisein, J., Semeki, J., Lejeune, P. & Vermeulen, C. Are unmanned aircra systems (UAS s) the future of wildlife moni-
toring? A review of accomplishments and challenges. Mam mal R ev. 45, 239–252 (2015).
9. Whitehead, K. et al. Remote sensing of the environment with small unmanned aircra systems (UASs), part 2: Scientic and
commercial applications. J. Unmanned Veh. Syst. 2, 86–102 (2014).
10. Barasona, J. A. et al. Unmanned aircra systems for studying spatial abundance of ungulates: Relevance to spatial epidemiology.
PLoS ONE 9, e115608 (2014).
11. Chrétien, L. P., éau, J. & Ménard, P. Wildlife multispecies remote sensing using visible and thermal infrared imagery acquired
from an unmanned aerial vehicle (UAV). Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 40, 241 (2015).
12. Guo, X. et al. Application of UAV remote sensing for a population census of large wild herbivores—Taking the headwater region
of the yellow river as an example. Remote Sens. 10, 1041 (2018).
13. Hu, J., Wu, X. & Dai, M. Estimating the population size of migrating Tibetan antelopes Pantholops hodgsonii with unmanned aerial
vehicles. Oryx 54, 101–109 (2020).
14. Mulero-Pázmány, M., Stolper, R., Van Essen, L. D., Negro, J. J. & Sassen, T. Remotely piloted aircra systems as a rhinoceros anti-
poaching tool in Africa. PLoS ONE 9, e83873 (2014).
15. Rey, N., Volpi, M., Joost, S. & Tuia, D. Detecting animals in African Savanna with UAVs and the crowds. Remote Sens. Environ.
200, 341–351 (2017).
16. Schroeder, N. M., Panebianco, A., Gonzalez Musso, R. & Carmanchahi, P. An experimental approach to evaluate the potential of
drones in terrestrial mammal research: A gregarious ungulate as a study model. R. Soc. Open Sci. 7, 191482 (2020).
17. Su, X. et al. Using an unmanned aerial vehicle (UAV) to study wild yak in the highest desert in the world. Int. J. Remote Sens. 39,
5490–5503 (2018).
18. Vermeulen, C., Lejeune, P., Lisein, J., Sawadogo, P. & Bouché, P. Unmanned aerial survey of elephants. PLoS ONE 8, e54700 (2013).
19. Mallory, M. L. et al. Financial costs of conducting science in the Arctic: Examples from seabird research. Arct. Sci. 4, 624–633
(2018).
20. Sasse, D. B. Job-related mortality of wildlife workers in the United States, 1937–2000. Wildl. Soc. Bull. 31, 1015–1020 (2003).
21. Loarie, S. R., Joppa, L. N. & Pimm, S. L. Satellites miss environmental priorities. Trends Ecol. Evol. 22, 630–632 (2007).
22. IUCN. e IUCN Red List of reatened Species. IUCN Red List of reatened Species https:// www. iucnr edlist. org/ en (2021).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
12
Vol:.(1234567890)
Scientic Reports | (2023) 13:947 | https://doi.org/10.1038/s41598-023-28240-9
www.nature.com/scientificreports/
23. Mech, L. D. & Barber, S. M. A critique of wildlife radio-tracking and its use in National Parks: a report to the National Park Service.
(2002).
24. Patterson, C., Koski, W., Pace, P., McLuckie, B. & Bird, D. M. Evaluation of an unmanned aircra system for detecting surrogate
caribou targets in Labrador. J. Unmanned Veh. Syst. 4, 53–69 (2015).
25. Hodgson, J. C. et al. Drones count wildlife more accurately and precisely than humans. Methods Ecol. Evol. 9, 1160–1167 (2018).
26. Seymour, A. C., Dale, J., Hammill, M., Halpin, P. N. & Johnston, D. W. Automated detection and enumeration of marine wildlife
using unmanned aircra systems (UAS) and thermal imagery. Sci. Rep. 7, 1–10 (2017).
27. COSEWIC. COSEWIC assessment and status report on the caribou (Rangifer tarandus) eastern migratory population, Torngat
mountain population in Canada. (COSEWIC, Committee on the Status of Endangered Wildlife in Canada, 2017).
28. Albawi, S., Mohammed, T. A. & Al-Zawi, S. Understanding of a convolutional neural network. in 2017 international conference on
engineering and technology (ICET) 1–6 (IEEE, 2017).
29. Gu, J. et al. Recent advances in convolutional neural networks. Pattern Recognit. 77, 354–377 (2018).
30. Teuwen, J. & Moriakov, N. Convolutional neural networks. in Handbook of medical image computing and computer assisted inter-
vention 481–501 (Elsevier, 2020).
31. Corcoran, E., Winsen, M., Sudholz, A. & Hamilton, G. Automated detection of wildlife using drones: Synthesis, opportunities and
constraints. Methods Ecol. Evol. 12, 1103–1114 (2021).
32. Corcoran, E., Denman, S., Hanger, J., Wilson, B. & Hamilton, G. Automated detection of koalas using low-level aerial surveillance
and machine learning. Sci. Rep. 9, 3208 (2019).
33. Gray, P. C. et al. Drones and convolutional neural networks facilitate automated and accurate cetacean species identication and
photogrammetry. Methods Ecol. Evol. 10, 1490–1500 (2019).
34. Gray, P. C. et al. A convolutional neural network for detecting sea turtles in drone imagery. Methods Ecol. Evol. 10, 345–355 (2019).
35. Peng, J. et al. Wild animal survey using UAS imagery and deep learning: modied Faster R-CNN for kiang detection in Tibetan
Plateau. ISPRS J. Photogramm. Remote Sens. 169, 364–376 (2020).
36. Borowicz, A. et al. Multi-modal survey of Adélie penguin mega-colonies reveals the Danger Islands as a seabird hotspot. Sci. Rep.
8, 3926 (2018).
37. Francis, R. J., Lyons, M. B., Kingsford, R. T. & Brandis, K. J. Counting mixed breeding aggregations of animal species using drones:
Lessons from waterbirds on semi-automation. Remote Sens. 12, 1185 (2020).
38. Santangeli, A. et al. Integrating drone-borne thermal imaging with articial intelligence to locate bird nests on agricultural land.
Sci. Rep. 10, 1–8 (2020).
39. Bowley, C., Mattingly, M., Barnas, A., Ellis-Felege, S. & Desell, T. An analysis of altitude, citizen science and a convolutional neural
network feedback loop on object detection in unmanned aerial systems. J. Comput. Sci. 34, 102–116 (2019).
40. Bowley, C., Mattingly, M., Barnas, A., Ellis-Felege, S. & Desell, T. Detecting wildlife in unmanned aerial systems imagery using
convolutional neural networks trained with an automated feedback loop. in International Conference on Computational Science
69–82 (Springer, 2018).
41. Delplanque, A., Foucher, S., Lejeune, P., Linchant, J. & éau, J. Multispecies detection and identication of African mammals in
aerial imagery using convolutional neural networks. Remote Sens. Ecol. Conserv. 8, 166–179 (2021).
42. Eikelboom, J. A. J. et al. Improving the precision and accuracy of animal population estimates with aerial image object detection.
Methods Ecol. Evol. 10, 1875–1887 (2019).
43. Kellenberger, B., Marcos, D. & Tuia, D. Detecting mammals in UAV images: Best practices to address a substantially imbalanced
dataset with deep learning. Remote Sens. Environ. 216, 139–153 (2018).
44. Hooge, I. T. C., Niehorster, D. C., Nyström, M., Andersson, R. & Hessels, R. S. Is human classication by experienced untrained
observers a gold standard in xation detection?. Behav. Res. Methods 50, 1864–1881 (2018).
45. Barnas, A. F., Darby, B. J., Vandeberg, G. S., Rockwell, R. F. & Ellis-Felege, S. N. A comparison of drone imagery and ground-based
methods for estimating the extent of habitat destruction by lesser snow geese (Anser caerulescens caerulescens) in La Pérouse Bay.
PLoS ONE 14, e0217049 (2019).
46. Brook, R. K. & Kenkel, N. C. A multivariate approach to vegetation mapping of Manitoba’s Hudson Bay Lowlands. Int. J. Remote
Sens. 23, 4761–4776 (2002).
47. Shilts, W. W., Aylsworth, J. M., Kaszycki, C. A., Klassen, R. A. & Graf, W. L. Canadian shield. in Geomorphic Systems of North
America vol. 2 119–161 (Geological Society of America Boulder, Colorado, 1987).
48. Barnas, A. F., Felege, C. J., Rockwell, R. F. & Ellis-Felege, S. N. A pilot (less) study on the use of an unmanned aircra system for
studying polar bears (Ursus maritimus). Polar Biol. 41, 1055–1062 (2018).
49. Ellis-Felege, S. N. et al. Nesting common eiders (Somateria mollissima) show little behavioral response to xed-wing drone surveys.
J. Unmanned Veh. Syst. 10, 1–4 (2021).
50. Barnas, A. F. et al. A standardized protocol for reporting methods when using drones for wildlife research. J. Unmanned Veh. Syst.
8, 89–98 (2020).
51. Ren, S., He, K., Girshick, R. & Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. Adv. Neural
Inf. Process. Syst. 28, 91–99 (2016).
52. Chen, T., Xu, B., Zhang, C. & Guestrin, C. Training Deep Nets with Sublinear Memory Cost. ArXiv160406174 Cs (2016).
53. Pinckaers, H. & Litjens, G. Training convolutional neural networks with megapixel images. ArXiv180405712 Cs (2018).
54. Abadi, M. et al. TensorFlow: Large-scale machine learning on heterogeneous systems. (2015).
55. Janocha, K. & Czarnecki, W. M. On loss functions for deep neural networks in classication. ArXiv Prepr. ArXiv170205659. (2017).
56. Murata, N., Yoshizawa, S. & Amari, S. Learning curves, model selection and complexity of neural networks. Adv. Neural Inf. Process.
Syst. 5, 607–614 (1992).
57. Hänsch, R. & Hellwich, O. e truth about ground truth: Label noise in human-generated reference data. in IGARSS 2019–2019
IEEE International Geoscience and Remote Sensing Symposium 5594–5597 (IEEE, 2019).
58. Bowler, E., Fretwell, P. T., French, G. & Mackiewicz, M. Using deep learning to count albatrosses from space: Assessing results in
light of ground truth uncertainty. Remote Sens. 12, 2026 (2020).
59. Brack, I. V., Kindel, A. & Oliveira, L. F. B. Detection errors in wildlife abundance estimates from Unmanned Aerial Systems (UAS)
surveys: Synthesis, solutions, and challenges. Methods Ecol. Evol. 9, 1864–1873 (2018).
60. Jagielski, P. M. et al. e utility of drones for studying polar bear behaviour in the Canadian Arctic: Opportunities and recom-
mendations. Drone Syst. Appl. 10, 97–110 (2022).
61. Williams, P. J., Hooten, M. B., Womble, J. N. & Bower, M. R. Estimating occupancy and abundance using aerial images with
imperfect detection. Methods Ecol. Evol. 8, 1679–1689 (2017).
62. Link, W. A., Schoeld, M. R., Barker, R. J. & Sauer, J. R. On the robustness of N-mixture models. Ecology 99, 1547–1551 (2018).
63. Horvitz, D. G. & ompson, D. J. A generalization of sampling without replacement from a nite universe. J. Am. Stat. Assoc. 47,
663–685 (1952).
64. Corcoran, E., Denman, S. & Hamilton, G. New technologies in the mix: Assessing N-mixture models for abundance estimation
using automated detection data from drone surveys. Ecol. Evol. 10, 8176–8185 (2020).
65. Lunga, D., Arndt, J., Gerrand, J. & Stewart, R. ReSFlow: A remote sensing imagery data-ow for improved model generalization.
IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 14, 10468–10483 (2021).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
13
Vol.:(0123456789)
Scientic Reports | (2023) 13:947 | https://doi.org/10.1038/s41598-023-28240-9
www.nature.com/scientificreports/
66. Fromm, M., Schubert, M., Castilla, G., Linke, J. & McDermid, G. Automated detection of conifer seedlings in drone imagery using
convolutional neural networks. Remote Sens. 11, 2585 (2019).
67. Velumani, K. et al. Estimates of maize plant density from UAV RGB images using Faster-RCNN detection model: Impact of the
spatial resolution. Plant Phenomics 2021, 9824843 (2021).
68. Hodgson, A., Peel, D. & Kelly, N. Unmanned aerial vehicles for surveying marine fauna: Assessing detection probability. Ecol.
Appl. 27, 1253–1267 (2017).
69. Ferguson, M. C. et al. Performance of manned and unmanned aerial surveys to collect visual data and imagery for estimating
arctic cetacean density and associated uncertainty. J. Unmanned Veh. Syst. 6, 128–154 (2018).
70. Zmarz, A. et al. Application of UAV BVLOS remote sensing data for multi-faceted analysis of Antarctic ecosystem. Remote Sens.
Environ. 217, 375–388 (2018).
71. Lyons, M. B. et al. Monitoring large and complex wildlife aggregations with drones. Methods Ecol. Evol. 10, 1024–1035 (2019).
Acknowledgements
Special thanks to the volunteers that participated in this study: Cailey Isaacson (CI), Lindsey Kallis (LK), Seth
Owens (SO), and Sara Yannuzzi (SY). anks to Ryan Brook for sharing his knowledge of the caribou popula-
tion status in the study area. Funding for this project was provided by the University of North Dakota College
of Arts and Sciences, UND Postdoctoral Seed Funding Program, UND Biology, National Science Foundation
(#13197000 awarded to SNE), North Dakota EPSCoR Infrastructure Improvement Program—Doctoral Disserta-
tion Assistantship #OIA-1355466, Wapusk National Park, Arctic Goose Joint Venture, the Central and Mississippi
Flyway Councils, North Dakota View Scholarship, Anne Via, and in-kind assistance and guidance from Parks
Canada, Wapusk National Park Management Board, and the community of Churchill, Manitoba, Canada. We
are grateful for cooperation and ight coordination from Hudson Bay Helicopters. We thank Mike Corcoran for
assistance with drone ights, and Will Beaton for assistance in the eld.
Author contributions
J.L.: Conceptualization, Data curation, Formal analysis, investigation, methodology, visualization, writing—origi-
nal dra, writing—review and editing. A.F.B.: conceptualization, data acquisition, investigation, methodology,
writing—original dra, writing—review and editing. A.A.E.: conceptualization, formal analysis, investigation,
methodology, writing—review and editing. T.D.: conceptualization, formal analysis, investigation, methodology,
writing—review and editing. R.F.R: data acquisition, writing—review and editing, S.N.E.-F.: conceptualization,
data acquisition, funding acquisition, investigation, methodology, project administration, resources, soware,
supervision, validation, visualization, writing—original dra, writing—review and editing.
Competing interests
e authors declare no competing interests.
Additional information
Supplementary Information e online version contains supplementary material available at https:// doi. org/
10. 1038/ s41598- 023- 28240-9.
Correspondence and requests for materials should be addressed to J.L.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aliations.
Open Access is article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons licence, and indicate if changes were made. e images or other third party material in this
article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.
is is a U.S. Government work and not under copyright protection in the US; foreign copyright protection
may apply 2023
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... Lawnmower patterns in drone surveys typically include 60%-80% frontal and side overlapping of adjacent images (Figure 1a-d; Ezat et al., 2018;Lyons et al., 2019;Aubert et al., 2021). While overlapping images are necessary for mapping orthorectified landscapes (Koh & Wich, 2012), image overlap for animal monitoring can increase sampling bias due to risk of repeatedly counting individuals (Brack et al., 2018;Lenzi et al., 2023). Yet, common default flight settings among commercially available drone software use overlapping lawnmower flight patterns (Frazier & Singh, 2021;Harris et al., 2019), an approach that may not support accurate surveys. ...
... Animal movements have the potential to influence counting accuracy in drone surveys through omission of individuals or multiple counts often caused by the same animal(s) occurring in several overlapping images (Brack et al., 2018). Lenzi et al. (2023) mentioned "ghost" animals produced when overlapping drone images were mosaicked. These were individuals that moved during subsequent image capture, creating blurred or transparent animals on the final mosaicked photograph, leading to possible erroneous counts. ...
... While it is acknowledged that animals move during surveys (Brack et al., 2018), many drone surveys assume animals are stationary (Sudholz et al., 2022) and create a mosaic image to more easily count animals and understand distributions (De Kock et al., 2021;Ezat et al., 2018) without quantifying the effect of animal movement on counting accuracy. A few field drone studies have attempted to address animal movement issues post data collection (Linchant et al., 2018) and with manual image searches for clones, partial, or blurred animals after mosaicking (Barbedo & Vieira Koenigkan, 2018;Lenzi et al., 2023). Another approach reviews individual overlapping images, comparing animal shapes, sizes, and positions to reduce the number of multiple counted animals (Cleguer et al., 2021;Sudholz et al., 2022;Witczuk et al., 2018). ...
Article
Full-text available
The use of remote sensing to monitor animal populations has greatly expanded during the last decade. Drones (i.e., Unoccupied Aircraft Systems or UAS) provide a cost‐ and time‐efficient remote sensing option to survey animals in various landscapes and sampling conditions. However, drone‐based surveys may also introduce counting errors, especially when monitoring mobile animals. Using an agent‐based model simulation approach, we evaluated the error associated with counting a single animal across various drone flight patterns under three animal movement strategies (random, directional persistence, and biased toward a resource) among five animal speeds (2, 4, 6, 8, 10 m/s). Flight patterns represented increasing spatial independence (ranging from lawnmower pattern with image overlap to systematic point counts). Simulation results indicated that flight pattern was the most important variable influencing count accuracy, followed by the type of animal movement pattern, and then animal speed. A awnmower pattern with 0% overlap produced the most accurate count of a solitary, moving animal on a landscape (average count of 1.1 ± 0.6) regardless of the animal's movement pattern and speed. Image overlap flight patterns were more likely to result in multiple counts even when accounting for mosaicking. Based on our simulations, we recommend using a lawnmower pattern with 0% image overlap to minimize error and augment drone efficacy for animal surveys. Our work highlights the importance of understanding interactions between animal movements and drone survey design on count accuracy to inform the development of broad applications among diverse species and ecosystems.
... Machine learning used in video and image analysis (computer vision) for a variety of purposes has been explored in recent years [34,35]. The use of object detection as a machine learning tool to find and recognize a given object has been investigated previously and used to recognize animals and their behaviors [32,36,37,38]. Tools such as this may prove useful as a way of automating behavioral analysis in the near future, which may reduce the workhours required of the researcher [30,32]. ...
... Tools such as this may prove useful as a way of automating behavioral analysis in the near future, which may reduce the workhours required of the researcher [30,32]. However, these uses and methods are still in their infancy which necessitates further investigation of different machine learning models, methods, and implementations [36,37,39]. ...
Preprint
Full-text available
This study investigates the possibility of using machine learning models created in DeepLabCut and Create ML to automate aspects of behavioral coding and aid in behavioral analysis. Two models with different capabilities and complexities were constructed and compared to a manually observed control period. The accuracy of the models was assessed before being applied to 7 nights of footage of the nocturnal behavior of two African elephants (Loxodonta africana). The resulting data was used to draw conclusions regarding behavioral differences between the two elephants and between individually observed nights, thus proving that such models can aid researchers in be-havioral analysis. The models were capable of tracking simple behaviors with high accuracy, but had certain limitations regarding detection of complex behaviors, such as the stereotyped behavior sway, and displayed confusion when deciding between visually similar behaviors. Further expansion of such models may be desired to create a more capable aid with the possibility of automating behavioral coding.
... Machine learning used in video and image analysis (computer vision) has been explored in recent years in application to a variety of purposes [34,35]. The use of object detection as a machine learning tool to find and recognize a given object has been investigated previously and used to recognize animals and their behaviors [32,[36][37][38]. Tools such as this may prove useful as a way of automating behavioral analysis in the near future, which may reduce the workhours required of the researcher [30,32]. ...
... Tools such as this may prove useful as a way of automating behavioral analysis in the near future, which may reduce the workhours required of the researcher [30,32]. However, these uses and methods are still in their infancy which necessitates further investigation of different machine learning models, methods, and implementations [36,37,39]. ...
Article
Full-text available
This study investigates the possibility of using machine learning models created in DeepLabCut and Create ML to automate aspects of behavioral coding and aid in behavioral analysis. Two models with different capabilities and complexities were constructed and compared to a manually observed control period. The accuracy of the models was assessed by comparison with manually scoring, before being applied to seven nights of footage of the nocturnal behavior of two African elephants (Loxodonta africana). The resulting data were used to draw conclusions regarding behavioral differences between the two elephants and between individually observed nights, thus proving that such models can aid researchers in behavioral analysis. The models were capable of tracking simple behaviors with high accuracy, but had certain limitations regarding detection of complex behaviors, such as the stereotyped behavior sway, and displayed confusion when deciding between visually similar behaviors. Further expansion of such models may be desired to create a more capable aid with the possibility of automating behavioral coding.
... Despite these advantages, accurate identification of animals from drone imagery is challenging and time-consuming because surveys produce a large number of images to process for subsequent detection and counting of animals when conducted by humans [4,5]. However, automated animal detection and identification using machine learning and artificial intelligence (AI) can substantially reduce time, bias, and logistical costs of processing images from drone surveys [3,[6][7][8][9][10][11]. ...
Article
Full-text available
Drones (unoccupied aircraft systems) have become effective tools for wildlife monitoring and conservation. Automated animal detection and classification using artificial intelligence (AI) can substantially reduce logistical and financial costs and improve drone surveys. However, the lack of annotated animal imagery for training AI is a critical bottleneck in achieving accurate performance of AI algorithms compared to other fields. To bridge this gap for drone imagery and help advance and standardize automated animal classification, we have created the Aerial Wildlife Image Repository (AWIR), which is a dynamic, interactive database with annotated images captured from drone platforms using visible and thermal cameras. The AWIR provides the first open-access repository for users to upload, annotate, and curate images of animals acquired from drones. The AWIR also provides annotated imagery and benchmark datasets that users can download to train AI algorithms to automatically detect and classify animals, and compare algorithm performance. The AWIR contains 6587 animal objects in 1325 visible and thermal drone images of predominantly large birds and mammals of 13 species in open areas of North America. As contributors increase the taxonomic and geographic diversity of available images, the AWIR will open future avenues for AI research to improve animal surveys using drones for conservation applications. Database URL: https://projectportal.gri.msstate.edu/awir/
... Wildlife ecologists have long used manual surveys to capture snapshots of animal populations and animal mortality, including mortality along roadways (Livingston 2019). In recent years, these methods have been augmented by aerial imaging from drones, and machine learning algorithms have been developed to assist with data reduction (Lenzi et al. 2023). Future research could explore the use of these methods for carcass counting based on imaging from drones flown along the right-of-way, maintenance vehicles equipped with rooftop cameras or lidar units, or images from roadside traffic surveillance cameras. ...
Technical Report
Full-text available
Mapping of Iowa vehicle-animal crashes that were reported to law enforcement in 2010–2020 indicates that although crashes occur throughout the state, they are most prevalent in southeast and northeast Iowa, on routes that parallel rivers and other waterways, near river crossings, and in the southwestern suburbs of the Des Moines metropolitan area. Animal crashes are less prevalent in northwest and north-central Iowa, except near river crossings. Combinations of high traffic volume and high-quality deer habitat appear to increase the likelihood of motor vehicle crashes involving animals. Fatal vehicle-animal collisions in Iowa often involve motorcycles or loss of control. In addition, this project considered likely deer-vehicle seasons and times of day, reviewed previous research on the potential effectiveness of various countermeasures, assessed methods for automating the collection of deer crash location information, and estimated a recommended value for the comprehensive costs of a typical deer crash.
... Bruk av maskinlaering og KI er viktig i analysen av kameradata frå droner, og bidrar både til effektivitet og høgare datakvalitet (Christie et al. 2016, Corcoran et al. 2021. For eksempel er det i dag mulig å bruke maskinlaering for å telle individ av enkelte artar basert på kameradata frå droner (Lenzi et al. 2023). ...
Technical Report
Full-text available
I arbeidet med konseptutredninga, har ein drøfta i kor stor grad ein skal detaljere beskrivingar av feltmetodikk. Variasjon mellom artar og ulike kartleggingsformål krev variasjon i kartleggingsmetodikk. Konseptutredninga vil ikkje kunne gi detaljerte svar på alle metodiske utfordringar, men gir eit metodisk rammeverk der fleire av elementa vil vere vesentlege bidrag til både å heve nivået og standardisere artskartlegging i Noreg.
... They primarily use the VisDrone2019 dataset. There is other recent work that uses drone imagery for identifying either large mammals [8] or specific bird species [6]. Corcoran et al. [4] surveys several of these approaches. ...
Preprint
Full-text available
Animal populations worldwide are rapidly declining, and a technology that can accurately count endangered species could be vital for monitoring population changes over several years. This research focused on fine-tuning object detection models for drone images to create accurate counts of animal species. Hundreds of images taken using a drone and large, openly available drone-image datasets were used to fine-tune machine learning models with the baseline YOLOv8 architecture. We trained 30 different models, with the largest having 43.7 million parameters and 365 layers, and used hyperparameter tuning and data augmentation techniques to improve accuracy. While the state-of-the-art YOLOv8 baseline had only 0.7% accuracy on a dataset of safari animals, our models had 95% accuracy on the same dataset. Finally, we deployed the models on the Jetson Orin Nano for demonstration of low-power real-time species detection for easy inference on drones.
... The fusion of thermal and visible aerial images ) and improving species identification by movement patterns (Iwamoto et al. 2022) are a couple of relatively recent computer-vision applications relevant to ecologists. However, this lag can be easily addressed by multidisciplinary teams (e.g., wildlife researchers and computer engineers) as evidenced by recent publications (e.g., Krishnan et al. 2023, Lenzi et al. 2023). ...
Article
Full-text available
Rapid advancements in technology often yield research inquiry into novel applications and drone (i.e., unoccupied aircraft systems or UAS) applications in wildlife management are no exception. We questioned the time lag between drone‐related research and end‐user assessments. We implemented an online, cross‐sectional survey of wildlife professionals to better understand current drone use and benefits or concerns, complemented by a review of contemporary peer‐reviewed and gray literature. We found little disparity between scientific inquiry and end‐user experiences (i.e., similar trends among concerns in published literature and survey results). Exploring new (i.e., advancements in computer vision) and refining original drone applications (i.e., evaluating animal behavior responses during monitoring) were strong among pilots of relatively minimal experience (1–5 years). Advancements in drone technology and changes in drone‐related legislation will continue to offer benefits and challenges.
Article
Estimating abundance and trends in populations is important for efforts to conserve biodiversity and understand species’ behaviour and ecology. For species that predictably aggregate in overnight communal roost sites, like many parrot species, roost counts are often used as a proxy for local abundance. However, accurately counting the number of birds present at overnight roosts can be challenging due to low light levels and the behaviour of parrots as they arrive which can lead to double counting and highly variable estimates between observers. Here, we report on a method for counting parrots and other birds at overnight communal roosts, using “night-vision” cameras to count endangered Grey Parrots (Psittacus erithacus) at a site in Nigeria. We compared the level of dispersion (standard deviation) in parrot counts made using “traditional” methods (using optical binoculars to count parrots at dusk as they arrived at a roost site) with counts made using night-vision camera and found that the latter achieved greater consistency between multiple independent observers. Although the use of a night vision camera will not be suitable in all situations, particularly those involving trees with dense foliage, we advocate for its use when making roost counts wherever feasible to achieve greater precision and repeatability of results and make recommendations for implementing this method.
Article
Full-text available
Different fields use different terms, but by changing its title, this journal is advocating for the discontinuation of ‘unmanned’ and recognition of ‘drone’ as an umbrella term for all robotic vehicles.
Article
Full-text available
Climate-induced sea-ice loss represents the greatest threat to polar bears (Ursus maritimus Phipps, 1774), and utilizing drones to characterize behavioural responses to sea-ice loss is valuable for forecasting polar bear persistence. In this manuscript, we review previously published literature and draw on our own experience of using multirotor aerial drones to study polar bear behaviour in the Canadian Arctic. Specifically, we suggest that drones can minimize human–bear conflicts by allowing users to observe bears from a safe vantage point; produce high-quality behavioural data that can be reviewed as many times as needed and shared with multiple stakeholders; and foster knowledge generation through co-production with northern communities. We posit that in some instances drones may be considered as an alternative tool for studying polar bear foraging behaviour, interspecific interactions, human–bear interactions, human safety and conflict mitigation, and den-site location at individual-level small spatial scales. Finally, we discuss flying techniques to ensure ethical operation around polar bears, regulatory requirements to consider, and recommend that future research focus on understanding polar bears’ behavioural and physiological responses to drones and the efficacy of drones as a deterrent tool for safety purposes.
Article
Full-text available
Drones may be valuable in polar research because they can minimize researcher activity and overcome logistical, financial, and safety obstacles associated with wildlife research in polar regions. Because polar species may be particularly sensitive to disturbance and some research suggests behavioral responses to drones are species-specific, there is a need for focal species-specific disturbance assessments. We evaluated behavioral responses of nesting Common Eiders (Somateria mollissima (Linnaeus, 1758), n = 19 incubating females) to first, second, or in a few cases third exposure of fixed-wing drone surveys using nest cameras. We found no effect of drone flights (F[1,23] = 0, P = 1.0) or previous exposures (F[1,23] = 0.75, P = 0.397) on the probability of a daily recess event (bird leaves nests). Drone flights did not impact recess length (F[1,25] = 1.34, P = 0.26); however, Common Eiders with prior drone exposure took longer recess events (F[1,25] = 5.27, P = 0.03). We did not observe any overhead vigilance behaviors common in other species while the drone was in the air, which may reflect Common Eiders’ anti-predator strategies of reducing activity at nests in response to aerial predators. Surveying nesting Common Eider colonies with a fixed-wing drone did not result in biologically meaningful behavioral changes, providing a potential tool for research and monitoring this polar nesting species.
Article
Full-text available
As satellite imagery collections continue to grow at an astonishing rate, so is the demand for automated and scalable object detection and segmentation. Scaling computational activities demand models that generalize well across various challenges that can hamper progress, including diverse imaging and geographic conditions, sampling bias in training data, manual ground truth collection, tooling for model reuse and accountability assessment, and poor model training strategies. A great deal of progress has been made on these challenges. We contribute to the improvement through further development of ReSFlow, a workflow that breaks the problem of model generalization into a collection of specialized exploitations. ReSFlow partitions imagery collections into homogeneous buckets equipped with exploitation models trained to perform well under each bucket.s specific context. Essentially, ReSFlow aims for generalization through stratification. Therefore, within a bucket, exploitation is a homogeneous process that mitigates heterogeneity challenges, including the number of training data and data biases that can occur over varied conditions. Furthermore, custom model architectures and rich training strategies effective for within-bucket conditions can be developed. Meanwhile, across buckets, performance metrics support systematic views of the workflow leading to optimal data labeling allocations and indications that further specialization is warranted. Herein, we discuss the formation of models during the framework's “Offline Initialization” stage. Lastly, we exploit the inherent parallelism due to bucketing to introduce model reuse and demonstrate efficacy by reducing an 89-day manual data labeling cost to zero-days in a new area of interest.
Article
Full-text available
Early-stage plant density is an essential trait that determines the fate of a genotype under given environmental conditions and management practices. The use of RGB images taken from UAVs may replace the traditional visual counting in fields with improved throughput, accuracy, and access to plant localization. However, high-resolution images are required to detect the small plants present at the early stages. This study explores the impact of image ground sampling distance (GSD) on the performances of maize plant detection at three-to-five leaves stage using Faster-RCNN object detection algorithm. Data collected at high resolution ( GSD ≈ 0.3 cm ) over six contrasted sites were used for model training. Two additional sites with images acquired both at high and low ( GSD ≈ 0.6 cm ) resolutions were used to evaluate the model performances. Results show that Faster-RCNN achieved very good plant detection and counting ( rRMSE = 0.08 ) performances when native high-resolution images are used both for training and validation. Similarly, good performances were observed ( rRMSE = 0.11 ) when the model is trained over synthetic low-resolution images obtained by downsampling the native training high-resolution images and applied to the synthetic low-resolution validation images. Conversely, poor performances are obtained when the model is trained on a given spatial resolution and applied to another spatial resolution. Training on a mix of high- and low-resolution images allows to get very good performances on the native high-resolution ( rRMSE = 0.06 ) and synthetic low-resolution ( rRMSE = 0.10 ) images. However, very low performances are still observed over the native low-resolution images ( rRMSE = 0.48 ), mainly due to the poor quality of the native low-resolution images. Finally, an advanced super resolution method based on GAN (generative adversarial network) that introduces additional textural information derived from the native high-resolution images was applied to the native low-resolution validation images. Results show some significant improvement ( rRMSE = 0.22 ) compared to bicubic upsampling approach, while still far below the performances achieved over the native high-resolution images.
Article
Full-text available
Survey and monitoring of wildlife populations are among the key elements in nature conservation. The use of unmanned aerial vehicles and light aircrafts as aerial image acquisition systems is growing, as they are cheaper alternatives to traditional census methods. However, the manual localization and identification of species within imagery can be time-consuming and complex. Object detection algorithms, based on convolutional neural networks (CNNs), have shown a good capacity for animal detection. Nevertheless, most of the work has focused on binary detection cases (animal vs. background). The main objective of this study is to compare three recent detection algorithms to detect and identify African mammal species based on high-resolution aerial images. We evaluated the performance of three multi-class CNN algorithms: Faster-RCNN, Libra-RCNN and RetinaNet. Six species were targeted: topis (Damaliscus luna-tus jimela), buffalos (Syncerus caffer), elephants (Loxodonta africana), kobs (Kobus kob), warthogs (Phacochoerus africanus) and waterbucks (Kobus ellip-siprymnus). The best model was then applied to a case study using an independent dataset. The best model was the Libra-RCNN, with the best mean average precision (0.80 AE 0.02), the lowest degree of interspecies confusion (3.5 AE 1.4%) and the lowest false positive per true positive ratio (1.7 AE 0.2) on the test set. This model was able to detect and correctly identify 73% of all individuals (1115), find 43 individuals of species other than those targeted and detect 84 missed individuals on our independent UAV dataset, with an average processing speed of 12 s/image. This model showed better detection performance than previous studies dealing with similar habitats. It was able to differentiate six animal species in nadir aerial images. Although limitations were observed with warthog identification and individual detection in herds, this model can save time and can perform precise surveys in open savanna.
Article
Full-text available
Wild animal surveys play a critical role in wild animal conservation and ecosystem management. Unmanned aircraft systems (UASs), with advantages in safety, convenience and inexpensiveness, have been increasingly used in wild animal surveys. However, manually reviewing wild animals from thousands of images generated by UASs is tedious and inefficient. To support wild animal detection in UAS images, researchers have developed various automatic and semiautomatic algorithms. Among these algorithms, deep learning techniques achieve outstanding performances in wild animal detection, but have some practical issues (e.g., limited animal pixels and sparse animal samples). Based on a typical deep learning pipeline, faster region based convolutional neural networks (Faster R-CNN), this study adopted several tactics, including feature stride shortening, anchor size optimization, and hard negative class, to overcome the practical issues in wild animal detection in UAS images. In this study, a kiang survey was conducted in UAS datasets (23,748 images) obtained by 14 flight campaigns in the eastern Tibetan Plateau. The validation experiments of our adopted tactics revealed the following: (1) feature stride shortening and anchor size optimization improved small animal detection performance in the animal patch set, increasing the F1 score from 0.84 to 0.86 and from 0.86 to 0.92, respectively; and (2) the hard negative class significantly suppressed false positives in the full UAS image set, increasing the F1 score from 0.44 to 0.86. The test results in the full UAS image set showed that the modified model with the adopted tactics can be applied to either a semiautomatic survey to accelerate manual verification by 25 times or an automatic survey with an F1 score of approximately 0.90. This study demonstrates that the combination of UAS and deep learning techniques can enable automatic/semiautomatic, accurate, inexpensive, and efficient wild animal surveys.
Article
Full-text available
In conservation, the use of unmanned aerial vehicles (drones) carrying various sensors and the use of deep learning are increasing, but they are typically used independently of each other. Untapping their large potential requires integrating these tools. We combine drone-borne thermal imaging with artificial intelligence to locate ground-nests of birds on agricultural land. We show, for the first time, that this semi-automated system can identify nests with a high performance. However, local weather, type of arable field and height of the drone can affect performance. The results’ implications are particularly relevant to conservation practitioners working across sectors, such as biodiversity conservation and food production in farmland. Under a rapidly changing world, studies like this can help uncover the potential of technology for conservation and embrace cross-sectoral transformations from the onset; for example, by integrating nest detection within the precision agriculture system that heavily relies on drone-borne sensors.
Article
Full-text available
• Reliable estimates of abundance are critical in effectively managing threatened species, but the feasibility of integrating data from wildlife surveys completed using advanced technologies such as remotely piloted aircraft systems (RPAS) and machine learning into abundance estimation methods such as N‐mixture modeling is largely unknown due to the unique sources of detection errors associated with these technologies. • We evaluated two modeling approaches for estimating the abundance of koalas detected automatically in RPAS imagery: (a) a generalized N‐mixture model and (b) a modified Horvitz–Thompson (H‐T) estimator method combining generalized linear models and generalized additive models for overall probability of detection, false detection, and duplicate detection. The final estimates from each model were compared to the true number of koalas present as determined by telemetry‐assisted ground surveys. • The modified H‐T estimator approach performed best, with the true count of koalas captured within the 95% confidence intervals around the abundance estimates in all 4 surveys in the testing dataset (n = 138 detected objects), a particularly strong result given the difficulty in attaining accuracy found with previous methods. • The results suggested that N‐mixture models in their current form may not be the most appropriate approach to estimating the abundance of wildlife detected in RPAS surveys with automated detection, and accurate estimates could be made with approaches that account for spurious detections.
Article
Accurate detection of individual animals is integral to the management of vulnerable wildlife species, but often difficult and costly to achieve for species that occur over wide or inaccessible areas or engage in cryptic behaviours. There is a growing acceptance of the use of drones (also known as unmanned aerial vehicles, UAVs and remotely piloted aircraft systems, RPAS) to detect wildlife, largely because of the capacity for drones to rapidly cover large areas compared to ground survey methods. While drones can aid the capture of large amounts of imagery, detection requires either manual evaluation of the imagery or automated detection using machine learning algorithms. While manual evaluation of drone‐acquired imagery is possible and sometimes necessary, the powerful combination of drones with automated detection of wildlife in this imagery is much faster and, in some cases, more accurate than using human observers. Despite the great potential of this emerging approach, most attention to date has been paid to the development of algorithms, and little is known about the constraints around successful detection (P. W. J. Baxter, and G. Hamilton, 2018, Ecosphere , 9 , e02194). We reviewed studies that were conducted over the last 5 years in which wildlife species were detected automatically in drone‐acquired imagery to understand how technological constraints, environmental conditions and ecological traits of target species impact detection with automated methods. From this review, we found that automated detection could be achieved for a wider range of species and under a greater variety of environmental conditions than reported in previous reviews of automated and manual detection in drone‐acquired imagery. A high probability of automated detection could be achieved efficiently using fixed‐wing platforms and RGB sensors for species that were large and occurred in open and homogeneous environments with little vegetation or variation in topography while infrared sensors and multirotor platforms were necessary to successfully detect small, elusive species in complex habitats. The insight gained in this review could allow conservation managers to use drones and machine learning algorithms more accurately and efficiently to conduct abundance data on vulnerable populations that is critical to their conservation.