Video Data Collection
3D Map Generation
Detection and Motion Tracking
Looking forward >
In the field we collected ground-level images of
zebras. In collaboration with WildBook and Dan
Rubenstein (Princeton University), we will
compare the unique stripe patterns of zebras in
our photos to a database of all individuals in
the population. Individual identification will
provide valuable biological and life historical
context for our behavioral data.
We can use our posture keypoints to estimate head and eye locations in
3D space. We will then use ray casting algorithms to reconstruct visual
fields for each individual. By examining where these rays intersect with
vegetation structures or conspecifics, we can determine what environ-
mental and social information each individual has visual access to at any
given point. We can thus explicitly account for this information and
examine its effect on behavioral patterns and decision-making.
Visual Field Reconstruction
Collective detection in
free-ranging African ungulates
We map each observation location using a Sensefly eBee+
fixed-wing drone. This drone captures an overlapping grid of
still photos. We can then use photogrammetry to create
detailed 3D models of the environmental structure. The
resolution of these 3D maps is approximately 2.5 cm per pixel.
We use two deep convolutional neural networks (resnet-101 and faster-rcnn) to detect
individual animals in the video frames and determine their species. Normally these very
large networks would require millions of training examples to learn the complex
parameters of our videos. However, we use a transfer learning approach in which we
pre-train the networks using a publicly-available image set (COCO) before training it on
several thousand annotations from our videos. Transfer learning lets us to harness the
power of these large networks with minimal annotation effort. After detecting animals in
each frame, we stitch frames together to generate continuous movement tracks for the
We use a deep convolutional neural network to estimate the posture of each animal in
each frame. This network takes as input raw images of individual animals and outputs
estimates of the locations of eight keypoints corresponding to parts of the animal’s body
(e.g. head, nose, tail), as well as a confidence levels for each estimate. We also make the
network predict lines connecting the keypoints, thereby forcing it to learn the spatial
relationships between the points, which increases accuracy. The next step is to use a
clustering algorithm to classify the keypoint configurations as behavioral states (e.g.
grazing, vigilant, running).
We film zebra, buffalo and impala herds using DJI Phantom 4 Pro drones. The drones
fly for approximately 20 minutes on a single battery, but we use two drones in relay to
achieve longer observation times. After collecting baseline footage, we approach the
group on foot, eliciting a detection and escape response. Footage resolution is 4K, 60
frames per second.
Blair R. Costelloe, Benjamin Koger, Jacob M. Graving, Iain D. Couzin
Max Planck Institute for Ornithology & University of Konstanz
Generating dynamic behavioral datasets with drones and computer vision
roups of prey are often better at detecting predators than solitary individuals, a
phenomenon known as collective detection. Collective detection is a two-stage process:
first, one or more individuals within the group must detect the threat and then this
information must be transferred to unaware group members. Effective study of these
processes requires continuous behavioral data for all individuals in a group at high temporal
resolution. We are harnessing new technologies and computer vision techniques to generate
these datasets for wild ungulates in complex environments. Using these methods, we will
quantify individual behavioral patterns and examine the effects of various social, ecological and
biological factors on vigilance behavior. We will then explore the relationship between individual
strategies and group-level phenomena such as collective vigilance and information transfer.
This project has received funding from the European Union’s
Horizon 2020 research and innovation programme under the
Marie Sklodowska-Curie grant agreement No. 748549.
Poster design by Mike Costelloe.
See these videos and learn more
about the research at our project
view the video at