Technical ReportPDF Available

Automated detection of sea pens in video footage: Applications for time-series monitoring of Vulnerable Marine Ecosystems (VME)

Authors:

Abstract and Figures

Sea pens are functionally important species of conservation value, which are also vulnerable to human impacts, such as bottom contact fishing. A recent publication by Downie et al. (2021) has indicated that sea pen species have potential as indicators of conservation status for mud habitats (both inside and beyond MPAs). In UK waters this is particularly relevant for ‘Sea pens and burrowing megafauna’; a Habitat Feature of Conservation Interest for which a number of UK Marine Protected Areas (MPAs) have been designated. Despite the importance of these species, there is currently little data available on sea pen density through time. This limits our ability to explore natural variability in sea pen populations and to establish links with human pressures, ultimately inhibiting the development of a robust condition indicator. This project develops a machine learning algorithm to facilitate efficient and cost-effective extraction of sea pen data from ‘non-traditional’ sources; in this case a 15 year time series of video footage collected by Cefas for Nephrops stock assessment. On development of the algorithm, data are automatically extracted from historic video footage, enabling exploration of sea pen density through time in relation to natural environmental parameters and fishing pressure. The results provide important evidence on which to base assessments of natural and fishing-related change in future monitoring programmes (and to inform appropriate management measures), both within and beyond MPAs. The objectives of this project are: 1) To develop and operationalise a machine learning tool to enable quick and cost-effective analysis of underwater video survey footage for extracting counts of the sea pens Pennatula phosphorea and Virgularia mirabilis. 2) On development of a successful algorithm, apply it to historic towed sledge camera footage available from Nephrops stock assessment surveys at Farnes Deep (FU6) with repeated sampling of stations over a 15-year period. 3) Assess the algorithm’s ability to generalise to footage from different camera platforms and with different quality and visibility. 4) Extract a time series for P. phosphorea and V. mirabilis density. 5) Investigate spatio-temporal trends in the P. phosphorea and V. mirabilis populations in the study area through spatial modelling of habitat suitability and time-series analysis of sea pen density. 6) Link the spatial and temporal trends to changes in the environment and human impacts.
Content may be subject to copyright.
Automated detection of sea pens in video
footage
Applications for time-series monitoring of
Vulnerable Marine Ecosystems (VME)
Anna Downie, Tammy Noble-James, John Sperry and
Shannon White
March 2022
© Crown copyright 2022
This information is licensed under the Open Government Licence v3.0. To view this licence,
visit www.nationalarchives.gov.uk/doc/open-government-licence/
This publication is available at www.gov.uk/government/publications
www.cefas.co.uk
Cefas Document Control
Submitted to:
Callum Hobbs, Defra
Date submitted:
27/03/2022
Project Manager:
Gary Saggers
Report compiled by:
Anna Downie, Tammy Noble-James, John Sperry and Shannon
White, Cefas
Quality control by:
Joe Ribeiro and James Bell, Cefas
Approved by and date:
Version:
2
Recommended citation
for this report:
Downie et al. (2022). Automated detection of sea pens in video
footage - Applications for time-series monitoring of Vulnerable
Marine Ecosystems (VME). Cefas Project Report for Defra, vi +
51 pp.
Version control history
Version
Author
Date
Comment
0.1
AD, TNJ, JS, SW
01/03/2022
First Draft
1
AD
02/03/2022
Revised draft
1.1
JR
25/03/2022
QC
2
AD, TNJ
27/03/2022
Finalised following
QC
i |
Contents
1. Executive summary ......................................................................................................... 1
2. Introduction ...................................................................................................................... 3
2.1. Background ............................................................................................................... 3
2.1.1. Marine imagery in monitoring ........................................................................... 3
2.1.2. Machine learning in marine imagery ................................................................ 4
2.1.3. Sea pens: case study taxa for machine learning ............................................. 5
2.2. Aims and objectives ................................................................................................. 6
3. Methodology..................................................................................................................... 7
3.1. Video footage............................................................................................................ 7
3.1.1. Nephrops norvegicus Underwater TV Surveys at Farne Deeps ..................... 7
3.1.2. Camera and video storage specifications ...................................................... 10
3.2. Environmental data ................................................................................................ 12
3.3. Software and model algorithms ............................................................................. 13
3.4. Video annotation ..................................................................................................... 14
3.4.1. Previous point annotations ............................................................................. 14
3.4.2. Preparation of annotations for training the deep learning model .................. 14
3.4.3. Annotations for testing the deep learning model............................................ 18
3.5. Model training and accuracy evaluation ................................................................ 19
3.5.1. Deep learning model training and outputs ..................................................... 19
3.5.2. Validation of deep learning model .................................................................. 20
3.6. Spatio-temporal analysis of sea pen density ......................................................... 23
3.6.1. Spatial distribution models .............................................................................. 23
ii |
3.6.2. Analysis of temporal trends in density ............................................................ 23
4. Results and Discussion ................................................................................................. 24
4.1. Accuracy of the deep learning model .................................................................... 24
4.2. Spatial distribution of sea pens at Farne Deeps.................................................... 29
4.3. Temporal analysis of sea pen density ................................................................... 33
5. Conclusion ..................................................................................................................... 40
5.1. Applicability of deep learning to evaluating sea pen density for research and
monitoring .......................................................................................................................... 40
5.2. Spatio-temporal trends in sea pen density implications for monitoring ............. 41
5.3. Future recommendations ....................................................................................... 42
6. References ..................................................................................................................... 44
Annex 1. Conditional Inference trees ................................................................................... 47
Annex 2. Random Forest partial dependence plots ............................................................. 48
iii |
List of Figures
Figure 1. Location of the Farne Deeps Nephrops ground (FU6) and the random stratified
grid of 110 annually visited UWTV stations. ........................................................................... 9
Figure 2. Example screen images from each survey, showing the changes in aspect ratio,
and camera placement on the towing sledge. ...................................................................... 11
Figure 3. Illustration of the process of converting VIAME generic object proposals into
training annotations for the deep learning model (a) and added annotations where BIIGLE
annotations were absent (b). ................................................................................................ 15
Figure 4. Examples of issues identified for sea pen track production and editing presented
for Virgularia. A and B show consecutive sea pens, which resulted in production of
overlapping tracks. A and B also show a sea pen crossing the laser (bottom right), which
resulted in the laser’s inclusion in the annotated feature for model training. C and D show
consecutive frames with contrasting sea pen visibilities. In C, the whole sea pen is not
readily distinguishable, representing a low confidence frame, whereas in D the full sea pen
is readily distinguishable. These issues could each affect the suitability of a track and
associated frames for model training.................................................................................... 17
Figure 5. Validation data table format................................................................................... 22
Figure 6. Example images from video with good visibility with different lighting from station
DM in 2017 (a) and 2021 (b). ................................................................................................ 27
Figure 7. Scatter plots with linear regression fits of the number of Pennatula phosphorea
and Virgularia mirabilis analyst observations and model detections in the validation dataset
(2014-2021 data filtered with thresholds). Spearman's rank correlation coefficients and
their p-values are given for each species. ............................................................................ 28
Figure 8. Scatter plot with linear regression fits of the number of Pennatula phosphorea (a)
and Virgularia mirabilis (b) analyst annotations made in BIIGLE and model detections in
the 2016 full video dataset filtered with thresholds. Spearman's rank correlation
coefficients (ρ) and their p-values are given for each species. ........................................... 29
Figure 9. Predictor contributions to the spatial distribution of P. phosphorea and
V. mirabilis density. The bars show the mean of percent increase in mean square error
(MSE) across 10 Random Forest model runs and the lines show standard error. SAR =
Swept Area Ratio. ................................................................................................................. 31
Figure 10. Partial dependence plots for the most influential environmental variables
included in Random Forest models for Pennatula phosporea and Virgularia mirabilis
density. Values are means of the 10 repeated split sample runs of the model. ................. 32
iv |
Figure 11. Predicted density (mean of 10 repeated split sample RF runs) of Pennatula
phosphorea and Virgularia mirabilis at Farne Deeps. The coastal strip shallower than the
lowest depth in the dataset is masked to prevent predictions outside the known
environmental conditions. Observed density is shown at sample locations. Crosses show
stations where no sea pens were found. .............................................................................. 33
Figure 12. Spatial distribution over mean annual SAR 2009-2020 (a) and range of sea pen
density values over 2014-2021 (b) at the selected time-series validation stations. Mean
and standard deviation are shown by the black circles/triangles and lines. ........................ 34
Figure 13. Coefficient of Variation (CV) of sea pen density at time-series stations overlaid
on the mean annual subsurface SAR 2009-2020. ............................................................... 35
Figure 14. GAM response plots for Pennatula phosphorea coefficient of variation (CV) at
stations over 2014-2021 against (a) suspended particulate matter in the water column in
winter and (b) mean annual subsurface swept area ratio (SAR) for 2009-2020. ................ 36
Figure 15. Density of sea pens observed at the 7 time-series stations with the annual
mean subsurface swept area ratio (SAR) between 2014-2021. .......................................... 37
Figure 16. Density of sea pens observed at the 7 time-series stations with the monthly
maximum significant wave height from the Tyne and Tees Waverider between 2014-2021.
............................................................................................................................................... 38
Figure 17. Density of sea pens observed at the 7 time-series stations with the monthly
maximum sea surface temperature from the Tyne and Tees Waverider between 2014-
2021. ...................................................................................................................................... 39
Figure 18. Conditional Inference tree used to determine filtering thresholds for Pennatula
phosphorea model predictions. Det.frames = the number of sea pen detection frames
forming each track; model.conf = the maximum label prediction confidence score
(MaxLPC). ............................................................................................................................. 47
Figure 19. Conditional Inference tree used to determine filtering thresholds for Virgularia
mirabilis model predictions. Det.frames = the number of sea pen detection frames forming
each track; model.conf = the maximum label prediction confidence score (MaxLPC). ...... 47
Figure 20. Partial dependence plots for environmental variables included in Random
Forest models for Pennatula phosporea density. Thin blue lines show the response for
each of the 10 repeated split sample runs of the model and the red line is the mean across
all runs. .................................................................................................................................. 48
v |
Figure 21. Partial dependence plots for environmental variables included in Random
Forest models for Virgularia mirabilis density. Thin blue lines show the response for each
of the 10 repeated split sample runs of the model and the red line is the mean across all
runs. ....................................................................................................................................... 49
vi |
List of Tables
Table 1. Summary of the differing camera systems used on each survey. ........................ 10
Table 2. Environmental raster layers used to investigate spatial and temporal trends in sea
pen density at Farne Deeps. ................................................................................................. 12
Table 3. Issues that arose and approach taken for sea pen track editing. ......................... 16
Table 4. Numbers of annotations used in training iterative versions of the deep learning
model. All videos are from 2016. Video indicates the station code of the video used for
training. No. in BIIGLE refers to the number of sea pens annotated in BIIGLE for the video.
Number of tracks refers to the number of individual sea pens track annotations were done
for. Number of detections is given where some or all of the annotations are annotation
boxes that have not been combined into tracks. Number of annotations is the total number
of individual annotation boxes for the model to train on. ..................................................... 19
Table 5. Accuracy statistics for model prediction of Pennatula phosphorea and Virgularia
mirabilis, based on predictions the thresholds detailed in Section 3.4. ............................ 25
Table 6. Spearman’s rank correlation of 2016 BIIGLE observations and corresponding
model predictions. ................................................................................................................. 29
Table 7. Validation statistics for the 10 repeated split sample Random Forest model runs.
N = number of samples, RMSE = root mean squared error, Relative RMSE = RMSE as a
proportion of the range of values in data, SD = standard deviation. ................................... 31
1 |
1. Executive summary
In UK waters, the collection of seabed imagery has become widespread as an efficient and
non-invasive means of data collection for monitoring habitats and species. The resulting
data contribute to the evidence base against which habitat condition is evaluated, ultimately
informing assessments of whether domestic conservation targets have been met, and
whether progress towards international goals has been made. To maximise the great
potential of seabed imagery data, the UK benthic monitoring community has recognised the
need for new techniques to optimise cost efficiency, accuracy, repeatability and accessibility
of imagery data products. One fundamental area for development is the identification and
annotation of fauna.
In recent years, the potential of Machine Learning (ML) methods for annotating imagery
datasets has been demonstrated for a range of marine fauna, with this technology becoming
more accessible to ecologists via ‘user-friendly’ interfaces. Sea pens (Pennatulacea) are
slow-growing and long-lived soft corals of international conservation importance, for which
UK MPAs are designated. These species are excellent candidates for developing ML
algorithms, due to their distinct body forms and their potential for aggregating in high
densities. They also show potential for development as indicators of habitat condition,
pending a greater understanding of their ecological requirements, population dynamics and
responses to pressures.
This study uses time-series video footage initially gathered for stock assessments of the
Farne Deeps Norway lobster fishing ground to develop and validate ML algorithms for
density counts of two UK sea pen species; the phosphorescent sea pen (Pennatula
phosphorea) and the slender sea pen (Virgularia mirabilis). The VIAME ML platform was
used to train a Cascade Faster R-CNN deep learning model on annotations of the
occurrence of P. phosphorea and V. mirabilis in videos from 2016 from six stations. Seven
other stations were then used to validate model detections in footage covering 2014-21. The
annotated data were then explored in the context of anthropogenic and environmental
variables, used to generate predictions of sea pen distribution via Random Forest models,
and to investigate drivers of sea pen density over time, using Generalised Additive Models
(GAMs).
In general, the algorithms detected P. phosphorea with notably higher accuracy than
V. mirabilis. Unsurprisingly, for both species the model detections were most accurate for
the 2016 data (on which the algorithm was trained). For the remaining years, the accuracy
of the algorithms for both species was substantially affected by variation in the camera set-
ups and lighting configurations between years. The spatial analysis of the 2016 data
confirmed that the two species occupy different parts of the Farne Deeps sandy mud habitat,
with V. mirabilis being more tolerant of turbid conditions than P. phosphorea. High densities
2 |
of both species were only found to occur when a single species was present. The highest
densities of P. phosphorea were associated with areas of low demersal fishing intensity,
whilst the density of V. mirabilis did not appear to be greatly influenced by this pressure.
Temporal analysis revealed substantial variation in P. phosphorea density, even in areas
where fishing pressure was extremely low or non-existent. Density of V. mirabilis also varied
through time, showing a notable reduction at two stations since 2014.
The results of this study are encouraging of further future development of automated
detection algorithms for sea pens, and other taxa with an appearance distinct from their
surroundings. P. phosphorea has shown potential here as an indicator species for
assessments of habitat condition, being well detected by ML algorithms and showing a
strong negative response to fishing intensity. V. mirabilis, conversely, is not as well detected
and does not appear to have a strong negative response to fishing pressure. The time and
cost saving implications of ML for extracting P. phosphorea data are considerable. We
recommend that further sampling is conducted at the same fixed stations to enhance
development of the ML algorithm and to provide additional evidence for indicator
development, in the context of natural variability across space and time.
3 |
2. Introduction
2.1. Background
2.1.1. Marine imagery in monitoring
Collection of seabed imagery is an efficient and non-invasive means of obtaining data to
monitor marine benthic habitats and species. In the UK, the use of imagery focused on the
seabed, or ‘benthic imagery’, for monitoring has become widespread, with imagery being
acquired from almost every marine habitat and by a diverse range of users (van Rein, 2020).
These seabed imagery data contribute to the evidence base against which the condition of
habitats and species of conservation importance are evaluated. This evidence ultimately
feeds into assessments of whether domestic conservation targets have been met under the
UK Marine Strategy and Environment Act 2021, and whether progress towards international
goals has been made (e.g., through the Oslo-Paris Convention).
Despite the widespread application and versatility of benthic imagery, significant research
and development work is needed to ensure that the data generated from seabed imagery
are of high quality, quantitative and suitable for comparison between surveys, allowing
changes in benthic habitats and communities over time to be detected. This need has been
widely recognised by imagery analysts, monitoring scientists and policy makers across the
UK, resulting in the Big Picture initiative, led by the Joint Nature Conservation Committee
(JNCC) under the North-East Atlantic Marine Biological Analytical Quality Control Scheme
(NMBAQC). This collaborative project has brought together a wide range of organisations
and disciplines, to develop and execute a Benthic Imagery Action Plan (BIAP; van Rein,
2020).
Identification and annotation of benthic fauna from seabed imagery are highlighted as
priority areas in the Big Picture BIAP, being fundamental to generating accurate datasets
for detecting change over time. Alongside developing the more ‘traditional’ aspects of
manual data annotation (e.g., epifaunal identification protocols and the use of manual
annotation platforms), the BIAP details the need to explore automated methods of imagery
annotation (i.e., via machine learning). These methods, although at a relatively early stage
of development for some marine benthic ecosystems, have great potential for monitoring in
the UK, with benefits including increased efficiency and cost-effectiveness, improved
comparability between different imagery datasets over time, and movement towards
realising the potential of large imagery datasets (such as those generated by Autonomous
Underwater Vehicles; AUVs).
4 |
2.1.2. Machine learning in marine imagery
Machine learning (ML) is a form of Artificial Intelligence (AI) which uses data and algorithms
to imitate the way that humans learn by gradually improving its accuracy over time, after
experiencing a greater range of input data. Neural network ML seeks to simulate in an
algorithm how the human brain makes neural connections to learn and understand the
information received via the visual cortex. Annotation of marine images for environmental
monitoring can be time consuming, laborious, and prone to bias being introduced by varying
human levels of experience and judgement. ML algorithms have been developed to improve
upon the time taken to perform image classification, and to such a level that they potentially
outperform human experts (French et al., 2020).
As the rise of digital media has led to an increase in the availability and distribution of
underwater imagery, the potential of analytical tools to quantify and extract data on marine
organisms has also been demonstrated. ML has been successfully employed in the
identification and classification of fish, plankton, and corals (Moniruzzaman et al., 2017).
Scallop detections derived from a Convoluted Neural Network (CNN) have been proven to
reduce the time required for human annotation by providing detections at pre-selected
confidence thresholds, so that only a few scallops must be manually added, or a few false
positives removed (Rasmussen et al., 2017).
A number of applications of ML in image analysis demonstrate the ability to process large
image collections in a fast and efficient manner. For example, the Machine learning Assisted
Image Annotation method (MAIA) (Zurowietz et al., 2018) uses a combination of
autoencoder networks and Mask Region-based Convolutional Neural Network (Mask R-
CNN), allowing human observers to annotate large image collections much faster than
before.
One significant difficulty in marine image applications of ML is that many marine species
share similar morphological characteristics, therefore differentiation at the species level is
challenging. This is also true for human expert annotation, but significant steps have been
taken in deriving ML techniques that provide accurate results in classifying coral texture
images (Gómez-Ríos et al., 2019b, 2019a). Successful training of an ML algorithm which is
robust and provides accurate classifications relies on a large and efficient training dataset.
Species level classification certainty has been proven to increase with training data size and
the diversity of species labels used (Durden et al., 2021), with a concurrent reduction in bias.
Automated classification of distinct and discrete still images provides annotations for each
individual image, allowing area-based counts on image-by-image basis or cover of larger
areas through pooling of non-overlapping images. Challenges arise when automated
detection is used on video files. Detections in each frame, lead to multiple detections of a
single individual. Robust techniques and methodologies are needed to ensure accurate
5 |
tracking of detections between frames, and to minimise the effects of non-tracking frames,
which can lead to inflated abundance estimates, as detection events corresponding to one
individual are split into several tracking events.
Many sensitive but sparse and patchily distributed large epifauna, such as gorgonian and
soft corals and sea pens require more sampling effort to detect, and hence are more likely
to be captured by video transects than still images. At the same time, they possess
characteristics, such as longevity and fragility which make them good candidates for
monitoring of habitat condition and impacts, and distinct body types making them good
candidates for automated detection by AI. Video-based counts of these taxa give the best
estimate of their abundance and consequently effort needs to be put into developing their
detection in video footage.
2.1.3. Sea pens: case study taxa for machine learning
Sea pens (Pennatulacea) are colonial soft corals which stand erect from muddy and sandy
sediments (often in dense aggregations), providing three-dimensional structure and creating
microhabitats on otherwise homogenous areas of seabed (Buhl-Mortensen and Buhl-
Mortensen, 2014; De Clippele et al. 2015). Sea pens are relatively slow-growing and long-
lived, and these life history traits make them particularly vulnerable to damage, displacement
or removal by demersal fishing activities (Hixon and Tissot, 2007; Malecha and Stone, 2009;
Lauria et al. 2017). In recognition of their sensitivity and functional importance, ‘Vulnerable
Marine Ecosystem’ (VME) status has been conferred on sea pen communities by the United
Nations General Assembly (Rogers and Gianni, 2010) and ‘Sea pens and burrowing
megafauna communities is included on the Oslo-Paris Commission (OSPAR) list of
threatened and/or declining habitats.
In the UK, sea pen conservation is implemented through the designation and management
of Marine Protected Areas (MPAs) for sea pen habitats, extending to the muddy sediments
they inhabit and their associated burrowing megafauna communities. Eighteen MPAs in the
UK network are currently designated for sea pen habitats (nine of which occur within
Secretary of State waters). Several UK studies have proposed that sea pen presence or
density (depending on the species) could serve as indicators of condition in mud habitats
(Greathead et al., 2007; Murray et al., 2015; Downie et al., 2021), however it was beyond
the scope of these studies to investigate natural spatio-temporal variations in sea pen
communities, or responses to pressures at local scales (e.g., at the scale of individual
MPAs).
To develop a robust indicator of condition for sea pen habitats, a link between the pressure
to be managed (e.g., physical disturbance by demersal towed fishing gears) and the status
of the at-risk species (e.g., density of sea pens) must be categorically demonstrated, with
respect to the natural variation to be expected in any benthic population where the
6 |
distribution of individuals is not uniform. At present, there is insufficient evidence to
understand natural spatio-temporal variability. Therefore, we cannot predict the level of
change to be expected in sea pen communities following the application of management
measures. This is a significant challenge for assessment of habitat condition in MPAs
designated for sea pen communities, however reanalysis of the long timeseries of video
footage collected in mud basins around the UK as part of Nephrops stock assessment would
present the opportunity to address the limited understanding of UK sea pen population
dynamics, whilst developing capability in machine learning methods of imagery annotation.
Sea pens typically inhabit muddy sediments with low levels of natural hydrodynamic
disturbance (Greathead et al. 2007, 2015; Downie et al. 2021) which are also targeted by
demersal Nephrops norvegicus fisheries, particularly in the North Sea. Cefas routinely
acquire video data from fixed stations in the North Sea Farne Deeps fishing ground
(Nephrops Functional Unit (FU) 6) to inform the annual Nephrops stock assessments. These
videos capture sea pen density alongside the target Nephrops burrows but have not yet
been annotated to enable extraction of sea pen data. Two sea pen species observed from
the Nephrops video data - the phosphorescent sea pen Pennatula phosphorea (Linnaeus,
1758) and the slender sea pen Virgularia mirabilis (Müller, 1776) - make excellent case study
taxa for developing ML algorithms to extract density data. Both species have distinct and
consistent body forms (in contrast to taxa such as sponges which exhibit wide within-species
morphological plasticity), which stand out against a relatively uniform muddy seabed.
P. phosphorea and V. mirabilis are the most prevalent sea pen species on the UK
continental shelf, and can occur in high densities, providing plenty of occurrences for training
and testing ML algorithms. A single 10 minute video transect towed through a sea pen field
at ~0.5 knots can contain many hundreds of sea pens.
Development of a successful algorithm for automated detection would substantially reduce
the staff hours required to process video data for sea pen density, resulting in large savings
for future monitoring. The Cefas Nephrops time-series video archive provided an opportunity
to develop a method of automated detection to extract sea pen density data from existing
footage, in line with the ‘collect once, use many times’ principle. Subsequently these data
were used to improve of our understanding of spatio-temporal trends in sea pen populations
in relation to demersal fishing effort and changes in natural conditions, providing valuable
evidence to inform monitoring and assessment of sea pen habitats (both within and beyond
MPAs), and to support the future development of a sea pen condition indicator.
2.2. Aims and objectives
The aim of this project is to use existing Cefas Nephrops survey video footage to develop,
test and operationalise machine learning algorithms for P. phosphorea and V. mirabilis, thus
generating density data to explore spatial and temporal dynamics of sea pen communities.
7 |
The specific project objectives are to:
1) Develop and operationalise a machine learning tool to enable quick and cost-
effective analysis of underwater video survey footage for extracting counts of the
sea pens P. phosphorea and V. mirabilis.
2) Following development of a successful algorithm, apply it to historic towed sledge
camera footage available from Nephrops stock assessment surveys at Farne
Deeps (FU6) with repeated sampling of stations over a 15-year period.
3) Assess the algorithm’s ability to generalise to footage from different camera
platforms and with different quality and visibility over the legacy video footage.
4) Extract a time series for P. phosphorea and V. mirabilis density.
5) Investigate spatio-temporal trends in the P. phosphorea and V. mirabilis populations
in the study area through spatial modelling of and temporal analysis of sea pen
density.
6) Link the spatial and temporal trends to changes in the environment and human
impacts.
3. Methodology
3.1. Video footage
3.1.1. Nephrops norvegicus Underwater TV Surveys at Farne Deeps
Norway lobster (Nephrops norvegicus L. 1758) fisheries are managed within the scope of
the International Council for the Exploration of the Sea (ICES). Nephrops stock assessments
are conducted annually for each ‘Functional Unit (FU) (specified Nephrops fishing grounds)
and are used by the European Commission to set annual Total Allowable Catches (TAC’s)
at the ICES sub-area level. The ICES Working Group on Nephrops Surveys (WGNEPS)
coordinates underwater television (UWTV) and trawl surveys that are conducted regularly
in 24 ICES FUs and three additional Nephrops grounds, covering the North Sea, Celtic Sea,
Irish Sea, East Atlantic and the Mediterranean Sea (ICES 2020). Cefas has performed
annual UWTV surveys in the Farne Deeps ICES FU6 since 1996. The Nephrops ground
extent, which forms the survey target area, was defined based on the extent of the Nephrops
fishery (from Vessel Monitoring System (VMS) data) and British Geological Survey (BGS)
sediment maps. A randomised fixed grid of 110 survey stations was positioned over the
Nephrops ground, with stations visited annually in June (Figure 1) The standard survey
8 |
methodology involves the use of a sledge mounted camera to conduct 10-minute video tows
at a speed of 0.7 knots. Vessel position during the tow is recorded every 10 seconds using
a differential global positioning system (DGPS) and sledge position using an ultra-short
baseline (USBL) transponder. Two fan lasers with a known distance are used to delimit the
field of view.
The aim of these surveys is to identify and count the number of Nephrops burrow systems
falling within a fixed field of view, along transects of known length. Additional information
recorded for each tow includes start and end coordinates, depth at start and end of tow,
visibility along the tow (Good/Moderate/Poor), the presence of trawl marks in the substrate
and whether any other fauna (such as fish, crabs, sea pens and fan worms) are present.
Non-target fauna are not counted during the routine analysis for Nephrops stock
assessment.
In the last 15 years tows have been recorded digitally, first onto DV-tape and DVD. From
2016 onwards they have been recorded onto hard drives in MP4 video file format (see
Section 0). For this project, footage from 2014 and 2015 has also been converted from DVD
format to MP4.
9 |
Figure 1. Location of the Farne Deeps Nephrops ground (FU6) and the random stratified grid
of 110 annually visited UWTV stations.
10 |
3.1.2. Camera and video storage specifications
Between 2014 and 2021 various camera configurations were used for the Nephrops survey
and are summarised below in Table 1. Before 2016 a Kongsberg/Simrad 14-408 camera
system was used. The Phased Alternate Line (PAL) format of the Kongsberg/Simrad camera
used an interlaced video format where alternate lines use the preceding frame lines to
reduce motion blur. This interlacing of frames alters the individual frames, when compared
to the progressive scan format of the camera systems used from 2016 onwards. The DVD
storage format pre-2016 also required transcoding the DVD media files into the MP4 format
used 2016 onwards, which is compatible with digital video analysis software. For the 2016
to 2019 surveys a Subsea Technology and Rentals (STR) high-definition Internet Protocol
(IP) camera was used and for the 2020 and 2021 surveys an STR SeaSpyder High Definition
(HD) camera was used, further increasing the achievable pixel resolution.
Table 1. Summary of the differing camera systems used on each survey.
Survey
Camera
Video Format
Aspect
ratio
File Format
CEND1214
Kongsberg/Simrad 14-408
PAL (576 lines)
4:3
Optical storage media
CEND1215
Kongsberg/Simrad 14-408
PAL (576 lines)
4:3
Optical storage media
CEND1216
STR SeaSpyder IP
720 HD
14:9
Digital storage
CEND1217
STR SeaSpyder IP
720 HD
14:9
Digital storage
CEND1018
STR SeaSpyder IP
720 HD
14:9
Digital storage
CEND0919
STR SeaSpyder IP
720 HD
14:9
Digital storage
CEND0920
STR SeaSpyder HD
1080 HD
14:9
Digital storage
CEND0821
STR SeaSpyder HD
1080 HD
14:9
Digital storage
As the camera system has changed over the various surveys, this also intrinsically changed
the video imagery available for analysis (Figure 2). The camera and PAL format used for
2014 and 2015 has a width-to-height aspect ratio of 4:3, which enabled a close-up view of
the seabed, with a small amount of coverage outside of the fan laser lines. The IP and HD
camera both have an aspect ratio of 14:9, extending the area captured outside of the laser
lines and bringing the sledge frame into view. There are also extrinsic parameters which
11 |
have altered between survey. There have been changes in both the camera mounting height
in the frame and the angle pointing towards the seabed. This is particularly noticeable in the
2019 data, where the camera angle is more vertical than the other surveys, and the sledge
skids are only just visible in the frame. Due to the increase in mounting height, there has
only been a slight increase in pixel resolution on the seabed.
2014
2015
2016
2017
2018
2019
2020
2021
Figure 2. Example screen images from each survey, showing the changes in aspect ratio,
and camera placement on the towing sledge.
12 |
3.2. Environmental data
Environmental data were collated from online sources to support analysis of the spatial distribution
and temporal trends in sea pen density across Farne Deeps. Environmental raster layers used by
Downie et al. (2021) to model sea pen distribution on the UK continental shelf were used to
investigate the influence of depth, substrate, bottom topography, energy at the seabed, temperature
and water clarity on sea pen density. An additional layer summarising the cumulative subsurface
swept area ratio (SAR) over 2009-2016 was included to investigate the impact of bottom fishing on
sea pen density (Table 2).
Table 2. Environmental raster layers used to investigate spatial and temporal trends in sea
pen density at Farne Deeps.
Variable
Unit
Source
resolution
Source
Annual subsurface swept area ratio (2009-
2020)
Annual SAR
0.05 deg
ICES/OSPAR, ICES
(2021)
Cumulative subsurface swept area ratio
(2009-2016)
Cumulative
annual SAR
0.05 deg
Calculated from
annual subsurface
SAR layers.
Bathymetry
m
0.002 deg
EMODnet Digital
Bathymetry (2016)
Valley Depth
m
0.002 deg
Calculated from
Bathymetry with SAGA
for QGIS Basic
terrain analysis tools
Relative Slope Position
0-1
Distance from Channel Network
m
Standardised height
m
Channel Network Baseline
m
Closed Depressions
?
Current Velocity
m/s
0.002 deg
Mitchell et al. (2019)
https://doi.org/10.1446
6/CefasDataHub.62.
Wave velocity
m/s
0.002 deg
Winter suspended particulate matter
g/m3
0.002 deg
Sand fraction
%
0.002 deg
Mitchell et al. (2019)
https://doi.org/10.1446
6/CefasDataHub.63.
Mud fraction
%
0.002 deg
Gravel fraction
%
0.002 deg
Sand to gravel log ratio
ratio
0.002 deg
Mud to gravel log ratio
ratio
0.002 deg
Annual Range in Bottom Temperature (2017-
2019)
Deg C
1.5 km
NORTHWESTSHELF_
ANALYSIS_FORECA
ST_PHY_004_013
from
http://marine.copernicu
s.eu/
Maximum Bottom Temperature (2017-2019)
Deg C
1.5 km
Mean Bottom Temperature (2017-2019)
Deg C
1.5 km
Minimum Bottom Temperature (2017-2019)
Deg C
1.5 km
13 |
Additional data on sea surface temperature and significant wave height between 2009-2021 were
obtained for the study area from the Cefas’ WaveNet strategic wave monitoring network Tyne and
Tees Waverider Buoy (located at located at 54°55'.14N, 000°44'.93W; available from
https://wavenet.cefas.co.uk).
3.3. Software and model algorithms
Detection and enumeration of stationary species such as sea pens in video footage is far more
complex than the detection of sea pens from single images. The ML algorithm must learn to detect
sea pens throughout a moving field of view (from top to bottom as the camera moves over them), in
different lighting conditions and, with an oblique camera angle such as in the Nephrops survey
footage, at a different distance. The ML process must incorporate a tracker algorithm to connect all
detections of the same sea pen into one track, so that one individual is not counted multiple times.
To achieve this, a very large number of training annotations are needed, with sea pens represented
in all parts of a single video frame. The easiest way to provide the ML algorithm with such a training
dataset was determined to be using track annotations. Consequently, the existing sea pen point
annotations required conversion into tracks of tightly framed rectangle annotations.
There is an ever-expanding suite of ML and AI tools available to consider; ranging from those purely
utilising command line programming to those with at least parts of the annotation process and
training algorithms accessible through a graphical user interface (GUI). Many of these tools only
address one step in the process of transcoding video, annotation, extraction of video frames, training
algorithms and displaying their results. The software toolkit, Video and Image Analytics for Marine
Environments (VIAME
1
), developed in cooperation between the US National Oceanic and
Atmospheric Administration’s (NOAA) Automated Image Analysis Strategic Initiative (AIASI), the
software developer Kitware and its partners, is an open-source computer vision software platform
targeting marine species analytics and combining tools for all the analysis stages through GUIs and
pre-existing processing pipelines, whilst being fully customisable (Dawkins et al., 2017). DIVE
2
, the
open-source GUI incorporates annotation tools and execution of pre-prepared analysis pipelines. It
can be run either as a web or desktop application. The desktop installation comes with many example
analysis pipelines for batch processing various stages of data preparation, algorithm training,
generation of detections and tracks from trained or pre-existing algorithms and post-processing
outputs.
The main machine learning methods implemented as standard in VIAME are: 1) Support Vector
Machines (SVM), 2) Cascade Faster R-CNN, 3) Mask R-CNN, 4) ResNet
3
(Residual Network), and
5) YOLO. The SVM configurations are the least processing-heavy and can be trained quickly with a
much smaller amount of training data in comparison to the other methods, which are based in deep
1
https://www.viametoolkit.org/
2
https://kitware.github.io/dive/
3
https://pytorch.org/hub/pytorch_vision_resnet/
14 |
learning. Deep learning is a group of machine learning methods based on artificial neural networks.
They are the most effective for complex learning tasks but are very data intensive. They require input
data in the order of thousands of training annotations and can take days or weeks to run depending
on the data, algorithm, and available processing power. Of the deep learning methods, Cascade
Faster R-CNN, Mask R-CNN and ResNet are run using NetHarn
4
, a parameterized fit harness deep
learning framework for PyTorch
5
and YOLO is run using Darknet
6
.
VIAME was selected as the analysis platform for this study for a number of reasons; 1) its
combination of a GUI for ease of manually annotating moving targets with tracks that can be
interpolated from non-consecutive frames, 2) the ability to use an existing general object detector
model on the platform to produce tracks for any objects to assist the conversion of point annotations
to tracks, and 3) the ability to run batch process pipelines with easy visualisation of results in the
GUI.
3.4. Video annotation
3.4.1. Previous point annotations
Previous annotations of sea pens done in the BIIGLE
7
annotation software on videos from Nephrops
surveys conducted in 2016, 2018 and 2020 were available to assist training annotation and for use
in validation of the model outputs. BIIGLE annotations were in the form of point annotations (by
species), of each sea pen within the field of view delimited by the fan lasers. As such they were not
directly applicable for training deep learning algorithms, which require large numbers of well framed
box annotations identifying each occurrence of the target species in an image or video frame. The
BIIGLE annotations were often of a single sea pen in a frame, with other sea pens visible in the same
frame annotated in previous or subsequent frames where they were more visible. The exclusion of
sea pens outside of the lasers also reduced the number of directly useful training frames. The point
annotations did, however, provide a starting point for generating a set of box track annotations for
training. Videos from 2016 contained the highest number of sea pen BIIGLE annotations and were
selected for the first stage of model training. BIIGLE annotations were exported as csv files
containing the annotated label (species), video file name, frame time from the beginning of the video
in decimal seconds and the point x-y coordinates in video frame pixels.
3.4.2. Preparation of annotations for training the deep learning model
A generic object tracker algorithm available in VIAME was applied to the 2016 videos. This detects
anything that is a distinct entity separate from the background. It was run using a batch processing
file which combines analysis pipelines for reducing the videos to 10 frames per second (fps), splitting
each video into individual frames, running a generic object detector on those frames, and compiling
4
https://gitlab.kitware.com/computer-vision/netharn
5
https://pytorch.org/
6
https://github.com/AlexeyAB/darknet
7
https://biigle.de/
15 |
tracks from the detections in consecutive individual frames. Output from the process is in a native
VIAME annotation format csv file, which contain a unique track ID, frame number from the beginning
of the video, the corner coordinates of a box surrounding the object in frame pixels and detection
confidence (0-1) for each individual detection in each video frame.
The generic object tracks were matched with point annotations by using a spatial overlay analysis of
the point coordinates from BIIGLE and polygons created from the corner coordinates of detection
boxes in each video frame. Detection boxes within 20 pixels of a BIIGLE sea pen annotation, and all
other detections included in the same track, were assigned the species label of that BIIGLE
annotation. The generic object tracks that were not matched with a sea pen were filtered out and the
confidence of matched tracks changed to 1. The BIIGLE point annotations were further converted
into square box annotations by buffering them by 30 pixels. These boxes were given alternate labels
to indicate they were BIIGLE annotations and added to the matched and filtered VIAME annotation
file before it was exported. The matching was done in R (R Core Team, 2021) using the sf package
(Pebesma, 2018). The R code used to match, covert and export these annotations is available at
https://github.com/annadownie/Sea-pen-detection.git.
Figure 3. Illustration of the process of converting VIAME generic object proposals into
training annotations for the deep learning model (a) and added annotations where BIIGLE
annotations were absent (b).
16 |
The annotations from this process were then used by analysts to help create detailed track
annotations for a subset of videos used for training the deep learning model (Figure 3). Analysts
used the BIIGLE annotations as a guide, edited the generic object-derived tracks where needed,
and created new tracks for any sea pens that were missing (including those outside of the laser
gates, Figure 3 b). Tracks produced by a generic object tracker and overlapping with single-frame
BIIGLE point annotations were checked and edited or re-made as required, using the DIVE
annotation software to ensure suitability for use in model training. This meant ensuring that the track
followed the sea pen across all frames, and that the boxes corresponding with the track fit and
followed the sea pen and not another ‘object’ (e.g. shadow or a shell). New tracks were also
produced if there was no overlap of a BIIGLE point annotation with the generic object tracker, or if
there was no BIIGLE annotation for a sea pen in the vicinity of one that had been annotated (the
video footage was not reviewed in full). Video quality was assessed based on the NMBAQC
categories (Turner et al., 2016).
Table 3. Issues that arose and approach taken for sea pen track editing.
Issue
Approach
Determination of first frame for annotation
When full length of sea pen is visible
Frames to incorporate for track creation
Try to capture side view and top view of the sea pen
Partial visibility of the sea pen in a frame
ending the track
Retain partial sea pen in track, at least to half visible
Fit of the box to the sea pen
Aim for generally good fit to the whole sea pen as long
as it is tracking the sea pen and not something else
(versus perfect fit). If it is requiring a lot of editing it is
better to do large boxes to capture the sea pen than to
spend too long on it. Can zoom in to help make tight
annotation.
White text at top of the image
The text is part of the image, do not annotate sea pen
when obscured by the text
Sea pen crosses the laser beam
Include if it looks like it is able to track across the laser
Two consecutive sea pens and no way to
distinguish them until nearly directly over them
Apply two boxes, even if they overlap. Where one of
them is entirely obstructing the other one, it won’t be
possible to do the whole track, but do it from where it is
visible.
Shadow influences visibility
Include sea pens in shadows and make note of the
frames
Visibility of sea pen changes frame to frame,
limiting placement/assessment of box and
track fit
Make note of low confidence frames. Don’t try to frame
too tightly with the box if you can’t see it (if part has
disappeared, draw where you think it is rather than only
the parts visible). If you can see it when zoomed in, do
not mark as low confidence.
17 |
When importing videos and annotations to the DIVE software, the frame rate was set to 10 frames
per second. Manual editing of tracks using the DIVE software was achieved by adjusting box fit to a
feature of interest and interpolating between frames with manually fitted boxes. For V. mirabilis,
substantial manual analyst intervention was required to produce or edit tracks, which highlighted key
considerations for developing the approach (Table 3). These included issues such as the presence
of consecutive sea pens in the footage (potential for overlapping tracks) and changes in the visibility
of a sea pen from frame to frame (Figure 4).
Figure 4. Examples of issues identified for sea pen track production and editing presented
for Virgularia. A and B show consecutive sea pens, which resulted in production of
overlapping tracks. A and B also show a sea pen crossing the laser (bottom right), which
resulted in the laser’s inclusion in the annotated feature for model training. C and D show
consecutive frames with contrasting sea pen visibilities. In C, the whole sea pen is not readily
distinguishable, representing a low confidence frame, whereas in D the full sea pen is readily
C
D
A
B
18 |
distinguishable. These issues could each affect the suitability of a track and associated
frames for model training.
With each track frame representing a replicate for use in model training and taking these
factors (Table 3) into consideration, a workflow was developed for track editing:
1. Use DIVE zoom feature to produce boxes with good fit to the sea pen for the
interpolation between frames (may want to note frames with good views/different
perspectives to guide this process)
2. Frame by frame
a. Check fit of the track zoomed in. Check if it is a generally good fit to the whole
sea pen and not capturing something else.
b. At the same time as checking fit, make note of whether the sea pen disappears
in the frame. If you would struggle to put a box around it in any independent
frame, make note of these frames as low confidence.
c. If edits are made, click through again for a final check of fit.
3. Check the track at normal zoom to make sure the full track is captured
a. If edits are made, click through again for a final check of fit.
For P. phosphorea, much less manual intervention was required, as the generic object
tracker more successfully tracked this species compared to Virgularia. Where detected by
the generic object algorithm, Pennatula were generally well framed by the boxes and very
little editing of the boxes was needed. Tracks needed to be added outside the laser gates
and some boxes interpolated or added where missing from intervening frames, but the
process was generally much simpler.
3.4.3. Annotations for testing the deep learning model
Seven stations were selected for testing how well the deep learning model was able to
generalise detections over the different image resolution and lighting conditions across the
Nephrops video footage time-series (i.e. 2014-2021). The annotations from these videos
were also used to investigate the temporal variability within these stations over the eight-
year period. For this second objective, the stations were selected on the basis of
geographical coverage across the study area, numbers of Pennatula and Virgularia present
at the stations in the BIIGLE counts and potential fishing impact level, extracted from a GIS
layer of mean annual Subsurface Swept Area Ratio (SAR) from 2009 to 2020. For speed of
annotation, all videos from these stations were annotated by analysts using single frame
box annotations for each sea pen, inside and outside of the laser gates.
19 |
3.5. Model training and accuracy evaluation
3.5.1. Deep learning model training and outputs
Whilst VIAME is OpenSource and all part of the analysis are customisable, it provides pre-
prepared analysis pipelines which can be run via batch command files containing the input
parameters required for the desired processes. For this project we selected the Cascade
Faster R-CNN (CF R-CNN) deep learning model.
In the first instance, detailed sea pen track annotations were available for four videos with a
total of 10,787 box annotations for Pennatula, representing 358 individuals in one video and
5,713 box annotations for Virgularia representing 157 individuals with full tracks and a further
333 individuals with single frame annotations across 3 videos (Table 4). None of the three
Virgularia videos were fully annotated due to the issues described in section 3.4.2. These
annotations were used to train a preliminary CF R-CNN model which was then applied to
one of the Virgularia videos and a further 2 Pennatula videos to create detections that could
be used as further annotations. The model detections were then inspected by an analyst
and used as a basis of 11,903 new box annotations for Pennatula and 12,909 for Virgularia,
bringing the total number of box annotations used in the final model to 41,312.
Table 4. Numbers of annotations used in training iterative versions of the deep learning
model. All videos are from 2016. Video indicates the station code of the video used for
training. No. in BIIGLE refers to the number of sea pens annotated in BIIGLE for the video.
Number of tracks refers to the number of individual sea pens track annotations were done
for. Number of detections is given where some or all of the annotations are annotation boxes
that have not been combined into tracks. Number of annotations is the total number of
individual annotation boxes for the model to train on.
No. in
BIIGLE
Number of tracks (detections)
annotated
Number of annotations
Video
Species
Prelim. Model
Final model
Prelim. Model
Final model
6-DW
Pennatula
235
358
358
10787
10787
6-DM
Pennatula
115
-
- (6131)
-
6131
6-B
Pennatula
111
-
- (5772)
-
5772
Total
Pennatula
461
358
358 (11903)
10787
22690
2-
IFCA
Virgularia
393
55 (281)
55 (281)
1865
1865
6-DD
Virgularia
338
75
644
2115
15024
6-AT
Virgularia
236
74 (68)
74 (68)
1733
1733
Total
Virgularia
967
157 (333)
773 (333)
5713
18622
Total
Both
1428
515 (333)
1131 (12236)
16500
41312
The final deep learning model was used to detect sea pens and form tracks in two sets of
videos: 1) videos from 2014-2021 from the seven time-series stations and 2) all of the videos
20 |
from 2016. The detections from VIAME were output in the VIAME native annotation csv
format. The annotations for the first set of videos included all sea pens including outside of
the laser gates. These annotations were directly matched to model generated detection
tracks and the outputs were used to firstly determine appropriate thresholds for detection
confidence by the model to be considered for sea pen counts. The second dataset was used
for a larger correlative comparison of numbers of sea pens detected by the deep learning
model in the full 130 videos from 2016 to those annotated in BIIGLE. As the BIIGLE
annotations were only done within the laser gates, the analysis must look at the strength of
correlation in the general patter of abundance between stations.
3.5.2. Validation of deep learning model
Whereas the accuracy with which the deep learning model detects sea pens in individual
video frames is assessed by comparing each detection box with a corresponding annotation
box, reporting the numbers of correct and incorrect classifications of image sections, the
practical usefulness of the model outputs needs to be determined based on the final
resulting counts of individuals. In a perfect model, one model detection track would
correspond to one sea pen observed by an analyst. The detection tracks in the raw output,
however, are of very varying quality. The deep learning model assigns each detection box
with a Detection Label Confidence (DLC) metric varying from 0-1, where 1 is a certain’
detection. The length of detection tracks also varied from a single frame to hundreds of
frames, or very few detections. On visual inspection of the raw outputs in the DIVE
annotation GUI, it was obvious that detections with very low DLC or a very low (or very high)
number of frames were unlikely to be actual sea pens. A track with many frames, following
the sea pen across the field of view also had higher confidence, whilst very long tracks were
observed to be following parts of the camera frame. To improve accuracy and minimise the
occurrence of non-sea-pen objects in the detection track dataset (e.g., model detections of
camera sledge rails as V. mirabilis) a statistical approach was taken to determine
appropriate thresholds for DLC and the length of the track to applied to the model detections.
For this purpose, each model detection track predicted for the set of videos with single frame
analyst annotations, was overlaid with the known instances of sea pens, to be classified
either as matching a true sea pen or not. The matching was done using a modified version
of the same code used earlier to match generic object detections with BIIGLE annotations.
Each detection in the frames with an analyst annotation and the frames directly before and
after were overlaid with the single frame analyst annotations to both classify a detection
intersecting an analyst annotation with the same species label as a match, and to calculate
the area of overlap. The model detection tracks are not always made up of detections in
every single frame that the object passes through. Hence, matching was allowed in the
previous and following frames to avoid missed matches with corresponding observed sea
pens, where a single frame is missing from a track. The matching process output a table
with each detection box in each track attributed with: 1) whether it intersected an analyst
21 |
annotation, 2) the proportion of its area intersected, 3) the species label assigned to the
intersecting analyst annotation, and 4) whether that label is a match to the model detection
label. Each model output track is made up of detections of varying confidence and that
information was also retained in the table.
Finally, the information was then summarised tabulating for each model detection track:
the video name,
unique track ID,
model predicted sea pen label (Species),
the maximum detection label confidence score for detections that form
the track
the number of detections forming the track,
whether any detection in the track intersects an analyst annotation of the
same species (yes/no),
the maximum proportion of the area of model detections that overlaps
with an analyst annotation,
the station code and
the year sampled.
The R code for completing this step can be found at https://github.com/annadownie/Sea-pen-
detection.git.
The data on the maximum detection label confidence score (MaxDLC), the number of sea
pen detection frames forming the track and whether the track was matched to an observed
sea pen were used to determine the optimal thresholds for the detection track confidence
and length to maximise the number of sea pens detected whilst minimising the inclusion of
false detections. Conditional inference (CI) trees were built for each sea pen species using
the R package ‘party’ (Hothorn et al. 2006), with the classification of the track as matched
to a sea pen or not (Matched) as the dependent variable and MaxDLC and number of frames
as the independent variables (see Annex 1; Figure 18 and Figure 19). Optimal thresholds
were determined separately for P. phosphorea and V. mirabilis due to detection confidence
for P. phosphorea being generally higher. The acquisition year was also taken into account
as, although all videos were sampled to the same frame rate, the field of view was smaller
in 2014 and 2015 (and less so in 2019) and it took much fewer frames for a sea pen to travel
across the field of view.
For P. phosphorea, which was more often tracked consistently through the field of view,
optimum thresholds varied dependent on the year of video acquisition. For 2014 and 2015,
matches between model detections and annotations were maximised when the number of
sea pen detection frames forming the track was 14 and MaxDLC was 0.999. For the
remaining years (2016-2021), matches were optimised when detection frames were 21 and
22 |
the MaxDLC was 0.998. For V. mirabilis, which was not as consistently visible throughout
crossing the field of view and had more missed frames, model detections and annotation
matches were optimised when the number of detection frames forming the track was 15
and the MaxDLC was 0.854.
The full summarised detection track datasets for P. phosphorea and V. mirabilis were filtered
to exclude tracks below the thresholds generated by the CI trees. This retained both
matched and unmatched tracks resulting in a validation dataset containing one column with
the presence of a model detection tracks of sea pens and another with analyst annotations
of sea pens. The number of filtered tracks per video were then compared to the number of
annotations per video to determine the number of actual presences not detected by the
model. These additional presences were added to the filtered dataset and noted as not
detected by the model (Figure 5).
Figure 5. Validation data table format.
This dataset can be used to calculate validation statistics which rely on True Positives, False
Positives and False Negatives, but it cannot be used for determining True Negatives, and
hence calculating validation statistics that rely on this category (such as Specificity). Model
accuracy statistics were generated for the combined predictions and observation data using
the presence.absence.accuracyfunction in the R package PresenceAbsence (Freeman
and Moisen, 2008). Only PCC (Percent Correctly Classified) and Sensitivity are reported
because of the aforementioned limitations. The data was also used to calculate and
compare the total numbers of sea pen detections by the model and sea pens observed by
the analysts in the validation dataset.
The thresholds were then applied to model detections for all of 2016 and Spearman’s rank
correlations were performed for both species to investigate the relationship between number
of 2016 BIIGLE annotations and the corresponding model detections in each video.
23 |
3.6. Spatio-temporal analysis of sea pen density
3.6.1. Spatial distribution models
The BIIGLE annotated sea pen counts from 2016 were used to characterise the spatial
distribution of Pennatula and Virgularia at Farne Deeps in relation to environmental
gradients and bottom fishing intensity. The predictor layers and their sources are
summarised in Table 2). Fishing intensity is represented as cumulative subsurface SAR,
calculated by adding up the annual SAR for 2009-1016. Counts were converted into an
estimated density (individuals/m2) using the observed area derived from the width of the
laser gates (0.81 m) and the length of each camera sledge tow (mean 219 m, SD 18 m).
Density of both species was modelled using Random Forest with regression trees (Cutler et
al. 2007). The models were built in R (v.4.1.2, R Core Team, 2021), using the ‘randomForest’
implementation of Random Forests in the randomForest package (Liaw and Wiener, 2002).
The models were run on the default settings, using 1000 trees. Preliminary single models
using all environmental variables were run first to select best variables and remove variables
with high covariance. Variables were dropped from the models based on redundancy (high
correlation with a more important variable), or poorly defined relationship with the response
variable (based on the response curves).
The final random forest models were built with the subset of predictor variables defined in
the previous step. Variable importance was determined through a multiple permutation
procedure during the model run. Cross-validation via repeated sub-sampling was done to
evaluate the robustness of the model estimate and predictions to data sub-setting. The
cross-validation was done using 10 random split samples with 75% used to train and 25%
to test models. The final model outputs were plotted as the mean of all 10 runs.
Model performance was evaluated using both R2 values and Root Mean Squared Error
(RMSE), which for convenience of interpretation was also calculated as proportion of the
range of values in the input data (Relative RMSE). All accuracy statistics are presented as
means and standard deviations of the scores from the 10 model runs.
3.6.2. Analysis of temporal trends in density
Temporal variability and trends were investigated at the seven stations with analyst
annotations covering 2014-2021. As the annotations for model validation had to be done for
the full view, the laser gates could not be used to determine the width of the video transect.
The field of view widths for all years were instead calculated from the known width of the
laser gates in pixels relative to width of the visible area. The visible area for 2014 and 2015
is the full width of the frame, which corresponds to 1.1 m. After the change of camera system
in 2016 the width of view was limited by the sledge rails at 1.4 m, with the exception of 2019
when the view was zoomed in slightly closer, and the width of view was determined to be
24 |
1.3 m. The estimates used here, distortion in the camera lens towards the edges make the
widths not as precise as when using the laser gates, but the variability introduced into the
total area cover of the video tows is very low. Where light is now equally distributed across
the image, the reduction of light towards the edges of the field of view can also lower counts
along the edge of the image. Calculations can be made to account for the lens distortion,
but a model to detect the laser lines is likely to be a better solution. Ultimately, the in the
inability to measure the exact distance covered by the camera sledge has a much larger
impact on the imprecision of the density estimate.
The time-series of sea pen densities were visualised in plots with corresponding time-series
of fishing intensity (annual SAR) and monthly maximum significant wave height and sea
surface temperature. The plots were used to interpret any temporal trends at the stations in
sea pens and the environment and fishing that could be of further interest for statistical
analysis. No clear correspondence in trends were identified and no further analysis was
done. Differences in interannual variability between the stations were investigated using the
coefficient of variation (CV). The relationship between fishing intensity and interannual
variability in sea pen density was investigated using Generalised Additive Models (GAMs).
GAMs were built for both species using subsurface SAR and depth as covariates using
penalised thin plate regression splines and a quasi-poisson family in the ‘gam’ function in
the mgcv package (Wood, 2011) in R.
4. Results and Discussion
4.1. Accuracy of the deep learning model
The results of the model detection accuracy statistics are presented in Table 5. One or more
detection tracks exceeded the filter thresholds for both species in all years, with the
exception of V. mirabilis in 2014. The vast majority of videos for which filtering thresholds
were exceeded were classed as having ‘Good’ visibility, however it should be noted that
there was a degree of variation in video quality within this category.
As expected, given the deep learning model was trained using annotations from the 2016
data, the accuracy of the P. phosphorea 2016 ‘Goodvisibility model detections was higher
than any other year, with the model correctly predicting 78% of the time (PCC = 0.78) with
a sensitivity of 0.83. The sensitivity value corresponds to the proportion of all sea pens
present in the videos (as counted by the analysts) that were correctly captured by the deep
leaning model. The lower number for the PCC indicates that some detections above the
applied thresholds were false positives and not matched by an analyst observation. It must
be noted, however, when interpreting these results that the matching process is somewhat
fallible. Detection tracks were matched to an analyst observation when a detection box in
25 |
the same or one preceding or following frame intersected an analyst annotation box. There
are instances where a track following a sea pen has missed more than one frame, or the
detection boxes in the previous or following frames do not overlap the analyst annotation
box. It is therefore likely, that some model detection tracks were falsely identified as not
matching a sea pen, understating the model performance.
Table 5. Accuracy statistics for model prediction of Pennatula phosphorea and Virgularia
mirabilis, based on predictions the thresholds detailed in Section 3.5.
Species
Year
Visibility
n
PCC
Sensitivity
Mean
S.D.
Mean
S.D.
Pennatula
phosphorea
2014
Good
106
0.085
0.027
0.093
0.030
2015
Good
175
0.389
0.037
0.557
0.045
2016
Good
346
0.780
0.022
0.833
0.021
2017
Good
319
0.711
0.030
0.723
0.030
2017
Moderate
89
0.191
0.042
0.191
0.042
2018
Good
158
0.772
0.033
0.841
0.030
2019
Good
173
0.599
0.038
0.680
0.039
2020
Good
192
0.550
0.036
0.755
0.037
2021
Good
140
0.000
0.000
0.000
0.000
Virgularia
mirabilis
2015
Good
196
0.051
0.016
0.052
0.016
2016
Good
224
0.509
0.033
0.927
0.024
2017
Good
55
0.357
0.075
0.455
0.088
2017
Poor
13
0.000
0.000
0.000
0.000
2018
Good
58
0.155
0.048
0.161
0.050
2018
Poor
27
0.407
0.096
1.000
0.000
2019
Moderate
127
0.441
0.061
0.455
0.062
2020
Good
102
0.029
0.017
0.033
0.019
2021
Good
72
0.000
0.000
0.000
0.000
n = number of combined model detections and observations; PCC = Percent Correct Classification; S.D. = standard
deviation. Year and visibility combinations where n 6 have been excluded.
Slightly lower but comparable accuracies were observed for the 2017 and 2018 model
detections in ‘Good’ visibility where >70% of detections were correct, with >0.72 sensitivity.
These two years have the most similar footage quality to 2016. The camera, zoom level
used are the same, with only minor differences in lighting. Moderate levels of accuracy were
observed for 2015, 2019 and 2020 in ‘Good’ visibility, where 39 60% of model tracks were
correct (sensitivity between 0.56 and 0.68). The camera set up in 2019 was also the same
as 2016-2018, but a closer zoom was used than in the other years and the camera angle is
26 |
slightly different. The camera set ups in 2015 and 2020 were both different from the year
used to train the models (2016). Pre-2016 a PAL ‘TV’ camera with 576 lines and a 4:3 aspect
ratio (see 0 for more detail) was used, whilst 2020 saw a change from HD Ready (720i) to
Full HD (1080i) and a different lighting configuration. Accuracy in the 2014 ‘Good’ visibility
and 2017 ‘Moderate’ visibility categories was extremely low, with just 9 and 19% of
detections correct, with sensitivities of 0.09 and 0.19. For ‘Good’ visibility in 2021 the model
detections were entirely inaccurate, with none of the 82 detections corresponding to any of
the 58 observed P. phosphorea. Lighting in the 2021 footage was extremely bright, reflecting
from suspended matter in the water column making it difficult to see the sea pens for
computer and human alike.
As observed for P. phosphorea, the 2016 ‘Good’ visibility model predictions for V. mirabilis
were the most accurate. However, the degree of accuracy was notably lower, with just 51%
of detection tracks correctly classified. The high sensitivity value (0.93) indicates that most
sea pens annotated by analysts were also detected by the model. The low PCC is a result
of overpredicting, false detections exceeding the thresholds used to filter them. The higher
number of False Positives in V. mirabilis is not entirely unexpected because of the lower
detection confidence threshold used to filter the detection tracks. It is also likely due to the
light colour and slim form of V. mirabilis being much less distinct from the image background
and consequently much more difficult to correctly classify. The 2017 ‘Good’ visibility, 2018
‘Poor’ visibility and 2019 ‘Moderate’ visibility models returned low to moderate accuracy
results, with 35 44% of detections correctly classified (with variable sensitivities ranging
from 0.46 to 1). In contrast to the results for P. phosphorea, the 2015 ‘Good’ visibility, 2018
‘Good’ visibility and 2020 ‘Good’ visibility model accuracies were extremely poor, with just 5
and 3% of detections correctly classified, with corresponding low sensitivities of 0.05 and
0.03. The 2017 ‘Poor’ visibility and the 2021 ‘Good’ visibility models were entirely inaccurate.
None of the 13 observed V. mirabilis were detected by the 2017 model, whilst none of the
22 detected by the 2021 model corresponded to the 60 observed.
The visibility categories assigned by the original data analysts did not consistently influence
model accuracy for either species. The 2021 videos with model detections retained following
the threshold filtering were described as having ‘Good’ visibility, yet none of the detections
for either species corresponded to actual observations. Conversely, for V. mirabilis the
second and third highest PCC values were observed for videos described as having
‘Moderate’ or ‘Poor’ visibility. The assignation of ‘Good’, ‘Moderate’ or ‘Poor’ visibility hinges
on whether, and for how much of the video tow, the view is obstructed by sediment (either
suspended by tide or waves, or in plumes from the movement of the camera sledge on the
sea floor). Within these parameters there is still variation in the camera altitude, zoom level
and lighting, which all have an impact on how distinct from the background the sea pens
appear. A good example is the difference between videos collected at station DM in 2017
and 2021 (Figure 6), where visibility in both is ‘Good’, but the different colour and orientation
of the lights makes it much harder to differentiate sea pens from the background in 2021.
27 |
The good match between human observers and the deep learning model in videos
categorised as ‘Moderate’ or ‘Poor’ visibility, may be because in those videos it may be
equally difficult for human and computer vision to detect sea pens, especially if the lower
visibility category is caused by intermittent plumes of sediment.
a)
b)
Figure 6. Example images from video with good visibility with different lighting from station
DM in 2017 (a) and 2021 (b).
Other factors that reduce detectability are shadows and the presence of the laser lines. Both
affect the outward appearance of the sea pens and without the understanding of context a
human has, these will confuse any ML model. The presence of the laser lines in the video
can also lead to false positives where sea pens crossed by the laser beam are included in
the training data. If the laser is part of the annotated feature used for training, this can cause
the model algorithm to focus on the features of the laser line instead of the sea pen and
potentially cause false positives by detecting the laser lines as sea pens. A few occurrences
of laser lines identified as V. mirabilis were seen on visual inspection of selected model
detections. Their representation in the annotated training dataset could thus influence model
success.
An investigation of the correlation between numbers of sea pen detections by the model
after thresholds have been applied and the number of sea pens observed analyst by
analysts in the validation dataset confirms the differences observed between species and
years in the model accuracy (Figure 7). The good regression fits, especially for
P. phosphorea in 2016 and 2018, do also point to the conclusion that some model detection
tracks were falsely identified as not matching a sea pen and the actual accuracy of the model
is better than the validation statistics suggest. The slopes of the regression fits also confirm
overestimation by the model of sea pen numbers for V. mirabilis in 2016 and for
P. phosphorea in 2021, whilst P. phosphorea numbers in 2014 and V. mirabilis numbers in
2015 are underestimated.
28 |
Figure 7. Scatter plots with linear regression fits of the number of Pennatula phosphorea and
Virgularia mirabilis analyst observations and model detections in the validation dataset
(2014-2021 data filtered with thresholds). Spearman's rank correlation coefficients and their
p-values are given for each species.
The Spearman’s rank correlations between the sea pen counts from 2016 BIIGLE point
annotations and corresponding model detections showed strong and highly statistically
significant positive relationships for both sea pen species (Table 6). The relationship is
29 |
further illustrated in
Figure 8, which plots the counts from both with a regression fit line.
Table 6. Spearman’s rank correlation of 2016 BIIGLE observations and corresponding model
predictions.
Species
n
S
Rho
p
Pennatula phosphorea
130
58830
0.84
<0.001
Virgularia mirabilis
130
106579
0.70
<0.001
30 |
Figure 8. Scatter plot with linear regression fits of the number of Pennatula phosphorea (a)
and Virgularia mirabilis (b) analyst annotations made in BIIGLE and model detections in the
2016 full video dataset filtered with thresholds. Spearman's rank correlation coefficients (ρ)
and their p-values are given for each species.
4.2. Spatial distribution of sea pens at Farne Deeps
The Random Forest results of a spatial modelling analysis on the 2016 data support the
observations from the video analysis that P. phosphorea and V. mirabilis occupy different
parts of the sandy mud habitat at Farne Deeps. Where they occur in large numbers it is
made up of just one of the species. There are areas of overlap where both occur in low
numbers. P. phosphorea yielded a good model of density (inv./m2) with a mean R2 of 0.48
and mean relative RMSE of 0.17 over the 10 split sample cross-validation runs (Table 7).
The R2 indicates how well the observed and predicted values correlate, whilst the RMSE
and the relative RMSE represent the deviation of predicted values from those observed on
absolute terms and in proportion to the range of observed values, respectively. A relative
RMSE of < 25% and an R2 >0.3 should be attained for a model to be judged fairly good.
Whilst the mean relative RMSE (0.12) for V. mirabilis is also low, the low mean R2 (0.13)
indicates that the good RMSE values are the result of the many small values of an
unbalanced dataset correctly predicted whilst the model performs poorly on the larger
values, leading to poor correlation between predicted and observed values. In other words,
the model can differentiate between presence and absence, but not the range of densities.
This is supported by V. mirabilis presence/absence being successfully modelled using many
of the same environmental raster layers by Downie et al. (2021).
31 |
There were also differences in the variables influencing density between the two sea pen
species. Figure 9 shows the importance of each of the predictor variables to the outcome of
the Random Forest models for both sea pen species. Variable importance is a measure of
how much information the predictor variable contributes to the model, in the presence of all
of the other variables. For a regression tree model the importance is estimated by the
percent increase in mean squared error (% MSE) when a variable is permuted in successive
Monte Carlo permutations inside the model algorithm. The concentration of suspended
particulate matter was the most influential predictor variable for both species, with opposite
effects for P. phosphorea which was more abundant in low concentrations and V. mirabilis,
which is abundant in the more turbid coastal waters (Figure 10). This again echoes the
findings in Downie et al. (2021) that V. mirabilis is more tolerant of turbid conditions. The
second-most important predictor variable for P. phosphorea density was the cumulative
subsurface SAR over 2009-2016. The highest density of P. phosphorea is found in the
lowest impacted areas and only a few individuals are encountered over a cumulative SAR
of 10. V. mirabilis density on the other hand is not greatly influence by fishing pressure,
which concurs with it being the more robust against disturbance out of the three sea pen
species that occur on the UK continental shelf (Greathead et al.; 2007, Downie et al., 2021).
Both species prefer sandy sediments (muddy sand and sandy mud) but V. mirabilis occurs
in somewhat higher densities where the mud/gravel content log ratio is higher (Figure 9 and
Figure 10).
Figure 11 shows the predicted distribution of the density of P. phosphorea and V. mirabilis
in the Farne Deeps, overlaid with the observed values. The shallow coastal strip has no sea
pen density data available and is excluded from prediction to avoid extrapolating outside of
the known environmental envelope. The distributions clearly reflect and inshore-offshore
distinction between the two species. Both sea pens although consistently present, have low
densities in the muddy basin, which is the main target of the Farne Deeps Nephrops fishery.
32 |
Table 7. Validation statistics for the 10 repeated split sample Random Forest model runs.
N = number of samples, RMSE = root mean squared error, Relative RMSE = RMSE as a
proportion of the range of values in data, SD = standard deviation.
Species
N
Mean ± SD
RMSE
Mean ± SD
Relative RMSE
Mean ± SD
R2
Pennatula phosphorea
91
0.14 ± 0.07
0.17 ± 0.15
0.48 ± 0.22
Virgularia mirabilis
91
0.23 ± 0.07
0.12 ± 0.05
0.13 ± 0.2
Figure 9. Predictor contributions to the spatial distribution of P. phosphorea and V. mirabilis
density. The bars show the mean of percent increase in mean square error (MSE) across 10
Random Forest model runs and the lines show standard error. SAR = Swept Area Ratio.
33 |
Figure 10. Partial dependence plots for the most influential environmental variables included
in Random Forest models for Pennatula phosporea and Virgularia mirabilis density. Values
are means of the 10 repeated split sample runs of the model.
34 |
Figure 11. Predicted density (mean of 10 repeated split sample RF runs) of Pennatula
phosphorea and Virgularia mirabilis at Farne Deeps. The coastal strip shallower than the
lowest depth in the dataset is masked to prevent predictions outside the known
environmental conditions. Observed density is shown at sample locations. Crosses show
stations where no sea pens were found.
4.3. Temporal analysis of sea pen density
The seven stations selected for investigating temporal variability and trends are distributed
across the Farne Deeps basin, representing different levels of fishing intensity (mean annual
SAR 2009-2020) and different densities of P. phosphorea and V. mirabilis (Figure 12). All
but one of the stations had a presence of each sea pen species in at least one of the years
under study (2014-2021) and showed interannual variability with coefficients of variation
(CV) ranging from 0.4-2.8. The exception was DL, where V. mirabilis was always absent.
The density of P. phosphorea was consistently lower and more variable at the stations
impacted by bottom trawling (Figure 13). The effect of fishing intensity on the within station
variability was confirmed with the Generalised Additive Model (GAM) fit to the data. Both
suspended particulate matter in winter (SPMW) and fishing intensity (SAR) were significant
terms in the model (SPMW: Est. df = 1.9, p = 0.003; SAR: Est. df = 1, p = 0.005). The
Deviance Explained (DE) for the model was 98%, but this will be artificially inflated because
35 |
of the very small sample size. The small sample size should lead to cautious interpretation
of the results in general. Figure 14 shows the plotted GAM fit for P. phosphorea. Neither
suspended matter nor fishing intensity had a significant effect on the within station variability
(CV) of V. mirabilis (SPMW: Est. df = 1, p = 0.5; SAR: Est. df = 1.5, p = 0.7; DE 33%).
Figure 12. Spatial distribution over mean annual SAR 2009-2020 (a) and range of sea pen
density values over 2014-2021 (b) at the selected time-series validation stations. Mean and
standard deviation are shown by the black circles/triangles and lines.
36 |
Figure 13. Coefficient of Variation (CV) of sea pen density at time-series stations overlaid on
the mean annual subsurface SAR 2009-2020.
37 |
Figure 14. GAM response plots for Pennatula phosphorea coefficient of variation (CV) at
stations over 2014-2021 against (a) suspended particulate matter in the water column in
winter and (b) mean annual subsurface swept area ratio (SAR) for 2009-2020.
The densities of the sea pens were plotted as time-series to get an overview of any
increasing or decreasing temporal trends. Three plots overlaying the sea pen density time-
series on corresponding time-series for annual SAR (Figure 15), monthly maximum
significant wave height (Figure 16) and monthly maximum sea surface temperature (Figure
17) were also inspected for any co-occurring temporal patterns between years. Stations AK
and DE are both dominated by V. mirabilis with fluctuating low numbers of P. phosphorea.
Both stations show a downward trend in the density of V. mirabilis from 2014-2021, with
numbers at zero in 2021. The decreasing trend did not show any clear association with
fishing intensity, wave height or temperature. The fishing layer used, however, is at a very
coarse scale and is more indicative of the level of fishing in the general vicinity of the station.
A SAR >1, which is interpreted as the whole area of the raster cell being swept at least once,
does not necessarily correspond to an impact at a specific location because fishing may not
be uniformly distributed across the cell. SAR should only be used to estimate the probability
of being hit by a trawl in a given year. Only a longer time-series of the station under the
same environmental and fishing impact conditions can tell if the trend is part of a longer
cyclical pattern or how quickly numbers can recover. V. mirabilis numbers at stations DU
and L as well as P. phosphorea numbers at all stations seem to fluctuate in a more waveform
pattern. Stations DM and DL can be used as a reference for variability of a P. phosphorea
field in un-impacted conditions. A longer timeseries would, however, allow for a proper time-
series analysis of the variability. Un-impacted stations with high numbers of V. mirabilis are
also needed for comparison.
38 |
Figure 15. Density of sea pens observed at the 7 time-series stations with the annual mean
subsurface swept area ratio (SAR) between 2014-2021.
39 |
Figure 16. Density of sea pens observed at the 7 time-series stations with the monthly
maximum significant wave height from the Tyne and Tees Waverider between 2014-2021.
40 |
Figure 17. Density of sea pens observed at the 7 time-series stations with the monthly
maximum sea surface temperature from the Tyne and Tees Waverider between 2014-2021.
41 |
5. Conclusion
5.1. Applicability of deep learning to evaluating sea pen
density for research and monitoring
The study achieved its first three objectives, namely, to develop and operationalise a
machine learning tool to enable quick and cost-effective analysis of underwater video survey
footage for extracting counts of the sea pens P. phosphorea and V. mirabilis, apply it to a
timeseries of Nephrops video footage and assess the algorithm’s ability to generalise to
footage from different camera platforms and with different quality and visibility over the
legacy video footage. The results were encouraging of future development of automated
detection algorithms for sea pens, and other similar taxa with an appearance distinct from
its surroundings. Whilst exact accuracy in sea pen counts was not achieved, the
automatically detected pea pen numbers generally followed the same pattern as those
counted by a human analyst. Counts from the deep learning model detections and human
analyst annotations had good correlation, especially in videos from the year that videos used
for model training were selected from. More detailed investigation is needed into the levels
of additional variability and artefactual noise that are added by the imprecise counts, but the
results suggest that automatically detected numbers of sea pens can differentiate between
high and low numbers of sea pens and could be used to detect trends.
The different camera specifications, lighting and rigging conditions and image resolutions
affected the transferability of the trained model to those other platforms and types of imagery
more than expected. Light conditions were observed to be an important factor in affecting
the usefulness of the model. Although this was not specifically tested, light that is lower in
brightness and warmer in hue seems to make detecting sea pens easier. Especially
V. mirabilis, with its light colouring becomes difficult to differentiate from the background in
very bright light, whereas in lower light it appears to have a more yellow tone which is more
conspicuous against the sediment.
Transferability of the model could be improved upon by expanding the training base to a
wider range of different sources of video, or potentially by using a smaller set on annotations
created for each camera set up at a time to tune the existing model by further training it to
detect in those specific conditions before applying it to the remaining videos. There is merit
in trialling both approaches. Preprocessing of video footage to e.g. standardise mean
brightness, adjust colour balance and apply filters to even up illumination also has potential
to improve performance across video platforms.
In the final model our study used all annotations, including those interpolated along the track
where the sea pen (particularly V. mirabilis) was not detectable to the human eye. This may
have caused the model to pick out other features from the background and contribute to the
42 |
overprediction of V. mirabilis. If a model would be equally able to detect presences of sea
pens where they are more visible, whilst reducing the amount of low confidence detections,
having been trained with fewer but clearer training annotations, it could make the training
annotation process faster. The potential downside is losing the ability to track the sea pens
where they are not well discernible in interceding frames between better examples.
Whilst the time taken to annotate videos for training was extensive, the current model can
now be used as a template to further train new models from with much less input data. If
consistency across years can be achieved with similar success as within the 2016 footage
in the current model, the method can be used to extract time-series from legacy data and
with regular tuning to be applied on the current platform in the future. The potential
timesaving for extracting data on a benthic taxon, P. phosphorea, that has in the preliminary
analysis here shown to respond to a pressure gradient and as such has promise as a
condition indicator for sublittoral muddy sand and mud habitats, is considerable.
The approach used here for sea pens can easily be extended to other prominent taxa, such
as anemones, tubeworms, and crustaceans as well as burrow openings, in the Nephrops
footage. As studies such as Durden et al. (2021) have shown, including a broader range of
taxa also improves the detection accuracy for the target taxa. Similarly, the algorithm can
be used as basis to train detection algorithms with fewer annotations for other video
collected with camera sledges or other video platforms which maintain a steady altitude and
speed. Whilst the detection algorithm can also be used to detect target species in footage
from varying altitude platforms such as drop-frames, the nature of the footage makes it more
appropriate to do this for individual frames rather than continuous video.
5.2. Spatio-temporal trends in sea pen density
implications for monitoring
Further objectives of the study included the extraction of a time series for P. phosphorea
and V. mirabilis density, investigation of spatio-temporal trends in their populations in the
study area and linking them to changes in the environment and human impacts. The
incongruent detection success of the developed algorithm prevented the extraction of a
timeseries including all stations. Instead, the second part of the study was completed on the
fully annotated 2016 data set and smaller number of stations with analyst annotations of
time series used for model validation.
P. phosphorea was found to have a negative response to fishing pressure both in its wider
spatial distribution and temporal variability. In areas subject to higher fishing intensity P.
phosphorea is present in low numbers with high local and temporal variability. Overall, the
stations with higher numbers of P. phosphorea have more stable numbers. The high
variability at low densities means it may be more difficult to detect change where densities
43 |
are low. The rolling multi-year increasing and decreasing trends at the unimpacted stations
give an indication of natural variability and highlight the need to develop longer term time-
series, instead of comparing two distinct time points out of context and inferring impact.
V. mirabilis, whilst showing the largest drop in numbers in the time-series, did not appear to
be influenced by fishing activity. Its robustness to turbid conditions, warmer water and
physical abrasion makes it a less useful indicator for habitat condition. Investigating further
into the past years at the stations with declining V. mirabilis numbers, and monitoring the
number in future years, can give further insight into the permanence of the change and
recovery potential. Higher resolution spatial data on fishing activities could also help narrow
down potential sources of local impact. An investigation of P. phosphorea to V. mirabilis
ratios where they co-occur could also be an interesting metric to investigate. Overall, more
stations with longer time-series are needed for comparison to confirm the findings.
5.3. Future recommendations
The deep learning model should be developed further, especially investigating the effects of
further tuning it to the footage from years other than 2016.
More comparative analysis is needed on the ability of the model detection count dataset to
distinguish statistically significant differences in sea pen density and how this differs from a
count data by a human analyst.
The effect of training box annotation clarity (how clearly is the target visible) on the extent a
model over or underpredicts should be investigated.
A way to share model training annotations and imagery would make it easier to achieve the
large numbers of training annotations that are needed and to widen the image variability
available to train models with.
A standard dataset for validation of new iterations of algorithms would allow comparison and
quality control between algorithms and video platforms. Accuracy of both single detections
and numbers of individuals counted should be reported for each application of automated
detection.
Some manual annotation should be done each time when applying an algorithm to footage
from a new platform, or with new lighting conditions, to ensure a comparable level of
accuracy, if the results from different platforms are to be used in comparative analysis.
Longer timeseries from more stations at Farne Deeps, and potentially other Nephrops FUs
should be extracted and analysed for more statistically robust information on the natural and
human induced variability in the sea pens, particularly P. phosphorea.
44 |
Sea pen densities at the stations AK, DE, DM, DL, DU, L and X, forming the current
timeseries should be extracted in future years and adder to the timeseries for added value.
Further strategic investments could be made in expanding the ‘standard’ geographical scope
of the existing Nephrops surveys to include additional selected stations, for example
locations in the nearby Marine Conservation Zones (MCZ) Farnes East and North East of
Farnes Deep. Alongside further spatial temporal investigations of sea pen communities
within and beyond this MCZ, this additional information could inform the development of a
fully validated sea pen condition indicator, which could be used in Environment Act 2021
and OSPAR assessments.
45 |
6. References
Buhl-Mortensen, P. and Buhl-Mortensen, L. (2014). Diverse and vulnerable deep-water biotopes in the
Hardangerfjord. Marine Biological Research 10, 253267. https://doi.org/10.1080/17451000.2013.810759.
Cutler, D. R., Edwards, T. C., Beard, K. H., Cutler, A., Hess, K. T., Gibson, J., and Lawler, J. J. 2007.
Random Forests for classification in ecology. Ecology, 88: 27832792. Ecological Society of America.
http://dx.doi.org/10.1890/07-0539.1.
De Clippele, L.H., Buhl-Mortensen, P. and Buhl-Mortensen, L. (2015). Fauna associated with cold water
gorgonians and sea pens. Continental Shelf Research 105, 6778. https://doi.org/10.1016/j.csr.2015.06.007.
Dawkins, M., Sherrill, L., Fieldhouse, K., Hoogs, A., Richards, B., Zhang, D., Prasad, L., et al. 2017. An
open-source platform for underwater image & video analytics. Proceedings - 2017 IEEE Winter Conference
on Applications of Computer Vision, WACV 2017: 898906. Institute of Electrical and Electronics Engineers
Inc.
Downie A.L., Noble-James, T., Chaverra, A. and Howell, K.L. (2021). Predicting sea pen (Pennatulacea)
distribution on the UK continental shelf: evidence of range modification by benthic trawling. Marine Ecology
Progress Series 670, 75-91. https://doi.org/10.3354/meps13744.
Durden, J., Hosking, B., Bett, B., Cline, D., Ruhl, H. (2021). Automated classification of fauna in seabed
photographs: The impact of training and validation dataset size, with considerations for the class imbalance.
Progress in Oceanography, 102612, 196. https://doi.org/10.1016/j.pocean.2021.102612
Freeman, E. A. and Moisen, G. (2008). PresenceAbsence: An R package for presence absence
analysis. Journal of Statistical Software 23 (11) 131. https://doi.org/10.18637/jss.v023.i11.
French, G., Mackiewicz, M., Fisher, M., Holah, H., Kilburn, R., Campbell, N., & Needle, C. (2020). Deep
neural networks for analysis of fisheries surveillance video and automated monitoring of fish discards. ICES
Journal of Marine Science, 77(4), 13401353. https://doi.org/10.1093/ICESJMS/FSZ149
Gomez-Rios, A., Tabik, S., Luengo, J., Shihavuddin, A., Herrera, F. (2019). Coral species identification with
texture or structure images using a two-level classifier based on Convolutional Neural Networks. Knowledge-
Based Systems, 104891, 184. https://doi.org/110.1016/J.KNOSYS.2019.104891
Gomez-Rios, A., Tabik, S., Luengo, J., Shihavuddin, A., Krawcyzk, B., Herrera, F. (2019). Towards highly
accurate coral texture images classification using deep convolutional neural networks and data
augmentation. Expert Systems with Applications, 315-328, 118. https://doi.org/10.1016/J.ESWA.2018.10.010
Greathead, C.F., Donnan, D.W., Mair, J.M. and Saunders, G.R. (2007). The sea pens Virgularia mirabilis,
Pennatula phosphorea and Funiculina quadrangularis: distribution and conservation issues in Scottish
waters. Journal of the Marine Biological Association of the United Kingdom 87, 10951103.
https://doi.org/10.1093/icesjms/fsu129.
Greathead, C., González-Irusta, J.M., Clarke, J., Boulcott, P., Blackadder, L., Weetman, A. and Wright, P.J.
(2015). Environmental requirements for three sea pen species: relevance to distribution and conservation.
ICES Journal of Marine Science 72, 576586. https://doi.org/10.1093/icesjms/fsu129.
46 |
Hothorn, T., Hornik, K. and Zeileis, A. (2006). Unbiased Recursive Partitioning: A Conditional Inference
Framework. Journal of Computational and Graphical Statistics 15 (3) 651674.
https://doi.org/10.1198/106186006X133933.
Hixon, M.A. and Tissot, B.N. (2007). Comparison of trawled vs untrawled mud seafloor assemblages of
fishes and macroinvertebrates at Coquille Bank, Oregon. Journal of Experimental Marine Biology and
Ecology 344, 2334. https://doi.org/10.1016/j.jembe.2006. 12.026.
ICES. 2020. Working Group on Nephrops Surveys (WGNEPS; outputs from 2019). ICES Scientific Reports.
2:16. 85 pp. http://doi.org/10.17895/ices.pub.5968.
ICES. 2021. OSPAR request on the production of spatial data layers of fishing intensity/pressure. In Report
of the ICES Advisory Committee, 2021. ICES Advice 2021, sr.2021.12.
https://doi.org/10.17895/ices.advice.8297.
Lauria, V., Garofalo, G., Fiorentino, F., Massi, D., Milisenda, G., Piraino, S., Russo, T. and Gristina, M.
(2017). Species distribution models of two critically endangered deep-sea octocorals reveal fishing impacts
on vulnerable marine ecosystems in central Mediterranean Sea. Science Reports 7 (1) 8049.
https://doi.org/10.1038/s41598-017-08386-z.
Liaw, A., and Wiener, M. 2002. Classification and Regression by randomForest. R News, 2: 1822.
http://cran.r-project.org/doc/Rnews/.
Malecha, P. and Stone, R. (2009). Response of the sea whip Halipteris willemoesi to simulated trawl
disturbance and its vulnerability to subsequent predation. Marine Ecology Progress Series 388, 197206.
https://doi.org/10.3354/meps08145.
Mitchell, P. J., Aldridge, J., & Diesing, M. (2019). Legacy Data: How Decades of Seabed Sampling can
Produce Robust Predictions and Versatile Products. Geosciences, 9(4), 182.
https://doi.org/10.3390/geosciences9040182.
Moniruzzaman, M., Islam, S., Bennamoun, M,. Lavery, P. (2017) Deep Learning on Underwater Marine
Object Detection: A Survey. Lecture Notes in Computer Science (including subseries Lecture Notes in
Artificial Intelligence and Lecture Notes in Bioinformatics), 150-160. http://link.springer.com/10.1007/978-3-
319-70353-4_13
Murray, J., Jenkins, C., Eggleton, J., Whomersley, P., Robson, L., Flavell, B. and Hinchen H. (2015). The
development of monitoring options for UK MPAs: Fladen Grounds R&D case study. Joint Nature
Conservation Committee/Cefas Partnership Report Series No. 9. Peterborough, UK. ISSN 2051-6711.
https://hub.jncc.gov.uk/assets/2d594d86-06f1-419d-8cb6-db2f61b5be9c.
Pebesma, E., 2018. Simple Features for R: Standardized Support for Spatial Vector Data. The R Journal
10(1), 439-446, https://doi.org/10.32614/RJ-2018-009.
Rasmussen, C., Zhao, J., Ferraro, D., Trembanis, A. (2017). 2017 IEEE International Conference on
Computer Vision Workshops (ICCVW), (2017), 2865-2873. http://ieeexplore.ieee.org/document/8265549/
R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical
Computing, Vienna, Austria. URL https://www.R-project.org/.
47 |
Rogers A.D. and Gianni, M. (2010). The Implementation of the UNGA Resolutions 61/105 and 64/72 in the
Management of Deep-Sea Fisheries on the High Seas. Report prepared for the Deep-Sea Conservation
Coalition, International Programme on the State of the Ocean, London. Available from:
http://www.savethehighseas.org/publicdocs/61105-Implemention-finalreport.pdf (Accessed 21 February
2022).
Turner, J.A., Hitchin, R., Verling, E., van Rein, H. 2016. Epibiota remote monitoring from
digital imagery: Interpretation guidelines. Available from:
http://www.nmbaqcs.org/media/1643/nmbaqc_epibiota_interpretation_guidelines_final.pdf (Accessed 23
February 2022).
van Rein, H., Hinchin, H., Hawes, J., Durden, J.C., Benson, A.D., Lindenbaum, C. E., Boulcott, P. F. and
Webb, K. (2020). Development of a Benthic Imagery Action Plan for the United Kingdom, in: The Big Picture
Benthic Imagery Workshop 2019. JNCC, Peterborough, UK. Available from:
http://www.nmbaqcs.org/media/1793/benthic-imagery-action-plan-v11-amended-henrik-sept-20.pdf
(Accessed 21 February 2022).
Wood, S.N. (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of
semiparametric generalized linear models. Journal of the Royal Statistical Society (B) 73(1):3-36.
Zurowietz, M., Langenkamper, D., Hosking, B., Ruhl, H., Nattkemper, T. (2018). MAIAA machine learning
assisted image annotation method for environmental monitoring and exploration. PLOS ONE, (2018),
e0207498, 13(11). https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0207498
48 |
Annex 1. Conditional Inference trees
Figure 18. Conditional Inference tree used to determine filtering thresholds for Pennatula phosphorea
model predictions. Det.frames = the number of sea pen detection frames forming each track;
model.conf = the maximum label prediction confidence score (MaxLPC).
Figure 19. Conditional Inference tree used to determine filtering thresholds for Virgularia mirabilis
model predictions. Det.frames = the number of sea pen detection frames forming each track;
model.conf = the maximum label prediction confidence score (MaxLPC).
49 |
Annex 2. Random Forest partial dependence
plots
Figure 20. Partial dependence plots for environmental variables included in Random Forest
models for Pennatula phosporea density. Thin blue lines show the response for each of the
10 repeated split sample runs of the model and the red line is the mean across all runs.
50 |
Figure 21. Partial dependence plots for environmental variables included in Random Forest
models for Virgularia mirabilis density. Thin blue lines show the response for each of the 10
repeated split sample runs of the model and the red line is the mean across all runs.
World Class Science for the Marine and Freshwater Environment
We are the government’s marine and freshwater science experts. We help keep our seas,
oceans and rivers healthy and productive and our seafood safe and sustainable by
providing data and advice to the UK Government and our overseas partners. We are
passionate about what we do because our work helps tackle the serious global problems
of climate change, marine litter, over-fishing and pollution in support of the UK’s
commitments to a better future (for example the UN Sustainable Development Goals and
Defra’s 25 year Environment Plan).
We work in partnership with our colleagues in Defra and across UK government, and with
international governments, business, maritime and fishing industry, non-governmental
organisations, research institutes, universities, civil society and schools to collate and
share knowledge. Together we can understand and value our seas to secure a
sustainable blue future for us all, and help create a greater place for living.
© Crown copyright 2021
__________________________________________________________________
Pakefield Road, Lowestoft, Suffolk, NR33 0HT
The Nothe, Barrack Road, Weymouth DT4 8UB
www.cefas.co.uk | +44 (0) 1502 562244
... Whether undertaken by a human or a computer, the task is broadly the same. In marine applications, many studies have demonstrated the possibility of using CNNs to classify benthic taxa or substratum in optical imagery (Abad-Uribarren et al., 2022;Downie et al., 2022;Durden et al., 2021;Jackett et al., 2023;Kandimalla et al., 2022;Langenkämper et al., 2019a;Langenkämper et al., 2020;Marburg and Bigham, 2016;Piechaud et al., 2019;Piechaud and Howell, 2022;Vega et al., 2024). ...
Article
Full-text available
Automating identification of benthic habitats from imagery, with Machine Learning (ML), is necessary to contribute efficiently and effectively to marine spatial planning. A promising method is to adapt pre-trained general convolutional neural networks (CNNs) to a new classification task (transfer learning). However, this is often inaccessible to a non-specialist, requiring large investments in computational resources and time (for user comprehension and model training). In this paper, we demonstrate a simpler transfer learning framework for classifying broad deep-sea benthic habitats. Specifically, we take an ‘off-the-shelf’ CNN (VGG16) and use it to extract features (pixel patterns) from benthic images (without further training). The default outputs of VGG16 are then fed in to a Support Vector Machine (SVM), a classical and simpler method than deep networks. For comparison, we also train the remaining classification layers of VGG16 using stochastic gradient descent. The discriminative power of these approaches is demonstrated on three benthic datasets (574–8353 images) from Norwegian waters; each using a unique imaging platform. Benthic habitats are broadly classified as Soft Substrate (sands, muds), Hard Substrate (gravels, cobbles and boulders) and Reef (Desmophyllum pertusum). We found that the relatively simplicity of the SVM classifier did not compromise performance. Results were competitive with the CNN classifier and consistently high, with test accuracy ranging from 0.87 to 0.95 (average = 0.9 (+/- 0.04)) across datasets, somewhat increasing with dataset size. Impressively, these results were achieved 2.4–5× faster than CNN training and had significantly less dependency on high-specification hardware. Our suggested approach maximises conceptual and practical simplicity, representing a realistic baseline for novice users when approaching benthic habitat classification. This method has wide potential. It allows automated image grouping to aid annotation or further model selection, as well as screening of old-datasets. It is especially suited to offshore scenarios as it can provide quick, albeit crude, insights into habitat presence, allowing adaptation of sampling protocols in near real-time.
Article
Full-text available
Sediment maps developed from categorical data are widely applied to support marine spatial planning across various fields. However, deriving maps independently of sediment classification potentially improves our understanding of environmental gradients and reduces issues of harmonising data across jurisdictional boundaries. As the groundtruth samples are often measured for the fractions of mud, sand and gravel, this data can be utilised more effectively to produce quantitative maps of sediment composition. Using harmonised data products from a range of sources including the European Marine Observation and Data Network (EMODnet), spatial predictions of these three sediment fractions were generated for the north-west European continental shelf using the random forest algorithm. Once modelled these sediment fraction maps were classified using a range of schemes to show the versatility of such an approach, and spatial accuracy maps were generated to support their interpretation. The maps produced in this study are to date the highest resolution quantitative sediment composition maps that have been produced for a study area of this extent and are likely to be of interest for a wide range of applications such as ecological and biophysical studies.
Article
Full-text available
Deep-sea coral assemblages are key components of marine ecosystems that generate habitats for fish and invertebrate communities and act as marine biodiversity hot spots. Because of their life history traits, deep-sea corals are highly vulnerable to human impacts such as fishing. They are an indicator of vulnerable marine ecosystems (VMEs), therefore their conservation is essential to preserve marine biodiversity. In the Mediterranean Sea deep-sea coral habitats are associated with commercially important crustaceans, consequently their abundance has dramatically declined due to the effects of trawling. Marine spatial planning is required to ensure that the conservation of these habitats is achieved. Species distribution models were used to investigate the distribution of two critically endangered octocorals (Funiculina quadrangularis and Isidella elongata) in the central Mediterranean as a function of environmental and fisheries variables. Results show that both species exhibit species-specific habitat preferences and spatial patterns in response to environmental variables, but the impact of trawling on their distribution differed. In particular F. quadrangularis can overlap with fishing activities, whereas I. elongata occurs exclusively where fishing is low or absent. This study represents the first attempt to identify key areas for the protection of soft and compact mud VMEs in the central Mediterranean Sea.
Technical Report
Full-text available
There is increasing recognition that the effective acquisition and interpretation of underwater video and still image data for biodiversity is growing in importance. Numerous organisations (e.g. Statutory Nature Conservation Bodies (SNCBs), Inshore Fisheries Conservation Authorities (IFCAs), environmental consultancy agencies, industry and academic institutes) are now engaged in this work for a variety of different purposes, including: • Marine habitat mapping of physical seabed habitats and features in support of a variety of national and international initiatives, e.g. Integrated Mapping For the Sustainable Development of Ireland's Marine Resource (INFOMAR). • Characterisation of the epibiotic attributes of seabed habitats and features e.g. in support of the Marine Strategy Framework Directive, Water Framework Directive, designation of Marine Protected Areas (MPAs, European and National), marine development applications and licensing. • Monitoring trends in seabed habitat features and their associated epibiotic communities, e.g. in support of monitoring the effectiveness of management measures implemented to achieve given conservation objectives within MPAs and also to assess and monitor predicted impacts for given marine developments and the effectiveness of mitigation measures implemented. The guidelines in this document provide a summary of current best practice for the interpretation of video and stills imaging data of benthic substrata and epibenthic species to ensure that data are interpreted to fulfil the objectives of a survey. These guidelines form part of the epibiota component of the NMBAQC scheme, reporting to the Healthy and Biologically Diverse Seas Evidence Group (HBDSEG) under the UK’s Marine Monitoring and Assessment Strategy (UKMMAS).
Article
Sea pen communities are United Nations General Assembly-designated Vulnerable Marine Ecosystems which occur worldwide in soft-bottom sediments where trawling often occurs. However, the ability of marine managers to assess, monitor and mitigate impacts to sea pens at national scales has been constrained by a limited understanding of their environmental requirements, geographical distribution and responses to trawling. In this study, we used random forest species distribution modelling (SDM) to predict the distribution of suitable habitat for 3 sea pen species (tall sea pen Funiculina quadrangularis, slender sea pen Virgularia mirabilis and phosphorescent sea pen Pennatula phosphorea) on the UK continental shelf, exploring the results relative to the distribution of fishing activity. Occurrence of all 3 species corresponded to areas of low current and wave velocity, where suspended matter in the water column was also low. However, for F. quadrangularis, the largest species, the models indicated substantially different drivers of distribution between the Greater North Sea and Celtic Seas ICES Ecoregions. This disparity appears to reflect modification to the range and realised niche of this species in the Greater North Sea, due to trawling impacts. P. phosphorea and V. mirabilis appear to be more resilient to trawling, with no clear negative relationships observed. Our findings illustrate the value of broadscale qualitative comparisons between SDMs and human activity data for insights on pressure−state relationships. When combined with robust distribution maps, this improved understanding of vulnerability will enable marine managers to make ecologically sound, defensible decisions and deliver tangible conservation outcomes for sea pen communities.
Article
Machine learning is rapidly developing as a tool for gathering data from imagery and may be useful in identifying (classifying) visible specimens in large numbers of seabed photographs. Application of an automated classification workflow requires manually identified specimens to be supplied for training and validating the model. These training and validation datasets are generally generated by partitioning the available manual identified specimens; typical ratios of training to validation dataset sizes are 75:25 or 80:20. However, this approach does not facilitate the desired scalability, which would require models to successfully classify specimens in hundreds of thousands to millions of images after training on a relatively small subset of manually identified specimens. A second problem is related to the ‘class imbalance’, where natural community structure means that fewer specimens of rare morphotypes are available for model training. We investigated the impact of independent variation of the training and validation dataset sizes on the performance of a convolutional neural network classifier on benthic invertebrates visible in a very large set of seabed photographs captured by an autonomous underwater vehicle at the Porcupine Abyssal Plain Sustained Observatory. We tested the impact of increasing training dataset size on specimen classification in a single validation dataset, and then tested the impact of increasing validation set size, evaluating ecological metrics in addition to computer vision metrics. Computer vision metrics (recall, precision, F1-score) indicated that classification improved with increasing training dataset size. In terms of ecological metrics, the number of morphotypes recorded increased, while diversity decreased with increasing training dataset size. Variation and bias in diversity metrics decreased with increasing training dataset size. Multivariate dispersion in apparent community composition was reduced, and bias from expert-derived data declined with increasing training dataset size. In contrast, classification success and resulting ecological metrics did not differ significantly with varying validation dataset sizes. Thus, the selection of an appropriate training dataset size is key to ensuring robust automated classifications of benthic invertebrates in seabed photographs, in terms of ecological results, and validation may be conducted on a comparatively small dataset with confidence that similar results will be obtained in a larger production dataset. In addition, our results suggest that automated classification of less common morphotypes may be feasible, providing that the overall training dataset size is sufficiently large. Thus, tactics for reducing class imbalance in the training dataset may produce improvements in the resulting ecological metrics.
Article
We report on the development of a computer vision system that analyses video from CCTV systems installed on fishing trawlers for the purpose of monitoring and quantifying discarded fish catch. Our system is designed to operate in spite of the challenging computer vision problem posed by conditions on-board fishing trawlers. We describe the approaches developed for isolating and segmenting individual fish and for species classification. We present an analysis of the variability of manual species identification performed by expert human observers and contrast the performance of our species classifier against this benchmark. We also quantify the effect of the domain gap on the performance of modern deep neural network-based computer vision systems.
Article
Simple features are a standardized way of encoding spatial vector data (points, lines, polygons) in computers. The sf package implements simple features in R, and has roughly the same capacity for spatial vector data as packages sp, rgeos, and rgdal. We describe the need for this package, its place in the R package ecosystem, and its potential to connect R to other computer systems. We illustrate this with examples of its use.