Wiley

Methods in Ecology and Evolution

Published by Wiley and British Ecological Society

Online ISSN: 2041-210X

Disciplines: Methods & statistics in ecology

Journal websiteAuthor guidelines

Top read articles

893 reads in the past 30 days

Quantitative structure models showing the vertical profiles of the example study trees, (a) Celtis occidentalis L. (common hackberry) and (b) Ulmus americana L. (American elm), alongside the horizontal canopy profiles for the (c) hackberry and (d) elm.
Conceptual schematic demonstrating the pruning operation employed here to delineate stemflow and dripflow components in a tree graph model, G. An inset is provided showing the angle used for analysis.
Scatterplots showing results from a sensitivity analysis for the cut‐off grade within the stemflow and throughfall delineation algorithm, applied to two tree models: (a) Celtis occidentalis and (b) Ulmus americana. The analysis tracks the stability and subsequent decline of stemflow perimeter length (open circles) as the cut‐off angle increases, with a notable decrease occurring beyond −10°. Concurrently, the mean contributing surface area to throughfall drip nodes, shown by filled circles, escalates with wider cut‐off angles and plateaus near the same −10° threshold. This plateau signifies a state where the increase in contributing throughfall areas stabilises, illustrating the algorithm's capability to identify a cut‐off angle that optimises the demarcation between branches affecting stemflow and throughfall. Example outputs below and above the cut‐off angle selected for our demonstration are provided in Figure S1 (stemflow areas and throughfall drip maps).
Comparison of the total projected canopy area (light grey) and the projected stemflow‐contributing branch area (dark grey) for (a) Celtis occidentalis L. (common hackberry) and (b) Ulmus americana L. (American elm). The areas shown in panels (a and b) are reported in Table 1. In the stemflow watershed, the distribution of (c) branch angles was nearly identical; however, (d) the branch radii distribution obviously differ.
Panels showing drip point maps for (a) Celtis occidentalis L. (common hackberry) and (b) Ulmus americana L. (American elm), where the shade of each dot indicates the amount of contributing projected branch area. Plotted drip points are the top 98th percentile of all drip nodes.
A LiDAR‐driven pruning algorithm to delineate canopy drainage areas of stemflow and throughfall drip points

October 2024

·

896 Reads

·

Travis E. Swanson

·

·

[...]

·

Download

Aims and scope


Methods in Ecology and Evolution promotes the development of new methods in ecology and evolution, and facilitates their dissemination and uptake by the research community.
We publish papers across a wide range of subdisciplines and provide a single forum for publishing analytical, practical, or conceptual methodological developments in ecology and evolutionary biology. Methods in Ecology and Evolution is fully open access and part of the prestigious British Ecological Society portfolio.

Recent articles


Generating spatially realistic environmental null models with the shift‐&‐rotate approach helps evaluate false positives in species distribution modelling
  • Article
  • Full-text available

November 2024

·

7 Reads

To circumvent reporting spurious correlations, species distribution models often explicitly account for spatial autocorrelation, for example by including spatially structured random effects. The validity of statistical inference derived from such models has been tested by simulations using null environmental predictors that do not have any causal dependency with the response. Such null environmental predictors can be obtained by permutations of the original predictors or by simulating spatial structures resembling the original predictors. In such approaches, it is important that the permuted or simulated predictors reflect the nature of spatial variation present in the original predictors. Here we present a novel approach for generating realistic null predictors by a shift‐&‐rotate (S&R) approach: we extract environmental variables after randomly translating and rotating the sampling area within a window of defined environmental layers. In this way, the null environmental variables have fully realistic spatial variation and covariation, but no relationship to the response variable. We implement the S&R approach to three main R‐functions and demonstrate with a simulation study how they can be used to untangle causal versus non‐causal relationships within species distribution modelling. These methods allow us to quantify the predictive power attributed within the models due to non‐causal correlations generated by the realistic structure of the environmental covariates. In our case study, we identify when a model incorrectly estimates parameter values, yet still has high predictive power due to the structured nature of the predictor variables. The use of null models is imperative in ecological modelling for testing the accuracy of statistical inference in complex ecological systems and the choice of these null models is far from trivial. Here we provide R functions for generating spatially realistic null models to use in species distribution modelling as well as other spatially explicit fields such as landscape genetics.


A comprehensive framework to assess multi‐species landscape connectivity

October 2024

·

114 Reads

Due to the central role of landscape connectivity in many ecological processes, evaluating and accounting for it has gained attention in both theoretical and applied ecological sciences. To address this challenge, researchers often use generic species to simplify multi‐species connectivity assessments. Yet, this approach tends to oversimplify movement behaviour, likely reducing realism and precision of connectivity model outputs. Also, the most widely used methods and theories for assessing landscape connectivity, namely circuit and network theories, have strong limitations. Finally, uncertainty or robustness estimates are rarely integrated in connectivity assessments. Here, we propose a versatile framework, which, instead of using arbitrary defined generic species, first identifies species groups based on species' environmental niches and morphological, biological, and ecological traits. Second, it combines circuit and network theories to take the best of the two methods to assess landscape connectivity for those groups, while integrating uncertainties in modelling choices. Specifically, ecological continuities (i.e. landscape elements contributing to connectivity) are calculated for these groups and used together with group dispersal capacities to derive network‐based connectivity metrics for conservation areas. We detailed our framework through a case study where we assess the connectivity of 1619 protected areas in metropolitan France for 193 vertebrate species. Our study revealed that both the protection of ecological continuities and the connectivity of protected areas for 11 mammal and 19 bird groups, respectively, were quite low, with variations among groups. Different protection types (i.e. national parks, reserves or prefectural orders) contributed unequally to the overall connectivity of group‐specific suitable habitats. Considering uncertainty propagation was crucial, as many connectivity metrics varied among repetitions. The proposed framework combines different connectivity tools to provide a more relevant and comprehensive assessment of landscape connectivity. It can be used to inform the decision‐making process for spatial planning, particularly in the context of connectivity conservation and management, or support theoretical studies to better understand the ecological role of landscape connectivity. Its flexibility allows easy application under various environmental conditions, including future scenarios.


The outputs from each step of the BATS workflow. (1) Illustrates the raw data downloaded (Base Reflectivity in dBZ). (2) Shows the pixel‐wise classification of FTB occurrence as a binary 0–1 presence–absence. (3) shows the final, aggregated FTB distribution across the study area within a 2‐week timeframe of June 2018, identifying where FTB spent time.
The data pipeline of the BATS algorithm highlights the various steps involved in downloading, processing, and classifying the radar data to the final aggregated model output.
Initial FTB emergence from multiple major colonies (red stars) and subsequent dispersion over a 19‐min period. The timezone is Pacific Standard Time. White pixels denote the probability of FTB above a threshold of 90% using the ANN. These images were classified using BATS from four radar scenes collected on August 19, 2020 from tower KDAX.
Aggregated FTB occurrence data over the central portion of the San Joaquin Valley, California. The aggregated data are scaled from 0 to 1 in terms of presence over the period of a week in August, 2017. The vertical scale is exaggerated to highlight areas of high occurrence.
BATS: Bat‐aggregated time series—A python‐based toolkit for landscape‐level monitoring of free‐tailed bats via weather radar

October 2024

·

18 Reads

The US operates a system of 160 S‐band Doppler weather radars known as NEXRAD (NEXt generation weather RADar) that continuously monitors the airspace around the majority of the United States and outlying territories. These radars detect and track birds, insects, and bats. Free‐tailed bats (Genus Tadarida) provide considerable ecosystem services through their voracious insect consumption; but their movements and ecosystem service provision have historically been difficult to track/study in space and time. We introduce ‘BATS’, a Python toolkit that streamlines the process of downloading, classifying, and aggregating time series of free‐tailed bats across large landscapes. BATS retrieves data from NOAA's weather radar data repositories and classifies the processed radar data using a pre‐trained ML trained to detect and classify radar echoes associated with free‐tailed bats. We trained various machine learning approaches at classifying pixels containing free‐tailed bats and compared the effectiveness across approaches. With an AUC of 0.963, the neural network approach is highly effective in identifying free‐tailed bats in NEXRAD data over our study sites in California and Texas. Furthermore, BATS is capable of quickly distilling 6 months of radar data from a single tower (3.5 Tb) into a single 15 Mb‐sized map of bat occurrence, contingent on available computing resources. BATS will help scientists and stakeholders identify areas of high bat occupancy at the landscape level over long periods of time. This ability has the potential to increase our understanding of the economic and agricultural value of these species.


The ecoEye camera design (commercial and open‐design versions). (a) Front view of the commercial version: Interchangeable lens, window for the internal light shield (not pictured), built‐in LED light pipes, power button, external sensor and power connectors. (b) Inner view of the commercial version: Main OpenMV and auxiliary boards (power management system at bottom, connector shield stacked on openMV board), lithium 18,650 batteries. (c) Block diagram of the commercial version's system design, including further extensions not pictured in the previous panels, such as the WLAN and light shields. (d) The open‐design CAD case model (version 49) used in use cases A to E. Open‐design CAD standalone lens mount model shown at bottom, provided for DIY applications allowing the placement of the interchangeable lenses outside off‐the‐shelf boxes in a waterproof manner.
Six use cases depicting field applications of the ecoEye embedded vision camera. Middle panels for (b and c) were made from multiple pictures to depict the relative sizes of the diverse targets. Padding was added to the bounding boxes of the detections for better visibility of the targets. In left panels, calendar icons indicate deployment durations, and in middle panels, crosshairs represent field accuracies, measured with F1 scores. Accuracy values in (e) are based on the first deployment only. In the right panel of (d), the red line indicates the predicted temperature effect on bee detections, and the shaded grey ribbon depicts the upper and lower bounds of its 95% confidence interval. In the rightmost pane of (f), the blue line represents the smoothed relationship between time and flower counts (loess smoother, evaluated at 80 points with a span of 0.75), with its 95% confidence interval depicted as a grey ribbon.
Eyes on nature: Embedded vision cameras for terrestrial biodiversity monitoring

October 2024

·

112 Reads

We need comprehensive information to manage and protect biodiversity in the face of global environmental challenges, and artificial intelligence is required to generate that information from vast amounts of biodiversity data. Currently, vision‐based monitoring methods are heterogenous; they poorly cover spatial and temporal dimensions, overly depend on humans, and are not reactive enough for adaptive management. To mitigate these issues, we present a portable, modular, affordable and low‐power device with embedded vision for biodiversity monitoring of a wide range of terrestrial taxa. Our camera uses interchangeable lenses to resolve barely visible and remote targets, as well as customisable algorithms for blob detection, region‐of‐interest classification and object detection to automatically identify them. We showcase our system in six use cases from ethology, landscape ecology, agronomy, pollination ecology, conservation biology and phenology disciplines. Using the same devices with different setups, we discovered bats feeding on durian tree flowers, monitored flying bats and their insect prey, identified nocturnal insect pests in paddy fields, detected bees visiting rapeseed crop flowers, triggered real‐time alerts for waterfowl and tracked flower phenology over months. We measured classification accuracies (i.e. F1‐scores) between 55% and 95% in our field surveys and used them to standardise observations over highly resolved time scales. Our cameras are amenable to situations where automated vision‐based monitoring is required off the grid, in natural and agricultural ecosystems, and in particular for quantifying species interactions. Embedded vision devices such as this will help addressing global biodiversity challenges and facilitate a technology‐aided agricultural systems transformation.


Histogram of evolutionary covariance estimates for 1000 simulated datasets. Estimates of evolutionary covariance were obtained using (1) species means, (2) within‐species independent contrasts and (3) E‐PGLS. Inset is a dot plot between estimates from within‐species independent contrasts and E‐PGLS for all 1000 datasets.
Results from stochastic sampling experiments of multivariate phenotypic data containing structured patterns of intraspecific variation (with balanced data and ni=4$$ {n}_i=4 $$) that differed across species and were thus species‐specific. (a) Plot of rejection rates for the intraspecific group effect (FGp$$ {F}_{Gp} $$) and the species‐specific intraspecific group effect (FS×Gp$$ {F}_{S\times Gp} $$). Rejection rates represent type I error when input signal = 0, and statistical power when input signal >0. Empirical sampling distributions for F‐statistics obtained via RRPP for 1000 simulations representing (b) intraspecific group effects (FGp$$ {F}_{Gp} $$), and (c) species‐specific intraspecific group effects (FS×Gp$$ {F}_{S\times Gp} $$).
Results of comparisons of multivariate phenotypic vectors of sexual dimorphism in Cyprinodon pupfish. (a) One example of landmarks and Bézier curves from which semilandmarks were sampled, plus the aligned coordinates for the entire data set, representing shape variation. (b) Time‐calibrated phylogeny for the seven species. Horizontal bars represent the magnitude of sexual dimorphism displayed by each species. (c) PC plot of shape variation. Evolutionary trajectories of sexual dimorphism for all species are superimposed in the plot. (d) Thin‐plate spline deformation grids describing shape differences between males and females in C. nevadensis and C. variegatus (shape differences displayed at 2× magnification to facilitate visual interpretation of observed patterns).
Extending phylogenetic regression models for comparing within‐species patterns across the tree of life

October 2024

·

97 Reads

Evolutionary biologists characterize macroevolutionary trends of phenotypic change across the tree of life using phylogenetic comparative methods. However, within‐species variation can complicate such investigations. For this reason, procedures for incorporating nonstructured (random) intraspecific variation have been developed. Likewise, evolutionary biologists seek to understand microevolutionary patterns of phenotypic variation within species, such as sex‐specific differences or allometric trends. Additionally, there is a desire to compare such within‐species patterns across taxa, but current analytical approaches cannot be used to interrogate within‐species patterns while simultaneously accounting for phylogenetic non‐independence. Consequently, deciphering how intraspecific trends evolve remains a challenge. Here we introduce an extended phylogenetic generalized least squares (E‐PGLS) procedure which facilitates comparisons of within‐species patterns across species while simultaneously accounting for phylogenetic non‐independence. Our method uses an expanded phylogenetic covariance matrix, a hierarchical linear model, and permutation methods to obtain empirical sampling distributions and effect sizes for model effects that can evaluate differences in intraspecific trends across species for both univariate and multivariate data, while conditioning them on the phylogeny. The method has appropriate statistical properties for both balanced and imbalanced data. Additionally, the procedure obtains evolutionary covariance estimates that reflect those from existing approaches for nonstructured intraspecific variation. Importantly, E‐PGLS can detect differences in structured (i.e. microevolutionary) intraspecific patterns across species when such trends are present. Thus, E‐PGLS extends the reach of phylogenetic comparative methods into the intraspecific comparative realm, by providing the ability to compare within‐species trends across species while simultaneously accounting for shared evolutionary history.


Empirical datasets and their global distributions, including four BOLD‐downloaded datasets: (a) Spirurida (nematodes), (b) Scincidae (reptiles, Squamata), (c) Limacodidae (insects, Lepidoptera) and (e) Theretra (insects, Lepidoptera Sphingidae); and two field‐sampled or literature‐published datasets: (d) Zhejiang hawk‐moths (insects, Lepidoptera), (f) Dendrolimus (insects, Lepidoptera Lasiocampidae; Dai et al., 2012). Points in different colours represent different species.
Success rates of NBSI (blue bars) compared with traditional DNA barcoding (red bars) when testing (A) completely simulated datasets and (B) their corresponding ambiguous sequence datasets. The colour gradient of the bars, from light to dark, represents the thresholds of acceptance for identification results, set at 0.90, 0.95 and 0.99. The X‐axis lists the nine simulated datasets, each configured with a virtual species count of 20 (5 individuals per species), 50 (2 individuals per species) or 100 (1 individual per species), and further specified the molecular mutation rate by coalescent parameters (θ) set at 0.05, 0.1 and 0.2; the Y‐axis indicates the percentage of successful identifications (true positive and true negative assignments) among 1000 leave‐one‐out cross‐validation test. The error bars indicate 95% confidence intervals, and different lowercase letters suggest significant differences at the level of 0.05 using Tukey's ‘Honest Significant Differences’ test (Miller, 1981; Yandell, 1997).
Success rates of NBSI (blue bars) compared with traditional DNA barcoding (coral bars) when testing (A) completely empirical datasets and (B) their corresponding ambiguous sequence datasets. The colour gradient of the bars, from light to dark, represents the thresholds of acceptance for identification results, set at 0.90, 0.95 and 0.99. The X‐axis lists the six empirical datasets, including four BOLD‐downloaded datasets, a field‐sampled and a literature‐published datasets; the Y‐axis indicates the percentage of successful identifications (true positive and true negative assignments) among 1000 leave‐one‐out cross‐validation test. The error bars indicate 95% confidence intervals, and different lowercase letters suggest significant differences at the level of 0.05 using Tukey's ‘Honest Significant Differences’ test (Miller, 1981; Yandell, 1997).
Improvements of success rates from traditional DNA barcoding to the NBSI framework in the (a) simulated and (b) empirical datasets. The X‐axis indicates proportions of ambiguous sequences and the Y‐axis indicates improvements of success rates. Correlations were measured by Spearman's rank correlation coefficient (Hollander & Wolfe, 1973).
Ecological space of part of the genetically close related species in the empirical datasets: (a) Spirurida (nematodes), (b) Scincidae (reptiles, Squamata), (c) Limacodidae (insects, Lepidoptera), and (e) Theretra (insects, Lepidoptera Sphingidae); and two field‐sampled or literature‐published datasets: (d) Zhejiang hawk‐moths (insects, Lepidoptera), (f) Dendrolimus (insects, Lepidoptera Lasiocampidae; Dai et al., 2012).
Environmental niche models improve species identification in DNA barcoding

October 2024

·

90 Reads

Recent advances in DNA barcoding have immeasurably advanced global biodiversity research in the last two decades. However, inherent limitations in barcode sequences, such as hybridization, introgression or incomplete lineage sorting can lead to misidentifications when relying solely on barcode sequences. Here, we propose a new Niche‐model‐Based Species Identification (NBSI) method based on the idea that species distribution information is a potential complement to DNA barcoding species identifications. NBSI performs species membership inference by incorporating niche modelling predictions and traditional DNA barcoding identifications. Systematic tests across diverse scenarios show significant improvements in species identification success rates under the newly proposed NBSI framework, where the largest increase is from 4.7% (95% CI: 3.51%–6.25%) to 94.8% (95% CI: 93.19%–96.06%). Additionally, obvious improvements were observed when using NBSI on potentially ambiguous sequences whose genetic nearest neighbours belongs to another species or more than two species, which occurs commonly with species represented by single or short DNA barcodes. These results support our assertion that environmental factors/variables are valuable complements to DNA sequence data for species identification by avoiding potential misidentifications inferred from genetic information alone. The NBSI framework is currently implemented as a new R package, ‘NicheBarcoding’, that is open source under GNU General Public Licence and freely available from https://CRAN.R‐project.org/package=NicheBarcoding.


Model‐based unconstrained ordination plots based on (a) ordered beta, and (b) hurdle beta generalised linear latent variable models (GLLVMs) fitted to the vascular plant cover dataset. Sites are coloured according to their peatland type. These clear clusters in the latent variable scores would dissipate if the peatland type was included in the GLLVM as a covariate.
Means and standard deviations of the Procrustes errors between the predicted and the true latent variable scores, where multivariate percent cover data were generated using the: (a) ordered beta generalised linear latent variable model (GLLVM), and (b) hurdle beta GLLVM. A trimming factor of 5% was used to remove effects of the most extreme values resulting from extremely volatile fits. The points are slightly jittered to avoid visual overlap.
Mean absolute error of prediction as a function of mean species prevalence for beta, hurdle beta, and ordered beta generalised linear latent variable models, across the four real multivariate percent cover datasets.
Root mean square error of prediction as a function of mean species prevalence for beta, hurdle beta, and ordered beta generalised linear latent variable models, across the four real multivariate percent cover datasets.
Area under the curve (top row) and Tjur's R² (bottom row) as a function of recorded group mean prevalence for the four real multivariate percent cover datasets. Recorded group mean prevalence was obtained by clustering species based on their recorded prevalences in the complete dataset into a small number of groups, and then calculating the mean prevalence of each group. The y‐axes presents the corresponding metric for each group.
A comparison of joint species distribution models for percent cover data

October 2024

·

37 Reads

Joint species distribution models (JSDMs) have gained considerable traction among ecologists over the past decade, due to their capacity to answer a wide range of questions at both the species‐ and the community‐level. The family of generalised linear latent variable models in particular has proven popular for building JSDMs, being able to handle many response types including presence‐absence data, biomass, overdispersed and/or zero‐inflated counts. We extend latent variable models to handle percent cover response variables, with vegetation, sessile invertebrate and macroalgal cover data representing the prime examples of such data arising in community ecology. Sparsity is a commonly encountered challenge with percent cover data. Responses are typically recorded as percentages covered per plot, though some species may be completely absent or present, that is, have 0% or 100% cover, respectively, rendering the use of beta distribution inadequate. We propose two JSDMs suitable for percent cover data, namely a hurdle beta model and an ordered beta model. We compare the two proposed approaches to a beta distribution for shifted responses, transformed presence‐absence data and an ordinal model for percent cover classes. Results demonstrate the hurdle beta JSDM was generally the most accurate at retrieving the latent variables and predicting ecological percent cover data.


Illustration showing how biologging measurements and animal welfare assessment criteria may affect the balance between the advantages and limitations of biologging assessment of animal welfare. Advantages and limitations have been categorised following the welfare assessment criteria described in Browning (2022) and Browning et al. (2024). The expected relative difference between wild and non‐wild animals is shown in grey for each category. The definitions of welfare assessment criteria (correctness, usefulness, feasibility) are provided in Table 3.
Theoretical scenarios explaining the relationship between an injury to the locomotory system and reduced activity in a wild boar, who was injured while being monitored with an accelerometer. Depending on its location and intensity, this injury could have led to no change in the activity of the animal (a) even though it might be painful (b). Alternatively, this injury could reduce the activity of the animal by increasing the pain experienced by the animal (c) or by reducing its physical capacity (d). These effects may be interrelated (e) and result in other negative states (e.g. hunger, fear; f) and may vary across time (e.g. initial pain progressively disappearing with time).
Example of biologging measurement illustrating the contrasting behavioural responses to disturbance, that is, fireworks, observed in a population of wild boar living in suburban areas. The upper panels (a–c) represent the diurnal activity (hourly means ± SE, centred on midnight) of three individual female wild boars, as measured by accelerometers over a period including days without (blue line; activity averaged over the first 5 days of December) and with fireworks (New Year's Eve; pink line). Mean activity scores were calculated as hourly average of activity scores of the X‐axis of dual‐axis accelerometers recording activity on a continuous range (0–255) every five minutes. In the upper panels, the grey windows highlight the 3 h after midnight when firing was considered to be maximal. During these three hours, the relative difference in activity of 14 wild boars equipped with accelerometers (including the three individuals represented in a–c) was calculated between nights without fireworks and New Year's Eve. The distribution of activity changes is represented in panel (d) and shows a diversity of behavioural responses between the different individuals being studied (blue bars: Decreased activity (as in a); purple bar: Stable activity (as in b); pink bars: Increased activity (as in c)). Two‐axis accelerometers together with GPS units (Vectronic Aerospace GmBH) were deployed as part of a broader study examining the movement and activity of wild boar in the Czech Republic (Project ‘Behavioural reaction of free‐living wild boar on measures realised against spreading of African swine fever virus’; trapping and handling approved by Ministry of the Environment of the Czech Republic; number MZP/2019/630/361). More information on the equipment procedure can be found in Supplementary Material.
Plugging biologging into animal welfare: An opportunity for advancing wild animal welfare science

October 2024

·

114 Reads

Animal welfare science is currently expanding beyond its traditional boundaries, from captive animals to those living in the wild. This current development is conceptually and methodologically challenging, but it could benefit from adjacent and more established research fields. Among these fields, biologging appears to be a strong candidate, as most intrinsic, location and environmental variables collected through biologging approaches could be used to assess animal welfare in the wild. To provide an objective view of the suitability of biologging to assess wild animal welfare, biologging was evaluated against the criteria that are currently recommended to assess animal welfare. This evaluation shows that biologging approaches could enhance animal welfare assessments in terms of completeness, informativeness and feasibility in the wild. However, their full implementation may be complicated by limitations in terms of validity, representativeness and disturbance, and by the different welfare perspectives taken by wildlife biologists using biologging approaches and animal welfare biologists. To exploit the full potential that biologging approaches could offer to assess wild animal welfare, their current limitations need to be overcome. Towards this end, recommendations are explicitly provided to enhance the validity and the representativeness of biologging measurements as welfare indicators, while reducing disturbance. To increase the visibility and the impact of biologging studies examining wild animal welfare, we also encourage wildlife biologists using biologging approaches to adopt the same language and perspectives as those used by animal welfare biologists. If current limitations are overcome, biologging is likely to be instrumental for the future study of animal welfare in the wild. Reciprocally, integrating animal welfare in biologging studies is expected to have a great impact on the whole biologging field by extending its current scope to a new and promising research area.


Methodological overview within TreeCompR—left: Functions and competition indices (CIs) related to input of original point clouds from close‐range remote sensing as terrestrial or mobile laser scanning (TLS/MLS); right: functions and CIs related to input of inventory data that can be derived by various data collection methods as ground‐based inventory, TLS/MLS or airborne laser scanning. R package recommendations for preprocessing the point clouds are written in italics.
Functionality of compete_pc() with filtering of the competing neighbourhood (left) and the ‘cone’ or ‘cylinder’ method (right).
(a) Distribution of the distance‐dependent competition indices (CIs) derived from mobile laser scanning and three different search radii (1×, 2×, 3× average crown radius of 4.5 m) for the 307 target trees. (b) Comparison of CI values for cone and cylinder method with different cone heights (50% and 60% of target tree's height) and different cylinder radii (1× and 2× crown radius) and the tree position methods base position (bp) and crown position (cp).
Correlation plot of target trees' heights (h, m), diameter at breast height (DBH, m), crown projection area (CPA, m²), box dimension (Db) and the four competition indices (CIs) quantified using the TreeCompR package. CIs are log‐transformed, all variables centred and scaled by standard deviation. Shown are the Pearson correlation coefficients (upper triangle, shaded by correlation strength), density plots of the variables (diagonal) and scatter plots overlaid with a linear model fit (blue line) with its 95% confidence intervals (grey ribbon). Figure created using corrmorant v0.0.0.9007 (Link, 2020).
(a) Distribution of height‐distance‐dependent CIBraathe derived from airborne (ALS) and mobile laser scanning (MLS) and three different search radii (1×, 2×, 3× average crown radius of 4.5 m) for the 307 target trees. A paired T‐test at a significance level of 0.05 shows differences between ALS‐ (dark blue) and MLS‐derived (light blue) derived CIs. (b) Comparison of number of competitors and (c) tree heights (in m) within search radius of 13.5 m around the target trees derived from ALS versus MLS data. Different colours indicate different forest sites.
TreeCompR: Tree competition indices for inventory data and 3D point clouds

October 2024

·

228 Reads

In times of more frequent global change‐type droughts and associated tree mortality events, competition release is one silvicultural measure discussed to have an impact on the resilience of managed forest stands. Understanding how trees compete with each other is therefore crucial, but different measurement options and competition indices (CI) leave users with a difficult choice, as no single competition index has proven universally superior. To help users with the choice and computation of appropriate indices, we present the open‐source TreeCompR package, which handles 3D point clouds and classical forest inventory data, enabling the calculation of both innovative point cloud‐based indices and traditional distance‐dependent indices. It serves as a centralized platform for exploring and comparing different CIs, allowing users to test and select the most suitable CI for their specific research questions within a common interface. To evaluate the package, we used TreeCompR to quantify the competition situation of 307 European beech trees from 13 sites in Central Europe. Based on this dataset, we discuss the interpretation, comparability and sensitivity of the different indices to their parameterization and identify possible sources of uncertainty and ways to minimize them. The compatibility of TreeCompR with different data formats and different data collection methods makes it accessible and useful for a wide range of users, specifically ecologists and foresters. Due to the flexibility in the choice of input formats as well as the emphasis on tidy, well‐structured output, our package can easily be integrated into existing data‐analysis workflows both for 3D point cloud and classical forest inventory data.


Low‐cost animal tracking using Bluetooth low energy beacons on a crowd‐sourced network

October 2024

·

49 Reads

Animal tracking has opened the door to address many fundamental questions in ecology and conservation. Whilst historically animals have been tracked as a means to understand their large‐scale movements, such as migration, there is now a greater focus on using tracking to study movements over smaller scales, individual variation in movement or how movements shape social network structure. With this shift in focus also comes different tracking needs, including the need to track larger numbers of individuals. Tracking studies all face some technological limitations. For example, GPS and other active tracking solutions can collect fine‐scale movement data, but have a high cost per tag, limiting the number of individuals that can be followed. They also have high low‐energy costs of data acquisition and download, limiting time periods over which data can be collected. Low‐energy passive (e.g. PIT) or active (e.g. reverse GPS) tags can overcome these limitations, but instead require animals to remain within a bounded study area or to come into close proximity to detectors. Here we describe one solution that can overcome many current limitations by employing the massive global network of personal mobile phones as gateways for tracking animals using Bluetooth low‐energy (BLE) beacons. In areas with medium to high density of people, these simple‐to‐make beacons can provide regular updates of position over long time periods (battery life 1–3 years). We describe how to use off‐the‐shelf components to produce BLE beacons that weigh c. 5–6 g and cost <$7USD. Using field‐testing, we then show that beacons are capable of producing high‐frequency tracking data that can be used to build home ranges or to detect spatiotemporal co‐occurrences among individuals. BLE beacons are a low cost, low‐energy solution for studying organisms (e.g. birds, mammals and reptiles) living and moving in urban landscapes. Their low weight and small size makes them particularly well‐suited for tracking smaller species. When combined with fixed gateways, their use can also be extended to non‐urban habitats. Their high accessibility is likely to make them an attractive solution for many research projects.


Estimated locations (points) for a Wood Thrush male tracked for a 6‐week period in southeastern Virginia, USA using radio telemetry. Sequentially recorded points are connected by lines. Results from a mechanistic range shift analysis indicated the spatial locations recorded were from two unique territories, with 2 points recorded during a transitional period between territories. Routine movements within the first territory led to temporary emigration from the hypothetical sampling site (represented by a 100 m radius point count circle) while the later‐season territory shift leads to a more permanent emigration from the sampling site. Imagery shown is publicly available from the 2020 National Agriculture Inventory Program (U.S. Department of Agriculture Farm Production and Conservation—Business Center, Geospatial Enterprise Operations).
True occupancy rates for our simulations were calculated based on the distribution of birds in space and time relative to sampling sites. This simplified diagram shows locations of two birds at six snapshots in time (minutes) over 2 days. Here, we assume these six snapshots represent the entirety of the simulation and demonstrate how we calculated these true occupancy values (instantaneous, daily and seasonal occupancy) as an emergent property from the simulation.
Violin plots showing the distribution of mean occupancy estimates (averaged across 1000 simulations) and variance in occupancy estimates (among 1000 simulations) stratified across (a) spatial, and (b) temporal survey protocol characteristics. Each grey dot represents an estimate generated by one of the 162 sampling protocols, and lines connect estimates from protocols that are identical except for the characteristic displayed on the x‐axis. Results represent sampling a Wood Thrush population at an intermediate density of 0.1 males/ha. See Figure 5 for estimates of bias and mean‐squared error.
Violin plots showing the distribution of mean occupancy estimates (averaged across 1000 simulations) and variance in estimates (among 1000 simulations) between occupancy models and logistic regression models fit to data collected on intermediate density (0.1 males/ha) Wood Thrush populations. Data were collected using 162 sampling protocols (grey dots) that varied in spatial and temporal sampling characteristics. Lines connect estimates from different types of models fit to the same dataset.
Bias (95% confidence interval, black) and mean‐squared error (red) in estimates from fitted occupancy and logistic regression models (n = 1000 simulations) across protocols with variable temporal survey characteristics. We calculated bias relative to true instantaneous occupancy (a), daily occupancy (b) and seasonal occupancy (c). Patterns in bias varied little with spatial sampling characteristics (i.e. point placement, survey site radius or Wood Thrush density), so we report results from systematically placed 100 m radius survey sites sampling of a Wood Thrush population at an intermediate density of 0.1 males/ha.
Thinking beyond the closure assumption: Designing surveys for estimating biological truth with occupancy models

October 2024

·

60 Reads

Occupancy models estimate distributions of imperfectly detected species, but violations of the closure assumption can bias results. However, researchers working with mobile animals may find it impossible to eliminate such violations. Here, we tested the hypothesis that occupancy models fit to realistic sampling data can generate unbiased occupancy estimates for an itinerant Wood Thrush (Hylocichla mustelina) population. In 2013 and 2014, we tracked movements of 41 breeding Wood Thrush males. We modelled territory shift probabilities using logistic exposure models and within‐territory movements using continuous‐time stochastic process models. We then constructed an individual‐based model, simulated (1000 iterations) spatiotemporal locations for individuals and simulated sampling these populations using 162 different point count protocols with variable spatial (sampling radius and point placement method), and temporal (survey length, between‐survey intervals and number of surveys) characteristics. We compared occupancy estimates with true values of instantaneous, daily and seasonal occupancy from the simulations. We parameterized continuous time stochastic process models based on movements within 34 unique territories and estimated a daily territory shift probability of 0.0099 (95% CI: 0.0060, 0.0152). Simulated data indicated that estimates of occupancy ranged from 0.18 (0.06, 1.00) to 0.80 (0.71, 0.89) depending on protocol characteristics. Occupancy estimates increased with increasing survey radius, survey length and between‐survey interval. Protocols using shorter surveys and between‐survey intervals were good estimators for instantaneous occupancy (low bias and mean‐squared error) but poor estimators for daily and seasonal occupancy; longer surveys and intervals generated unbiased estimators of daily occupancy but underestimated seasonal occupancy. Logistic regression models that ignored imperfect detection outperformed occupancy models for estimating instantaneous occupancy but not daily or seasonal occupancy. For mobile animals, occupancy of sampling sites changes in space and time. Consequently, the spatial and temporal aspects of a sampling protocol have strong, but predictable, effects on occupancy model parameter estimates. Our results demonstrate that how these factors interact is critical for designing surveys that produce occupancy estimates representative of the biological process of interest to a researcher.


Environmental DNA sample mixture mapping at large scale. (a) Bar plots representing fish samples (horizontal lines) for varying numbers of sources (K) from 3 to 9. Each source is depicted with a different colour. (b) World map with a pie plot illustrating the sample mixture for K = 6. Samples cluster into six pools corresponding to distinct biogeographic regions: The Tropical southwestern Pacific (yellow), the Western Coral Triangle (pink), the Mediterranean Sea (green), the Tropical northwestern Atlantic (orange), the western Indian Ocean (purple) and the Scotia Sea (blue).
Frequencies of molecular operational taxonomic units from environmental DNA samples in each source for the large‐scale dataset: (a) Bar plot representing the frequencies of 2888 fish molecular operational taxonomic units (MOTUs) in each source for K = 6. (b) Bar plot for a selection of MOTUs (56 out of 2888) with a cumulative frequency exceeding 30% in (a). For these selected MOTUs, we provide the best taxonomic assignments. (c) The colour association with sources is derived from the assemblage results presented in Figure 1, for which we report the bar plot for K = 6.
Most differentiating molecular operational taxonomic units (MOTUs) detected in the large‐scale environmental DNA dataset: (a) Plot of −log10(p‐values) for the large‐scale dataset. The horizontal line represents an expected false discovery rate of q = 10−30. The taxonomy of the nine most differentiated MOTUs is reported. (b) Table reporting the 29 MOTUs above the q threshold and their corresponding taxonomic identification. (c) Representation of the top three −log10(p‐values) and the corresponding fish distribution based on FishBase and Aquamaps. The geographic areas corresponding to our samples are highlighted in red rectangles.
Mixture mapping and molecular operational taxonomic unit (MOTU) composition at a local scale: The Mediterranean Sea (a) Barplots illustrating the composition of fish samples for varying numbers of sources (K) from 3 to 7 in the Mediterranean Sea. As K increases, the different pools are associated with the continent‐island gradient (K = 3) and the latitudinal and longitudinal gradients (K = 4–6). (b) At K = 5, distinct regions are identified: Balearic Islands (orange), Banyuls (purple), Carry‐le‐Rouet and Riou (yellow), Porquerolles (green), Cap Roux (pink) and Corsica (blue). (c) At K = 7, the effect of protection within each specific geographic region is observed, with local variation in composition between samples from no‐take reserves and those from fished areas within each region. On the left, in the geographic map, reserve samples are represented by larger circles, and their coordinates are indicated by asterisks. On the right, a bar plot for K = 7 shows samples sorted by latitude. Bars corresponding to samples from reserves are indicated by an arrow and an asterisk.
A spatial matrix factorization method to characterize ecological assemblages as a mixture of unobserved sources: An application to fish eDNA surveys

October 2024

·

79 Reads

Understanding how ecological assemblages vary in space and time is essential for advancing our knowledge of biodiversity dynamics and ecosystem functioning. Metabarcoding of environmental DNA (eDNA) is an efficient method for documenting biodiversity changes in both marine and terrestrial ecosystems. However, current methods fail to detect and display the biodiversity structure within and between eDNA samples limiting ecological and biogeographical interpretations. We present a spatial matrix factorization method that identifies optimal eDNA sample assemblages—called pools—assuming that taxonomic unit composition is based on a fixed number of unknown sources. These sources, in turn, represent taxonomic units sharing similar habitat properties or characteristics. The method aims to reduce the multi‐taxa composition structure into a low number of dimensions defined by these sources. This method is inspired by admixture analysis in population genetics. Using a marine fish eDNA survey on 263 sampling stations detecting 2888 molecular operational taxonomic units (MOTUs), we apply this method to analyse the biogeography and mixing patterns of fish assemblages at regional and large scales. At large scale, our analysis reveals six primary pools of fish samples characterized by distinct biogeographic patterns, with some mixtures between these pools. We identify pools composed of unique sources, corresponding to distinct and more isolated regions such as the Mediterranean and Scotia Seas. We also identify pools composed of a greater mix of sources, corresponding to geographically connected areas, such as tropical regions. Additionally, we identify the taxa underpinning the formation of each pool. In the regional analysis of Mediterranean eDNA samples, our method successfully identifies different pools, allowing the detection of not only geographic gradients but also human‐induced gradients corresponding to protection levels. Spatial matrix factorization adds a new method in community ecology, where each sample is considered as a mixture of K unobserved sources, to assess the dissimilarity of ecological assemblages revealing environmental and human‐induced gradients. Beyond the study of fish eDNA samples, this method has the potential to shed new light on any biodiversity survey and provide new bioindicators of global change.


An aerial view of the Forest and Biodiversity (FAB2) experiment, showing the combination of 100 and 400 m² plots, adjacent to the smaller FAB1 experiment and the BioDiv grassland experiment in October 2023 at the Cedar Creek Ecosystem Science Reserve. Differences in colour and size of evergreen conifers and deciduous angiosperms in FAB2 are evident.
The Forest and Biodiversity (FAB2) experiment is designed to test consequences of (a) tree species and lineages and of (b) multiple dimensions of forest diversity for ecosystem functions and for other trophic levels, and (c) to uncover the mechanisms underlying these effects. (a) The monocultures in FAB2 enable tests of species (and lineage) effects. Species vary in nutrient acquisition and use strategies, litter properties, wood and hydraulic properties, defence chemistry, symbiotic relationships, growth rates and resistance to stress—all of which may influence their fitness (response traits) as well as their effect on the environment around them (effect traits, sensu Lavorel & Garnier, 2002). We thus expect forest plots that differ in which species are planted to vary in productivity, microclimate, soil texture and pH, decomposition rates and carbon/nutrient cycling, dominant foliar herbivores, and soil fauna like worms, bacteria and fungi—including dominant mycorrizal types associated with tree roots. This conceptual figure illustrates the predicted influence on the environment of species belonging to one of three lineages, indicated by clusters of similar colours. Greater variation in ecosystem processes is expected among lineages that are phylogenetically or functionally dissimilar (plot colours are more distinct) than among closely related species that have more shared ancestry and/or species that are more functionally similar (plots colours are more similar). Hence, we expect closely related and/or functionally similar species to exhibit similar patterns in productivity, decomposition and carbon/nutrient cycling, dominant herbivores, mycorrhizal type and woody hydraulic properties important for productivity and resistance to drought. These hypotheses are illustrated with different coloured bars underneath the plots to indicate variation in the ecosystem and trophic‐level consequences of different plant species and lineages. Grayscale bars indicate a gradient. (b) Through its nested dimensions of diversity (e.g. variation in functional and phylogenetic diversity within species richness levels), FAB2 also enables tests of forest diversity effects on above‐ and belowground productivity; herbivore composition, diversity and abundance, and their feedbacks to ecosystem productivity; and resistance and resilience of ecosystems through time. Various monocultures (left) bicultures of different phylogenetic and functional similarity (middle) and higher diversity treatments (right) are shown in (b) as examples to illustrate the consequences of tree diversity treatments, indicated below with directional arrows. With greater tree diversity, mean productivity is expected to increase, variance in decomposition and nutrient cycling is expected to decrease, local insect diversity is expected to increase and resistance to drought and biomass stability are expected to increase. (c and d) The replicated monocultures and range of mixtures provide a means to decipher potential roles of complementarity and selection effects as mechanisms by which diversity influences ecosystem functions and to examine how they may emerge from shifts in species interactions, including facilitation and niche partitioning. Included in the tree diversity experiment are twelve tree species native to Minnesota that span a wide range of lineages and functional traits. From left to right: Quercus macrocarpa, Q. alba, Q. rubra, Q. ellipsoidalis, Betula papyrifera, Acer rubrum, A. negundo, Juniperus virginiana, Pinus resinosa, P. banksiana, P. strobus. Some of the potential mechanisms underlying species and lineage effects associated with the experimental design are illustrated in (c), which shows foliar and wood traits differing among species and lineages with consequences for ecosystem processes and other trophic levels, including soil biota. The experimental design enables the study of (i) species differences in plant function and intrinsic growth rates, (ii) host specificity and co‐evolutionary acquisition of symbionts, including bacterial and fungal partners, and (iii) the deep evolutionary divergence in wood and leaf structural properties and their consequences for ecosystem processes. (d) depicts some of the potential mechanisms underlying tree diversity effects that can be studied in the experiment. These effects include (iv) dilution effects, (v) phenological offsetting in light and nutrient use, and (vi) facilitation through shading and soil moisture maintenance. Detailed hypotheses and definitions are provided in the Supporting Information.
Experimental design of the Forest and Biodiversity Experiment 2 at the Cedar Creek Ecosystem Science Reserve. (a) A LiDAR image during summer 2022. Colour images show the spatial arrangement of (b) species composition of plots with monocultures, (c) monocultures, PV‐FV mixtures, 12 species mixtures and the oak species mixtures within the Oak‐DIV nested experiment, (d) phylogenetic variability within plots, (e) number of species in each plot and (f) functional variability within plots.
Percent mortality of trees. (a) average mortality by species in all plots. Mortality by species in (b) 100 m² monoculture plots (c), 400 m² monocultures, (d) 100 m² mixtures (any plot with more than one species), e) 400 m² mixtures. Mean values per species are shown for each year, with standard error confidence intervals. Mortality percentages were calculated to include any trees newly or previously planted in the plots. The 100 m² plots were planted in 2016, the 400 m² plots were planted in 2017; these were replanted as necessary for 3 years.
Tree mortality by species or plot composition in relation to plot diversity. In the top three panels, percent mortality per species averaged over time is shown in relation to (a) species richness level, (b) phylogenetic diversity, calculated as the sum of the phylogenetic branch lengths of all species in the assemblage (Faith's PD) with monocultures shown as zero phylogenetic diversity and (c) functional diversity calculated as the sum of functional distances for a suite of eight traits, with monocultures shown as zero functional diversity. Linear models are fit to each species, shown as different coloured lines. Solid lines are angiosperms and dashed lines are gymnosperms. In the bottom three panels, mean percent mortality by assemblage averaged across time is shown in relation to (d) species richness, (e) phylogenetic diversity and (f) functional diversity. The proportion of angiosperm tree species in each plot is colour indicated, with increasing proportion of angiosperms shown as more orange and greater proportion of conifers as more green.
Forest and Biodiversity 2: A tree diversity experiment to understand the consequences of multiple dimensions of diversity and composition for long‐term ecosystem function and resilience

October 2024

·

140 Reads

We introduce a new “ecosystem‐scale” experiment at the Cedar Creek Ecosystem Science Reserve in central Minnesota, USA to test long‐term ecosystem consequences of tree diversity and composition. The experiment—the largest of its kind in North America—was designed to provide guidance on forest restoration efforts that will advance carbon sequestration goals and contribute to biodiversity conservation and sustainability. The new Forest and Biodiversity (FAB2) experiment uses native tree species in varying levels of species richness, phylogenetic diversity and functional diversity planted in 100 m² and 400 m² plots at 1 m spacing, appropriate for testing long‐term ecosystem consequences. FAB2 was designed and established in conjunction with a prior experiment (FAB1) in which the same set of 12 species was planted in 16 m² plots at 0.5 m spacing. Both are adjacent to the BioDIV prairie‐grassland diversity experiment, enabling comparative investigations of diversity and ecosystem function relationships between experimental grasslands and forests at different planting densities and plot sizes. Within the first 6 years, mortality in 400 m² monoculture plots was higher than in 100 m² plots. The highest mortality occurred in Tilia americana and Acer negundo monocultures, but mortality for both species decreased with increasing plot diversity. These results demonstrate the importance of forest diversity in reducing mortality in some species and point to potential mechanisms, including light and drought stress, that cause tree mortality in vulnerable monocultures. The experiment highlights challenges to maintaining monoculture and low‐diversity treatments in tree mixture experiments of large extent. FAB2 provides a long‐term platform to test the mechanisms and processes that contribute to forest stability and ecosystem productivity in changing environments. Its ecosystem‐scale design, and accompanying R package, are designed to discern species and lineage effects and multiple dimensions of diversity to inform restoration of ecosystem functions and services from forests. It also provides a platform for improving remote sensing approaches, including Uncrewed Aerial Vehicles (UAVs) equipped with LiDAR, multispectral and hyperspectral sensors, to complement ground‐based monitoring. We aim for the experiment to contribute to international efforts to monitor and manage forests in the face of global change.


good: An R package for modelling count data

October 2024

·

40 Reads

Organisms‐related data often appear as counts. The Poisson distribution is the most popular choice for modelling count data, but this distribution assumes equidispersion, which is usually not satisfied in real‐world data. Deviations from the Poisson assumption lead to discrete‐valued distributions that can fit over‐ and/or underdispersion. Although models for count data with over‐dispersion have been widely considered in the literature, models for underdispersion—the opposite phenomenon—have received less attention because underdispersion is relatively common only in certain research fields, including ecology. The Good distribution is a flexible option for modelling count data with over‐dispersion or underdispersion, although no R packages are available so far offering functionalities such as calculating quantiles, probabilities, etc., of a Good distribution or providing a method for modelling a Good‐distributed output based on a number of potential predictors. This paper presents the R package good, which computes the standard probabilistic functions, generates random samples from a population following a Good distribution and estimates the Good regression.


Interaction functions: Individuals in continuous‐space models typically interact with one another according to an interaction strength function. Several interaction functions are depicted in the left panel. The right panel depicts competition between two individuals (blue and green diamonds with white outlines) using the circle intersection function; the circle intersection function was chosen for illustration because it is the most similar to the interaction strengths produced by our resource‐explicit models. The foraging areas of the individuals are represented by blue and green shading; competition strength is proportional to the amount by which these foraging areas overlap (orange shaded area).
Visualization of resource‐explicit interaction algorithms, square tiling: Competition between two individuals (blue and green diamonds) is determined by the portion of their foraging area that overlaps and by the interaction algorithm used in the model. The foraging areas are represented by blue and green shading; the overlapping area is shaded orange. In the inelastic model (left), individuals forage from resource nodes (dots at the centre of each square) within their foraging radius. In the elastic model (right), individuals forage from as many nodes as necessary to maintain a nominally sized foraging area (in this case comprising 50 nodes). Away from the edges of the landscape (top row), the two models behave similarly but not identically: In the inelastic model (upper left panel), the blue individual happens to forage from 51 nodes (due to its precise spatial position within the grid), resulting in slightly greater competition. The difference between the two models is much greater in the corner of the landscape (bottom row): The blue individual has a much smaller area in the inelastic model, whereas the blue individual in the elastic model forages from much further away to maintain a full‐sized foraging area, resulting in greater competition.
Differences in interaction strength between resource‐explicit models and the circle intersection function, square‐tiled: Two million pairwise interaction strengths between randomly placed individuals as measured by the circle intersection function were subtracted from those measured in resource‐explicit models to yield a distribution of the deviation from the circle intersection function for each model. As node density increases, the standard deviation decreases. These distributions all have a distinctive peak just below 0 due to cases where pairs of individuals have a very small but non‐zero interaction strength when using the circle intersection function, but the small overlapping portion of their foraging areas does not include any resource nodes.
Differences in survival rate between resource‐explicit models and a direct interaction model using the circle intersection function, square‐tiled: The survival rates of 100,000 individuals were measured using the inelastic method, the ‘fair’ inelastic method, the elastic method and the circle intersection function. Strengths measured by the circle intersection function were subtracted from measurements made using the other resource‐explicit methods for each individual to yield distributions of differences.
Resource‐explicit interactions in spatial population models

October 2024

·

17 Reads

Continuous‐space population models can yield significantly different results from their panmictic counterparts when assessing evolutionary, ecological or population genetic processes. However, the computational burden of spatial models is typically much greater than that of panmictic models due to the overhead of determining which individuals interact with one another and how strongly they interact. While these calculations are necessary to model local competition that regulates the population density, they can lead to prohibitively long runtimes. Here, we present a novel modelling method in which the resources available to a population are abstractly represented as an additional layer of the simulation. Instead of interacting directly with one another, individuals interact indirectly via this resource layer. We find that this method closely matches other spatial models, yet can dramatically increase the speed of the model, allowing the simulation of much larger populations. In addition to improved runtimes, models structured in this manner exhibit other desirable characteristics, including more explicit control over population density near the edge of the simulated area, and an efficient route for modelling complex heterogeneous landscapes.


Sampling design (a, b) used in the statistical model for imperfect detection. Purple segments represent plant crowns. Starting from the top, (a) an unmanned aircraft systems (UAS) image capturing a landscape partially burnt in a wildfire collected as part of the study, with crowns identified as big sagebrush (Artemisa tridentata) shaded in purple. Magnified sections of the UAS orthomosaic (b), showing the automatically delineated shrub crowns (outlined in purple) overlaid on top of the field GPS points that mark ground‐mapped sagebrush (black points). The labels indicate whether a field identified plant was detected in high‐resolution imagery (true positive, TP), a field identified plant was not detected in high‐resolution imagery (false negative, FN), or if a field identified plant was either double‐counted or if a non‐target entity was misclassified as big sagebrush in the high‐resolution imagery (false positive, FP).
The estimated detection probabilities of sagebrush from high‐resolution unmanned aircraft systems imagery across 10 sites in SW Idaho, USA (a) and factors explaining detection variability across sites (b). Points indicate mean parameter estimates, and the error bars correspond to 95% credibility intervals. Colour of points indicates size class: Blue points are adults and green points are juveniles. The effect size (right x‐axis) is shown on the logit scale relative to the mean detection by size class, indicated by zero.
The effect of surface topography on the abundance of sagebrush (a) and false positive detections (b) from unmanned aircraft systems imagery across 10 sites in SW Idaho, USA. The variables (y‐axis) summarize the effect of TRI (topographic roughness index), TPI (topographic position index), and HLI (heat load index) derived from Digital Terrain Models (DTM) at 5 m resolution.
The spatial effect of a wildfire edge on the abundance of sagebrush in partially burnt areas across 10 sites in SW Idaho, USA. The y‐axis indicates the marginal effect of unburnt vegetation relative to the corresponding means for the two size classes. The left panel (a) shows the predicted effect of unburnt vegetation on the average expected count. The right panel (b) shows the same effect using posterior prediction to simulate counts. Thick lines depict means and shadowed regions indicate the central 95% CI of the predicted effect.
Predictive maps of sagebrush abundance within a single landscape (see aerial image in Figure 1). Top row shows model predictions of true abundance. The bottom row shows the uncertainty (the standard deviation of the posterior distribution). Cells with SD = 0 represent cells where field‐validated counts occurred (under the assumption that all plants were detected). The north–south white line indicates the boundary of a wildfire that occurred 26 years before the data collection east of the line, with adults more abundant in the unburnt section (west) and recruits more abundant near the edge of the burn (east).
Propagating observation errors to enable scalable and rigorous enumeration of plant population abundance with aerial imagery

October 2024

·

45 Reads

Estimating and monitoring plant population size is fundamental for ecological research, as well as conservation and restoration programs. High‐resolution imagery has potential to facilitate such estimation and monitoring. However, remotely sensed estimates typically have higher uncertainty than field measurements, risking biased inference on population status. We present a model that accounts for false negative (missed plants) and false positive (misclassified or double‐counted plants) error in counts from high‐resolution imagery via integration with ground data. We apply it to estimate the abundance of a foundational shrub species in post‐wildfire landscapes in the western United States. In these landscapes, plant recruitment is crucial for ecological recovery but locally patchy, motivating the use of spatially extensive measurements from unoccupied aerial systems (UAS). Integrating >16 ha of UAS imagery with >700 georeferenced field plots, we fit our model to generate insights into the prevalence and drivers of observation errors associated with classification algorithms used to distinguish individual plants, relationships between abundance and landscape context, and to generate spatially explicit maps of shrub abundance. Raw counts of plant abundance in high‐resolution imagery resulted in substantial false negative and false positive observation errors. The probability of detecting (p) adult plants (≥ \ge 0.25 m tall) varied between sites within 0.52 < p̂adultp^adult {\hat{p}}_{\mathrm{adult}} < 0.82, whereas the detection of smaller plants (<0.25 m) was lower, 0.03 < p̂smallp^small {\hat{p}}_{\mathrm{small}} < 0.3. On average, we estimate that 19% of all detected plants were false positive errors, which varied spatially in relation to topographic predictors. Abundance declined toward the interior of previous wildfires and was positively associated with terrain roughness. Our study demonstrates that integrated models accounting for imperfect detection improve estimates of plant population abundance derived from inherently imperfect UAS imagery. We believe such models will further improve inference on plant population dynamics—relevant to restoration, wildlife habitat and related objectives—and echo previous calls for remote sensing applications to better differentiate between ecological and observational processes.


Distance‐weighting functions whose potential for spatially explicit modelling was assessed in this study.
Maps of substrate types, shrub density and topography outlined from Borcard and Legendre (1994, their fig. 1), together with a principal component analysis (PCA) biplot showing the sampling sites (markers) and Oribatid mites (arrows). The maps also feature the sampling points (black dots), open waters (blue area) and flooded areas (circumscribed by dotted curves). The labels at the tip of the PCA biplot arrows are the names of the eight mites with the largest axis loadings in their vicinity. The two PCA axes represent approximately a fourth of the total variation among the sites.
Results of the spatially explicit models predicting substrate density and water content of the peat, which are defined as the mass (in grams) of solids and water per litre of uncompacted peat. Predictions are presented on the maps as rainbow colours and observed values at the sampling locations are presented with dots using the same rainbow colour scale as for the model predictions. The substrate density model P2=0.088$$ \left({P}^2=0.088\right) $$ is much weaker than the water content model P2=0.25$$ \left({P}^2=0.25\right) $$.
Site loadings of a two‐axis principal component analysis (PCA) over the Oribatid mite study area. These axes represent the two main components of the Oribatid mite community structure (representing approximately 25% of the among‐site variability). Rainbow colours pixels on the surface of the study area are the values obtained from predicted counts, whereas the background colour of the markers correspond to the observed PCA axis loadings.
Spatially explicit predictions using spatial eigenvector maps

October 2024

·

108 Reads

In this paper, we explain how to obtain sets of descriptors of the spatial variation, which we call “predictive Moran's eigenvector maps” (pMEM), that can be used to make spatially explicit predictions for any environmental variables, biotic or abiotic. It unites features of a method called “Moran's eigenvector maps” (MEM) and those of spatial interpolation, and produces sets of descriptors that can be used with any other modelling method, such as regressions, support vector machines, regression trees, artificial neural networks and so on. The pMEM are the predictive eigenvectors produced by using a distance‐weighting function (DWF) in the construction of MEM. Seven types of pMEM, each associated with one of seven different DWFs, were defined and studied. We performed a simulation study to determine the power of different types of pMEM eigenfunctions at making accurate predictions for spatially structured variables. We exemplified the application of the method to the prediction of the spatial distribution of 35 Oribatid mites living in a peat moss (Sphagnum) mat on the shore of a Laurentian lake. We also provide an R language package called pMEM to make calculations easily available to end users. The results indicate that anyone of the pMEMs obtained from the different DWFs could be the best suited one to predict spatial variability in a given data set. Their application to the prediction of mite distributions highlights the capability of pMEMs for predicting distributions, and for providing spatially explicit estimates of environmental variables that are useful for predicting distributions.


Diagram of the SimpleMetaPipeline workflow. Ovals represent the different steps in the pipeline and the order in which they occur—either in series or in parallel. The table on the right represents the format of the output ‘Sequence Data Table’ (as shown in Table 1) in simplified graphical form. Arrows indicate the step in the pipeline where each set of information in the Sequence Data Table is generated.
Varieties of multi‐algorithm agreement. Only two‐way algorithm agreements are visualised, three‐way and four‐way algorithm agreement tests are also possible by combining the two‐way varieties visualised here. (a) Agreement between assignment and clustering algorithms. Three clusters are shown, with the proportion of component ASVs assigned to each taxa at each rank visualised, with taxonomic assignments in large blue circles representing those received by all component ASVs. For example, Cluster1 contains three ASVs all assigned to the phylum Arthropoda and class Malacostraca, but they are assigned to different orders (Decapoda and Euphausiaceae). A conservative approach would therefore be to assign the cluster to the class Malacostraca but leave it unidentified at lower ranks. (b) Agreement between clustering algorithms. Two alternative clustering outputs are shown (red and blue ovals containing ASVs represented by black bars). For example, the blue Cluster1 contains two red clusters containing three and four ASVs each. In this case, agreement and disagreement between clustering algorithms provides additional information to interrogate the internal structure of, or potential relationships between, specific clusters of interest. (c) Agreement between assignment methods. Two ASVs are shown, each receiving an assignment from both IDTAXA and BLAST. ASV1 receives diverging assignments at lower ranks (family and genus), while ASV2 receives the same assignment from both algorithms at all ranks. A conservative approach would therefore assign ASV1 to the Order Charchariniformes but leave it unidentified at lower ranks.
SimpleMetaPipeline: Breaking the bioinformatics bottleneck in metabarcoding

October 2024

·

61 Reads

The democratisation of next‐generation sequencing has vastly increased the availability of sequencing data from metabarcoding. However, to effectively prepare these metabarcoding data for subsequent analysis, researchers must consistently apply several different bioinformatic tools—including those which denoise reads, cluster sequences and assign taxonomic identities. This often creates a bioinformatics bottleneck in workflows for non‐specialists due to obstacles around: (a) integrating different tools, (b) the inability to easily modify and rerun bioinformatic pipelines involving non‐scripted (‘point‐and‐click’) elements and (c) the multiple outputs that may be required of a single dataset (e.g. amplicon sequence variants [ASVs] and operational taxonomic units [OTUs]), which often results in users running pipelines multiple times. Here, we introduce SimpleMetaPipeline, an open‐source bioinformatics pipeline implemented in R, which addresses these obstacles. SimpleMetaPipeline integrates the most robust and commonly used existing bioinformatic tools in a single reproducible pipeline, with a streamlined choice of parameters, to generate a sequence data table containing alternative clustering and assignment options. SimpleMetaPipeline accepts demultiplexed paired‐end and single reads from multiple sequencing runs. We describe the pipeline and demonstrate how alternative annotations enable the easy implementation of multi‐algorithm agreement tests to strengthen inferences. SimpleMetaPipeline represents a valuable addition to the existing library of pipelines, providing easy and reproducible bioinformatics, including a range of commonly desired clustering and assignment options, such as OTUs and ASVs.


Quantitative structure models showing the vertical profiles of the example study trees, (a) Celtis occidentalis L. (common hackberry) and (b) Ulmus americana L. (American elm), alongside the horizontal canopy profiles for the (c) hackberry and (d) elm.
Conceptual schematic demonstrating the pruning operation employed here to delineate stemflow and dripflow components in a tree graph model, G. An inset is provided showing the angle used for analysis.
Scatterplots showing results from a sensitivity analysis for the cut‐off grade within the stemflow and throughfall delineation algorithm, applied to two tree models: (a) Celtis occidentalis and (b) Ulmus americana. The analysis tracks the stability and subsequent decline of stemflow perimeter length (open circles) as the cut‐off angle increases, with a notable decrease occurring beyond −10°. Concurrently, the mean contributing surface area to throughfall drip nodes, shown by filled circles, escalates with wider cut‐off angles and plateaus near the same −10° threshold. This plateau signifies a state where the increase in contributing throughfall areas stabilises, illustrating the algorithm's capability to identify a cut‐off angle that optimises the demarcation between branches affecting stemflow and throughfall. Example outputs below and above the cut‐off angle selected for our demonstration are provided in Figure S1 (stemflow areas and throughfall drip maps).
Comparison of the total projected canopy area (light grey) and the projected stemflow‐contributing branch area (dark grey) for (a) Celtis occidentalis L. (common hackberry) and (b) Ulmus americana L. (American elm). The areas shown in panels (a and b) are reported in Table 1. In the stemflow watershed, the distribution of (c) branch angles was nearly identical; however, (d) the branch radii distribution obviously differ.
Panels showing drip point maps for (a) Celtis occidentalis L. (common hackberry) and (b) Ulmus americana L. (American elm), where the shade of each dot indicates the amount of contributing projected branch area. Plotted drip points are the top 98th percentile of all drip nodes.
A LiDAR‐driven pruning algorithm to delineate canopy drainage areas of stemflow and throughfall drip points

October 2024

·

896 Reads

Precipitation channelled down tree stems (stemflow) or into drip points of ‘throughfall’ beneath trees results in spatially concentrated inputs of water and chemicals to the ground. Currently, these flows are poorly characterised due to uncertainties about which branches redirect rainfall to stemflow or throughfall drip points. We introduce a graph theoretic algorithm that ‘prunes’ quantitative structural models of trees (derived from terrestrial LiDAR) to identify branches contributing to stemflow and those contributing to throughfall drip points. To demonstrate the method's utility, we analysed two trees with similar canopy sizes but contrasting canopy architecture and rainfall partitioning behaviours. For both trees, the branch ‘watershed’ area contributing to stemflow (under conditions assumed to represent moderate precipitation intensity) was found to be only half of the total ground area covered by the canopy. The study also revealed significant variations between trees in the number and median contribution areas of modelled throughfall drip points (69 vs. 94 drip points tree⁻¹, with contributing projected areas of 28.6 vs. 7.8 m² tree⁻¹, respectively). Branch diameter, surface area, volumes and woody area index of components contributing to stemflow and throughfall drip points may play a role in the trees' differing rainfall partitioning behaviours. Our pruning algorithm, enabled by the proliferation of LiDAR observations of canopy structure, promises to enhance studies of canopy hydrology. It offers a novel approach to refine our understanding of how trees interact with rainfall, thereby broadening the utility of existing LiDAR data in environmental research.


Conceptual and analytical framework for the Life on the edge toolbox, incorporating ‘Exposure’ (current and projected future species distribution models (SDM) and Species their dissimilarity), ‘Sensitivity (adaptive and neutral sensitivity), ‘Landscape barriers' (predicted population connectivity) to predict a final ‘Population vulnerability’ metric for each population (which is a weighted combination of the other metrics). Software packages used are denoted in blue text (LFMM, latent factor mixed models; RDA, redundancy analysis).
Main inputs and data (yellow boxes), analyses (blue box) and outputs (green boxes) of the LotE toolbox. ‘Species_binomial’ is the name of the analysis for any given species, using genus name followed by species name separated by an underscore. Directory names are highlighted in bold, ‘Exposure’, ‘Sensitivity’ (including neutral and adaptive sensitivity) and ‘Landscape barriers’ become populated with the relevant output files for each analysis upon running LotE, which are then used to calculate output metrics per population. Information on specific R functions within the blue box and how they interact with the output directories can be found in Figure S1. The ‐scripts‐ and R_functions folders contain all the toolbox scripts and functions, and the ‐outputs‐ folder stores all output files in relevant subdirectories when running the toolbox. Blue lines represent locations for input files, dotted blue lines represent locations for input files if the user wants to circumvent the full toolbox workflow with their own input data (e.g. pre‐prepared SDMs, a list of adaptive SNPs so that GEA analysis is unnecessary, imputed missing genotype data, or an already parameterised circuitscape input layer).
Results generated using the LotE toolbox (a–d: Afrixalus fornasini, e: Multi‐species population vulnerability). Sampling locations with genomic data represented over maps as dots, legends within each panel and plot provide information on the scale of variables. (a) Exposure—SDM dissimilarity between current and future conditions (−1, orange = range loss, 1, green = range expansion). (b) Sensitivity—neutral genetic diversity (nucleotide diversity, left panel) and genomic offset per population (right panel). Genomic offset predictions are clipped to a 2 degree buffer around presence points. (c) Landscape barriers—parameterised cumulative resistance surface (left panel, ranging from 0—no resistance, to 100—complete barrier) and predicted movement density (right panel) between populations based on Circuitscape analysis. (d) Population vulnerability, calculated as the mean of the exposure, adaptive and neutral sensitivity and landscape barriers metrics (all ranging between 1 (low vulnerability) and 10 (high vulnerability). (e) Multi‐species population vulnerability for Afrixalus fornasini, Afrixalus delicatus and Afrixalus sylvaticus running LotE for all three species. For output summaries from the three full species analyses described above see Appendices S2–S4.
Individual categorisation results using RDA for Myotis escalerai and Myotis crypticus generated using the LotE toolbox. (a) M. escalerai individual categorisation in RDA ordination space based on putatively adaptive SNPs and (b) mapped categorised individuals in geographic space. (c) M. crypticus individual categorisation in RDA ordination space based on putatively adaptive SNPs and (d) mapped categorised individuals in geographic space. For (b) and (d), circle sizes represent number of individuals per sampling locality.
Adaptive SDMs generated using the LotE toolbox capturing intraspecific adaptations for Myotis escalerai and Myotis crypticus based on the categorised individuals for hot‐dry, cold‐wet conditions shown in Figure 4. Separate SDMs were built for each category based on the ordination of each genotype in the RDA, and maps are categorised into binary presence/absences for hot‐dry adapted (red), cold‐wet adapted (blue), with overlapping areas for both categories in yellow. (a) M. escalerai adaptive SDMs (left panel: Current conditions, right panel: Future (2070) conditions). (b) M. crypticus adaptive SDMs (left panel: Current conditions, right panel: Future (2070) conditions).
Life on the edge: A new toolbox for population‐level climate change vulnerability assessments

October 2024

·

403 Reads

Global change is impacting biodiversity across all habitats on earth. New selection pressures from changing climatic conditions and other anthropogenic activities are creating heterogeneous ecological and evolutionary responses across many species' geographic ranges. Yet we currently lack standardised and reproducible tools to effectively predict the resulting patterns in species vulnerability to declines or range changes. We developed an informatic toolbox that integrates ecological, environmental and genomic data and analyses (environmental dissimilarity, species distribution models, landscape connectivity, neutral and adaptive genetic diversity, genotype‐environment associations and genomic offset) to estimate population vulnerability. In our toolbox, functions and data structures are coded in a standardised way so that it is applicable to any species or geographic region where appropriate data are available, for example individual or population sampling and genomic datasets (e.g. RAD‐seq, ddRAD‐seq, whole genome sequencing data) representing environmental variation across the species geographic range. To demonstrate multi‐species applicability, we apply our toolbox to three georeferenced genomic datasets for co‐occurring East African spiny reed frogs (Afrixalus fornasini, A. delicatus and A. sylvaticus) to predict their population vulnerability, as well as demonstrating that range loss projections based on adaptive variation can be accurately reproduced from a previous study using data for two European bat species (Myotis escalerai and M. crypticus). Our framework sets the stage for large scale, multi‐species genomic datasets to be leveraged in a novel climate change vulnerability framework to quantify intraspecific differences in genetic diversity, local adaptation, range shifts and population vulnerability based on exposure, sensitivity and landscape barriers.


Overview of the EarthRanger system structure: (a) The core server application handles all data and application processes. (b) A REpresentational State Transfer (REST) application programming interface (API) allows external applications (e.g. Web Application, Mobile‐App, Gundi, Ecoscope) to interface with the server and data storage components. (c) Backend cloud storage provides secure and archival long‐term data archiving. (d) The EarthRanger Web application (Figure 2) is the primary client for interfacing with the system. (e) The EarthRanger mobile application (Figure 4) allows users to collect both tracking data using the phone's GPS and record user‐defined field sightings and visualise data. (f) Gundi (https://projectgundi.org) acts as a data provider to EarthRanger by connecting it with a multitude of (g) external observations and event data services (e.g. wildlife collars, vehicle trackers, deforestation alerts camera trap photos). (h) Ecoscope (https://ecoscope.io) provides data analytics and reporting functionality to EarthRanger.
(a) The EarthRanger web interface showing an example from the Mara Elephant Project located in Kenya. The interface provides a simple mechanism to visualise information from multiple data streams and within various geographic contexts. Attribute and temporal filters help to isolate data of interest to the user. The interface is self‐updating and automatically polls the EarthRanger database for new information. New events (e.g. an elephant sighting) or patrols can be keyed in manually through the interface, or existing ones updated. Geofences (red‐dashed lines) can be visualised along with other geospatial vector layers or tiled basemaps (e.g. the custom tiled layer developed and used by the Mara Elephant Project). (b) A time slider tool lets the user replay events and observations through time and apply temporal filters to visible data. (c) The events feed updates continuously as new information is collected and lets the user quickly navigate to incoming event locations. (d) The subject interface gives current information about location and other configurable attributes such as track length. Observations (tracking) data may be visualised as a stylised line or heat map. (e) The Alerts dialogue lets individual users subscribe to events and tailor the notifications they receive from different events including calendar days and blackout times.
(a) The EarthRanger Mobile login interface. The user is associated with a subject of the same name. (b) The events interface of the mobile app. Users can create and access drafts easily and the event types are configurable to each EarthRanger deployment site. The pending sync number refers to the number of events that are saved in the user's phone cache, but that cannot be transmitted to the server due to a lack of network connectivity. As soon as the phone is connected to the internet, recorded events will be transmitted to the server and the pending sync number will reset to 0. The blue icon on the lower right allows you to quickly create new events. (c) The event interface of EarthRanger Mobile showing an example from the Bylot Island Research Station (https://inq.ulaval.ca/en/tools/lab‐o‐nord/facilities/bylot‐island‐field‐station). This very simple and visual interface allows fieldworkers to quickly record their sightings. (d) With a built‐in patrol function, app users can start and stop their recorded patrols. Recorded events are then automatically associated with their patrols. Users can also see their recent track and their current location in real‐time even when the phone is offline.
Clustering events generated by an algorithm to detect den sites from GPS points from a tracked female cougar ‘Kakuna’ by the Olympic Cougar Project (www.panthera.org). The clustering algorithm runs externally to EarthRanger and accesses observations from various subjects. Detected clusters are then reported as events using the EarthRanger API. The heat map visualisation of tracking data also helps researchers to identify high‐use areas.
EarthRanger: An open‐source platform for ecosystem monitoring, research and management

October 2024

·

247 Reads

Effective approaches are needed to conserve the planet's remaining wildlife and wilderness landscapes, especially concerning global biodiversity conservation targets. Here, we present a new software system called EarthRanger: an open‐source platform built to help monitor, research and manage ecosystems. EarthRanger consists of seven main components (Core Server, API, Storage, Gundi, Web App, Mobile App, Ecoscope) that provide functionality for data (i) aggregation & collection, (ii) storage & management, (iii) real‐time and post hoc analysis, (iv) visualisation and (v) dissemination. The mobile application provides field‐based data recording and visualisation tools. EarthRanger may be deployed for single project use or can aggregate across multiple geographies as a centralised hub. EarthRanger can be used to collect standardised tracking data (e.g. from wildlife collars, vehicles and ranger patrols) and configurable event information (e.g. a singular recording with associated user‐defined attribute information such as a wildlife sighting or encounter with a poacher). Since development began in 2015, the platform has (at the time of writing) been deployed at over 500 sites across 70 countries and with myriad configurations and objectives. EarthRanger has improved the ability to monitor data feeds and manage conservation‐related operations in real time. For instance, the deployment of EarthRanger by African Parks has led to the removal of over 50,000 snares, steady population growth of key species of concern and near cessation of poaching. In Liwonde's protected area, enhanced mitigation efforts supported by EarthRanger reduced the number of deaths from wildlife conflict by more than 91%. EarthRanger is also providing a platform to enhance standardisation, aggregation, transfer and long‐term storage of ecological information and promote collaboration between groups conducting protected area management and ecology and biodiversity research.


Should we still teach or learn coding? A postgraduate student perspective on the use of large language models for coding in ecology and evolution

October 2024

·

39 Reads

The extent to which coding skills are taught within ecology and evolution curricula remains largely unquantified. While coding, and especially R, proficiency is increasingly demanded in academic and professional contexts, many students encounter coding for the first time as postgraduates, presenting a steep learning curve alongside learning advanced statistics. With the emergence of large language models (LLMs), questions arise regarding the relevance of teaching coding when many of these tasks can now be automated. Here, we explore students' experiences with using LLMs for coding, highlighting both benefits and limitations. Through qualitative analysis of student perspectives, we identify several advantages of using LLMs for coding tasks, including enhanced search capabilities, provision of starting points and clear instructions, and troubleshooting support. However, limitations such as a lack of responsiveness to feedback and the prerequisite of extensive prior knowledge pose challenges to the effectiveness of student use of LLMs for coding at a beginner level. Concerns also arise regarding future access to LLMs, potentially exacerbating inequities in education. Despite the potential of LLMs, we argue for the continued importance of teaching coding skills alongside their integration with LLM support. Tutor‐supported learning is essential for building foundational knowledge, facilitating comprehension of LLM outputs and fostering students' confidence in their abilities. Moreover, reliance solely on LLMs risks hindering deep learning and comprehension, thereby undermining the educational process. Our experiences underscore the significance of maintaining a balanced approach, leveraging LLMs as supplementary tools rather than substitutes for coding education in ecology and evolution courses.


The role of large language models in interdisciplinary research: Opportunities, challenges and ways forward

October 2024

·

47 Reads

Large language models (LLMs) are gaining importance in research as they offer many benefits. One often overlooked benefit is their potential to facilitate and support interdisciplinary research, which is key to addressing current global challenges, such as the twin crises of biodiversity loss and climate change. LLMs can help reduce the costs associated with knowledge transfer and bridge gaps between different fields of study. They can also be especially useful in helping ecologists understand and adopt powerful techniques common in other fields. However, using LLMs in research, especially for complex tasks, carries important risks, including the possibility of generating inaccurate information, which can lead to false conclusions. We recommend that researchers adhere to best practices when using LLMs for research by providing appropriate prompts and dividing complex tasks into smaller, more manageable tasks that facilitate learning and testing. Moreover, journals should implement policies to ensure that information and code generated using LLMs are properly validated. Academic programs should incorporate formal training in LLMs, equipping students and researchers with the necessary skills to use these tools more effectively and responsibly, including for interdisciplinary research.


ChatGPT is likely reducing opportunity for support, friendship and learned kindness in research

October 2024

·

136 Reads

Large language models (LLM) have proved to be highly popular since the release of ChatGPT, leading many researchers to explore their potential across multiple fields of scientific research. In a recent Perspective, Cooper et al. (2024) highlight a set of benefits and challenges for the use of LLMs in ecology, emphasising their value to coding in research and education. While we agree that the ability of LLMs to assist in the coding process is remarkable, researchers should be conscious that this capability is likely changing the lived experience of primarily computational researchers, especially early career ecologists between Masters and Postdoctoral career stages. In particular, since the release of ChatGPT, the authors of this paper have noticed a marked reduction in the frequency of social interactions emergent from coding and statistics queries. These questions are highly likely still being asked, but now often exclusively to a LLM. Further research is needed to fully understand the effect of LLMs on the lived‐experience of researchers and students. For primarily computational researchers, ChatGPT is likely reducing emergent opportunity for support, friendship and learned kindness. Group leaders should recognise this and foster deliberate within‐group communication and collaboration.


Pressure to publish introduces large‐language model risks

October 2024

·

74 Reads

Large‐language models (LLMs) have the potential to accelerate research in ecology and evolution, cultivating new insights and innovation. However, whilst revelling in the plethora of opportunities, researchers need to consider that LLM use could also introduce risks. An important piece of context underpinning this perspective is the pressure to publish, where research careers are defined, at least partly, by publication metrics like number of papers, impact factor, citations etc. Coupled with academic employment insecurity, especially during early career, researchers may reason that LLMs are a low‐risk and high‐reward tool for publication. However, this pressure to publish can introduce risks if LLMs are used as a shortcut to game publication metrics instead of a tool to support true innovation. These risks may ultimately reduce research quality, stifle researcher development and incur reputational damage for researchers and the entire scientific record. We conclude with a series of recommendations to mitigate the magnitude of these risks and encourage researchers to apply caution whilst maximising LLM potential.


Journal metrics


6.3 (2023)

Journal Impact Factor™


27%

Acceptance rate


11.6 (2023)

CiteScore™


17 days

Submission to first decision


$3,500 / £2,650 / €2,950

Article processing charges

Editors