Cajo ter BraakWageningen University & Research | WUR · Biometris, Department of Mathematical and Statistical Methods
Cajo ter Braak
Professor Emeritus
About
343
Publications
145,309
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
69,878
Citations
Introduction
Cajo ter Braak is professor emeritus at Biometris, Department of Mathematical and Statistical Methods, Wageningen University & Research. Cajo does research in Statistical ecology and Multivariate analysis, and, formerly, Bayesian computing.
Skills and Expertise
Publications
Publications (343)
Anthropogenic landscape modification may lead to the proliferation of a few species and the loss of many. Here we investigate mechanisms and functional consequences of this winner–loser replacement in six human-modified Amazonian and Atlantic Forest regions in Brazil using a causal inference framework. Combining floristic and functional trait data...
This is the vignette of the R package `douconca`. The aim of the `douconca` package is to help ecologists unravel trait-environment relationships from an abundance data table with
associated multi-trait and multi-environment data tables.
A popular method to such aim is RLQ, which also a three-tables method. RLQ is available in the R-package `ade4`....
Controversies exist regarding the extent of past human influence on terrestrial ecosystems and the relative importance of human versus climatic factors in shaping Holocene vegetation. However, there has been no systematic examination of these issues at a global scale.
Here we integrate palaeoecological, archaeological, and palaeoclimate data to ass...
This note summarizes what the literature say about double constrained correspondence analysis (dc-CA) and its cousins RLQ and CWM-RDA. These are all methods for exploring and establishing trait-environment relationships in ecology. All ignore within-species trait variation, although this is more a software issue in dc-CA than a real limitation.
The douconca package analyzes multi-trait multi-environment ecological data by double constrained correspondence analysis (ter Braak et al. 2018) using vegan and native R code. It has a formula interface which allows to assess, for example, the importance of trait interactions in shaping ecological communities. Throughout the two step algorithm of...
Plant species of ancient forests tend to be poor dispersers, but recent field studies suggest that dispersal may be strongly accelerated in streams. To further test this idea, we addressed the following two questions: (1) which traits facilitate transport and deposition of seeds by streams? (2) do ancient forest species differ from other forest spe...
Plant species of ancient forests tend to be poor dispersers, although field studies suggest that dispersal may be strongly accelerated in streams. To further test this idea we addressed the following two questions: 1) which traits facilitate transport and deposition of seeds by streams? 2) do ancient forest species differ from other forest species...
The Iberian Peninsula is characterized by a steep west–east moisture gradient at present, reflecting the dominance of maritime influences along the Atlantic coast and more Mediterranean-type climate further east. Holocene pollen records from the Peninsula suggest that this gradient was less steep during the mid-Holocene, possibly reflecting the imp...
The published version is at https://doi.org/10.1016/j.chemolab.2023.104898 (open access)
Highlights:
1. A sequence of redundancy analyses (RDA) is more general in theory and practice than ASCA.
2. ASCA+ and WE-ASCA are unstable in designs with a missing factor combination
3. RDA outperforms ASCA+ and WE-ASCA.
Abstract
Chemometrics and statistical e...
An author-year version of citations is next at ResearchGata.
Chemometrics and statistical ecology share interest in the analysis of multivariate data. In ecology, unconstrained and constrained ordination are popular methods to analyze and visualize multivariate data, with principal components analysis (PCA) and redundancy analysis (RDA) as prototyp...
After applying canonical correspondence analysis to metagenomics data with hugely different library sizes (site totals) it became evident that Canoco and the R-packages ade4 and vegan can yield (at least up to 2022) very different P -values in statistical tests of the relationship between taxonomic composition (species composition) and predictors (...
Mountain regions are hotspots of biodiversity, and are particularly sensitive to human activities and global changes. Characterizing biodiversity using trait-based approaches may improve the understanding of the evolutionary and mechanistic basis of ecological patterns in species distribution. The investigation of trait-environment relationships, h...
Bacteria are part of the insect gut system and influence many physiological traits of their host. Gut bacteria may even reduce or block the transmission of arboviruses in several species of arthropod vectors. Culicoides biting midges are important arboviral vectors of several livestock and wildlife diseases, yet limited information is available on...
The vegan package provides tools for descriptive community ecology. It has most basic functions of diversity analysis, community ordination and dissimilarity analysis. Most of its multivariate tools can be used for other data types as well.
The functions in the vegan package contain tools for diversity analysis, ordination methods and tools for th...
In testing an overall null hypothesis, it does not matter whether to permute the response variables (Y) while keeping the predictors fixed or to permute the predictor variables (X) while keeping the response variables fixed. However, in weighted (univariate and multivariate) regression and in partial tests these options yield different results. Thi...
The Iberian Peninsula is characterised by a steep west-east moisture gradient today, reflecting the dominance of maritime influences along the Atlantic coast and more Mediterranean-type climate further east. Holocene pollen records from the Peninsula suggest that this gradient was less steep during the early to mid-Holocene, possibly reflecting the...
Aim
Here we examine the functional profile of regional tree species pools across the latitudinal distribution of Neotropical moist forests, and test trait–climate relationships among local communities. We expected opportunistic strategies (acquisitive traits, small seeds) to be overrepresented in species pools further from the equator, but also in...
Aim
High levels of nitrogen deposition have been responsible for important losses of plant species diversity. It is often assumed that reduction of ammonia and nitrogen oxide emissions will result in the recovery of the former biodiversity. In Western Europe, N deposition peaked between 1980 and 1988 and declined thereafter. In a 60‐year experiment...
Microbiome composition data collected through amplicon sequencing are count data on taxa in which the total count per sample (the library size) is an artefact of the sequencing platform and as a result such data are compositional. To avoid library size dependency, one common way of analyzing multivariate compositional data is to perform a principal...
To combat antimicrobial resistance (AMR), policymakers need an overview of evolution and trends of AMR in relevant animal reservoirs, and livestock is monitored by susceptibility testing of sentinel organisms such as commensal E. coli. Such monitoring data are often vast and complex and generates a need for outcome indicators that summarize AMR for...
Quantitative reconstructions of past climates are an important resource for evaluating how well climate models reproduce climate changes. One widely used statistical approach for making such reconstructions from fossil biotic assemblages is weighted averaging partial least-squares regression (WA-PLS). There is however a known tendency for WA-PLS to...
Tripartite interactions among insect vectors, midgut bacteria, and viruses may determine the ability of insects to transmit pathogenic arboviruses. Here, we investigated the impact of gut bacteria on the susceptibility of Culicoides nubeculosus and Culicoides sonorensis biting midges for Schmallenberg virus, and of Aedes aegypti mosquitoes for Zika...
This is the supplementary material of the paper "Double constrained ordination for assessing biological trait responses to multiple stressors: A case study with benthic macroinvertebrate communities" organized in a better way than at the publisher (https://doi.org/10.1016/j.scitotenv.2020.142171) . Notably, the zip file with R code for dc-CA contai...
Benthic macroinvertebrate communities are used as indicators for anthropogenic stress in freshwater ecosystems. To better understand the relationship between anthropogenic stress and changes in macroinvertebrate community composition, it is important to understand how different stressors and species traits are associated, and how these associations...
Microbial communities, which drive the major ecosystem functions, are composed by a wide range of interacting species. Understanding how microbial communities are structured and the underlying processes is a crucial task for interpreting ecosystem response to global change but it is challenging as microbial interactions cannot usually be directly o...
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
The use of functional information in the form of species traits plays an important role in explaining biodiversity patterns and responses to environmental changes. Although relationships between species composition, their traits, and the environment have been extensively studied on a case-by-case basis, results are variable, and it remains unclear...
Alfalfa (Medicago sativa L.) silage (AS) is an important feedstuff in ruminant nutrition. However, its high non-protein nitrogen content often leads to poor ruminal nitrogen retention. Various pre-ensiling treatments differing with respect to dry matter concentrations, wilting intensities and sucrose addition have been previously shown to improve t...
Urban rivers often function as sinks for various contaminants potentially placing the benthic communities at risk of exposure. We performed a comprehensive biological survey of the benthic macroinvertebrate and bacterial community compositions in six rivers from the suburb to the central urban area of Guangzhou city (South China), and evaluated the...
Introduction
Attention deficit hyperactivity disorder (ADHD) is the most common childhood behavioural disorder, causing significant impediment to a child’s development. It is a complex disorder with numerous contributing (epi)genetic and environmental factors. Currently, treatment consists of behavioural and pharmacological therapy. However, ADHD m...
Statistical analysis of trait–environment association is challenging owing to the lack of a common observation unit: Community‐weighted mean regression (CWMr) uses site points, multilevel models focus on species points, and the fourth‐corner correlation uses all species‐site combinations. This situation invites the development of new methods capabl...
The Eemian interglacial represents a natural experiment on how past vegetation with negligible human impact responded to
amplified temperature changes compared to the Holocene. Here, we assemble 47 carefully selected Eemian pollen sequences
from Europe to explore geographical patterns of (1) total compositional turnover and total variation for each...
Statistical analysis of trait-environment association is challenging owing to the lack of a common observation unit: Community weighted mean regression (CWM) uses site points, multilevel models use species points, and the fourth corner correlation uses all species-site combinations. This situation invites the development of new methods capable of u...
Statistical analysis of trait-environment association is challenging owing to the lack of a common observation unit: Community weighted mean regression (CWM) uses site points, multilevel models use species points, and the fourth corner correlation uses all species-site combinations. This situation invites the development of new methods capable of u...
The fourth‐corner analysis aims to quantify and test for relationships between species traits and site‐specific environmental variables, mediated by site‐specific species abundances. Since there is no common unit of observation, the significance of the relationships is tested using a double permutation procedure (site‐based and species‐based). This...
Question
The community weighted means (CWM) approach is an easy way of analyzing trait‐environment association by regressing (or correlating) the mean trait per plot against an environmental variable and assessing the statistical significance of the slope or the associated correlation coefficient. However, the CWM approach does not yield valid test...
Correspondence analysis with linear external constraints on both the rows and the columns has been mentioned in the ecological literature, but lacks full mathematical treatment and easily available algorithms and software. This paper fills this gap by defining the method as maximizing the fourth-corner correlation between linear combinations, by pr...
Ultrasonic vocalizations (USVs) are crucial in the social behavior of rats. We aim to relate USV rates of pairs of rats to individual activity in an automated home cage (PhenoTyper®) where USVs are recorded per pair and not per individual. We propose a composite link model approach to parametrize a mechanistic “sum‐of‐rates” model in which the pair...
Presentation summarizing 5 papers on trait-environment relations in ecology, L-shaped data or bipartite graphs. It starts with a description of the data, a simple loglinear (GLM) model and the Rao score test on interaction. This leads to the Legendre’s fourth-corner correlation and, for multi-trait, multi-environment problems, double constrained co...
An environmental risk assessment for the introduction of genetically modified crops includes assessing the consequences for biodiversity. In this study arthropod biodiversity was measured using pitfall traps in potato
agro-ecosystems in Ireland and The Netherlands over two years. We tested the impact of site, year, potato
genotype, and fungicide ma...
Leaves are the major component of terrestrial litter input into aquatic systems. Leaves are distributed by the flow, accumulate in low flow areas and form patches. In natural streams, stable leaf patches form around complex structures, such as large woody debris. Until now, little is known about flow conditions under which leaf patches persist. Thi...
Ecologists wish to understand the role of traits of species in determining where each species occurs in the environment. For this, they wish to detect associations between species traits and environmental variables from three data tables, species count data from sites with associated environmental data and species trait data from data bases. These...
Principal response curves analysis (PRC) is widely applied to experimental multivariate longitudinal data for the study of time-dependent treatment effects on the multiple outcomes or response variables (RVs). Often, not all of the RVs included in such a study are affected by the treatment and RV-selection can be used to identify those RVs and so g...
Statistical testing of trait-environment association from data is a challenge as there is no common unit of observation: the trait is observed on species, the environment on sites and the mediating abundance on species-site combinations. A number of correlation-based methods, such as the community weighted trait means method (CWM), the fourth-corne...
The simulation models
Mathematical description of the Gaussian response model and log-linear model used in the simulation.
Annotated R-functions and scripts
Description of the R functions and script with example output.
Diagnostic plot
Dunn-Smith residuals of the model ‘site+species+species:Snow’ against the fitted values in the aravo data set.
Why does site-based statistical testing in a GLM fail?
R-files
The zip contains three R-files. The file trait_env_Type_I_error_examples.R is the driver script, the other contain functions as explained in Article S2.
The process of macroinvertebrate drift in streams is characterized by dislodgement, drift distance and subsequent return to the bottom. While dislodgement is well studied, the fate of drifting organisms is poorly understood, especially concerning Trichoptera. Therefore, the aim of the present study was to determine the ability of six case-building...
Establishing trait-environment relationships has become routine in community ecology. Here, we demonstrate that the Community Weighted Means correlation (CWM) and its parallel approach in linking trait variation to the environment, the Species Niche Centroid correlation (SNC), have important shortcomings, arguing against their continuing applicatio...
Quantitative insight into species differences in risk assessment is expected to reduce uncertainty and variability related to extrapolation from animals to humans. This paper explores quantification and comparison of gene expression data between tissues and species from intervention studies with isoflavones. Gene expression data from peripheral blo...
There is a growing need for good environmental risk assessment of engineered nanoparticles (ENPs). Environmental risk assessment of ENPs has been hampered by lack of data and knowledge about ENPs, their environmental fate and their toxicity. This leads to uncertainty in the risk assessment. To effectively deal with uncertainty in the risk assessmen...
The ecosystem services (EcoS) concept is being used increasingly to attach values to natural systems and the multiple benefits they provide to human societies. Ecosystem processes or functions only become EcoS if they are shown to have social and/or economic value. This should assure an explicit connection between the natural and social sciences, b...
Estimating the risk, P(X > Y), in probabilistic environmental risk assessment of nanoparticles is a problem when confronted by potentially small risks and small sample sizes of the exposure concentration X and/or the effect concentration Y. This is illustrated in the motivating case study of aquatic risk assessment of nano-Ag. A non-parametric esti...
Both environmental filtering and dispersal filtering are known to influence plant species distribution patterns and biodiversity. Particularly in dynamic habitats, however, it remains unclear whether environmental filtering (stimulated by stressful conditions) or dispersal filtering (during recolonization events) dominates in community assembly, or...
Insight into risks of nanotechnology and the use of nanoparticles is an essential condition for the social acceptance and safe use of nanotechnology. One of the problems with which the risk assessment of nanoparticles is faced is the lack of data, resulting in uncertainty in the risk assessment. We attempt to quantify some of this uncertainty by ex...
In this paper, we reflect on a number of aspects of ordination methods: how should absences be treated in ordination and how do model-based methods, including Gaussian ordination and methods using generalized linear models, relate to the usual least-squares (eigenvector) methods based on (log−) transformed data. We defend detrended correspondence a...
Recruitment processes are critical components of a plant's life cycle. However, in comparison with later stages in the plant life cycle (e.g. competition among adults), relatively little is known about their contribution to the regulation of plant species distribution. Particularly, little is known about the individual contributions of the three ma...
Mesocosm experiments that study the ecological impact of chemicals are often analysed using the multivariate method 'Principal Response Curves' (PRCs). Recently, the extension of generalised linear models (GLMs) to multivariate data was introduced as a tool to analyse community data in ecology. Moreover, data aggregation techniques that can be anal...
Both environmental filtering and dispersal filtering are known to influence plant species distribution patterns and biodiversity. Particularly in dynamic habitats, however, it remains unclear whether environmental filtering (stimulated by stressful conditions) or dispersal filtering (during recolonization events) dominates in community assembly, or...
The number of perennial low‐order lowland streams likely to experience intermittent flow is predicted to increase in north‐western Europe. To understand the effects of such a change on macroinvertebrates, a field experiment was carried out in a currently perennial sandy lowland stream.
Using a before–after control–impact design, the flow regime was...
La agricultura se ha intensificado asociada a una mayor dependencia de los combustibles fósiles (agroquímicos, mecanización, riego), comprometiendo su sostenibilidad. Las rotaciones de cultivos son claves para mejorar la sostenibilidad de los sistemas de producción. El diseño de las rotaciones es un proceso complejo que conjuga diversos objetivos,...
Multi-parameter models in systems biology are typically 'sloppy': some parameters or combinations of parameters may be hard to estimate from data, whereas others are not. One might expect that parameter uncertainty automatically leads to uncertain predictions, but this is not the case. We illustrate this by showing that the prediction uncertainty o...