ArticlePDF Available

Type-II statistical errors in environmental science and the precautionary principle

Authors:
... exceeds the R 2 adj. of the scope (full model utilized in the first step); and (3) the selected permutation P-value is exceeded (Blanchet et al., 2008). The level of significance was defined as α = 0.10 to reduce the incidence of type II errors and avoid the exclusion of relationships with ecological relevance (Buhl-Mortensen, 1996;Underwood, 1997). ...
Article
Full-text available
Portulaca hatschbachii is endemic to the basaltic rocky outcrops that are distributed, in a discontinuous way, along the Third Plateau of Paraná State, Brazil, composing environments that form the Subtropical Highland Grasslands of the Atlantic Forest Biome. Considering the risk of extinction of the species and the massive anthropization of these outcrops, we applied AFLP, ITS and rps16 molecular markers in ten populations throughout the area of occurrence of the species to generate information about the genetic status of P. hatschbachii and contribute to the development of conservation strategies. Low rates of genetic diversity, high population structure, restricted gene flow and the presence of diversifying selection were observed for the populations. The analysis of variation partitioning (R2adj. = 63.60%) showed that environmental variables have a greater influence on the distribution of variation of loci under selection (R2adj. = 26.70%) than geographical isolation (R2adj. = 1.20%). The strong population structure, for both neutral and selected loci, suggests an isolation by adaptation mechanism (IBA) occurring in populations and highlights the need and urgency for in situ conservation plans for the species and its occurrence on rocky outcrops.
... Indeed, Miller and Ulrich (2016) showed how these and other factors have a direct bearing on the final research payoff. There is an impressive literature attesting to the difficulties in setting a blanket level recommendation (e.g., Buhl-Mortensen, 1996;Lemons et al., 1997;Lemons and Victor, 2008;Lieberman and Cunningham, 2009;Myhr, 2010;Rice and Trafimow, 2010;Mudge et al., 2012;Lakens et al., 2018). ...
Preprint
Full-text available
We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p = .05 to .005, is deleterious for the finding of new discoveries and the progress of science. Given that blanket and variable alpha levels both are problematic, it is sensible to dispense with significance testing altogether. There are alternatives that address study design and sample size much more directly than significance testing does; but none of the statistical tools should be taken as the new magic method giving clear-cut mechanical answers. Inference should not be based on single studies at all, but on cumulative evidence from multiple independent studies. When evaluating the strength of the evidence, we should consider, for example, auxiliary assumptions, the strength of the experimental design, and implications for applications. To boil all this down to a binary decision based on a p -value threshold of .05, .01, .005, or anything else, is not acceptable.
... Indeed, Miller and Ulrich (2016) show how these and other factors have a direct bearing on the final research payoff. There is an impressive literature attesting to the difficulties in setting a blanket level recommendation (e.g., Buhl-Mortensen, 1996;Lemons et al., 1997;Lemons and Victor, 2008;Lieberman and Cunningham, 2009;Myhr, 2010;Rice and Trafimow, 2010;Mudge et al., 2012;Lakens et al., 2018). ...
Preprint
Full-text available
We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p = .05 to .005, is deleterious for the finding of new discoveries and the progress of science. Given that blanket and variable alpha levels both are problematic, it is sensible to dispense with significance testing altogether. There are alternatives that address study design and sample size much more directly than significance testing does; but none of the statistical tools should be taken as the new magic method giving clear-cut mechanical answers. Inference should not be based on single studies at all, but on cumulative evidence from multiple independent studies. When evaluating the strength of the evidence, we should consider, for example, auxiliary assumptions, the strength of the experimental design, and implications for applications. To boil all this down to a binary decision based on a p -value threshold of .05, .01, .005, or anything else, is not acceptable.
... However, only dramatic declines are readily detected (Taylor et al., 2007) and irremediable damage or loss may occur because measures are delayed in the light of statistically insignificant declines. This shortcoming of nil null hypotheses is well known (Noss, 1994, Buhl-Mortensen, 1996, but current conservation instruments in Europe such as the Habitats Directive or the Marine Strategy Framework Directive have not taken stock of it. Here, we have carried out extensive simulations to show that type-I errors are not even minimized with standard (unregularized) regression techniques applied on realistic data for cetaceans. ...
Article
Full-text available
Many conservation instruments rely on detecting and estimating a population decline in a target species to take action. Trend estimation is difficult because of small sample size and relatively large uncertainty in abundance/density estimates of many wild populations of animals. Focusing on cetaceans, we performed a prospective analysis to estimate power, type-I, sign (type-S) and magnitude (type-M) error rates of detecting a decline in short time-series of abundance estimates with different signal-to-noise ratio. We contrasted results from both unregularized (classical) and regularized approaches. The latter allows to incorporate prior information when estimating a trend. Power to detect a statistically significant estimates was in general lower than 80%, except for large declines. The unregularized approach (status quo) had inflated type-I error rates and gave biased (either over- or under-) estimates of a trend. The regularized approach with a weakly-informative prior offered the best trade-off in terms of bias, statistical power, type-I, type-S and type-M error rates and confidence interval coverage. To facilitate timely conservation decisions, we recommend to use the regularized approach with a weakly-informative prior in the detection and estimation of trend with short and noisy time-series of abundance estimates.
... Because of few degrees of freedom in most tests, an a priori significance level α < 0.1 was adopted following a precautionary principle (Gray 1990). Increasing the significance level increases the risk of a type I error (i.e., false significant effect if there is no impact), though priority should be given to decreasing the risk of a type II error, i.e., failure to detect if there is a significant effect (Buhl-Mortensen 1996), which is here achieved by setting the statistical power to β > 0.8. In case of significant interaction terms, post-hoc Tukey's Honestly Significant Difference (HSD) tests were performed to compare pairwise differences. ...
Article
Full-text available
Accidental oil discharges pose acute and chronic risks on coral communities, but knowledge on the ecological long-term implications is fragmentary. Here, we examine the potential short-, mid- and long-term effects of a major oil spill on subtidal reef communities over a 30-year period using a multicontrol before-after-control-impact (BACI) approach. In April 1986, 8000 tons (~9.3 million liters) of crude oil were released from a refinery in Bahia Las Minas (Caribbean Panama) contaminating an about 40 km² stretch of intertidal and subtidal mangrove, seagrass, sandy and coral reef habitats. Surveys of oiled and unpolluted control sites have been conducted at different times between 1985 and 2017 and changes in community metrics (i.e., percent live cover, diversity, community composition and recruitment) were compared with pre-spill data. The main focus was on scleractinian corals, but impacts on other major benthic taxa were also considered. Short-term oil effects on scleractinian corals included substantial declines in live cover, and diversity as well as changes in community structure being detectable up to four years after the spill, while other benthic taxa were hardly affected. Branching corals, such as Acropora palmata, seemed to suffer more, but strong incident-related declines could also be seen in two massive species (i.e., Pseudodiploria clivosa and Porites astreoides). Recruitment rates were not significantly different relative to oil exposure, but number of recruits showed strong temporal variation both at the oiled and control sites. While short-term effects (1 yr. post-spill) could be unequivocally linked to the spill, assessment of mid-term impacts was complicated by cumulative, albeit different stressors (diseases, bleaching, warming, additional accidental oil discharges) that have been driving changes at oiled and control sites respectively and thus ultimately concealing any effects of the spill. Our data did not provide evidence of a long-term (>10 yrs.) chronic impact of the oil spill, but instead a variety of factors have contributed to reef degradation both at oiled and control sites over the survey period.
... The nearly significant variables were also considered to reduce the incidence of Type II errors and avoid rejecting ecologically relevant effects at an early stage (e.g. Buhl 1996, Underwood 1997). Final multivariate model selection was produced with an automated forward and backward stepwise regression, using the Akaike Information Criteria (AIC) to select the best models (Zuur et al. 2007). ...
Thesis
Full-text available
Avaliámos a importância das bermas das estradas como áreas de refúgio para pequenos mamíferos, em paisagens Mediterrânicas intensivamente pastoreadas, e comparámos esta possível função das estradas como refúgio com o papel fundamental das galerias ripícolas como reservatórios de diversidade biológica. Para esse efeito, foram realizadas capturas de micromamíferos em dois segmentos de estrada e em duas ribeiras da região de Évora. Foram capturados 457 indivíduos de cinco espécies diferentes. Mus spretus foi a espécie mais capturada, seguida de Crocidura russula e Apodemus sylvaticus. M. spretus apresentou uma maior abundância nas bermas de estrada do que na vegetação ripicola, enquanto que a abundância de C. russula e A. sylvaticus era semelhante para ambos os habitats. O número de capturas das três espécies foi bastante superior dentro dos habitats lineares do que na matriz circundante. Os indivíduos de M. spretus eram maiores nas ribeiras, mas significativamente menores fora dos habitats lineares, e os indivíduos de C. russula apresentavam uma melhor condição corporal nas bermas das estradas. Tanto as estradas como as ribeiras exerceram um forte efeito de barreira aos movimentos dos micromamíferos. Concluímos então que as bermas das estradas actuam como habitat de refúgio em áreas sub-óptimas das paisagens Mediterrânicas. We assessed the importance of road verges as refuge areas for small mammals, in highly intensified grazed pastures on a Mediterranean landscape, and compared road function as refuge with the fundamental role of riparian galleries as reservoirs of biological diversity. For this purpose, a small mammal trapping study was undertaken on road verges and on small stream sides. We sampled two road segments and two streams in the vicinity of Évora, Portugal. We captured a total of 457 individuals of five different species. Mus spretus was the most common species captured, followed by Crocidura russula and Apodemus sylvaticus. M. spretus was more abundant on road verges than on riparian strips, whilst the abundance of C. russula and A. sylvaticus were similar in the two habitats. Captures of the three species were much higher inside both linear habitats than on the surrounding matrix. M. spretus were bigger on stream sites but significantly smaller outside the linear habitats and C. russula had better body conditions on roads. Both roads and streams exerted a strong barrier effect to small mammals’ movements. We conclude that roadside verges act as refuge habitat in sub-optimal Mediterranean landscapes.
... It has been pointed out that, in accordance with precautionary principles, β should be minimized in environmental risk assessment and decision-making based on negative indicators. This results in more powerful statistical testing (Power = 1 − β) (Buhl-Mortensen, 1996;Peterman and M'Gonigle, 1992;Sanderson and Petersen, 2002;Santillo et al., 1998;Underwood and Chapman, 2003). In terms of risk protection, for consumers as well as parts of the ecosystem false negative assignments are much more severe and relevant than false positives. ...
... Indeed, Miller and Ulrich (2016) showed how these and other factors have a direct bearing on the final research payoff. There is an impressive literature attesting to the difficulties in setting a blanket level recommendation (e.g., Buhl-Mortensen, 1996;Lemons et al., 1997;Lemons and Victor, 2008;Lieberman and Cunningham, 2009;Myhr, 2010;Rice and Trafimow, 2010;Mudge et al., 2012;Lakens et al., 2018). ...
Preprint
Full-text available
We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p = .05 to .005, is deleterious for the finding of new discoveries and the progress of science. Given that blanket and variable alpha levels both are problematic, it is sensible to dispense with significance testing altogether. There are alternatives that address study design and sample size much more directly than significance testing does; but none of the statistical tools should be taken as the new magic method giving clear-cut mechanical answers. Inference should not be based on single studies at all, but on cumulative evidence from multiple independent studies. When evaluating the strength of the evidence, we should consider, for example, auxiliary assumptions, the strength of the experimental design, and implications for applications. To boil all this down to a binary decision based on a p -value threshold of .05, .01, .005, or anything else, is not acceptable.
Chapter
In this chapter I shall discuss what seems to me to be a systematic ambiguity running through the large and complex risk-assessment literature. The ambiguity concerns the question of separability: can (and ought) risk assessment be separated from the policy values of risk management? Roughly, risk assessment is the process of estimating the risks associated with a practice or substance, and risk management is the process of deciding what to do about such risks. The separability question asks whether the empirical, scientific, and technical questions in estimating the risks either can or should be separated (conceptually or institutionally) from the social, political, and ethical questions of how the risks should be managed. For example, is it possible (advisable) for risk-estimation methods to be separated from social or policy values? Can (should) risk analysts work independently of policymakers (or at least of policy pressures)? The preponderant answer to the variants of the separability question in recent riskresearch literature is no. Such denials of either the possibility or desirability of separation may be termed nonseparatist positions. What needs to be recognized, however, is that advocating a nonseparatist position masks radically different views about the nature of risk-assessment controversies and of how best to improve risk assessment. These nonseparatist views, I suggest, may be divided into two broad camps (although individuals in each camp differ in degree), which I label the sociological view and the metascientific view. The difference between the two may be found in what each finds to be problematic about any attempt to separate assessment and management. Whereas the former (sociological) view argues against separatist attempts on the grounds that they give too small a role to societal (and other nonscientific) values, the latter (metascientific) view does so on the grounds that they give too small a role to scientific and methodological understanding. Examples of those I place under the sociological view are the cultural reductionists discussed in the preceding chapter by Shrader-Frechette. Examples of those I place under the metascientific view are the contributors to this volume themselves. A major theme running through this volume is that risk assessment cannot and should not be separated from societal and policy values (e.g., Silbergeld's uneasy divorce).
Article
Past statistical power analyses show that abundance estimation techniques usually have high β, the probability of not rejecting a null hypothesis when it should have been, and that only large effects are detectable. I review relationships among β, power, detectable effect size, sample size, and sampling variability. I show how statistical power analysis can help interpret past results and improve designs of future experiments, impact assessments, and management regulations. I make recommendations for researchers and decision makers, including routine application of power analysis, more cautious management, and reversal of the burden of proof to put it on industry, not management agencies. -from Author
Article
The so-called 'null hypothesis' debate in ecology opened a statistical Pandora's Box. Ecologists were forced to question whether or not decades of pattern analysis had been productive. Over the past few years, the debate has expanded beyond the role of different kinds of statistical hypothesis to include the importance of different types of statistical error. Our objective in this article is to show how trends governing ecological inferences under uncertainty appear to be changing as ecologists become increasingly aware o f the potential importance of statistical errors.
Ministeral declaration. Second International Con-ference on the protection of the North Sea Statistics and the precautionary principle
  • Anon
Anon. (1987). Ministeral declaration. Second International Con-ference on the protection of the North Sea, London. Gray, J. S. (1990a). Statistics and the precautionary principle. Mar. Pollut. Bull. 21, 174-176.