The Nature of Statistical Evidence

To read the full-text of this research, you can request a copy directly from the author.


As discussed in Chapter 8, Birnbaum introduces E v (E, y), the evidential meaning of obtaining data y as an instance of experiment E. Following Birnbaum, various authors have wrestled with the problem of developing a single set of postulates under which statistical inference can be made coherent. But as we claim in Section 8.3, E v(E, y)does not exist. Evidence is grounds for belief—an imprecise concept. There must be many valid reasons for believing and hence many ways of making the evidence concept precise. Most of our beliefs are held because mother—or someone else we trust—told us so. The law trusts sworn testimony. Scientific and statistical evidence are other different grounds for belief—supposedly particularly reliable kinds. Instead of E v (E, y) we are concerned with E v(E, T, y), the evidential meaning of observing v as an instance of E, in the context of theory T.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... That presents us with a dilemma and, unfortunately, statistics does not provide a way around it. Global error rates and local evidence operate in different logical spaces (Thompson 2007) and so there can be no strictly statistical way to weigh them together. All is not lost, though, because statistical limitations do not preclude thoughtful integration of local and the global issues when making inferences. ...
Full-text available
This chapter demystifies P-values, hypothesis tests and significance tests and introduces the concepts of local evidence and global error rates. The local evidence is embodied in this data and concerns the hypotheses of interest for this experiment, whereas the global error rate is a property of the statistical analysis and sampling procedure. It is shown using simple examples that local evidence and global error rates can be, and should be, considered together when making inferences. Power analysis for experimental design for hypothesis testing is explained, along with the more locally focussed expected P-values. Issues relating to multiple testing, HARKing and P-hacking are explained, and it is shown that, in many situations, their effects on local evidence and global error rates are in conflict, a conflict that can always be overcome by a fresh dataset from replication of key experiments. Statistics is complicated, and so is science. There is no singular right way to do either, and universally acceptable compromises may not exist. Statistics offers a wide array of tools for assisting with scientific inference by calibrating uncertainty, but statistical inference is not a substitute for scientific inference. P-values are useful indices of evidence and deserve their place in the statistical toolbox of basic pharmacologists.
ResearchGate has not been able to resolve any references for this publication.