Philosophy and the practice of Bayesian statistics

Department of Statistics and Department of Political Science, Columbia University, New York, USA Statistics Department, Carnegie Mellon University, Santa Fe Institute, Pittsburgh, USA.
British Journal of Mathematical and Statistical Psychology (Impact Factor: 2.17). 02/2012; 66(1). DOI: 10.1111/j.2044-8317.2011.02037.x
Source: PubMed

ABSTRACT A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science. Clarity about these matters should benefit not just philosophy of science, but also statistical practice. At best, the inductivist view has encouraged researchers to fit and compare models without checking them; at worst, theorists have actively discouraged practitioners from performing model checking because it does not fit into their framework.

26 Reads
  • Source
    • "g . , Gelman and Shalizi 2013 ) . We distinguish the confirmatory and iterative - discovery methods because both seem to be in popular use , but they cause different difficulties for multimodel inference . "
    [Show abstract] [Hide abstract]
    ABSTRACT: Multimodel inference accommodates uncertainty when selecting or averaging models, which seems logical and natural. However, there are costs associated with multimodel inferences, so they are not always appropriate or desirable. First, we present statistical inference in the big picture of data analysis and the deductive–inductive process of scientific discovery. Inferences on fixed states of nature, such as survey sampling methods, generally use a single model. Multimodel inferences are used primarily when modeling processes of nature, when there is no hope of knowing the true model. However, even in these cases, iterating on a single model may meet objectives without introducing additional complexity. Additionally, discovering new features in the data through model diagnostics is easier when considering a single model. There are costs for multimodel inferences, including the coding, computing, and summarization time on each model. When cost is included, a reasonable strategy may often be iterating on a single model. We recommend that researchers and managers carefully examine objectives and cost when considering multimodel inference methods. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.
    Journal of Wildlife Management 05/2015; DOI:10.1002/jwmg.891 · 1.73 Impact Factor
    • "comparing statistics inferred from empirical data to statistics simulated under the very model used to estimate the empirical parameters, PPS extends data analysis beyond the estimation of parameters (Gelman 2003). It allows researchers to assess model adequacy and to learn how and why a model does not fit the data (Gelman & Shalizi 2012). In the case of the multispecies coalescent, researchers can discover whether patterns inherent to the data (the observed genealogies) are inconsistent with model assumptions (e.g. that shared polymorphism results from incomplete lineage sorting). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Bayesian inference operates under the assumption that the empirical data are a good statistical fit to the analytical model, but this assumption can be challenging to evaluate. Here, we introduce a novel R package that utilizes posterior predictive simulation to evaluate the fit of the multispecies coalescent model used to estimate species trees. We conduct a simulation study to evaluate the consistency of different summary statistics in comparing posterior and posterior predictive distributions, the use of simulation replication in reducing error rates, and the utility of parallel process invocation towards improving computation times. We also test P2C2M on two empirical data sets in which hybridization and gene flow are suspected of contributing to shared polymorphism, which is in violation with the coalescent model: Tamias chipmunks and Myotis bats. Our results indicate that (a) probability-based summary statistics display the lowest error rates, (b) the implementation of simulation replication decreases the rate of type II errors, and (c) our R package displays improved statistical power compared to previous implementations of this approach. We also test P2C2M on two empirical data sets in which hybridization and gene flow are suspected of contributing to shared polymorphism, which is in violation with the coalescent model: Tamias chipmunks and Myotis bats. When probabilistic summary statistics are used, P2C2M corroborates the assumption that genealogies collected from Tamias and Myotis are not a good fit to the multispecies coalescent model. Taken as a whole, our findings argue that an assessment of the fit of the multispecies coalescent model should accompany any phylogenetic analysis that estimates a species tree. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
    Molecular Ecology Resources 05/2015; DOI:10.1111/1755-0998.12435 · 3.71 Impact Factor
  • Source
    • "6 Furthermore, Bayesian statistics is not as radical a departure from frequentism as it can sometimes appear to be. Bayes' theorem, the engine of Bayesian inference , can be unproblematically derived from the probability axioms accepted by statisticians of all hues and in many other respects too converges with frequentist statistical approaches (Gelman and Shalizi, 2013: 26). As a result, we may see contemporary statistics as a place of happy eclecticism: the wealth of computational ability allows for the application of countless methods with little hand-wringing about foundations. "
    [Show abstract] [Hide abstract]
    ABSTRACT: As a contribution to current debates on the ‘social life of methods’, in this article we present an ethnomethodological study of the role of understanding within statistical practice. After reviewing the empirical turn in the methods literature and the challenges to the qualitative-quantitative divide it has given rise to, we argue such case studies are relevant because they enable us to see different ways in which ‘methods’, here quantitative methods, come to have a social life – by embodying and exhibiting understanding they ‘make the social structures of everyday activities observable’ (Garfinkel, 1967: 75), thereby putting society on display. Exhibited understandings rest on distinctive lines of practical social and cultural inquiry – ethnographic ‘forays’ into the worlds of the producers and users of statistics – which are central to good statistical work but are not themselves quantitative. In highlighting these non-statistical forms of social and cultural inquiry at work in statistical practice, our case study is an addition to understandings of statistics and usefully points to ways in which studies of the social life of methods might be further developed from here.
    Theory Culture &amp Society 01/2015; DOI:10.1177/0263276414559058 · 1.77 Impact Factor
Show more


26 Reads
Available from