# Alexander Philip DawidUniversity of Cambridge | Cam · Statistical Laboratory

Alexander Philip Dawid

BA, MA, ScD

## About

297

Publications

47,135

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

17,590

Citations

Introduction

Emeritus Professor of Statistics, University of Cambridge

Additional affiliations

October 2013 - present

October 2007 - September 2013

October 1978 - September 1981

## Publications

Publications (297)

Directed acyclic graph (DAG) models are popular tools for describing causal relationships and for guiding attempts to learn them from data. In particular, they appear to supply a means of extracting causal conclusions from probabilistic conditional independence properties inferred from purely observational data. I take a critical look at this enter...

Health economic evaluations have recently become an important part of the clinical and medical research process and have built upon more advanced statistical decision-theoretic foundations. In some contexts, it is officially required that uncertainty about both parameters and observable variables be properly taken into account, increasingly often b...

We consider the problem of learning about and comparing the consequences of dynamic treatment strategies on the basis of observational data. We formulate this within a probabilistic decision-theoretic framework. Our approach is compared with related work by Robins and others: in particular, we show how Robins's 'G-computation' algorithm arises natu...

I thank Thomas Richardson and James Robins for their discussion of my paper, and discuss the similarities and differences between their approach and mine.

This article is a response to recent proposals by Pearl and others for a new approach to personalised treatment decisions, in contrast to the traditional one based on statistical decision theory. We argue that this approach is dangerously misguided and should not be used in practice.

We describe and contrast two distinct problem areas for statistical causality: studying the likely effects of an intervention (effects of causes) and studying whether there is a causal link between the observed exposure and outcome in an individual case (causes of effects). For each of these, we introduce and compare various formal frameworks that...

Suppose X and Y are binary exposure and outcome variables, and we have full knowledge of the distribution of Y, given application of X. We are interested in assessing whether an outcome in some case is due to the exposure. This “probability of causation” is of interest in comparative historical analysis where scholars use process tracing approaches...

We give an overview of various topics tied to the expression of uncertainty about a variable or event by means of a probability distribution. We first consider methods used to evaluate a single probability forecaster, including scoring rules, calibration, resolution and refinement. We next revisit methods for combining several experts’ distribution...

This chapter is dedicated to the memories of Stephen and Joyce Fienberg.In this chapter, we address the problem of inference about individual causation on the basis of statistical data collected on groups. While such information typically cannot identify precisely the probability of causation, it can supply bounds on it. We show how these bounds ca...

We develop a mathematical and interpretative foundation for the enterprise of decision-theoretic (DT) statistical causality, which is a straightforward way of representing and addressing causal questions. DT reframes causal inference as “assisted decision-making” and aims to understand when, and how, I can make use of external data, typically obser...

In this work we study stationary linear time-series models, and construct and analyse “score-matching” estimators based on the Hyvärinen scoring rule. We consider two scenarios: a single series of increasing length, and an increasing number of independent series of fixed length. In the latter case there are two variants, one based on the full data,...

We describe and contrast two distinct problem areas for statistical causality: studying the likely effects of an intervention ("effects of causes"), and studying whether there is a causal link between the observed exposure and outcome in an individual case ("causes of effects"). For each of these, we introduce and compare various formal frameworks...

Combinations of intense non-pharmaceutical interventions (lockdowns) were introduced worldwide to reduce SARS-CoV-2 transmission. Many governments have begun to implement exit strategies that relax restrictions while attempting to control the risk of a surge in cases. Mathematical modelling has played a central role in guiding interventions, but th...

Combinations of intense non-pharmaceutical interventions (lockdowns) were introduced worldwide to reduce SARS-CoV-2 transmission. Many governments have begun to implement exit strategies that relax restrictions while attempting to control the risk of a surge in cases. Mathematical modelling has played a central role in guiding interventions, but th...

Combinations of intense non-pharmaceutical interventions (lockdowns) were introduced in countries worldwide to reduce SARS-CoV-2 transmission. Many governments have begun to implement lockdown exit strategies that allow restrictions to be relaxed while attempting to control the risk of a surge in cases. Mathematical modelling has played a central r...

We develop a mathematical and interpretative foundation for the enterprise of decision-theoretic statistical causality (DT), which is a straightforward way of representing and addressing causal questions. DT reframes causal inference as "assisted decision-making", and aims to understand when, and how, I can make use of external data, typically obse...

Bradley (Theory Decis 85:5–20, 2018) develops some theory of the linear opinion pool, in apparent contradiction to results of Dawid et al. (Test 4:263–314, 1995). We investigate the sources of these contradictions, and in particular identify a mathematical error in Bradley (2018) that invalidates his main result.

Suppose X and Y are binary exposure and outcome variables, and we have full knowledge of the distribution of Y, given application of X. From this we know the average causal effect of X on Y. We are now interested in assessing, for a case that was exposed and exhibited a positive outcome, whether it was the exposure that caused the outcome. The rele...

Likelihood-based estimation methods involve the normalising constant of the model distributions, expressed as a function of the parameter. However in many problems this function is not easily available, and then less efficient but more easily computed estimators may be attractive. In this work we study stationary time-series models, and construct a...

Invited Discussion : Bertrand Clarke - Meng Li - Peter Grunwald and Rianne de Heide Contributed Discussion : A. Philip Dawid - William Weimin Yoo - Robert L. Winkler, Victor Richmond R. Jose, Kenneth C. Lichtendahl Jr., and Yael Grushka-Cockayne - Kenichiro McAlinn, Knut Are Aastveit, and Mike West - Minsuk Shin - Tianjian Zhou - Lennart Hoogerheid...

We survey a variety of possible explications of the term "Individual Risk."
These in turn are based on a variety of interpretations of "Probability,"
including Classical, Enumerative, Frequency, Formal, Metaphysical, Personal,
Propensity, Chance and Logical conceptions of Probability, which we review and
compare. We distinguish between "groupist" a...

Many legal cases require decisions about causality, responsibility or blame, and these may be based on statistical data. However, causal inferences from such data are beset by subtle conceptual and practical difficulties, and in general it is, at best, possible to identify the "probability of causation" as lying between certain empirically informed...

An individual has been subjected to some exposure and has developed some outcome. Using data on similar individuals, we wish to evaluate, for this case, the probability that the outcome was in fact caused by the exposure. Even with the best possible experimental data on exposure and outcome, we typically can not identify this "probability of causat...

We consider the problem of choosing between parametric models for a discrete observable, taking a Bayesian approach in which the within-model prior distributions are allowed to be improper. In order to avoid the ambiguity in the marginal likelihood function in such a case, we apply a homogeneous scoring rule. For the particular case of distinguishi...

In a prediction market, individuals can sequentially place bets on the outcome of a future event. This leaves a trail of personal probabilities for the event, each being conditional on the current individual's private background knowledge and on the previously announced probabilities of other individuals, which give partial information about their...

This article is a response to the position papers published in the Science & Justice virtual special issue on measuring and reporting the precision of forensic likelihood ratios. I point out a number of serious statistical errors in some of these papers. These issues need to be properly addressed before the philosophical debate can be conducted in...

Statistical causal inference from observational studies often requires adjustment for a possibly multi-dimensional variable, where dimension reduction is crucial. The propensity score, first introduced by Rosenbaum and Rubin, is a popular approach to such reduction. We address causal inference within Dawid’s decision-theoretic framework, where it i...

This letter comments on the report “Forensic science in criminal courts: Ensuring scientific validity of feature-comparison methods” recently released by the President’s Council of Advisors on Science and Technology (PCAST). The report advocates a procedure for evaluation of forensic evidence that is a two-stage procedure in which the first stage i...

Proper scoring rules are devices for encouraging honest assessment of probability distributions. Just like log-likelihood, which is a special case, a proper scoring rule can be applied to supply an unbiased estimating equation for any statistical model, and the theory of such equations can be applied to understand the properties of the associated e...

Given empirical evidence for the dependence of an outcome variable on an exposure variable, we can typically only provide bounds for the “probability of causation” in the case of an individual who has developed the outcome after being exposed. We show how these bounds can be adapted or improved if further information becomes available. In addition...

The goal of this paper is to integrate the notions of stochastic conditional
independence and variation conditional independence under a more general notion
of extended conditional independence. We show that under appropriate
assumptions the calculus that applies for the two cases separately (axioms of a
separoid) still applies for the extended cas...

An individual has been subjected to some exposure and has developed some outcome. Using data on similar individuals, we wish to evaluate, for this case, the probability that the outcome was in fact caused by the exposure. Even with the best possible experimental data on exposure and outcome, we typically can not identify this “probabiity of causati...

Increasing integration and availability of data on large groups of persons
has been accompanied by proliferation of statistical and other algorithmic
prediction tools in banking, insurance, marketiNg, medicine, and other FIelds
(see e.g., Steyerberg (2009a;b)). Controversy may ensue when such tools are
introduced to fields traditionally reliant on...

Statistical causal inference from observational studies often requires
adjustment for a possibly multi-dimensional variable, where dimension reduction
is crucial. The propensity score, first introduced by Rosenbaum and Rubin, is a
popular approach to such reduction. We address causal inference within Dawid's
decision-theoretic framework, where it i...

Given empirical evidence for the dependence of an outcome variable on an
exposure variable, we can typically only provide bounds for the "probability of
causation" in the case of an individual who has developed the outcome after
being exposed. We show how these bounds can be adapted or improved if further
information becomes available. In addition...

Deborah Mayo claims to have refuted Birnbaum's argument that the Likelihood
Principle is a logical consequence of the Sufficiency and Conditionality
Principles. However, this claim fails because her interpretation of the
Conditionality Principle is different from Birnbaum's. Birnbaum's proof cannot
be so readily dismissed. [arXiv:1302.7021]

Bayesian model selection with improper priors is not well-defined because of
the dependence of the marginal likelihood on the arbitrary scaling constants of
the within-model prior densities. We show how this problem can be evaded by
replacing marginal log likelihood by a homogeneous proper scoring rule, which
is insensitive to the scaling constants...

The aim of this paper is to compare numerically the performance of two
estimators based on Hyv\"arinen's local homogeneous scoring rule with that of
the full and the pairwise maximum likelihood estimators. In particular, two
different model settings, for which both full and pairwise maximum likelihood
estimators can be obtained, have been considere...

We present an overview of the decision-theoretic framework of statistical
causality, which is well-suited for formulating and solving problems of
determining the effects of applied causes. The approach is described in detail,
and is related to and contrasted with other current formulations, such as
structural equation models and potential responses...

We present an overview of the decision-theoretic framework of statistical
causality, which is well-suited for formulating and solving problems of
determining the effects of applied causes. The approach is described in detail,
and is related to and contrasted with other current formulations, such as
structural equation models and potential responses...

This paper considers the problem of defining distributions over graphical
structures. We propose an extension of the hyper Markov properties of Dawid and
Lauritzen(1993), which we term structural Markov properties, for both
undirected decomposable and directed acyclic graphs, which requires that the
structure of distinct components of the graph be...

We welcome Professor Pearl’s comment on our original article, Dawid et al. Our focus there on the distinction between the “Effects of Causes” (EoC) and the “Causes of Effects” (CoE) concerned two fundamental problems, one a theoretical challenge in statistics and the other a practical challenge for trial courts. In this response, we seek to accompl...

Ascoring rule S(x; q) provides away of judging the quality of a quoted probability density q for a random variable X in the light of its outcome x. It is called proper if honesty is your best policy, i.e., when you believe X has density p, your expected score is optimised by the choice q = p. The most celebrated proper scoring rule is the logarithm...

Science is largely concerned with understanding the "effects of causes"
(EoC), while Law is more concerned with understanding the "causes of effects"
(CoE). While EoC can be addressed using experimental design and statistical
analysis, it is less clear how to incorporate statistical or epidemiological
evidence into CoE reasoning, as might be requir...

We define mechanistic interaction between the effects of two variables on an outcome in terms of departure of these effects
from a generalized noisy-OR model in a stratum of the population. We develop a fully probabilistic framework for the observational
identification of this type of interaction via excess risk or superadditivity, one novel featur...

In many applications of highly structured statistical models the likelihood function is in- tractable; in particular, finding the normalisation constant of the distribution can be de- manding. One way to sidestep this problem is to to adopt composite likelihood methods, such as the pseudo-likelihood approach. In this paper we display composite like...

In many applications of highly structured statistical models the likelihood function is in- tractable; in particular, finding the normalisation constant of the distribution can be de- manding. One way to sidestep this problem is to to adopt composite likelihood methods, such as the pseudo-likelihood approach. In this paper we display composite like...

Law and science share many perspectives, but they also differ in important ways. While much of science is concerned with the effects of causes (EoC), relying upon evidence accumulated from randomized controlled experiments and observational studies, the problem of inferring the causes of effects (CoE) requires its own framing and possibly different...

Taking a rigorous formal approach, we consider sequential decision problems
involving observable variables, unobservable variables, and action variables.
We can typically assume the property of extended stability, which allows
identification (by means of G-computation) of the consequence of a specified
treatment strategy if the unobserved variables...

Methods for performing complex probabilistic reasoning tasks, often based on masses of different forms of evidence obtained from a variety of different sources, are being sought by, and developed for, persons in many important contexts including law, medical diagnosis, and intelligence analysis. The complexity of these tasks can often be captured a...

Evidence - its nature and interpretation - is the key to many topical debates and concerns such as global warming, evolution, the search for weapons of mass destruction, DNA profiling, and evidence-based medicine. In 2004, University College London launched a cross-disciplinary research programme 'Evidence, Inference and Enquiry' to explore the que...

We display pseudo-likelihood as a special case of a general estimation technique based on proper scoring rules. Such a rule supplies an unbiased estimating equation for any statistical model, and this can be extended to allow for missing data. When the scoring rule has a simple local structure, as in many spatial models, the need to compute problem...

Beauty challenges conventional approaches to the subject through an interdisciplinary approach that forges connections between the arts, sciences and mathematics. Classical, conventional aspects of beauty are addressed in subtle, unexpected ways: symmetry in mathematics, attraction in the animal world and beauty in the cosmos. This collection arise...

Given two variables that causally influence a binary response, we formalize the idea that their effects operate through a common mechanism, in which case we say that the two variables interact mechanistically. We introduce a mechanistic interaction relationship of "interference" that is asymmetric in the two causal factors. Conditions and assumptio...

In this paper we review the notion of direct causal effect as introduced by
Pearl (2001). We show how it can be formulated without counterfactuals, using
intervention indicators instead. This allows to consider the natural direct
effect as a special case of sequential treatments discussed by Dawid and
Didelez (2005) which immediately yields conditi...

IntroductionDecision theory and causalityNo confoundingConfoundingPropensity analysisInstrumental variableEffect of treatment of the treatedConnections and contrastsPostscriptAcknowledgementsReferences

IntroductionWhat is a mechanism?Statistical versus mechanistic interactionIllustrative exampleMechanistic interaction definedEpistasisExcess risk and superadditivityConditions under which excess risk and superadditivity indicate the presence of mechanistic interactionCollapsibilityBack to the illustrative studyAlternative approachesDiscussionEthics...

We consider conditions that allow us to find an optimal strategy for
sequential decisions from a given data situation. For the case where all
interventions are unconditional (atomic), identifiability has been discussed by
Pearl & Robins (1995). We argue here that an optimal strategy must be
conditional, i.e. take the information available at each d...

Prentice & Pyke (1979) established that the maximum likelihood estimate of an
odds-ratio in a case-control study is the same as would be found by running a
logistic regression: in other words, for this specific target the incorrect
prospective model is inferentially equivalent to the correct retrospective
model. Similar results have been obtained f...

At rst sight, there may appear to be little connection between Statistics and Law. On closer inspection it can be seen that the problems they tackle are in many ways identical | although they go about them in very di erent ways. In a broad sense, each subject can be regarded as concerned with the Interpreta-tion of Evidence. I owe my own introducti...

Abstract We extend Pearl's criticisms of principal stratification analysis as a method for interpreting and adjusting for intermediate variables in a causal analysis. We argue that this can be meaningful only in those rare cases that involve strong functional dependence, and even then may not be appropriate.

Causality: Statistical Perspectives and Applications presents a wide-ranging collection of seminal contributions by renowned experts in the field, providing a thorough treatment of all aspects of statistical causality. It covers the various formalisms in current use, methods for applying them to specific problems, and the special requirements of a...

This chapter considers the problem of Bayesian inference about the statistical model from which the data arose. It examines the asymptotic dependence of posterior model probabilities on the prior specifications and the data and proves that such problems of model choice exhibit more sensitivity to the prior than is the case for standard parametric i...

This chapter gives a broad-brush account of some of the principal approaches to statistical model selection. Under the assumption that random variations are independent with constant variance from period to period and across patients, the variance of each estimator is proportional to twice the sum of the squares of the associated weights. It is sho...

This concluding chapter summarizes the main themes of the book and attempts to bring together some of the lessons that have been learnt. It considers what models are, what they are useful for, appropriate levels of model complexity, and the treatment of uncertainty. It is clear that modelling plays an important role in science and its application t...

This introductory chapter provides some brief remarks about the book, Simplicity, Complexity and Modelling, what its purpose is, how it relates to the Simplicity Complexity and Modelling (SCAM) project and also more widely about what the purpose of modelling is and what various traditions in modelling there are. Different sciences have developed th...

We display pseudo-likelihood as a special case of a general estimation technique based on proper scoring rules. Such a rule supplies an unbiased estimating equation for any statistical model, and this can be extended to allow for missing data. When the scoring rule has a simple local structure, as in many spatial models, the need to compute problem...

Consider an American option that pays G(X^*_t) when exercised at time t,
where G is a positive increasing function, X^*_t := \sup_{s\le t}X_s, and X_s
is the price of the underlying security at time s. Assuming zero interest
rates, we show that the seller of this option can hedge his position by trading
in the underlying security if he begins with...

A scoring rule is a loss function measuring the quality of a quoted
probability distribution $Q$ for a random variable $X$, in the light of the
realized outcome $x$ of $X$; it is proper if the expected score, under any
distribution $P$ for $X$, is minimized by quoting $Q=P$. Using the fact that
any differentiable proper scoring rule on a finite sam...

The effect of treatment on the treated (ETT) is a causal effect commonly used in the econo- metric litetature. The ETT is typically of interest when evaluating the effect of schemes that require voluntary participation from eligible members of the population—those who participate are regarded as the treated. We show by means of examples, that it ca...

Mutation models are important in many areas of genetics including forensics. This letter criticizes the model of the paper 'DNA identification by pedigree likelihood ratio accommodating population substructure and mutations' by Ge et al. (2010). Furthermore, we argue that the paper in some cases misrepresents previously published papers.
Please see...

The term “ancillary statistic” was introduced by R. A. Fisher (Fisher 1925) in the context of maximum likelihood estimation.
Fisher regarded the likelihood function as embodying all the information that the data had to supply about the unknown parameter.
At a purely abstract level, this might be regarded as simply an application of the sufficiency...

We investigate proper scoring rules for continuous distributions on the real
line. It is known that the log score is the only such rule that depends on the
quoted density only through its value at the outcome that materializes. Here we
allow further dependence on a finite number $m$ of derivatives of the density
at the outcome, and describe a large...

The judgment of the Court of Appeal in R v T [1] raises several issues relating to the evaluation of scientiﬁc evidence that, we believe, require a response.
We, the undersigned, oppose any response to the judgment that would result in a movement away from the use of logical methods for evidence evaluation. A paper in this issue of the Journal [2]...

Our aim is to detect mechanistic interaction between the effects of two causal factors on a binary response, as an aid to identifying situations where the effects are mediated by a common mechanism. We propose a formalization of mechanistic interaction which acknowledges asymmetries of the kind "factor A interferes with factor B, but not viceversa"...

We consider the game-theoretic scenario of testing the performance of Forecaster by Sceptic who gambles against the forecasts. Sceptic's current capital is interpreted as the amount of evidence he has found against Forecaster. Reporting the maximum of Sceptic's capital so far exaggerates the evidence. We characterize the set of all increasing funct...

Working within the decision-theoretic framework for causal inference, we study the properties of "sufficient covariates", which support causal inference from observational data, and possibilities for their reduction. In particular we illustrate the rôle of a propensity variable by means of a simple model, and explain why such a reduction typically...

We are interested in the following version of Jeffreys's law: if two predictors are predicting the same sequence of events and either is doing a satisfactory job, they will make similar predictions in the long run. We give a classification of instances of Jeffreys's law, illustrated with examples. Comment: 12 pages

Meester & Sjerps (2003) (henceforth M & S) draw attention to an ambiguity in the definition of 'the likelihood ratio' that can arise with forensic multiple transfer evidence. Here I argue that this is a mathematical artifact and a logical red herring: in particular, for their data and assumptions only the likelihood ratio associated with their 'fir...

We present a statistical methodology for making inferences about mutation rates from paternity casework. This takes account of a number of sources of potential bias, including hidden mutation, incomplete family triplets, uncertain paternity status and differing maternal and paternal mutation rates, while allowing a wide variety of mutation models....

Legal applications of probabilistic and statistical reasoning have a long history, having exercised pioneers such as Nicolas Bernoulli, Condorcet, Laplace, Poisson, and Cournot (Zabell, 1988). After a period of neglect, interest has resurfaced in recent years, and the topic has given rise to many challenging problems. Evidence presented in a case a...

We reconsider two graphical aids to handling complex mixed masses of evidence in a legal case: Wigmore charts and Bayesian networks. Our aim is to forge a synthesis of their best features and to develop this further to overcome remaining limitations. One important consideration is the multilayered nature of a complex case, which can involve direct...