ArticlePDF Available

Abstract

The ability to calculate precise likelihood ratios is fundamental to many STEM areas, such as decision-making theory, biomedical science, and engineering. However, there is no assumption-free statistical methodology to achieve this. For instance, in the absence of data relating to covariate overlap, the widely used Bayes' theorem either defaults to the marginal probability driven "naive Bayes' classifier", or requires the use of compensatory expectation-maximization techniques. Equally, the use of alternative statistical approaches, such as multivariate logistic regression, may be confounded by other axiomatic conditions, e.g., low levels of co-linearity. This article takes an information-theoretic approach in developing a new statistical formula for the calculation of likelihood ratios based on the principles of quantum entanglement. In doing so, it is argued that this quantum approach demonstrates: that the likelihood ratio is a real quality of statistical systems; that the naive Bayes' classifier is a special case of a more general quantum mechanical expression; and that only a quantum mechanical approach can overcome the axiomatic limitations of classical statistics.
A quantum framework for likelihood ratios
Rachael L. Bond
School of Psychology, University of Sussex,
Falmer, East Sussex, BN1 9QH, UK
hello@rachaelbond.com
Yang-Hui He
Department of Mathematics, City,
University of London, EC1V 0HB, UK
Merton College, University of Oxford, OX1 4JD, UK
School of Physics, NanKai University, Tianjin 300071, China
hey@maths.ox.ac.uk
Thomas C. Ormerod
School of Psychology, University of Sussex,
Falmer, East Sussex, BN1 9QH, UK
t.ormerod@sussex.ac.uk
Received 6 March 2017
Accepted 15 November 2017
Published 12 December 2017
The ability to calculate precise likelihood ratios is fundamental to science, from Quantum
Information Theory through to Quantum State Estimation. However, there is no assumption-
free statistical methodology to achieve this. For instance, in the absence of data relating to
covariate overlap, the widely used Bayes' theorem either defaults to the marginal probability
driven \naive Bayes' classi¯er", or requires the use of compensatory expectation-maximization
techniques. This paper takes an information-theoretic approach in developing a new statistical
formula for the calculation of likelihood ratios based on the principles of quantum entanglement,
and demonstrates that Bayes' theorem is a special case of a more general quantum mechanical
expression.
Keywords: Bayes' theorem; probability; statistics; inference; decision-making.
1. Introduction
In recent years, Bayesian statistical research has often been epistemologically driven,
guided by de Finetti's famous quote that \probability does not exist".
1
For example,
the \quantum Bayesian" methodology of Caves, Fuchs and Schack has applied
de Finetti's ideas to Bayes' theorem for use in quantum mechanics.
2
In doing so,
International Journal of Quantum Information
Vol. 16, No. 1 (2018) 1850002 (14 pages)
#
.
cWorld Scienti¯c Publishing Company
DOI: 10.1142/S0219749918500028
1850002-1
Caves et al. have argued that statistical systems are best interpreted by methods in
which the Bayesian likelihood ratio is seen to be both external to the system and
subjectively imposed on it by the observer.
3
However, the Caves et al. approach is
problematic. At a human scale, for instance, an observer's belief as to the chances of a
fair coin landing either \heads" or \tails" has no known e®ect. Indeed, for all prac-
tical purposes, the \heads:tails" likelihood ratio of 0.5:0.5 is only meaningful when
considered as a property of the coin's own internal statistical system rather than as
some ephemeral and arbitrary qualia.
Yet, to date, the axiomatic di±culties associated with Bayes' theorem, notably its
reliance upon the use of marginal probabilities in the absence of structural statistical
information (e.g. estimates of covariate overlap), as well as the assumed conditional
independence of data, have largely been approached from a \mend and make do"
standpoint. For instance, the \maximum likelihood" approach of Dempster, Laird
and Rubin calculates iteratively derived measures of covariate overlap which not only
lack a sense of natural authenticity, but also introduce fundamental assumptions into
the statistical analysis.
4
Instead, this paper adopts a di®erent approach to the analysis of statistical
systems. By using quantum mechanical mathematical spaces, it is demonstrated
that the creation of isomorphic representations of classical data-sets as entangled
systems allows for a natural, albeit nontrivial, calculation of likelihood ratios. It is
expected that this technique will ¯nd applications within the ¯elds of quantum state
estimation, and quantum information theory.
2. The Limits of Bayes' Theorem
Bayes' theorem is used to calculate the conditional probability of a statement, or
hypothesis, being true given that other information is also true. It is usually written as
PðHijDÞ¼ PðHiÞPðDjHiÞ
PjPðHjÞPðDjHjÞ:ð1Þ
Here, PðHijDÞis the conditional probability of hypothesis Hibeing true given that the
information Dis true; PðDjHiÞis the conditional probability of Dbeing true if Hiis
true; and PjPðHjÞPðDjHjÞis the sum of the probabilities of all hypotheses multiplied
by the conditional probability of Dbeing true for each hypothesis.
5
Particle α(H1) Particle β(H2)
Number of particles (n)10 10
Proportion spin (D)0.80.7
Proportion spin (¯
D)0.20.3
ð2Þ
R. L. Bond, Y.-H. He & T. C. Ormerod
1850002-2
To exemplify using the contingency information in (2), if one wishes to calculate
the nature of a randomly selected particle from a set of 20, given that it has spin ",
then using Bayes' theorem it is trivial to calculate that particle is the most likely
type with a likelihood ratio of approximately 0.53:0.47,
PðH1jDÞ¼ 0:50:8
ð0:50:8Þþð0:50:7Þ¼8
15 0:533;
PðH2jDÞ¼1PðH1jDÞ¼ 7
15 0:467;ð3Þ
where PðHiÞ¼10=ð10 þ10Þ¼0:5for both i¼1;2.
However, di±culties arise in the use of Bayes' theorem for the calculation of
likelihood ratios where there are multiple nonexclusive data sets. For instance, if the
information in (2) is expanded to include data about particle charge (4) then the
precise covariate overlap (i.e. D1\D2) for each particle becomes an unknown.
Particle α(H1) Particle β(H2)
Number of particles (n)10 10
Proportion spin (D1)0.80.7
Proportion charge + (D2)0.60.5
ð4Þ
All that may be shown is that, for each particle, the occurrence of both features
forms a range described by (5), where nðHiÞis the total number of exemplars i,
nðD1jHiÞis the total number of iwith spin ", and nðD2jHiÞis the total number of i
with a positive charge,
nðD1\D2jHiÞ
2
½nðD1jHiÞþnðD2jHiÞnðHiÞ;...;minðnðD1jHiÞ;nðD2jHiÞÞ
if nðD1jHiÞþnðD2jHiÞ>nðHiÞ;or
½0;...;minðnðD1jHiÞ;nðD2jHiÞÞ
if nðD1jHiÞþnðD2jHiÞnðHiÞ:
8
>
>
>
>
<
>
>
>
>
:
ð5Þ
Speci¯cally for (4) these ranges equate to
nðD1\D2jH1Þ2f4;5;6g;
nðD1\D2jH2Þ2f2;3;4;5g:ð6Þ
The simplest approach to resolving this problem is to naively ignore any intersection,
or co-dependence, of the data and to directly multiply the marginal probabilities.
Hence, given (4), the likelihood of particle having the greatest occurrence of both
spin "and a positive charge would be calculated as
PðH1jD1\D2Þ¼ 0:50:80:6
ð0:50:80:6Þþð0:50:70:5Þ;
0:578:ð7Þ
A quantum framework for likelihood ratios
1850002-3
Yet, because the data intersect, this probability value is only one of a number which
may be reasonably calculated. Alternatives include calculating a likelihood ratio
using the mean value of the frequency ranges for each hypothesis
Pð½nðD1\D2jH1ÞÞ ¼ 1
10 1
3ð4þ5þ6Þ¼0:5;
Pð½nðD1\D2jH2ÞÞ ¼ 1
10 1
4ð2þ3þ4þ5Þ¼0:35 ð8Þ
)PðH1jD1\D2Þ0:588;
and taking the mean value of the probability range derived from the frequency range
min PðH1jD1\D2Þ¼ 4
4þ5;
max PðH1jD1\D2Þ¼ 6
6þ2
)½PðH1jD1\D2Þ  0:597:
ð9Þ
Given this multiplicity of probability values, it would seem that none of these
methods may lay claim to normativity. This problem of covariate overlap has, of
course, been previously addressed within statistical literature. For instance, the
\maximum likelihood" approach of Dempster, Laird and Rubin has demonstrated
how an \expectation-maximization" algorithm may be used to derive appropriate
covariate overlap measures.
4
Indeed, the mathematical e±cacy of this technique has
been con¯rmed by Wu.
6
However, it is di±cult to see how such an iterative meth-
odology can be employed without introducing axiomatic assumptions. Further, since
any assumptions, irrespective of how benign they may appear, have the potential to
skew results, what is required is an approach in which covariate overlaps can be
automatically, and directly, calculated from contingency data.
3. A Quantum Mechanical Proof of Bayes' Theorem
for Independent Data
Previously unconsidered, the quantum mechanical von Neumann axioms would seem
to o®er the most promise in this regard, since the re-conceptualization of covariate
data as a quantum entangled system allows for statistical analysis with few, nonar-
bitrary assumptions. Unfortunately, there are many conceptual di±culties that can
arise here. For instance, a Dirac notation representation of (4) as a standard quantum
superposition is
j1
ffiffiffiffi
N
pffiffiffi
1
3
rj4iH1þffiffiffi
1
3
rj5iH1þffiffiffi
1
3
rj6iH1
!"
þffiffiffi
1
4
rj2iH2þffiffiffi
1
4
rj3iH2þffiffiffi
1
4
rj4iH2þffiffiffi
1
4
rj5iH2
!#:ð10Þ
R. L. Bond, Y.-H. He & T. C. Ormerod
1850002-4
In this example, (10) cannot be solved since the possible values of D1\D2for each
hypothesis (6) have been described as equal chance outcomes within a general su-
perposition of H1and H2, with the unknown coe±cients and assuming the role of
the classical Bayesian likelihood ratio.
The development of an alternative quantum mechanical description necessitates a
return to the simplest form of Bayes' theorem using the case of exclusive populations
Hiand data sets D,
D, such as given in (2). Here, the overall probability of H1may
be simply calculated as
PðH1Þ¼ nðH1Þ
nðH1ÞþnðH2Þ:ð11Þ
The a priori uncertainty in (2) may be expressed by constructing a wave function in
which the four data points are encoded as a linear superposition
j1;1jH1D1;2jH1
D2;1jH2D2;2jH2
Di:ð12Þ
Since there is no overlap between either Dand
Dor the populations H1and H2, each
datum automatically forms an eigenstate basis with the orthonormal conditions
hH1DjH1Di¼hH1
DjH1
D1
hH2DjH2Di¼hH2
DjH2
D1
all other brakets ¼0;ð13Þ
where the normalization of the wave function demands that
hj1;ð14Þ
so that the sum of the modulus squares of the coe±cients i;jgives a total probability
of 1
j1;1j2þj1;2j2þj2;1j2þj2;2j2¼1:ð15Þ
For simplicity, let
x1¼PðDjH1Þ;y1¼Pð
DjH1Þ;
x2¼PðDjH2Þ;y2¼Pð
DjH2Þ;
X1¼PðH1Þ;X2¼PðH2Þ:ð16Þ
If the coe±cients i;jfrom (12) are set as required by (2), it follows that
j1;1j2¼x1;j1;2j2¼y1;j2;1j2¼x2;j2;2j2¼y2;ð17Þ
so that the normalized wave function jiis described by
j1
ffiffiffiffi
N
pðffiffiffiffiffi
x1
pjH1Dffiffiffiffi
y1
pjH1
Dffiffiffiffiffi
x2
pjH2Dffiffiffiffi
y2
pjH2
D;ð18Þ
for some normalization constant N.
A quantum framework for likelihood ratios
1850002-5
The orthonormality condition (14) implies that
N¼x1þy1þx2þy2¼X1þX2;ð19Þ
thereby giving the full wave function description
jffiffiffiffiffi
x1
pjH1Dffiffiffiffi
y1
pjH1
Dffiffiffiffiffi
x2
pjH2Dffiffiffiffi
y2
pjH2
Di
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
X1þX2
p:ð20Þ
If the value of PðH1jDÞis to be calculated, i.e. the property Dis observed, then the
normalized wave function (12) necessarily collapses to
j01jH1D12jH2D1i;ð21Þ
where the coe±cients 1;2may be determined by projecting jion to the two terms
in j0iusing (13), giving
1¼h0jH1Dffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x1
X1þX2
r;
2¼h0jH2Dffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x2
X1þX2
r:ð22Þ
Normalizing (21) with the coe±cient N0
j01
ffiffiffiffiffiffi
N0
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x1
X1þX2
rjH1Dffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x2
X1þX2
rjH2Di

;ð23Þ
and using the normalization condition (14), implies that
1¼h0j01
N0
x1
X1þX2þx2
X1þX2

!N0¼x1þx2
X1þX2
:ð24Þ
Thus, after collapse, the properly normalized wave function (23) becomes
j0ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x1
x1þx2
rjH1Dffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x2
x1þx2
rjH2Di;ð25Þ
which means that the probability of observing jH1Diis
PðjH1DiÞ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x1
x1þx2
r

2
¼2
1
2
1þ2
2¼x1
x1þx2
:ð26Þ
This is entirely consistent with Bayes' theorem and demonstrates its derivation using
quantum mechanical axioms.
4. Quantum Likelihood Ratios for Co-Dependent Data
Having established the principle of using a quantum mechanical approach for the
calculation of simple likelihood ratios with mutually exclusive data (2), it is now
R. L. Bond, Y.-H. He & T. C. Ormerod
1850002-6
possible to consider the general case of nhypotheses and mdata (27), where the data
are co-dependent, or intersect.
H1H2··· Hn
D1x1,1x1,2··· x1,n
D2x2,1x2,2··· x2,n
.
.
..
.
..
.
.
Dmxm,1xm,2··· xm,n
ð27Þ
Here, the contingency table in (27) has been indexed using
xi;;¼1;2;...;n;i¼1;2;...;m:ð28Þ
While the general wave function remains the same as before, the overlapping data
create nonorthonormal inner products which can be naturally de¯ned as
hHDijHDjc
ij ;c
ij ¼c
ji 2R;c
ii ¼1:ð29Þ
Assuming, for simplicity, that the overlaps c
ij are real, then there is a symmetry in
that c
ij ¼c
ji for each . Further, for each and i, the state is normalized i.e. c
ii ¼1.
The given independence of the hypotheses Halso enforces the Kronecker delta
function, .
The Hilbert space Vspanned by the kets jHDiiis mn-dimensional and, be-
cause of the independence of H, naturally decomposes into the direct sum (30) with
respect to the inner product, thereby demonstrating that the nonorthonormal con-
ditions are the direct sum of mvector spaces V:
V¼SpanðfjHDiigÞ ¼ M
n
¼1
V;dim V¼m:ð30Þ
Since the inner products are nonorthonormal, each Vmust be individually ortho-
normalised. Given that Vsplits into a direct sum, this may be achieved for each
subspace Vby applying the GramSchmidt algorithm to fjHDiig of V. Con-
sequently, the orthonormal basis may be de¯ned as
jK
iX
n
k¼1
A
i;kjHDki;hK
ijK
jij;ð31Þ
for each ¼1;2;...;nwith mmmatrices A
i;k, for each .
Substituting the inner products (29) gives
X
m
k;k0¼1
A
ikA
jk0c
kk0¼ij 8¼1;2;...;n:ð32Þ
A quantum framework for likelihood ratios
1850002-7
The wavefunction may now be written as a linear combination of the orthonorma-
lized kets jK
iiwith the coe±cients b
i, and may be expanded into the jHDii
basis using (31), i.e.
jX
;i
b
ijK
iX
;i;k
b
iA
ikjHDki:ð33Þ
As with (17) from earlier, the coe±cients in (33) should be set as required by the
contingency table
X
i
b
iA
i;k¼ffiffiffiffiffiffiffi
xk
p;ð34Þ
where, to solve for the b-coe±cients, (32) may be used to invert
X
k;k0X
i
b
iA
ikAjk 0c
kk0¼X
k;k0ffiffiffiffiffiffiffi
xk
pA
jk0c
k0k;ð35Þ
giving
b
j¼X
k;k0ffiffiffiffiffiffiffi
xk
pA
jk0c
kk0:ð36Þ
Having relabeled the indices as necessary, a back-substitution of (34) into the ex-
pansion (33) gives
jX
;i;k
b
iA
i;kjHDkX
;kffiffiffiffiffiffiffi
xk
pjHDki;ð37Þ
which is the same as having simply assigned each ket's coe±cient to the square root of
its associated entry in the contingency table.
The normalization factor for jiis simply 1=ffiffiffiffi
N
p, where Nis the sum of the
squares of the coe±cients bof the orthonormalized bases jK
ii,
N¼X
i; ðb
iÞ2¼X
i;
b
iX
k;k0ffiffiffiffiffiffiffi
xk
pA
k0;ic
kk0
!
¼X
k;k0; ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
xkxk0
pc
kk0:ð38Þ
Thus, the ¯nal normalized wave function is
jP;kffiffiffiffiffiffiffi
xk
pjHDki
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pi;j; ffiffiffiffiffiffiffiffiffiffiffiffiffi
xixj
pc
ij
q;ð39Þ
where is summed from 1 to n, and i;jare summed from 1 to m. Note that, in the
denominator, the diagonal term ffiffiffiffiffiffiffiffiffiffiffiffiffi
xixj
pc
ij, which occurs whenever i¼j, simpli¯es
to xisince c
ii ¼1for all .
From (39) it follows that, exactly in parallel to the nonintersecting case, if all
properties Diare observed simultaneously, the probability of any hypothesis H, for
a ¯xed ,is
PðHjD1\D2\DmÞ¼ Piðb
iÞ2
Pi; ðb
iÞ2¼Pi;jffiffiffiffiffiffiffiffiffiffiffiffiffi
xixj
pc
ij
Pi;j; ffiffiffiffiffiffiffiffiffiffiffiffiffi
xixj
pc
ij
:ð40Þ
R. L. Bond, Y.-H. He & T. C. Ormerod
1850002-8
In the case of noneven populations for each hypothesis (i.e. noneven priors), the
calculated probabilities should be appropriately weighted.
5. Example Solution
Returning to the problem presented in the contingency table (4), it is now possible to
calculate the precise probability for a randomly selected particle with the properties
of \spin "" and \charge þ" being particle (H1). For this 22matrix, recalling
from (29) that c
ii ¼1and c
ij ¼c
ji, the general expression (40) may be written as
PðH1jD1\D2Þ¼ P2
i;j¼1ffiffiffiffiffiffiffiffiffiffiffiffiffiffi
xi;1xj;1
pc1
ij
P2
i;j¼1P2
¼1ffiffiffiffiffiffiffiffiffiffiffiffiffi
xixj
pc
ij
¼ffiffiffiffiffiffiffiffi
x2
1;1
qc1
1;1þffiffiffiffiffiffiffiffi
x2
2;1
qc1
2;2þffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x1;1x2;1
pc1
1;2þffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x2;1x1;1
pc1
2;1
P2
¼1ffiffiffiffiffiffiffiffi
x2
1;
qc1
1;1þffiffiffiffiffiffiffiffi
x2
2;
qc1
2;2þffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x1; x2;
pc1
1;2þffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
x2; x1;
pc1
2;1
¼x1þy1þ2c1ffiffiffiffiffiffiffiffiffi
x1y1
p
x1þx2þy1þy2þ2c1ffiffiffiffiffiffiffiffiffi
x1y1
pþ2c2ffiffiffiffiffiffiffiffiffi
x2y2
p;ð41Þ
where, adhering to the earlier notation (16),
x1¼x1;1¼PðD1jH1Þ;y1¼x2;1¼PðD2jH1Þ;
x2¼x1;2¼PðD1jH2Þ;y2¼x2;2¼PðD2jH2Þ;
X1¼PðH1Þ;X2¼PðH2Þ;ð42Þ
and, for brevity, c1:¼c1
1;2,c2:¼c2
1;2. For simplicity, PðHijD1\D2Þwill henceforth
be denoted as Pi. Implementing (41) is dependent upon deriving solutions for the yet
unknown expressions ci,i¼1;2which govern the extent of the intersection in (29).
This can only be achieved by imposing reasonable constraints upon ciwhich have
been inferred from expected behavior and known outcomes, i.e. through the use of
boundary values and symmetries. Speci¯cally, these constraints are:
Data dependence. The expressions cimust, in some way, be dependent upon the data
given in the contingency table, i.e.
c1¼c1ðx1;y1;x2;y2;X1;X2Þ;
c2¼c2ðx1;y1;x2;y2;X1;X2Þ:ð43Þ
Probability. The calculated values for Pimust fall between 0 and 1. Since xiand yiare
positive, it su±ces to take
1<ciðx1;y1;x2;y2Þ<1:ð44Þ
Complementarity. The law of total probability dictates that
P1þP2¼1;ð45Þ
which can easily be seen to hold.
A quantum framework for likelihood ratios
1850002-9
Symmetry. The exchanging of rows within the contingency tables should not a®ect
the calculation of Pi. In other words, for each i¼1;2,Piis invariant under xi$yi.
This constraint implies that
ciðx1;y1;x2;y2Þ¼ciðy1;x1;y2;x2Þ:ð46Þ
Equally, if the columns are exchanged then Pimust map to each other, i.e. for each
i¼1;2then P1$P2under x1$x2;y1$y2which gives the further constraint that
c1ðx1;y1;x2;y2Þ¼c2ðx2;y2;x1;y1Þ:ð47Þ
Known values. There are a number of contingency table structures which give rise to
a known probability, i.e.
H1H2
D11 1
D2m n
P1=m
m+n
H1H2
D1m n
D21 1
P1=m
m+n
H1H2
D1n m
D2m n
P1=1
2
H1H2
D1n n
D2m m
P1=1
2
H1H2
D1m m
D2m m
P1=1
2,
ð48Þ
where m;nare positively valued probabilities. For such contingency tables the cor-
rect probabilities should always be returned by ci. Applying this principle to (41)
gives the constraints
m
mþn¼2c1ðm;1;n;1Þffiffiffiffi
m
pþmþ1
2c1ðm;1;n;1Þffiffiffiffi
m
pþ2c2ðm;1;n;1Þffiffiffi
n
pþmþnþ2;ð49Þ
R. L. Bond, Y.-H. He & T. C. Ormerod
1850002-10
1
2¼2c1ðn;m;m;nÞffiffiffiffi
m
pffiffiffi
n
pþmþn
2c1ðn;m;m;nÞffiffiffiffi
m
pffiffiffi
n
pþ2c2ðn;m;m;nÞffiffiffiffi
m
pffiffiffi
n
pþ2mþ2n;ð50Þ
1
2¼2c1ðn;m;n;mÞffiffiffiffi
m
pffiffiffi
n
pþmþn
2c1ðn;m;n;mÞffiffiffiffi
m
pffiffiffi
n
pþ2c2ðn;m;n;mÞffiffiffiffi
m
pffiffiffi
n
pþ2mþ2n:ð51Þ
Nonhomogeneity. Bayes' theorem returns the same probability for any linearly scaled
contingency tables, e.g.
x1!1:0;y1!1:0;x2!1:0;y2!0:50 )P10:667;ð52Þ
x1!0:5;y1!0:5;x2!0:5;y2!0:25 )P10:667:ð53Þ
While homogeneity may be justi¯ed for conditionally independent data, this is not
the case for intersecting, co-dependent data since the act of scaling changes the
nature of the intersections and the relationship between them. This may be easily
demonstrated by taking the possible value ranges for (52) and (53), calculated
using (5), which are
Eq:ð52Þ)ðD1\D2ÞjH1¼f1g;
ðD1\D2ÞjH2¼f0:5g;
Eq:ð53Þ)ðD1\D2ÞjH1¼f0:0...0:5g;
ðD1\D2ÞjH2¼f0:0...0:25g:
ð54Þ
The e®ect of scaling has not only introduced uncertainty where previously there had
been none, but has also introduced the possibility of 0 as a valid answer for both
hypotheses. Further, the spatial distance between the hypotheses has also decreased.
For these reasons, it would seem unreasonable to assert that (52) and (53) share the
same likelihood ratio.
Using these principles and constraints, it becomes possible to solve ci. From the
principle of symmetry, it follows that
c1ðn;m;m;nÞ¼c2ðm;n;n;mÞ¼c2ðn;m;m;nÞ;
c1ðn;m;n;mÞ¼c2ðn;m;n;mÞ¼c2ðn;m;n;mÞ;ð55Þ
and that the equalities (50), (51) for Pi¼0:5automatically hold. Further, (49) solves
to give
c2ðm;1;n;1Þ¼2ffiffiffiffi
m
pnc1ðm;1;n;1Þmþn
2mffiffiffi
n
p;ð56Þ
which, because c1ðn;1;m;1Þ¼c2ðm;1;n;1Þ, ¯nally gives
c1ðn;1;m;1Þ¼2ffiffiffiffi
m
pnc1ðm;1;n;1Þmþn
2mffiffiffi
n
p:ð57Þ
A quantum framework for likelihood ratios
1850002-11
Substituting gðm;nÞ:¼ffiffiffi
n
pc1ðm;1;n;1Þtransforms (57) into an anti-symmetric
bivariate functional equation in m;n,
gðm;nÞgðn;mÞ¼ m
2ffiffiffiffiffiffiffi
mn
pn
2ffiffiffiffiffiffiffi
mn
p;ð58Þ
whose solution is gðm;nÞ¼ m
2ffiffiffiffiffi
mn
p.
This gives a ¯nal solution for the coe±cients c1;2of
c1ðx1;y1;x2;y2Þ¼ ffiffiffiffiffiffiffiffiffi
x1y1
p
2x2y2
;
c2ðx1;y1;x2;y2Þ¼ ffiffiffiffiffiffiffiffiffi
x2y2
p
2x1y1
:ð59Þ
Thus, substituting (59) into (41) gives the likelihood ratio expression of,
PðH1jD1\D2Þ¼
x1y1
x2y2þx1þy1
x1y1
x2y2þx1þy1þx2y2
x1y1þx2þy2
:ð60Þ
Given that the population sizes of H1and H2are the same, no weighting needs to
take place. Hence, the value of PðH1jD1\D2Þfor (4) may now be calculated to be
PðH1jD1\D2Þ0:5896:ð61Þ
6. Discussion
One of the greatest obstacles in developing any statistical approach is demonstrating
correctness. This formula is no di®erent in that respect. If correctness could be
demonstrated then, a priori, there would be an appropriate existing method which
would negate the need for a new one. All that may be hoped for in any approach is
that it generates appropriate answers when they are known, reasonable answers
for all other cases, and that these answers follow logically from the underlying
mathematics.
However, what is clear is that the limitations of the naive Bayes' classi¯er render
any calculations derived from it open to an unknown margin of error. Given the
importance of accurately deriving likelihood ratios this is troubling. This is especially
true when the statistical tolerance of calculations is marginal.
As a quantum mechanical methodology this result is able to calculate accurate,
iteration free, likelihood ratios which fall beyond the scope of existing statistical
techniques, and o®ers a new theoretical approach within both statistics and physics.
Further, through the addition of a Hamiltonian operator to introduce time-evolution,
it can o®er likelihood ratios for future system states with appropriate updating of the
contingency table. In contrast, Bayes' theorem is unable to distinguish directly be-
tween time-dependent and time-independent systems. This may lead to situations
where the process of contingency table updating results in the same decisions being
made repeatedly with the appearance of an ever increasing degree of certainty.
R. L. Bond, Y.-H. He & T. C. Ormerod
1850002-12
Indeed, from (26), it would seem that the naive Bayes' classi¯er is only a special case
of a more complex quantum mechanical framework, and may only be used where the
exclusivity of data is guaranteed.
The introduction of a Hamiltonian operator, and a full quantum dynamical for-
malism, is in progress, and should have profound implications for the physical sci-
ences. Inevitably, such a formalism will require a sensible continuous classical limit.
In other words, the ¯nal expressions for the likelihood ratios should contain a pa-
rameter, in some form of }, which, when going to 0, reproduces a classically known
result. For example, the solutions to (59) could be moderated as
c1ðx1;y1;x2;y2Þ¼ ffiffiffiffiffiffiffiffiffi
x1y1
p
2x2y2ð1expð}ÞÞ;
c2ðx1;y1;x2;y2Þ¼ ffiffiffiffiffiffiffiffiffi
x2y2
p
2x1y1ð1expð}ÞÞ;ð62Þ
so that in the limit of }!0, the intersection parameters, c1and c2, vanish to return
the formalism to the classical situation of independent data.
7. Conclusion
This paper has demonstrated both theoretically, and practically, that a quantum
mechanical methodology can overcome the axiomatic limitations of classical statis-
tics. In doing so, it challenges the orthodoxy of de Finetti's epistemological approach
to statistics by demonstrating that it is possible to derive \real" likelihood ratios from
information systems without recourse to arbitrary and subjective evaluations.
While further theoretical development work needs to be undertaken, particularly
with regards to the application of these mathematics in other domains, it is hoped
that this article will help advance the debate over the nature and meaning of
statistics within the physical sciences.
Acknowledgments
YHH would like to thank the Science and Technology Facilities Council, UK, for
grant ST/J00037X/1, the Chinese Ministry of Education, for a Chang-Jiang Chair
Professorship at NanKai University as well as the City of Tian-Jin for a Qian-Ren
Scholarship, and Merton College, Oxford, for her enduring support.
References
1. B. de Finetti, Theory of Probability: A Critical Introductory Treatment, Vols. 1 and 2,
Translated by A. Machíand A. Smith (Wiley, New York, 1974).
2. C. Caves, C. Fuchs and R. Schack, Conditions for compatibility of quantum-state
assignments, Phys. Rev. A 66(6) (2002) 062111(1), doi: 10.1103/PhysRevA.66.062111.
3. C. G. Timpson, Quantum Bayesianism: A study, Stud. Hist. Philoso. Sci. B, Stud. Hist.
Philoso. Mode. Phys. 39(3) (2008) 579, doi: 10.1016/j.shpsb.2008.03.006.
A quantum framework for likelihood ratios
1850002-13
4. A. P. Dempster, N. M. Laird and D. B. Rubin, J. R. Stat. Soc. S. B (Stat. Methodol.) 39(1)
(1977) 1.
5. M. Oaksford and N. Chater, Bayesian Rationality. The Probabilistic Approach to Human
Reasoning (Oxford University Press, Oxford, 2007).
6. C. F. J. Wu, On the convergence properties of the EM algorithm, Ann. Stat. 11(1) (1983)
95, doi: 10.1214/aos/1176346060.
R. L. Bond, Y.-H. He & T. C. Ormerod
1850002-14
... Since, in their original quantum-mechanical expression, Bond et al. 3 ...
Preprint
Full-text available
This article presents a new interpretation of the structure of subjective Bayesian probability spaces. Rather than assuming the linear space of classical statistical theory, it is proposed that Bayes' theorem demands a curved, non-linear probability space. This finding challenges over 250 years of accepted assumptions about Bayes Theorem and necessitates a re-evaluation of the reliability of any scientific research that relies upon it, whether that be in Psychology, Medicine, Informatics, Economics or the Physical Sciences.
Article
Full-text available
This paper presents the results of a study designed to investigate the pseudodiagnosticity bias as a failure to identify and select diagnostically relevant information. The reported experiment (N = 240) aims to deepen understanding of the role played by the rarity of evidential features in a classical pseudodiagnos-ticity task. The problem used for the experiment was a classical pseudodiagnosticity task. Six experimental versions were constructed: they differed in the rarity of features proposed and in the percentages (high or low) associated with them. The results show that people's responses appear to be influenced by the percentage values associated with explicit information more than by a rarity factor. When an initial piece of evidence is associated with a low percentage, the percentage of normatively diagnostic answers is greater than when this percentage is high. Furthermore, rarity is not, in itself, a crucial factor in the occurrence of pseudodiagnosticity bias. Rather, the perception of the difference between two evidential features in terms of informative value influences people's responses when orienting a diagnostic evaluation. When people perceive an initial piece of evidence as having greater informative value than a second piece of evidence, they tend to (correctly) move their attention from the focal hypothesis to the alternative one.
Article
Full-text available
Three experiments investigated the effect of rarity on people's selection and interpretation of data in a variant of the pseudodiagnosticity task. For familiar (Experiment 1) but not for arbitrary (Experiment 3) materials, participants were more likely to select evidence so as to complete a likelihood ratio when the initial evidence they received was a single likelihood concerning a rare feature. This rarity effect with familiar materials was replicated in Experiment 2 where it was shown that participants were relatively insensitive to explicit manipulations of the likely diagnosticity of rare evidence. In contrast to the effects for data selection, there was an effect of rarity on confidence ratings after receipt of a single likelihood for arbitrary but not for familiar materials. It is suggested that selecting diagnostic evidence necessitates explicit consideration of the alternative hypothesis and that consideration of the possible consequences of the evidence for the alternative weakens the rarity effect in confidence ratings. Paradoxically, although rarity effects in evidence selection and confidence ratings are in the spirit of Bayesian reasoning, the effect on confidence ratings appears to rely on participants thinking less about the alternative hypothesis.
Book
De Finetti's theory of probability is one of the foundations of Bayesian theory. De Finetti stated that probability is nothing but a subjective analysis of the likelihood that something will happen and that that probability does not exist outside the mind. It is the rate at which a person is willing to bet on something happening. This view is directly opposed to the classicist/ frequentist view of the likelihood of a particular outcome of an event, which assumes that the same event could be identically repeated many times over, and the 'probability' of a particular outcome has to do with the fraction of the time that outcome results from the repeated trials.
Article
Much of our understanding of human thinking is based on probabilistic models. This innovative book by Jerome R. Busemeyer and Peter D. Bruza argues that, actually, the underlying mathematical structures from quantum theory provide a much better account of human thinking than traditional models. They introduce the foundations for modelling probabilistic-dynamic systems using two aspects of quantum theory. The first, 'contextuality', is a way to understand interference effects found with inferences and decisions under conditions of uncertainty. The second, 'quantum entanglement', allows cognitive phenomena to be modeled in non-reductionist ways. Employing these principles drawn from quantum theory allows us to view human cognition and decision in a totally new light. Introducing the basic principles in an easy-to-follow way, this book does not assume a physics background or a quantum brain and comes complete with a tutorial and fully worked-out applications in important areas of cognition and decision.
Book
Are people rational? This question was central to Greek thought and has been at the heart of psychology and philosophy for millennia. This book provides a radical and controversial reappraisal of conventional wisdom in the psychology of reasoning, proposing that the Western conception of the mind as a logical system is flawed at the very outset. It argues that cognition should be understood in terms of probability theory, the calculus of uncertain reasoning, rather than in terms of logic, the calculus of certain reasoning.
Article
Subjects selected data in order to decide from which of two ‘islands’ an ‘archeological find’ had come. The results replicated two established phenomena in cognitive psychology: (1) the tendency to ignore base rate data given individuating information, and (2) the tendency to seek confirmatory evidence.The major outcome of the study was, however, to reveal a new phenomenon in information search. Subjects displayed a surprising and strong tendency to seek diagnostically worthless information. They then altered their conclusion based on that information. For example, subjects who had already obtained P(D1/H1) selected P(D2/H1) when P(D1/H2) was equally easily available, and when they had no relevant experience to bring to bear on the estimation of P(D1/H2). This phenomenon, which appears to be a wholly dysfunctional cognitive tendency, was labeled pseudodiagnosticity.