Content uploaded by Rachael Bond

Author content

All content in this area was uploaded by Rachael Bond on Nov 13, 2018

Content may be subject to copyright.

Content uploaded by Yang-Hui He

Author content

All content in this area was uploaded by Yang-Hui He on Aug 26, 2015

Content may be subject to copyright.

A quantum framework for likelihood ratios

Rachael L. Bond

School of Psychology, University of Sussex,

Falmer, East Sussex, BN1 9QH, UK

hello@rachaelbond.com

Yang-Hui He

Department of Mathematics, City,

University of London, EC1V 0HB, UK

Merton College, University of Oxford, OX1 4JD, UK

School of Physics, NanKai University, Tianjin 300071, China

hey@maths.ox.ac.uk

Thomas C. Ormerod

School of Psychology, University of Sussex,

Falmer, East Sussex, BN1 9QH, UK

t.ormerod@sussex.ac.uk

Received 6 March 2017

Accepted 15 November 2017

Published 12 December 2017

The ability to calculate precise likelihood ratios is fundamental to science, from Quantum

Information Theory through to Quantum State Estimation. However, there is no assumption-

free statistical methodology to achieve this. For instance, in the absence of data relating to

covariate overlap, the widely used Bayes' theorem either defaults to the marginal probability

driven \naive Bayes' classi¯er", or requires the use of compensatory expectation-maximization

techniques. This paper takes an information-theoretic approach in developing a new statistical

formula for the calculation of likelihood ratios based on the principles of quantum entanglement,

and demonstrates that Bayes' theorem is a special case of a more general quantum mechanical

expression.

Keywords: Bayes' theorem; probability; statistics; inference; decision-making.

1. Introduction

In recent years, Bayesian statistical research has often been epistemologically driven,

guided by de Finetti's famous quote that \probability does not exist".

1

For example,

the \quantum Bayesian" methodology of Caves, Fuchs and Schack has applied

de Finetti's ideas to Bayes' theorem for use in quantum mechanics.

2

In doing so,

International Journal of Quantum Information

Vol. 16, No. 1 (2018) 1850002 (14 pages)

#

.

cWorld Scienti¯c Publishing Company

DOI: 10.1142/S0219749918500028

1850002-1

Caves et al. have argued that statistical systems are best interpreted by methods in

which the Bayesian likelihood ratio is seen to be both external to the system and

subjectively imposed on it by the observer.

3

However, the Caves et al. approach is

problematic. At a human scale, for instance, an observer's belief as to the chances of a

fair coin landing either \heads" or \tails" has no known e®ect. Indeed, for all prac-

tical purposes, the \heads:tails" likelihood ratio of 0.5:0.5 is only meaningful when

considered as a property of the coin's own internal statistical system rather than as

some ephemeral and arbitrary qualia.

Yet, to date, the axiomatic di±culties associated with Bayes' theorem, notably its

reliance upon the use of marginal probabilities in the absence of structural statistical

information (e.g. estimates of covariate overlap), as well as the assumed conditional

independence of data, have largely been approached from a \mend and make do"

standpoint. For instance, the \maximum likelihood" approach of Dempster, Laird

and Rubin calculates iteratively derived measures of covariate overlap which not only

lack a sense of natural authenticity, but also introduce fundamental assumptions into

the statistical analysis.

4

Instead, this paper adopts a di®erent approach to the analysis of statistical

systems. By using quantum mechanical mathematical spaces, it is demonstrated

that the creation of isomorphic representations of classical data-sets as entangled

systems allows for a natural, albeit nontrivial, calculation of likelihood ratios. It is

expected that this technique will ¯nd applications within the ¯elds of quantum state

estimation, and quantum information theory.

2. The Limits of Bayes' Theorem

Bayes' theorem is used to calculate the conditional probability of a statement, or

hypothesis, being true given that other information is also true. It is usually written as

PðHijDÞ¼ PðHiÞPðDjHiÞ

PjPðHjÞPðDjHjÞ:ð1Þ

Here, PðHijDÞis the conditional probability of hypothesis Hibeing true given that the

information Dis true; PðDjHiÞis the conditional probability of Dbeing true if Hiis

true; and PjPðHjÞPðDjHjÞis the sum of the probabilities of all hypotheses multiplied

by the conditional probability of Dbeing true for each hypothesis.

5

Particle α(H1) Particle β(H2)

Number of particles (n)10 10

Proportion spin ↑(D)0.80.7

Proportion spin ↓(¯

D)0.20.3

ð2Þ

R. L. Bond, Y.-H. He & T. C. Ormerod

1850002-2

To exemplify using the contingency information in (2), if one wishes to calculate

the nature of a randomly selected particle from a set of 20, given that it has spin ",

then using Bayes' theorem it is trivial to calculate that particle is the most likely

type with a likelihood ratio of approximately 0.53:0.47,

PðH1jDÞ¼ 0:50:8

ð0:50:8Þþð0:50:7Þ¼8

15 0:533;

PðH2jDÞ¼1PðH1jDÞ¼ 7

15 0:467;ð3Þ

where PðHiÞ¼10=ð10 þ10Þ¼0:5for both i¼1;2.

However, di±culties arise in the use of Bayes' theorem for the calculation of

likelihood ratios where there are multiple nonexclusive data sets. For instance, if the

information in (2) is expanded to include data about particle charge (4) then the

precise covariate overlap (i.e. D1\D2) for each particle becomes an unknown.

Particle α(H1) Particle β(H2)

Number of particles (n)10 10

Proportion spin ↑(D1)0.80.7

Proportion charge + (D2)0.60.5

ð4Þ

All that may be shown is that, for each particle, the occurrence of both features

forms a range described by (5), where nðHiÞis the total number of exemplars i,

nðD1jHiÞis the total number of iwith spin ", and nðD2jHiÞis the total number of i

with a positive charge,

nðD1\D2jHiÞ

2

½nðD1jHiÞþnðD2jHiÞnðHiÞ;...;minðnðD1jHiÞ;nðD2jHiÞÞ

if nðD1jHiÞþnðD2jHiÞ>nðHiÞ;or

½0;...;minðnðD1jHiÞ;nðD2jHiÞÞ

if nðD1jHiÞþnðD2jHiÞnðHiÞ:

8

>

>

>

>

<

>

>

>

>

:

ð5Þ

Speci¯cally for (4) these ranges equate to

nðD1\D2jH1Þ2f4;5;6g;

nðD1\D2jH2Þ2f2;3;4;5g:ð6Þ

The simplest approach to resolving this problem is to naively ignore any intersection,

or co-dependence, of the data and to directly multiply the marginal probabilities.

Hence, given (4), the likelihood of particle having the greatest occurrence of both

spin "and a positive charge would be calculated as

PðH1jD1\D2Þ¼ 0:50:80:6

ð0:50:80:6Þþð0:50:70:5Þ;

0:578:ð7Þ

A quantum framework for likelihood ratios

1850002-3

Yet, because the data intersect, this probability value is only one of a number which

may be reasonably calculated. Alternatives include calculating a likelihood ratio

using the mean value of the frequency ranges for each hypothesis

Pð½nðD1\D2jH1ÞÞ ¼ 1

10 1

3ð4þ5þ6Þ¼0:5;

Pð½nðD1\D2jH2ÞÞ ¼ 1

10 1

4ð2þ3þ4þ5Þ¼0:35 ð8Þ

)PðH1jD1\D2Þ0:588;

and taking the mean value of the probability range derived from the frequency range

min PðH1jD1\D2Þ¼ 4

4þ5;

max PðH1jD1\D2Þ¼ 6

6þ2

)½PðH1jD1\D2Þ 0:597:

ð9Þ

Given this multiplicity of probability values, it would seem that none of these

methods may lay claim to normativity. This problem of covariate overlap has, of

course, been previously addressed within statistical literature. For instance, the

\maximum likelihood" approach of Dempster, Laird and Rubin has demonstrated

how an \expectation-maximization" algorithm may be used to derive appropriate

covariate overlap measures.

4

Indeed, the mathematical e±cacy of this technique has

been con¯rmed by Wu.

6

However, it is di±cult to see how such an iterative meth-

odology can be employed without introducing axiomatic assumptions. Further, since

any assumptions, irrespective of how benign they may appear, have the potential to

skew results, what is required is an approach in which covariate overlaps can be

automatically, and directly, calculated from contingency data.

3. A Quantum Mechanical Proof of Bayes' Theorem

for Independent Data

Previously unconsidered, the quantum mechanical von Neumann axioms would seem

to o®er the most promise in this regard, since the re-conceptualization of covariate

data as a quantum entangled system allows for statistical analysis with few, nonar-

bitrary assumptions. Unfortunately, there are many conceptual di±culties that can

arise here. For instance, a Dirac notation representation of (4) as a standard quantum

superposition is

ji¼ 1

ﬃﬃﬃﬃﬃ

N

pﬃﬃﬃ

1

3

rj4iH1þﬃﬃﬃ

1

3

rj5iH1þﬃﬃﬃ

1

3

rj6iH1

!"

þﬃﬃﬃ

1

4

rj2iH2þﬃﬃﬃ

1

4

rj3iH2þﬃﬃﬃ

1

4

rj4iH2þﬃﬃﬃ

1

4

rj5iH2

!#:ð10Þ

R. L. Bond, Y.-H. He & T. C. Ormerod

1850002-4

In this example, (10) cannot be solved since the possible values of D1\D2for each

hypothesis (6) have been described as equal chance outcomes within a general su-

perposition of H1and H2, with the unknown coe±cients and assuming the role of

the classical Bayesian likelihood ratio.

The development of an alternative quantum mechanical description necessitates a

return to the simplest form of Bayes' theorem using the case of exclusive populations

Hiand data sets D,

D, such as given in (2). Here, the overall probability of H1may

be simply calculated as

PðH1Þ¼ nðH1Þ

nðH1ÞþnðH2Þ:ð11Þ

The a priori uncertainty in (2) may be expressed by constructing a wave function in

which the four data points are encoded as a linear superposition

ji¼1;1jH1Diþ1;2jH1

Diþ2;1jH2Diþ2;2jH2

Di:ð12Þ

Since there is no overlap between either Dand

Dor the populations H1and H2, each

datum automatically forms an eigenstate basis with the orthonormal conditions

hH1DjH1Di¼hH1

DjH1

Di¼1

hH2DjH2Di¼hH2

DjH2

Di¼1

all other brakets ¼0;ð13Þ

where the normalization of the wave function demands that

hji¼1;ð14Þ

so that the sum of the modulus squares of the coe±cients i;jgives a total probability

of 1

j1;1j2þj1;2j2þj2;1j2þj2;2j2¼1:ð15Þ

For simplicity, let

x1¼PðDjH1Þ;y1¼Pð

DjH1Þ;

x2¼PðDjH2Þ;y2¼Pð

DjH2Þ;

X1¼PðH1Þ;X2¼PðH2Þ:ð16Þ

If the coe±cients i;jfrom (12) are set as required by (2), it follows that

j1;1j2¼x1;j1;2j2¼y1;j2;1j2¼x2;j2;2j2¼y2;ð17Þ

so that the normalized wave function jiis described by

ji¼ 1

ﬃﬃﬃﬃﬃ

N

pðﬃﬃﬃﬃﬃ

x1

pjH1Diþ ﬃﬃﬃﬃﬃ

y1

pjH1

Diþ ﬃﬃﬃﬃﬃ

x2

pjH2Diþ ﬃﬃﬃﬃﬃ

y2

pjH2

DiÞ;ð18Þ

for some normalization constant N.

A quantum framework for likelihood ratios

1850002-5

The orthonormality condition (14) implies that

N¼x1þy1þx2þy2¼X1þX2;ð19Þ

thereby giving the full wave function description

ji¼ ﬃﬃﬃﬃﬃ

x1

pjH1Diþ ﬃﬃﬃﬃﬃ

y1

pjH1

Diþ ﬃﬃﬃﬃﬃ

x2

pjH2Diþ ﬃﬃﬃﬃﬃ

y2

pjH2

Di

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

X1þX2

p:ð20Þ

If the value of PðH1jDÞis to be calculated, i.e. the property Dis observed, then the

normalized wave function (12) necessarily collapses to

j0i¼1jH1D1iþ2jH2D1i;ð21Þ

where the coe±cients 1;2may be determined by projecting jion to the two terms

in j0iusing (13), giving

1¼h0jH1Di¼ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x1

X1þX2

r;

2¼h0jH2Di¼ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x2

X1þX2

r:ð22Þ

Normalizing (21) with the coe±cient N0

j0i¼ 1

ﬃﬃﬃﬃﬃﬃ

N0

pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x1

X1þX2

rjH1Diþ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x2

X1þX2

rjH2Di

;ð23Þ

and using the normalization condition (14), implies that

1¼h0j0i¼ 1

N0

x1

X1þX2þx2

X1þX2

!N0¼x1þx2

X1þX2

:ð24Þ

Thus, after collapse, the properly normalized wave function (23) becomes

j0i¼ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x1

x1þx2

rjH1Diþ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x2

x1þx2

rjH2Di;ð25Þ

which means that the probability of observing jH1Diis

PðjH1DiÞ ¼ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x1

x1þx2

r

2

¼2

1

2

1þ2

2¼x1

x1þx2

:ð26Þ

This is entirely consistent with Bayes' theorem and demonstrates its derivation using

quantum mechanical axioms.

4. Quantum Likelihood Ratios for Co-Dependent Data

Having established the principle of using a quantum mechanical approach for the

calculation of simple likelihood ratios with mutually exclusive data (2), it is now

R. L. Bond, Y.-H. He & T. C. Ormerod

1850002-6

possible to consider the general case of nhypotheses and mdata (27), where the data

are co-dependent, or intersect.

H1H2··· Hn

D1x1,1x1,2··· x1,n

D2x2,1x2,2··· x2,n

.

.

..

.

..

.

.

Dmxm,1xm,2··· xm,n

ð27Þ

Here, the contingency table in (27) has been indexed using

xi;;¼1;2;...;n;i¼1;2;...;m:ð28Þ

While the general wave function remains the same as before, the overlapping data

create nonorthonormal inner products which can be naturally de¯ned as

hHDijHDji¼c

ij ;c

ij ¼c

ji 2R;c

ii ¼1:ð29Þ

Assuming, for simplicity, that the overlaps c

ij are real, then there is a symmetry in

that c

ij ¼c

ji for each . Further, for each and i, the state is normalized i.e. c

ii ¼1.

The given independence of the hypotheses Halso enforces the Kronecker delta

function, .

The Hilbert space Vspanned by the kets jHDiiis mn-dimensional and, be-

cause of the independence of H, naturally decomposes into the direct sum (30) with

respect to the inner product, thereby demonstrating that the nonorthonormal con-

ditions are the direct sum of mvector spaces V:

V¼SpanðfjHDiigÞ ¼ M

n

¼1

V;dim V¼m:ð30Þ

Since the inner products are nonorthonormal, each Vmust be individually ortho-

normalised. Given that Vsplits into a direct sum, this may be achieved for each

subspace Vby applying the Gram–Schmidt algorithm to fjHDiig of V. Con-

sequently, the orthonormal basis may be de¯ned as

jK

ii¼X

n

k¼1

A

i;kjHDki;hK

ijK

ji¼ij;ð31Þ

for each ¼1;2;...;nwith mmmatrices A

i;k, for each .

Substituting the inner products (29) gives

X

m

k;k0¼1

A

ikA

jk0c

kk0¼ij 8¼1;2;...;n:ð32Þ

A quantum framework for likelihood ratios

1850002-7

The wavefunction may now be written as a linear combination of the orthonorma-

lized kets jK

iiwith the coe±cients b

i, and may be expanded into the jHDii

basis using (31), i.e.

ji¼X

;i

b

ijK

ii¼X

;i;k

b

iA

ikjHDki:ð33Þ

As with (17) from earlier, the coe±cients in (33) should be set as required by the

contingency table

X

i

b

iA

i;k¼ﬃﬃﬃﬃﬃﬃﬃ

xk

p;ð34Þ

where, to solve for the b-coe±cients, (32) may be used to invert

X

k;k0X

i

b

iA

ikAjk 0c

kk0¼X

k;k0ﬃﬃﬃﬃﬃﬃﬃ

xk

pA

jk0c

k0k;ð35Þ

giving

b

j¼X

k;k0ﬃﬃﬃﬃﬃﬃﬃ

xk

pA

jk0c

kk0:ð36Þ

Having relabeled the indices as necessary, a back-substitution of (34) into the ex-

pansion (33) gives

ji¼X

;i;k

b

iA

i;kjHDki¼X

;kﬃﬃﬃﬃﬃﬃﬃ

xk

pjHDki;ð37Þ

which is the same as having simply assigned each ket's coe±cient to the square root of

its associated entry in the contingency table.

The normalization factor for jiis simply 1=ﬃﬃﬃﬃﬃ

N

p, where Nis the sum of the

squares of the coe±cients bof the orthonormalized bases jK

ii,

N¼X

i; ðb

iÞ2¼X

i;

b

iX

k;k0ﬃﬃﬃﬃﬃﬃﬃ

xk

pA

k0;ic

kk0

!

¼X

k;k0; ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

xkxk0

pc

kk0:ð38Þ

Thus, the ¯nal normalized wave function is

ji¼P;kﬃﬃﬃﬃﬃﬃﬃ

xk

pjHDki

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

Pi;j; ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

xixj

pc

ij

q;ð39Þ

where is summed from 1 to n, and i;jare summed from 1 to m. Note that, in the

denominator, the diagonal term ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

xixj

pc

ij, which occurs whenever i¼j, simpli¯es

to xisince c

ii ¼1for all .

From (39) it follows that, exactly in parallel to the nonintersecting case, if all

properties Diare observed simultaneously, the probability of any hypothesis H, for

a ¯xed ,is

PðHjD1\D2\DmÞ¼ Piðb

iÞ2

Pi; ðb

iÞ2¼Pi;jﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

xixj

pc

ij

Pi;j; ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

xixj

pc

ij

:ð40Þ

R. L. Bond, Y.-H. He & T. C. Ormerod

1850002-8

In the case of noneven populations for each hypothesis (i.e. noneven priors), the

calculated probabilities should be appropriately weighted.

5. Example Solution

Returning to the problem presented in the contingency table (4), it is now possible to

calculate the precise probability for a randomly selected particle with the properties

of \spin "" and \charge þ" being particle (H1). For this 22matrix, recalling

from (29) that c

ii ¼1and c

ij ¼c

ji, the general expression (40) may be written as

PðH1jD1\D2Þ¼ P2

i;j¼1ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

xi;1xj;1

pc1

ij

P2

i;j¼1P2

¼1ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

xixj

pc

ij

¼ﬃﬃﬃﬃﬃﬃﬃﬃ

x2

1;1

qc1

1;1þﬃﬃﬃﬃﬃﬃﬃﬃ

x2

2;1

qc1

2;2þﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x1;1x2;1

pc1

1;2þﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x2;1x1;1

pc1

2;1

P2

¼1ﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x2

1;

qc1

1;1þﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x2

2;

qc1

2;2þﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x1; x2;

pc1

1;2þﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x2; x1;

pc1

2;1

¼x1þy1þ2c1ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x1y1

p

x1þx2þy1þy2þ2c1ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x1y1

pþ2c2ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x2y2

p;ð41Þ

where, adhering to the earlier notation (16),

x1¼x1;1¼PðD1jH1Þ;y1¼x2;1¼PðD2jH1Þ;

x2¼x1;2¼PðD1jH2Þ;y2¼x2;2¼PðD2jH2Þ;

X1¼PðH1Þ;X2¼PðH2Þ;ð42Þ

and, for brevity, c1:¼c1

1;2,c2:¼c2

1;2. For simplicity, PðHijD1\D2Þwill henceforth

be denoted as Pi. Implementing (41) is dependent upon deriving solutions for the yet

unknown expressions ci,i¼1;2which govern the extent of the intersection in (29).

This can only be achieved by imposing reasonable constraints upon ciwhich have

been inferred from expected behavior and known outcomes, i.e. through the use of

boundary values and symmetries. Speci¯cally, these constraints are:

Data dependence. The expressions cimust, in some way, be dependent upon the data

given in the contingency table, i.e.

c1¼c1ðx1;y1;x2;y2;X1;X2Þ;

c2¼c2ðx1;y1;x2;y2;X1;X2Þ:ð43Þ

Probability. The calculated values for Pimust fall between 0 and 1. Since xiand yiare

positive, it su±ces to take

1<ciðx1;y1;x2;y2Þ<1:ð44Þ

Complementarity. The law of total probability dictates that

P1þP2¼1;ð45Þ

which can easily be seen to hold.

A quantum framework for likelihood ratios

1850002-9

Symmetry. The exchanging of rows within the contingency tables should not a®ect

the calculation of Pi. In other words, for each i¼1;2,Piis invariant under xi$yi.

This constraint implies that

ciðx1;y1;x2;y2Þ¼ciðy1;x1;y2;x2Þ:ð46Þ

Equally, if the columns are exchanged then Pimust map to each other, i.e. for each

i¼1;2then P1$P2under x1$x2;y1$y2which gives the further constraint that

c1ðx1;y1;x2;y2Þ¼c2ðx2;y2;x1;y1Þ:ð47Þ

Known values. There are a number of contingency table structures which give rise to

a known probability, i.e.

H1H2

D11 1

D2m n

→P1=m

m+n

H1H2

D1m n

D21 1

→P1=m

m+n

H1H2

D1n m

D2m n

→P1=1

2

H1H2

D1n n

D2m m

→P1=1

2

H1H2

D1m m

D2m m

→P1=1

2,

ð48Þ

where m;nare positively valued probabilities. For such contingency tables the cor-

rect probabilities should always be returned by ci. Applying this principle to (41)

gives the constraints

m

mþn¼2c1ðm;1;n;1Þﬃﬃﬃﬃﬃ

m

pþmþ1

2c1ðm;1;n;1Þﬃﬃﬃﬃﬃ

m

pþ2c2ðm;1;n;1Þﬃﬃﬃ

n

pþmþnþ2;ð49Þ

R. L. Bond, Y.-H. He & T. C. Ormerod

1850002-10

1

2¼2c1ðn;m;m;nÞﬃﬃﬃﬃﬃ

m

pﬃﬃﬃ

n

pþmþn

2c1ðn;m;m;nÞﬃﬃﬃﬃﬃ

m

pﬃﬃﬃ

n

pþ2c2ðn;m;m;nÞﬃﬃﬃﬃﬃ

m

pﬃﬃﬃ

n

pþ2mþ2n;ð50Þ

1

2¼2c1ðn;m;n;mÞﬃﬃﬃﬃﬃ

m

pﬃﬃﬃ

n

pþmþn

2c1ðn;m;n;mÞﬃﬃﬃﬃﬃ

m

pﬃﬃﬃ

n

pþ2c2ðn;m;n;mÞﬃﬃﬃﬃﬃ

m

pﬃﬃﬃ

n

pþ2mþ2n:ð51Þ

Nonhomogeneity. Bayes' theorem returns the same probability for any linearly scaled

contingency tables, e.g.

x1!1:0;y1!1:0;x2!1:0;y2!0:50 )P10:667;ð52Þ

x1!0:5;y1!0:5;x2!0:5;y2!0:25 )P10:667:ð53Þ

While homogeneity may be justi¯ed for conditionally independent data, this is not

the case for intersecting, co-dependent data since the act of scaling changes the

nature of the intersections and the relationship between them. This may be easily

demonstrated by taking the possible value ranges for (52) and (53), calculated

using (5), which are

Eq:ð52Þ)ðD1\D2ÞjH1¼f1g;

ðD1\D2ÞjH2¼f0:5g;

Eq:ð53Þ)ðD1\D2ÞjH1¼f0:0...0:5g;

ðD1\D2ÞjH2¼f0:0...0:25g:

ð54Þ

The e®ect of scaling has not only introduced uncertainty where previously there had

been none, but has also introduced the possibility of 0 as a valid answer for both

hypotheses. Further, the spatial distance between the hypotheses has also decreased.

For these reasons, it would seem unreasonable to assert that (52) and (53) share the

same likelihood ratio.

Using these principles and constraints, it becomes possible to solve ci. From the

principle of symmetry, it follows that

c1ðn;m;m;nÞ¼c2ðm;n;n;mÞ¼c2ðn;m;m;nÞ;

c1ðn;m;n;mÞ¼c2ðn;m;n;mÞ¼c2ðn;m;n;mÞ;ð55Þ

and that the equalities (50), (51) for Pi¼0:5automatically hold. Further, (49) solves

to give

c2ðm;1;n;1Þ¼2ﬃﬃﬃﬃﬃ

m

pnc1ðm;1;n;1Þmþn

2mﬃﬃﬃ

n

p;ð56Þ

which, because c1ðn;1;m;1Þ¼c2ðm;1;n;1Þ, ¯nally gives

c1ðn;1;m;1Þ¼2ﬃﬃﬃﬃﬃ

m

pnc1ðm;1;n;1Þmþn

2mﬃﬃﬃ

n

p:ð57Þ

A quantum framework for likelihood ratios

1850002-11

Substituting gðm;nÞ:¼ﬃﬃﬃ

n

pc1ðm;1;n;1Þtransforms (57) into an anti-symmetric

bivariate functional equation in m;n,

gðm;nÞgðn;mÞ¼ m

2ﬃﬃﬃﬃﬃﬃﬃﬃ

mn

pn

2ﬃﬃﬃﬃﬃﬃﬃﬃ

mn

p;ð58Þ

whose solution is gðm;nÞ¼ m

2ﬃﬃﬃﬃﬃ

mn

p.

This gives a ¯nal solution for the coe±cients c1;2of

c1ðx1;y1;x2;y2Þ¼ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x1y1

p

2x2y2

;

c2ðx1;y1;x2;y2Þ¼ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x2y2

p

2x1y1

:ð59Þ

Thus, substituting (59) into (41) gives the likelihood ratio expression of,

PðH1jD1\D2Þ¼

x1y1

x2y2þx1þy1

x1y1

x2y2þx1þy1þx2y2

x1y1þx2þy2

:ð60Þ

Given that the population sizes of H1and H2are the same, no weighting needs to

take place. Hence, the value of PðH1jD1\D2Þfor (4) may now be calculated to be

PðH1jD1\D2Þ0:5896:ð61Þ

6. Discussion

One of the greatest obstacles in developing any statistical approach is demonstrating

correctness. This formula is no di®erent in that respect. If correctness could be

demonstrated then, a priori, there would be an appropriate existing method which

would negate the need for a new one. All that may be hoped for in any approach is

that it generates appropriate answers when they are known, reasonable answers

for all other cases, and that these answers follow logically from the underlying

mathematics.

However, what is clear is that the limitations of the naive Bayes' classi¯er render

any calculations derived from it open to an unknown margin of error. Given the

importance of accurately deriving likelihood ratios this is troubling. This is especially

true when the statistical tolerance of calculations is marginal.

As a quantum mechanical methodology this result is able to calculate accurate,

iteration free, likelihood ratios which fall beyond the scope of existing statistical

techniques, and o®ers a new theoretical approach within both statistics and physics.

Further, through the addition of a Hamiltonian operator to introduce time-evolution,

it can o®er likelihood ratios for future system states with appropriate updating of the

contingency table. In contrast, Bayes' theorem is unable to distinguish directly be-

tween time-dependent and time-independent systems. This may lead to situations

where the process of contingency table updating results in the same decisions being

made repeatedly with the appearance of an ever increasing degree of certainty.

R. L. Bond, Y.-H. He & T. C. Ormerod

1850002-12

Indeed, from (26), it would seem that the naive Bayes' classi¯er is only a special case

of a more complex quantum mechanical framework, and may only be used where the

exclusivity of data is guaranteed.

The introduction of a Hamiltonian operator, and a full quantum dynamical for-

malism, is in progress, and should have profound implications for the physical sci-

ences. Inevitably, such a formalism will require a sensible continuous classical limit.

In other words, the ¯nal expressions for the likelihood ratios should contain a pa-

rameter, in some form of }, which, when going to 0, reproduces a classically known

result. For example, the solutions to (59) could be moderated as

c1ðx1;y1;x2;y2Þ¼ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x1y1

p

2x2y2ð1expð}ÞÞ;

c2ðx1;y1;x2;y2Þ¼ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

x2y2

p

2x1y1ð1expð}ÞÞ;ð62Þ

so that in the limit of }!0, the intersection parameters, c1and c2, vanish to return

the formalism to the classical situation of independent data.

7. Conclusion

This paper has demonstrated both theoretically, and practically, that a quantum

mechanical methodology can overcome the axiomatic limitations of classical statis-

tics. In doing so, it challenges the orthodoxy of de Finetti's epistemological approach

to statistics by demonstrating that it is possible to derive \real" likelihood ratios from

information systems without recourse to arbitrary and subjective evaluations.

While further theoretical development work needs to be undertaken, particularly

with regards to the application of these mathematics in other domains, it is hoped

that this article will help advance the debate over the nature and meaning of

statistics within the physical sciences.

Acknowledgments

YHH would like to thank the Science and Technology Facilities Council, UK, for

grant ST/J00037X/1, the Chinese Ministry of Education, for a Chang-Jiang Chair

Professorship at NanKai University as well as the City of Tian-Jin for a Qian-Ren

Scholarship, and Merton College, Oxford, for her enduring support.

References

1. B. de Finetti, Theory of Probability: A Critical Introductory Treatment, Vols. 1 and 2,

Translated by A. Machíand A. Smith (Wiley, New York, 1974).

2. C. Caves, C. Fuchs and R. Schack, Conditions for compatibility of quantum-state

assignments, Phys. Rev. A 66(6) (2002) 062111(1), doi: 10.1103/PhysRevA.66.062111.

3. C. G. Timpson, Quantum Bayesianism: A study, Stud. Hist. Philoso. Sci. B, Stud. Hist.

Philoso. Mode. Phys. 39(3) (2008) 579, doi: 10.1016/j.shpsb.2008.03.006.

A quantum framework for likelihood ratios

1850002-13

4. A. P. Dempster, N. M. Laird and D. B. Rubin, J. R. Stat. Soc. S. B (Stat. Methodol.) 39(1)

(1977) 1.

5. M. Oaksford and N. Chater, Bayesian Rationality. The Probabilistic Approach to Human

Reasoning (Oxford University Press, Oxford, 2007).

6. C. F. J. Wu, On the convergence properties of the EM algorithm, Ann. Stat. 11(1) (1983)

95, doi: 10.1214/aos/1176346060.

R. L. Bond, Y.-H. He & T. C. Ormerod

1850002-14