Extended statistical modeling under symmetry; the link toward quantum mechanics
ABSTRACT We derive essential elements of quantum mechanics from a parametric structure
extending that of traditional mathematical statistics. The basic setting is a
set $\mathcal{A}$ of incompatible experiments, and a transformation group $G$
on the cartesian product $\Pi$ of the parameter spaces of these experiments.
The set of possible parameters is constrained to lie in a subspace of $\Pi$, an
orbit or a set of orbits of $G$. Each possible model is then connected to a
parametric Hilbert space. The spaces of different experiments are linked
unitarily, thus defining a common Hilbert space $\mathbf{H}$. A state is
equivalent to a question together with an answer: the choice of an experiment
$a\in\mathcal{A}$ plus a value for the corresponding parameter. Finally,
probabilities are introduced through Born's formula, which is derived from a
recent version of Gleason's theorem. This then leads to the usual formalism of
elementary quantum mechanics in important special cases. The theory is
illustrated by the example of a quantum particle with spin.
-
Citations (0)
-
Cited In (0)
Page 1
arXiv:quant-ph/0503214v4 18 May 2006
The Annals of Statistics
2006, Vol. 34, No. 1, 42–77
DOI: 10.1214/009053605000000868
c ? Institute of Mathematical Statistics, 2006
EXTENDED STATISTICAL MODELING UNDER SYMMETRY;
THE LINK TOWARD QUANTUM MECHANICS
By Inge S. Helland
University of Oslo
We derive essential elements of quantum mechanics from a para-
metric structure extending that of traditional mathematical statis-
tics. The basic setting is a set A of incompatible experiments, and
a transformation group G on the cartesian product Π of the param-
eter spaces of these experiments. The set of possible parameters is
constrained to lie in a subspace of Π, an orbit or a set of orbits of
G. Each possible model is then connected to a parametric Hilbert
space. The spaces of different experiments are linked unitarily, thus
defining a common Hilbert space H. A state is equivalent to a ques-
tion together with an answer: the choice of an experiment a ∈ A plus
a value for the corresponding parameter. Finally, probabilities are
introduced through Born’s formula, which is derived from a recent
version of Gleason’s theorem. This then leads to the usual formalism
of elementary quantum mechanics in important special cases. The
theory is illustrated by the example of a quantum particle with spin.
1. Introduction.
tion using the concept of probability. Historically, the difference between the
two disciplines has been large, but in the last few years it has diminished,
not in the least due to the recent work by Barndorff-Nielsen, Gill and Jupp
[7].
The lack of contact between the two disciplines is of course related to the
difference in foundation, but one of the aims of the present paper is to argue
that to a certain extent, this difference in foundation can be overcome. This
may perhaps at first be difficult to believe: In statistics, the state of a given
system is given simply by a probability measure on some measurable space.
In quantum theory in its most common formulation the state of a system is
Both statistics and quantum theory deal with predic-
Received March 2003; revised March 2005.
AMS 2000 subject classifications. Primary 62A01; secondary 81P10, 62B15.
Key words and phrases. Born’s formula, complementarity, complete sufficient statistics,
Gleason’s theorem, group representation, Hilbert space, model reduction, quantum me-
chanics, quantum theory, symmetry, transition probability.
This is an electronic reprint of the original article published by the
Institute of Mathematical Statistics in The Annals of Statistics,
2006, Vol. 34, No. 1, 42–77. This reprint differs from the original in pagination
and typographic detail.
1
Page 2
2
I. S. HELLAND
given by a vector v in some abstract Hilbert space. As a continuation of this
formal theory, each observable is linked to a self-adjoint operator T on the
same Hilbert space in such a way that the expectation of this observable in
the state v is given by (v,Tv). Associated with this is Born’s formula: The
transition probability from state u to state v is of the form |(v,u)|2. Also, in
the absence of what physicists call superselection rules, linear combinations
of statevectors form new statevectors, which lead to interference phenomena
unknown to classical statistics.
The Born formula allows physicists to compute probabilities for sets of
outcomes, perhaps as a function of certain parameters. Statistical methods
can then be used for inference about these parameters, as discussed in [7].
By contrast, the present paper aims at giving a statistical interpretation of
the vectors v themselves. If parameters are introduced as in op. cit., the total
model will be similar to the hierarchical models used in Bayesian statistics.
We will not use these latter kinds of parameters in the present paper. Our
parametric models will be of the simplest kind, but we will emphasize that
the choice between different experimental questions to focus upon also may
imply a choice between different parametric models.
The quantum formalism as such is the result of a long development within
physics, starting with discoveries by Max Planck, and where contributions
have been made by Bohr, Pauli, Schr¨ odinger, Heisenberg and many others.
There are many good books on quantum theory, for instance, [39], where
also some of the philosophical background is discussed.
Many authors have tried to find deeper foundations leading to the formal-
ism of quantum theory. Several mathematical approaches are discussed in
[60]. One such approach is quantum logic, treated in detail by Beltrametti
and Cassinelli [12].
The earliest book on the mathematical foundation of quantum mechanics
is [58]; in English translation, [59]. This book has had great influence; in its
time it constituted a very important mathematical synthesis of the theory of
quantum phenomena. The book can also be considered to be a forerunner of
quantum probability. For physicists, von Neumann’s book was supplemented
by the book of Dirac [24], which started the development leading to modern
quantum field theory.
The development of quantum probability as a mathematical discipline,
continuing the more formal development of quantum theory, was started
in the 1970’s. A first important topic was to develop a noncommutative
analogue of the notion of stochastic processes; see [1] and references therein.
Other topics were noncommutative conditional expectations and quantum
filtering and prediction theory ([10] and references therein).
Quantum probability was made popular among ordinary probabilists by
Meyer [45]. A related book is [49], which discusses the quantum stochastic
calculus founded by Hudson and Parthasarathy, but also many other themes
Page 3
STATISTICS AND QUANTUM MECHANICS
3
related to the mathematics of current quantum theory. An example of a sym-
posium proceeding aiming at covering both conventional probability theory
and quantum probability is [2].
There are also links between quantum theory and statistical inference
theory. A systematic treatment of quantum hypothesis testing and quantum
estimation theory was first given by Helstrom [37]. In [38] several aspects of
quantum inference are discussed in depth; among other things the book con-
tains a chapter on symmetry groups. A survey paper on quantum inference
is Malley and Hornstein [43].
As an example of a particular statistical topic of interest, consider that
of Fisher information. Since a quantum state ordinarily allows several ex-
periments, this concept can be generalized in a natural way. A quantum
information measure due to Helstrom can be shown to give the maximal
Fisher information over all possible experiments; for a recent discussion see
[6].
One can thus point to several links between ordinary probability and
statistics on the one hand and their quantum counterparts on the other hand.
However, a general theory encompassing both sides, based on a reasonably
intuitive foundation, has until now been lacking.
The main purpose of the present paper is indeed to suggest a new ap-
proach to the statistical foundation of quantum mechanics based on elemen-
tary concepts such as choice of experiment, probability model, complemen-
tarity, symmetry and model reduction. I claim that this approach leads to a
conceptual basis which is more intuitive than the usual one. This is of course
a very bold statement, knowing how well established the ordinary quantum
formalism is, especially since the program started here also needs further
development. Nevertheless, I will claim that for readers knowing statistical
theory and some group theory, the present approach will probably be more
enlightening than the usual formalism.
In addition to the implications for quantum theory, the concepts needed
to complete this program, and also concepts learned directly from quantum
theory, may at the same time turn out to lead to an enrichment of current
statistical theory.
An example is the concept of complementarity; in our approach this de-
notes the situation where two parameters cannot both be estimated accu-
rately in a given context, but it can also be given a wider content. In our
opinion this concept should not be confined to the microworld. This view is
also in line with Bohr [16], who gave talks explaining the concept of com-
plementarity to, among others, biologists and sociologists.
A related generalization of the ordinary statistical paradigm will in fact
be basic to our main setting: Before we look at the parameter of a concrete
experiment, we consider all questions that can be addressed in any experi-
ment in a given context. Thus there is a total parameter φ, which is a vector
Page 4
4
I. S. HELLAND
containing all theoretical quantities that can be imagined for a given system.
Any experiment which is chosen has a parameter that is a function of φ, but
φ itself has too rich a content to be estimated. Some ordinary statistical
situations that can be fit into this pattern are:
Example 1.
at the experimental design phase. This can be made concrete in many dif-
ferent directions.
Consider all quantities of relevance that are contemplated
Example 2.
with a fixed number of alternatives for each question. Some respondents
insist on giving unexpected but informative answers, say, comments in ad-
dition to the fixed questions. The total parameter φ may contain some such
possibilities.
A questionnaire is designed for a statistical investigation
Example 3.
of humans is performed, say, through a questionnaire. Let φ contain all
possible information about these humans which may have some relevance to
the concrete questions posed.
More generally: A statistical investigation on some group
Example 4.
surement which is destroyed after one measurement. Let µ be the length
which is to be measured. Assume furthermore that the standard deviation
of measurement σ can only be estimated by destroying the apparatus. Let
then φ = (µ,σ).
There is a fragile apparatus for some specific length mea-
Example 5.
time λ1if he gets treatment 1 at a specific time t, and expected survival
time λ2if he gets treatment 2 at that time. Here “expected” is not primarily
meant in relation to a probability model, but may at this point be related to
what is expected by the medical experts taking into account all knowledge
they have about the patient and about the treatments. Then φ = (λ1,λ2)
can never be estimated.
Assume that a particular patient has an expected survival
Example 6.
individual, where we know that the answer will depend on the order in
which the questions are posed. Let (λ1,λ2) be the expected answer when
the questions are posed in one order, and (λ3,λ4) when the questions are
posed in the other order. Then φ = (λ1,...,λ4) cannot be estimated from
one individual.
Let there be two questions which are to be asked of an
Many more realistic, moderately complicated, examples exist, like the
behavioral parameters of a rat taken together with parameters of the brain
structure which can only be measured if the rat is killed.
Page 5
STATISTICS AND QUANTUM MECHANICS
5
We will concentrate much on the statistical parameter space. An essential
point of the statistical paradigm is that, before the experiment, the param-
eter λ is unknown; afterward it is as a rule fairly accurately determined. In
this way the focus is shifted from what the value of the parameter “is” to the
knowledge we have about the parameter. In a physical context this can eas-
ily be made consistent with the point of view expressed by Niels Bohr, cited
from [51]: “It is wrong to think that the task of physics is to find out how
nature is. Physics concerns what we can say about nature.” This statement
is also in agreement with current views of quantum theory, as expressed, for
instance, by Fuchs [27].
It is well known that there exists in the literature a large number of sug-
gestions for interpretations of quantum theory; a very incomplete list is given
by the references [13, 15, 20, 25]. Most of these interpretations include the
ordinary minimalistic interpretation of Niels Bohr (the Copenhagen school
or pragmatic interpretation concentrating on interpreting the outcomes of
concrete experiments; for more details see [39]). The present article also
implies a particular statistical interpretation related to the Niels Bohr in-
terpretation, but it is beyond the scope of this paper to discuss in detail
relations to other interpretation given in the literature.
There are also a few related papers in the recent literature. Bohr and Ulf-
beck [14] discuss a foundation of quantum mechanics which is based upon
irreducible representation of groups, and thus uses symmetry in a way which
is similar to ours. Caves, Fuchs and Schack [19] proposes a Bayesian approach
to quantum theory based upon Gleason’s powerful Hilbert space theorem.
Here we will avoid taking an abstract Hilbert space as a point of departure,
but we will arrive at it from a rather concrete setting. Finally, Hardy [32] de-
rives quantum theory and probability theory from a few reasonable axioms,
without going into any details concerning the state concept.
Sections 2–7 below are preparatory: In Section 2 group actions on the sam-
ple space and on the parameter space of an experiment are discussed, and
the concept of permissibility is introduced. In Section 3 it is shown that per-
missibility always can be achieved by going to a subgroup; such a subgroup
connected to an experimental parameter will be important later. In Section 4
the relation to causal inference, in particular to the concept of counterfactu-
als, is discussed, while in Section 5 the main quantum-mechanical example,
electron spin, is treated. Section 6 gives the starting point sketched in the
abstract above: reduction of the cartesian product of the parameter spaces
of complementary experiments, while Section 7 treats model reduction in
general and introduces the concept of group representation.
Then in Sections 8–10 the basic Hilbert space is introduced, first for a
single experiment and then tied together for several complementary experi-
ments. The treatment in these sections could have been simplified consider-
ably by concentrating on the parameter space. The full discussion involving
Page 6
6
I. S. HELLAND
the sample space is included mainly for three reasons, however: First, this
paves the way for further generalizations. Second, the context of an experi-
ment is related to the limitation of the data that can be obtained, and this
context is felt to play a role in the quantization. Third, a discussion of the
full experiment is needed later in Section 12.
Before that, in Section 11, operators and states are introduced.
An important result is proved in Section 12: Born’s formula for the tran-
sition probability between experiments. From this, the basic formalism of
elementary quantum mechanics is derived in Section 13.
In what follows, we will make several explicit assumptions; most of them
are relatively weak and fairly natural in a statistical setting. The excep-
tions to this are Assumption 5, which is a simple assumption about the
connection between the parameter spaces associated with different choices
of experiments; Assumption 7, which through a limitation of the parameter
space serves to restrict us to a discussion of elementary quantum theory;
and finally, Assumption 8, which gives the symmetry assumption needed to
derive Born’s formula and from this the formalism of elementary quantum
mechanics.
2. Statistical models and groups.
Φ—the range of the total parameter φ—can have almost any structure; in
this paper we will assume:
In general the total parameter space
Assumption 1.
transformation group G acting on Φ which satisfies certain weak technical
requirements (see Appendix A.1) so that Φ can be given a right invariant
measure ν, that is, a measure which satisfies ν((dφ)g) = ν(dφ).
Φ is a locally compact topological space. There is a
Note that in this paper, group actions will always be written to the right:
φ ?→ φg. The reason for this is simply that it facilitates the introduction of
the right invariant measure, which from several points of view [34] in the case
of a single parameter can be argued to be the best choice of a noninformative
prior under symmetry in ordinary Bayesian statistical inference.
The right invariant measure is unique (up to a fixed constant) for transi-
tive transformation groups, that is, group actions where the space consists of
one single orbit. An orbit is defined as a set of the form {φ:φ = φ0g:g ∈ G}.
In general the space Φ can be divided into several orbits, and the invariant
measure is unique on each orbit; it must be supplemented by some measure
on the orbit indices in order to give a measure on the whole space Φ.
When a group G is defined on the (total) parameter space Φ, an impor-
tant property that an experimental parameter may or may not have is the
following (cf. McCullagh [44], who chose to call this concept natural):
Page 7
STATISTICS AND QUANTUM MECHANICS
7
Definition 1.
if it satisfies:
The parameter λ is called permissible as a function λ(φ)
If λ(φ1) = λ(φ2) then λ(φ1g) = λ(φ2g) for all g ∈ G.
The most important argument for this restriction is that it leads to a
uniquely defined action of the group G on the image space Λ of λ(φ):
(λg)(φ) = λ(φg). (1)
Several general arguments for permissibility are given in [33, 34]: When
this property holds, the best equivariant estimator, which essentially is the
Bayes estimator under prior ν, is conserved under model reduction using
functions of λ. Also, in the transitive case credibility intervals under the
invariant prior turn out to be identical to confidence intervals, and certain
paradoxes related to Bayes estimation are avoided.
Trivially, the total parameter λ = φ itself is permissible. Also, the vector
parameter (λ1,...,λk) is permissible if each λiis permissible.
As will be shown in the next section, if λ is not permissible with respect
to G, one can always define a maximal subgroup with respect to which λ is
permissible. This will be the usual case in our setting.
Let now a general group D of transformations be defined on the parameter
space Λ—the range of λ. This transformation group D will be kept fixed,
being thought of as a part of the specification of the problem in addition to
the statistical model.
Sometimes a group D of transformations on the sample space is defined
first, and then the actions on the parameter space are introduced via the
statistical model by defining probability measures Pλgfor g ∈ D on the
sample space X by
Pλg(B) = Pλ(Bg−1) for sets B.(2)
Then the connection between these two transformation groups is a homo-
morphism: If g1and g2are taken to act on the two spaces X and Λ, then
g−1
i
and g1g2act on both spaces in the same way. The concept of homomor-
phism will be fundamental to this paper. It means that we have very similar
group actions: The identity element, inverses and subgroups are mapped as
they should be between the two transformation groups; that is, the essential
structure is inherited. This is the reason why the same symbol D can and
will be used for both transformation groups. If g is mapped by (2) into the
identity e only when g = e, then the homomorphism will be an isomorphism:
The structures of the two groups are then essentially identical. If in addition
a one-to-one correspondence can be established between the spaces upon
which the groups act, everything will be equivalent.
Page 8
8
I. S. HELLAND
A further discussion of symmetry groups in statistics is given in [34] and in
Appendix A.1. Note that the existence of a group D acting on the parameter
space Λ in fact requires very few explicit invariance properties. What is
needed is basically: (i) The sample space and the parameter space should
both be closed under the transformations in the group. (ii) If the problem
is formulated in terms of a loss function, this should be unchanged when
observations and parameters are transformed conformably by the group. (iii)
If a noninformative prior on Λ is needed, the right invariant distribution ν
on this space should be used.
3. Experimental parameters and permissibility.
rameter or total parameter φ is used to model some given part of reality,
there are usually many questions that can be investigated in such a setting.
Very often different such questions are addressed performing different ex-
periments on the specific part of reality in question. (A related case is when
different questions are addressed within the same experiment, e.g., when
statisticians consider different sets of orthogonal contrasts in an analysis of
variance experiment.)
Let A be the set of such questions from now on in this paper assumed to
be connected to different experiments.
Assuming that a pa-
Assumption 2.
which we assume that a probability model Pλa(·) exists corresponding to
experiment a. It is assumed that each experiment is maximal, that is, that
there exists no possible experiment with parameter µasuch that λais a
proper function of µa.
For each a ∈ A there is a parameter λa= λa(φ), for
In a physical context, Pλa(·) should be the probability measure for the
measurement apparatus, at the present moment left unspecified.
When we in the sequel talk about choice of experiment/question a, we re-
ally mean a choice of (a,λa). But the probability measure Pλa(·) is thought
to be connected to the measurement apparatus, and is not at the outset in-
cluded in this choice. Quantum probabilities are first introduced in Theorem
5.
When a transformation group G is defined on the (total) parameter space
Φ, an important property of the experimental parameter λais whether it is
a permissible function λa(φ). As already said, the most important argument
for this restriction is that it leads to a uniquely defined transformation group
Gaon the image space Λaof λa(φ), so that (λaga)(φ) = λa(φga) for ga∈ Ga.
As a simple illustration of a group connected to a parameter space or
the total parameter space, look at the (total) parameter φ = (µ,σ) with the
translation/scale group (µ,σ) ?→ (a+bµ,bσ) where b > 0. The following one-
dimensional parameters are permissible: µ, σ, µ3, µ + σ, µ + 3σ, and if a
Page 9
STATISTICS AND QUANTUM MECHANICS
9
such parameter is asked for some reason, say as a focus parameter, all these
give valid candidates.
On the other hand, the following parameters are not permissible, and
would according to McCullagh [44] lead to absurd focus parameters under
this group: µ+σ2, σeµ, tan(µ)/sin(σ).
A further example is given by the coefficient of variation σ/µ. This is not
permissible. (The location part of the transformation does not make sense
here.) But it will be permissible if the group is reduced to the pure scale
group (µ,σ) ?→ (bµ,bσ), b > 0. This points at an important general
Principle.
the basic group G, then take a subgroup Gaso that it becomes permissible
with respect to this subgroup.
If a focus parameter λa(φ) is not permissible with respect to
Lemma 1.
Gaof G such that λais permissible with respect to Ga.
Given a parameter λa, there is always a maximal subgroup
Proof.
that λa(φ1) = λa(φ2) if and only if λa(φ1g) = λa(φ2g). Then Gacontains the
identity. Furthermore, using the definition with φ1,φ2replaced by φ1g1,φ2g1,
it follows that g1g2∈ Gawhen g1∈ Gaand g2∈ Ga. Using the definition with
φ1,φ2replaced by φ1g−1,φ2g−1, it is clear that it contains inverses. Hence
Gais a group. It follows from the construction that it is maximal.
Let Gabe the set of all g ∈ G such that for all φ1,φ2∈ Φ we have
?
From this it follows that the group Gaalso acts on Λa= λa(Φ), by a
simple homomorphism determined as in (1).
4. Experimental parameters and counterfactuals.
of experiment can also be related to the literature on causal inference, in
particular to the concept of counterfactuals, which has a central place there.
A counterfactual question is a question of the form: “What would the result
have been if ...?”. A counterfactual variable, in the way this concept is used
in the literature, is a hypothetical variable giving the result of performing an
experiment under some specific condition a, when this condition a is known
not to hold. A typical example is when several treatments can be allocated
to some given experimental unit at some fixed time, and then in reality only
one of these treatments can be chosen.
The use of such a concept goes back to Neyman [48], and has in recent
decades been discussed by, among others, Rubin [54], Robins [52, 53], Pearl
[50] and Gill and Robins [29]. On the other hand, Dawid [21] is skeptical of
an extensive use of counterfactuals. The discussion of the last paper shows
some of the positions taken by several prominent scientists on this issue.
In our view this choice
Page 10
10
I. S. HELLAND
In our setting, we choose and perform one experiment a, and then any
other experiment b imagined at the same time must be regarded as a coun-
terfactual experiment. However, instead of introducing counterfactual vari-
ables, I use counterfactual parameters λa, which in my view is a more useful
concept. Parameters are hypothetical entities that usually cannot be ob-
served directly. Nevertheless they may be useful in our mental modeling of
phenomena and in our discussion of them. In the last decades, such men-
tal models in causal inference have been developed to great sophistication,
among other ways by using various graphical tools [41, 50]. In the present
paper we will limit mental models to scalar and vector parameters, some
counterfactual, leading to what we have called a total parameter, but this
model concept can in principle be generalized.
When it is decided to perform one particular experiment a ∈ A, the λa
becomes the parameter of this specific experiment, an experiment which
then also may include a technical or experimental error. In any case, the
experiment will give an estimateˆλa. If the technical error can be neglected,
we have a perfect experiment, implyingˆλa= λa.
We are here at a crucial point for understanding the whole theory of
this paper, namely the transition from the unobserved parameter to the
observed variable. Let us again look at a single patient at some given time
who can be given two different treatments. Define λaas the expected survival
time of this patient under treatment a. Then make a choice of treatment,
say a = 1. Ultimately, we then observe a survival time t1for this patient.
There is no technical error involved here, so we might say that we then have
λ1=ˆλ1= t1. And this is in fact true. Per definition, λ1is connected to the
single patient, the definite treatment time and a definite choice of treatment.
So even though λ1is defined at the outset as an unknown parameter, its
definition is such that, once the experiment is carried out, the parameter
must by definition take the value t1.
This simple, but crucial phenomenon, which is related to how a concept
can be defined in a given situation, is in my view of quantum mechanics
closely connected to what physicists call “the collapse of the wave packet”
when an observation is undertaken.
5. A quantum particle with spin. Perhaps the most simple quantum-
mechanical system is an electron with its spin. The spin component λ can be
measured in any space direction a, and λ always takes one of the values −1 or
+1. Given such a (perfect) measurement, this defines in the usual quantum
formalism a certain state vector v in a complex two-dimensional vector space
H, formally as the eigenvector of an operator corresponding to the given
measurement with the given measurement value as eigenvalue. And given
this state vector v, quantum mechanics offers formulae, versions of which will
be discussed later, for predicting the results of further measurements. This
Page 11
STATISTICS AND QUANTUM MECHANICS
11
quantum-mechanical model for the electron also has several applications to
other systems. The setup itself is generally called a qubit in the literature.
As a contrast to this formalism, and to illustrate the general theory of
this paper, we give a nonstandard description of a particle with spin, a
description which will turn out in the end to be essentially equivalent to the
one given by ordinary quantum theory.
The total parameter φ corresponding to electron spin may be defined as a
vector in three-dimensional space; the direction of the vector gives the spin
axis, the norm gives the spinning speed. The associated group G is then the
group of all rotations of this vector in R3around the origin. At the outset, φ
is a model quantity and hence unknown. As indicated before, we will assume
throughout that such a total parameter can never assume a definite value
in the sense that it never can be estimated. Nevertheless, such an abstract
quantity turns out to be useful in model discussions.
Now let the electron have such a total parameter φ attached to it. Assume
first that the system defines a context such that it is only possible to estimate
some given component of φ. From this point of view, the most that we can
hope to be able to measure is the angular momentum component θa(φ) =
|φ|cos(α) in some direction given by a unit vector a, where α is the angle
between φ and a.
The function θa(·) is easily seen to be nonpermissible for fixed a. This
is simply because two vectors with the same component along a in general
will have different such components after a rotation. The maximal possible
choice of the group Gawith respect to which θa(·) is permissible is the group
of rotations of the unit vector around the axis a, possibly together with a
180orotation around any axis perpendicular to a.
The group Gaalso acts on the image space for θa. This group action has
several orbits: For each κ ∈ (0,1], one orbit is given by the two-point set
{−κ,κ} in Θa. In addition there is an orbit for κ = 0.
We want in general that any reduction of the parameter space should be
to an orbit or to a set of orbits. Since the value of κ may be considered to
be arbitrary, we concentrate on λa= sign(θa), taking the two values −1 and
+1. This also implies that the function λa(φ) is permissible with respect
to the group Ga, and that this group acts upon λaby exchanging its two
values. Assume now that the electron in itself defines such a context that
only λacan be measured, an assumption which is consistent with experience.
The apparatus usually used to measure such a discretized spin component
is called a Stern–Gerlach device.
The unconditional prior probability for λais 1/2 for each of the values
±1 by symmetry. Assume now that we know that λa= +1, and that we af-
terward will measure the spin component in another direction b. We assume
for simplicity that we have an ideal measurement apparatus in the direction
Page 12
12
I. S. HELLAND
b, so that what we seek is the transition probability in parameter space,
P(λb= +1|λa= +1).
The formal quantum-mechanical solution of this is well known in the
physics literature. Let the components of the (unit) a-vector be (ax,ay,az),
and let σx, σyand σzbe the three Pauli spin operators
σx=
?01
01
?
,σy=
?0−i
0i
?
,σz=
?10
0−1
?
. (3)
Calculate the eigenvector vafor the operator axσx+ayσy+azσzcorrespond-
ing to the eigenvalue +1, and do a similar thing in the b-direction. Then the
formalism of quantum mechanics (see Section 14 below) says that
P(λb= +1|λa= +1) = |va†vb|2. (4)
A straightforward calculation then gives
P(λb= +1|λa= +1) = (1+ cos(u))/2, (5)
where u is the angle between the a-vector and the b-vector.
A general statistical approach to transition probabilities is given in The-
orem 5 below.
6. Parameters of several statistical experiments.
assumed the existence of a total parameter. This section gives a very general
alternative way to arrive at this concept.
Consider a set A of mutually exclusive experiments, each of the ordinary
statistical kind, but we will concentrate on the parameter spaces Λa;a ∈ A.
The whole set of parameters of the experiments is given by points in the big
space
Π = ×
a Cartesian product. If all parameter spaces have the same structure Λ, this
can be considered to be the set of functions from A to Λ.
Let there be defined a transformation group G on Π.
Up to now, we have
a
Λa,
Example 7 (Compare Example 5).
are the expected lifelengths of a single patient under two mutually exclusive
treatments. Let G be the joint set of time scale transformations together
with the exchange λ1↔ λ2.
Consider again the electron spin. Let π = (λa;a ∈ A), where
λais the spin component ±1 of a perfect measurement in the direction a of
an electron. Let G be the group generated by the transformations:
Let π = (λ1,λ2), where λ1and λ2
Example 8.
Page 13
STATISTICS AND QUANTUM MECHANICS
13
(i) Inversions: λa?→ −λa.
(ii) Rotations of experiments: If a ?→ ao under a rotation o, replace each
λawith λao. This gives a permutation within the cartesian product.
Note in general that the points of Π make sense mathematically, but not
directly physically, hence it does not make sense in a physical context to
give values to the individual points of this space. The space Π will hence
not be called a state space.
So what operations are meaningful with the spaces Π? I have mentioned
group operations. One can also adjoin such spaces corresponding to different
systems, and adjoin π with some other parameter. Finally, one can look at
subspaces.
Assume that the experiments are related in some way. Then it may be
reasonable to try to reduce the space Π. The purpose of this reduction may
be to achieve parsimony. This should not be thought of as an approximation,
however, but may be a result of some physical theory. Note that theories
are formulated not in terms of observations, but in terms of parameters, the
theoretical language behind observations.
Let Π be reduced to a subspace Ψ with the property:
Property 1.
or a set of orbits for the group G. Use the notation G also for this group
acting on Ψ.
Ψ is an orbit, that is, a set of the form {π:π = π0g:g ∈ G},
This is a necessary condition in order that G should be a transformation
group on the reduced space. It is also consistent with the discussion elsewhere
in this paper. In [34] there are given several examples of model reductions
connected to single experiments where the reduced space is an orbit or a set
of orbits of an associated transformation group.
It is natural in certain situations to demand also:
Property 2.
tion with Ψ for a set of specified values λ0.
Each section {π ∈ Π:λa(π) = λ0} has a nonzero intersec-
In fact, this will always be true for some values λ0. In a future publication
we hope to use this fact together with some group representation theory to
discuss quantization itself.
Let now the model reduction be associated with some function φ on Π
which is one-to-one on the subset Ψ and undefined elsewhere. It follows then
from Property 1 that the group G is well defined on the range of φ.
Definition 2.
ter space. Any function with the above properties is called a total parameter.
If such a function exists, call Φ = φ(Ψ) the total parame-
Page 14
14
I. S. HELLAND
A total parameter φ can in principle be replaced by any other total pa-
rameter in one-to-one correspondence with φ. But it is important to have a
simple representation.
If Property 2 holds, then each λacan be regarded as a function on Φ.
Example 8 (continued).
such that there exists a vector φ that gives each λaequal to sign(a·φ). Let
φ(π) be this direction normed as a unit vector.
– Taken as a unit vector φ(π) is a unique function of π.
Restrict Π to the subset Ψ, the set of all π
Proof.
unit vectors φ1and φ2. Then a = φ1− φ2, normalized gives λa= +1 corre-
sponding to φ1and λa= −1 corresponding to φ2, a contradiction.
Suppose that there is a π which corresponds to two different
?
– The set Ψ is an orbit of G.
Proof.
It is easy to see that Ψ is closed under inversions and rotations.
?
– All sections {π:λa(π) = ±1} have nonzero intersections with Ψ.
Proof.
Obvious.
?
From this, we are back to the situation discussed in Section 5.
7. Experiment, model reduction and group representation.
experimentalist have the choice between different experiments a ∈ A on the
same unit(s), where the experiment a consists of measuring some ya, with
ya= ya(ω) being a function on some sample space S, and where the mea-
surement process is modeled with a parameter λa. This parameter is a part
of the model description of the units, and all the model parameters may be
seen as functions λa(φ) of a total parameter φ.
We use a common sample space S for all experiments a, since this space
can be imagined in terms of a common measurement apparatus or some set
of apparatus. Specifically we assume:
Now let the
Assumption 3.
probability measures Pλaare jointly dominated, that is, absolutely contin-
uous with respect to a fixed probability measure P on the sample space
S.
There is a common sample space S. The reduced model
Page 15
STATISTICS AND QUANTUM MECHANICS
15
In the electron case this simply means that one in principle can assume
that the same or the same kind of Stern–Gerlach apparatus can be used for
every measurement. The measure P can be assumed to be Bernoulli(1/2).
In the previous section, a global model reduction was introduced by re-
ducing the large space Π to one or a few orbits of the basic group G. As in
the electron spin example, it may also be natural or necessary to reduce the
original parameter θato a new parameter λa. All such model reduction is
done by selecting one or a few orbits of the relevant group Ga.
The most important theoretical argument for model reduction associated
with orbits of the group is the following: All models should have a parameter
space which is invariant under the group. For the reduced model this is only
possible when the parameter space in question is composed of orbits of the
relevant group.
Here is another argument: The Pitman estimator is equal to the Bayes
estimator under right invariant prior, and this estimator is important in
many applications. In order that this shall make sense for the reduced model,
the parameter space of this reduced model must be constructed from orbits
of the parameter group actions.
A further discussion of model reduction under symmetry in statistics and
in quantum mechanics will be given elsewhere, and we then also hope to
relate the discussion to the concept of group representation, which is very
useful in quantum theory.
Generally (see also Appendix A.2), a group representation is a class of
operators {U(g);g ∈ G} on a vector space space V , where G is a group,
such that the operators satisfy the property U(gh) = U(g)U(h). This gives
a group of operators homomorphic to the group G, and, as the name says, it
is used to represent the group in a specific way. There is a large mathematical
literature on group representations.
Specifically, the regular representation U(G) on L2(Φ,ν), where ν is a
right invariant measure for the basic group G, is given by
U(g)f(φ) = f(φg). (6)
Explicitly, this implies that U(G) is a group of linear operators acting on
L2(Φ,ν). The group property of U(G) is well known and easily verified. The
same formula (6) is valid for any subspace V of L2(Φ,ν) which is invariant
under the group of operators U(G), that is, such that U(g)f ∈ V when f ∈ V
and g ∈ G.
We will also consider group representation spaces of the group Gaacting
on φ. Let λabe a permissible function of φ. Then
Va
λ= {f ∈ L2(Φ,ν):f(φ) =˜f(λa(φ))}
is an invariant subspace of L2(Φ,ν) under the regular representation U(Ga).
View other sources
Hide other sources
-
Available from Inge Helland · 5 Apr 2013
-
Available from arxiv.org