Content uploaded by Florio M. Ciaglia
Author content
All content in this area was uploaded by Florio M. Ciaglia on Oct 29, 2020
Content may be subject to copyright.
Diﬀerential geometric aspects of parametric estimation
theory for states on ﬁnitedimensional C?algebras
F. M. Ciaglia1,3, J. Jost1,4, L. Schwachhöfer2,5
1Max Planck Institute for Mathematics in the Sciences, 04103 Leipzig, Germany
2Faculty for Mathematics, TU Dortmund University, 44221 Dortmund, Germany
3email: ciaglia[at]mis.mpg.de,florio.m.ciaglia[at]gmail.com
4email: jjost[at]mis.mpg.de
5email: lschwach[at]math.tudortund.de.
October 28, 2020
Abstract
A geometrical formulation of estimation theory for ﬁnitedimensional
C?
algebras is
presented. This formulation allows to deal with the classical and quantum case in a single,
unifying mathematical framework. The derivation of the CramerRao and Helstrom bounds
for parametric statistical models with discrete and ﬁnite outcome spaces is presented.
Contents
1 Introduction 2
2 Diﬀerential geometric aspects of the space of states 5
3 Parametric models of states on C?algebras 9
4 Parametric statistical models of states on C?algebras 14
5 The problem of estimation theory 18
6 The CramerRao bound 20
7 The Helstrom bound 24
8 Conclusion 27
References 28
1If available, please cite the published version
arXiv:2010.14394v1 [mathph] 27 Oct 2020
1 Introduction
The purpose of this work is to present the formulation of estimation theory in the framework of
C?
algebras, with particular attention to diﬀerential geometric aspects. Although estimation
theory is a welldeveloped subject both in the classical case of probability distributions [
5
,
4
,
6
,
10
]
and in the quantum case of density operators [
13
,
63
,
64
,
65
,
66
,
68
,
77
,
99
], and even if quantum
estimation theory builds upon classical estimation theory, there is no unifying picture for these
subjects. By unifying picture, we mean a mathematical framework in which estimation theory
is placed in such a way that the classical and quantum cases appear as particular instances of
the general theory. We believe that such a formulation may be helpful in obtaining a better
understanding of the similarities and diﬀerences of classical and quantum estimation theory.
This idea may be considered as the driving force of this work.
Roughly speaking, the main problem estimation theory tries to address is to infer the value
of some parameters characterizing the state of the “physical system” under investigation by the
theoretical manipulation of the outcomes of experiments performed on such system.
In the classical case, the state of the system is described by a probability distribution on the
space of outcomes of the experiment, and the goal of estimation theory is to infer the value of
some parameters characterizing the true probability distribution describing the system (e.g., the
mean and/or variance for the case of Gaussian distributions) from the outcomes of the experiment.
As such, estimation theory is welldeveloped both in its asymptotic and nonasymptotic regimes.
Arguably, one little black spot of the theory is that the parameter spaces characterizing the
probability distributions under study are usually taken to be homeomorphic to open subsets
of some ﬁnitedimensional Euclidean space. Even if this assumption is justiﬁed in most of the
models, it necessarily introduces some simpliﬁcations related with the “nice” structure of the
parameter spaces as smooth manifolds. As an example, the existence of global coordinates
often lead to the deﬁnition of objects that are coordinatedependent (see for instance [
6
, ch. 4]
where it is clearly stated that the notion of unbiased estimator developed there is coordinate
dependent). We believe it is healthy to formulate the theory in order to avoid these issues and
better comprehend the coordinateindependent aspects of the theory. This geometric attitude
already proved itself useful in classical Newtonian, Lagrangian and Hamiltonian mechanics
[
1
,
20
,
22
,
81
,
79
,
80
], in thermodynamics and statistical physics [
12
,
21
,
78
,
94
], and in quantum
mechanics [
40
,
41
,
43
,
69
]. Clearly, there already have been eﬀorts to formulate classical
estimation theory in this direction [
23
,
46
,
67
,
84
], and here we try to encapsulate the spirit of
these works in our formulation of estimation theory on C?algebras.
On the other hand, in the quantum case, the state of the system is nolonger a probability
distribution, it is a density operator on the Hilbert space associated with the system. This
adds, at the same time, complexity and richness to the problem of estimation. A ﬁrst layer of
added complexity refers to the need of a statistical interpretation of a given quantum state.
Since the dawn of quantum mechanics, the issue of the physical interpretation of Schrödinger’s
wavefunction was recognized to be a fundamental question. The idea of interpreting the square
modulus of the wavefunction as a probability distribution paved the way to the statistical
interpretation of quantum states through what is now called the
Born rule
[
75
]. Essentially,
the Born rule describes a “procedure” to associate a probability distribution on a suitable
outcome space with a given quantum state. Clearly, this depends on both the quantum state and
the choice of the outcome space, and this means that there is more than one way to associated
probability distributions with quantum states. From the mathematical point of view, the choice
of the statistical interpretation is described by a positive operatorvalued measure (POVM)
2If available, please cite the published version
on the Hilbert space of the system [
68
,
73
]. Accordingly, in order to set up the estimation
problem for a given parametric model of quantum states, we need to operate a preliminary
choice concerning the POVM “inducing” the statistical interpretation. Of course, this choice
inﬂuences the estimation problem we set up, and diﬀerent choices in general lead to diﬀerent
solutions of the associated estimation problem. All this obviously adds a layer of complexity to
the estimation problem, but, simultaneously, it opens new possibilities to outperform classical
limits of estimation because of the peculiar features of quantum states (e.g., entanglement).
Indeed, in the quantum case it is possible to give a precise mathematical meaning to the
assertion “measuring one copy N times is less informative than measuring N copies one single
time” [
42
,
51
,
93
,
99
,
104
]. This assertion relies on the phenomenon of entanglement which is
absent in the classical realm, and thus highlights an important diﬀerence between the classical
and quantum estimation theory.
As mentioned before, the goal of this article is to introduce a theoretical framework that
allows us to treat the classical and quantum case simultaneously. Speciﬁcally, our choice is to
consider the theory of
C?
algebras as the backbone of our construction because both probability
distributions and quantum states may be realized as linear functionals on suitable
C?
algebras.
In the case of probability distributions, this is basically the duality between probability measures
and functions given by the Riesz theorem. For quantum states, this comes directly from the
axiomatic structure of the theory. The main diﬀerence between the two cases is that the algebras
involved are commutative in the former case and noncommutative in the latter. In this general
framework, probability distributions and quantum states represent diﬀerent realizations of the
notion of
state
on a
C?
algebra
A
. The space of states
S
is a convex subset of the dual space
of
A
, and the study of its diﬀerential geometry is a fascinating subject. The rich algebraic
structure of
C?
algebras translate into a rich geometrical structure for their spaces of states that
is perfectly suited for the formulation of parametric estimation theory.
The use of
C?
algebras as a theoretical framework to study the geometry of quantum states
is not new [
15
,
25
,
28
,
29
,
38
,
47
,
48
,
49
,
53
,
60
,
61
,
62
,
70
,
71
,
87
,
88
,
89
,
90
,
101
,
102
]. However,
the focus was essentially always on the algebra of bounded linear operators on the Hilbert space
of the quantum system, and not on a generic
C?
algebra. While this restriction may seem not
particularly relevant for most practical purposes, it is certainly so from the theoretical point
of view. Indeed, some recent developments [
30
,
31
,
34
,
35
,
36
,
37
] point out the possibility
of describing quantum systems whose associated
C?
algebras are groupoid algebras, and thus
are in principle more general than the algebra of bounded linear operators. Consequently, a
reformulation of the wellknown results for an arbitrary C?algebra appears to be useful.
On the other hand, in the classical case, the explicit use of
C?
algebras to investigate the
geometry of probability distributions is essentially absent. To the best of our knowledge, the
only (very nice) exceptions are the works [
57
,
58
,
59
]. However, the point of view of these works
is diﬀerent from ours because they consider probability distributions as particular elements of a
C?algebras, while we consider them as particular linear functionals.
Another reason why we believe it would be useful to consider the framework of
C?
algebras is
that the space of states of a
C?
algebra is an example of space of states of
general probabilistic
theories
[
14
,
24
]. Therefore, the study of the diﬀerential geometry of the space of states of
C?

algebras, and in particular the study of parametric estimation theory in this context, represents
a ﬁrst step toward the study of these subjects in the more comprehensive frameork of general
probabilistic theories. This intermediate step may be useful because states on
C?
algebras
3If available, please cite the published version
beneﬁt from the rich algebraic structure of the algebras they act upon, while states in general
probabilistic theories do not necessarily have such a rich algebraic background to rely on.
Consequently, a ﬁrst study of the richer case may lead to results that can be later generalized
to the less rich case once an appropriate and judicious process of extrapolation is pursued.
We conﬁne ourselves to the case of ﬁnitedimensional
C?
algebras because, at this preliminary
stage, we want to avoid the technical diﬃculties with which the inﬁnitedimensional case is
ﬁlled. Indeed, we are now interested in exposing the basic aspects of the theory in order
to have a solid background on which future works can rely on. In the inﬁnitedimensional
case, the technical diﬃculties would often obscure the conceptual aspects and this unavoidably
leads to be less comunicative. Moreover, it is even not yet clear what are the geometrical
players on the ﬁelds when inﬁnite dimensions are considered because there is no general
consensus on which are the most appropriate manifolds of states to consider in this case (see
[10,11,18,33,39,40,55,50,56,72,82,91,95] for some examples).
Incidentally, the restriction to the ﬁnitedimensional case seems to aﬀect more the classical
case, rather than the quantum case. Indeed, classical estimation theory essentially deals with
parametric models of probability distributions on spaces which are neither discrete nor ﬁnite
(think for instance to normal distributions), and these cases are naturally associated with
inﬁnitedimensional
C?
algebras. The case of parametric models of probability distributions on
discrete and ﬁnite spaces is usually less studied because it seldomly presents itself in applications.
In the context of quantum information theory, the situation is quite the opposite, and the vast
majority of the models considered refer to quantum system with a ﬁnitedimensional Hilbert
space, and thus, with an associated ﬁnitedimensional
C?
algebra. The inﬁnitedimensional case
usually deals only with purestate models for which the underlying manifold of states is rather
friendly, being the Hilbert manifold of a complex projective space associated with a separable,
complex Hilbert space.
The content of this work builds on wellknown and estabilished results in the context of
both classical and quantum estimation theory. However, the presentation of these results in
the unifying framework of
C?
algebras is essentially new, as are the proofs of some results.
We believe that this attitude may be particularly useful in future research dealing with the
inﬁnitedimensional case, and dealing with the comparison of classical and quantum methods.
Accordingly, this work should be considered more as a ﬁrst, preliminary step in a research
program aimed at the understanding of the uniﬁcation of classical and quantum estimation
theory rather than an exposition of a ﬁnite theory, and the focus of the work is more on the
discussion of general structures rather than on the presentation of speciﬁc examples.
The article is structured as follows. In section 2, some diﬀerential geometric aspects of
ﬁnitedimensional
C?
algebras and of their spaces of states are recalled. In section 3, the
notion of parametric model of states on a
C?
algebra
A
is introduced, and the notion of
Symmetric Logarithmic Derivative used in quantum information theory is generalized to the
C?
algebraic setting. In section 4, the notion of parametric statistical model associated with
a given parametric model of states is introduced. This notion represents the bridge between
the models of states on a possibly noncommutative
C?
algebra and the models of probability
distributions used in classical estimation theory. Also, the notion of multiple round model and
its geometrical properties are brieﬂy discussed. In section 5, the problem of estimation theory
is formulated in the
C?
algebraic framework, and the notion of manifoldvalued estimator is
recalled. In section 6, a proof of the CramerRao bound for manifoldvalued estimators on
4If available, please cite the published version
ﬁnite outcome spaces is given following the work of Hendriks [
67
]. Finally, in section 7, the
generalization of the Helstrom bound used in quantum information theory to the
C?
algebraic
framework is given.
2 Diﬀerential geometric aspects of the space of states
We start with a brief summary of
C?
algebras [
17
,
19
,
92
,
98
]. Let
A
be a complex algebra with
identity
I
. If there is an antilinear map
†:A→A
such that (
a†
)
†
=
a
for all
a∈A
, and such
that (
ab
)
†
=
b†a†
for all
a,b∈A
, then
†
is called an
involution
, and (
A, †
)an
involutive
algebra
. If there is a norm
k·k
on
A
turning it into a Banach space satisfying the additional
relations
kabk≤kak kbk
and
kaa†k≤kak2
for all
a,b∈A
, then (
A, †,k·k
)is called a
C?algebra, and, for the sake of notational simplicity, it will be denoted simply by A.
An element
a∈A
is called
selfadjoint
if
a
=
a†
. The space of selfadjoint elements in
A
is denoted as
Asa
. It is a real Banach space whose dual space is denoted as
V
, and there is a
direct sum decomposition
A=Asa ⊕ıAsa,(1)
where ıis the imaginary unit.
An element
b∈A
is called
positive
if there exists
a∈A
such that
b
=
a†a
. Clearly, a
positive element
b
is selfadjoint, and it can be proved that there is a unique selfadjoint element
ssuch that b=s2.
An element g
∈A
is called
invertible
if there is another element written as g
−1
such
that g g
−1
= g
−1
g =
I
. The set of invertible elements in
A
is denoted as
G
, and it is a real
BanachLie group the BanachLie algebra of which is
A
endowed with the commutator [
27
,
103
].
An element
u∈G
is called
unitary
if
u−1
=
u†
. The set of unitary elements in
A
is denoted as
U
and it is a real BanachLie subgroup of
G
, called the unitary group of
A
, whose BanachLie
algebra is the subspace
ıAsa
in the decomposition (1) endowed with the commutator inherited
from A[27,103].
Let
A?
be the complex Banach dual of
A
. An element
ξ∈A?
is called a
selfadjoint
linear functional if
ξ
(
a†
) =
ξ(a)
. The set of selfadjoint linear functionals is precisely the real
Banach dual
V
of
Asa
. A selfadjoint linear functional
ω
is called
positive
if
ω
(
a
)
≥
0for every
positive element
a∈A
. A positive element
ω
is called
faithful
if
ω
(
a
) = 0 implies
a
=
0
for
all positive elements in
A
. The set of positive elements is denoted as
P
. A positive linear
functional ρis called a state if ρ(I) = 1. The set of states is denoted as S.
In the following, we will focus only on ﬁnitedimensional
C?
algebras. Given a selfadjoint
element a∈Asa, we write fafor the linear function on Vgiven by
fa(ξ) = ξ(a),(2)
as well as for its restrictions to the various submanifolds of
V
we will introduce below (with an
evident abuse of notation).
There is a group action of Gon Sgiven by [33,38]
ρg:ρg(c) = ρ(g†cg)
ρ(g†g) ≡Φ(g, ρ)∀c∈A,(3)
and the space of states
S
decomposes into the disjoint union of orbits of the
G
action, and
evidently, each such orbit is a homogeneous space.
5If available, please cite the published version
Recalling that
A
endowed with the commutator is the Lie algebra of
G
, the fundamental
vector ﬁelds of Φare labelled by elements of
A
. Recalling
(1)
, we write an element in
A
as
a
+
ıb
where
a,b∈Asa
, and
ı
is the imaginary unit. Accordingly, we write Γ
ab
for the fundamental
vector ﬁeld associated with
1
2(a+ıb)
. A direct computation shows that the tangent vector
Γ
ab
(
ρ
), identiﬁed with a selfadjoint linear functional in
V
because the orbit
O
is an immersed
submanifold of V, is given by
(Γab fc) (ρ) = (Γab(ρ)) (c) = ρ({a,c})−ρ(a)ρ(c) + ρ([[b,c]]) ∀c∈A,(4)
where
{,}
and [[
,
]] denote, respectively, the Jordan product and the Lie product in
A
given by
{c,d}:= 1
2(cd +dc)
[[c,d]] := 1
2ı(cd −dc).
(5)
Note that {,}and [[,]] preserve Asa, and actually turn it into a JordanLie algebra [45,74].
We set
Ya:= Γa0
Xb:= Γ0b,(6)
and we call
Ya
a gradient vector ﬁeld (the origin of the name will be explained below), and
Xb
a Hamiltonian vector ﬁeld. It is not hard to show that the Hamiltonian vector ﬁelds give an
antirepresentation of the Lie algebra of the group
U⊂G
of unitary element of
G
[
33
,
38
]. This
Lie algebra antirepresentation integrates to a left action of
U
on
O
given by the restriction of
Φto U.
If we ﬁx a basis
{ej}j=1,...,N
of selfadjoint elements in
A
(where
dim
(
A
) =
N
), we may
introduce the structure constants
djk
l
and
cjk
l
of the Jordan and Lie products in equation
(5)
by
setting
{ej,ek}=djk
lel
[[ej,ek]] := cjk
lel.(7)
Then, the gradient and Hamiltonian vector ﬁelds are easily seen to be given by
Ya=djk
lakxl−faxj∂
∂xj
Xb:= cjk
lbkxl∂
∂xj,
(8)
where
{xj}j=1,...,N
is the Cartesian coordinate system on
V
associated with the dual basis
{ej}j=1,...,N of {ej}j=1,...,N .
Example 1 (The probability simplex).If we endow
C
(
Xn
)with the involution given by complex
conjugation, and with the supremum norm, it is not hard to prove that it is a
C?
algebra. We
denote this
C?
algebra as
Cn
. Let
ej∈Cn
be the ‘delta function’ at
xj∈ Xn
(i.e.,
ej
(
xk
) =
δj
k
),
then
{ej}j=1,..,n
is clearly a basis for
Cn
(seen as a vector space) made up of positive, selfadjoint
elements, and we have n
X
j=1
ej=1n,(9)
6If available, please cite the published version
where
1n
is the identity element in
Cn
(i.e., the identity function on
Xn
). Consequently, we can
build the dual basis {ej}j=1,..,n, and a state ρon Cnis easily seen to be written as
ρ=pjej,(10)
where the real numbers pj=ρ(ej)are nonnegative and are subject to the constraint
n
X
j=1
pj= 1.(11)
From this, we conclude that the space of states
S
of
Cn
may be identiﬁed with the
n
simplex ∆
n
.
In the following, whenever we deal with
Cn
we will identify a state
ρ
on
Cn
with a probability
distribution in ∆nand write pinstead of ρ.
Let
Ik⊆ Xn
be a subset with
k≤n
elements, and let
ρ
be a state on
Cn
such that
pj6
= 0 if
and only if
xj∈Ik
. Then, it is not hard to check that the orbit
O
of the group
G
of invertible
elements in
Cn
(see equation
(3)
) through
ρ
coincides with the set of all those states
%
=
qjej
such that
qj6
= 0 if and only if
xj∈Ik
. In particular, the open interior ∆
+
n
of the
n
simplex may
be identiﬁed with the orbit of Gthrough the state pwith pj=1
nfor all j= 1, ..., n.
Since
Cn
is Abelian, it is not hard to see that the action of the unitary group
U⊂G
is
trivial, and thus the Hamiltonian vector ﬁelds vanish identically. On the other hand, a direct
computation shows that the structure constants
djk
l
of the Jordan product with respect to the
basis {ej}j=1,..,n vanish unless j=k=l, in which case they are 1.
Example 2 (The space of density matrices).Consider the complex algebra
Mn
:=
Mn
(
C
)of
complexvalued, (
n×n
)matrices. There is an involution
†
on
Mn
given by the composition
of transposition with componentwise complex conjugation. By exploiting the trace operation,
it is possible to deﬁne a norm on
Mn
given by
kak2
=
Tr
(
a†a
), and we obtain a
C?
algebra
which will be denoted by
Mn
. Moreover, it is easily seen that
Mn
is isomorphic to the
algebra
B
(
H
)of bounded linear operators on an
n
dimensional complex Hilbert space
H
. This
isomorphism depends on the choice of an orthonormal basis in
H
, but, in the context of quantum
information theory, this is in general not very limiting because a preferred choice of basis, called
computational basis [83], is often tied to the physics of the problem under investigation.
Since
Mn
is ﬁnitedimensional, it is isomorphic with its dual space, and an isomorphism is
provided by the trace operation. Speciﬁcally, a linear functional
ξ
on
Mn
is identiﬁed with an
element b
ξ∈Mnby means of
ξ(a) := Tr(b
ξa).(12)
Then, it follows that a state
ρ
on
Mn
may be identiﬁed with a positive semideﬁnite matrix
b
ρ∈Mnwith unit trace. Any such matrix is usually referred to as a density matrix.
It is not hard to prove that the orbits of
G
are classiﬁed by the rank of the associated density
matrices [
25
,
29
,
33
,
53
,
54
]. Speciﬁcally, every orbit
O
is made up of states the associated
density matrices of which have ﬁxed rank. In particular, we have the orbit of states whose
density matrices have unit rank which is the orbit of
pure states
(the extremal points of the
convex space of states) which is diﬀeomorphic to the complex projective space
CPn
, and the
orbit of states whose density matrices have full rank (invertible) which is the orbit of
faithful
states
. Note that the latter is an open subset of the aﬃne space of selfadjoint linear functionals
giving 1 when evaluated on the identity Inof Mn.
If we introduce a basis
{σj}j=0,...,n2−1
on
Mn
in such a way that
σ0
coincides with the identity
element
In∈Mn
, and that
σj
is selfadjoint and satisﬁes
Tr
(
σj
) = 0 for all
j6
= 0, we can build
7If available, please cite the published version
its dual basis {σj}j=0,...,n2−1, and it follows that a state ρmay be written as
ρ=1
nσ0+xjσj,(13)
where
j
= 1
, ..., n2−
1, and
xj∈R
. Clearly, the fact that
ρ
must be a state imposes some
constraints on the values of
xj
depending on the fact that
ρ
(
a†a
)must be nonnegative. There
is no general closed formula to express these constraints for arbitrary n > 2.
For the case
n
= 2 (also known as the
qubit
), it is customary to select
σ1, σ2, σ3
to be the
socalled Pauli matrices
σ1= 0 1
1 0!σ2= 0−ı
ı0!σ3= 1 0
0−1!,(14)
where ıis the imaginary unit. Then, ρis a state if and only if
δjk xjxk≤1.(15)
This identiﬁes a ball in the threedimensional space spanned by the Pauli matrices which is
known as the
Bloch ball
. In this case, there are only two orbits of the group
G
of invertible
elements in
Mn
, namely, the density matrices lying on the surface sphere (the
pure states
),
and the density matrices in the interior of the ball (the faithful states).
According to [
38
], the gradient vector ﬁelds provide an overcomplete basis of the tangent
space at each point in every orbit
O
. Furthermore, on every
O
we may deﬁne a Riemannian
metric tensor Ggiven by
Gρ(Ya(ρ),Yb(ρ)) = ρ({a,b})−ρ(a)ρ(b),(16)
and
Ya
is the gradient vector ﬁeld associated with the smooth function
fa
(see equation
(2)
) by
means of G. This metric tensor is invariant with respect to the action of the unitary group in
the sense that
Φ∗
UG = G ∀U∈U,(17)
where ΦUis the diﬀeomorphism given by
ΦU(ρ) := Φ(U, ρ).(18)
However, Gis not invariant under the action of all of G.
The metric tensor Gturns out to be the
C?
algebraic version of some wellknown and relevant
metric tensors when explicit cases are considered [
38
]. For instance, if
A
=
Cn
and
O
= ∆
+
n
,
then Gcoincides with the FisherRao metric tensor. If
A
=
B
(
H
)and
O∼
=CP
(
H
)is the orbit
of pure states), then Gcoincides with the FubiniStudy metric tensor. If
A
=
B
(
H
)and
O
is
the orbit of faithful states, then Gcoincides with the BuresHelstrom metric tensor.
According to [
38
], the geodesic of Gstarting at
ρ∈ O
with initial tangent vector
v∈TρO
reads
νv
ρ(t) = cos2(vt)ρ+sin2(vt)
v2ρv+sin(2vt)
2vρ{v},(19)
where
v=Ya(ρ)for some a∈Asa ρ(a) = 0,
v2= Gρ(v,v) = ρa2,
ρv(b) := ρ(a b a)∀a∈Asa ,
ρ{v}(b) := ρ({a,b})∀a∈Asa.
(20)
8If available, please cite the published version
The geodesic
νv
ρ
(
t
)remains inside the space of states
S
for all
t∈R
, but it also exits and enters
the orbit Ocontaining the initial state ρat multiple times [38].
3 Parametric models of states on C?algebras
Motivated by the classical theory of parametric estimation, we will now introduce the notion of
a parametric model of states on a ﬁnitedimensional
C?
algebra, and then reformulate the theory
of parametric estimation in this theoretical framework. This will allow for the simultaneous
handling of the classical and the quantum case.
Deﬁnition 1.
A
parametric model
of states on a (ﬁnitedimensional)
C?
algebra
A
is a
triplet (
M,
j
,O
)where
M
is a smooth manifold,
O ⊂ S
is a
G
orbit in
S
(see section 2), and
j:M→ O is a smooth map. If jis injective, we say that the model is identiﬁable.
Some comments are in order. First of all, we ﬁx the codomain of jto be an orbit of states
O
because, as will be clear below, we want to exploit the diﬀerential geometric aspects of
O
itself. In practice, a vast part of the models considered in the literature falls in this category.
For instance, in quantum information geometry, it is customary to deal with parametric models
consisting only of pure states, or only of invertible density operators. In principle, it would also
be possible to consider a more general case in which jis a smooth map of
M
into the Banach
space
V
of selfadjoint linear functionals in such a way that j(
M
)
⊂S
, and
M
intersects
diﬀerent orbits of states. This line of thought would require a diﬀerent way to handle geometrical
properties of the space of states in relation with the parameter manifold, based, for example,
on the methodology introduced in [
10
,
11
] for the classical case. This line of reasoning may be
useful in the transition to the inﬁnitedimensional case where the smooth structure of the orbits
O
is in general not guaranteed, and we plan to address this and related questions in the future.
Concerning the identiﬁability of a model, it may seem at ﬁrst glance reasonable to consider
only identiﬁable models, but we will show that there are wellknown and “simple” parametric
models of quantum states (e.g., qubit models) for which either this assumption is not satisﬁed,
or it leads to diﬃculties with the statistical interpretation of the model.
Now, we turn our attention to the geometrical objects that
M
inherits by means of the
smooth map j. Indeed, once we have the smooth map j, a symmetric, covariant (0
,
2) tensor is
naturally obtained on
M
by considering the Riemannian metric Gon
O
introduced before and
taking its pullback
GM:= j∗G(21)
to
M
with respect to j. This gives a tensor on
M
which “feels” the possible noncommutativity
of Aand gives the “correct” tensor in the classical case.
Indeed, if
A
is Abelian, then
O
is diﬀeomorphic to the open interior of a suitable simplex,
Gis the FisherRao metric tensor [
38
], and G
M
is the pullback of the FisherRao metric tensor
to the manifold Mseen as a model of probability distributions [6].
On the other hand, if
A
is the algebra
B
(
H
)of bounded linear operators on a ﬁnite
dimensional, complex Hilbert space
H
and
O
is the manifold of pure states, then
O
is diﬀeomor
phic to the complex projective space
CP
(
H
)associated with
H
,Gis the FubiniStudy metric
[
38
] on
O
=
CP
(
H
), and G
M
is the quantum counterpart of the FisherRao metric tensor on the
manifold
M
seen as a model of pure quantum states [
44
]. Also, if
O
is the manifold of faithful
states, then Gis the BuresHelstrom metric tensor [
38
], and G
M
may be read as a quantum
9If available, please cite the published version
counterpart of the FisherRao metric tensor on the manifold
M
seen as a model of faithful
quantum states [85].
We will now introduce the
C?
algebraic version of the Symmetric Logarithmic Derivative
(SLD) introduced in quantum estimation theory by Helstrom in [
63
]. For this purpose, note
that every tangent vector at
ρ∈ O
may be expressed in terms of gradient vector ﬁelds, that
is, given
ρ∈ O
, for every tangent vector
Vρ∈TρO
there exists a selfadjoint element
a∈Asa
depending on Vρsuch that
Vρ=Ya(ρ).(22)
Consequently, if we consider a tangent vector
vm∈TmM
, it makes sense to ask for the gradient
vector ﬁeld Yaon Osuch that
Tmj(vm) = Ya(ρm),(23)
where
ρm
:=
ρ
(
j
(
m
)). The gradient vector ﬁeld
Ya
in general depends on both the point
m∈M
and the tangent vector
vm
. The tangent vector
Ya
(
ρm
)satisfying equation
(23)
is called the
SLD of vmat ρm.
To appreciate the link with the standard deﬁnition of the SLD, let us consider a parametric
model (
M,
j
,O
)where
A
=
B
(
H
),
O
is the manifold of faithful states (invertible density
operators),
M
is an open submanifold of
R
, and jis a suitable smooth map. Setting
vm
=
∂t
(
m
)
where
∂t
is the restriction to
M
of the vector ﬁeld generating the group structure of
R
, a direct
computation shows that the solution of equation
(23)
coincides with the Symmetric Logarithmic
Derivative (SLD) of [
63
]. Indeed,
∂t
is the inﬁnitesimal generator of
mt
=
m
+
t
, and considering
an arbitrary function fbon O, we have
hdfb(ρm), Tmj(vm)i=d
dt(Tr ( ˆρmtb))t=0 = Tr d
dt(ˆρmt)t=0 b!∀b∈Asa (24)
so that equation (23) may be alternatively written as
d
dt(ˆρmt)t=0 ={ˆρm,am}=1
2(ˆρmam+amˆρm),(25)
where
am=a−ˆρm(a)I,(26)
which is precisely the deﬁnition of the SLD (see also equation 3 in [
85
] and equations 3.4 and
3.14 in [
100
] for the multiparametric case). This justiﬁes the interpretation of equation
(23)
as
the
C?
algebraic generalization of the SLD embracing also the multiparametric quantum and
classical cases.
Example 3 (A pure state qubit model).Consider the algebra
M2
of the qubit (see example 2).
Take the oneparameter group of unitary elements generated by the element ıσ3according to
uγ= e ı
2γσ3,(27)
where
γ∈R
. Then, consider the orbit
O∼
=CP2
of pure states on
M2
, set
M
=
R
, and consider
the map jR:M→ O given by
ργ≡jR(γ) := Φ(uγ, ρ),(28)
where Φis the action of G⊃Ugiven in equation (3), and
ρ=1
2(σ0+σ1).(29)
10 If available, please cite the published version
A direct computation shows that
ργ=1
2(σ0+ cos(γ)σ1−sin(γ)σ2)(30)
and that j
R
is smooth. Clearly, j
R
is not injective, and thus the parametric model (
R,
j
R,CP2
)
is not identiﬁable. However, the parametrization given in equation
(30)
is useful in quantum
estimation theory when an experimental realization of the parametric model is constructed
in terms of a spin interacting with a magnetic ﬁeld. In this case,
γ
=
tB
where
t
is the time
parameter of the dynamical evolution and
B
is the strenght of the magnetic ﬁeld. Then, the
fact that the model is not identiﬁable depends on the dynamical evolution being periodic.
Now, let us consider the vector ﬁeld
V
on
M
=
R
generating translations. This vector ﬁeld
is complete, and provide a basis of the tangent space
TγM
at each
γ∈M
. Moreover,
V
is the
inﬁnitesimal generator of the action of the Abelian Lie group G=Ron M=Rgiven by
ψ(ζ, γ) := γ+ζ∀ζ∈G, γ ∈M. (31)
The group Gacts also on CP2by means of
Ψ(ζ, ρ) := Φ(Uζ, ρ),(32)
where Φis the action given in equation
(3)
. The fact that Ψis a group action follows from the
fact that the map γ7→ Uγis a group homomorphism, that is, it satisﬁes
Uζ1Uζ2=Uζ1+ζ2∀ζ1, ζ2∈G. (33)
The actions
ψ
and Ψhave a particular relation to one another, indeed, a direct computation
shows that they are equivariant with respect to jR, which means that
jR(ψ(ζ, γ)) = Ψ (ζ, jR(γ)) .(34)
This property is quite strong because it implies that the fundamental vector ﬁelds of the action
of
G
on
M
=
R
are j
R
related with the fundamental vector ﬁelds of the action of
G
on
CP2
,
which means that [2]
TγjR(Vγ) = Wζ
ργ,(35)
where
V
is the fundamental vector ﬁeld of
ψ
(
ζ, γ
)(i.e, the vector ﬁeld generating the translation
considered above), while
W
is the fundamental vector ﬁeld of Ψ(
ζ, ρ
)(recall that, in this case,
the exponential map from the Lie algebra of Gto Gitself is the identity). Since
Ψ(ζ, ρ) = Φ(Uζ, ρ),(36)
the fundamental vector ﬁeld
W
is easily seen to be the Hamiltonian vector ﬁeld associated with
σ3(see equation (6)). This means that
hdfb, TγjR(Vγ)i=hdfb, Wργi=ργ[[σ3,b]].(37)
Consequently, regarding the SLD, equation (23) leads us to look for the selfadjoint element a
satisfying
ργ({a,b})−ργ(a)ργ(b) = ργ[[σ3,b]](38)
11 If available, please cite the published version
for all selfadjoint elements
b∈M2
. Passing from
ργ
to its density matrix
ˆργ
, we see that
equation (38) is equivalent to
{ˆργ,a} − Tr( ˆργa) ˆρ= [[ˆργ, σ3]].(39)
We write
a=a0σ0+a1σ1+a2σ2+a3σ3,(40)
where
aj∈R
for all
j
= 0
,
1
,
2
,
3. A direct computation exploiting the properties of the Pauli
matrices shows that
a0
is arbitrary (as it should be because of the very deﬁnition of gradient
vector ﬁeld), a3= 0, while a1and a2must satisfy
a1sin(γ) + a2cos(γ) = −1.(41)
Clearly, this means that aand thus the SLD are not uniquely deﬁned.
Concerning the covariant tensor GR, we have
GR=j∗
RG(42)
by deﬁnition. Since j
R
is an immersion and Gis a Riemannian metric, then G
R
is a Riemannian
metric (i.e., it is positive and invertible). Moreover, setting
ψζ(γ) := ψ(ζ, γ)
Ψζ(ρ) := Ψ(ζ, ρ) = Φ(Uζ, ρ),(43)
we immediately obtain
ψ∗
ζGR=ψ∗
ζj∗
RG = (jR◦ψζ)∗G = (Ψζ◦jR)∗G = j∗
RΨ∗
ζG = j∗
RΦ∗
UζG = jRG = GR(44)
where we used equation
(34)
in the fourth equality, and equation
(17)
in the sixth equality.
Therefore, we conclude that G
R
is invariant with respect to the action of the Lie group
G
=
R
on
M
=
R
given by translation, and thus must be proportional to the Euclidean metric tensor.
Example 4 (A mixed state qubit model).Consider the algebra
M2
of the qubit (see example 2).
Consider the orbit Oof faithful states, set M=R+×R+, and deﬁne the map jMas
ργ,ζ ≡jM(γ, ζ) := 1
2σ0+ e−ζ γ (cos(γ)σ1−sin(γ)σ2).(45)
A direct computation shows that this map is smooth. Quite interestingly, the parametric model
(
M,
j
M,O
)has a physical origin which is connected with the dynamics of open quantum systems.
The dynamics of such systems is governed by the socalled GoriniKossakowskiLindblad
Sudarshan (GKLS) equation [
7
,
8
,
9
,
26
,
52
,
76
]. In particular, choosing the inﬁnitesimal
generator
L
of this linear equation to be the dephasing channel, the dynamical evolution
evolution generated by
L
is such that the initial (pure) state
ρ
given in equation
(29)
evolves
according to the righthandside of equation
(45)
, where
γ
2
plays the role of the time parameter
while 2
ζ
is the dephasing parameter [
29
, ex. 2]. Note that the initial pure state is evolved into
a mixed (faithful) state as soon as the time parameter is greater than 0.
This model has been recently considered in the context of quantum parameter estimation in
the presence of nuisance parameters [97].
12 If available, please cite the published version
Let us now consider the vector ﬁelds
V
and
W
on
M
generating the local oneparameter
groups of local diﬀeomorphisms
φt(γ, ζ ) = (γ+t, ζ)
ψt(γ, ζ ) = (γ, ζ +t).(46)
Clearly, these vector ﬁelds are not complete on
M
, however, they provide a basis of tangent
vectors at each point of M. A direct computation shows that
hdfb, Tγ,ζ jM(Vγ ,ζ )i=−e−ζγ ((sin(γ) + ζcos(γ)) b1+ (cos(γ)−ζsin(γ)) b2)
hdfb, Tγ,ζ jM(Wγ ,ζ )i=−γe−ζγ (cos(γ)b1−sin(γ)b2),(47)
from which we conclude that j
M
is an immersion. Then, equation
(23)
implies that the SLD
YaV(ργ,ζ )and YaW(ργ ,ζ )of Vand Wat (γ, ζ ), respectively, are found as the solutions of
hdfb, Tγ,ζ jM(Vγ ,ζ )i=hdfb,YaV(ργ,ζ )i=ργ,ζ {aV,b}−ργ,ζ aVργ,ζ (b)
hdfb, Tγ,ζ jM(Eγ ,ζ )i=hdfb,YaW(ργ,ζ )i=ργ,ζ {aW,b}−ργ,ζ aWργ,ζ (b)(48)
for all selfadjoint elements b∈M2. A direct computation leads to
aV=aV
0σ0− e−ζγ sin(γ) + ζcos(γ)
2 sinh(ζγ)!σ1+ ζsin(γ)
2 sinh(ζγ)−e−ζ γ cos(γ)!σ2
aW=aW
0σ0−γ
2 sinh(ζγ)cos(γ)σ1−sin(γ)σ2.
(49)
Note that, apart from the coeﬃcients
aV
0
and
aW
0
which are arbitrary because they do not aﬀect
the expression of the associated gradient vector ﬁeld, the SLD associated with
V
and
W
are
uniquely deﬁned at each point of
M
. This is due to the fact that the model is a model of faithful
states. Also, note that [
aV,aW
]
6
=
0
, and thus there is no unital, Abelian
C∗
subalgebra of
M2
that contains both
aV
and
aW
. This will have an impact on the attainability of the Helstrom
bound.
Since GM=j∗
MG, we immediately obtain (see equation (16))
GM
γ,ζ (Vγ ,ζ , Vγζ )=Gργ,ζ (YaV(ργ,ζ ),YaV(ργ ,ζ )) = ργ,ζ {aV,aV}−ργ,ζ aV2,(50)
and similarly for G
M
γ,ζ (Vγ ,ζ , Wγζ )
and G
M
γ,ζ (Wγ ,ζ , Wγζ )
. Then, since
V
and
W
provide a basis of
tangent vectors at each point in M, the tensor GMcan be computed to be
GM= e−2ζγ +ζ2
e2ζγ −1!dγ⊗dγ+ ζγ
e2ζγ −1!dγ⊗Sdζ+ γ2
e2ζγ −1!dζ⊗dζ. (51)
Example 5 (Lie group and Lie algebra parametric models).Motivated by the model in example
3, and by some of the models commonly used in the quantum context [
13
,
96
], we introduce the
notion of a Lie group parametric model and of a Lie algebra parametric model.
Let
G
be a Lie group which is realized as a Lie subgroup of the Lie group
G
of invertible
elements in
A
, and let
ρ0
be a state in
S
. Set
M
=
G
and deﬁne the map j
G:M→ O
, where
Ois the orbit containing ρ0, by means of
jG(g) := Φ(g, ρ0).(52)
13 If available, please cite the published version
This map is clearly smooth, and we call (
G,
j
G,O
)a
Lie group parametric model
. If the
ﬁducial state ρ0is such that
Φ(g, ρ0) = ρ0⇐⇒ g = I∀g∈G , (53)
then the model is identiﬁable.
Since Gis a subgroup of G, the left action of Gon itself is related with the action of Gon
O
determined by the restriction of Φto
G
in the way expressed in equation
(34)
. Speciﬁcally,
let ψbe the left action of Gon itself. Deﬁne an action Ψof Gon Ogiven by
Ψ(g, ρ) := Φ(g(g), ρ),(54)
where
g∈G
and g(
g
)
∈G
is the realization of
g
as an element of
G
. Then,
g7→
g(
g
)is a group
homomorphism, that is, it satisﬁes
g(g1g2) = g(g1) g(g2),(55)
and thus it follows from equations (52), (52), and (55) that
jG(ψ(g, h)) = Ψ (g(g),jG(h)) ,(56)
which means that the actions
ψ
and Ψare equivariant with respect to j
G
. This means that the
fundamental vector ﬁelds of
ψ
are j
G
related with the fundamental vector ﬁelds of Ψ[
2
]. This
instance may be helpful in computing the SLD adapting the steps outlined in example 3.
If (
G,
j
G,O
)is a Lie group parametric model and we consider another parameter manifold
which is a smooth homogeneous space
M
=
G/H
of
G
admitting a global, smooth section
η:M→G
, then we can immediately build another parametric model (
M,
j
M,O
)by setting
j
M
:= j
G◦η
. This may be helpful to obtain identiﬁable models. Indeed, if
ρ0
has a nontrivial
isotropy group
G0⊂G
, which is the set of all elements
g∈G
such that Φ(
g, ρ0
) =
ρ0
, we have
that
M
=
G/G0
is a smooth manifold. Then, if there is a smooth section
η
for
M
, the resulting
parametric model will be identiﬁable. This is very similar to the notion of coherent state used
in quantum theory [3,86].
Another relevant parametric models is obtained when we consider the Lie algebra
g
of
G
. In
this case, we have the exponential map
exp: g→G
that can be exploited to deﬁne a parametric
model. Speciﬁcally, let (
G,
j
,O
)be a Lie group parametric model. Then, deﬁning j
g
:= j
G◦exp
,
we immediately obtain the parametric model (
g,
j
g,O
)which is referred to as a
Lie algebra
parametric model
. If the Lie algebra
g
is commutative, then the exponential map is a group
homomorphism when the Lie algebra is thought of as a group with respect to the vector sum,
and we obtain an equivariance relation with respect to j
g
between the left action
ψ
of
g
on itself
and its realization Ψ(v, ρ) = Φ(exp(v), ρ)as a group acting on O.
4 Parametric statistical models of states on C?algebras
When an experiment is performed on a system in a given state
ρ
, we obtain an outcome lying
in a given outcome space
X
which is associated with the measurement procedure. The state
ρ
is then “transformed” into a probability distribution on
X
in the sense that diﬀerent repetitions
of the same experimental procedure (i.e., preparation of the system in the state
ρ
followed by
14 If available, please cite the published version
the measurement procedure with outcome space
X
) will produce in general diﬀerent outcomes
characterized by a probability distribution which is associated with the state
ρ
and with the
measurement procedure adopted. In this work, we will always consider outcome spaces which
are discrete and ﬁnite.
Given a discrete and ﬁnite outcome space
Xn
with
n
elements, the statistical interpretation of
the state
ρ
is encoded in a map
m?:S−→
P(
Xn
)
≡
∆
n
, which we will assume to be convex in
order to preserve one of the basic features of probabilities and states. From this, it follows that
m?
can be extended to a linear map
m?:A?−→
S(
Xn
), where S(
X
)is the vector space of signed
measures on
Xn
. From the
C?
algebraic perspective, S(
Xn
)is the space of selfadjoint linear
functionals on the Abelian
C?
algebra
Cn
:=
C
(
Xn
)of complexvalued, continuous functions
on
Xn
, and thus, since
m?
is continuous because
A?
and S(
Xn
)are ﬁnitedimensional, we
immediately obtain that there is a continuous linear map
m:Cn−→ A
of which
m?
is the
dual map. By construction, the map
m
must be such that its dual map
m?
sends the space of
states of
A
into the space of states of
Cn
. One way to implement this condition is to require
m:Cn−→ A
to be a unital, positive map between
C?
algebras, that is, a linear map preserving
the identity and sending positive elements into positive elements (clearly, any such map sends
selfadjoint elements into selfadjoint elements).
Deﬁnition 2.
A positive unital map
m:Cn→A
is deﬁned to be a
measurement procedure
.
Speciﬁcally, given a ﬁnite and discrete outcome space
Xn
, we can always consider the basis
of Cngiven by the elements {ej}j=1,...,n where ejis the “delta function” at the jth element of
Xn. The measurement procedure mamounts to deﬁne the elements
mj:= m(ej)∀j= 1, ..., n, (57)
in such a way that they satisfy n
X
j=1
mj=I,(58)
and
mj≥0∀j= 1, ..., n. (59)
Essentially, we are considering a (discrete) POVM in the
C?
algebraic framework. The probability
distribution m?(ρ)associated with the state ρis characterized by the numbers
pj:= (m?(ρ)) (ej) = ρm(ej)=ρ(mj).(60)
Once a parametric model (M, j,O)is chosen, we immediately have the map
jc:= m?◦j:M−→ ∆n.(61)
We require this map to lie entirely in a given ﬁxed orbit of states inside ∆
n
. Clearly, since every
orbit in ∆
n
is diﬀeomorphic to ∆
+
k
for some
k6
=
n
(see example 1), there is no loss of generality
in requiring the codomain of j
c
to lie entirely inside the manifold ∆
+
n
of faithful states on
Cn
.
Indeed, if this is not the case, it suﬃces to redeﬁne
Xn
to be the subset
Ik
, exchange
Cn
with
C(Ik), and relabel kas n.
Deﬁnition 3.
Let (
M,
j
,O
)be a parametric model of states on a
C?
algebra
A
. A
measure
ment procedure msuch that jc(M) := m?◦j(M)⊆∆+
nis called regular for (M, j,O).
15 If available, please cite the published version
Once a regular measurement procedure
m
for (
M,
j
,O
)is chosen, we are ready to build a
parametric statistical model (in the sense of information geometry [
5
,
6
,
10
]) associated with
the parametric model (M, j,O).
Deﬁnition 4.
Let (
M,
j
,O
)be a parametric model of states on a
C?
algebra
A
, and let
m
be a
regular measurement procedure for (
M,
j
,O
). Then, the triple (
M,
j
c,
∆
+
n
), with j
c
as in equation
(61)
, is deﬁned to be the
parametric statistical model
associated with the parametric model
(M, j,O)by means of the measurement procedure m.
Remark 1 (Classical statistical models).In the speciﬁc case when the algebra
A
is commutative,
i.e.,
A
=
Cn
for some
n∈N
, a
parametric model
(
M,
j
,O
)of states on
Cn
is already a
parametric statistical model
by itself. Indeed, according to example 1, the orbit
O
is
diﬀeomorphic to the open interior ∆
+
k
of a ksimplex with
k6
=
n
. Speciﬁcally, we have a subset
Ik⊆ Xn
of
k
elements, the
C?
algebra
Ck
generated by the elements
ej∈Cn
with
j
such that
xj∈Ik
, and
O
is diﬀeomorphic to the orbit of faithful states of
Ck
. Then, we have a “natural”
measurement procedure
m:Ck→Cn
at our disposal given by the natural identiﬁcation i
k
map
of
Ck
in
Cn
, and the map j
c
=
m?◦
j= i
∗
k◦
jgives rise to the statistical model (
M,
j
c,
∆
+
k
)
associated with (
M,
j
,O
). From this, it is clear that once we have the parametric model (
M,
j
,O
)
we immediately have a “natural” parametric statistical model (
M,
j
c,
∆
+
k
)associated with it.
No additional choices must be made.
Exploiting the Riemannian geometry of ∆
+
n
, the parameter manifold
M
may be endowed
with another symmetric, covariant (0
,
2) tensor which is in general diﬀerent from the metric G
M
introduced before. Indeed, we may consider the FisherRao Riemannian metric G
F R
on ∆
+
n
,
which is the Riemannian metric tensor Gassociated with the Jordan product of the selfadjoint
part of Cnas described in section 2, and then take its pullback
GMc = (jc)∗GF R (62)
to
M
(the ‘c’ stands for classical, or commutative). In this case, we obtain a symmetric
covariant tensor on
M
which, unlike G
M
given by equation
(21)
, can not feel the possible
noncommutativity of
A
, and which is the pullback of the FisherRao metric tensor on
M
thought of as a parametric statistical model in ∆
+
n
along the lines of classical information
geometry.
To accomodate multiple runs, say
N
, of the same experimental procedure on
N
identical
and independent copies of the initial state, we introduce the parametric model (
M,
j
N,ON
)
where ONis the manifold of states on the tensor product algebra
A⊗N:= A⊗ · · · ⊗ A(63)
containing the product states of the form
ρ1⊗ · · · ⊗ ρN
with
ρj∈ O
for every
j
= 1
, ..., N
, and
jN:M−→ ONis given by
jN(m) := j(m)⊗ · · · ⊗ j(m)≡ρm⊗ · · · ⊗ ρm≡ρ⊗N
m.(64)
Clearly, we may endow Mwith the Riemannian metric GMN deﬁned by
GMN := (jN)∗GN,(65)
where G
N
denotes the canonical Riemannian metric on
ON
associated with the Jordan product
on
A⊗N
. Since the smooth embedding j
N
has been deﬁned in terms of a “multiplicative object”,
16 If available, please cite the published version
namely, the tensor product, it is reasonable to expect that this multiplicative feature reﬂects
also in the pullback metric. Indeed, below we will prove that
GMN =NGM.(66)
Performing
N
runs of an experiment provides us with a list of
N
outcomes, and we consider
the outcome space
XN=X × · · · × X .(67)
At this point, we must choose a measurement procedure
mN:C⊗N
n
=
C
(
XN
)
−→ A⊗N
so that,
setting j
cN
=
mN◦
j
N
, we can build a statistical model (
M,
j
cN ,
∆
+
Nn
)in the obvious way. We
may endow Mwith the Riemannian metric GMcN deﬁned by
GMcN := N(jcN )∗GF R,(68)
where
N
G
F R
is the FisherRao metric tensor on ∆
+
nN
(this either follows from standard arguments
in classical information geometry, or by proposition 1below applied to the case where
A
=
Cn
).
Proposition 1. With the notations introduced above, we have
GMN =NGM.(69)
Proof. We start proving that, if vm∈TmMis such that
Tmj(vm) = Ya(ρm),(70)
then it holds
TmjN(vm) = YN
aN(ρ⊗N
m),(71)
where YN
aNis the gradient vector ﬁeld on ONassociated with
aN=a⊗I⊗ · · · ⊗ I+I⊗a⊗I⊗ · · · ⊗ I+· · · +I⊗ · · · ⊗ I⊗a.(72)
Recall that simple elements of the form
b1⊗ · · · ⊗ bN
generate
A⊗N
, and thus, to prove
equation (71), it is suﬃcient to compute
hdfb1⊗···⊗bN(ρ⊗N
m), TmjN(vm)i=hd(jN)∗fb1⊗···⊗bN(m), vmi.(73)
Denoting by mta smooth curve in Mstarting at mwith initial tangent vector vm, we have
hd(jN)∗fb1⊗···⊗bN(m), vmi=d
dtρ⊗N
mt(b1⊗ · · · ⊗ bN)t=0 =
=d
dt(ρmt(b1)· · · ρmt(bN))t=0 ,
(74)
from which equation
(71)
follows applying the Leibniz rule and recalling that
Tm
j(
vm
) =
Ya
(
ρm
).
We now take vm, wm∈TmMsuch that
TmjN(vm) = YN
aN(ρ⊗N
m)
TmjN(wm) = YN
bN(ρ⊗N
m),(75)
with aNand bNas in equation (72). Recalling that GM N = (jN)∗GN, and noting that
GN
ρ⊗N
m(YN
aN(ρ⊗N
m),YN
bN(ρ⊗N
m)) = ρ⊗N
m({aN,bN})−ρ⊗N
m(aN)ρ⊗N
m(bN)(76)
17 If available, please cite the published version
because of equation (16), we have
GMN
m(vm, wm) = GN
ρ⊗N
m(YN
aN(ρ⊗N
m),YN
bN(ρ⊗N
m))
=ρ⊗N
m{aN,bN}−ρ⊗N
m(aN)ρ⊗N
m(bN) =
= (Nρm({a,b}) + N(N−1) ρm(a)ρm(b)) −N2ρm(a)ρm(b) =
=N(ρm({a,b})−ρm(a)ρm(b)) =
=NGM
m(vm, wm)
(77)
as desired.
5 The problem of estimation theory
The purpose of estimation theory is to manipulate the outcomes of experiments in such a way
to obtain an estimate of the “true state” on which the experiment has been performed. This
is done by means of a map
E:Xn−→ M
called
estimator
. In the following, we will always
consider nonconstant estimators.
Clearly, we need to come up with a way of establishing optimality for estimators. For this
purpose, we introduce a smooth
cost function C:M×M−→ R
which is nonnegative and
vanishes only on the diagonal. The choice of the cost function is essentially left to the ingenuity
of the theoretician, and it is diﬃcult to outline a general selection methodology. However, in
some cases, the choice of the cost function is suggested by the context.
Starting with a cost function
C
, and writing
Ej≡ E
(
xj
)for the value of the estimator at the
jth element of the outcome space Xn, we introduce the function L:M×M−→ Rgiven by
L(m1, m2) :=
n
X
j=1
C(m1,Ej)pj(m2) =
n
X
j=1
C(m1,Ej)ρm2(mj),(78)
where (
p1
(
m2
)
,· · · , pn
(
m2
)) = j
c
(
m2
) =
m?
(
ρm2
), and
m
is the measurement procedure “gen
erating” the statistical model (
M,
j
c,
∆
+
n
)associated with the parametric model (
M,
j
,O
)of
states on
A
under investigation. It is clear from equation
(78)
that if the cost function
C
is
constant, then
L
does not actually depend on
m2
, and the problem of estimation theory as will
be now developed will lose meaning.
The function
L
may be seen as the expectation value of the realvalued,
M
parametric
random variable
C
(
m1,E
(
·
)) on
Xn
with respect to the
M
parametric probability distribution
m
(
ρm2
)on
Xn
. Therefore,
L
measures how centered is the probability distribution generated by
C(m1,E(·)).
Let m?∈Mand denote by L?the function
L?(m) := L(m, m?).(79)
The estimator
E
is called
stationary
for the cost function
C
at
m?
if
L?
has an extremum at
m=m?, that is, if
(V L?) (m?) = 0 (80)
for all vector ﬁelds
V
on
M
. The estimator
E
is called
unbiased
for the cost function
C
at
m?∈M
if the function
L?
has a minimum at
m
=
m?
, and it is called
locally unbiased
for
18 If available, please cite the published version
the cost function
C
at
m?
if
L?
has a local minimum at
m
=
m?
. In general, for a given cost
function C, unbiased estimators need not exist.
Now, we may deﬁne an Mparametric selfadjoint element Min Asetting
Mm1:=
n
X
j=1
C(m1,Ej)mj.(81)
This element clearly depends also on the estimator
E
and on the measurement procedure
m
.
Moreover, it allows us to write the function
L
as the expectation value of
Mm1
with respect to
the state ρm2according to
L(m1, m2) = ρm2(Mm1).(82)
The estimation problem may be approached from two diﬀerent perspectives of increasing
diﬃculty:
•
the regular measurement procedure
m
is ﬁxed, and the unknown of the problem is the
estimator E;
•
both the regular measurement procedure
m
and the estimator
E
are considered unknown.
Clearly, the ﬁrst case reduces to the classical problem of estimation, and may be faced relying on
wellknown methods like the maximum likelyhood estimator. The limit on the precision is then
governed by the CramerRao bound (see section 6). The second case is deﬁnitely more diﬃcult
to address because the freedom in the choice of the regular measurement procedure adds another
layer of complexity. However, in this case, the precision is governed by the Helstrom bound
(see section 7), and allows for a sharpening of the CramerRao bound. Indeed, the freedom in
choosing the measurement procedure reﬂects in the possibility of consider diﬀerent “classical
scenarios”, and choose the one with the lowest CramerRao bound.
Unfortunately, for both forms of the problem, there is no algorithm to solve the problem in
full generality, and a casebycase analysis is mandatory.
Remark 2 (Stationary estimators for Euclidean cost function).Suppose that
M
is explicitely
realized as an
n
dimensional submanifold of
RN
for some positive
N∈N
with
n≤N
. In this
context, a common choice in parameter estimation theory is to consider the cost function
C
which is the Euclidean distance on RN×RNrestricted to M×M. Speciﬁcally, we have
C(m1, m2) := 1
2m1−m22,(83)
so that the function Lreads
L(m1, m2) := 1
2
n
X
j=1
m1− Ej2pj(m2).(84)
This type of cost function is called a
Euclidean cost function
for obvious reasons. Clearly,
the Euclidean cost function
C
depends on the actual realization of the (a priori abstract)
manifold
M
into a suitable
RN
. In particular, because of Whitney’s embedding theorem, given
a parameter manifold
M
we can always build a Euclidean cost function. Of course, the actual
usefulness of such a cost function is in principle not clear and should be investigated case by case.
However, it often happens in concrete models that the parameter manifold
M
is “naturally”
immersed in some given
RN
by construction, and thus the Euclidean cost function unavoidably
presents itself from the start.
19 If available, please cite the published version
If
{θ1, ..., θn}
is a local system of coordinates on
M
, it is easy to see that being stationary at
m?is equivalent to (see equation (80))
mk
?(θ) = Em?(θ)[Ek]∀k= 1, ..., N and ∀r= 1, ..., n, (85)
where
mk
1
is the smooth function on
M
obtained by composing the canonical immersion of
M
in
RN
with the canonical projection on the
k
th factor,
Ek
is the realvalued random variable on
X
obtained by composing
E
with the canonical immersion of
M
in
RN
and with the canonical
projection on the
k
th factor, and where E
m?
[
·
]denote the expectation value with respect to the
Mparametric probability distribution pm?.
Since
C >
0for all (
m1, m2
)
∈M×M
unless
m1
=
m2
, in which case it vanishes, we see
that a stationary estimator at m∈Mis also locally unbiased at m∈M.
When
M
is an open subset of
RN
and
{θ1, ..., θN}
is a system of Cartesian coordinates, and
when equation
(85)
holds for all
m∈M
, we recover the standard deﬁnition of an unbiased
estimator used in classical and quantum estimation theory [6, ch. 4].
6 The CramerRao bound
Here, we recall Hendrik’s derivation of the CramerRao bound for estimators with values in
a manifold [
67
] when the underlying outcome space is discrete and ﬁnite. This gives a clear
geometric picture of the CramerRao bound which does not depend on the existence of a
privileged coordinatization of the parameter space
M
as it is the case in most of the existing
literature (see for instance [
6
, ch. 4] where it is clearly stated that the notion of unbiased
estimator developed there is coordinatedependent, as well as [46,67,84])
Let (
M,
j
,
∆
+
n
)be a parametric statistical model (see remark 1). Recall that the metric
G
M
determined by equation
(21)
coincides with the FisherRao tensor on
M
as determined by
standard methods of information geometry [5,4,6]. We assume that GMis invertible.
In order to obtain the generalized CramerRao bound for a stationary estimator, we need to
exploit the geometrical properties of the product structure of the manifold
M×M
. We will
now recall these geometrical properties following [
32
, sec. 2], to which we refer for the explicit
proofs.
First of all, we note that there are two projections πland πrfrom M×Mto Mgiven by
πl(m1, m2) := m1
πr(m1, m2) := m2,(86)
and there is also the diagonal immersion idof Minto M×Mgiven by
id(m) := (m, m).(87)
Given a vector ﬁeld
X
on
M
, we may deﬁne its left and right lift to be the vector ﬁelds
Xl
and
Xron M×Mcharacterized by
Xl(π∗
lf) = π∗
l(X(f))
Xr(π∗
rf) = π∗
r(X(f)) (88)
for every smooth function
f
on
M
. It is possible to prove that every vector ﬁeld
X
on
M
is
idrelated with the vector ﬁeld Xl+Xron M×M[32, sec. 2].
20 If available, please cite the published version
If
E
is a stationary estimator at
m?∈M
then
L?
has an extremum at
m?
, and this is
equivalent to
(id(XlL))m=m?= 0 (89)
for all vector ﬁelds
Xl
on
M×M
. We assume that
E
is a stationary estimator for all
m?∈M
.
This means that the function L = i
d(XlL)
identically vanishes. Consequently, given an arbitrary
vector ﬁeld Yon M, we also have
0 = Y(i∗
dL) = Y(i∗
d(XlL)) = i∗
d((YlXl+YrXl)L),(90)
which means
i∗
d(YlXlL) = −i∗
d(YrXlL).(91)
Since
E
is stationary at every
m?
, it follows that the Hessian form H
?
of
L?
at
m?
is well
deﬁned and we have
H?(X(m?), Y (m?)) := (Y X L?) (m?.(92)
A moment of reﬂection shows that
(Y X L?) (m?) = (i?
d(YlXlL)) (m?)(93)
so that
H?(X(m?), Y (m?)) = −(i?
d(YrXlL)) (m?)(94)
because of equation (91). Set
CEj(m) := C(m, Ej)(95)
so that we have
L(m1, m2) :=
n
X
j=1
CEj(m1)pj(m2)(96)
and we obtain
H?(X(m?), Y (m?)) = −
n
X
j=1 XlCEj(m?)Yrpj(m?).(97)
Introducing the realvalued random variables on the probability space (Xn,p(m?)) given by
F?
X(xj) := XlCEj(m?)
G?
Y(xj) := Yrln(pj)(m?),(98)
we can rewrite the right hand side of equation (97) as
H?(X(m?), Y (m?)) = −E?[F?
XG?
Y],(99)
where E
?[·]
denotes the expectation value with respect to the probability measure
p
(
m?
). The
expression
hF, Gi?:= E?[F G](100)
is an inner product on the space of random variables on the probability space (
Xn,p
(
m?
), and
the CauchySchwarz inequality may be applied to obtain
(H?(X(m?), Y (m?)))2≤E?[F?
XF?
X] E?[G?
YG?
Y].(101)
21 If available, please cite the published version
Then, a direct computation shows that
E?[G?
YG?
Y] =
n
X
j=1 Yrln(pj)(m?)Yrln(pj)(m?)pj(m?) =
= GM(Y(m?), Y (m?)).
(102)
Next, we introduce the expression
C(X(m?), Y (m?)) := E?[F?
XF?
Y],(103)
which according to (98) implicitly contains the cost function C, so that we can write equation
(97) as
(H?(X(m?), Y (m?)))2≤ C(X(m?), X(m?)) GM(Y(m?), Y (m?)) .(104)
Clearly, Cdepends on the cost function Cand the estimator E.
Now, ﬁx Xm?∈Tm?M, and deﬁne the function H:Tm?M−→ Rgiven by
Y(m?)≡Ym?7→ H(Ym?) := H?(Xm?, Ym?).(105)
This function admits a maximum on the unit sphere determined by the FisherRao metric.
Indeed, the FisherRao unit sphere in
Tm?M
is compact because the FisherRao metric is a
Riemannian metric (positive). Let
Y0
m?
be a point on which
H
is maximum. Then, we may
always ﬁnd a real number λsuch that
H(Ym?) = λGM(Y0
m?, Ym?),(106)
so that
H(Y0
m?) = λGM(Y0
m?, Y 0
m?) = λ(107)
because Y0
m?lies on the FisherRao unit sphere.
With an evident abuse of notation, we denote by H?(Xm?)the covector in T?
m?Macting as
hH?(Xm?), Zm?i:= H?(Zm?, Xm?)∀Zm?∈Tm?M, (108)
and by GMY0
m?the covector in T?
m?Mgiven by
hGMY0
m?, Zm?i:= GMY0
m?, Zm?∀Zm?∈Tm?M . (109)
Then, comparing equation
(105)
with equation
(106)
, equation
(108)
and
(109)
allows us to
conclude that
H?(Xm?) = GMλ Y 0
m?,(110)
which, assuming GMto be invertible, is equivalent to
(GM)−1(H?(Xm?), αm?) = hαm?, λ Y 0
m?i(111)
for all covectors αm?∈T?
m?M. In particular, setting αm?= H?(Xm?)we get
(GM)−1(H?(Xm?),H?(Xm?)) = hH?(Xm?), λ Y 0
m?i=λ H(Y0
m?)(112)
22 If available, please cite the published version
because of equation
(108)
and
(105)
. Now, equation
(105)
together with equation
(107)
and
equation (112) imply that
H?(Y0
m?, Xm?)2=H(Y0
m?)2=λ H(Y0
m?) =
=GM−1(Hm?(Xm?),Hm?(Xm?)) .
(113)
Eventually, recalling that
Y0
m?
lies on the FisherRao unit sphere, equation
(104)
and
(113)
lead
us to the generalized CramerRao bound
C(Xm?, Xm?)≥GM−1(H?(Xm?),H?(Xm?)) .(114)
If the Hessian form of L?at m?is invertible, we deﬁne the covariance bivector Cov as
Cov(ξm?, ηm?) := CH−1
?(ξm?),H−1
?(ηm?),(115)
where
ξm?, ηm?∈T?
m?M
. We may then rewrite the generalized CramerRao bound in terms of
covectors. We proved the following:
Proposition 2.
Let (
M,
j
,
∆
+
n
)be a parametric statistical model for which G
M
is invertible.
Let
C
be a cost function and let
E
be a stationary estimator for
C
at
m?
. If the Hessian form
of L?at m?is invertible, then we have the generalized CramerRao bound
Cov(ξm?, ξm?)≥GM−1(ξm?, ξm?)(116)
for all ξm?∈T?
m?M.
A stationary estimator
E
which saturates the CramerRao bound for every
vm
is called
eﬃcient
. The CramerRao bound is related to the cost function
C
and to the estimator
E
,
however, it is expressed in terms of the (inverse of the) FisherRao metric tensor on
M
which
is a geometrical object on
M
which is completely independent of the cost function and the
estimator. Note, however, that the expression
(115)
is invariant under rescaling the cost function
C
, because the expression
C
by
(103)
contains such a scaling factor quadratically, and this is
cancelled because the inverse of the Hessian enters quadratically into (115).
Remark 3 (The CramerRao bound for Euclidean cost functions).The “standard form” of the
CramerRao inequality used in classical information geometry is obtained when we
M
and the
cost function
C
are as in remark 2. In this case, a direct computation shows that, in local
coordinates around
m?
, the components Hessian form of
L?
at every stationary point are given
by
(H?)jk = δrs
∂mr
∂θj
∂ms
∂θk!(m?).(117)
Assuming that
M
is open in the ambient manifold
RN
, and taking
{θ1, ..., θN}
to be the
Cartesian coordinates associated with the canonical projections of
RN
on
R
we immediately see
that
(H?)jk =δj k.(118)
Therefore, writing
(Cov(m?))jk ≡Cov(dθj(m?),dθk(m?)),(119)
23 If available, please cite the published version
a direct computation shows that the covariance matrix
(Cov(m?))jk
at the point
m?
for which
Eis a stationary estimator reads
(Cov(m?))jk = Ep?hEj−Ep?hEjiEk−Ep?hEkii,(120)
which is essentially the form usually found in standard textbooks on estimation theory in
statistics. The “standard form” of the CramerRao bound follows immediately.
7 The Helstrom bound
The CramerRao bound found in section 6applies to
parametric statistical models
. As
such, it depends only on the FisherRao metric on
M
which, in turn, depends on the properties
of the Abelian algebra underlying the parametric statistical model. Accordingly, if (M , jc,∆+
n)
is the parametric statistical model associated with a parametric model of states (
M,
j
,O
)on
the possibly noncommutative
C?
algebra
A
, the CramerRao bound for (
M,
j
c,
∆
+
n
)“does not
feel” the possible noncommutativity of the algebra
A
. However, it is possible to formulate
a bound which “feels” the possible noncommutativity of
A
, and this bound is related with
the metric tensor G
M
and its relation with G
Mc
. This bound is essentially the
C?
algebraic
formulation of the Helstrom bound used in quantum information theory, and the content of the
following proposition will be the key point to formulate the Helstrom bound in the
C?
algebraic
framework.
Proposition 3.
Let (
M,
j
,O
)be a parametric model of states on the ﬁnitedimensional
C?

algebra
A
, and let G
M
be the symmetric covariant tensor on
M
deﬁned by equation
(21)
.
Let (
M,
j
c,
∆
+
n
)be a parametric statistical model associated with (
M,
j
,O
), and let G
Mc
be the
symmetric covariant tensor on Mdeﬁned by equation (62). Then, we have
GM
m(vm, vm)≥GMc
m(vm, vm)(121)
for every m∈Mand every vm∈TmM.
Proof.
According to the deﬁnition of the SLD given in equation
(23)
, given an arbitrary tangent
vector vm∈TmM, there is a gradient vector ﬁeld Yaon Osuch that
Tmj(vm) = Ya(ρm).(122)
Consequently, we have (recalling (16))
GM
m(vm, vm) = Gρm(Ya(ρm),Ya(ρm)) = ρm(a2)−(ρm(a))2.(123)
On the other hand, by deﬁnition, we have
GMc = (jc)∗GF R = (m?◦j)∗GF R =j∗((m?)∗GF R ),(124)
which means
GMc
m(vm, vm) = ((m?)∗GF R)ρm(Ya(ρm),Ya(ρm)) ,(125)
and thus we have to prove that
((m?)∗GF R)ρm(Ya(ρm),Ya(ρm)) ≤ρm(a2)−(ρm(a))2(126)
24 If available, please cite the published version
to prove the proposition.
We note that, ﬁxed any
ρ∈ O
and given an arbitrary nonzero gradient tangent vector
Ya
(
ρ
),
there is an element
ac∈Cn≡C
(
Xn
)and a gradient tangent vector
Yac
(
m?
(
ρ
)) at
m?
(
ρ
)
∈ Oc
such that
Tρm?(Ya(ρ)) = Yac(m?(ρ)),(127)
and a direct computation shows that acis characterized by the property
ρ({a,m(bc)})−ρ(a)ρ(m(bc)) = ρ(m(acbc)) −ρ(m(ac)) ρ(m(bc)) (128)
for all bc∈Cn. Therefore, we have
((m?)∗GF R)ρ(Ya(ρ),Ya(ρ)) = (GF R )m?(ρ)(Tρm?(Ya(ρ)), Tρm?(Ya(ρ))) =
= (GF R)m?(ρ)(Yac(m?(ρ)),Yac(m?(ρ))) =
=ρ(m(a2
c)) −(ρ(m(ac)))2.
(129)
Recalling equation (126), we see that if the inequality
ρ(m(a2
c)) −(ρ(m(ac)))2≤ρ(a2)−(ρ(a))2(130)
holds for all ρ, a,acand msatisfying equation (127), then the proposition is proved.
Next, by means of equation (128), we write
ρ(m(a2
c)) −(ρ(m(ac)))2=ρ({a,m(ac)})−ρ(a)ρ(m(ac)) ,(131)
and since
ρ
(
{·,·}
)
−ρ
(
·
)
ρ
(
·
)is an inner product on the space of selfadjoint elements of
A
, we
may apply the CauchySchwarz inequality to obtain
ρ(m(a2
c)) −(ρ(m(ac)))22≤ρ(a2)−(ρ(a))2 ρ(m(ac)m(ac)) −(ρ(m(ac)))2.(132)
Now, mis a positive unital map, and thus it satisﬁes Kadison’s inequality
m(a2
c)≥m(ac)m(ac),(133)
from which it follows that
ρ(m(a2
c)) ≥ρ(m(ac)m(ac)) .(134)
Consequently, assuming that ρ(m(a2
c)) −(ρ(m(ac)))26= 0, we have
ρ(m(ac)m(ac)) −(ρ(m(ac)))2
ρ(m(a2
c)) −(ρ(m(ac)))2≤1(135)
and thus
ρ(m(a2
c)) −(ρ(m(ac)))2≤ρ(a2)−(ρ(a))2,(136)
and the proposition is proved.
From the proof of proposition 3, we easily obtain the following corollary.
25 If available, please cite the published version
Corollary 1.
Let (
M,
j
,O
)be a parametric model of states on the
C?
algebra
A
. Suppose there
is a unital, Abelian
C?
subalgebra
C⊆A
such that, for all
vm∈TmM
, the SLD
Ya
(
ρm
)of
vm
at ρm=j(m)given by
Tmj(vm) = Ya(ρm)(137)
is such that
a∈C
. Suppose also that the measurement procedure
m
:= i
C
given by the natural
inclusion of
C
in
A
gives rise to a parametric statistical model (
M,
j
c,
∆
+
n
)associated with
(M, j,O). Then, it holds
GM
m(vm, vm) = GMc
m(vm, vm).(138)
Now, let (
M,
j
,O
)be a parametric model of states on the
C?
algebra
A
, and let (
M,
j
c,
∆
+
n
)
be a parametric statistical model associated with (
M,
j
,O
). Assume (
M,
j
,O
)and (
M,
j
c,
∆
+
n
)
to be such that G
M
and G
Mc
are invertible. Let
C
be a cost function and
E
an estimator as
in section 5. Assume
E
is a stationary estimator at
m?
, and let
CEj:M→R
be the smooth
function given by CEj(m) := C(m, Ej), where Ej≡ E(xj)with xj∈ Xn.
According to the results of section 6(see equations
(98)
,
(103)
, and
(114)
), given
vm?, wm?∈
Tm?M, the bilinear form
C(vm?, wm?) :=
n
X
j=1
vm?(CEj)wm?(CEj)pj(m?),(139)
where
vm?
(
CEj
)is the derivative of
CEj
in the direction of
vm?
evaluated at
m?∈M
(and
similarly for wm?(CEj)), satisﬁes the CramerRao bound given by
C(vm?, vm?)≥GMc
m?−1(Hm?(vm?),Hm?(vm?)) ,(140)
where G
Mc
is the FisherRao metric on
M
seen as a parametric statistical model in ∆
+
n
, and
H
m?
is the Hessian form of the function
Lm?:M→R
given by
Lm?
(
m1
) :=
L
(
m1, m?
)at the
point m1=m?(see equation (78)).
Then, proposition 3states that
GM
m(wm, wm)≥GMc
m(wm, wm)(141)
for every wm∈TmM. Consequently, we also obtain that
GM
m−1(αm, αm)≤GMc
m−1(αm, αm)(142)
for every
αm∈T?
mM
(see [
16
, Ex. 1.2.12]), and the CramerRao bound in equation
(116)
allows
us to state that
C(vm?, vm?)≥GMc
m?−1(Hm?(vm?),Hm?(vm?)) ≥GM
m?−1(Hm?(vm?),Hm?(vm?)) .(143)
We proved the following:
Proposition 4.
Let (
M,
j
c,
∆
+
n
)be the parametric statistical model associated with a parametric
model of states (
M,
j
,O
). Assume that both G
M
and G
Mc
are invertible. Let
C
be a cost function
and let
E
be a stationary estimator for
C
at
m
. If the Hessian form of
Lm?
at
m?
is invertible,
then we have the generalized Helstrom bound
Cov(ξm?, ξm?)≥GMc
m?−1(ξm?, ξm?)≥GM
m?−1(ξm?, ξm?)(144)
for all ξm?∈T?
m?M.
26 If available, please cite the published version
This is the
Helstrom bound
for parametric models of states on a
C?
algebra. Indeed, when
A
is the algebra
B
(
H
)of bounded operators on the Hilbert space
H
of a ﬁnitelevel quantum
system,
O
is the orbit of faithful density operators on
H
,
M
is an open subset of some
Rk
with
k∈N
. Then, in accordance with remark 2, the cost function
C
may be taken to be the
Euclidean distance on
Rk×Rk
pulled back on
M×M
, and a direct computation shows that
equation
(143)
reduces to the socalled
Helstrom bound
used in quantum estimation theory
or quantum metrology [63,64,65,85].
Remark 4 (Helstrom bound for multipleround models).If we consider multiple rounds as in the
end of section 4, that is, we set
X
=
YN
, then proposition 1implies that the Helstrom bound
can be written as
C(vm, vm)≥GMcN
m−1(Hm(vm),Hm(vm))
≥GMN
m−1(Hm(vm),Hm(vm))
≥1
NGM
m−1(Hm(vm),Hm(vm)) ,
(145)
and this equation allows for the asymptotic analysis of the bound.
The Helstrom bound is a universal bound for all the possible parametric statistical models
associated with a given parametric model of states on a given
C?
algebra. This makes it quite a
remarkable bound.
It is clear that, independently of the cost function and of the estimator we may choose, the
Helstrom bound may be saturated if and only if
GM
m(vm, vm) = GMc
m(vm, vm).(146)
Then, corollary 1shows that this is in principle always true for onedimensional models because
we can always take the unital, Abelian
C?
subalgebra generated by the selfadjoint element
a
associated with the SLD of a given
vm
at
ρm
, and we are in the hypothesis of the corollary.
However, it is also clear that for higherdimensional models like the one in example 4, this
strategy may not be available.
8 Conclusion
We presented a preliminary account of the formulation of estimation theory in the context of
parametric models of states on ﬁnitedimensional
C?
algebras. The aim is to set the stage for
the development of a mathematical formulation of estimation theory that is able to deal with
the classical and quantum case “at the same time” by simply switching between commutative
and noncommutative algebras.
After reviewing the diﬀerential geometric properties of the space of states
S
of an arbitrary
ﬁnitedimensional
C?
algebra
A
, we introduced the notion of parametric model of states on
A
.
Then, following what is done in quantum information theory using POVMs, we considered how
the explicit choice of a positive linear map from
A
to a suitable commutative
C?
algebra
C
gives
rise to the notion of parametric statistical model of states associated with the starting parametric
model of states on
A
. This parametric statistical model may be viewed as a classicallike
snapshot of the given parametric model of states on the possibly noncommutative algebra
A
,
and the CramerRao bound for manifoldvalued estimators is available for this model.
27 If available, please cite the published version
The fact that when
A
is noncommutative there is more than one such classicallike snapshot
means that there is a CramerRao bound for every classicalike snapshot of a given parametric
model of states on
A
. This instance leads us to reformulate the socalled Helstrom bound to the
case of a parametric model of states on a generic
C?
algebra and not just the algebra of bounded
linear operators on a Hilbert space as it is customarily done in quantum information theory.
The Helstrom bound gives a lower bound for all the possible CramerRao bounds associated
with the classicallike snapshots of a given parametric model of states on
A
. The possibility of
considering also multipleround models is brieﬂy discussed, and the Helstrom bound derived in
this context will be the starting point for the asymptotic theory of estimation theory in the
C?algebraic framework we will deal with in future works.
As already remarked in the introduction, this work should be interpreted as a preliminary step
toward a more general understanding of classical and quantum estimation theory. Accordingly,
there are diﬀerent questions that are left open for further developments. For instance, it is
necessary to understand the general conditions for the attainability of the Helstrom bound
for parametric models of states of dimension greater or equal than 2, it is also necessary to
understand how to formulate other relevant bounds like the RLDbound and the Holevo bound
used in quantum information theory in the
C?
algebraic framework, and it would be particularly
interesting to understand how to perform the transition to the inﬁnite dimensional case. We
plan to address these issues in future works.
References
[1]
R. Abraham and J. E. Marsden. Foundations of Mechanics. AddisonWesley, Menlo Park, CA, second
edition, 1978. ↓2
[2]
R. Abraham, J. E. Marsden, and T. Ratiu. Manifolds, tensor analysis, and applications. SpringerVerlag,
New York, second edition, 1988. ↓11,14
[3]
T. S. Ali, J.P. Antoine, and J.P. Gazeau. Coherent states, wavelets, and their geeralizations. SpringerVerlag,
New York, 1999. ↓14
[4] S. I. Amari. Information Geometry and its Application. Springer, Japan, 2016. ↓2,20
[5]
S. I. Amari, O. E. BarndorﬀNielsen, R. E. Kass, S. L. Lauritzen, and C. R. Rao. Diﬀerential geometry in
statistical inference, volume 10 of Lecture Notes  Monograph Series. Institute of Mathematical Statistics,
1987. ↓2,16,20
[6]
S. I. Amari and H. Nagaoka. Methods of Information Geometry. American Mathematical Society,
Providence, RI, 2000. ↓2,9,16,20
[7]
S. Attal, A. Joye, and Pillet C.A., editors. Open Quantum systems I, volume 1880 of Lecture notes in
Mathematics. SpringerVerlag, Berlin, 2006. ↓12
[8]
S. Attal, A. Joye, and Pillet C.A., editors. Open Quantum systems II, volume 1881 of Lecture notes in
Mathematics. SpringerVerlag, Berlin, 2006. ↓12
[9]
S. Attal, A. Joye, and Pillet C.A., editors. Open Quantum systems III, volume 1882 of Lecture notes in
Mathematics. SpringerVerlag, Berlin, 2006. ↓12
[10]
N. Ay, J. Jost, H. V. Le, and L. Schwachhöfer. Information Geometry. Springer International Publishing,
2017. ↓2,4,9,16
[11]
N. Ay, J. Jost, H. V. Le, and L. Schwachhöfer. Parametrized measure models. Bernoulli, 24(3):1692 –
1725, 2018. ↓4,9
[12]
F. Barbaresco. Geometric theory of heat from Souriau lie groups thermodynamics and koszul hessian
geometry: Applications in information geometry for exponential families. Entropy, 18(11):386 – 426, 2016.
↓2
28 If available, please cite the published version
[13]
O. E. BarndorﬀNielsen, R. D. Gill, and P. E. Jupp. On quantum statistical inference. Journal of the
Royal Statistical Society: Series B, 65(04):775–816, 2003. ↓2,13
[14]
J. Barrett. Information processing in generalized probabilistic theories. Physical Review A, 75:–, 2007.
↓
3
[15]
I. Bengtsson and K. Życzkowski. Geometry of Quantum States: An Introduction to Quantum Entanglement.
Cambridge University Press, New York, 2006. ↓3
[16] R. Bhatia. Positive Deﬁnite Matrices. Princeton University Press, 2007. ↓26
[17]
B. Blackadar. Operator Algebras: Theory of
C∗
algebras and von Neumann Algebras. SpringerVerlag,
Berlin, 2006. ↓5
[18]
P. Bona. Some considerations on topologies of inﬁnite dimensional unitary coadjoint orbits. Journal of
Geometry and Physics, 51(2):256 – 268, 2004. ↓4
[19]
O. Bratteli and D. W. Robinson. Operator Algebras and Quantum Statistical Mechanics I. SpringerVerlag,
Berlin, second edition, 1987. ↓5
[20]
A. Bravetti, H. Cruz, and D. Tapias. Contact hamiltonian mechanics. Annals of Physics, 376:17 – 39,
2017. ↓2
[21]
A. Bravetti, C.S. LopezMonsalvo, and F. Nettel. Contact symmetries and Hamiltonian thermodynamics.
Annals of Physics, 361:377 – 400, 2015. ↓2
[22]
J. F. Cariñena, X. Gràcia, G. Marmo, E. Martínez, M. C. Muñoz Lecanda, and N. RománRoy. Geometric
HamiltonJacobi Theory. International Journal of Geometric Methods in Modern Physics, 03(07):1417 –
1458, 2006. ↓2
[23]
N. N. Cencov. Statistical Decision Rules and Optimal Inference. American Mathematical Society,
Providence, RI, 1982. ↓2
[24]
G. Chiribella, G. M. D’Ariano, and P. Perinotti. Probabilistic theories with puriﬁcation. Physical Review
A, 81, 2010. ↓3
[25]
D. Chruściński, F. M. Ciaglia, A. Ibort, G. Marmo, and F. Ventriglia. Stratiﬁed manifold of quantum
states, actions of the complex special linear group. Annals of Physics, 400:221 – 245, 2019. ↓3,7
[26]
D. Chruściński and S. Pascazio. A Brief History of the GKLS Equation. Open Systems & Information
dynamics, 24(3):1740001–20, 2017. ↓12
[27]
C. Chu. Jordan Structures in Geometry and Analysis. Cambridge University press, Cambridge, UK, 2012.
↓5
[28]
F. M. Ciaglia. Quantum states, groups and monotone metric tensors. European Physical Journal Plus,
135:530 (16pp.), 2020. ↓3
[29]
F. M. Ciaglia, F. Di Cosmo, A. Ibort, M. Laudato, and G. Marmo. Dynamical vector ﬁelds on the manifold
of quantum states. Open Systems & Information dynamics, 24(3):1740003–38, 2017. ↓3,7,12
[30]
F. M. Ciaglia, F. Di Cosmo, A. Ibort, and G. Marmo. Schwinger’s Picture of Quantum Mechanics.
International Journal of Geometric Methods in Modern Physics, 17(04):2050054 (14), 2020. ↓3
[31]
F. M. Ciaglia, F. Di Cosmo, A. Ibort, and G. Marmo. Schwinger’s Picture of Quantum Mechanics
IV: Composition and independence. International Journal of Geometric Methods in Modern Physics,
17(04):2050058 (34), 2020. ↓3
[32]
F. M. Ciaglia, F. Di Cosmo, M. Laudato, G. Marmo, G. Mele, F. Ventriglia, and P. Vitale. A Pedagogical
Intrinsic Approach to Relative Entropies as Potential Functions of Quantum Metrics: the qz family.
Annals of Physics, 395:238 – 274, 2018. ↓20
[33]
F. M. Ciaglia, A. Ibort, J. Jost, and G. Marmo. Manifolds of classical probability distributions and
quantum density operators in inﬁnite dimensions. Information Geometry, 2(2):231 – 271, 2019.
↓
4,5,6,7
[34]
F. M. Ciaglia, A. Ibort, and G. Marmo. A gentle introduction to Schwinger’s formulation of quantum
mechanics: the groupoid picture. Modern Physics Letters A, 33(20):1850122–8, 2018. ↓3
29 If available, please cite the published version