# A pathway to multivariate Gaussian density

**ABSTRACT** A general principle called "conservation of the ellipsoid of concentration" is introduced and a generalized entropic form of order 'alpha' is optimized under this principle. It is shown that this can produce a density which can act as a pathway to multivariate Gaussian density. The resulting entropic pathway contains as special cases the Boltzmann-Gibbs (Shannon) and Tsallis (Havrda-Charvat) entropic forms.

**0**Bookmarks

**·**

**87**Views

- [Show abstract] [Hide abstract]

**ABSTRACT:**Stimulated by the recent debate on the physical relevance and on the predictivity of q-Gaussian formalism, we present specific analytical expressions for the parameters characterizing non-Gaussian distributions, such as the nonextensive parameter q, expressions that we have proposed for different physical systems, an important example being plasmas in the stellar cores. Comment: 8 pages, no figures, two columns AAS Latex Macros, references updated. To be published in Astrophysics and Space ScienceAstrophysics and Space Science 06/2008; · 2.40 Impact Factor

Page 1

arXiv:0709.3820v1 [cond-mat.stat-mech] 24 Sep 2007

AN ENTROPIC PATHWAY TO MULTIVARIATE

GAUSSIAN DENSITY

H.J. Haubold1, A.M. Mathai2, S. Thomas3

1Office for Outer Space Affairs, United Nations, Vienna International Centre, P.O. Box 500, A-1400,

Vienna, Austria.

2Centre for Mathematical Sciences Pala Campus, Arunapuram P.O., Pala-686 574, Kerala, India and

Department of Mathematics and Statistics, McGill University, Montreal, Canada H3A 2K6.

3Department of Statistics, St. Thomas College, Palai, Arunapuram P.O., Pala-686 574, Kerala, India

Abstract

A general principle called “conservation of the ellipsoid of concentration” is introduced

and a generalized entropic form of order α is optimized under this principle. It is shown

that this can produce a density which can act as a pathway to multivariate Gaussian

density. The resulting entropic pathway contains as special cases the Boltzmann-Gibbs

(Shannon) and Tsallis (Havrda-Charv´ at) entropic forms.

Key words: Multivariate Gaussian density; pathway model; generalized entropic form of

order α; ellipsoid of concentration; conservation principle.

1 Introduction

The normal (Gaussian) distribution is a family of continuous probability distributions

and is ubiquitous in the field of statistics and probability (Feller [4]). The importance of

the normal distribution as a model of quantitative phenomena is due to the central limit

theorem. The normal distribution maximizes Shannon entropy among all distributions

with known mean and variance and in information theory, Shannon entropy is the measure

of uncertainty associated with a random variable.

In statistical mechanics, Gaussian (Maxwell-Boltzmann) distribution maximizes the

Boltzmann-Gibbs entropy under appropriate constraints (Gell-Mann and Tsallis [7]).

Given a probability distribution P = {pi}

probability of the system to be in the ith microstate, the Boltzmann-Gibbs entropy is

S(P) = −k?N

of microstates. If all states are equally probable it leads to the Boltzmann principle

S = k lnW (N = W). Boltzmann-Gibbs entropy is equivalent to Shannon’s entropy if

k = 1.

A generalization of Boltzmann-Gibbs extensive statistical mechanics is known as Tsal-

lis non-extensive statistical mechanics (Swinney and Tsallis [5], Abe and Okamoto [6]).

Tsallis discovered the generalization of Shannon’s entropy to non-extensivity as S(P,q) =

(i = 1,...,N), with pi representing the

i=1pilnpi, where k is the Boltzmann constant and N the total number

1Corresponding author:

Email address: HANS.HAUBOLD@UNVIENNA.ORG

1

Page 2

(?N

q-probabilities accommodating the fact that non-extensive systems are better described

by power law distributions, pq

where q is a real parameter.

This paper, in Section 2, introduces a general principle called conservation of the

ellipsoid of concentration and maximizes a generalized entropic form of order α, containing

Shannon (Boltzmann-Gibbs), R´ enyi, Havrda-Charv´ at (Tsallis) entropies as special cases,

under this principle, in Section 3. Normalizing constants are derived in Section 3.1 and

mean value and covariance matrix in Section 3.2 for the cases α < 1, α > 1, and α = 1.

The pathway, characterized by α is shown to produce multivariate type-1 beta, Gaussian,

and type-2 beta densities, respectively. In Section 3.3 a graphical representation of the

pathway surface is shown. Section 4 draws conclusions.

i=1pq

i− 1)/(1 − q). For q → 1, Shannon’s entropy is recovered. Tsallis introduced

i, now called q-probabilities. The pq

iare scaled probabilities

2Conservation of the ellipsoid of concentration

Consider a q × 1 vector X,X′= (x1,...,xq), where a prime denotes the transpose. The

components x1,...,xq may be real scalar mathematical variables or random variables

describing various components in a physical system. Each component in X can be assumed

to have a finite mean value and variance. If E denotes the expected value, the value on

the average in the long-run, then we can assume E(xi) = µi< ∞ for i = 1,...,q. Let

µ′= (µ1,...,µq). Similarly one can assume the expected dispersion in each component

to be finite. The square of a measure of dispersion is given by the variance or Var(xi).

That is, Var(xi) < ∞. The components may be correlated or may have pair-wise joint

variations. A measure of pair-wise joint variation is covariance between xi and xj or

Cov(xi,xj) = E[xi− E(xi)][xj− E(xj)] = vijso that when i = j we have Var(xi) = vii.

The matrix of such variances and covariances is the covariance matrix in X, denoted by

Cov(X) = E(X − E(X))(X − E(X))′= V = (vij). Note that V is real symmetric when

xi,i = 1,...,q are real, and V is at least non-negative definite. Let us assume that no

component in the q × 1 vector X is a linear function of other components so that we

can take V to be nonsingular. This will then imply that V is positive definite. That is,

V = V′> 0. Let V

V .

Standardization of a component xiis achieved by relocating it at µjand by rescaling it

by taking yi=

√

the q×1 vector X is achieved by a linear transformation on X−µ, namely, Y = V−1

so that E(Y ) = O and Cov(Y ) = I where I is the identity matrix. The Euclidean norm in

Y is then [Y′Y ]

many interpretations in different disciplines. A measure of distance between X and µ is

any norm ||X−µ||. But if we want to accommodate the joint variations in the components

x1,...,xqas well as the fact that the variances of the components may be different then

we consider a generalized distance between X and µ. One such square of the generalized

1

2 be the positive definite square root of the positive definite matrix

xi−µi

V ar(xi)so that E(yi) = 0 and Var(yi) = 1. Similarly, standardization of

2(X−µ)

1

2 = [(X −µ)′V−1(X −µ)]

1

2. This scalar quantity (X −µ)′V−1(X −µ) has

2

Page 3

distance is the square of the Euclidian norm in Y or Y′Y = (X − µ)′V−1(X − µ). For

a given constant c > 0,(X − µ)′V−1(X − µ) = c defines the surface of an ellipsoid since

V is positive definite. This ellipsoid is known as the ellipsoid of concentration of X

around its expected value µ. If we assume that c is fixed, for example c = 1 which implies

(X−µ)′V−1(X−µ) = 1 then this assumption is equivalent to saying that the standardized

X, namely, Y is a point on the surface of a hypersphere of radius 1. When it is assumed

that the ellipsoid of concentration is a fixed finite quantity what we are saying is that the

generalized distance of X from µ is fixed and finite. This is the principle of conservation

of the ellipsoid of concentration.

3 Generalized entropic form of order α

Let f(X) be a real-valued scalar function of X where X could be a scalar quantity or a

q×1 vector, q > 1, or p×q matrix, p > 1,q > 1. Let us assume that the elements in X are

real scalar random variables. Then f(X) can define a density provided?

and f(X) ≥ 0 for all X. If

provided f(X) ≥ 0 for all X. Here dX denotes the wedge product of the differentials in

X. For example, dX = dx11∧ dx12∧ ... ∧ dx1q∧ dx21∧ ... ∧ dxpqif X is p × q and all

elements in X are functionally independent. A measure of uncertainty or information in

X or in f(X) is measured by Shannon entropy defined by

Xf(X)dX = 1

hf(X) is a density

?

Xf(X)dX = h < ∞ then g(X) =

1

S(f) = −

?

X

f(X)lnf(X)dX (3.1)

when f is continuous, where X may be scalar or vector or a general matrix and f is the

density of X. There are generalizations of S(f), some of them are listed in Mathai and

Rathie [1]. Some of these are the following (Mathai and Haubold [2]):

R´ enyi’s entropy Rα(f) =ln[?

1 − α

Havrda-Charv´ at entropy Hα(f) =

21−α− 1

Tsallis’ non-extensive entropy Tα(f) =

X{f(X)}αdX]

, α ?= 1,α > 0

?

X[f(X)]α]dX − 1

, α ?= 1,α > 0

?

X[f(X)]αdX − 1

1 − α

, α ?= 1,α > 0

X[f(X)]2−αdX − 1

α − 1

Non-extensive generalized entropic form Mα(f) =

?

, α ?= 1,α < 2

Extensive generalized entropic form M∗

α(f) =ln[fX{f(X)}2−αdX]

α − 1

, α ?= 1,α < 2.

Let us look into the problem of optimizing the non-extensive generalized entropic form

Mα(f) under the principle of the conservation of the ellipsoid of concentration. That is,

to optimize Mα(f) over all functional f, subject to the conditions

(i)

?

X

f(X)dX = 1; (ii)

?

X

(X − µ)′V−1(X − µ)f(X)dX = constant

3

Page 4

for all f ≥ 0 for all X. If we apply calculus of variation technique then the Euler equation

becomes

∂

∂f

?f2−α− λ1f + λ2(X − µ)′V−1(X − µ)f?= 0, α < 2

where λ1 and λ2 are Lagrangian multipliers, observing the fact that since α is fixed,

optimization of

f2−α

α−1is equivalent to optimizing f2−αover all functional f. That is,

f1−α=

λ1

2 − α

?

1 −λ2

λ1(X − µ)′V−1(X − µ)

?

.

Either by takingλ2

value of (1−α)(X −µ)′V−1(X −µ) is 1 where 1−α denotes the strength of information

in f(X), see Mathai and Haubold [2], we have

λ1= a(1 − α), a > 0 or by taking the second condition as the expected

f = λ[1 − a(1 − α)(X − µ)′V−1(X − µ)]

1

1−α

(3.2)

where λ is the normalizing constant, 1−a(1−α)(X −µ)′V−1(X −µ) > 0. Observe that

when α < 1 the form in (3.2) is that of a multivariate type-1 beta type density. When

α > 1, writing 1 − α = −(α − 1) we have

f = λ[1 + a(α − 1)(X − µ)′V−1(X − µ)]−

1

α−1, α > 1, a > 0.(3.3)

Note that (3.3) is a multivariate type-2 beta type density. But when α → 1 in (3.2) and

(3.3) we have the form

f = λe−a(X−µ)′V−1(X−µ). (3.4)

Note that λ in (3.2), (3.3) and (3.4) are different, which are to be evaluated separately

for the three cases of α < 1,α > 1 and α → 1. Thus (3.2) and (3.3) provide a pathway

to the multivariate Gaussian density in (3.4). When a =1

(3.4) is

2the normalizing constant in

λ =

a

q

2

(π)

q

2|V |

1

1

2

=

(2π)

q

2|V |

1

2

for a =1

2

(3.5)

or when α → 1 in (3.2) and (3.3).

4

Page 5

3.1 The normalizing constant λ

Let us consider the case α < 1 first. Since the total integral is 1 we have

1 =

?

X

f(X)dX

= λ|V |

1

2

?

Y

[1 − a(1 − α)(y2

?

yj>0,j=1...q,1−a(1−α)(y2

1+ ··· + y2

q)]

1

1−αdY, Y = V−1

2(X − µ) ⇒ dX = |V |

1

2dY.

= λ2q|V |

1

2

...

?

1+···+y2

q)>0

[1 − a(1 − α)(y2

1+ ··· + y2

q)]

1

1−αdY.

Put uj= a(1 − α)y2

j⇒ dyj=1

2

u

1

2−1

j

du′

j

1

[a(1−α)]

2, α < 1. Then

1 =

λ|V |

1

2

[a(1 − α)]

×(1 − u1− ··· − uq)

λ|V |

[a(1 − α)]

q

2

?

...

?

1−u1−···−uq>0,0<uj<1,j=1,...,q

u

1

2−1

1

...u

1

2−1

q

1

1−αdu1∧ ... ∧ duq

??qΓ?

1

1−α+ 1 +q

2

=

1

2

q

2

?Γ?1

Γ?

2

1

1−α+ 1?

?

.

by evaluating the integral with the help of a type-1 Dirichlet integral (Mathai [3]). Thus

λ =Γ?

1

1−α+ 1 +q

Γ?

2

?[a(1 − α)]

2π

1

2

1

1−α+ 1?|V |

1

q

2

for α < 1.(3.6)

For α > 1, writing 1−α = −(α −1) and proceeding as above and then finally evaluating

the integral with the help of a type-2 Dirichlet integral [Mathai [3]) we have

λ =

[a(α − 1)]

|V |

q

2Γ?

1

α−1−q

1

α−1

?

?,

1

2π

q

2Γ?

2

1

α − 1−q

2> 0, α > 1.(3.7)

When α → 1 do (3.6) and (3.7) go to (3.3)? This can be checked with the help of Stirling’s

formula which states that for |z| → ∞ and ε a bounded quantity,

Γ(z + ε) ≈

Note that for α < 1 and when α → 1,

Γ?

√2π?

√2π?

=

π

2

√2πzz+ε−1

2e−z. (3.8)

1

1−α→ ∞. Then applying Stirling’s formula to

1−α+ 1?in (3.6) we have

1

1−α+ 1 +q

2

?

1

1−α+ 1?

a

2

1

1−α+ 1 +q

2

?and Γ?

1

λ →

1

1−α+1+q

2−1

2e−

1

1−a[a(1 − α)]

1

1−α|V |

q

2

1

1−α+1−1

2e−

1

2π

q

2

q

q

2|V |

1

5

Page 6

which is the value of λ in (3.5). Then when α approaches 1 from the left, (3.6) goes to

(3.5). Similarly we can see that (3.7) also goes to (3.5) when α → 1 from the right. This

constitutes the pathway to multivariate Gaussian density.

3.2The mean value and covariance matrix of X in (3.2)

E(X) =

?

X

Xf(X)dX = µ

?

X

f(X)dX +

?

X

(X − µ)f(X)dX

= µ + λ|V |

1

2{V

1

2

?

Y

Y [1 − a(1 − α)Y′Y ]

1

1−αdY },

since?

But Y [1−a(1−α)Y′Y ]

E(X) = µ.

Xf(X)dX = 1 and since X − µ = V

1

1−αis an odd function and hence the integral over Y is null. Hence

1

2Y when Y = V−1

2(X − µ) ⇒ dX = |V |

1

2dY.

Cov(X) = E(X − E(X))(X − E(X))′

= E(X − µ)(X − µ)′

= V

1

2{E(Y Y′)}V

= λ|V |

1

2,Y = V−1

2(X − µ)

1

2V

1

2{

?

Y

Y Y′[1 − a(1 − α)Y′Y ]

1

1−αdY }V

1

2.

Note that Y Y′is a q × q matrix where the (i,j)th element is yiyj. For i ?= j the integral

over Y is zero since yiyj[1−a(1−α)Y′Y ]

diagonal elements of Y Y′are y2

1

1−αis an odd function in yias well as in yj. The

q. The integral over one of them will be of the form

1,...,y2

?

Yy2

= 2q

1[1 − a(1 − α)Y′Y ]

?

1 − a(1 − α)(y2

1

[a(1 − α)]

1

2

?Γ?1

[a(1 − α)]

by using a type-1 Dirichlet integral. Now, substitute in (3.2) and (3.6) we have

1

1−αdY for 1 − a(1 − α)Y′Y > 0 when α < 1.

1(1 − a(1 − α)Y′Y ]

1+ ··· + y2

?

??qΓ?

1

1−α+ 1 +q

...

?

y2

1

1−αdY for yi> 0,j = 1...q, α < 1 and

q) > 0

=

q

2+1

...

?

u

3

2−1

1

u

1

2−1

2

...u

1

2−1

q

(1 − u1− ··· − uq)

1

1−αdu1∧ ... ∧ duq

=

2

1

1−α+ 1?

q

2+1Γ?

2+ 1?,

Cov(X) =

1

1−α+ 1 +q

2a(1 − α)?

1

2

?V =

1

2a[1 + (1 − α)(1 +q

2)]V, α < 1. (3.9)

Observe that it is an interesting result because the covariance matrix in X is not the

parameter matrix V in the model (3.2) and (3.6). For α > 1, proceeding as before, one

6

Page 7

has

Cov(X) =

1

2a[1 − (α − 1)?q

2+ 1?V,(3.10)

for α > 1,1 − (α − 1)?q

a =1

the multivariate Gaussian density. Hence the pathway for the covariance matrix is given

in (3.8) and (3.9).

2+ 1?> 0 which implies 1 < α < 1 +

2and α → 1 then (3.8) and (3.9) give the covariance matrix as V which agrees with

1

q

2+1. Observe that when

3.3 The pathway surface

Let us look into the pathway model for the standard case. That is, for α < 1,

g1(Y ) =[a(1 − α)]

q

2Γ?

1

1−α+ 1?π

q) > 0. This is plotted for q = 2,a = 1 and for α = −0.5,0,0.5.

1

1−α+ 1 +q

q

2

2

?

Γ?

[1 − a(1 − α)(y2

1+ ··· + y2

q)]

1

1−α,α < 1

1 − a(1 − α)(y2

1+ ··· + y2

For α > 1,

g2(Y ) =[a(α − 1)]

q

2Γ?

2

1

α−1

?π

?

Γ?

1

α−1−q

q

2

[1+a(α−1)(y2

1+···+y2

q)]−

1

α−1,α < 1,

1

α − 1−q

2> 0,α > 1.

This is plotted for q = 2,a = 1, and for α = 1.1,1.5,1.7.

7

Page 8

For α → 1

g3(Y ) =a

q

2

2e−a(y2

π

q

1+···+y2

q).

This is plotted for a = 1.

The nature of the pathway surface when α moves from -0.5 to 1 can be seen from Figures

1a-1c and Figure 3. The nature of the movement when α moves from 1 to 1.7 can be seen

from Figure 3 and Figures 2a-2c.

4Conclusions

The multivariate Gaussian density and its central place in the procedure of maximizing

a generalized entropic form of order α is the core result of this paper. It contributes to

gain understanding of different entropic forms and how they relate to each other by using

the parameter α (Mathai and Rathie [1], Masi [8]). This makes visible the pathway from

type-1 beta, through Gaussian, to type-2 beta densities as they emerge depending on α

and shows the relation to entropies of Boltzmann-Gibbs and Tsallis statistical mechanics

(Hilhorst and Schehr [9], Vignat and Plastino [10]). While the generalized entropic form

of order α may not have direct applications in statistical mechanics, it might be of in-

terest to information theory and to a better understanding of attempts to unify entropic

forms under either mathematical or physical principles. A graphical representation of the

pathway is given in Figures 1, 2, and 3.

Acknowledgement The authors would like to thank the Department of Science and

Technology, Government of India, New Delhi, for the financial assistance for this work

under project No. SR/S4/MS:287/05 which enabled this collaboration possible.

References

[1] A.M. Mathai and P.N. Rathie, Basic Concepts in Information Theory and Statistics:

Axiomatic Foundations and Applications, Wiley Halsted, New York 1975.

[2] A.M. Mathai and H.J. Haubold, Pathway model, superstatistics, Tsallis statistics, and

a generalized measure of entropy, Physica A 375 (2007) 110-122.

8

Page 9

[3] A.M. Mathai, A review of the recent developments on generalized complex matrix-

variate Dirichlet integrals, in Proceedings of the 7th International Conference of the

Society for Special Functions and their Applications (SSFA), Pune, India, 21-23 Febru-

ary 2006, Ed. A.K. Agarwal, Published by SSFA, pp. 131-142.

[4] W. Feller, An Introduction to Probability Theory and Its Applications, Volume I, Third

Edition, John Wiley and Sons, New York 1968.

[5] H.L. Swinney and C. Tsallis (Eds.), Anomalous Distributions, Nonlinear Dynamics,

and Nonextensivity, Physics D, 193 (2004) 1-356.

[6] S. Abe and Y. Okamoto (Eds.), Nonextensive Statistical Mechanics and Its Applica-

tions, Springer, Heidelberg 2001.

[7] M. Gell-Mann and C. Tsallis (Eds.), Nonextensive Entropy: Interdisciplinary Appli-

cations, Oxford University Press, New York 2004.

[8] M. Masi, A step beyond Tsallis and Re’nyi entropies, Physics Letters A, 338 (2005)

217-224.

[9] H.J. Hilhorst and G. Schehr, A note on q-Gaussians and non-Gaussians in statistical

mechanics, Journal of Statistical Mechanics: Theory and Experiment, 2007, P06003.

[10] C. Vignat and A. Plastino, Scale invariance and related properties of q-Gaussian

systems, Physics Letters A, 365 (2007) 370-375.

9

#### View other sources

#### Hide other sources

- Available from Hans Joachim Haubold · May 28, 2014
- Available from ArXiv