Content uploaded by Andrea Aveni

Author content

All content in this area was uploaded by Andrea Aveni on Oct 18, 2020

Content may be subject to copyright.

NEW PROOF OF A CHARACTERIZATION OF THE GIBBS TYPE

PROCESSES

ANDREA AVENI

Abstract. We provide a new proof of the fact that the only Species Sampling Models

where the probability of observing a new value does depend just on the sample size nand

the number of clusters hare the Gibbs-type Processes.

Deﬁnition. For any n∈Nand h∈[n], we denote by ∆n

h:= {c∈Nh:Ph

j=1 cj=n}the

set of all compositions of nof rank (or length) h.

We also write ∆n=∪n

h=1∆n

hand ∆ = ∪n≥1∆n.

Deﬁnition. Given two elements c, d ∈∆n

hwe say that

c∼d⇔ ∃σ∈Sh: (cj)h

j=1 = (dσ(j))h

j=1

Since Shis a group then ∼is an equivalence relation and we denote deﬁne

˜

∆n

h= ∆n

h/∼

the set of all number theoretic partitions of the number nof rank h.

Theorem. |∆n

h|=n−1

h−1, so that |∆n|= 2n−1

Proof. We classify the compositions cin ∆n

haccording to their ﬁrst element. If the ﬁrst

element c1of cis u∈[n], then (cj)h

j=2 must belong to ∆n−u

h−1, moreover for any c∈∆n−u

h−1

the partition (u, c) (if exists) is a unique element of ∆n

h. In other words there exists a

bijection between ∆n−u

h−1and the elements of ∆n

h−1starting from u, therefore we must have

|∆n

h|=

n

X

u=1 |∆n−u

h−1|=

n−(h−1)

X

u=1 |∆n−u

h−1|(1)

Now we prove the result by induction on n. The case n= 1 holds since |∆1

1|=|{(1)}| =

1 = 0

0. Now we assume the formula |∆n

h|=n−1

h−1to hold for any (h, n) such that n<N

and we prove it also holds for any (h, N) with h= 1...N.

1

2 ANDREA AVENI

For any such h= 2...N we have by (1) and by inductive hypothesis

|∆N

h|=

N−(h−1)

X

u=1 |∆N−u

h−1|

=

N−2

X

j=h−2|∆j+1

h−1|

=

N−2

X

j=h−2j

h−2=N−1

h−1

Where the last passage is due to the hockey-stick relation 1. Finally if h= 1 we always

have |∆n

1|=|{n}| = 1, so, in particular, |∆N

1|= 1. And this complete the proof.

On the contrary the cardinality of ˜

∆n

his much more complicated to be analyzed and we

will simply indicate it with ph(n) := |˜

∆n

h|and p(n) := Pn

h=1 ph(n) = |˜

∆n|and p(0) := 1

The few known facts about these functions are the following

∞

X

n=0

p(n)zn=

∞

Y

n=1

1

1−xn

And this function, deﬁned on |z|<1 has many interesting properties, for instance

gcd(p, q)=1⇒lim

ρ→1"(1 −ρ) ln

∞

Y

n=1

1

1−(ρe2πip/q )n#=π2

6q2.

as I proved in the appendix.

Moreover an asymptotic formula is available

p(n)=Θ(4n√3)−1exp(πp2n/3)

Now we deﬁne the EPPF.

Deﬁnition. A EPPF is a function from ˜

∆to the real numbers (whose restrictions to ˜

∆n

h

are denoted by pn

h) such that

•(PR) For every c∈∆n

h, we have

pn

h(c) = pn+1

h+1(c, 1) +

h

X

i=1

pn+1

h(cj+δi

j)h

j=1

•(TR) For every c∈∆n

h, the ratio pn+1

h+1(c, 1)/pn

h(c)depends only on hand n.

1https ://en.wikipedia.org/wiki/H ockey −stick identity

NEW PROOF OF A CHARACTERIZATION OF THE GIBBS TYPE PROCESSES 3

Before tying to characterize EPPFs we would like to consider how may of them we can

reasonably expect. If we denote by ˜

∆|N:= ∪N

n=1 ˜

∆n, then the restriction of a EPPF to ˜

∆|n

is made up by

N

X

n=1 |˜

∆n|=

N

X

n=1

p(n) =: P(N)

real numbers.

The number of conditions implied by (P R) to the EPPF restricted to ˜

∆|Nare as many as

the elements in ˜

∆|N−1, so they are P(N−1).

On the other hand the condition T R implies the equality of the quantity pn+1

h+1(c, 1)/pn

h(c)

among all the c∈˜

∆n

h, so for ﬁxed nand hthe constraints are |˜

∆n

h|−1. Therefore, in this

case, the number of constraints is

N−1

X

n=1

n

X

h=1

(ph(n)−1) =

N−1

X

n=1

p(n)−n=P(n−1) −N(N−1)

2.

Finally the degrees of freedom should be

P(N)−2P(N−1) + N(N−1)

2=p(N) + N(N−1)

2−P(N−1)

Unfortunately this quantity is negative for all N > 7. This seems counterintuitive since

there exists some (indeed uncountably many) EPPF. A possible reason is that the set of

constraints we have indicated may be dependent (even though they seem not). Another

possible explanation lies in the non-linearity of the (T R) constraints.

(Notice that, on the other hand the degreed of freedom of the Gibbs-type solution on ˜

∆N

are 1 + P(N)−P(N−1) = p(N)+1>0.)

Despite this pointless discussion on the degreed of freedom we have the following result.

Theorem. The only species sampling models where the probability of observing a value

that has never occurred before depends only on nand his the Gibbs-type processes. In

other words the only non negative weights (((pn

h(c))c∈∆n

h)n

h=1)n≥1such that:

•(TR) For every c∈∆n

h, the ratio pn+1

h+1(c, 1)/pn

h(c)depends only on hand n.

•(PR) For every c∈∆n

h, we have

pn

h(c) = pn+1

h+1(c, 1) +

h

X

i=1

pn+1

h(cj+δi

j)h

j=1

•(EX) For all ς∈Sh,pn

h((cj)h

j=1) = pn

h((cς(j))h

j=1)

can be written as

pn

h((cj)h

j=1) = Vn

h

h

Y

j=1

(1 −σ)cj−1

for some σ∈Rand for some coeﬃcients ((Vn

h)n

h=1)n≥1such that

•(CL) For every n≥1,h= 1...n, we have Vn

h=Vn+1

h+1 + (n−hσ)Vn+1

h

4 ANDREA AVENI

Proof. First notice that the result trivially holds for p1

1(1), then we proceed by induction

on n; that is, assuming the result holds for all the partitions of any length of any number

up to n=N, then we prove that there must be coeﬃcients {VN+1

h}N

h=1 satisfying condition

(CL) and such that for any h= 1...N + 1 and c∈∆N+1

h,

(T H )pN+1

h(c) = VN+1

h

h

Y

j=1

(1 −σ)cj−1.

Because of assumption (T R) and by inductive hypothesis, we must have that that for any

h∈ {2...N + 1}and for all c= (c1...ch−1,1) ∈∆N+1

h,

pN+1

h(c1...ch−1,1)

pN

h−1(c1...ch−1)=Ch

For some positive constants {Ch}N+1

h=2 . Therefore,

pN+1

h(c1...ch−1,1) = VN

h−1Ch

h−1

Y

j=1

(1 −σ)cj−1

=VN

h−1Ch

h

Y

j=1

(1 −σ)cj−1.

Finally we deﬁne VN+1

h:= VN

h−1Chfor any h= 2...N + 1.

So, by (EX) we have that, for any h= 2...N + 1 and for any partition (cj)h

j=1 of N+ 1

with a 1 in it, we must have

∀c∈∆N+1

h: min(c) = 1, pN+1

h((cj)h

j=1) = VN+1

h

h

Y

j=1

(1 −σ)cj−1.(2)

This is precisely (T H ) for all partitions cof N+1 such that min(c) = 1. Now the problem

is to identify pfor all those partitions that do not have any 1 in them and to verify equation

(CL).

In this respect, we can also deﬁne VN+1

1:= (VN

1−VN+1

2)/(N−σ) so that we recover (CL)

for h= 1.

Using (PR) and (1), for any partition (c1...ch) of Nwe must have that

pN

h(c1...ch) = pN+1

h+1 (c1...ch,1) +

h

X

i=1

pN+1

h(c1...ci+ 1, ...ch)

h

X

i=1

pN+1

h((ch+δi

j)h

j=1) = hVN

h−VN+1

h+1 ih

Y

j=1

(1 −σ)cj−1

Now we focus our attention to the partitions c∈∆N

hthat terminates by 1 (for some

h∈ {2...N}). In this case for each i= 1...h −1, we have that (cj+δi

j)h

j=1 is a partition of

NEW PROOF OF A CHARACTERIZATION OF THE GIBBS TYPE PROCESSES 5

N+ 1 terminating by 1, while (cj+δh

j)h

j=1 is a generic partition of N+ 1 terminating by

2. In this case, thanks to (1), this last equation reads

pN+1

h((cj)h−1

j=1 ,2) +

h−1

X

i=1

pN+1

h((cj+δi

j)h−1

j=1 ,1) = hVN

h−VN+1

h+1 ih

Y

j=1

(1 −σ)cj−1

pN+1

h((cj)h−1

j=1 ,2) +

h−1

X

i=1

VN+1

h

h−1

Y

j=1

(1 −σ)cj+δi

j−1=hVN

h−VN+1

h+1 ih

Y

j=1

(1 −σ)cj−1

pN+1

h((cj)h−1

j=1 ,2) + VN+1

h

h

Y

j=1

(1 −σ)cj−1

h−1

X

i=1

(ci−σ) = hVN

h−VN+1

h+1 ih

Y

j=1

(1 −σ)cj−1

pN+1

h((cj)h−1

j=1 ,2) = hVN

h−VN+1

h+1 −VN+1

h(N−1−(h−1)σ)ih

Y

j=1

(1 −σ)cj−1

Now by exchangeability, we have that if some component of (cj)h−1

j=1 is equal to one (notice

that it is always possible to ﬁnd a partition of rank hof Nwith two elements equal to one

as long as N > 2 and h= 3...N or N= 2 and h= 2) then we must have

VN+1

h(1 −σ)

h

Y

j=1

(1 −σ)cj−1=hVN

h−VN+1

h+1 −VN+1

h(N−1−(h−1)σ)ih

Y

j=1

(1 −σ)cj−1

So that we recover that the relation VN

h=VN+1

h+1 + (N−hσ)VN+1

hmust hold for every

h≥3.

If, on the other hand, all (cj)h−1

j=1 are greater then 1, if we call (c∗

j)h

j=1 = ((cj)h−1

j=1 ,2), then

pN+1

h((cj)h−1

j=1 ,2) = VN+1

h(1 −σ)

h−1

Y

j=1

(1 −σ)cj−1=VN+1

h

h

Y

j=1

(1 −σ)c∗

j−1

And this proves (T H) for all partitions cof N+ 1 such that min(c)=2

Lemma. The EPPF function can always be written as

pn

h((cj)h

j=1) = Vn

h

h

Y

j=1

(1 −σ)cj−1

for any c∈∆N+1

hwith u= min(c)< N + 1.

Proof. Now we proceed by induction on the number uof the last entry of a partition

c= ((cj)h−1

j=1 , u) of N. (Notice that (cj)h−1

j=1 is a partition of N−uand therefore its length

must be h−1∈ {1...N −u}so that h∈ {2...N +1 −u}. In particular this argument cannot

be carried on when N+ 1 −u= 2 or u=N−1)

6 ANDREA AVENI

By inductive hypothesis and (?), we have the following

pN+1

h((cj)h

j=1, u + 1) +

h−1

X

i=1

pN+1

h((cj+δi

j)h−1

j=1 , u) = [VN

h−VN+1

h+1 ]

h

Y

j=1

(1 −σ)cj−1

pN+1

h((cj)h

j=1, u + 1) + VN+1

h

h

Y

j=1

(1 −σ)cj−1

h−1

X

i=1

ci−σ= [VN

h−VN+1

h+1 ]

h

Y

j=1

(1 −σ)cj−1

pN+1

h((cj)h

j=1, u + 1) = hVN

h−VN+1

h+1 −Vn+1

h(N−u−(h−1)σ)ih

Y

j=1

(1 −σ)cj−1

Now, if there is some j∈ {1...h −1}such that cj≤u, then, by inductive hypothesis and

(EX ), we ﬁnd that

VN+1

h(u−σ)

h

Y

j=1

(1 −σ)cj−1=hVN

h−VN+1

h+1 −Vn+1

h(N−u−(h−1)σ)ih

Y

j=1

(1 −σ)cj−1

From which we recover once again the relation VN

h=VN+1

h+1 +VN+1

h(N−hσ) for all

h∈ {2...N + 1 −u}; if on the contrary for each j= 1..h −1, cj> u then we get

pN+1

h((cj)h

j=1, u + 1) = VN+1

h(u−σ)

h

Y

j=1

(1 −σ)cj−1

And this is proves the lemma.

The only partition that has been left behind is pN+1

1(N+ 1). In this case, because of

(P R) and our deﬁnition of VN+1

1, we must have

pN

1(N) = pN+1

1(N+ 1) + p2pN+1

2

pN+1

1= [VN

1−VN+1

2](1 −σ)N−1

pN+1

1=VN+1

1(1 −σ)N+1−1

Now that we have found that (T H) holds for every partition of N+ 1, then it is immediate

to see that (CL) is equivalent to (P R) and therefore (CL) must holds also for h= 2.

Finally it is easy to see that the only Gibbs-type processes where the probability of

observing a value that has never occurred before depends only on nis the Dirichlet Process.

Theorem. The only non-negative weights {{Vn

h}n

h=1}n≥1such that

•V1

1= 1,

•For any n≥1and h∈ {1...n},Vn

h=Vn+1

h+1 +nV n+1

hand

•For any n≥1and h∈ {1...n},Vn+1

h+1 /V n

his constant in h,

NEW PROOF OF A CHARACTERIZATION OF THE GIBBS TYPE PROCESSES 7

can be expressed as

Vn

h=αh−1

(1 + α)n−1

for some α∈[0,∞].

Proof. First of all, we notice that the formula holds for all the couples (n, h) with n≤1

(there is only one such couple of index, namely (1,1)) then, we proceed by induction, let

us assume that Vn

hhas the speciﬁed form for some α∈[0,∞) up to n=N, then, if we call

(assuming all weights to be positive) Ω = VN+1

h+1 /V N

h, we have by hypothesis that

VN

N=VN+1

N+1 +NV N+1

N

VN

N= ΩVN

N+NV N+1

N

So we have these two relations

VN+1

N= ΩVN

N−1=1−Ω

NVN

N.

But this forces Ω = α/(α+N). This, in turn, implies that, for h= 2...N + 1,

VN+1

h= ΩVN,h−1=α

α+N

αh−2

(1 + α)N−1

=αh−1

(1 + α)N+1−1

,

and ﬁnally, for h= 1...N,

VN+1

h=1−Ω

NVN

h=1

α+N

αh−1

(1 + α)N−1

=αh−1

(1 + α)N+1−1

.

Notice that this does not only shows the uniqueness of the solution, but it does also show

that our proposed solution is indeed a solution.

Finally, by selecting α=∞, we get degenerate case where some weights are zero and the

previous argument does not apply.

Deﬁnition. The set of all partitions on nletters, denoted by Πnis deﬁned as

Πn:= nπ∈2[n]:∪π= [n]∧ ∀πi, πj∈π(πi6=πj⇒πi∩πj=∅)o

Every partition is thus a set containing mutually disjoint and collectively exhaustive subsets

of [n].

We shall order the elements of each π∈Πnaccording to their least element. If |π|=h,

then for j= 1 to h, the cardinality of πjwill be indicated cjand its elements will be

denoted as

πj={`i

j}cj

i=1

. Given this notation it is evident that, for each partition π∈Πn, (cj)h

j=1 is a com-

position in ∆n

h. On the contrary to each composition c∈∆n

hthere will correspond (in

general) multiple partitions, for instance, both the (distinct) partitions {{1,2},{3},{4}}

and {{1,3},{2},{4}} correspond to the same composition (2,1,1). It is interesting to ﬁnd

the cardinality of the set of partitions that correspond to a speciﬁc composition c∈∆n

h.

8 ANDREA AVENI

Theorem. Given a composition c∈∆n

h, the number of partitions π∈Πncorresponding

to cis

h!

h

Y

j=1

cj!

−1

Y

c:∃j:cj=c|{j:cj=c}|!

−1

Proof. The result is evident.

Given this we have that

Bn:= |Πn|=

n

X

h=1

h!X

c∈∆n

h

h

Y

j=1

cj!

−1

Y

c:∃j:cj=c|{j:cj=c}|!

−1

On the other hand we also have that from any partition π∈Πn, we can get exactly |π|+ 1

unique partitions of N+ 1, by inserting N+ 1 in one of the elements of πor by adding

{N+ 1}at the end of π. Therefore, we have the recurrence relation

|Πn+1|=X

π∈Πn|π|+ 1 = |Πn|+X

π∈Πn|π|

1. Useless Appendix

If we denote by p(n) the number of distinct partitions of the number n, we have that

α(z) =

∞

X

n=0

p(n)zn=

∞

Y

n=1

1

1−zn|z|<1

The aim of this appendix is to study the behavior of α(z) on the boundary of its domain.

First of all we can immediately see that if z= exp(2πir) for some rational number r∈Q,

let us say r=p/q with p, q coprimes, then we have that for every natural number n,

znq = 1 and α(z) will not be deﬁned.

If instead z= exp(2πix) for some x∈R\Q, then znwill never be equal to one and each

factor in α(z) will be well deﬁned.

Theorem. If gcd(p, q) = 1, then

lim

ρ→1"(1 −ρ) ln

∞

Y

n=1

1

1−(ρe2πip/q )n#=π2

6q2.

NEW PROOF OF A CHARACTERIZATION OF THE GIBBS TYPE PROCESSES 9

Proof.

lim

ρ→1"(1 −ρ) ln

∞

Y

n=1

1

1−(ρe2πip/q )n#=−lim

ρ→1"(1 −ρ)

∞

X

n=1

ln 1−(ρe2πip/q )n#

= lim

ρ→1"(1 −ρ)

∞

X

n=1

∞

X

k=1

(ρe2πip/q )nk

k#

= lim

ρ→1"(1 −ρ)

∞

X

k=1

1

k

∞

X

n=1

(ρe2πip/q )nk#

= lim

ρ→1"(1 −ρ)

∞

X

k=1

1

k

1

1−(ρe2πip/q )k#

=

∞

X

k=1

1

klim

ρ→1

1−e2πikp/q

1−ρ+e2πikp/q

k−1

X

n=0

ρn!−1

But, if pand qare coprime, we have that

lim

ρ→1"1−e2πikp/q

1−ρ+e2πikp/q

k−1

X

n=0

ρn#=(ke2πik/p ifk/q ∈N

∞otherwise =(kifk/q ∈N

∞otherwise

An then, ﬁnally

lim

ρ→1"(1 −ρ) ln

∞

Y

n=1

1

1−(ρe2πip/q )n#=

∞

X

k=1

1

k21k/q∈N

=

∞

X

n=1

1

(qn)2

=π2

6q2

This also implies that limρ→1(1 −ρ) ln(α(ρexp(2πix)) = 0 if x∈R\Q.

It would be interesting to see whether α(exp(2πix)) is deﬁned or not for xirrational.

For a normal number x, we could heuristically approximate α(z) in the following way

(zn)n≥1iid

∼Unif|z|=1

=1

1−zniid

∼Cauchy(1/4)

<1

1−zn= 1/2

10 ANDREA AVENI

Figure 1. A plot of (1 −ρ) ln Q∞

n=1 1

1−(ρe2πip/q )nwhen ρ= 1/100.

Where by Cauchy(1/4) we mean the distribution with pdf

1

2π

1

1/4 + t2

But then we have that E(1/(1 −zn)) = 1/2 (in the sense of Cauchy principal value). This

argument suggests that α(exp(2πix)) will be zero for any normal x.