Content uploaded by Andrea Aveni
Author content
All content in this area was uploaded by Andrea Aveni on Oct 18, 2020
Content may be subject to copyright.
NEW PROOF OF A CHARACTERIZATION OF THE GIBBS TYPE
PROCESSES
ANDREA AVENI
Abstract. We provide a new proof of the fact that the only Species Sampling Models
where the probability of observing a new value does depend just on the sample size nand
the number of clusters hare the Gibbs-type Processes.
Definition. For any n∈Nand h∈[n], we denote by ∆n
h:= {c∈Nh:Ph
j=1 cj=n}the
set of all compositions of nof rank (or length) h.
We also write ∆n=∪n
h=1∆n
hand ∆ = ∪n≥1∆n.
Definition. Given two elements c, d ∈∆n
hwe say that
c∼d⇔ ∃σ∈Sh: (cj)h
j=1 = (dσ(j))h
j=1
Since Shis a group then ∼is an equivalence relation and we denote define
˜
∆n
h= ∆n
h/∼
the set of all number theoretic partitions of the number nof rank h.
Theorem. |∆n
h|=n−1
h−1, so that |∆n|= 2n−1
Proof. We classify the compositions cin ∆n
haccording to their first element. If the first
element c1of cis u∈[n], then (cj)h
j=2 must belong to ∆n−u
h−1, moreover for any c∈∆n−u
h−1
the partition (u, c) (if exists) is a unique element of ∆n
h. In other words there exists a
bijection between ∆n−u
h−1and the elements of ∆n
h−1starting from u, therefore we must have
|∆n
h|=
n
X
u=1 |∆n−u
h−1|=
n−(h−1)
X
u=1 |∆n−u
h−1|(1)
Now we prove the result by induction on n. The case n= 1 holds since |∆1
1|=|{(1)}| =
1 = 0
0. Now we assume the formula |∆n
h|=n−1
h−1to hold for any (h, n) such that n<N
and we prove it also holds for any (h, N) with h= 1...N.
1
2 ANDREA AVENI
For any such h= 2...N we have by (1) and by inductive hypothesis
|∆N
h|=
N−(h−1)
X
u=1 |∆N−u
h−1|
=
N−2
X
j=h−2|∆j+1
h−1|
=
N−2
X
j=h−2j
h−2=N−1
h−1
Where the last passage is due to the hockey-stick relation 1. Finally if h= 1 we always
have |∆n
1|=|{n}| = 1, so, in particular, |∆N
1|= 1. And this complete the proof.
On the contrary the cardinality of ˜
∆n
his much more complicated to be analyzed and we
will simply indicate it with ph(n) := |˜
∆n
h|and p(n) := Pn
h=1 ph(n) = |˜
∆n|and p(0) := 1
The few known facts about these functions are the following
∞
X
n=0
p(n)zn=
∞
Y
n=1
1
1−xn
And this function, defined on |z|<1 has many interesting properties, for instance
gcd(p, q)=1⇒lim
ρ→1"(1 −ρ) ln
∞
Y
n=1
1
1−(ρe2πip/q )n#=π2
6q2.
as I proved in the appendix.
Moreover an asymptotic formula is available
p(n)=Θ(4n√3)−1exp(πp2n/3)
Now we define the EPPF.
Definition. A EPPF is a function from ˜
∆to the real numbers (whose restrictions to ˜
∆n
h
are denoted by pn
h) such that
•(PR) For every c∈∆n
h, we have
pn
h(c) = pn+1
h+1(c, 1) +
h
X
i=1
pn+1
h(cj+δi
j)h
j=1
•(TR) For every c∈∆n
h, the ratio pn+1
h+1(c, 1)/pn
h(c)depends only on hand n.
1https ://en.wikipedia.org/wiki/H ockey −stick identity
NEW PROOF OF A CHARACTERIZATION OF THE GIBBS TYPE PROCESSES 3
Before tying to characterize EPPFs we would like to consider how may of them we can
reasonably expect. If we denote by ˜
∆|N:= ∪N
n=1 ˜
∆n, then the restriction of a EPPF to ˜
∆|n
is made up by
N
X
n=1 |˜
∆n|=
N
X
n=1
p(n) =: P(N)
real numbers.
The number of conditions implied by (P R) to the EPPF restricted to ˜
∆|Nare as many as
the elements in ˜
∆|N−1, so they are P(N−1).
On the other hand the condition T R implies the equality of the quantity pn+1
h+1(c, 1)/pn
h(c)
among all the c∈˜
∆n
h, so for fixed nand hthe constraints are |˜
∆n
h|−1. Therefore, in this
case, the number of constraints is
N−1
X
n=1
n
X
h=1
(ph(n)−1) =
N−1
X
n=1
p(n)−n=P(n−1) −N(N−1)
2.
Finally the degrees of freedom should be
P(N)−2P(N−1) + N(N−1)
2=p(N) + N(N−1)
2−P(N−1)
Unfortunately this quantity is negative for all N > 7. This seems counterintuitive since
there exists some (indeed uncountably many) EPPF. A possible reason is that the set of
constraints we have indicated may be dependent (even though they seem not). Another
possible explanation lies in the non-linearity of the (T R) constraints.
(Notice that, on the other hand the degreed of freedom of the Gibbs-type solution on ˜
∆N
are 1 + P(N)−P(N−1) = p(N)+1>0.)
Despite this pointless discussion on the degreed of freedom we have the following result.
Theorem. The only species sampling models where the probability of observing a value
that has never occurred before depends only on nand his the Gibbs-type processes. In
other words the only non negative weights (((pn
h(c))c∈∆n
h)n
h=1)n≥1such that:
•(TR) For every c∈∆n
h, the ratio pn+1
h+1(c, 1)/pn
h(c)depends only on hand n.
•(PR) For every c∈∆n
h, we have
pn
h(c) = pn+1
h+1(c, 1) +
h
X
i=1
pn+1
h(cj+δi
j)h
j=1
•(EX) For all ς∈Sh,pn
h((cj)h
j=1) = pn
h((cς(j))h
j=1)
can be written as
pn
h((cj)h
j=1) = Vn
h
h
Y
j=1
(1 −σ)cj−1
for some σ∈Rand for some coefficients ((Vn
h)n
h=1)n≥1such that
•(CL) For every n≥1,h= 1...n, we have Vn
h=Vn+1
h+1 + (n−hσ)Vn+1
h
4 ANDREA AVENI
Proof. First notice that the result trivially holds for p1
1(1), then we proceed by induction
on n; that is, assuming the result holds for all the partitions of any length of any number
up to n=N, then we prove that there must be coefficients {VN+1
h}N
h=1 satisfying condition
(CL) and such that for any h= 1...N + 1 and c∈∆N+1
h,
(T H )pN+1
h(c) = VN+1
h
h
Y
j=1
(1 −σ)cj−1.
Because of assumption (T R) and by inductive hypothesis, we must have that that for any
h∈ {2...N + 1}and for all c= (c1...ch−1,1) ∈∆N+1
h,
pN+1
h(c1...ch−1,1)
pN
h−1(c1...ch−1)=Ch
For some positive constants {Ch}N+1
h=2 . Therefore,
pN+1
h(c1...ch−1,1) = VN
h−1Ch
h−1
Y
j=1
(1 −σ)cj−1
=VN
h−1Ch
h
Y
j=1
(1 −σ)cj−1.
Finally we define VN+1
h:= VN
h−1Chfor any h= 2...N + 1.
So, by (EX) we have that, for any h= 2...N + 1 and for any partition (cj)h
j=1 of N+ 1
with a 1 in it, we must have
∀c∈∆N+1
h: min(c) = 1, pN+1
h((cj)h
j=1) = VN+1
h
h
Y
j=1
(1 −σ)cj−1.(2)
This is precisely (T H ) for all partitions cof N+1 such that min(c) = 1. Now the problem
is to identify pfor all those partitions that do not have any 1 in them and to verify equation
(CL).
In this respect, we can also define VN+1
1:= (VN
1−VN+1
2)/(N−σ) so that we recover (CL)
for h= 1.
Using (PR) and (1), for any partition (c1...ch) of Nwe must have that
pN
h(c1...ch) = pN+1
h+1 (c1...ch,1) +
h
X
i=1
pN+1
h(c1...ci+ 1, ...ch)
h
X
i=1
pN+1
h((ch+δi
j)h
j=1) = hVN
h−VN+1
h+1 ih
Y
j=1
(1 −σ)cj−1
Now we focus our attention to the partitions c∈∆N
hthat terminates by 1 (for some
h∈ {2...N}). In this case for each i= 1...h −1, we have that (cj+δi
j)h
j=1 is a partition of
NEW PROOF OF A CHARACTERIZATION OF THE GIBBS TYPE PROCESSES 5
N+ 1 terminating by 1, while (cj+δh
j)h
j=1 is a generic partition of N+ 1 terminating by
2. In this case, thanks to (1), this last equation reads
pN+1
h((cj)h−1
j=1 ,2) +
h−1
X
i=1
pN+1
h((cj+δi
j)h−1
j=1 ,1) = hVN
h−VN+1
h+1 ih
Y
j=1
(1 −σ)cj−1
pN+1
h((cj)h−1
j=1 ,2) +
h−1
X
i=1
VN+1
h
h−1
Y
j=1
(1 −σ)cj+δi
j−1=hVN
h−VN+1
h+1 ih
Y
j=1
(1 −σ)cj−1
pN+1
h((cj)h−1
j=1 ,2) + VN+1
h
h
Y
j=1
(1 −σ)cj−1
h−1
X
i=1
(ci−σ) = hVN
h−VN+1
h+1 ih
Y
j=1
(1 −σ)cj−1
pN+1
h((cj)h−1
j=1 ,2) = hVN
h−VN+1
h+1 −VN+1
h(N−1−(h−1)σ)ih
Y
j=1
(1 −σ)cj−1
Now by exchangeability, we have that if some component of (cj)h−1
j=1 is equal to one (notice
that it is always possible to find a partition of rank hof Nwith two elements equal to one
as long as N > 2 and h= 3...N or N= 2 and h= 2) then we must have
VN+1
h(1 −σ)
h
Y
j=1
(1 −σ)cj−1=hVN
h−VN+1
h+1 −VN+1
h(N−1−(h−1)σ)ih
Y
j=1
(1 −σ)cj−1
So that we recover that the relation VN
h=VN+1
h+1 + (N−hσ)VN+1
hmust hold for every
h≥3.
If, on the other hand, all (cj)h−1
j=1 are greater then 1, if we call (c∗
j)h
j=1 = ((cj)h−1
j=1 ,2), then
pN+1
h((cj)h−1
j=1 ,2) = VN+1
h(1 −σ)
h−1
Y
j=1
(1 −σ)cj−1=VN+1
h
h
Y
j=1
(1 −σ)c∗
j−1
And this proves (T H) for all partitions cof N+ 1 such that min(c)=2
Lemma. The EPPF function can always be written as
pn
h((cj)h
j=1) = Vn
h
h
Y
j=1
(1 −σ)cj−1
for any c∈∆N+1
hwith u= min(c)< N + 1.
Proof. Now we proceed by induction on the number uof the last entry of a partition
c= ((cj)h−1
j=1 , u) of N. (Notice that (cj)h−1
j=1 is a partition of N−uand therefore its length
must be h−1∈ {1...N −u}so that h∈ {2...N +1 −u}. In particular this argument cannot
be carried on when N+ 1 −u= 2 or u=N−1)
6 ANDREA AVENI
By inductive hypothesis and (?), we have the following
pN+1
h((cj)h
j=1, u + 1) +
h−1
X
i=1
pN+1
h((cj+δi
j)h−1
j=1 , u) = [VN
h−VN+1
h+1 ]
h
Y
j=1
(1 −σ)cj−1
pN+1
h((cj)h
j=1, u + 1) + VN+1
h
h
Y
j=1
(1 −σ)cj−1
h−1
X
i=1
ci−σ= [VN
h−VN+1
h+1 ]
h
Y
j=1
(1 −σ)cj−1
pN+1
h((cj)h
j=1, u + 1) = hVN
h−VN+1
h+1 −Vn+1
h(N−u−(h−1)σ)ih
Y
j=1
(1 −σ)cj−1
Now, if there is some j∈ {1...h −1}such that cj≤u, then, by inductive hypothesis and
(EX ), we find that
VN+1
h(u−σ)
h
Y
j=1
(1 −σ)cj−1=hVN
h−VN+1
h+1 −Vn+1
h(N−u−(h−1)σ)ih
Y
j=1
(1 −σ)cj−1
From which we recover once again the relation VN
h=VN+1
h+1 +VN+1
h(N−hσ) for all
h∈ {2...N + 1 −u}; if on the contrary for each j= 1..h −1, cj> u then we get
pN+1
h((cj)h
j=1, u + 1) = VN+1
h(u−σ)
h
Y
j=1
(1 −σ)cj−1
And this is proves the lemma.
The only partition that has been left behind is pN+1
1(N+ 1). In this case, because of
(P R) and our definition of VN+1
1, we must have
pN
1(N) = pN+1
1(N+ 1) + p2pN+1
2
pN+1
1= [VN
1−VN+1
2](1 −σ)N−1
pN+1
1=VN+1
1(1 −σ)N+1−1
Now that we have found that (T H) holds for every partition of N+ 1, then it is immediate
to see that (CL) is equivalent to (P R) and therefore (CL) must holds also for h= 2.
Finally it is easy to see that the only Gibbs-type processes where the probability of
observing a value that has never occurred before depends only on nis the Dirichlet Process.
Theorem. The only non-negative weights {{Vn
h}n
h=1}n≥1such that
•V1
1= 1,
•For any n≥1and h∈ {1...n},Vn
h=Vn+1
h+1 +nV n+1
hand
•For any n≥1and h∈ {1...n},Vn+1
h+1 /V n
his constant in h,
NEW PROOF OF A CHARACTERIZATION OF THE GIBBS TYPE PROCESSES 7
can be expressed as
Vn
h=αh−1
(1 + α)n−1
for some α∈[0,∞].
Proof. First of all, we notice that the formula holds for all the couples (n, h) with n≤1
(there is only one such couple of index, namely (1,1)) then, we proceed by induction, let
us assume that Vn
hhas the specified form for some α∈[0,∞) up to n=N, then, if we call
(assuming all weights to be positive) Ω = VN+1
h+1 /V N
h, we have by hypothesis that
VN
N=VN+1
N+1 +NV N+1
N
VN
N= ΩVN
N+NV N+1
N
So we have these two relations
VN+1
N= ΩVN
N−1=1−Ω
NVN
N.
But this forces Ω = α/(α+N). This, in turn, implies that, for h= 2...N + 1,
VN+1
h= ΩVN,h−1=α
α+N
αh−2
(1 + α)N−1
=αh−1
(1 + α)N+1−1
,
and finally, for h= 1...N,
VN+1
h=1−Ω
NVN
h=1
α+N
αh−1
(1 + α)N−1
=αh−1
(1 + α)N+1−1
.
Notice that this does not only shows the uniqueness of the solution, but it does also show
that our proposed solution is indeed a solution.
Finally, by selecting α=∞, we get degenerate case where some weights are zero and the
previous argument does not apply.
Definition. The set of all partitions on nletters, denoted by Πnis defined as
Πn:= nπ∈2[n]:∪π= [n]∧ ∀πi, πj∈π(πi6=πj⇒πi∩πj=∅)o
Every partition is thus a set containing mutually disjoint and collectively exhaustive subsets
of [n].
We shall order the elements of each π∈Πnaccording to their least element. If |π|=h,
then for j= 1 to h, the cardinality of πjwill be indicated cjand its elements will be
denoted as
πj={`i
j}cj
i=1
. Given this notation it is evident that, for each partition π∈Πn, (cj)h
j=1 is a com-
position in ∆n
h. On the contrary to each composition c∈∆n
hthere will correspond (in
general) multiple partitions, for instance, both the (distinct) partitions {{1,2},{3},{4}}
and {{1,3},{2},{4}} correspond to the same composition (2,1,1). It is interesting to find
the cardinality of the set of partitions that correspond to a specific composition c∈∆n
h.
8 ANDREA AVENI
Theorem. Given a composition c∈∆n
h, the number of partitions π∈Πncorresponding
to cis
h!
h
Y
j=1
cj!
−1
Y
c:∃j:cj=c|{j:cj=c}|!
−1
Proof. The result is evident.
Given this we have that
Bn:= |Πn|=
n
X
h=1
h!X
c∈∆n
h
h
Y
j=1
cj!
−1
Y
c:∃j:cj=c|{j:cj=c}|!
−1
On the other hand we also have that from any partition π∈Πn, we can get exactly |π|+ 1
unique partitions of N+ 1, by inserting N+ 1 in one of the elements of πor by adding
{N+ 1}at the end of π. Therefore, we have the recurrence relation
|Πn+1|=X
π∈Πn|π|+ 1 = |Πn|+X
π∈Πn|π|
1. Useless Appendix
If we denote by p(n) the number of distinct partitions of the number n, we have that
α(z) =
∞
X
n=0
p(n)zn=
∞
Y
n=1
1
1−zn|z|<1
The aim of this appendix is to study the behavior of α(z) on the boundary of its domain.
First of all we can immediately see that if z= exp(2πir) for some rational number r∈Q,
let us say r=p/q with p, q coprimes, then we have that for every natural number n,
znq = 1 and α(z) will not be defined.
If instead z= exp(2πix) for some x∈R\Q, then znwill never be equal to one and each
factor in α(z) will be well defined.
Theorem. If gcd(p, q) = 1, then
lim
ρ→1"(1 −ρ) ln
∞
Y
n=1
1
1−(ρe2πip/q )n#=π2
6q2.
NEW PROOF OF A CHARACTERIZATION OF THE GIBBS TYPE PROCESSES 9
Proof.
lim
ρ→1"(1 −ρ) ln
∞
Y
n=1
1
1−(ρe2πip/q )n#=−lim
ρ→1"(1 −ρ)
∞
X
n=1
ln 1−(ρe2πip/q )n#
= lim
ρ→1"(1 −ρ)
∞
X
n=1
∞
X
k=1
(ρe2πip/q )nk
k#
= lim
ρ→1"(1 −ρ)
∞
X
k=1
1
k
∞
X
n=1
(ρe2πip/q )nk#
= lim
ρ→1"(1 −ρ)
∞
X
k=1
1
k
1
1−(ρe2πip/q )k#
=
∞
X
k=1
1
klim
ρ→1
1−e2πikp/q
1−ρ+e2πikp/q
k−1
X
n=0
ρn!−1
But, if pand qare coprime, we have that
lim
ρ→1"1−e2πikp/q
1−ρ+e2πikp/q
k−1
X
n=0
ρn#=(ke2πik/p ifk/q ∈N
∞otherwise =(kifk/q ∈N
∞otherwise
An then, finally
lim
ρ→1"(1 −ρ) ln
∞
Y
n=1
1
1−(ρe2πip/q )n#=
∞
X
k=1
1
k21k/q∈N
=
∞
X
n=1
1
(qn)2
=π2
6q2
This also implies that limρ→1(1 −ρ) ln(α(ρexp(2πix)) = 0 if x∈R\Q.
It would be interesting to see whether α(exp(2πix)) is defined or not for xirrational.
For a normal number x, we could heuristically approximate α(z) in the following way
(zn)n≥1iid
∼Unif|z|=1
=1
1−zniid
∼Cauchy(1/4)
<1
1−zn= 1/2
10 ANDREA AVENI
Figure 1. A plot of (1 −ρ) ln Q∞
n=1 1
1−(ρe2πip/q )nwhen ρ= 1/100.
Where by Cauchy(1/4) we mean the distribution with pdf
1
2π
1
1/4 + t2
But then we have that E(1/(1 −zn)) = 1/2 (in the sense of Cauchy principal value). This
argument suggests that α(exp(2πix)) will be zero for any normal x.