A generalized sequential construction of exchangeable Gibbs

advertisement
September 14-16, S.Co. 2009, Politecnico di Milano, Italy
A generalized sequential construction of
exchangeable Gibbs partitions with application
Annalisa Cerquetti
Collegio Nuovo, Pavia, Italy
September 15, 2009
Outline
• Motivation
• BNP analysis of species sampling problems under Gibbs priors
• Can be embedded in Pitman’s framework?
• Background
• Exchangeable Gibbs Partitions (EGPs)
• Central and non-central Generalized Stirling numbers
• Chinese Restaurant Process constructions
• Contribution
• Groups sequential CRP for Gibbs partitions
• Embedding BNP results in the construction
Motivation
Background
Contribution
BNP analysis under Gibbs priors: can be embedded?
Bayesian NP analysis of species sampling problems
• Lijoi et al. (2007) Biometrika, 94, 769-786
• Lijoi et al. (2008) Ann. Appl. Probab. 18, 1519-1547
propose BNP analysis under Gibbs priors
• given a sample from P a.s. discrete with atoms (Pi ) ∼ PK (ρα , γ)
• k distinct species observed with frequencies (n1 , . . . , nk )
they obtain conditional posterior/predictive results for an additional
sample w.r.t.
• the number of new distinct species
• the number of new observations belonging to new species
• the random partition induced by the additional sample
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
BNP analysis under Gibbs priors: can be embedded?
My question was:
Is it possible to embed this analysis in the typical Pitman’s combinatorial
framework for exchangeable partitions theory?
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Exchangeable Gibbs partitions
Some combinatorial formulas and generalized Stirling numbers
Basic CRP construction
Exchangeable random partitions.
[Kingman, 1975]
• a sample (X1 , . . . , Xn , . . . ) from a.s. discrete p.m. P induces a
random partition Π = {A1 , . . . , Ak , . . . } of N by
i ≈ l ⇔ Xi = Xl
• for each restriction to [n], for nj = |Aj |,
Pr (Πn = {A1 , . . . , Ak }) = p(n1 , . . . , nk )
for some symmetric p called the exchangeable partition probability
function (EPPF) of Π.
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Exchangeable Gibbs partitions
Some combinatorial formulas and generalized Stirling numbers
Basic CRP construction
recall Poisson-Kingman models on P1↓
[Pitman, 2003]
A large class of models for ERP (priors for BNP) is obtained by
generalizing infinite sum construction of the Dirichlet process as follows:
• for J1 ≥ J2 ≥ · · · ≥ 0 ranked points of a Poisson prox on (0, ∞)
with general mean intensity ρ(x), (with some constraints)
• (Pi ) = (Ji /T ) ∼ Poisson-Kingman (ρ) on P1↓
and even a larger class of mixed PK (ρ, γ) models is obtained as
Z ∞
PK (ρ|t)γ(dt)
PK (ρ, γ) :=
0
for general mixing density γ.
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
EPPFs in Gibbs product form
Exchangeable Gibbs partitions
Some combinatorial formulas and generalized Stirling numbers
Basic CRP construction
[Gnedin and Pitman, 2005]
→ an interesting sub-class of PK (ρ, γ) for BNP arise for ρ = ρα the Levy
density of the α-stable law for α ∈ (0, 1)
→ PK (ρα , γ) models are characterized by inducing EPPF in Gibbs form
of type α i.e.
k
Y
p(n1 , . . . , nk ) = Vn,k
(1 − α)nj −1↑ .
j=1
→ consistency conditions for Π to be infinite imply the weights (Vn,k )
must satisfy the backward recursion
Vn,k = (n − αk)Vn+1,k + Vn+1,k+1
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Exchangeable Gibbs partitions
Some combinatorial formulas and generalized Stirling numbers
Basic CRP construction
Some known combinatorial formulas
For n = 0, 1, 2, . . . , and arbitrary real x and h, (x)n↑h denotes generalized
rising factorial
(x)n↑h :=
n−1
Y
(x + ih) = hn (x/h)n↑
i=0
for which the following multiplicative law holds
(x)n+r ↑h = (x)n↑h (x + nh)r ↑h
(1)
A binomial formula also holds
(x + y )n↑h =
n X
n
(x)k↑h (y )n−k↑h
k
k=0
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Exchangeable Gibbs partitions
Some combinatorial formulas and generalized Stirling numbers
Basic CRP construction
as well as a generalized version of the multinomial theorem,
p
X
(
zj )n↑h =
p
X
P
nj ≥0, nj =n
j=1
Y
n!
(zj )nj ↑h .
n1 ! · · · np !
(2)
j=1
An application of (1) yields
(zj )nj +mj −1↑ = (zj )mj −1↑ (zj + mj − 1)nj ↑
(3)
and by (2)
X
P
nj ≥0, nj =n
p
p
p
Y
Y
X
n!
(zj )nj +mj −1↑ =
(zj )mj −1↑ ( (zj + mj − 1))n↑ =
n1 ! · · · np ! j=1
j=1
j=1
=
p
p
Y
X
(zj )mj −1↑ (m +
zj − p)n↑ .
j=1
j=1
which agrees with a Lemma in Lijoi et al. (2008)
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Exchangeable Gibbs partitions
Some combinatorial formulas and generalized Stirling numbers
Basic CRP construction
Partial Bell polynomials and generalized Stirling numbers
The number of ways to partition [n] into k blocks and assign each block
a combinatorial structure,
Bn,k (w• ) =
n!
k!
X
k
Y
wn
(n1 ,...,nk ) i=1
i
ni !
.
(4)
Generalized Stirling numbers arise as Bn,k (w• ) for particular w• ,
η,β
Sn,k
= Bn,k ((β − η)•−1↓η ),
where (x)n↓−h = (x)n↑h . In general for arbitrary distinct reals η and β,
are the connection coefficients defined by
(x)n↓η =
n
X
η,β
Sn,k
(x)k↓β
k=0
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Exchangeable Gibbs partitions
Some combinatorial formulas and generalized Stirling numbers
Basic CRP construction
−1,−α
Hence for η = −1, β = −α, and α ∈ (−∞, 1), Sn,k
is defined by
(x)n↑1 =
n
X
−1,−α
Sn,k
(x)k↑α ,
(5)
k=0
and for wni = (1 − α)ni −1↑ and α ∈ [0, 1), equation (4) yields
Bn,k ((1 − α)•−1↑ ) =
n!
k!
X
k
Y
(1 − α)n −1↑
i
(n1 ,...,nk ) i=1
ni !
−1,−α
= Sn,k
.
(6)
i.e. the partial Bell we encounter with Gibbs EPPF of type α.
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Exchangeable Gibbs partitions
Some combinatorial formulas and generalized Stirling numbers
Basic CRP construction
Generalized Stirling vs Generalized Factorial
In Lijoi et al. (2007, 2008) the treatment is in terms of generalized
α
factorial coefficients, Cn,k
defined by
(αy )n↑1 =
n
X
α
Cn,k
(y )k↑1 .
k=0
From (5), if x = y α then
(y α)n↑1 =
n
X
−1,−α
Sn,k
(y α)k↑α =
k=0
n
X
−1,−α k
Sn,k
α (y )k↑1 ,
k=0
hence
−1,−α
α
Sn,k
= Cn,k
/αk .
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Exchangeable Gibbs partitions
Some combinatorial formulas and generalized Stirling numbers
Basic CRP construction
non-central Generalized Stirling numbers [Hsu and Shiue (1998)]
Moreover the following convolution relation holds,
−1,−α,γ
Sn,k
=
n X
n −1,−α
S
(−γ)n−s↑1 ,
s s,k
(7)
s=k
hence
α,γ
−1,−α,γ
Cn,k
= αk Sn,k
=
n X
n α
C (−γ)n−s↑1 ,
s s,k
s=k
and non-central generalized Stirling numbers are the connection
−1,−α,γ
coefficients Sn,k
such that
(y α − γ)n↑1 =
n
X
−1,−α,γ k
Sn,k
α (y )k↑1 =
k=0
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
n
X
−1,−α,γ
Sn,k
(y α)k↑α .
(8)
k=0
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Exchangeable Gibbs partitions
Some combinatorial formulas and generalized Stirling numbers
Basic CRP construction
Sequential CRP construction of exchangeable partitions
Given an infinite EPPF, p(n) = p(n1 , . . . , nk ), assume that
• an unlimited number of customers arrives sequentially in a
restaurant
• for n ≥ 1, given (n1 , . . . , nk ), the placement of the first n customers
at k tables, the (n + 1)th customer is:
→ seated at table j, for 1 ≤ j ≤ kn , with probability
pj,n = pj (n) =
p(nj+ )
p(n)
→ seated at a new table with probability
p0,n = p0,n (n) =
for l = kn + 1,
Pk+1
j=1
p(nl+ )
p(n)
pj,n + p0,n = 1, p(nj+ ) = p(n1 , . . . , nj + 1, . . . , nk )
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Groups variation of CRP
Embedding BNP analysis
A groups sequential variation of the CRP
[Cerquetti, 2008]
Given an infinite EPPF, p(n) = p(n1 , . . . , nk ), assume that
• an unlimited number of groups of customers arrives sequentially in a
restaurant
• for n ≥ 1, given the placement of the first group of n customers in a
(n1 , . . . , nk ) configuration in k tables, the new group of m
customers is:
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Groups variation of CRP
Embedding BNP analysis
→ all seated at the old k tables in configuration (m1 , . . . , mk ), with
probability
p(n1 + m1 , . . . , nk + mk )
,
p(m|n) =
p(n1 , . . . , nk )
→ all seated at k ∗ new tables in configuration (s1 , . . . , sk ∗ ) with
probability
p(n1 , . . . , nk , s1 , . . . , sk ∗ )
,
p(s|n) =
p(n1 , . . . , nk )
(9)
(10)
→ s < m are seated at k ∗ new tables in configuration (s1 , . . . , sk ∗ ) and
the remaining m − s customers at the k old tables in configuration
(m1 , . . . , mk ) with probability
p(m, s|n) =
p(n1 + m1 , . . . , nk + mk , s1 , . . . , sk ∗ )
.
p(n1 , . . . , nk )
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
(11)
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Groups variation of CRP
Embedding BNP analysis
Groups sequential rules for EPPFs in Gibbs form
For
p(n1 , . . . , nk ) = Vn,k
k
Y
(1 − α)nj −1↑
j=1
formulas (9), (10) and (11) specialize as follows,
k
Vn+m,k Y
p(m|n) =
(nj − α)mj ↑ ,
Vn,k
(12)
j=1
∗
k
Vn+m,k+k ∗ Y
p(s|n) =
(1 − α)sj −1↑
Vn,k
(13)
j=1
∗
k
k
Y
Vn+m,k+k ∗ Y
p(m, s|n) =
(nj − α)mj ↑ (1 − α)sj −1↑ .
Vn,k
j=1
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
(14)
j=1
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Groups variation of CRP
Embedding BNP analysis
Embedding Lijoi et al.’s BNP analysis - some examples
The distribution of the random partition (s1 , . . . , sk∗ ) induced by the
additional sample [Lijoi et al. 2008, Prop. 1] arises marginalizing (14)
with respect to (m1 , . . . , mk ):
p(s1 , . . . , sk ∗ |n1 , . . . , nk ) =
m
Vn+m,k+k ∗
Vn,k
m−s
X
P
j
(m1 ,...,mk )
mj =m−s,mj ≥0
m−s
m1 , . . . , mk
Y
k
k∗
Y
(nj −α)mj ↑ (1−α)sj −1↑ =
j=1
j=1
and then exploiting the generalized version of the multinomial theorem
k∗
Y
m
Vn+m,k+k ∗
(n − kα)m−s↑ (1 − α)sj −1↑ .
=
Vn,k
m−s
(15)
j=1
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Groups variation of CRP
Embedding BNP analysis
Embedding BNP analysis (continued)
P
The number of new observations s = sj belonging to new species [Lijoi
et al. 2008, Eq. 11] arises by (15) by summing over the space of all
partitions of s elements in k ∗ blocks, and then marginalizing with respect
to k ∗ , and resorting to the definition of Generalized Stirling numbers,
s
X
1 m
−1,−α
(n − kα)m−s↑
Vn+m,k+k ∗ Ss,k
Pr (S = s|n1 , . . . , nk ) =
.
∗
Vn,k s
∗
k =0
(16)
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Groups variation of CRP
Embedding BNP analysis
Embedding BNP analysis (continued)
The distribution of the number k ∗ of new species in the additional
sample [Lijoi et al. 2007, Eq. 4] arises marginalizing the joint conditional
distribution of S and K ∗ with respect to S, and by the convolution
relation defining non-central generalized Stirling numbers,
Pr (K ∗ = k ∗ |n1 , . . . , nk ) =
Vn+m,k+k ∗ −1,−α,−(n−kα)
Sm,k ∗
.
Vn,k
(17)
N.B. Some additional examples and a connection established between the
notion of conditional Gibbs structures of Lijoi et al. (2008) and the
deletion of classes property of Pitman (2003) may be found on the full
paper on the ArXiv repository.
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Groups variation of CRP
Embedding BNP analysis
Final comments
Is it possible to embed Lijoi et al.’s analysis in the typical Pitman’s
combinatorial framework of exchangeable partitions theory?
Yes, it is
Is that interesting and worth a submission/a paper?
I was not sure, so I checked Prof. Pitman’s opinion
According to Prof. Jim Pitman (May 2008) it
• provides some simplifications of Lijoi et al.’s work,
• connects it better to the Gnedin-Pitman development,
• provides a better platform for future work,
hence it deserves a series of submission attempts. (still ongoing...)
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Motivation
Background
Contribution
Groups variation of CRP
Embedding BNP analysis
Selected references
Cerquetti, A. (2008) Generalized Chinese restaurant construction of exchangeable
Gibbs partitions and related results. ArXiv:0805.3853 [math.PR]
Gnedin, S. and Pitman, J. (2005) Exchangeable Gibbs partitions and Stirling
triangles arXiv:math.PR/0412494.
Kingman, J.F.C. (1975) Random discrete distributions. JRSS B, 37, 1–22.
Kingman, J.F.C (1978) The representation of partition structure. J. London Math.
Soc. 2, 374–380.
Lijoi, A., Mena, R. and Prünster, I. (2007) Bayesian nonparametric estimation of
the probability of discovering new species. Biometrika, 94, 769-786
Lijoi, A. Prünster, I. and Walker, S.G. (2008) Bayesian nonparametric
estimators derived from conditional Gibbs structures. Ann. Appl. Probab. 18,
1519-1547
Pitman, J. (2003) Poisson-Kingman partitions. IMS, LN 40, 1–34. Institute of
Mathematical Statistics, Hayward, California. (arXiv:math.PR/0210396).
Pitman, J. (2006) Combinatorial Stochastic Processes. Ecole d’Eté de Probabilité de
Saint-Flour XXXII - 2002. Lecture Notes N. 1875, Springer.
Annalisa Cerquetti S.Co. 2009, Sept 15 - Politecnico di Milano
Generalized CRP for Gibbs partitions with application
Download