Kernel Stick-Breaking Process

Download Report

Transcript Kernel Stick-Breaking Process

Kernel Stick-Breaking Process
D. B. Dunson and J. Park
Discussion led by Qi An
Jan 19th, 2007
Outline
•
•
•
•
•
•
Motivation
Model formulation and properties
Prediction rules
Posterior Computation
Examples
Conclusions
Motivation
• Consider a problem of estimating the conditional
density of a response variable using a mixture
model, f ( y x)   f ( y x, )dG ( ) , where Gx is an unknown
probability measure indexed by x.
• The problem of defining priors for random
probability measures on Gx has received
increasing attention in recent year. For example,
DP, DDP.
x
One model
• In DDP, the atoms can vary with x according to a
stochastic process while the weights are fixed
• Dunson et al propose a model to allow the
weights to vary with predictors
while this model lacks reasonable marginalization and
updating properties.
Model formulation
• Introduce a countable sequence of mutually
independent random components
• The kernel stick-breaking process (KSBP) can
be defined as follows:
About the model
• The model for Gx is a predictor-dependent
mixture over an infinite sequence of basis
probability measures, Gh* located at Γh.
• Bases located close to x and having a smaller
index, h, tend to receive higher probability
weight.
• KSBP accommodates dependency between Gx
and Gx’
Special cases
• If K(x,Γ)=1 for all and Gh*~DP(αG0), it is a stickbreaking mixture of DP.
• If K(x,Γ)=1, Gh*   and h ~ G,0 we obtain Gx≡G,
with G having a stick-breaking prior.
• If a 1 a and b  b  ha , we obtain a Pitman-Yor
process.
h
h
h
Properties
• Let
, we can obtain
First moment
No dependency on V and Γ
Second moment
• The correlation between measures
It can be proven
as x x’
where
 ( x, x' ;V , )  1 and the value 1 in the limit
Alternative representation
The KSBP has an alternative representation
The moments and correlation coefficient has the form
Truncation
• For stick-breaking Gibbs sampler, we need to
make truncation approximation
The approximated model can be expressed as
P0 ( x)
• Author proves that the residual weights
decrease exponentially fast in N and an
accurate approximation may be obtained for
moderate N
Prediction rules
• Consider a special case in which
The model can be equivalently expressed as:
Prediction rules
• Define
and
is a subset of
the integers between 1 and n
• It can be proven that the probability that subjects
i and j belong to the same cluster is
The predictive distribution is obtained by marginalization
where
and
denote the set of possible r-
dimensional subsets of {1,…,s} that include i
Posterior Computation
From the prior, we can obtain
1, sample Si
2, sample CSi when Si=0 (assign subject I to a new atom at an occupied location)
3, sample θh
4, sample Vh
First sample
and
then, alternate between
(i) Sampling (Aih,Bih) from their conditional distribution
(ii) Updating Vh by sampling from conditional posterior
5, sample Γh using a Metropolis-Hastings step or Gibbs step if H is a set of
discrete potential locations
Simulated examples
Conclusions
• This stick-breaking process is useful in setting in which
there is uncertainty in an uncountable collection of
probability measures
• The process can be applied in predictor dependent
clustering, dynamic modeling and spatial data analysis,
besides the density regression.
• The KSBP formulation can be applied to many tools
developed for exchangeable stick-breaking processes
with minimal modification.
• A predicator dependent urn scheme is obtained, which
generalizes the Polya urn scheme