Accelerated Sampling for the Indian Buffet Process

Download Report

Transcript Accelerated Sampling for the Indian Buffet Process

Accelerated Sampling for the
Indian Buffet Process
Finale Doshi-Velez and Zoubin Ghahramani
ICML 2009
Presented by: John Paisley, Duke University
Introduction
• The IBP is a nonparametric prior for inferring the number of
underlying features in a dataset, as well as which subset of
these features are used by a given observation, e.g., the
number of underlying notes in a piece of music and which
notes are used at what times.
• Gibb sampling currently has some issues: The uncollapsed
Gibbs sampler is slow to mix, while the collapsed Gibbs
sampler is slow to complete an iteration when the number of
samples is large.
• This paper presents an accelerated Gibbs sampling method
for the linear-Gaussian model that is fast in both respects.
The IBP and Linear-Gaussian Model
• The IBP story:
– The first customer walks into the buffet and samples
dishes.
– The Nth customer samples previously tasted dishes with
probability proportional to the number of previous
customers who have sampled the dish, and samples
more dishes.
This prior can be used for the
linear-Gaussian model, to infer
the number of underlying vectors
that are linearly combined to
construct a matrix.
IBP Calculations for Z
• Integrating out A, the posteriors for for Z are calculated for
existing features as,
and for new features as,
• Collapsing out the loadings matrix, A, the likelihood term is,
which significantly increases the amount of computation that
needs to be done to calculate the likelihood.
IBP Calculations for Z
• When A is not integrated out, inference is faster because
matrices do not need to be inverted. Also, when finding the
posterior for a value in the nth row of Z, we don’t need to
worry about the other rows of Z since the likelihood can
be represented as
• For the linear-Gaussian model, the posterior of A is Gaussian
with,
Accelerated Gibbs Sampling
• Goal: Develop a sampler that mixes like the collapsed
sampler, but has the speed of the uncollapsed sampler.
• To do this, select a window and split the data into two groups.
Accelerated Gibbs Sampling
•
By splitting the data this way, we can write the probabilities for Z as,
•
We therefore don’t need to worry about X-w when calculating likelihoods.
Accelerated Gibbs Sampling
We can efficiently compute means and
covariances with data removed,
And then efficiently update the mean
and covariance using all data using
the statistics for the window.
And rank-one updates can be used
when updating
Experiments (Synthetic Data)
Generate a linear-Gaussian model
from the IBP prior with D = 10.
Experiments (real data)
•
Experiments were run on several real
datasets for 500 iterations or 150 hours
(max). Data set information is below.
Per-iteration run times and performance
measures are at right.
Experiments (real data)
Discussion and Conclusion
•
The accelerated Gibbs sampler achieved similar performance with the other
two sampling methods, but at a much faster rate.
•
Rank-one updates are less precise (due to round-offs). It is important to
sometimes invert the entire matrix.
•
Performance is also faster than slice-sampling. Also, it doesn’t rely on
proposals (Metropolis-Hastings) or particle counts (particle filters).
•
Efficient computations were obtained by collapsing locally on a window
using posteriors calculated from data outside of this window.