Transcript Document

Tractable Nonparametric Bayesian Inference in Poisson
Processes with Gaussian Process Intensity
by
Ryan P. Adams, Iain Murray, and David J.C. MacKay
(ICML 2009)
Presented by Lihan He
ECE, Duke University
July 31, 2009
Outline
Introduction
The model
Poisson distribution
Poisson process
Gaussian process
Gaussian Cox process
Generating data from Gaussian Cox process
Inference by MCMC
Experimental results
Conclusion
2/19
Introduction
Inhomogeneous Poisson process
 A counting process
 Rate of arrivals varies in time or space
 Intensity function (s)
 Astronomy, forestry, birth model, etc.
3/19
Introduction
How to model the intensity function (s)
 Using Gaussian process
 Nonparametrical approach
 Called Gaussian Cox process
Difficulty: intractable in inference
 Double-stochastic process
 Some approximation methods in previous research
 This paper: tractable inference
 Introducing latent variables
 MCMC inference – Metropolis-Hastings method
 No approximation
4/19
Model: Poisson distribution
Discrete random variable X has p.m.f.
e   k
P( X  k ) 
k!
for k = 0, 1, 2, …
 Number of event arrivals
 Parameter 
 E[X] = 
 Conjugate prior: Gamma distribution
5/19
Model: Poisson process
The Poisson process is parameterized by an intensity function
such that the random number of event
within a subregion
is Poisson distributed with parameter
e  T (T ) k
P[ N (T )  k ] 
k!
for k = 0, 1, 2, …
 N(0)=0
 The number of events in disjoint subregions are independent
 No events happen simultaneously
 Likelihood function
6/19
Model: Poisson process
One-dimensional temporal Poisson process
Two-dimensional spatial Poisson process
7/19
Model: Gaussian Cox process
Using Gaussian process prior for intensity function (s)
*: upper bound on (s)
σ : logistic function
g(s): random scalar function, drawn from a Gaussian process prior
g ~ GP(m( ), C ( ))
~ N N (m(s1:N ; ),C(s1:N ; ))
8/19
Model: Gaussian process
Definition: Let g=(g(x1), g(x2), …, g(xN)) be an N-dimensional vector of
function values evaluated at N points x1:N. P(g) is a Gaussian process if for any
finite subset {x1, …, xN} the marginal distribution over that finite subset g has a
multivariate Gaussian distribution.
g ~ GP(m, C )
[ g ( x1 ),...,g ( xN )]T ~ N N (m(s1:N ),C(s1:N ))
 Nonparametric prior (without parameterizing g, as g=wTx)
 Infinite dimension prior (dimension N is flexible), but only
need to work with finite dimensional problem
 Fully specified by the mean function m( ) and the
covariance function C ( )
Mean function is usually defined to be zero
Example covariance function
1 d
Ci , j  C ( xi , x j ; )  v0 exp{  lm ( xim  x mj ) 2 }  v1  v2 (i, j )
2 m1
  {v0 , v1, v2 , lm}
9/19
Model: Generating data from Gaussian Cox process
Objective: generate a set of event {sk}k=1:K on some subregion T which are
drawn from Poisson process with intensity function
10/19
Inference
Given a set of K event {sk}k=1:K on some subregion T as observed data, what is
the posterior distribution over (s)?
Poisson process likelihood function
Posterior
11/19
Inference
Augment the posterior distribution by introducing latent variables to make the
MCMC-based inference tractable.
Observed data:
Introduced latent variables:
1.
Total number of thinned events M
2.
Locations of thinned events
3.
Values of the function g(s) at the thinned events
4.
Values of the function g(s) at the observed events
Complete likelihood
12/19
Inference
MCMC inference: sample
Sample M and
: Metropolis-Hasting method
Metropolis-Hasting method: draw a new sample xt+1based on the last
sample xt and a proposal distribution q(x’;xt)
1.
Sample x’ from proposal q(x’; xt)
p( x' ) q ( x t ; x' )
2. Compute acceptance ratio a 
p ( x t ) q( x' ; x t )
3.
Sample r~U(0,1)
4.
If r<a, accept x’ as new sample, i.e., xt+1=x’; otherwise, reject x’,
let xt+1=xt.
13/19
Inference
Sample M: Metropolis-Hasting method
Proposal distribution for inserting one thinned event
Proposal distribution for deleting one thinned event
Acceptance ratio for inserting one thinned event
Acceptance ratio for deleting one thinned event
14/19
Inference
Sample
: Metropolis-Hasting method
Acceptance ratio for sampling a thinned event
Sample gM+K: Hamiltonian Monte Carlo method (Duane et al, 1987)
m
Sample *: place Gamma prior on *
Conjugate prior, the posterior can be derived analytically.
15/19
Experimental results
Synthetic data
53 events
29 events
235 events
16/19
Experimental results
Coal mining disaster data
191 coal mine explosions in British from year
1875 to 1962
17/19
Experimental results
Redwoods data
195 redwood locations
18/19
Conclusion
 Proposed a novel method of inference for the Gaussian Cox process
that avoids the intractability of such model;
 Using a generative prior that allows exact Poisson data to be generated
from a random intensity function drawn from a transformed Gaussian
process;
 Using MCMC method to infer the posterior distribution of the intensity
function;
 Compared to other method, having better result;
 Having significant computational demands: infeasible for data sets that
have more than several thousand event.
19/19