Transcript handout
Computing the Posterior Probability
The posterior probability distribution contains
the complete information concerning the
parameters, but need often need to integrate it
E.g., normalizing posterior requires an ndimensional integral for an n-parameter
posterior
Rarely can analytically integrate posterior
Numerical integration is also difficult when n >
several
Markov-Chain Monte Carlo
Instead of integrating, sample from posterior in a way
that results in a set of parameters values that is
identically-distributed as the posterior
The histogram of chain values for a parameter is a visual
representation of the (marginalized) probability
distribution for that parameter
Can then easily compute confidence intervals:
1. Sum histogram from best-fit value (often peak of histogram)
in both directions
2. Stop when x% of values summed for an x% confidence
interval
MCMC Algorithms
MCMC is similar to a random-walk approach
Most common techniques are MetropolisHastings and Gibbs Sampling (see Wikipedia
entries on these)
Metropolis-Hastings is simplest algorithm since it
only requires evaluations of the posterior for a
given set of parameter values
Gibbs Sampling
1. Derive conditional probability of each parameter given
values of the other parameters
2. Pick parameter at random
3. Draw from conditional probability of that parameter given
values of all other parameters from previous iteration
4. Repeat until chain converges
Example: Case of a two-variable posterior p(x,y), for each
MCMC draw i:
xi ~ p(x|y=yi-1)
yi ~ p(y|x=xi-1)
“~” means “distributed as”
A good approach is to form “hierarchical” models that result in
simple conditional probabilities.
Metropolis-Hastings
Can be visualized as similar to the rejection
method of random number generation
Use a “proposal” distribution that is similar in
shape (especially width) to the expected
posterior distribution to generate new parameter
values
Accept new step when probability of new values
is higher, occasionally accept new step otherwise
(to go “up hill”, avoiding relative minima)
M-H with a Gaussian Proposal Distr.
1.
2.
3.
4.
For each parameter θ estimate/determine
“step” size σ
For each chain iteration, new proposal value
θ’= θi-1 + N(σ), N = normal deviate with st.
dev. = σ
If p(θ’)/p(θ i-1) > 1, θi = θ’
If p(θ’)/p(θ i-1) < 1, accept θi = θ’ with
probability p(θ’)/p(θ i-1), θi = θi-1 otherwise
M-H Issues
Can be very slow to converge, especially when
there are correlated variables
Use multivariate proposal distributions (done in
XSPEC approach)
Transform correlated variables:
If x and y are correlated, instead set y = ax + b, fit for x,
a, b, compute y from chain values afterwards (may also
need to also fix a or b with a tight prior)
Convergence
Run multiple chains, compute convergence statistics
MCMC Example
In Ptak et al. (2007) we used MCMC to fit the
X-ray luminosity functions of normal galaxies in
the GOODS area (see poster)
Tested code first by fitting published FIR
luminosity function
Key advantages:
visualizing full probability space of parameters
ability to derive quantities from MCMC chain value
(e.g., luminosity density)
Sanity Check: Fitting local 60 mm LF
Φ
Fit Saunders et al (1990) LF
assuming Gaussian errors and
ignoring upper limits
Param. S1990
α
1.09 ± 0.12
σ
0.72 ± 0.03
Φ*
0.026 ± 0.008
log L* 8.47 ± 0.23
MCMC
1.04 ± 0.08
0.75 ± 0.02
0.026 ± 0.003
8.39 ± 0.15
log L/L○
(Ugly) Posterior Probabilities
z< 0.5 X-ray luminosity functions
Early-type Galaxies
Late-type Galaxies
Red crosses show 68% confidence interval
Marginalized Posterior Probabilities
Dashed curves show Gaussian with same mean & st. dev. as posterior
Dotted curves show prior
log φ*
a
log L*
s
a
s
Note: α and σ tightly constrained by (Gaussian) prior, rather than being “fixed”
MCMC in XSPEC
XSPEC MCMC is based on the Metropolis-Hastings algorithm. The chain
proposal command is used to set the proposal distribution. The basic
options are multivariate gaussian or cauchy although user-defined
distributions can be entered. The covariance matrix for these distributions
can be calculated from the best-fit, entered from the command line, or
calculated from previous MCMC chain(s).
MCMC is integrated into other XSPEC commands. If chains are loaded
then these are used to generate confidence regions on parameters,
fluxes and luminosities. This is more accurate than the current method for
estimating errors on fluxes and luminosities.
The tclout simpars command returns a set of parameter values drawn
from the probability distribution defined by the currently loaded chain(s).
Chains can be saved either as ascii or FITS files.
XSPEC MCMC Output
Histogram and probability density plot (2-d histogram) of spectral fit parameters from an
XSPEC MCMC run produced by fv (see https://astrophysics.gsfc.nasa.gov/XSPECwiki)
Future
Use “physical” priors… have posterior from previous
work be prior for current work
Use observed distribution of photon indices of nearby AGN
when fitting for NH in deep surveys
Incorporate calibration uncertainty into fitting (Kashyap
AISR project)
XSPEC has a plug-in mechanism for user-defined
proposal distributions… would be good to also allow
user-defined priors
Code repository/WIKI for MCMC analysis in
astronomy