The Markov Chain Monte Carlo Method

Download Report

Transcript The Markov Chain Monte Carlo Method

The Markov Chain Monte Carlo
Method
Isabelle Stanton
May 8, 2008 Theory Lunch
Monte Carlo vs Las Vegas


Las Vegas Algorithms
are randomized and
always give the
correct results but
gamble with
computation time
Quicksort

Monte Carlo
algorithms have fixed
running time but may
be wrong

Simulated Annealing

Estimating volume
Markov Chains

a memoryless stochastic process, eg, flipping a
coin
1/6
1/6
2
1/6
3
4
1
1/6
6
1/6
5
1/6
1/6
Other Examples of Markov Chains

Shuffling cards

Flipping a coin

PageRank Model

Particle systems – focus of MCMC work
General Idea


Model the system using a Markov Chain
Use a Monte Carlo Algorithm to perform some
computation task
Applications




Approximate Counting - # of solutions to 3-SAT
or Knapsack
Statistical Physics – when do phase transitions
occur?
Combinatorial optimization – simulated
annealing type of algorithms
We'll focus on counting
Monte Carlo Counting

How do you estimate the volume
of a complex solid?

Render with environment maps
efficiently?

Estimate an integral numerically?
(Picnic) Knapsack
weighs 5
weighs 4
weighs 10
How
What
many
is a solution?
solutions
are there?
weighs 2
weighs 4
Holds 20
Counting Knapsack Solutions

Item weights: a = (a0,...an)

Knapsack size: a real number b


Estimate the number of {0,1} vectors, x, that
satisfy a*x ≤ b
Let N denote the number of solutions
Naїve Solution

Randomly generate x

Calculate a*x

If a*x ≤ b return 2n

else return 0

This will return N in expectation:

0*(2n-N) + N*2n / 2n
Is this fast?

Counterexample:

a = (1, ... 1) and b = n/3

Any solution has less than n/3 1's

There are (n choose n/3)*2n/3 solutions
no



Pr(sample x, ||x|| ≤ n/3) < (n choose n/3)*2-2n/3
In expectation, need to generate 2n/3 x's before
we get a single solution!
Any polynomial number of trials will grossly
underestimate N
Knapsack with MCMC


Let Mknap be a markov chain withstate space
Ω(b) = {x | a*x ≤ b}
This will allow us to sample a solution
Various Mknap
a=(0,.5,.5) b = 1.5
a=(0,1,1) b = 1.5
111
011
001
101
010
000
110
100
Mknap Transitions

Transitions

With probability 1/2, x transitions to x

Otherwise, select an i u.a.r.
111
from 0 to n-1 and flip
the ith bit of x.
0.5
101
011
If x' is a
110
1/6
1/6
1/6
solution,
transition there.
0.5
1/6
001
0.5
010
100
0.5
1/6
0.5
1/6
1/6
000
0.5
a=(0,1,1) b = 1.5
Connected?

Is Mknap connected?

Yes. To get from x to x' go through 0.
Ergodicity

What is the stationary distribution of Knapsack?



Sample each solution with prob 1/N
A MC is ergodic if the probability distribution
over the states converges to the stationary
distribution of the system, regardless of the
starting configuration
Is Mknap ergodic? Yes.
Algorithm Idea

Start at 0 and simulate Mknap for enough steps
that the distribution over the states is close to
uniform

Why does uniformity matter?

Does this fix the problem yet?
The trick

Assume that a0 ≤ a1 ... ≤ an (0,1,2,…,n-1,n)

Let b0 = 0 and bi = min{b, Σiaj}

|Ω(bi-1)| ≤ |Ω(bi)| - why?

|Ω(bi)| ≤ (n+1)|Ω(bi-1)| - why?
Change any element of Ω(bi)
to one of Ω(bi-1) by switching
the rightmost 1 to a 0
How does that help?



|Ω(b)| = |Ω(bn)| = |Ω(bn)|/|Ω(bn-1)| x
|Ω(bn-1)|/|Ω(bn-2)| x ... x |Ω(b1)|/Ω|(b0)| x |Ω(b0)|
We can estimate each of these ratios by doing
a walk on Ω(bi) and computing the fraction of
samples in Ω(bi-1)
Good estimate since
|Ω(bi-1)| ≤ |Ω(bi)| ≤ (n+1)|Ω(bi-1)|
Analysis



Ignoring bias, the expectation of each trial is
|Ω(bi-1)|/|Ω(bi)|
We perform t = 17ε-2n2 steps
Focus on Var(X)/E(X)^2 in analyzing efficiency
for MCMC methods
Analysis

If Z is the product of the trials,
E[Z] = П |Ω(bi-1)|/|Ω(bi)|

*Magic Statistics Steps*

Var(Z)/(E[Z])2 ≤ ε2/16

By Chebyshev's:
Pr[(1-ε/2)|Ω(b)| ≤ Z ≤ (1+ε/2)|Ω(b)| ] ≥ 3/4
Analysis



We used nt = 17ε-2n3 steps
This is a FPRAS (Fully Polynomial Randomized
Approximation Scheme)
Except... what assumption did I make?
Mixing Time



Assumption: We are close to the uniform
distribution in 17ε-2n2 steps
This is known as the mixing time
It is unknown if this distribution mixes in
polynomial time
Mixing Time


What does mix in polynomial time?

Dice – 1 transition

Shuffling cards – 7 shuffles

ferromagnetic Ising model at high temperature –
O(nlog n)
What doesn't?

ferromagnetic Ising model at low temperature –
starts to form magnets
Wes Weimer Memorial Conclusion Slide

The markov chain monte carlo
method models the problem
as a Markov Chain and then
uses random walks

Mixing time is important

P# problems are hard

Wes likes trespassing