#### Transcript The Markov Chain Monte Carlo Method

```The Markov Chain Monte Carlo
Method
Isabelle Stanton
May 8, 2008 Theory Lunch
Monte Carlo vs Las Vegas


Las Vegas Algorithms
are randomized and
always give the
correct results but
gamble with
computation time
Quicksort

Monte Carlo
algorithms have fixed
running time but may
be wrong

Simulated Annealing

Estimating volume
Markov Chains

a memoryless stochastic process, eg, flipping a
coin
1/6
1/6
2
1/6
3
4
1
1/6
6
1/6
5
1/6
1/6
Other Examples of Markov Chains

Shuffling cards

Flipping a coin

PageRank Model

Particle systems – focus of MCMC work
General Idea


Model the system using a Markov Chain
Use a Monte Carlo Algorithm to perform some
Applications




Approximate Counting - # of solutions to 3-SAT
or Knapsack
Statistical Physics – when do phase transitions
occur?
Combinatorial optimization – simulated
annealing type of algorithms
We'll focus on counting
Monte Carlo Counting

How do you estimate the volume
of a complex solid?

Render with environment maps
efficiently?

Estimate an integral numerically?
(Picnic) Knapsack
weighs 5
weighs 4
weighs 10
How
What
many
is a solution?
solutions
are there?
weighs 2
weighs 4
Holds 20
Counting Knapsack Solutions

Item weights: a = (a0,...an)

Knapsack size: a real number b


Estimate the number of {0,1} vectors, x, that
satisfy a*x ≤ b
Let N denote the number of solutions
Naїve Solution

Randomly generate x

Calculate a*x

If a*x ≤ b return 2n

else return 0

This will return N in expectation:

0*(2n-N) + N*2n / 2n
Is this fast?

Counterexample:

a = (1, ... 1) and b = n/3

Any solution has less than n/3 1's

There are (n choose n/3)*2n/3 solutions
no



Pr(sample x, ||x|| ≤ n/3) < (n choose n/3)*2-2n/3
In expectation, need to generate 2n/3 x's before
we get a single solution!
Any polynomial number of trials will grossly
underestimate N
Knapsack with MCMC


Let Mknap be a markov chain withstate space
Ω(b) = {x | a*x ≤ b}
This will allow us to sample a solution
Various Mknap
a=(0,.5,.5) b = 1.5
a=(0,1,1) b = 1.5
111
011
001
101
010
000
110
100
Mknap Transitions

Transitions

With probability 1/2, x transitions to x

Otherwise, select an i u.a.r.
111
from 0 to n-1 and flip
the ith bit of x.
0.5
101
011
If x' is a
110
1/6
1/6
1/6
solution,
transition there.
0.5
1/6
001
0.5
010
100
0.5
1/6
0.5
1/6
1/6
000
0.5
a=(0,1,1) b = 1.5
Connected?

Is Mknap connected?

Yes. To get from x to x' go through 0.
Ergodicity

What is the stationary distribution of Knapsack?



Sample each solution with prob 1/N
A MC is ergodic if the probability distribution
over the states converges to the stationary
distribution of the system, regardless of the
starting configuration
Is Mknap ergodic? Yes.
Algorithm Idea

Start at 0 and simulate Mknap for enough steps
that the distribution over the states is close to
uniform

Why does uniformity matter?

Does this fix the problem yet?
The trick

Assume that a0 ≤ a1 ... ≤ an (0,1,2,…,n-1,n)

Let b0 = 0 and bi = min{b, Σiaj}

|Ω(bi-1)| ≤ |Ω(bi)| - why?

|Ω(bi)| ≤ (n+1)|Ω(bi-1)| - why?
Change any element of Ω(bi)
to one of Ω(bi-1) by switching
the rightmost 1 to a 0
How does that help?



|Ω(b)| = |Ω(bn)| = |Ω(bn)|/|Ω(bn-1)| x
|Ω(bn-1)|/|Ω(bn-2)| x ... x |Ω(b1)|/Ω|(b0)| x |Ω(b0)|
We can estimate each of these ratios by doing
a walk on Ω(bi) and computing the fraction of
samples in Ω(bi-1)
Good estimate since
|Ω(bi-1)| ≤ |Ω(bi)| ≤ (n+1)|Ω(bi-1)|
Analysis



Ignoring bias, the expectation of each trial is
|Ω(bi-1)|/|Ω(bi)|
We perform t = 17ε-2n2 steps
Focus on Var(X)/E(X)^2 in analyzing efficiency
for MCMC methods
Analysis

If Z is the product of the trials,
E[Z] = П |Ω(bi-1)|/|Ω(bi)|

*Magic Statistics Steps*

Var(Z)/(E[Z])2 ≤ ε2/16

By Chebyshev's:
Pr[(1-ε/2)|Ω(b)| ≤ Z ≤ (1+ε/2)|Ω(b)| ] ≥ 3/4
Analysis



We used nt = 17ε-2n3 steps
This is a FPRAS (Fully Polynomial Randomized
Approximation Scheme)
Except... what assumption did I make?
Mixing Time



Assumption: We are close to the uniform
distribution in 17ε-2n2 steps
This is known as the mixing time
It is unknown if this distribution mixes in
polynomial time
Mixing Time


What does mix in polynomial time?

Dice – 1 transition

Shuffling cards – 7 shuffles

ferromagnetic Ising model at high temperature –
O(nlog n)
What doesn't?

ferromagnetic Ising model at low temperature –
starts to form magnets
Wes Weimer Memorial Conclusion Slide

The markov chain monte carlo
method models the problem
as a Markov Chain and then
uses random walks

Mixing time is important

P# problems are hard

Wes likes trespassing
```