Transcript Lecture 4
Monte Carlo Methods and
Statistical Physics
Mathematical Biology Lecture 4
James A. Glazier
(Partially Based on Koonin and Meredith,
Computational Physics, Chapter 8)
• Monte Carlo Methods Use Statistical
Physics Techniques to Solve Problems
that are Difficult or Inconvenient to Solve
Deterministically.
• Two Basic Applications:
– Evaluation of Complex Multidimensional
Integrals (e.g. in Statistical Mechanics)
[1950s]
– Optimization of Problems where the
Deterministic Problem is Algorithmically Hard
(NP Complete—e.g. the Traveling Salesman
Problem) [1970s].
• Both Applications Important in Biology.
Example: Thermodynamic Partition
Function
• For a Gas of N Atoms at Temperature 1/b,
Interacting Pairwise through a Potential V(r), the
Partition Function:
3
Z d r1 d rN exp b v ri rj
i j
3
Suppose We need to Evaluate Z Numerically with 10 Steps/Integration.
Then Have 103N Exponentials to Evaluate.
The Current Fastest Computer is About 1012 Operations/Second. One Year ~
3 x 107 Seconds. So One Year ~ 3 x 1019 Operations.
In One Year Could Evaluate Z for about 7 atoms!
This Result is Pretty Hopeless. There Must Be a Better Way.
Normal Deterministic Integration
1
• Consider the Integral: I f x dx
0
• Subdivide [0,1] into N Evenly Spaced Intervals of width
Dx=1/N. Then:
1
I lim
N N
N
f x , where x i-1Δx,
i
i 1
i
Error Estimate—Continued
2) Convergence is Slow: I N 1 2
Normal Deterministic Integration: I N 2
while for
However, Comparison Isn’t Fair.
Suppose You Fix the Number of Subdomains in the Integral
to be N.
In d Dimensions Each Deterministic Sub-Integral has N1/d
Intervals.
So the Net Error is I N
Carlo Method is Better!
2 d
So, if d>4 the Monte
Error Estimate
How Good is the Estimate?
2
I
Variance of I
1 2
f
N
1
N
Variance of f
For a constant Function, the Error is 0
for Both Deterministic and Monte
Carlo Integration.
Two Rather Strange Consequences:
1) In Normal Integration, Error is 0 for
Straight Lines.
In Monte Carlo Integration, Errors Differ
for Straight Lines Depending on Slope
(Worse for Steeper Lines). If
f x mx, then I m
1
N
N
i 1
1
f xi
N
2
N
i 1
f xi
2
Monte Carlo Integration
• Use the Idea of the Integral as
1
an Average:
I f x dx
0
• Before We Solved by Subdividing [0,1] into Evenly
Spaced Intervals, but could Equally Well Pick Positions
Where We Evaluate f(x) Randomly:
1
I lim
N N
N
f x , where x 0,1,
i
i
i 1
Chosen to be Uniform Random.
1 N
So: I f xi
N
i 1
Approximates I
Note: Need a Good Random Number Generator for this Method to Work.
See (Vetterling, Press…, Numerical Recipies)
Pathology
• Like Normal Integration, Monte Carlo Integration Can
Have Problems.
• Suppose You have N Delta
Functions Scattered over
the Unit Interval.
1
I f x dx N
0
• However, the Probability of Hitting a Delta Function is 0,
so IN=0.
• For Sharply-Peaked Functions, the Random Sample is a
Bad Estimate (Standard Numerical Integration doesn’t
Work Well Either)
Weight Functions
•
Can Improve Estimates by Picking the ‘Random’ Points Intelligently, to Have
More Points Where f(x) is Large and Few Where f(x) is Small.
•
Let w(x) be a Weight Function Such That:
•
For Deterministic Integration,
the Weight Function has No Effect:
x
•
Let:
wx dx 1
0
1
I wx
0
y x wxdx
0
1
f x
dx.
wx
dy
wx , y x 0 0, y x 1 0.
dx
•
Then: I f x y dy I N 1
0 wx y
N
•
Alternatively, Pick:
•
So All We have to do is Pick x According to the Distribution w(x) and Divide
f(x) by that Distribution:
1 N f x
1
f x yi
, Where We Pick y 0,1, Uniform Random.
w
x
y
i 1
i
N
xi 0,1, px x,x δ wx .
IN
.
N
wx
i
i 1
i
Weight Functions—Continued
• If w x f x I f w f
• Why Not Just Let w(x)= f(x)? Then Need to Solve the
Integral to Invert y(x) to Obtain x(y) or to Pick x
According to w(x). But Stripping Linear Drift is Easy and
Always Helps.
• In d dimensions have: wx dx d wx 1
x
• So:
d
y x dx wx
0
• Which is Hard to Invert, so Need to Pick xi Directly
(Though, Again Can Strip Drift).
Example
• Let
I e x g x dx,
0
• And wx e .
x
• Then y x 1 e
x
x y log 1 y .
• When You Can’t Invert y(x) Refer to Large
Literature on How to Generate xi With the
Needed Distribution.
Metropolis Algorithm
• Originally a Way to Derive Statistics for Canonical
Ensemble in Statistical Mechanics.
• A Way to Pick the xi According to the Weight
Function wx in a very high dimensional space.
• Idea: Pick any x0 and do a Random Walk:
n
xn x0 i , where i are Random Steps We will Discuss Later.
Subject to Constraints Such that the Probability pxi of a
i 1
Walker at xi has: lim p xi w xi .
n
Problems:
1) Convergence can be Very Slow.
2) Result can be Wrong.
3) Variance Not Known.
Algorithm
• For any State xi Generate a New Trial State
xt
Usually (Not Necessary)
Assume
that
is
Not
x
t
Too Far From xi . I.e. that it lies within a ball of
Radius d >0 of xi : xi xt d .
• Let:
w xt
r .
w xi
• If r≥1 then Accept the Trial: xi 1 xt .
• If r<1 then Accept the Trial with Probability r. I.e.
Pick a Random Number
[0,1].
–If <r then Accept: xi 1 xt .
–Otherwise Reject: xi 1 xi .
• Repeat.
.
Problems
• May Not Sample Entire Space.
– If d Too Small
Explore only Small Region
Around x0 .
– If d Too Big Probability of Acceptance Near 0.
Inefficient.
– If Regions of High w Linked by Regions of Very
Low w Never See Other Regions.
– If w Sharply Peaked Tend to Get Stuck Near
Maximum.
• Sequence of xi Not Statistically
Independent, so Cannot Estimate Error.
• Fixes:
x
– Use Multiple Replicas. Many Different 0 ,
Which Together Sample the Whole Space.
– Pick d So that the Acceptance Probability is ~
½ (Optimal).
– Run Many Steps Before Starting to Sample.
Convergence Theorem
•
•
•
•
•
•
p xi w xi .
Theorem: lim
n
Proof: Consider Many Independent Walkers Starting from
Every Possible Point and Let Them Run For a Long Time.
x
N
x
Let n i be the Density of Walkers iat Point .
Consider Two Points x and y . Let Px y be the
Probability for a Single Walker to Jump from x to y .
y
Rate of Jumps from x to
is: N n x Px y N n y P y x
N
y
P
y
x
N
x
P
x
y
y
Rate of Jumps from
to x is: n
n
N
x
So the Net Change in n i is:
P x y
N n x P y x
dN n x
N n y Px y
.
dt
N n y P x y
P y x
Convergence Theorem—Continued
•At Equilibrium:
•So if
dN e x
N e x P y x
0
.
dt
N e y Px y
dN e x
N n x N e x
0.
dt
dN e x
0.
•And if N n x N e x
dt
N
x
•So n Always Tends to its Equilibrium Value Monotonically and the Rate
of Convergence is Linear in the Deviation:
v Ne x - Nn x Deviation from Equilibri um F mv.
•Implies that System is Perfectly Damped.
•This Result is the Fundamental Justification for Using the Metropolis Algorithm
to Calculate Nonequilibrium and Kinetic Phenomena.
Convergence Theorem—Conclusion
•Need to Evaluate
Px y .
•If x y is Allowed, So is
•So
y x . (Detailed Balance). I.e.
P y x 0 Px y 0.
w y
wx w y P y x 1 and Px y .
wx
N e x P y x
1
w x
.
N e y P x y w y w x w y
.
w
x
•While if w y w x P x y 1 and P y x
w
y
N e x P y x w x w y w x
.
N e y Px y
1
w y
N e x w x
•So N y w y Always.
e
•Normalizing by the Total Number of Walkers px wx □
y
•Note that This Result is Independent of How You Choose
Given x ,
As Long as Your Algorithm Has Nonzero Transition Probabilities for All
Initial Conditions and Obeys Detailed Balance (Second Line Above).