Transcript Lecture 4

Monte Carlo Methods and
Statistical Physics
Mathematical Biology Lecture 4
James A. Glazier
(Partially Based on Koonin and Meredith,
Computational Physics, Chapter 8)
• Monte Carlo Methods Use Statistical
Physics Techniques to Solve Problems
that are Difficult or Inconvenient to Solve
Deterministically.
• Two Basic Applications:
– Evaluation of Complex Multidimensional
Integrals (e.g. in Statistical Mechanics)
[1950s]
– Optimization of Problems where the
Deterministic Problem is Algorithmically Hard
(NP Complete—e.g. the Traveling Salesman
Problem) [1970s].
• Both Applications Important in Biology.
Example: Thermodynamic Partition
Function
• For a Gas of N Atoms at Temperature 1/b,
Interacting Pairwise through a Potential V(r), the
Partition Function:


  
3
Z   d r1   d rN exp  b  v ri  rj 
i j


3


Suppose We need to Evaluate Z Numerically with 10 Steps/Integration.
Then Have 103N Exponentials to Evaluate.
The Current Fastest Computer is About 1012 Operations/Second. One Year ~
3 x 107 Seconds. So One Year ~ 3 x 1019 Operations.
 In One Year Could Evaluate Z for about 7 atoms!
This Result is Pretty Hopeless. There Must Be a Better Way.
Normal Deterministic Integration
1
• Consider the Integral: I   f x dx
0
• Subdivide [0,1] into N Evenly Spaced Intervals of width
Dx=1/N. Then:
1
I  lim
N  N
N
 f x , where x  i-1Δx,
i
i 1
i
Error Estimate—Continued
2) Convergence is Slow:  I  N 1 2
Normal Deterministic Integration:  I  N 2
while for
However, Comparison Isn’t Fair.
Suppose You Fix the Number of Subdomains in the Integral
to be N.
In d Dimensions Each Deterministic Sub-Integral has N1/d
Intervals.
So the Net Error is  I  N
Carlo Method is Better!
2 d
So, if d>4 the Monte
Error Estimate
How Good is the Estimate?


2
I
Variance of I

1 2
f
N


1

N
Variance of f
For a constant Function, the Error is 0
for Both Deterministic and Monte
Carlo Integration.
Two Rather Strange Consequences:
1) In Normal Integration, Error is 0 for
Straight Lines.
In Monte Carlo Integration, Errors Differ
for Straight Lines Depending on Slope
(Worse for Steeper Lines). If
f x   mx, then  I  m

1


N
N

i 1
 1
f  xi  
N
2
N

i 1

f  xi 

2





Monte Carlo Integration
• Use the Idea of the Integral as
1
an Average:
I   f  x dx
0
• Before We Solved by Subdividing [0,1] into Evenly
Spaced Intervals, but could Equally Well Pick Positions
Where We Evaluate f(x) Randomly:
1
I  lim
N  N
N
 f x , where x  0,1,
i
i
i 1
Chosen to be Uniform Random.
1 N
So: I   f xi 
N
i 1
Approximates I
Note: Need a Good Random Number Generator for this Method to Work.
See (Vetterling, Press…, Numerical Recipies)
Pathology
• Like Normal Integration, Monte Carlo Integration Can
Have Problems.
• Suppose You have N Delta
Functions Scattered over
the Unit Interval.
1
I   f x dx  N
0
• However, the Probability of Hitting a Delta Function is 0,
so IN=0.
• For Sharply-Peaked Functions, the Random Sample is a
Bad Estimate (Standard Numerical Integration doesn’t
Work Well Either)
Weight Functions
•
Can Improve Estimates by Picking the ‘Random’ Points Intelligently, to Have
More Points Where f(x) is Large and Few Where f(x) is Small.
•
Let w(x) be a Weight Function Such That:
•
For Deterministic Integration,
the Weight Function has No Effect:
x
•
Let:
 wx dx  1
0
1
I   wx 
0
y x    wxdx 
0
1
f x 
dx.
wx 
dy
 wx , y x  0  0, y x  1  0.
dx
•
Then: I  f x y  dy  I N  1
0 wx y 
N
•
Alternatively, Pick:
•
So All We have to do is Pick x According to the Distribution w(x) and Divide
f(x) by that Distribution:
1 N f x 
1
f x yi 
, Where We Pick y  0,1, Uniform Random.





w
x
y
i 1
i
N
xi  0,1,  px   x,x  δ   wx .
IN 
.

N
wx 
i
i 1
i
Weight Functions—Continued
• If w x   f  x    I   f  w   f
• Why Not Just Let w(x)= f(x)? Then Need to Solve the
Integral to Invert y(x) to Obtain x(y) or to Pick x
According to w(x). But Stripping Linear Drift is Easy and
Always Helps.



• In d dimensions have: wx   dx d wx   1

x
• So:

 
d 
y x    dx  wx 
0
• Which is Hard to Invert, so Need to Pick xi  Directly
(Though, Again Can Strip Drift).
Example

• Let
I   e x g x dx,
0
• And wx   e .
x
• Then y x   1  e
x
 x y    log 1  y .
• When You Can’t Invert y(x) Refer to Large
Literature on How to Generate xi  With the
Needed Distribution.
Metropolis Algorithm
• Originally a Way to Derive Statistics for Canonical
Ensemble in Statistical Mechanics.

• A Way to Pick the xi According to the Weight

Function wx  in a very high dimensional space.
• Idea: Pick any x0 and do a Random Walk:
n 



xn  x0   i , where i are Random Steps We will Discuss Later.

Subject to Constraints Such that the Probability pxi  of a
i 1



Walker at xi has: lim p  xi   w xi .
n 
Problems:
1) Convergence can be Very Slow.
2) Result can be Wrong.
3) Variance Not Known.
Algorithm

• For any State xi Generate a New Trial State


xt
Usually (Not Necessary)
Assume
that
is
Not
x
t

Too Far From xi . I.e. that it lies within a ball of
Radius d >0 of xi : xi  xt  d .
• Let:

w xt 
r  .
w xi 


• If r≥1 then Accept the Trial: xi 1  xt .
• If r<1 then Accept the Trial with Probability r. I.e.
Pick a Random Number

[0,1].


–If <r then Accept: xi 1  xt .


–Otherwise Reject: xi 1  xi .
• Repeat.
.
Problems
• May Not Sample Entire Space.
– If d Too Small
Explore only Small Region

Around x0 .
– If d Too Big Probability of Acceptance Near 0.
Inefficient.
– If Regions of High w Linked by Regions of Very
Low w Never See Other Regions.
– If w Sharply Peaked Tend to Get Stuck Near
Maximum.

• Sequence of xi  Not Statistically
Independent, so Cannot Estimate Error.
• Fixes:

x
– Use Multiple Replicas. Many Different 0 ,
Which Together Sample the Whole Space.
– Pick d So that the Acceptance Probability is ~
½ (Optimal).
– Run Many Steps Before Starting to Sample.
Convergence Theorem
•
•
•
•
•
•


p  xi   w xi .
Theorem: lim
n 
Proof: Consider Many Independent Walkers Starting from
Every Possible Point and Let Them Run For a Long Time.


x


N
x
Let n i be the Density of Walkers iat Point .




Consider Two Points x and y . Let Px  y  be the


Probability for a Single Walker to Jump from x to y .


y
Rate of Jumps from x to
is: N n x Px  y   N n  y P y  x 

 

 










N
y
P
y

x

N
x
P
x

y
y
Rate of Jumps from
to x is: n
n




N
x
So the Net Change in n i  is:
P x  y 





 
 N n  x  P  y  x 
dN n x 
 N n  y Px  y 
  
 .
dt
 N n  y  P  x  y 


P y  x 
Convergence Theorem—Continued
•At Equilibrium:
•So if




dN e  x 
N e x  P y  x 
0 
  
 .
dt
N e  y  Px  y 



dN e  x 
N n x   N e x  
 0.
dt



dN e  x 
 0.
•And if N n  x   N e  x  
dt


N
x
•So n  Always Tends to its Equilibrium Value Monotonically and the Rate
of Convergence is Linear in the Deviation:





v  Ne x  - Nn x Deviation from Equilibri um   F  mv.
•Implies that System is Perfectly Damped.
•This Result is the Fundamental Justification for Using the Metropolis Algorithm
to Calculate Nonequilibrium and Kinetic Phenomena.
Convergence Theorem—Conclusion
•Need to Evaluate




Px  y .
•If x  y is Allowed, So is
•So


y  x . (Detailed Balance). I.e.




P y  x   0  Px  y   0.






 w y 
wx   w y   P y  x   1 and Px  y    .




wx 

N e x  P y  x 
1
w x 








 .
N e  y  P  x  y  w y  w  x  w y 








.
w
x
•While if w y   w x   P  x  y   1 and P  y  x  



w
y






N e  x  P  y  x  w x  w  y  w x 
  .
  
 
N e  y  Px  y 
1
w y 


N e  x  w x 
•So N  y   w y  Always.
e


•Normalizing by the Total Number of Walkers  px   wx  □



y
•Note that This Result is Independent of How You Choose
Given x ,
As Long as Your Algorithm Has Nonzero Transition Probabilities for All
Initial Conditions and Obeys Detailed Balance (Second Line Above).