Markov Chain Monte Carlo Method

Download Report

Transcript Markov Chain Monte Carlo Method

Physical Fluctuomatics
Applied Stochastic Process
8th Markov chain Monte Carlo methods
Kazuyuki Tanaka
Graduate School of Information Sciences, Tohoku University
[email protected]
http://www.smapip.is.tohoku.ac.jp/~kazu/
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
1
Horizon of Computation
in Probabilistic Information Processing
Compensation of expressing uncertainty
using probability and statistics
It must be calculated by taking account of both events
with high probability and events with low probability.
Computational Complexity
It is expected to break throw the computational
complexity by introducing approximation algorithms.
Physical Fuctuomatics (Tohoku
University)
2
What is an important point in
computational complexity?
How should we treat the
calculation of the
summation over 2N
configuration?
a  0;
for( x1  T or F){
for( x2  T or F){

for( x N  T or F){
    f x , x ,, x 
x1 T, F x2 T, F
x N T, F
1
2
a  a  f  x1 , x2 ,  , xL ;
N
If it takes 1 second in the case of
N=10, it takes 17 minutes in
N=20, 12 days in N=30 and 34
years in N=40.
}

}
N fold loops
}
Markov Chain Monte Carlo Metod
Belief Propagation Method
Physical Fuctuomatics (Tohoku
University)
This Talk
Next Talk
3
Calculation of the ratio of the circumference of
a circle to its diameter by using random
numbers (Monte Carlo Method)
1
n0 m0
-1
nn+1
Generate uniform random numbers
a and b in the interval [-1,1]
a2+b2≤1
Yes
mm+1
No
0
1
-1
Count the number of points inside of the
unit circle after plotting points randomly
the ratio of the circumference of a
circle to its diameter
S  R 2
R 1
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
 S
4m
n
Accuracy is improved as the
number of trials increases 4
Law of Large Numbers
Let us suppose that random variables X1,X2,...,Xn are
identical and mutual independent random variables with
average . Then we have
1
Yn  ( X 1  X 2    X n )   (n  )
n
Central Limit Theorem
We consider a sequence of independent, identical distributed
random variables, {X1,X2,...,Xn}, with average  and variance s2.
Then the distribution of the random variable
Yn 
1
( X1  X 2    X n )
n
tends to the Gauss distribution with average  and variance
s2/n as n+.
Physics Fluctuomatics/Applied
Stochastic Process (Tohoku University)
5
Calculation of the ratio of the circumference
of a circle to its diameter by using random
numbers (Monte Carlo Method)
1
x- and y- coordinates of each random
points is the average 0 and the
variance ½.
0
-1
1
From the central limit theorem, the
sample average and the sample
-1
variance are 0 and 1/2n for n
random points.
Count the number of points m
inside of the unit circle after
The width of probability density
plotting points n randomly
function decreases by according to
1/n1/2 as the number of points, n,
the ratio of the circumference of a
increases.
circle to its diameter
S  R
Order of the error of the ratio of
the circumference of a circle to its R  1
Physical Fluctuomatics / Applied
diameter is O(1/n1/2)
2
Stochastic Process (Tohoku University)
 S
4m
n
Accuracy is improved as the
number of trials increases
6
Integral Computation by
Monte Carlo Method
1
n0 m0
I 
1 1

-1 -1
f ( x, y )dxdy
nn+1
-1
Generate uniform random
numbers a and b in the
interval [-1,1]
mm + f(a,b)
4m
I
n
0
1
-1
Compute the value of f(x,y) at each
point (x,y) after plotting points n
inside of the green region randomly
Accuracy is O(1/n1/2)
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
7
Marginal Probability
Marginal Probability
P1 x1  
    Px , x ,, x 
x2 0,1 x3 0,1
P12 x1 , x2  
P13 x1 , x3  
xL 0,1
1
2
L
    Px , x , x ,, x 
x3 0,1 x4 0,1
xL 0,1
1
2
3
L
    Px , x , x ,, x 
x2 0,1 x4 0,1
xL 0,1
1
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
2
3
L
8
Important Point of Computations
Can we make an algorithm to generate |V| random vectors
(x1,x2,…,x|V|) which are independent of each other?
The random numbers should be according to
P x   Px1 , x2 ,, x|V | 
Computational time generating one random
numbers should be order of |V|.
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
9
Fundamental Stochastic Process:
Markov Process
Transition Probability w(x|y)≥0 (x,y=0,1)
 wz x   1
For any initial distribution P0(x),
Pt ( x) 
1
 w( x | z ) Pt -1 ( z )
z 0
( x  0,1; t  1,2, )
z 0
 Pt (0)   w(0 | 0) w(0 | 1)  Pt -1 (0) 

  


 Pt (1)   w(1 | 0) w(1 | 1)  Pt -1 (1) 
Transition Matrix
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
10
Fundamental Stichastic Process: Markov Chain
Pt ( x) 

1
 w( x | z[t - 1]) Pt -1 ( z[t - 1])
z[t -1]  0
1

1

z[t -1]  0 z[t - 2]  0
1

 w( x | z[t - 1]) w( z[t - 1] | z[t - 2])  w( z[1] | z[0]) P0 ( z[0])
z[t -1]  0
 Pt (0) 
 P0 (0) 
t

  W 

 Pt (1) 
 P0 (1) 
 Pt (0) 
 1 0  -1  P0 (0) 

  U 

t U 
0  
 Pt (1) 
 P0 (1) 
Transition matrix can be diagonalized as
 1 0  -1
U
W  U 
(   1)
0  
 Pt (0) 
 P(0) 
 1 0  -1  P0 (0) 
  U 


  lim 
U 
 P(1)  t   Pt (1) 
 0 0
 P0 (1) 
Limit Distribution
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
11
Fundamental Stochastic Process:
Markov Process
Pt ( x) 
1
 w( x | z[t - 1]) Pt -1 ( z[t - 1])
z[t -1]  0
 P(0) 
 P(0) 
  W 

P( x)   w( x | z[t - 1]) P( z[t - 1]) 
 P(1) 
 P(1) 
z[t -1]  0
1
Stationary Distribution or Equilibrium Distribution
In the Markov process, if there exists one unique limiting distribution,
it is an equilibrium distribution.
 P (0) 
 P(0) 

  lim  t 
 P(1)  t   Pt (1) 
Even if there exists one equilibrium distribution, it is not always a limiting distribution.
Example  0 1  1 1 1  1 0 1 1  The stationary distribution is
  



W  
 P(0)  1 / 2 
 1 0  2 1 - 1 0 - 11 - 1

  

P
(
1
)
1
/
2

 

Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
12
Stationary Process and Detailed Balance
in Markov Process
Pt  x  
1
 wx y Pt -1 ( y)
1
where
y 0
 wy x   1
y 0
P1(x), P2(x), P3(x),…: Markov Chain
Detailed Balance
w y x P( x)  wx y P( y)
When the transition probability w(x|y) is chosen so as to
satisfy the detailed balance, the Markov process provide us a
stationary distribution P(x).
Px  
1
Stationary Distribution of




w
x
y
P
y

Markov Process
y 0
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
13
Markov Chain Monte Carlo Method
Let us consider a joint probability distribution P(x1,x2,…,xL)

P( x)  P( x1, x2 ,, xL )

x  ( x1 , x2 ,, xL )T
How to find the transition probability w(x|y) so as to satisfy


lim Pt ( x )  P ( x )
t  
where



Pt ( x )   wx y Pt -1 ( y) (t  1,2,3,)

y
P1(x), P2(x), P3(x),…: Markov Process
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
14
Markov Chain Monte Carlo Method

xt - 1
 
wxt  xt - 1

x t 
Reject


x[1]
x[2]


x[t  1]
x[t  2]




x[( N - 1)t  1] x[( N - 2)t  1]
Accuracy
1/2
O(1/t )





x[t ]

x[2t ]


x[ Nt  1]
They can be regarded as
samples from the given
probability distribution P(x).
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
Randomly generated
For sufficient
large t, x[t],
x[2t], x[3t], …,
x[Nt] are
independent of
each other
How large
number t ?
t: relaxation time
15
Markov Chain Monte Carlo Method


x[1]
x[2]


x[t  1]
x[t  2]




x[( N - 1)t  1] x[( N - 1)t  2]
P1  X 1  





x[t ]

x[2t ]


x[ Nt ]
    P X 1 , X 2 , X 3 , , X L 
X2 X3
XL
Histgram
N
1

 x1 nt , X 1

N n 1
Marginal Probability Distribution
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
Xi
16
Markov Chain Monte Carlo Method

P( x )  P( x1 , x2 ,, xL ) 

ij
( xi , x j )
{i , j }E
P x1 , x2 , , xi -1 , xi , xi 1 , , xL 
Pxi x1 , x2 , , xi -1 , xi 1 ,, xL  
P x1 , x2 ,, xi -1 , xi 1 , , xL 
Px1 , x2 ,, xi -1 , xi 1 ,, xL    Px1 , x2 ,, xi -1 , zi , xi 1 ,, xL 
zi
V:Set of all the nodes
E:Set of all the neighbouring pairs of nodes
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
17
Markov Chain Monte Carlo Method

P( x )  P( x1 , x2 ,, xL ) 

{i , j }
( xi , x j )
{i , j }E
Pxi x1 , x2 ,, xi -1 , xi 1 ,, xL
  x , x 

 Px x
  z , x 
ji
{i , j }
i
j
i
{i , j }
zi
i

j  i
j
ji
(V , E )
L | V |
j
(V , E )
Markov
Random Field
E:Set of all the
neighbouring pairs of nodes
∂i:Set of all the neighbouring
nodes of the node i
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
18
Markov Chain Monte Carlo Method

P( x )  P( x1 , x2 ,, xL ) 

{i , j }
( xi , x j )
{i , j }E
wx1, x2 ,, xL x1 , x2 ,, xL
xk  xk k  i 
(V , E )
 x, x 



      ( x , x ) 

   z , x 
ji
iV
k
{i , j }
i
j
k
kV / i
{i , j }
zi
i
j
ji
  


wx ' x Px   wx x' Px' 

xt - 1
 
wxt  xt - 1
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)

xt 
19
Markov Chain Monte Carlo Method
 
wxt  xt - 1


x[t - 1]
xt 
True
xi = ○ or
V
x
False
●
V
x’
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
20
Sampling by
Markov Chain Monte Carlo Method
p

p


Small p
Large p
Disordered State
Ordered State
Sampling by
Markov Chain
Monte Carlo
Method
More is different.
Near Critical Point of p
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
21
Summary
Calculation of the ratio of the circumference of a
circle to its diameter by using random numbers
Law of Large Numbers and Central Limit
Theorem
Markov Chain Monte Carlo Method
Future Talks
9th Belief propagation
10th Probabilistic image processing by means of physical models
11th Bayesian network and belief propagation in statistical inference
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
22
Practice 8-1
When the probability distribution P(x) and
the transition probability w(x’|x) satisfy the
detailed balance
wx' x P x   wx x' P x' 
where
 wx x   1, prove that
P x    wx x' P x' 
x
x'
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
23
Practice 8-2
Let us consider that the transition matrix of
the present stochastic process is given as
1 2 1
 Pt (0) 
 Pt -1 (0) 

  W 
 , where W  

3  1 2
 Pt (1) 
 Pt -1 (1) 
Find the limit distribution defined by
 Pt (0) 
 P(0) 


  lim 
 P(1)  t   Pt (1) 
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
24
Practice 8-3
Let us consider an undirected square grid graph with L=Lx×Ly nodes. The
set of all the nodes is denoted by V={1,2,…,L} and the set of all the
neighbouring pairs of nodes is denoted by E. A random variable Fi is assigned
at each node i and takes every integer in the set {0,1,2,…,Q-1} . The joint
probability distribution of the provability vector F=(F1,F2,…,FL)T is given as
 1
2
PrF1  f1 , F2  f 2 ,, FL  f L  
exp  - a   fi - f j  
Z Prior
 2 {i , j}E

1
Make a program which generate N mutual independent random vectors
(f1,f2,…,fL)T randomly from the above joint probability distribution
Pr{F1=f1,F2=f2,…,FL=fL} . For various values of positive numbers a, give
numerical experiments.
Example of generated
random vector in the case
of Q=256, a=0.0005
Ly
L=Lx×Ly
Example of generated
random vector in the
case of Q=2, a=2
Lx
Physical Fluctuomatics / Applied
Stochastic Process (Tohoku University)
25