EXTREME EIGENVALUES WITH APPLICATIONS TO …
Download
Report
Transcript EXTREME EIGENVALUES WITH APPLICATIONS TO …
Path Integral Method
and Speeding up MC with
Nonlocal Strategies
Juan M. Restrepo
Department of Mathematics
Physics Department
University of Arizona
Collaborators
Gregory Eyink (Johns Hopkins)
Frank Alexander (LANL)
Parsing the Problem
There is no single data assimilation
problem, when the challenges are
considered:
Spatio/temporal scales
High frequency of data
insertion or low frequency
When the number of state
variables is small or large
Three Estimation Problems:
Given a random time series {X(t): t < t0}
X(t) 2 RN
Prediction:
Estimate {x(t): t> t0}
Filtering (Nudiction):
Estimate {x(t0)}
Smoothing (Retrodiction):
Estimate {x(t): t · t0}
Formulating the Equations:
Discrete Model:
L(x(0),…,x(t-dt),x(t),x(t+dt),…,x(tf ),…,
B(t)q(t), B(t)q(t+dt),…,t) = 0
x(t) 2 RN
B q 2 RN
Example: take dt = 1
x(t+1) = L(x,Bq,t)
if linear:
x(t+1) = A(t) x + B(t) q(t)
A 2 RN£ N
Example:
t u(z,t) = n zzu(z,t) + f(t)
u(z,0) = u0(z)
u(0,t) = g(t) u(1,t) = h(t)
Discretizing:
x(t) ´ [u1(t),u2(t)…uN(t)]T
x(t+dt) = A x(t) + B q(t)
x(t)
= A x(t-dt) + B q(t-dt)
....
Leads to:
L(x(0),…,x(t-dt),x(t),x(t+dt),…,x(tf ),…,
Bq(t), Bq(t+dt),…,t) = 0
x(t) 2 RN B q 2 RN
Statement of the Problem
Stochastic Dynamics (Langevin Problem):
Measurements:
GOAL:
Find mean x conditioned on measurements:
xS(t) = E[ x(t)| y1,..., yM]
and
Covariance matrix (uncertainty)
CS(t) =E[(x(t)-xS(t))(x(t) -xS(t))>|y1,...,yM]
The conditional mean xS(t) minimizes
tr CS(t) = E[|(x(t)-xS(t))|2|y1,...,yM].
It is termed the smoother estimate.
A Nonlinear Example
Stochastic Dynamics (Langevin Problem):
dX(t) = f(X(t)) dt + k dW(t)
with
V(x) = -2x2+x4
f(x) = -V’(x)=4x(1-x2)
k = 0.5
Measurements:
at times m Dt, m=1,…, M one observes
ym := X(tm) + r Nm
to have measured values Ym, m=1,2,…,M
Observations
Ym 2 y(tm)
Extended Kalman Filter
Alternative Approaches
KSP: optimal, but impractical
ADJOINT/4D-VAR
(Restrepo, Leaf, Griewank, SIAM J. Sci Comp 1995)
Mean Field Variational Method
(Eyink, Restrepo, Alexander, Physica D, 2003)
enKF (ensemble Kalman Filter)
Particle Method
(Kim Eyink Restrepo Alexander Johnson, Mon. Wea. Rev. 2002)
Path Integral Method
(Alexander Eyink Restrepo, J. Stat. Phys. 2005)
KSP
Kushner, Stratonovich
Pardoux
Prediction/Filter:
Solve t P = -x [f(x) P] + k2xx P/2
P(x,0) = Ps(x)
between tm < t < tm+1, m=0,1…,M.
At measurement times tm,
P (x,tm+) = 1/N exp(rm/r2 – x2/r2/2) P(x,tm-)
Smoothing: solve final value problem
Solve t A = -x f(x) A – k2xx A /2
A (x,t ¸ tm) = 1
between tm+1¸ t ¸ tm, m=M,..,1.
At measurement times tm,
A (x,tm-) = 1/N exp(rm x/r2 – x2/r2/2) A (x,tm+)
Then, for each m=1, 2,…, M and tm < t < tm+1
P(x,t|yj= Yj, j=1,2,…,M)= A (x,t) P (x,t)
Kolmogorov Equation
t P = -x [f(x) P] + k2xx P/2
P(x,t)t ! 1 = Ps(x)
Observations
Ym 2 y(tm)
KSP Filter Results
Why not KSP?
Impractical!
Very tiny example: a scalar PDE in 1-d space
and time solved on a lattice of N points in space:
Dim(X(t))=N. Each component of X(t) requires
the solution of 2 stiff PDE’s
enKF (ensemble)
Combine forecast with measurements:
use linear interpolation via Kalman gain
• Between observations:
Run full model P times, use MC to get
conditional mean and variance.
•
Pros/Cons
Conditional statistics
Linear interpolation
Error in PDF » 1/P1/2
Need to run model fully resolved
Conveniently incorporates legacy
code
Path Integral Method
Related to simulated annealing
It can be developed as a black box
Simple to implement
Can handle nonlinear/non-Gaussian
problems
Calculates mean and uncertainty
PROBLEM: Relies on MC!!!
Discretized using explicit Euler-Maruyama scheme
Probability of the dynamics generating a given history
related to probability that it experiences a certain
noise history h (tk) = W(tk + dt)-W(tk),
at times tk,
k=0,1,2,…,
For Gaussian, uncorrelated noise this probability,
Prob h(t) »exp(-1/2 k | h (tk) |2).
In general: Prob h(t) » exp(-Hdyn)
t = t0, t1, ...t_T
where
Hdyn ´ ¼ k = 0T-1 [ [(xk+1- xk)/dt-f(xk,tk)]> D-1(xk, tk)
[(xk+1-xk)/dt-f(xk,tk)] ]
Hdyn ´ k = 0T - 1 [ [(xk+1- xk)/dt –f(xk,tk)]> D-1(xk, tk)
[(xk+1-xk)/dt -f(xk,tk)] ]/4
To include influence of observations
use Bayes' rule.
This modifies Hamiltonian:
Hobs =m=1M[h(x(tm)- y(tm)]>R-1[h(x(tm))-y(tm)]
The total Hamiltonian:
H = Hdyn + Hobs
The Hamiltonian is the log-likelihood.
There is a very close relationship between the
approach we take here and 4D-var,
used extensively in the ocean/climate community.
Maximum likelihood methods minimize the
Hamiltonian.
Instead, we are going to sample the Gibbs
distribution.
SAMPLING
Hybrid Monte Carlo (HMC)
Unigrid Monte Carlo (UMC)
Generalized Monte Carlo (GHMC)
Hybrid Monte Carlo
molecular dynamics: used to
propose a new system configuration
Metropolis MC: accept/reject based
on the energy
Configuration is specified by
degrees of freedom q0, q1, ... , qT.
qi 2 RN
The HMC algorithm works as follows:
To each qi, a conjugate generalized
momemtum, pi, is assigned.
The momenta pi give rise to a kinetic
energy HK = i pi2/2 .
The total Hamiltonian of the system
H = H + HK.
The dynamics are:
dqi/dt = pi
dpi/dt= Fi where
Fi=- H/ qi
is the force on the ith degree of freedom.
What’s going on?
Write Probability P(x) = exp(-E(x))/Z
E(x) and grad(E(x)) are easily evaluated:
Gradient indicates which direction one should go
to find states with higher probability!
Note: H(q,p) = H(q) + HK
PH(q,p)=exp(-H(q,p))/Z=exp(-H(q))exp(HK(p))/Z
separable
then marginal distribution exp(-H(q))/Zq
1) A chain of states is generated:
(qi’,pi’)
i=0,1,2,…,T, by evolving J steps.
2) Detailed balance achieved if configuration
obtained after evolving J steps is accepted
with probability min[1, exp DH],
where DH = H(q',p')-H(q,p).
The Metropolis step corrects for time
discretization errors.
3) p’ refreshed after every acceptance/rejection
according to a Gaussian distribution of independent
variables exp(-HK).
Unigrid Monte Carlo
Updates system by taking coherent moves on a
number of length'' scales.
Decompose system into blocks of contiguous
lattice points.
B=Block sizes 1, 2, 4, ..., 2s.
B=1 the standard local Metropolis.
Update: to each site in B local value has a
random (Gaussian) df added to it.
Metropolis accept/reject as before.
Generalized HMC
Dynamics replaced by:
dqi/dt = A pi dpi/dt = [A]T Fi
A is an N£ N matrix.
when A =I we obtain HMC.
Discretize:
q' = q + dt Ap + ½ dt A AT F([q])
p' = p + ½ dt AT(F[q]+F[q'])
Challenge: find A that leads to a significant reduction
of the correlation time.
Used the circulant matrix
A =circ(1,exp(-a ),exp(-2 a ),…, exp(-Ta ))
A Nonlinear Example
Stochastic Dynamics (Langevin Problem):
dX(t) = f(X(t)) dt + k dW(t)
with
V(x) = -2x2+x4
f(x) = -V’(x)=4x(1-x2)
k = 0.5
Measurements:
at times m Dt, m=1,…, M one observes
ym := X(tm) + r Nm
to have measured values Ym, m=1,2,…,M
Observations
Ym 2 y(tm)
RESULTS:
decorrelation time
T+1 number of lattice points
J number of MD steps per sweep
(standard deviation)
[a] in A
Conclusions
GHMC affects significant speedup
GHMC results can be applied to
enKF
Path Integral can be used as a
black box data assimilator
Questions
Euler-Maruyama error O(dt)
other time integrators?
Optimal A matrix in GHMC?
Combining Path Integral Method
with closure techniques when N and
T large
Further Information:
http://www.physics.arizona.edu/~restrepo
Education:
Not enough people trained to do it
The training sequence is not yet
mature
The problem is interdisciplinary
Data Assimilation (Arizona)
Prerequisites:
Graduate numerical analysis
sequence (linear algebra).
Probability theory.
Program
Semester 1:
Inverse Problem: the deterministic
problem
optimization
sensitivity analysis