Transcript Slide 1
New Algorithms to make Quantum
Monte Carlo more Efficient
Daniel R. Fisher
William A. Goddard III
Materials and Process Simulation Center (MSC)
California Institute of Technology
2003, D.R. Fisher
Dirac’s Thoughts on Chemistry
"The underlying physical laws necessary for the
mathematical theory of a large part of physics and
the whole of chemistry are thus completely
known, and the difficulty is only that the
application of these laws leads to equations much
too complicated to be soluble."
- P. Dirac, Proc. Roy. Soc (London) 123:714 (1929)
2003, D.R. Fisher
Computational Quantum Mechanics Algorithms
Approximation
Reliability
Scaling
With Size
Time(2N)/(N)
No. of Non-H
Atoms
Ab initio MO methods
CCSD(T)
MP2 Perturbation Theory
DFT(B3LYP)
Hartree-Fock
Quantitative (~2 kcal/mol)
Semiquantitative
Semiquantitative
Qualitative
O(N6)
O(N4)
O(N3)
O(N3)
64
16
8
8
5-6
10-50
50-100
100-200
Semiempirical MO methods
AM1, PM3, MNDO
Semiqualitative
O(N2-3)
4-8
1000
Classical force-field methods
Amber, Charm, MM3, …
Semiqualitative
O(N1-2)
2-4
103-4
(Semi?)Quantitative
Quantitative
Exact
O(N1-3)
O(N1-3)
O(N1-3)
2-8
2-8
2-8
15-30*
10-20*
0-1*
QMC methods
Variational
Fixed Node Diffusion
Released Node Diffusion
* Based on O(N3)
Adapted from: Morokuma, et al. IBM J. Res. & Dev. Vol. 45 No. 3/4 May July 2001. p 367-395
2003, D.R. Fisher
Hartree-Fock QM Calculations
Replace explicit electron-electron interactions with
average electron-electron interactions to get
N,1-particle equations.
Average Position
of the Other Electrons
This works BUT it does not deal correctly
with electron-electron correlation
2003, D.R. Fisher
When Is Electron Correlation Important?
• Transition Metal Oxides
– High Temprature Copper Oxide Superconductors
• Heavy Fermion Metals
– Usually Rare Earths or Actinides
• Organic Charge Transfer Compounds
• One and Two Dimensional Electron Gas Systems
– Nano-scale Wires
• Positron Chemistry
– Defect Detection and Analysis
• Muon/Exotic Particle Physics
– Catalyzed Fusion
See 21 April 2000 Science
for More Examples
2003, D.R. Fisher
QMC Basic Algorithm
Traditional
QC package
Get YT
(HF,DFT,MCSCF)
Get Starting
Jastrow
Construct Initial
Walker(s)
Molecule
VMC Equilibrate
Task Conceived!!!
-Accurate result
-Small expense
VMC Sampling
Optimized?
Enough
DMC Equilibrate
Very Accurate
Result
DMC Sampling
Change Jastrow
to Optimize it
2003, D.R. Fisher
Metropolis Algorithm and QMC
• Reformulate Energy Expectation Integral.
E Y * Hˆ Ydx3 N
1
E
M
ˆ
H
Y 3N
2
dx Elocal dx3 N
Y
Y
iM
E x
xi
local
i
i 1
• Monte Carlo Integration for this 3N-dimensional integral.
– Metropolis algorithm produces electron configurations with respect to Y2.
– System energy expectation value is average of the local energies.
2003, D.R. Fisher
VMC Wavefunction for a Molecule
Y
YHF
HF-Type
Wavefunction
Function of how far
apart the electrons are
from each other
Function of how
far electrons are
from nuclei
n n a1*rij a2 *rij2
n N A1 *riA A2 *riA2
exp 1b *r b *r 2 exp 1 *r *r 2
1 ij
2 ij
i 1 A1 A1 iA A2 iA
i 1 j i
What a typical YHe might look like:
r1N
ar12
r2 N
YHe 1s (1)1s (2) exp
exp
1 br12
1 r1N 1 r2 N
2003, D.R. Fisher
Diffusion QMC
1 2
Y V Y
2
Change variables ...
t i
Y 0 e
Solution at
Key Point:
( E1 E0 )
Ground State
1 e
( E2 E0 )
2
Excited States
No Matter What Wavefunction We Start With, It
Decays to the True Ground State as We Take Steps in “time,” !
2003, D.R. Fisher
New Algorithms and Areas Affected
Traditional
QC package
Get YT
(HF,DFT,MCSCF)
Get Starting
Jastrow
Construct Initial
Walker(s)
Molecule
Generic Jastrow
VMC Equilibrate
Task Conceived
ManagerWorker
Parallelization
VMC Sampling
Change Jastrow
to Optimize it
Optimized?
Enough
DMC Equilibrate
Very Accurate
Result
DMC Sampling
Decorrelation Algorithm
2003, D.R. Fisher
Problem With Metropolis Sampling
From intro statistics, the uncorrelated variance in energy is:
2 (E) E 2 E
Probability of a move being
accepted is related to the transition
probability. T(AB)
Rejected
Move
Accepted
Move
2
This does not work in this case because
the energies calculated at sequential
points are serially correlated.
2 (E) 2 (E)
2
cov(Ei , E j )
N i j
Energies at these points are correlated!
2003, D.R. Fisher
Flyvbjerg-Petersen Decorrelation Algorithm
Average blocks of the original data into new data elements.
Original Data
x1
x2
x3
x4
x5
x6
x7
x8
x9
xn-3
xn-2
xn-1
xn
•••••••••••••
Blocked Data
x1
x2
x3
x4
x5
x6
x7
x8
x9
If the data blocks are sufficiently large, the blocked data points
are uncorrelated and the standard O(N) variance equation can be used.
2 (E) E 2 E
2
2003, D.R. Fisher
VMC Particle-in-a-Box Standard Deviation Calculation
Uncorrelated Data
Constant
True
Plateau
Correlated Data
Underestimate
True
2003, D.R. Fisher
New Statistical Analysis Algorithm
Flyvbjerg-Petersen Algorithm
• One processor must do all of the work
N = 107-1012
log2N = 23-40
• Must communicate O(N) data when used on a parallel computer
• Must store O(N) data
• Must be used at the end of the calculation
(can’t check convergence on-the-fly).
Dynamic Distributable Decorrelation Algorithm
Feldmann M.T., D.R. Kent IV, R.P. Muller, W.A. Goddard III. “Efficient Algorithm
for “On-the-fly” Error Analysis of Local or Distributed Serially-Correlated Data”,
J. Chem. Phys. (submitted)
• Perfectly parallel even on inhomogeneous computers
• Must communicate O(log2N) data when used on a parallel computer
• Must store O(log2N) data
• Can provide on-the-fly results (convergence based termination)
2003, D.R. Fisher
New Manager Worker Parallelization Algorithm
Current Algorithm - Pure Iterative
• All processors do an equal amount of work
• If one processor finishes first it must wait on all of the others
• Intended only for homogeneous machines
(all processors the same)
• Convergence-based termination not possible
New Algorithm - Manager Worker
Feldmann M.T., D.R. Kent IV, R.P. Muller, W.A. Goddard III. “Manager-Worker
Based Model for Massively Parallel and Perfectly Load-Balanced Quantum
Monte Carlo”, J. Comp. Chem. (submitted)
• All processors do as much work as they can
• All processors complete at the same time
• Intended for either homogeneous or inhomogeneous machines
(processors can be different)
• Convergence-based termination possible with DDDA
2003, D.R. Fisher
Parallel Algorithm Performance:
Heterogeneous Computer
8 total processors. Mixture of Pentium II 200 MHz and
Pentium III 866 MHz
2003, D.R. Fisher
Parallel Algorithm Performance:
Heterogeneous Computer
Performance depends on what is being measured…
Not quite linear
LLNL Blue Pacific
LANL Nirvana
Linear to 2048 Processors
2003, D.R. Fisher
Are Transferable Parameter Sets Possible?
Y
YHF
HF-Type
Wavefunction
Function of how far
apart the electrons are
Function of how
far electrons are
from nuclei
n n a1*rij a2 *rij2
n N A1 *riA A2 *riA2
exp 1b *r b *r 2 exp 1 *r *r 2
1 ij
2 ij
i 1 A1 A1 iA A2 iA
i 1 j i
Can parameter sets be found that are transferable
between different systems?
Similar idea to contracted Gaussian basis sets (e.g. 6-31G**).
2003, D.R. Fisher
Correlation Energy Recovered by
Generic Jastrow Function
Correlation Energy Recovered (h) /
Total Nuclear Charge For The Molecule
0.01
0.008
0.006
benzene
0.004
trans-butadiene
0.002
cis-butadiene
ethylene
0
-0.002
1
3
5
7
9
ethane
allene
acetylene
-0.004
methane
-0.006
-0.008
-0.01
b parameter
All these hydrocarbons have nearly the same
optimal “b” parameter!
2003, D.R. Fisher
Generic Jastrow Performance
0.1
1
10
100
1000
10000
DMC CH4
optimized
0.01
b=1
b=100
0.001
0.0001
0.00001
QMC Steps
0.1
1
10
100
1000
optimized
b=1
0.01
b=3 (Generic)
DMC C2H2
Variance
Variance
b=3 (Generic)
b=100
0.001
0.0001
0.00001
QMC Steps
10000
2003, D.R. Fisher
Why Isn’t QMC Really Linearly Scaling?
•
Each processor needs to perform statistically independent
calculations
•
Each processor must begin calculating with independent,
statistically significant points in configuration space
•
A separate Metropolis calculation must be equilibrated for each
independent point in configuration space
LANL Nirvana
Efficiency
Number of Processors
2003, D.R. Fisher
Current Work- Initialization Solution
• Make algorithm that reduces the initialization time.
– Better guess for placement of initial walkers in the Monte Carlo random walk.
– Reduce the number of steps to equilibrate the walker.
– Make algorithm for initialization which itself is parallelizable.
• There are two criteria by which we can judge the quality of a
configuration.
– The sum of one electron probability densities from the SCF calculation:
1
SCF x
N
x
i
i
– The distance of the electrons from each other.
2
2003, D.R. Fisher
Constructing Initial Configurations
• Orbitals are linear combinations of primitive gaussians:
i d jx y z e
aj
bj
cj
j r 2
j
The square of the orbital is its probability density.
We can change to spherical coordinates and separate the
dependence in r, θ, and φ:
2 a a b b c c r 2
j
k
j
k
j
k
j
k
i d d j dk r
2
j ,k
e
0
2
dr sin
0
a a
b b
sin
cos
d 1
0
j
k
j
k
1 a j ak b j bk
cos c c d
j
k
2003, D.R. Fisher
Probability Distribution Functions
• By integrating over one variable at a time, we can get the marginal
probability distribution function in each direction.
These are the probability distribution functions for the 2px orbital of Ne:
Theta Dist
P(theta)
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
Phi Dist
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0
0.5
1
1.5
2
2.5
3
3.5
r (au)
P(phi)
P(r)
Radial Dist
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1.5707965
theta
0
1.57
3.14
4.71
6.28
phi
Now we can generate uniform random numbers between 0 and 1 to
distribute electrons with respect to the probability density of this
orbital.
3.141593
2003, D.R. Fisher
Deciding which configurations to keep
• We can invert each orbital of the molecule and distribute
electrons in them according to their occupancy. This will
ensure that our initial configurations are in regions of high
one electron probability density.
• This algorithm, however, will not prevent electrons from
being placed near each other.
• We plan to develop a heuristic score function that will
take a 3N-dimensional walker as input and return a real
number, such that the larger the result, the fewer steps
the walker will need to take to reach an equilibrium region
of configuration space.
2003, D.R. Fisher
The Score function
•
The score function will depend on the one electron density and the
distance between pairs of electrons:
S x Gx , x
, where
N
x
j i
1
x
N
1
xi x j
x
SCF
i
i
Our goal is to determine a functional form of the score
function G that is transferable between different molecular
systems.
2003, D.R. Fisher
Using the score function
• Once a reliable score function is developed, it will be possible to
construct high-quality walkers:
- For an N-electron molecule, we could generate M>N electron positions and then find
which combination gives the best walker.
- Alternatively, a large set of N-electron configurations could be generated, and then the
most favorable could be chosen.
- The number of equilibration steps would be determined by the result of the score
function
????? Could it be possible to use the score function to dynamically determine when a
walker is equilibrated ?????
If this is possible, we could terminate the initialization of
each walker individually. This would mean we could
start gathering data from a walker as soon as it is
equilibrated, rather than waiting for the entire ensemble.
2003, D.R. Fisher
Conclusion
• For parallel calculations, the efficiency drops off drastically
as the initialization time and number of processors
increase.
• This new class of Metropolis initialization algorithms will
greatly decrease the time required to initialize a QMC
calculation.
• The efficient use of parallel processors will allow highly
accurate quantum mechanical calculations of larger and
more interesting chemical systems.
2003, D.R. Fisher
Acknowledgements
General:
• Mike Feldmann
• Chip Kent
• Rick Muller
• William A. Goddard III
• Goddard Group
• CACR staff
Funding:
• $$$ ASCI $$$ (Caltech-ASCI-MP)