The Accelerated Weighted Ensemble

Download Report

Transcript The Accelerated Weighted Ensemble

The Accelerated Weighted Ensemble
Greatly Improved Protein Folding Statistics
Using WorkQueue and Condor
Jeff Kinnison & Dr. Jesus A. Izaguirre
Studying a New Protein
HP24stab
● Subdomain of the Villin
headpiece
● Two-helical supersecondary
structure
● 24 amino acids (406 atoms)
● Discovered in 2015, little kinetic
information available
Problems with Traditional MD
● Computationally Expensive
○ Molecular force fields perform expensive operations on all atoms
○ Timescales of interest quickly become intractable with protein size
○ GPU resources to increase efficiency are not always readily available
● Events of Interest are Rare
○ Protein folding occurs on O(ns) to O(ms) scale
○ There is no guarantee that a folding event will occur in a given
simulation
With these two issues, it is difficult to generate enough data
to make statistically significant kinetic approximations.
Accelerated Weighted Ensemble (AWE)
1. Simulate a
number of
models for a
short time
2. Resample
to maintain
the number
of models in
each state
3. Repeat
until fluxes
converge
Additionally, assign each state to a macrostate (folded, transition, unfolded) and track
macrostate transitions to account for non-Markovian behavior.
AWE Partition
Free Energy Surface of HP24stab
Partition Following Transition Pathway
The partition in AWE is based on existing kinetic data, approximating the correct weights.
Distributing Simulations with
WorkQueue
• Each simulation is independent, so parallelize simulations to
increase efficiency
•WorkQueue allows scaling to the number of simulations in a
particular AWE run
•AWE includes task cloning to overcome bottlenecks caused by
slow worker
Preliminary Trajectory Data
We created the AWE partition by
collecting trajectory data using
traditional MD on GPU. Each
trajectory took 4 days to complete.
Of the 36 trajectories collected, 19
were valid and only 9 contained
folding events.
Folding first passage times for the nine original
trajectories that folded.
AWE Setup
Two Systems
•
1000-cell
•
100-cell
•
10 models per state
WorkQueue
•
Maintained a factory requesting
between 100 and 1000 workers
•
All simulations run on 4-core
workers
•
Used Condor workers only to
prevent AWE workers from taking
over the cluster
MD Parameters
•
T = 325K
•
Langevin Dynamics with implicit
solvent (λ = .91ps-1)
•
Amber03 force field
•
250ps simulation time
AWE Condor Usage
AWE Condor Usage
100-Cell Partition Simulations Per Day
1000-Cell Partition Simulations Per Day
By leveraging WorkQueue and Condor, we were able to
run O(10k) simulations per day.
AWE Results
Started with 19 microseconds of traditional MD trajectory data containing nine
folding events computed over one month.
Conclusion
Both the coarse and fine partitions converged in one-sixth the
time needed to generate the original trajectories and generated
several orders of magnitude more folding events.
By leveraging WorkQueue and Condor, AWE is able to quickly
generate reliable approximations of protein kinetic properties.
Acknowledgements
We would like to thank Dr. Douglas Thain and the Cooperative Computing
Lab students for making WorkQueue available and helping to integrate it
with AWE.
All computations were run on compute nodes provided by the Notre Dame
Center for Research Computing.
References
•
Hocking, H. G.; Häse, F.; Madl, T.; Zacharias, M.; Rief, M.; Žoldák, G. A Compact Native 24Residue Supersecondary Structure Derived from the Villin Headpiece Sub- Domain. Biophys. J.
2015, 108, 678–686.
•
Huber, G. A.; Kim, S. Weighted-ensemble Brownian dynamics simulations for protein
association reactions. Biophys. J. 1996, 70, 97.
•
Bhatt, D.; Zhang, B. W.; Zuckerman, D. M. Steady-state simulations using weighted ensemble
path sampling. J Chem. Phys. 2010, 133, 014110.
•
Abdul-Wahid, B.; Yu, L.; Rajan, D.; Feng, H.; Darve, E.; Thain, D.; Izaguirre, J. A. Folding
Proteins at 500 ns/hour with Work Queue. E-Science (e-Science), 2012 IEEE 8th International
Conference on. 2012; pp 1–8.