Drug Affinity Ranking

Download Report

Transcript Drug Affinity Ranking

Interoperable HT-BAC for
Personalised Medicine
Peter V Coveney & team
Shantenu Jha & team
With Special thanks to Dieter Kranzlmuller
HIV Drug Resistance
Drug Binding Affinity Ranking
Application in HIV Drug Resistance
•
•
•
Enzyme of HIV responsible for
protein maturation
Target for 9 FDA approved Antiretroviral Inhibitors
Example of Structure Assisted Drug
Design
Monomer B
101 - 199
Glycine - 48, 148
Monomer A
1 - 99
Flaps
Saquinavir
So what’s the problem?
•
•
•
•
•
Emergence of drug resistant
mutations in protease
Render drug ineffective
P2 Subsite
Drug resistant mutants have
emerged for all FDA inhibitors
Leucine - 90, 190
Mutations frequently interact
We want to predict the binding
affinity of inhibitors to any sequence
Catalytic Aspartic
Acids - 25, 125
C-terminal
N-terminal
2
Binding affinity calculator (BAC)
BAC can reliably predict binding affinities of compounds
with their target proteins, and be used potentially as a drug
ranking tool in clinical application or a virtual screening
tool in pharmaceutical lead discovery.
Blackbox-like
BAC
Ranking of
binding
affinities
S. K. Sadiq, D. Wright, S. J. Watson, S. J. Zasada, I. Stoica, Ileana, and P. V. Coveney, "Automated
Molecular Simulation-Based Binding Affinity Calculator for Ligand-Bound HIV-1 Proteases", Journal of
Chemical Information and Modeling, 48, (9), 1909-1919, (2008), DOI: 10.1021/ci8000937.
3
Drug Affinity Ranking
Computational Application to Drug Affinity Ranking
– Single MD simulation
SINGLE MD
PROTEIN
DRUGS
Errors uncontrolled
Results unreproducible
4
Drug Affinity Ranking
Computational Application to Drug Affinity Ranking
– Ensemble Simulations
Errors fully under control;
Results reproducible.
5
Ensemble Molecular Dynamics Protocol
• Run 50 ‘replica’ simulations
• Vary only initial velocities
• 4 ns of production trajectory per replica
• More efficient sampling compared to single long simulation
• Allows us to examine reproducibility of results
• The workflow can be completed within nine hours of wallclock time,
provided the required number of cores is available.
• To compute more than one binding affinity concurrently, one needs
to multiply the requirement by the number of molecules of interest.
Sadiq, S.K, Wright, D.W., Kenway, O.A. and Coveney, P.V. “Accurate Ensemble Molecular Dynamics Binding Free Energy Ranking of
Multidrug-Resistant HIV-1 Proteases.” Journal of Chemical Information and Modeling 2010 50 (5), 890-905.
Wan, S., Knapp, B., Wright, D.W., Deane, C.M., Coveney, P.V., “Rapid and Accurate Peptide-MHC Binding Affinity Predictions from in
silico Molecular Dynamics” 2015, preprint submitted for publication.
6
Single vs Ensemble MD Simulations
The binding free energy can vary widely (up to 12 kcal/mol) between two single
simulations.
Single simulation: not reproducible, unscientific!
Drug – EGFR
Wan & Coveney, J. R. Soc. Interface, 8, 1114-1127, (2011).
Wright, Hall, Kenway, Jha & Coveney, JCTC, (2014), DOI: 10.1021/ct4007037.
Drug – HIV-1 protease
7
Ensemble MD Simulations
• The MM/PBSA results follow well defined Gaussian distributions.
• Configurational entropies, obtained from normal mode estimates, closely
resemble normal distributions.
Drug – HIV-1 protease
Wright, Hall, Kenway, Jha & Coveney, JCTC, (2014), DOI: 10.1021/ct4007037.
8
Length of Simulations in an Ensemble Run
GMMPBSA is converged at 4ns with a boot of less than 0.3kcal/mol.
-TSNM also converged at 4ns with a boot of less than 0.3kcal/mol.
All of the production runs are therefore limited to 4ns.
The variations of the bootstrap statistics with replica simulation length and the sampling
rate used for the averages of GMMPBSA and –TSNM for 50 replica ensemble simulations.
Wright, DW, Hall, BA,Kenway, OA, Jha, S and Coveney, PV, "Computing Clinically Relevant Binding Free
Energies of HIV-1 Protease Inhibitors.” J. Chem. Theory Comput., 2014, DOI: 10.1021/ct4007037
9
Number of replicas in an Ensemble Simulation
 Larger ensembles make for more reproducible ranking with lower boot.
 Minor changes in boot after approximately 25 replicas.
 Decrease slows in boot after 25 replicas included in the ensemble.
25 or more replicas needed in an ensemble study.
Variations of the bootstrap statistics with number of replicas within an ensemble
simulation on the Spearman rank coefficient.
Wright, DW, Hall, BA,Kenway, OA, Jha, S and Coveney, PV, "Computing Clinically Relevant Binding Free
Energies of HIV-1 Protease Inhibitors.” J. Chem. Theory Comput., 2014, DOI: 10.1021/ct4007037
10
Calculating Clinically Relevant Binding Affinities
FDA-approved drugs to wild-type HIV-1 protease
This work used several of the most powerful
supercomputers in the USA, UK, and EU.
Wright, DW, Hall, BA,Kenway, OA, Jha, S and Coveney, PV, "Computing Clinically Relevant Binding Free
Energies of HIV-1 Protease Inhibitors.” J. Chem. Theory Comput., 2014, DOI: 10.1021/ct4007037
11
A Pore Man’s View of the TeraGrid/XSEDE
• 2005-09: Tried running many simulations on many supercomputers. Did
not work (well)!
• Why? What has changed?
RADICAL Cybertools http://radical-cybertools.github.com
Abstractions-based, Standards-driven approach to HPDC
• Manage heterogeneity
– Middleware variants (syntax)
– Infrastructure utilization (semantics)
– (some) Architectural features
•Flexible execution and resource management techniques
–“Static resource” execution versus “Dynamic resource” execution
–Using Pilot Concept as “higher-level” resource management
• Serve as building blocks upon which other components can be built
– Use other RADICAL-Cybertools components
– Application/Domain specific Toolkits:
– BAC + RADICAL-Cybertools  HT-BAC
– RADICAL-Cybertools used on ARCHER and SuperMuc + XSEDE
• Interoperability for free, more flexible resource utilization modes
Application Toolkit Layer
Resource Management Layer
Resource Access Layer
RADICAL SAGA
• RADICAL-SAGA:
– Native Python implementation of Open Grid Forum GFD.90.
– Allows access to different middleware / services through a unified interface
– Provides access via different backend plug-ins (“adaptors”).
– SAGA-Python provides both a common API, but also unified semantics
across heterogeneous middleware:
•
•
•
•
Transparent Remote operations (SSH / GSISSH tunneling).
Asynchronous operations.
Callbacks.
Error Handling.
RADICAL-Pilot http://radical-cybertools.github.io/radical-pilot
• Lightweight, portable, fast, scalable pilot framework
• Implements P*, well defined state models (for pilots and units)
• Scalability (up and out)
– Lightweight data model
– Bulk operations
– Notifications / support for async programming
• Portability
– Modular Pilot agent adaptable for different architectures
– Pure Python, SAGA-Python as plumbing layer
• Supports Research whilst supporting production scalable science!
– Pluggable schedulers; High degree of introspection, provenance;
consistent and verifiable performance
Heterogeneous Resource: Localized to Agent
So why the focus on Pilot-Jobs?
•
Conceptual basis for dynamic execution models and resource management
– Predicting Tq Difficult: “Can’t beat ‘em, Join ‘em”
– Difficult only because of static utilization
• Case for flexibile distributed resource utilization
•
Unifying abstraction across HPC and DCI and others..
– Task-level parallelism has very strong application drivers!
– For the foreseeable future we will use task-level parallelism for extreme
sale computing -- whether 1 machine or many smaller machines
•
•
“I think that the community's focus on only scaling SPMD computations is misguided..” Ian Foster
Act as a building blocks: Provides the resource management layer for
application-level tools, libraries and services
Project Status
•
•
•
•
Thanks to Nancy/XSEDE
• Allocation on XSEDE resources
Thanks to Dieter/PRACE
– 1M on SuperMuc
Through a combination of many other allocation sources
– ARCHER, HecTOR, SuperMuc, XSEDE and others (Hartree Center)
Collaboration ongoing: both infrastructure and science dimension
– Ability to scale & support greater number of ensembles