A short course on VEGA ZZ

Download Report

Transcript A short course on VEGA ZZ

Università degli Studi di Milano
Dipartimento di Scienze Farmaceutiche “Pietro Pratesi”
GriDock: An MPI-based software
for virtual screening in drug discovery
Alessandro Pedretti
What is the virtual screening ?
• The virtual screening (VS) is a computational approach that can be used
in drug discovery processes to find new hit compounds.
Database of moleculesDatabase of molecules Set of molecules
Database filter
Database filter
Experimental assay
Hit compounds
Hit compounds
Hit compounds
Virtual screening
Virtual screening
High-throughput screening
• It can be compared to the High-throughput screening (HTS) that is a true
experimental approach.
The database of molecules
• The database must contain molecules that are available in the real world
or synthetically accessible in easy way.
• The pharmaceutical industries have got databases built trough the years
from researches in some different fields.
• Some databases are publicly available and provided by chemical
compound resellers (AKos, Asinex, TimTec, etc) or by non-profit
institutions (Kyoto University, NCI, University of Padua, etc).
• The database must contain a large number of molecules in order to do an
exhaustive exploration of the chemical space.
The database filter
• The database filter does the virtual test to check if a molecule could be
bioactive or not.
• The kind of filter allows to classify the virtual screening approaches in:
Ligand-based
The 3D structure of the biological target is unknown and a set of
geometric rules and/or physical-chemical properties (pharmacophore
model) obtained by QSAR studies are used to screen the database.
Structure-based
It involves molecular docking calculations between each molecule to test
and the biological target (usually a protein). To evaluate the affinity a
scoring function is applied. The 3D structure of the target must be known.
Molecular docking
+
Ligand
Receptor
Docking software
• The complex quality is evaluated by
the score.
Ligand – receptor complex
GriDock – Main features
• GriDock is a software developed to perform structure-based virtual
screenings.
• It’s a front-end to the well known AutoDock software, developed by D.S.
Goodsel and A.J. Olson.
• It uses VEGA command-line software to perform file format conversion,
database extraction and molecular property calculations.
AutoDock 4
+
VEGA
Virtual screening
GriDock
• Highly portable C++ code (Linux 32 and 64 bit, Windows 32 and 64 bit).
• It can take full advantages of multi-CPUs/cores systems and GRID-based
architectures through its parallel design.
How GriDock works
Database of molecules
VEGA
Receptor coord.
+ maps
AutoDock 4
Ligand – receptor
complexes
Score analysis
Output files
• Calculation of the molecular
properties.
• Input file generation (PDBQT).
• Molecular docking.
• Score calculation.
How VEGA works with GriDock
Database of molecules
Hydrogens add
Property calculation
AMBER force field
Gasteiger-Marsili method
Potential attribution
Calculation of charges
Search of flexible
torsions
Conversion to PDBQT
to AutoDock 4
GriDock multi-threaded version
Database
Receptor
Thread loop
GriDock main thread
Thread 1
Thread 2
Thread n
VEGA
VEGA
VEGA
AutoDock 4
AutoDock 4
AutoDock 4
Symmetric multiprocessing
(SMP) provided by pthread
library or Windows APIs
Mutex controlled access
Output files*
• Log file (gridock_DATE.log).
• CSV file containing the list of complexes ranked by docking score.
• Zip file containing the output complexes generated by AutoDock 4.
GriDock MPI version
GriDock MPI master node
Node loop
Database
Receptor
Node 1
Database
Receptor
Node 2
Database
Receptor
Node n
MPI
VEGA
VEGA
VEGA
AutoDock 4
AutoDock 4
AutoDock 4
GriDock MPI master node
Output files
GriDock input requirements
To perform a virtual screening with GriDock, you need:
• The 3D structure of the biological target.
- Protein Data Bank (http://www.rcsb.org).
- Homology modeling.
• The 3D maps of the active site generated by AutoGrid 4
- AutoDockTools / MGLTools (http://mgltools.scripps.edu).
- VEGA ZZ (http://www.vegazz.net).
• One or more databases of 3D structures in SDF or Zip format.
• Ligand.Info: Small-Molecule Meta-Database (http://ligand.info).
• MMsINC (http://mms.dsfarm.unipd.it/MMsINC.html).
• ZINC (http://zinc.docking.org).
The Citrus tristeza virus case
• The Citrus tristeza virus (CTV) is a positive single stranded RNA virus that
causes a serious pathology of the citruses.
• Any treatment to save the infected plants is unknown.
• A possible therapeutic target could be the RNA-dependent-RNA
polymerase (RdRp) involved in the virus replication.
Infected cell
Protease
ssRNA (+) – 5’ prot.
Translation
mRNA
prot.
Early protein
Other proteins
Protease
Translation
RdRp
Virions
Structural
proteins
(-)RNA
Replicative
complex
The RdRp model
The crystal structure doesn’t exist and a
homology modeling procedure was performed:
SwissProt
Q2XP15
Primary structure
Fugue
Folding prediction
Rough 3D structure
RdRp model
VEGA ZZ
+
NAMD
To the refinement
workflow
Model refinement
Rough model
Missing residues
VEGA ZZ
+
NAMD
Side chains add
Hydrogens add
30.000 steps
conjugate gradients
Energy minimization
Structure check
Model ready
for the screening
Ramachandran plot
Calculation of the grid maps
AutoDock requires pre-calculated grid maps to evaluate the total interaction
energy between the ligand and the target macromolecule.
To do it, we used the script included in the VEGA ZZ package:
RdRp structure
Potential attribution
Calculation of charges
Apolar hydrogens
remove
PDBQT file
Mapping the active site
Script file:
AutoDock/Receptor.c
AutoGrid 4 run
Grid map files
Considered databases
All test databases in SDF format were downloaded from http://ligand.info:
• ChemBank
• ChemPDB
• KEGG Ligand
• Anti-HIV NCI
• Drug/likeness NCI
• Not annotate NCI
• AKos GmbH
• Asinex Ltd.
The total number of docked ligands is: ~1,000,000
Test system
Tyan Transport VX50
• # 8 AMD Opteron 875 dual
core CPUs @ 2.4 GHz.
• 8 Gb Ram.
• 72 + 150 Gb SATA hard disk.
• Linux 64 bit (CentOS 4).
40,000 ligands/day.
Preliminary results
The top ranked ligands contains in their structure one or more sulfurs.
Sulfonic acid derivatives.
These compounds are know to be
potent inhibitors of the HIV reverse
transcriptase. Some of them are
naphtalen
polysulfonic
acids
developed as Anti-HIV (Anti-HIV
NCI database).
Conclusions
• We developed a new parallel structure-based virtual screening software
able to run on both multi-CPU and GRID systems.
• The complete model of the RNA-dependent-RNA-polymerase of Citrus
Tristeza Virus was obtained to perform a virtual screening study.
• Screening ~1,000,000 ligands, potential RdRp inhibitors were found.
• These molecules contains sulfur atoms and, more in details, multiple
sulfonic acid moieties.
• Some of them are included in the Anti-HIV class.
• To complete the study, the activity of the found molecules must be
experimentally confirmed by biological assays.
Acknowledgments
• Giulio Vistoli
• Santo Motta
• Cristina Marconi
• Francesco Pappalardo
• Alessandro Lombardo
• Emilio Mastriani
www.vegazz.net
www.ddl.unimi.it