Synthetic Sprout

Download Report

Transcript Synthetic Sprout

Synthetic Sprout
Generating Synthetically
Accessible Ligands by De Novo
Design
A Peter Johnson
Krisztina Boda
Attilla Ting
Jon Baber
SPROUT is the De Novo design system
developed in Leeds
SPROUT components




Identification of potential interaction sites
complementary to the receptor, ie H
bonding, hydrophobic sites, metal coordination sites etc.
Automated docking of small fragments at
the interaction sites.
Generation of hypothetical structures by
linking the docked fragments together.
Tools for scoring, sorting and navigating
the answer set.
Hydrogen Bond Sites
H-bond acceptor site
H-bond donor site
Example: 3D shapes of sites
Boundary Surface
Docking of small fragments at target
sites
Target sites are generated either by SPROUT module HIPPO
(or similar system) or come from a pharmacophore hypothesis.
Small fragments with complementary functionality are selected
by the user and automatically docked into the target site(s).
In addition to these small fragments, it is also possible to dock
large fragments which are known to satisfy several of the target
sites. Such a large fragment can then act as a “seed” for further
growth.
A successful dock must place the small fragment at the target
site with the correct orientation to satisfy any directional
constraints.
The docking process is very fast and uses a novel hierarchical
least squares optimisation procedure.
Structure generation
The SPIDER module links the target sites together in a
pairwise fashion to make complete molecular structures
which satisfy target sites. It does this by sequentially
adding new fragments in an exhaustive fashion.
There is no element of random choice in this process, which
means that various heuristics have to be adopted to avoid a
combinatorial explosion.
The main approximations employed are:
There is a sampling of all the possible conformations about
single bonds.
Growth is only permitted from atoms/bonds which are
closest to the target site which is to be reached
Main algorithm of SPIDER
Multiphase heuristic graph search on a forest ( set of
trees)
Two trees are searched and removed in each phase and a new tree
generated which contains skeletons connections both set of sites
Each phase consists of
a bi-directional search
Breadth First Search (BFS)
Depth First Search (DFS)
Typical saving bi-directional search 10 successors, 6 level: 2x103 << 106
Connection of Partial
Structures
Common template is
located in two structures
(one from each tree)
Structures are overlayed
by the common template
Combined structure is
docked to the united set of
target sites also
considering the steric
constraints of the receptor
site
Side effect joins are
axamined for validity (e.g.
fusion on figure)
Navigating the answer sets
Estimated binding energy score


Ranking final de novo set
Ranking and pruning (with caution)
intermediate trees to reduce
combinatorial problem.
Estimated ease of synthesis score



Ranking final de novo answer set
Too slow (~1 structure per minute) to be
useful for intermediate pruning
Need faster methods for intermediate
pruning
Recent Advances

Parallelization of structure generation
– Farm of SG’s or pcs
– SPROUT server – BEOWOLF cluster currently 11
dual processor 600Mhz Pentium III



VLSPROUT screens virtual libraries
SYNSPROUT generates synthetically
accessible ligands
Receptor SPROUT generates potential
synthetic receptors for small movecules
The perennial modellers problem
Hypothetical ligands, including those predicted to
bind very strongly, have no practical value unless
they can be readily synthesised.
Our attempts to provide solutions:
CAESA
post design estimation of synthetic
accessibility
SynSPROUT synthetic constraints built into the
de novo design
process
VLSPROUT
even greater synthetic constraints –
only members of a specific virtual library are
generated
Synthetic Sprout Approach
Pool of readily available
starting materials, e.g.
subset of ACD
Knowledge Base of reliable high
yielding reactions, e.g. esterification,
amide formation, reductive amination..
VIRTUAL
SYNTHESIS
IN RECEPTOR
CAVITY
Readily synthesable
Putative ligand structures
Creation of Starting Material
Libraries



Obvious Classes eg amino acids
“Drug like” starting materials selected by
hand
“Drug like” starting materials generated
automatically by retrosynthetic analysis
of drug databases
Retro-Synthetic Knowledge Base
Retro-Synthetic Rule
EXPLANATION Amide
Formation
IF Amide
THEN
disconnect bond between 1
and 3
add-atom O[Hs=1] to 1 with
–
add-hydrogen to 2
END-THEN
EXPLANATION Ether
Formation
IF Ether
THEN
disconnect bond between 2
and 3
add-atom O[Hs=1], Cl, Br 3
with –
add-hydrogen to 2
END-THEN
3
3
3
S
H
N 4
1 NH
OH
2
N 4HNH
S
S
S
O
N
O
2
O
N
N
N
O
O
O
O
O
O
O
OH
O
H2N
OH
HO
OH
O
Cl
HO,Cl,Br
HO,Cl,Br
Cl
OH
OH
OH
OH
O
O
1 OH
33
O
2 3
O
O
OH
Br
Br
O
O
O
Automatic Template Library
Generation
Perception
Knowledge
Bases
•Aromatic
•Normalisation
•Hybridisation
•H-bonding
properties
2D Drug-like Structures
Fragmentation
Ring Perception
Retro-Synthetic
Knowledge Base
Clustering
Retro-synthetic patterns
Filter
Corina
Single 3D Conformer
Generation
Omega
Multiple Conformer
Generation
Synthetic Template Library
Retro-Synthetic rules
Synthetic
Knowledge Base
• Functional groups
Automatic Chemical Perception
Rule based system where
rules are encoded using
the PATRAN language

Information
Perceived
– Aromatic atoms and
bonds
– Normalised bonds
Example from Hybridisation knowledge
– Hybridisation including
base
induced hybridisation
– H-Donors / Acceptors
CHEMICAL-LABEL <NitrogenWithLP--SP2>
X[SPCENTRE=2]-N[HS=0,1,2];[SPCENTRE=3]
– Number of hydrogens
EXPLANATION N with lone pair next to sp2 attached to an atom
centre behaves as sp2.
– Number of connections to
IF NitrogenWithLP--SP2
an atom
THEN set-av-eps 2 to 0
– Number of available
set-hybridisation 2 to 2
electron pairs
END-THEN
– Charge at an atom
(similar to SMILES)
Perception - Binding Properties
O Single atom based
Vs
C Functional group based
–
–
–
–
–
D - H donor
A - H acceptor
J - Joinable*
H - Hydrophobic
N - None
O - original method C - current method
* According to reaction knowledge base
Synthetic Template
Primary
Amine
(Donor)
D
O
O
O
N
A
A
A
N
N
H
AD
A
H
A
Phenol
(Acceptor-Donor)
Carboxylic Acid
(Acceptor)
Synthetic Knowledge Base
Synthetic Rules
2
O
1
OH
3
H
+
H
O
N5
4
EXPLANATION Amide Formation 1
IF Carboxylic Acid INTER Primary
Amine
THEN destroy-atom 3
form-bond - between 1 and 5
change-hybridization 5 to
SP2
Dihedral 0 0
Dihedral 0 180
Bond-length 1.35
END-THEN
H
N
Joining Rules
• Steps of formation
• Hybridization change
• Bond type
• Bond length
• Dihedral angles/penalties
O
O
N1
O
De-novo Design
Using Synthetic Sprout
Reductive
Primary
Amination
Amine
OH
O
O
O
Carboxylic
Acid
OO NHO
O
O
OO
Donor site
22 O
NN
HH2O
2N
O H
O
OH
OH
OH
DF Search
OH
N
OH
1
OH OH
towards
acceptor
site
H
OHO 1.
Carbonyl
BF
Search
N
NH
NH
Dock selected fragments to each site
122 Primary
Amide
towards
site for connection
2.
Selectdonor
two sites
OH Amine
Formation
OH
OH
OH
Overlapping common
fragment
2
N
O
Acceptor Site
OH
1.Amide Formation ( Carboxylic Acid -Primary Amine )
2.Reductive Amination ( Carbonyl - Primary Amine )
OH
New Problems - Hybridisation
change
(SP3 SP2)
SP3
SP2
Secondary Amine
Nitrogen becomes SP2
Hybridisation change in Amide Formation 2.
( Carboxylic Acid - Secondary Amine )
Hybridisation change
(SP2 SP3)
SP2
SP3
Carbonyl
Carbon becomes SP2
Hybridisation change in Reductive Amination 1.
( Carbonyl - Primary Amine )
Selection of Synthetic Reactions
Amide Formation
Ether Formation
Ullman reaction
Amine Alkylation
Ester Formation
Aldol
Wittig
Imine
C-S-C Formation
Reductive Amination
CDK2
Library :
Actfragments/1055
Score : -7.80conformations
300
4
Docked:890
3
Docked:358
5
1079
1534
Run
1
2
71
1 Amide Alkylation 2 ( Secondary Amide – Primary Alkyl Halide )
Docked:935
2 Wittig Reaction ( Carbonyl
= Primary Alkyl Halide ) Docked:780
time
: 10
h
3 Ether
Formation
1 ( Alcohol - Alcohol )
4 & 5 Amine Alkylation 1 (Primary Amine - Primary Alkyl Halide )
SynSPROUT
Current status
Works well for small starting material libraries (low
hundreds).
Several libraries now built including amino acid library
for peptide generation. Library from MDDR being built.
Potential for suggesting starting points for new
combinatorial libraries
Future work
Extend types of chemistry allowed
Develop algorithms which would permit the use of
libraries of hundreds of thousands of starting materials
(such as ACD).
Parallelisation helps but on its own is not sufficient to
cope with the inevitable combinatorial explosion.
Acknowledgements
Co-workers :
Krisztina Boda
Attilla Ting
Jon Baber
Special thanks to Open Eye Scientific Software for
providing access to OMEGA