Topics in Protein-Protein Docking

Download Report

Transcript Topics in Protein-Protein Docking

DOCKING
Modeling Protein Complexes
Dr. Victor Lesk
24th October 2006
• Protein / small molecules
– Enzyme / substrates
– Enzyme / drug
• Protein / protein
– Enzyme / inhibitor
– Inhibitor / modulator
– Macromolecular assemblies
• Protein / nucleic acid
– RNA/DNA / polymerase
– Ribosome / peptide
• Docking two molecules means constructing
the coordinates of the bound state.
• Bound state is called the complex.
• We require coordinates for the independent
molecules as input
• Molecules move towards each other and
bind/‘dock’
• But aim is to predict their docked
configuration (not describe their motion).
Drug design
• Drugs typically affect a protein target’s
ability to bind substrate (compet. or non-).
• For some time carried out by automated
screening of large library of compounds
• Library chosen according to ADME criteria
using Lipinski’s rule of 5
• Used to be experimental screening using
physical library
• Now “virtual screening” computer-based
methods are also available.
How is virtual screening
performed?
• QSAR when structure of protein target is
unknown
• When protein structure is known, docking
the drugs onto the protein can be tried
(“small-molecule docking”).
• If partner is also protein and binding site is
cannot be identified by expt. or
bioinformatics, protein-protein docking
may be used to help find it.
SMALL-MOLECULE DOCKING
Ligand docking
Virtual screening
For:
• Drug design
– Lead optimization
• Toxicology
• Metabolism study
• Development of tags
for imaging
Software e.g.:
• Commercial
– DOCK
– GOLD
– FlexX
• Free for academics
– Autodock
Small molecule docking for drug
design
• Try to dock putative drug molecule on to protein
• Each molecule has few atoms, so docking of each is
computationally efficient.. but many molecules.
• Search from library of 500,000 compounds
– Pre-filtered using heuristics (Lipinski)
• Score with pairwise energy function
• Protein remains rigid
• Torsion angles of drug are allowed to rotate. Bonds are
not allowed to stretch or flex.
• Example: HIV protease inhibitors
PROTEIN-PROTEIN DOCKING
(also polynucleotides)
• Other reasons for protein-protein docking
• Basic concepts
• Methods
• Assessment
• Summary
Background: why do proteinprotein docking?
Aside from helping with virtual screening,
• Protein-protein interaction networks are of
widespread interest in systems biology
• Exist proteins with no information from
genome projects
• And known proteins having as yet unknown
interactions
• Structure prediction technology advances
Protein-Protein Docking
• Configuration space large, computationally
intensive problem
• Even larger when one of components is not
rigid ‘enough’
– Throw away as many atoms as possible
– Search remaining space efficiently
– And/or use high performance computing
Protein-protein docking:
Methods
• Set of configurations must contain good
enough one
• Good enough configuration must have
nearly the best score
• Use non-structural help where possible
Protein-protein docking:
Methods
• Fourier series methods
• Monte Carlo methods
• Surface methods
• Bioinformatics methods
• Normal modes
Fourier method for proteinprotein docking
‘Reciprocal space method’
• Convolution scores only
• Non-structural help cannot be used
efficiently
• Conformational change not allowed
Monte-Carlo methods for
protein-protein docking
• Make small random change
• Prefer to accept change with better score
• Random change may include non-rigidity
• Not guaranteed to consider true structure
Surface method for proteinprotein docking
• Superpose a point on each protein’s
molecular surface
• Rotate to make normals antiparallel
• Surfaces created with marching cubes
Movie of surface method proteinprotein docking
•
•
•
•
•
•
2 x Plasmodium vivax 25kD protein
Homodimer complex
Symmetry not imposed
1000 active triangles on each protein
60,000,000 configurations total
Score rate: 30 configurations per second
Marching cubes
(Lorensen and Cline,1987)
• Originally for medical imaging
(Pictures: D. Lingrand
Université Nice Sophia Antipolis/CNRS)
Marching cubes: animated
demonstration
• Molecular surface of P25 being constructed
• Low resolution
Marching cubes: properties
• Surface is constructed out of triangles
(‘simplicial complex’)
• All mathematical topologies ok
• Restrictable to specified patches
• Internal pockets must be eliminated
• Major flexibility requires refinement stage
(although better than Fourier method)
Marching cubes: variables
Molecular surface
Solvent-accessible surface
Marching cubes: patches
800 sq. A patch around
GLU152.Oe2
Sample from patch
Bioinformatics methods for
protein-protein docking
• Mutagenesis effects on affinity
• Surface residue conservation
• Correlated mutation between interactors
• Homologous complexes
• Bioinformatics auxiliary, not stand-alone
Normal Modes
• About 10000 oscillations in average protein
• Largest amplitude oscillations calculable
• Describe structural change for docking
• Or just reduce backbone conformer search
(needs to be done)
Scores for selecting the best
configuration
• Free energy
– Electostatics
– Stereochemistry
•
•
•
•
•
Solvation score
Statistical scores
Geometric scores
Phylogenetic scores
Combined scores
Free energy
•
•
•
•
•
Contribution from all pairs of atoms
Same/opposite electric charge repel/attract
Electron clouds exclude each other
Atoms try to make glancing contact
Hydrogen bonds are favourable (difficult to
model and calculate, direction-dependent)
Solvation score
• Water attracts polar groups
• Non-polar groups buried by interface
• Atomic contact energy (Zhang)
Statistical scores
• Interacting residue or atom type profile
• Profile from known complex interfaces
Geometric scores
• Convex hull of surfaces
• Buried surface area
• Volume of intersection
Phylogenetic scores
• Needs homologues for all interactors
• Conservation score
• Correlation of mutations across interface
Combined scores
•
•
•
•
•
best/worst rank(s1,s2,s3 … )
reverse: -s1
s1 with configurations filtered out if s2>5.7
weighted sum: a x s1 + b x s2 + …
weighted product: s1a x s2b x s3c
• Automated combined score trials
Assessment of docking methods
• Benchmark
• Assessment event:
– CAPRI double-blind trial
Protein-protein docking
benchmark
• 84 protein-protein complexes
• Unique structural family combinations
• Diverse biological roles
• Maintained by Prof. Z. Weng at Boston U.
CAPRI: Critical Assessment of
PRotein Interactions
• Every 6 months or so, irregular
• Set of 1 to 6 target complexes
• Centralized double-blind assessment
• International meetings
• Proteins journal: special CAPRI edition
Interpretation of docking
• Not simulation (maybe Monte Carlo is)
• Energy functions are no more than
‘inspired’ by physics
• With greater understanding affinity
prediction could become possible
Imperial structural bioinformatics
group
Virtual screening
Protein-protein
docking
• Prof. Mike Sternberg
• Dr. Ata Amini
• Dr. Paul Shrimpton
•
•
•
•
•
Prof. Mike Sternberg
Dr. Suhail Islam
Dr. Victor Lesk
Philip Carter
Sara Dobbins
Glossary
•
•
•
•
•
•
•
•
•
•
•
Complex
Component
Ligand/Receptor
Bond torsion
Torsion angle
Bond angle
Configuration
Decoy
Blind Trial
PDB file
Coordinate file
•
•
•
•
•
•
•
•
•
•
Energy function
Scoring function
Fitness function
Pairwise energy
Electrostatics
Solvation
Dielectric
Affinity
Fourier Transform
Mutagenesis
Summary
• Surface method: fast, versatile, flexibility ok
• 600 processor hours for full rigid search
•
•
•
•
To be done:
Score improvement
Fast sidechain flexibility
Backbone flexibility
Bibliography
• Virtual screening
– Virtual Screening in Drug Discovery, Alvarez & Shoichet, CRC
Press (2005)
– Structure-based virtual screening – an overview, Lyne, DDT 7
20 1047 (2002)
• Lipinski’s rule of 5
– Experimental and computational approaches to estimate
solubility and permeability in drug discovery and
development settings, Lipinski et al., Adv. Drug Del. Rev.. 26 13 3(2001)
• Protein-protein docking: Fourier method
– Molecular surface recognition: determination of geometric fit
between proteins and their ligands by correlation methods,
Katzir et al., PNAS 89 2195(1992)
• Protein-protein docking: Monte Carlo
– Protein-protein docking with simultaneous
optimization of rigid-body displacement and sidechain conformations, Gray et al., JMB 331 1 281 (2003)
• Benchmark for protein-protein docking
– Protein-Protein Docking Benchmark 2.0, Mintseris et
al., Proteins 60 2 216(2005)
• Solvation modeling for proteins
– Determination of atomic desolvation energies from the
structures of crystallized proteins, Zhang et al., JMB
267 3 707(1997)
• Automated protein-protein docking
server
– CLUSPRO, Comeau et al., Bioinf. 20 1 45 (2004)
http://nrc.bu.edu/cluster/
• CAPRI docking assessment
– Welcome to CAPRI, Janin, Proteins SFG 47 3 257(2002)
– CAPRI methods articles, Proteins SFG 52 1(2003)
– The CAPRI experiment, its evaluation and
implications, Wodak & Mendex, Curr Opin Struct Biol 14
2 242
• Marching Cubes
– Marching cubes: A high resolution 3d surface
construction algorithm, Lorensen & Cline, Proc. ACM
Siggraph Aug 1987 163
Critical
• Scores and hybrid scores
• How to describe the success rate of a
docking method
• Benchmark
• Changes in protein structure upon
complexation
Basic Scores
electrostatic energy
hard core repulsion penalty score (e.g.
Lennard-Jones)
solvation score
Quality of method
In x% of different proteinprotein complexes the best
n guesses of method M
contain at least one guess
closer than r Angstroms.
Benchmark
• Benchmark of 39 complexes, filtered for
redundancy
• Ranked in 3 difficulty classes based on
degree of conformational change
• Enzyme-inhibitor complexes:11
• Antibody-antigen complexes:11
• 27 others of unclassified functional role
Conformational change
• Proteins change shape a little (example)
or a lot (example) upon complexation
• When conformational change is small,
docking methods can ignore it.
Strategy
Any docking method must work unfailingly in
cases of zero conformational change.
Methods are first tested with zero
conformational change imposed.
Surface-based docking
• Automatically excludes a large subset of known
undesirable conformations
• Can impose contact between specified surface
patches
• N-fold rotational symmetry
• Provides alternative visualization
Construction of Surface from pdb
file
• Step 1: read pdb file and identify atom
types
• Step 2: replace points with overlapping
clouds of density
• Step 3: apply marching cubes to generate
a set of interlocking triangles representing
the atomic surface
Generation of guesses for
complexed structure
A possible structure for the complex can be
generated by specifying
• A triangle from the surface representing
component 1
• A triangle from component 2
• An angle
Put triangle 1 against triangle 2 and rotate by
angle around the centre of the triangle.
Score the configuration and record the structure.
Repeat 1M times.
Ultimate aims of Protein-Protein
Docking, in order of difficulty
• What is the complexed structure of proteins
x, y of known structure which are known to
form a complex? (Hard)
• Could proteins a, b of known structure form
a stable complex in vivo? (Very Hard)
• What, approximately, is the chemical affinity
for given interacting proteins? (Very Hard)
Hybrid scores
Hybrid scores are scores produced by
operations on basic scores s1,s2,s3
reverse(s1)
s1 with configurations filtered out if s2>5.7
weighted sum a*s1 + b*s2 + c*s3 + …
weighted product s1a x s2b. x s3c …