InterEvol: The R-evolutionary databank G. Faure et al, in prep.

Download Report

Transcript InterEvol: The R-evolutionary databank G. Faure et al, in prep.

Structural prediction of protein assemblies
Guilhem FAURE
Supervisor : Raphaël Guérois
Molecular Assemblies and Signaling
Structural Biology and Radiobiology Lab
iBiTecS – URA CNRS 2096 - CEA Saclay
iViv 2010, Journées de rentrée des doctorants
Experimental insights into the protein interactions space ?
High throughput
approaches
High resolution
approches
Macromolecules in cellulo
Large scale vision
Synergies/Competitions
Molecular vision
iViv 2010, Journées de rentrée des doctorants
Translate each node of the interaction
networks into a 3D structure ?
Experimental structures
Homology models ?
How to model the structure of
proteins/domains assemblies ?
iViv 2010, Journées de rentrée des doctorants
How to predict protein assemblies ?
104 decoys
Filters
Surface complementarities
Physico-chemestry features
 Evolution data
~ 10 decoys
1 most likely
model
iViv 2010, Journées de rentrée des doctorants
Thesis Goals
 Use evolution data to predict protein assemblies
104 decoys
How to characterize evolution ?
 Conservation ? Coevolution ?
Filters
 type of data to analyse ?
~ 10 decoys
1 most likely
model
How to use evolution to predict ?
iViv 2010, Journées de rentrée des doctorants
Can conservation leads protein assemblies ?
Interface conservation
Ratio of conserved residues
part of a given interface ~ 30 %
protein
% of complexes
Complex A-B
interface
= conserved
AB interface
% of all conserved residues
Lack of specificity to predict
iViv 2010, Journées de rentrée des doctorants
Evolutionary rates as relevant interface signals ?
Lif1 S. cerevisiae
XRCC4 H. sapiens (low sequence identity)
Xray structure known at 2.4A
Xray structure known at 2.3A
Nej1 S. cerevisiae
Cernunnos H. sapiens (low sequence identity)
iViv 2010, Journées de rentrée des doctorants
Evolutionary rates as relevant interface signals ?
An example from the DNA repair interaction network
Lif1 S. cerevisiae
XRCC4 H. sapiens
BRCT
DNA ligase
Nej1 S. cerevisiae
Cernunnos H. sapiens
conservation
iViv 2010, Journées de rentrée des doctorants
An Example of Prediction with XRCC4-Cernunnos
Exploiting Evolution and Energy Calculations
Coll. JB Charbonnier (LBSR)
G. Faure in Malivert et al, JBC (2010)
Step 1
Filter solutions using
evolutionnary rates
Step 2
-30
-20
-10
Rosetta Score (min vs all)
Interface Energy
Local perturbations,
Optimisations of the
interactions
… search for funnels
2
4
6
8
10 12 14
iRMS
iViv 2010, Journées de rentrée des doctorants
Model gives many precious information
Interface mutations can be design to study the complex
 Model can lead the resolving of Xray structure
But without biochemestry information about BRCT  hard to predict
Need mutual information  coevolution / coadaptation
iViv 2010, Journées de rentrée des doctorants
How do deleterious mutations at the interface can be tolerated ?
S. cerevisiae
: complementary interactions
- charge compensation
- polar interactions
- apolar interactions
…
Deleterious
mutation
Euk. sup.
 Neighbouring positions can buffer the loss of complementarity
Madaoui & Guerois, PNAS 2008
 Other mechanisms of co-evolution ?
 How to account for structural plasticity ?
iViv 2010, Journées de rentrée des doctorants
How to study coevolution : concept of interology
Same interaction involving same partners
=
INTEROLOGS
 Same interface
Same ancestor = homolog
 Same evolution profil + same fold
iViv 2010, Journées de rentrée des doctorants
How to build an interolog database ?
Extracting and cleaning heterocomplex
True heteromer
 biological interfaces
…
Redundancy traitement
2500 Non redundant interfaces
2500 groups of interologs
G. Faure et al, in prep.
350 groups of structural interologs
iViv 2010, Journées de rentrée des doctorants
How to explore coevolution ?
A PyMol plugin to visualize Structure and alignments
Data and Querying Server
at
http://biodev.extra.cea.fr/lbsr/
iViv 2010, Journées de rentrée des doctorants
Conclusion & Perspecpives
Conservation can not be used to predict protein assemblies
 Building a large database
Large spectrum of sequence divergence
 Explore structural plasticity at complex interfaces
while increasing sequence divergence
 Test our ability to reproduce this plasticity
 Analyze the evolution of hot-spot regions
Benchmark to address how far structural models
can be used in modelling protein complexes
Developpement of statistical potential
taking account evolution data
iViv 2010, Journées de rentrée des doctorants
iViv 2010, Journées de rentrée des doctorants
InterEvol : Automatic and self-updating interface database
for extracting structural and evolutionary information
XXX heteromeric
complexes
Redundancy
filters
Biological
vs
non biological
interfaces
NoXclass
Clustering
Families &
Superfamilies
HHsearch
Matras
Coupled alignments
for orthologous
sequences for both partners
XXX
non redundant
interfaces
XXX
structural
interologs
Pymol plugin
for interface
coevolution
visualisation
Querying Server
at
http://biodev.extra.cea.fr/lbsr/
iViv 2010, Journées de rentrée des doctorants
How to study coevolution ?
Querying Server
at
http://biodev.extra.cea.fr/lbsr/
iViv 2010, Journées de rentrée des doctorants
iViv 2010, Journées de rentrée des doctorants
How to find coevolution ?
An interolog structural databank (350 groups of interologs)
same fold
+
same evolutif profil
+
same interaction area
G. Faure et al, in prep.
iViv 2010, Journées de rentrée des doctorants
How to explore coevolution at interfaces ?
iViv 2010, Journées de rentrée des doctorants
How to predict protein assemblies with coevolution ?
Multi-body potential
Interologs database (350 groups of interologs)
Exploring base
Interface database (2500 interfaces)
Learning base
InterAlign database (2500 alignments)
iViv 2010, Journées de rentrée des doctorants
Which evolutionary signals at protein surfaces can be captured
to identify the interaction sites ?
Conservation analyses
HSM3
RPN1
RPT1
RPT2
RPT5
conservation score
conservation
iViv 2010, Journées de rentrée des doctorants
Evolutionary rates do not provide mutual information
between interacting surfaces …
 How to account for co-evolution or co-adaptation
 Can this helps to better predict molecular assemblies
Protein A
Protein B
Which ratio of conserved residues
are part of the interface ?
% of complexes
protein
AB interface
interface
% of all conserved residues
iViv 2010, Journées de rentrée des doctorants
Evolutionary rates do not provide mutual information
between interacting surfaces …
iViv 2010, Journées de rentrée des doctorants
Co-adaptation involve not only pairs of residues but also
groups of structural neighbours
90
°
k i
90°
A/B complex
Protein A
k
Human
Mouse
Fish
…
Yeast
i
Hydrophobic
Polar
Acidic
Basic
j
Protein B






j
Structural Neighbours
may compensate
for loss of complementarity
Madaoui & Guerois, PNAS 2008
iViv 2010, Journées de rentrée des doctorants
Co-variation analyses at the interface
of intra-molecular domain-domain interactions
Partner B
Partner A
Protein A
Protein B
AB interface
Human
Mouse
Fish
…
Yeast






iViv 2010, Journées de rentrée des doctorants
An Example of Prediction Exploiting Evolution
DNA repair complex
(Non-homologous End Joining)
Coll. JB Charbonnier (LBSR)

Conserved Residues
Conserved Residues
Docking under constrains with Haddock (Bonvin’s group)
iViv 2010, Journées de rentrée des doctorants
G. Faure in Malivert et al, JBC (2010)
The evolutionary dimension should provide key information
to exploit interaction data under a structural perspective
iViv 2010, Journées de rentrée des doctorants
2 majors issues
Difficulties to identify orthologs
How to characterize selection pressure at the interface
iViv 2010, Journées de rentrée des doctorants
2 majors issues
Difficulties to identify orthologs
How to characterize selection pressure at the interface
iViv 2010, Journées de rentrée des doctorants
InterEvol: The R-evolutionary databank
A non redundant heterodimer structures databank (2300 structures)
Study the contact statistics at the interface
Graph répartition transient permanent taille interface
G. Faure et al, in prep.
(1) Krissinel and K. Henrick
iViv 2010, Journées de rentrée des doctorants
InterEvol: The R-evolutionary databank
An interolog structural databank (350 structures)
A
B
same fold
+
Same evolutif profil
A’
B’
Rajouter les % id
G. Faure et al, in prep.
(1) Krissinel and K. Henrick
iViv 2010, Journées de rentrée des doctorants
InterEvol: The R-evolutionary databank
An interolog sequence databank (2300 alignments)
Sequences from PSIBLAST
Initial structure
at least
30% of identity
…
G. Faure et al, in prep.
iViv 2010, Journées de rentrée des doctorants
InterEvol: The R-evolutionary databank
PISA 1 (PDB complex assemblies)
G. Faure et al, in prep.
(1) Krissinel and K. Henrick
iViv 2010, Journées de rentrée des doctorants
InterEvol: The R-evolutionary databank
PISA 1 (PDB complex assemblies)
Cleaned true heteromer
G. Faure et al, in prep.
(1) Krissinel and K. Henrick
iViv 2010, Journées de rentrée des doctorants
InterEvol: The R-evolutionary databank
PISA 1 (PDB complex assemblies)
Cleaned true heteromer
Non redundant PDB structures databank
G. Faure et al, in prep.
(1) Krissinel and K. Henrick
iViv 2010, Journées de rentrée des doctorants
InterEvol: The R-evolutionary databank
PISA 1 (PDB complex assemblies)
Cleaned true heteromer
Non redundant PDB structures databank
Non redundant heterodimer databank
SCOTCHAlign databank
G. Faure et al, in prep.
(1) Krissinel and K. Henrick
iViv 2010, Journées de rentrée des doctorants
InterEvol: The R-evolutionary databank
PISA 1 (PDB complex assemblies)
Cleaned true heteromer
Non redundant PDB structures databank
Non redundant heterodimer databank
SCOTCHAlign databank
Interolog databank
G. Faure et al, in prep.
(1) Krissinel and K. Henrick
iViv 2010, Journées de rentrée des doctorants
Through multidimensionnal data: InterEvolVisu
Photo du plugin sur un exemple
G. Faure et al, in prep.
(1) Krissinel and K. Henrick
iViv 2010, Journées de rentrée des doctorants
Conclusions & Perspectives
Build a statistical multicore potential from structure and sequence data
Understand the pressure selection at the interface with Interologs
Build a full leading Docking method to automise each steps
(1) Krissinel and K. Henrick
iViv 2010, Journées de rentrée des doctorants
Conservation analyses at the interface
of intra-molecular domain-domain interactions
Which ratio of conserved residues
are part of the interface ?
% of complexes
protein
interface
% of all conserved residues
Several approaches combined conservation with other structure and sequence features
to identify potential binding patches  no mutual information
(ProMate (Neuvirth, JMB, 2004), PINUP (Liang et al, NAR, 2006), SPPIDER (Porollo, Proteins, 2007)
iViv 2010, Journées de rentrée des doctorants
Which evolutionary signals at protein surfaces can be captured
to identify the interaction sites ?
Conservation analyses
HSM3
RPN1
RPT1
RPT2
RPT5
conservation score
conservation
iViv 2010, Journées de rentrée des doctorants
Relationships between sequence divergence and
conservation of the binding mode
B
A
Human
+

AB Complex
~ > 30 % identity
Yeast
A’
+

B’
A’B’ Complex
Two homologous complexes (~> 30% identity)
generally interact in a similar manner
Evolution data gives information about structure assemblies
Aloy & Russel,iViv
JMB2010,
2003 Journées de rentrée des doctorants