Prediction of protein disorder: basic concepts and practical hints

Download Report

Transcript Prediction of protein disorder: basic concepts and practical hints

PART II.
Prediction of functional regions
within disordered proteins
Zsuzsanna Dosztányi
MTA-ELTE Momentum Bioinformatics Group
Department of Biochemistry Eotvos Lorand University,
Budapest, Hungary
[email protected]
LDR (40<) protein, %
Protein disorder is prevalent
60
E
40
A
20
B
0
kingdom
Protein disorder is functional
Xie H et al. J Proteome Res. 2007, 6, 1882-1898
Protein disorder is important

Prion protein
Prion disease

CFTR
Cystic fibrosis

t
Alzheimer’s

-synuclein
Parkinson’s

p53, BRCA1
cancer
Functions of intrinsically disordered
proteins
I
II
III
IV
V
Entropic chains
Linkers
Molecular recognition
Protein modifications (e.g. phosphorylation)
Assembly of large multiprotein complexes
Interactions of IDPs
Complex between p53 and MDM2
Coupled folding and binding
Interactions of proteins
Interactions of proteins
Coupled folding and binding
 Functional advantages
Weak transient, yet specific interactions
 Post-translational modifications
 Flexible binding regions that can overlap
 Evolutionary plasticity

Signaling
Regulation
Coupled folding and binding
 Experimental difficulties
Highly flexible
 Weak interactions
 Short half-life




Complexes of IDPs in the PDB:
~ 200
Known instances:
~ 2,000
Estimated number of such interactions in the
human proteome:
~ 100,000
Interactions of IDPs
F19, W23 and L26
p27
 Cyclin-dependent kinase (Cdk) inhibitor, p27Kip1 (p27)
 Cell cycle regulation
 Binds to cdk-cyclin komplex and inhibitis their activity
 Fully disordered protein
p27
p27
Partial unfolding enables the phosphorylation of Tyr88,
starting a series of signaling events that leads to the
beginning of S phase.
Prediction of functional regions within IDPs
 Disordered binding regions (ANCHOR)
 Linear motifs (ELM, SlimPred)
 Morfs (Morfpred)
 Specific conservation pattern
Disordered protein complexes
• Interaction sites are usually linear
(consist of only 1 part)
• enrichment of interaction prone amino
acids
Sequence
No need for structure,
binding sites can be
predicted from
sequence alone
Complex between p53 and MDM2
Binding sites
Prediction of disordered binding regions
– ANCHOR

What discriminates disordered binding regions?



A cannot form enough favorable interactions with their
sequential environment
It is favorable for them to interact with a globular protein
Based on simplified physical model


Based on an energy estimation method using statistical
potentials
Captures sequential context
ANCHOR
Human p53 C –terminal region
ANCHOR and linear motifs
NCOA2 transcription co-activator
The regions between 600-800 is disordered
Contains 3 receptor binding motifs: xLxxLLx (LIG_NRBOX)
LMs and Disordered Binding Regions

Linear motifs and disordered binding regions
often overlap

Complementary approaches

Prediction of disordered binding regions can
help to increase likelihood of true instances
LMs and Disordered Binding Regions

Linear motifs and disordered binding regions
often overlap

Complementary approaches

Prediction of disordered binding regions can
help to increase likelihood of true instances
Machine learning approaches

SlimPred: trained on ELM database

Morfpred: trained on short chains in complex

Very small datasets

Negative datasets
Conservation
Conservation patterns of linear motifs


No evolutionary constraints to keep the structure
Strong constraints on functional site
Island-like conservation
SlimPrints





Generates sequence alignments of orthologous
sequences
Relative conservation score per position
Filters out less reliable regions
Fails if sequences are too divergent, or too similar
http://bioware.ucd.ie/slimprints.html.