Disordered regions
Download
Report
Transcript Disordered regions
Intrinsically Disordered Proteins:
from lack of structure to
pleiotropy of functions
Lilia Iakoucheva
University of California, San Diego
OUTLINE
Characterization and properties of IDPs
Functional repertoire of IDPs
Post-translational modifications and disorder
Importance for molecular recognition
Disorder and diseases
Historical perspective
1894 - Emil Fischer’s “lock-and-key” hypothesis:
1950 – Fred Karush “Configurational
adaptability”
1958 – Daniel Koshland “Induced fit” theory
Protein Structure-Function Paradigm
Amino Acid Sequence
3D Structure
Function
Disorder examples
Some proteins/regions could function
without being folded…= disordered
First examples of disorder:
Tail of histone H5 (Aviles et al, Eur. J. Biochem. 1978)
… and later tails of other histones
95-residue long disordered segment of calcineurin
(Kissinger et al, Nature, 1995)
Cyclin-dependent kinase inhibitor p21Waf1/Cip1/Sdi1
(Kriwacki et al, PNAS, 1996)
Etc…
Re-assessing structure-function paradigm
Amino Acid Sequence
Amino Acid Sequence
3D Structure
Order
Disorder
Function
Function
What is disorder?
Protein regions (or entire proteins) lacking
stable II and III structure and existing in
the ensemble of conformations with
dynamically changing Ramachandran angles
Disorder is experimentally detected by
• X-ray crystallography
• NMR spectroscopy
• circular dichroism (CD)
• limited proteolysis (LP)
• hydrodynamic methods
Bracken et al, Curr Opin Struct Biol. 2004, 570; Receveur-Bréchot et al, Proteins, 2006, 24
DISORDER
“I don’t know about hair care, Rapunzel,
but I’m thinking a good cream rinse plus
PROTEIN conditioner might just solve
both our problems.”
Properties of IDRs and IDPs
Compositional bias
Charge-hydropathy plot
DisProt-Order/Order
0.6
0.4
DisProt 4.9 (2009)
DisProt 3.4 (2006)
0.2
0.0
-0.2
-0.4
-0.6
Orderpromoting
Disorderpromoting
-0.8
CWY I F V L H TNA G DMK R SQ P E
↓Aromatic,
hydrophobic
Residues
↑Charged,
hydrophilic
Dunker et al, 2001, JMGM; Radivojac et
al, 2007, Biophys J
↑ Net charge
↓ Hydrophobicity
↓ Net charge
↑ Hydrophobicity
Uversky et al, 2000, Proteins 41:415-427
Disorder prediction
AA sequence codes for protein structure…
Does AA sequence code for the lack of structure?
Keith Dunker group – first Predictor Of Natural
Disordered Regions PONDR
• amino acid composition
•
•
•
•
•
sequence complexity
net charge
hydrophobicity
flexibility
…and other features
Protein Disorder Predictors
The PONDR-FIT meta-predictor combines several methods. Use it and other predictors
here.
Xue, B., R. L. DunBrack, R.W. Williams, A.K. Dunker, and V. N. Uversky (2010)
"PONDR-Fit: A meta-predictor of intrinsically disordered amino acids," Biochim.
Biophys. Acta (in press) doi:10.1016/j.bbapap.2010.01.011
PONDRFITTM
Linding R, Jensen LJ, Diella F, Bork P, Gibson TJ, Russell RB. "Protein disorder
prediction: implications for structural proteomics." Structure. 2003;11(11):1453-9, PMID: DisEMBLTM
14604535
Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT. "Prediction and functional
analysis of native disorder in proteins from the three kingdoms of life." J Mol Biol.
2004;337(3):635-45, PMID: 15019783
DISOPRED2
MacCallum B. "Order/Disorder Prediction With Self Organising Maps." CASP 6 meeting,
DRIPPRED
Online paper
Cheng J, Sweredoski M, Baldi P. "Accurate Prediction of Protein Disordered Regions by
Mining Protein Structure Data" Data Mining and Knowledge Discovery. 2005; 11(3):213- DISpro
222, Online Paper
Prilusky J, Felder CE, Zeev-Ben-Mordehai T, Rydberg EH, Man O, Beckmann JS, Silman
I, Sussman JL. "FoldIndex: a simple tool to predict whether a given protein sequence is
FoldIndex©
intrinsically unfolded." Bioinformatics. 2005;21(16):3435-8, PMID: 15955783
Linding R, Russell RB, Neduva V, Gibson TJ. "GlobPlot: Exploring protein sequences for
GlobPlot 2
globularity and disorder." Nucleic Acids Res. 2003;31(13):3701-8, PMID: 12824398
Dosztanyi Z, Csizmok V, Tompa P, Simon I. "IUPred: web server for the prediction of
intrinsically unstructured regions of proteins based on estimated energy content."
Bioinformatics. 2005;21(16):3433-4, PMID: 15955779
IUPred
Romero P, Obradovic Z, Li X, Garner EC, Brown CJ, Dunker AK. "Sequence complexity
of disordered protein." Proteins. 2001;42(1):38-48, PMID: 11093259
PONDR®
Coeytaux K, Poupon A. "Prediction of unfolded segments in a protein sequence based on
amino acid composition." Bioinformatics. 2005;21(9):1891-900, PMID: 15657106
PreLink
Yang ZR, Thomson R, McNeil P, Esnouf RM. "RONN: the bio-basis function neural
network technique applied to the detection of natively disordered regions in proteins."
RONN
http://www.disprot.org/
predictors.php
PONDRing XPA
XPA-MBD
Structure of the
full-length XPA ???
Ikegami et al,1998, Nat.Struct.Biol.
PONDR in action
Iakoucheva et al, Prot Science 2001
Functional importance
DNA BD
P-site
RPA
RPA
NLS
ERCC1
TFIIH
DDB2
NLS
Protein-protein interaction sites are mapped to
disordered XPA termini
XPA’s phosphorylation site is located in its
disordered C-terminus
Putative XPA nuclear localization signals (NLS)
are located in disordered regions
Disorder and Functions
Examples
Function
Description
Protein
modification
Phosphorylation, acetylation,
glycosylation, methylation,
ubiquitination, fatty
acylation
histones, 4-E BP,
CFTR, Bcl-2,
neuromodulin,
HMG-I(Y), p53
Molecular
recognition
Protein-DNA, protein-RNA,
protein-protein, proteinligand interactions
p53, max, fos, jun,
myc, α-synuclein,
CDK inhibitors p21,
p57, p27, TF
Phages, viruses, bacterial
Macromolecular
flagellum, ribosome,
assembly
spliceosome, nuclear pore
Flexible linkers, entropic
Entropic chains
springs, bristles
flagellin, SR
proteins, ribosomal
prot, Nups
fd g3p, RPA, titin,
neurofilament H
Dunker et al, 2002, Biochemistry
Advantages of being disordered
Low-affinity/high-specificity binding
Broad binding diversity
Ability to form large interaction surfaces
Greater capture radius (“fly-casting”
mechanism)
Facilitate alternative splicing
Facilitate post-translational modifications
Phos-sites prefer IDRs
DisPhos
http://core.ist.temple.edu/pred/
More kinases
that target IDPs!
Gsponer et al, 2008, Science. 322(5906):1365-8
More kinase
targets are IDPs!
Ubiquitination and disorder
IDPs are susceptible to proteasomal degradation
Unstructured initiation site is required for degradation (Prakash
et al, 2004, Nat Struct Mol Biol.)
PEST motifs are disordered (Singh et al, 2006, Proteins)
Low coverage of known Ub sites by PDB
Examples of Ub sites in IDRs (p53, c-myc, cyclin B, securin, p21,
p27, p57, α-synuclein, IκBα etc, various authors)
Ub ligases
β-catenin peptide:
15 out of 26 aa
are disordered
Wu et al, 2003, Molecular Cell,
Vol. 11, 1445–1456
~60A° gap
p27 peptide:
14 out of 24 aa
are disordered
Hao et al, 2005, Molecular Cell,
Vol. 20, 9-19
Ub sites properties
Identified 145 new Ub sites with MudPit, mass-spec SILAC and mutant
(grr1Δ and cdc34tm) yeast strains to target short-lived proteins
Ub sites:
Negative charge
0.8
Ub sites
non-Ub sites
Mean Value
0.6
D and E
0.4
K and hydrophobics
0.2
Disorder
Predicted B-factors
0.0
-0.2
ge
r
cha
t
e
n
er
ord
dis
r
cto
a
B-f
s
E
D,
bic
ho
p
ro
hyd
UbPred
http://www.UbPred.org
Radivojac et al, Proteins, 2010
Radivojac et al, Proteins, 2010
Dynamic disorder of Sic1 bound to Cdc4
Structural Model of the
Dynamic pSic1-Cdc4 Complex
Sic1 contains 9 phosphorylation
sites, which interact with Cdc4 in
a dynamic equilibrium
Directly interacting residues are
transiently ordered, whereas the
rest of Sic1 remains disordered
even in the complex
The disorder of Sic1 helps to
bridge the 64A gap between E2
(Cdc34) and the Sic1 bound to
Cdc4 for ubiquitin transfer
Mittag et al, Structure, 2010
Disorder and Functions
Examples
Function
Description
Protein
modification
Phosphorylation, acetylation,
glycosylation, methylation,
ubiquitination, fatty
acylation
histones, 4-E BP,
CFTR, Bcl-2,
neuromodulin,
HMG-I(Y), p53
Molecular
recognition
Protein-DNA, protein-RNA,
protein-protein, proteinligand interactions
p53, max, fos, jun,
myc, α-synuclein,
CDK inhibitors p21,
p57, p27, TF
Phages, viruses, bacterial
Macromolecular
flagellum, ribosome,
assembly
spliceosome, nuclear pore
Flexible linkers, entropic
Entropic chains
springs, bristles
flagellin, SR
proteins, ribosomal
prot, Nups
fd g3p, RPA, titin,
neurofilament H
Molecular recognition
Disordered regions are commonly used for binding
to multiple partners…
C-terminus of p53
Oldfield et al, 2008, BMC Genomics. 9 Suppl 1:S1
NCBD domain of CBP/p300
Wright and Dyson, 2009, Curr Opin Struct Biol.
Mechanisms of binding for IDPs
How do disordered proteins bind to their targets?
Induced folding
(binding, … then folding)
Conformational selection
(folding, … then binding)
Coupled/synergistic
(folding and binding,
… or even binding without folding)
(CFTR R and NBD1 domains, Baker et al, 2007, Nat Struct Mol Biol, 14:738)
Hubs and disorder
Are disordered proteins network hubs?
Cytoskeletal hubs subnetwork from
the S.cerevisiae interactome
Yeast PPI
hubs
ends
order
proteins, %
80
60
40
20
0
>=30
>=40
>=50
>=60
>=70
>=80
>=90 >=100
length of predicted disordered region, aa
Haynes et al, 2006, PLoS CB
Ordered hubs – disordered partners
14-3-3 proteins – signal transduction, apoptosis, cell cycle, cancer
>200 binding (mostly phosphorylated) targets
Three different predictors indicate that 14-3-3 TARGETS are
highly disordered (Bustos and Iglesias, 2006, Proteins, 63:35–42)
Peptides bind to essentially the same region
of 14-3-3
Differences in 14-3-3 side chains conformations
(e.g. induced fit mechanism)
Peptides are highly hydrated in the bound state
(e.g. likely disordered in the unbound state)
Oldfield et al, BMC Genomics, 2008, 9(Suppl 1):S1
Disorder and disease
Individual examples of IDPs/IDRs involved
in human diseases:
p53 (cancer), BRCA1 (cancer), a-synuclein (PD,
AD, dementia, Down syndrome), amyloid b (AD),
tau (AD), prion (TSEs), amylin (Type II diabetes),
hirudin and thrombin (CVD), HPV (cancer)…
Human Papillomavirus (HPV)
Increased amount of disorder in E6 proteins
from high-risk HPVs
Uversky et al, 2006, JPR, 5 (8), 1829-1842
BRCA1
Mark et al, J Mol Biol. 2005,
345(2):275-87
CH plot of BRCA1 fragments
BRCA1 fragments
Full-length BRCA1
Disordered
proteins
CD and NMR of fragmentsall disordered
Ordered
proteins
Are disease proteins more disordered in general?
Disorder and disease
cancer-associated proteins
disease-related proteins
typical eukaryotic proteins
ordered
80
Proteins, %
60
40
20
0
>=30
>=40
>=50
>=60
>=70
>=80
Length of disordered region
>=90 >=100
Disease-related SW keywords are
strongly associated with predicted
disorder (p>0.95)
Xie et al, 2007, JPR
Disease-associated mutations
Disease mutations impact protein
Structure:
Function:
- Folding
- Post-translational modifications
- Oligomerization
- Binding to partners
- Stability
- Intracellular localization
…
- Activity
…
Many predictors of the functional impact of SNPs are
available (SIFT, POLYPHEN, SNP3D etc)
Majority rely on known protein 3D structure and
evolutionary conservation
Disease-associated mutations
Disordered regions:
Do not fold into 3D structure
Are generally less evolutionary conserved than ordered
regions
Do current predictors make errors in predicting
impact of disease mutations in IDRs?
Do disease mutations even occur in the regions of
disorder?
Disease-associated mutations
Total
Dataset
mutations
DM
15459
3356
21.7%
12103
78.3%
Poly
24220
9790
40.4%
14430
59.6%
NES
60339
26927
44.6%
33372
55.3%
number
IDR Mutations
OR Mutations
Disease mutations are prevalent in
ORDERED regions
Disorder-to-Order transition
Dataset
IDR
D->O
DM
Mutations
3356
D->D
80.0%
Poly
9790
NES
26927
OR
O->O
O->D
20.0%
Mutations
12103
95.1%
4.9%
88.5%
11.5%
14430
95.1%
4.6%
92.7%
7.3%
33372
94.4%
5.6%
p=1.06E-32
p=5.47E-105
Some disease mutations in disordered regions cause
Disorder-> Order transition
(may disrupt disordered structure? induce order?)
D→O and O→D
D→O
Substitution
R→W
R→C
R→H
E→K
R→Q
O→D
D→O disease
mutations, %
13.1
10.3
7.6
6.7
6.3
44%
Substitution
L→P
C→R
G→R
W→R
F→S
O→D disease
mutations, %
11.9
6.6
6.1
4.1
3.6
32.2%
Arginine is often mutated
17
SW_DAMU
SW_POLY
SW_CONTROL
ALL_SW
16
15
14
% all mutated residues
13
12
11
10
9
8
7
6
5
4
3
2
1
0
C
W
Y
I
F
V
L
H
T
N
A
Residue
G
D
M
K
R
S
Q
P
E
Hypothetical mechanism?
Codons for
Arginine:
CGG
CGT
CGC
CGA
AGA
AGG
CpG methylation
TGG
TGT
TGC
TGA
AGA
AGG
R-> W
R-> C
R-> C
R-> Stop
N/A
N/A
R-> W and R-> C are among the most frequent
mutations in the disease dataset
Disease Models
Disorder-centric vs Structure-centric view at disease mutations
IDRs summary
Proteins can carry intrinsically disordered
regions
These regions can be predicted from sequence
IDRs perform important functional roles:
PTMs, molecular recognition, involvement in
diseases
Disease mutations could occur in IDRs, and OR
and IDR mutations could lead to diseases via
different mechanisms
Acknowledgements
Rockefeller University:
Jurg Ott
Chad Haynes
Fei Ji
Indiana University
Predrag Radivojac
Mark Goebl
Keith Dunker
Columbia University:
Vladimir Vacic
PNNL:
Eric Ackerman
Richard D. Smith
Funding:
Disordered Proteins Database DisProt
http://www.DisProt.org
List of Disorder Predictors
http://www.disprot.org/predictors.php
Phos Sites Predictor DisPhos
http://www.ist.temple.edu/disphos/
Ub Sites Predictor UbPred
http://www.ubpred.org/
[email protected] – Lilia Iakoucheva
Prevalence of IDPs in nature
Kingdom
# Genomes
% Sequences
L > 40*
% Sequences
L > 40**
Bacteria
22
7 - 33
16 - 45
Archaea
7
9 - 37
21 - 51
Eukaryota
5
35 - 51
52 - 67
* VL-XT Predictor, order ~ 78%, disorder ~ 65%
** VL2 Predictor, order ~ 83%, disorder ~ 75%