Binding equilibrium in Protein

Download Report

Transcript Binding equilibrium in Protein

Propagation of noise and
perturbations
in protein binding networks
Sergei Maslov
Brookhaven National Laboratory
Experimental interaction data are binary instead of
graded  it is natural to study topology


Very heterogeneous number of binding partners (degree)

One large cluster containing ~80% proteins

Perturbations were analyzed from purely topological standpoint
Ultimately one want to quantify the equilibrium and
dynamics: time to go beyond topology!

Law of Mass Action equilibrium





dDAB/dt = r(on)AB FA FB – r(off)AB DAB
In equilibrium DAB=FA FB/KAB where the
dissociation constant KAB= r(off)AB/ r(on)AB
has units of concentration
Total concentration = free concentration +
bound concentration  CA= FA+FA FB/KAB 
FA=CA/(1+FB/KAB)
In a network Fi=Ci/(1+neighbors j Fj/Kij)
Can be numerically solved by iterations
What is needed to model?

A reliable network of reversible (non-catalytic) proteinprotein binding interactions


the BIOGRID database with 2 or more citations. Most are
reversible: e.g. only 5% involve a kinase
Total concentrations Ci of all proteins




 CHECK! e.g. physical interactions between yeast proteins in

CHECK! genome-wide data for yeast in 3 Nature papers
(2003, 2006) by the group of J. Weissman @ UCSF.
VERY BROAD distribution: Ci ranges between 50 and 106
molecules/cell
Left us with 1700 yeast proteins and ~5000 interactions
in vivo dissociation constants Kij

OOPS! . High throughput experimental techniques are not
there yet
Let’s hope it doesn’t matter




The overall binding strength from the PINT database:
<1/Kij>=1/(5nM). In yeast: 1nM ~ 34 molecules/cell
Simple-minded assignment Kij=const=10nM
(also tried 1nM, 100nM and 1000nM)
Evolutionary-motivated assignment:
Kij=max(Ci,Cj)/20: Kij is only as small
as needed to ensure binding given
Ci and Cj
All assignments of a given average strength give
ROUGHLY THE SAME RESULTS
Robustness with respect to
assignment of Kij
Bound concentrations: Dij
Spearman rank correlation: 0.89
Pearson linear correlation: 0.98
Free concentrations: Fi
Spearman rank correlation: 0.89
Pearson linear correlation: 0.997
Numerical study of
propagation of perturbations




We simulate a twofold increase of the
abundance C0 of just one protein
Proteins with equilibrium free
concentrations Fi changing by >20% are
significantly perturbed
We refer to such proteins i as
concentration-coupled to the protein 0
Look for cascading perturbations
Resistor network analogy




Conductivities ij – dimer (bound)
concentrations Dij
Losses to the ground iG – free (unbound)
concentrations Fi
Electric potentials – relative changes in free
concentrations (-1)L Fi/Fi
Injected current – initial perturbation C0
SM, K. Sneppen, I. Ispolatov, arxiv.org/abs/q-bio.MN/0611026;
What did we learn from this
mapping?





The magnitude of perturbations` exponentially
decay with the network distance (current is
divided over exponentially many links)
Perturbations tend to propagate along highly
abundant heterodimers (large ij )
Fi/Ci has to be low to avoid “losses to the ground”
Perturbations flow down the gradient of Ci
Odd-length loops dampen the perturbations by
confusing (-1)L Fi/Fi
Exponential decay of
perturbations
O – real
S - reshuffled
D – best
propagation
HHT1
SM, I. Ispolatov, PNAS in press (2007)
What conditions
make some
long chains
good conduits
for propagation of
concentration perturbations
while suppressing it
along side branches?

Perturbations propagate along dimers with large concentrations

They cascade down the concentration gradient and thus directional

Free concentrations of intermediate proteins are low
SM, I. Ispolatov, PNAS in press (2007)
Implications of our results
Cross-talk via small-world
topology is suppressed, but…




Good news: on average perturbations via
reversible binding rapidly decay
Still, the absolute number of concentrationcoupled proteins is large
In response to external stimuli levels of
several proteins could be shifted. Cascading
changes from these perturbations could
either cancel or magnify each other.
Our results could be used to extend the list of
perturbed proteins measured e.g. in
microarray experiments
Intra-cellular noise




Noise is measured for total concentrations Ci
(Newman et al. Nature (2006))
Needs to be converted in biologically relevant
bound (Dij) or free (Fi) concentrations
Different results for intrinsic and extrinsic
noise
Intrinsic noise could be amplified (sometimes
as much as 30 times!)
Could it be used for
regulation and signaling?


3-step chains exist in bacteria: anti-antisigma-factors  anti-sigma-factors  sigmafactors  RNA polymerase
Many proteins we find at the receiving end of
our long chains are global regulators (protein
degradation by ubiquitination, global
transcriptional control, RNA degradation, etc.)


Other (catalytic) mechanisms spread perturbations
even further
Feedback control of the overall protein abundance?
Future work
Kinetics
Non-specific vs specific



How quickly the equilibrium is
approached and restored?
Dynamical aspects of noise
How specific interactions peacefully
coexist with many non-specific ones
Iaroslav Ispolatov
Research scientist
Ariadne Genomics
Kim Sneppen
NBI, Denmark
THE END
Genetic interactions


Propagation of concentration
perturbations is behind many genetic
interactions e.g. of the “dosage rescue”
type
We found putative “rescued” proteins
for 136 out of 772 such pairs (18% of
the total, P-value 10-216)
SM, I. Ispolatov, PNAS in press (2007)
SM, I. Ispolatov, PNAS in press (2007)
Genome-wide
protein binding networks



S. cerevisiae curated PPI network used in our study
Nodes - proteins
Edges - proteinprotein bindings
Experimental data
are binary while real
interactions are
graded  one deals
only with topology
Going beyond topology and
modeling the binding
equilibrium and
propagation of perturbations
SM, K. Sneppen, I. Ispolatov, arxiv.org/abs/q-bio.MN/0611026;
SM, I. Ispolatov, PNAS in press (2007)
Kij=max(Ci,Cj)/20
4
10
total concentration Ci
bound concentrations Dij
free concentration Fi
3
histogram
10
2
10
1
10
0
10 0
10
1
10
2
3
4
10
10
10
concentration (molecules/cell)
5
10
6
10
Indiscriminate cross-talk is
suppressed
What did we learn from
topology?
1.
2.
3.
Broad distribution of the degree K of
individual nodes
Degree-degree correlations and high
clustering
Small-world-property: most proteins
are in the same cluster and are
separated by a short distance (follows
from 1. for <K2>/<K> > 2 )
Protein binding networks
have small-world property
86% of proteins could be connected
83% in this plot
S. cerevisiae
Large-scale Y2H experiment
Curated dataset used in our study
Why small-world matters?



Claims of “robustness” of this network
architecture come from studies of the Internet
where breaking up the network is undesirable
For PPI networks it is the OPPOSITE:
interconnected pathways are prone to
undesirable cross-talk
In a small-world network equilibrium
concentrations of all proteins in the same
component are coupled to each other
3
2
1
RNA
polymerase
II
mRNA polyadenylation;
protein sumoylation
G2/M transition of
cell cycle
unfolded protein binding
mRNA, protein, rRNA
export from nucleus
RNA polymerase I, III
35S primary transcript
processing
protein phosphatase type 2A
Propagation to 3rd neighbors
HSP82  SSA1  KAP95  NUP60 : -1.13
SSA2  HSP82  SSA1  KAP95: -1.51
HSC82  CPR6  RPD3  SAP30: -1.20
SSA2  HSP82  SSA1  MTR10: -1.57
CDC55  PPH21  SDF1  PPH3: -2.42
CDC55  PPH21  SDF1  SAP4: -2.42
PPH22  SDF1  PPH21  RTS1: -1.18
CDC55|
CPR6|
HSC82|
HSP82|
KAP95|
MTR10|
NUP60|
PPH21|
PPH22|
PPH3|
RPD3|
RTS1|
SAP30|
SAP4|
SDF1|
SSA1|
SSA2|
• Only 7 pairs in the DIP core network
• But in Krogan et al. dataset there
are 84 pairs at d=3, 17 pairs at d=4,
and 1 pair at d=5 (sic!). Total=102
• Reshuffled concentrations
same network, Total=16
2155 | 8600 | 1461 |
protein biosynthesis* |
protein phosphatase type 2A activity |
4042 | 18600 | 114 |
protein folding |
unfolded protein binding* |
4635 | 132000 | 4961 |
telomere maintenance* |
unfolded protein binding* |
6014 | 445000 | 115 |
response to stress* |
unfolded protein binding* |
4176 | 51700 | 41 |
protein import into nucleus |
protein carrier activity |
5535 | 6340 | 6 |
protein import into nucleus* |
nuclear localization sequence binding |
102 | 4590 | 1693 |
telomere maintenance* |
structural constituent of nuclear pore |
874 | 5620 | 95 |
protein biosynthesis* |
protein phosphatase type 2A activity |
930 | 4110 | 72 |
protein biosynthesis* |
protein phosphatase type 2A activity |
1069 | 2840 | 200 | protein amino acid dephosphorylation* |
protein phosphatase type 2A activity |
5114 | 3850 | 269 |
chromatin silencing at telomere* |
histone deacetylase activity |
5389 | 300 | 80 |
protein biosynthesis* |
protein phosphatase type 2A activity |
4714 | 704 | 80 |
telomere maintenance* |
histone deacetylase activity |
2195 | 279 | 20 | G1/S transition of mitotic cell cycle | protein serine/threonine phosphatase activity |
6101 | 5710 | 451 |
signal transduction |
molecular function unknown |
33 | 269000 |40441 |
translation* |
ATPase activity* |
3780 | 364000 |83250 |
response to stress* |
ATPase activity* |
'RPS10A'
'SEC27'
'HTB2'
'HTB2'
'RPS10A'
'HTB2'
'HTB2'
'HTB2'
'HTB2'
'RPN1'
'HTB2'
'SEC27'
'GIS2'
'HTB2'
'HTB2'
'RPS10A'
'HTB2'
'HTB2'
'SPC72'
[ 1.4732]
'URA7'
[ 1.2557]
'YBR273C' [ 1.3774]
'TUP1'
[ 1.2796]
'AIR2'
[ 2.3619]
'UFD2'
[ 1.3717]
'YDR049W' [ 1.3645]
'PLO2'
[ 1.2640]
'YDR330W' [ 1.3774]
'GAT1'
[ 1.4277]
'YFL044C' [ 1.3774]
'STT3'
[-1.2321]
'STT3'
[ 1.3437]
'YGL108C' [ 1.3774]
'UFD1'
[ 1.3744]
'AIR1'
[ 2.3833]
'FBP1'
[ 1.3576]
'YMR067C' [ 1.3510]
Propagation to
th
4 neighbors
in Krogan nc
AIR1| 2889 | mRNA export from nucleus* | molecular function unknown | nucleus*
AIR2| 916 | mRNA export from nucleus* | molecular function unknown | nucleus*
FBP1| 4207 | gluconeogenesis | fructose-bisphosphatase activity | cytosol
GAT1| 1857 | transcription initiation from RNA polymerase II promoter* | specific RNA polymerase II transcription factor activity* | nucleus*
GIS2| 5039 | intracellular signaling cascade | molecular function unknown | cytoplasm
HTB2| 136 | chromatin assembly or disassembly | DNA binding | nuclear nucleosome
PLO2| 1291 | telomere maintenance* | histone deacetylase activity | nucleus*
RPN1| 2608 | ubiquitin-dependent protein catabolism | endopeptidase activity* | cytoplasm*
RPS10A| 5667 | translation | structural constituent of ribosome | cytosolic small ribosomal subunit (sensu Eukaryota)
SEC27| 2102 | ER to Golgi vesicle-mediated transport* | molecular function unknown | COPI vesicle coat
SPC72| 78 | mitotic sister chromatid segregation* | structural constituent of cytoskeleton | outer plaque of spindle pole body
STT3| 1987 | protein amino acid N-linked glycosylation | dolichyl-diphosphooligosaccharide-protein glycotransferase activity | oligosaccharyl transferase c.
TUP1| 710 | negative regulation of transcription* | general transcriptional repressor activity | nucleus
UFD1| 2278 | ubiquitin-dependent protein catabolism* | protein binding | endoplasmic reticulum
UFD2| 932 | response to stress* | ubiquitin conjugating enzyme activity | cytoplasm*
URA7| 174 | phospholipid biosynthesis* | CTP synthase activity | cytosol
YBR273C| 534 | ubiquitin-dependent protein catabolism* | molecular function unknown | endoplasmic reticulum*
YDR049W| 1043 | biological process unknown | molecular function unknown | cytoplasm*
YDR330W| 1328 | ubiquitin-dependent protein catabolism | molecular function unknown | cytoplasm*
YFL044C| 1880 | protein deubiquitination* | ubiquitin-specific protease activity | cytoplasm*
YGL108C| 2073 | biological process unknown | molecular function unknown | cellular component unknown
YMR067C| 4506 | ubiquitin-dependent protein catabolism* | molecular function unknown | cytoplasm*
Weight of links


Perturbations sign-alternate
j Dij/Ci=1-Fi /Ci <1
thus perturbations always decay
Resistor network analogy
• j~Fj/Fj – potentials, Dij , Fj , Ci –currents
• Dij – conductivity between interacting nodes
• Fi – shunt conductivity to the ground
<1/Kd>=1/5.2nM
close to our choice
of 10nM
Data from PINT database (Kumar and Gromiha, NAR 2006)
How much data is out there?
Species
S.cerevisiae
Set
nodes
edges
# of sources
HTP-PI
4,500
13,000
5
LC-PI
3,100
20,000
3,100
D.melanogaster HTP-PI
6,800
22,000
2
C.elegans
HTP-PI
2,800
4,500
1
H.sapiens
LC-PI
6,400
31,000
12,000
HTP-PI
1,800
3,500
2
HTP-PI
700
1,500
1
HTP-PI
1,300
H. pylori
P. falciparum
2,800
1
Breakup by experimental
technique in yeast
BIOGRID database
S. cerevisiae
Affinity Capture-Mass Spec
Affinity Capture-RNA
Affinity Capture-Western
Co-crystal Structure
28172
55
5710
107
FRET
43
Far Western
41
Two-hybrid
11935
Total
46063
Sprinzak et al., JMB, 327:919-923, 2003
TAPMass-Spec
Yeast 2-hybrid
Christian von Mering*, Roland Krause†, Berend Snel*, Michael Cornell‡, Stephen G. Oliver‡, Stanley Fields§ & Peer Bork*
NATURE |VOL 417, 399-403| 23 MAY 2002
HHT1