Intersubunit contacts are often facilitated by specificity

Download Report

Transcript Intersubunit contacts are often facilitated by specificity

Intersubunit contacts are often facilitated
by specificity-determining positions
Computational identification of protein
positions that possibly account for precise
recognition of the interaction partner


Abundance of sequence data
Little experimental information on protein
function
=> annotation by homology

Even less information on protein specificity
=> prediction of specificity-determining
positions (SDPs)
SDP (Specificity-Determining Position)

Alignment position that is conserved
within groups of proteins having the
same specificity (specificity groups) but
differs between them
SDP is not
equivalent to a
functionally
important
position!
What can we infer from SDPs?



Targets for protein functional redesign
Specificity signature
Sites of protein-protein interaction
Talk overview



SDPpred, an algorithm for identification of
SDPs
A studied example:
isocitrate/isopropylmalate dehydrogenases
Link to PPI
SDPpred
Multiple protein
alignment divided into
specificity groups
=== AQP ===
%sp|Q9L772|AQPZ_BRUME
-------------------------------------mlnklsaeffgtfwlvfggcgsa
ilaa--afp-------elgigflgvalafgltvltmayavggisg--ghfnpavslgltv
iiilgsts------------------------------slap-----------------qlwlfwvaplvgavigaiiwkgllgrd-------------------------------------…
=== GLP ===
%sp|P11244|GLPF_ECOLI
----------------------------msqt---stlkgqciaeflgtglliffgvgcv
aalkvag---------a-sfgqweisviwglgvamaiyltagvsg--ahlnpavtialwl
glilaltd------------------------------dgn--------------g-vpr
-flvplfgpivgaivgafayrkligrhlpcdicvveek--etttpseqkasl------------…
SDPpred
SDPs: positions best
discriminating between
specificity groups
What is in the black box: the algorithm

Mutual information Ip reflect the extent to which an alignment position
tends to be a SDP.
Ip 


f p ( , i) log
all specificity all amino
groups i
acids

f p ( , i)
f p ( ) f (i)
f p ( , i) - ratio of occurences of amino acid α in group i in
position p to the height of the alignment column
- frequency of amino acid α in position p
- fraction of proteins in group i
f p ( )
f (i)
Statistical significance of Ip.


Expected mutual information Ipexp of an alignment column.
Z-score.
I p  I exp
p
Z 
p  (I exp)
p
(Mirny&Gelfand, 2002, J Mol Biol, 321(1))

Are 5 SDP with Z-score >10.5 better than 10 SDP with Z-score >9.0?
Bernoulli estimator for selection of proper number of SDPs
Z1  Z 2  
k *  arg min Pthereare at least k observed Z - scores Z  Z k  
k
n


 arg min1   C ni q i p n  i 
k
i  n  k 1




p  P( Z  Z k ) 

Zk
1
exp(  Z 2 )dZ
2
q 1 p
Smoothed amino acid frequencies: a leucine is more a methionine than
a valine, and any arginine has a dash of lysine…
f ( , i)  n( , i) n(i)
 20

n( , i)     n(  , i)m(    ) 
~
  1

f ( , i) 
n(i)   n(i)
n(i)
Other similar techniques






Evolutionary trace (Lichtarge et al. 1996, 1997)
Evolutionary rate shifts (Gaucher et al. 2002) 
Surface patches of slowly evolving residues
(Rate4Site, Pupko et al. 2002) 
PCA in sequence space (Casari et al. 1999, del
Sol Mesa et al. 2003)
Correlated mutations (Pazos and Valencia, 2002)
Prediction of functional sub-types (Hannenhalli
and Russell, 2000) and identification of PSDR
(Mirny and Gelfand, 2002)
Special features of SDPpred



Smoothed amino acid frequencies allow to
account for functional (structural, chemical,
evolutionary, …) similarities among amino acids
Automatic cutoff setting -> no prior knowledge
about protein family
Does not require 3D structure -> use of
structural data solely for interpretation and
verification of results
– Kalinina OV, Mironov AA, Gelfand MS, Rakhmaninova AB. (2004)
Protein Sci 13(2): 443-56
– Kalinina OV, Novichkov PS, Mironov AA, Gelfand MS, Rakhmaninova
AB. (2004) Nucl Acids Res 32(Web Server issue): W424-8.
– http://math.belozersky.msu.ru/~psn/
Example: isocitrate/isopropylmalate
dehydrogenases (IDH/IMDH)


IDH: catalyzes the oxidation of isocitrate to αketoglutorate and CO2 (TCA) using either NAD or
NADP as a cofactor in different organisms from
bacteria to higher eukaryotes
IMDH: catalyzes oxidative decarboxylation of 3isopropylmalate into 2-oxo-4-methylvalerate
(leucine biosynthesis) in bacteria and fungi
IDH/IMDH: combinations of specificities
towards substrate and cofactor




NAD-dependent IDHs
NADP-dependent IDHs
from bacteria and archaea
(type I)
NADP-dependent IDHs
from eukaryota (type II)
NAD-dependent IMDH
Eukaryota
Archaea
Bacteria
Eukaryota
Mitochondria
Archaea
Bacteria
IDH/IMDH: selecting specificity groups
1. All NAD-dependent 2. All IDHs vs. all
vs. all NADPdependent
IMDHs
IDH (NADP)
type II
IDH (NAD)
IDH (NADP)
type I
Four groups
IDH (NADP)
type II
IDH (NAD)
IMDH (NAD)
3.
IDH (NADP)
type II
IDH (NAD)
IMDH (NAD)
IDH (NADP)
type I
IMDH (NAD)
IDH (NADP)
type I
IDH/IMDH: predicted SDPs
(cofactor-specific)
Substrate
Cofactor
SDPs
Subunit I
Subunit II
NADP-dependent IDH from E. coli (1ai2)
IDH/IMDH: predicted SDPs
(substrate-specific)
Substrate
Cofactor
SDPs
Subunit I
Subunit II
NADP-dependent IDH from E. coli (1ai2)
IDH/IMDH: predicted SDPs
(four groups)
Substrate
Cofactor
SDPs
Subunit I
Subunit II
NADP-dependent IDH from E. coli (1ai2)
IDH/IMDH: predicted
SDPs (overview)
IDH/IMDH: SDPs predicted for
different groupings
All NADdependent vs. all
NADP-dependent
-> cofactorspecific SDPs
208Arg
337Ala
100Lys
300Ala
Color code:
105Thr
229His
154Glu
103Leu
233Ile
158Asp
115Asn
305Asn
308Tyr
155Asn
231Gly 327Asn
344Lys 287Gln
164Glu
351Val 345Tyr
241Phe
38Gly 40Asp
104Thr
107Val
152Phe
323Ala 245Gly 161Ala 232Asn
Contacts substrate
Contacts cofactor
162Gly 36Gly
Contacts the other subunit
45Met
Contacts substrate AND cofactor
Contacts substrate AND the other subunit
All IDHs vs. all
IMDHs
-> substratespecific SDPs
31Tyr
341Thr
97Val
98Ala
Four groups
IDH/IMDH: SDPs in contact with
cofactor
Substrate (isocitrate)
Cofactor (NADP)
Nicotinamide nucleotide
100Lys, 104Thr, 105Thr,
107Val, 337Ala, 341Thr:
substrate-specific and
four group SDPs,
functionally not
characterized
Adenine nucleotide
344Lys, 345Tyr, 351Val:
cofactor-specific SDPs,
known determinants of
specificity to cofactor
NADP-dependent IDH from E. coli (1ai2)
Clusters of SDPs on the
intersubunit contact surface

in the IDH/IMDH family…
Cluster II
Cluster I
…and in other protein families

The LacI family of bacterial transcription factors

Bind specific operator sequences upon interaction with effector
molecules, mainly various sugars
Cluster I
Effector
Cluster II
DNA operator
LacI (lactose repressor) from E.coli (1jwl)

Bacterial membrane transporters from the MIP
family

Water and glycerol/water channels
Cluster I
Cluster II
Substrate
(glycerol)
Subunit I
Glpf (glycerol facilitator) from E. coli (1fx8)
Conclusions



SDPpred, a method for identification of amino
acids that account for differences in protein
specificity
Results obtained for several protein families of
different functional type agree with structural
and experimental data
A substantial fraction of SDPs are located on the
intersubunit contacts interface, where they form
distinct spatial clasps





Olga V. Kalinina
Pavel S. Novichkov
Andrey A. Mironov
Mikhail S. Gelfand
Aleksandra B. Rakhmaninova



Department of Bioengineering
and Bioinformatics, Moscow
State University, Moscow,
Russia
Institute for Information
Transmission Problems RAS,
Moscow, Russia
State Scientific Center
GosNIIGenetika, Moscow,
Russia

Acknowledgements
 Leonid A. Mirny
 Olga Laikova
 Vsevolod Makeev
 Roman Sutormin
 Shamil Sunyaev
 Aleksey Finkelstein
Thank you!