22_Binkowski - PSI Structural Biology Knowledgebase
Download
Report
Transcript 22_Binkowski - PSI Structural Biology Knowledgebase
Protein Surface Analysis for
Functional Analysis and Prediction
T. Andrew Binkowski and Andrzej Joachimiak
2009 NIGMS Workshop: Enabling Technologies for Structural Biology
March 4-6, 2009
Outline
How Can Surface Analysis Aid Your Structural
Genomics Effort?
Protein Surfaces
Comparing Surfaces of Proteins
Surface Analysis in the Structural Genomics Pipeline
The Global Protein Surface Survey
Functional Inference in Proteins
Transfer function based on similarity to
a protein with known biological activity
Sequence
30-70%
Functional sites result from spatial
interactions of key residues in
diverse regions of primary sequence
Structure
Reveal more distant relationships
1 fold ~ many functions; vice versa
Example: generalized secondary
structural element
Different SSE can bring residues in
spatial proximity
(Jaroszewski & Godzick, ISMB 00)
3
Functional Inference in Proteins
Functional surfaces may be the most conserved structural features of
proteins
Surfaces performing identical biochemical activity can be found within different
protein scaffolds or in the absence clear evolutionary relationships
Novel heme-monooxygenase
•12% sequence identity
• a/b vs. all a
•Experimentally verified activity
Exploit ability of proteins to preserve local spatial residue patterns
Presents another opportunity to infer insightful ideas about their biological function
and mechanisms
4
Surfaces of Proteins
Surface:
Local grouping of solvent accessible
atoms
Pockets:
Empty concavity on a protein surfaces
into which solvent can gain access
Identifying surfaces:
Methods:
Solvent accessibility, Geometry, Grids,
Spheres
Applications:
CASTp, Surfnet, Pocket, Ligsite, Pass
Our approach:
Computational geometry (alpha
shape)
CASTp, PDB, Swiss-Prot,
Catalytic Site Atlas
Ligand binding surfaces:
Exclusion contact surface
(solvent accessibility difference)
Muck & Edelsbrunner, ACM Tran Graph, 1994; Edelsbrunner, Facello, Liang, Disc Appl Math, 1996; Liang, Edelsbrunner, Woodward, Protein Sci,
1998
5
Global Protein Surface Survey
http://gpss.mcsg.anl.gov
6
Comparing Surfaces of Proteins
SurfaceScreen
Methodology for identifying similarly
shaped proteins and aligning them
Surface
Optimizes two components
Global Shape
Perceived similarity
Size and scale, independent of
Global Surface
Shape Filtering
chemistry
Local physicochemical texture
Preserved atom/residue orientation
Conservation of chemical
complimentarity
Surface Shape
Alignment
Constrained Spatial
Surface Refinement
Apply Scoring
Functions
7
Comparing Surfaces of Proteins:
Global Shape Similarity
Surface Shape Signatures (SSS)
Represent signature of a surface
as distribution sampled from a
shape function (Osada et. Al., 2002)
Comparison of probability
distributions
Kolmogorov-Smirnov
Earth Mover’s Distance
ATP Binding sites
protein kinase CK2 from Z. mays (b)
phosphopantetheine
adenylyltransferase from E. coli (c)
maltose/maltodextrin transport
protein from E. coli (d,cyan chain A,
light blue chain B)
50 non-homologous sites (< 30%
sequence identity)
8
Spatial Surface Alignment Refinement
Combinatorial comparison of
residue sets in “neighborhood”
Maintain “like” correspondence
of types
Maximum common residues
Enumerate and evaluate
alignment orientations
Find optimal superposition
using SVD of correlation matrix
(Umeyama 1991)
Heme binding pockets of myoglobin from different organisms.
9
Evaluating Surface Alignments
RMSD Distance:
Estimate the probability of obtaining a specific
RMSD for nres
Compute random surface alignments (108) and build
lookup tables
RMSD variants:
cRMSD (coordinate)
oRMSD (orientation)
Surface Volume Overlap:
VAB VA VB VA B
SVOTAB
VAB
VAA VBB VAB
Interpretation of SVOT is not
straightforward
Need global and local
lSVOTab
Vab
Vaa Vbb Vab
gSVOTAB
VAB
VAA VBB VAB
10
Benchmarking Surface Alignments
11
Heme Binding Site Retrieval
Heme (iron-protoporphyrin IX)
Multi-functional (i.e.oxygen binding/transport,
electron transfer and redox)
Binding on 20 different folds
Between proteins <2% seq. id.
seq. & fold
Query myoglobin (gray) against PDB
structure to identify hemoproteins
surface analysis
Retrieval rate (area under ROC curve)
Sequence: 68.7%
Structure (SSM): 64.4%
Surface: 95.8%
Detection of convergent heme binding
site on IsdG from S. aureus
Missing characteristic sequence motif
12% seq id; different scaffold
Experimentally verified
monooxygenase activity
12
ATP: Retrieval of a Flexible Ligand
Adenosine 5’-triphosphate multifunctional nucleotide (i.e.cell signaling, enegry transfer)
58 unique EC classifications #.#.#.#
Conformational flexibility
Retrieval rates for 4 conformations (79.1%-85.4%); method is tolerant to flexible ligands
13
Prediction and Validation of
GDP Binding Surface
Structure of F420-0:gamma-glutamyl
ligase from A. fulgidus
Large binding surface was searched
to support functional predictions and
GDP binding surface is identified
Posed GDP based on superposition of
surfaces (red)
Co-crystallization experiments
validates prediction
14
Surface Analysis in the Structural
Genomics Pipeline
15
Exploiting Protein Surfaces in
Structural Genomics
Developing surface-based
tools to address specific
needs of structural genomics
pipeline
Ligands for co-crystallization
Aid in the assignment of
Future Studies
Discovery
electron density
Functional annotation tools
Drive further studies (i.e.
ligand binding, discovery)
16
Crystallization/Structure Improvement
Partially Solved
or Low Quality
Structure
Surface Identification
Search GPSS for
Binding Sites
Co-crystallization
Experiments
Introduction of GDP to F420-0:gamma-glutamyl ligase from A.
fulgidus
improves resolution from 2.8 to 1.9 Angstroms and orders loop
regions.
17
Assisted Electron Density Assignment
Unidentified ligand density
Construct surface surrounding density
and search against ligand surface
library
Does not require entire structure
to be built
18
Assisted Electron Density Assignment
Applicable to ligands of various
molecular weights and sizes
Fructose (pdb id=1zx5)
NADP (pdb id=2ag8)
Suggest a list in cases of
ambiguity
19
Landscape Analysis: ATP
Classification based on surface similarity
shows functional families have preferred
(not necessarily unique) surfaces and
conformation
20
Automated Protein Kinase Classification
All-against-all surface comparison of all protein
kinases in the PDB
Color labeled by expert annotation (KinBase)
Surface clustering identifies:
Dual substrate specificity of CK2 proteins
Active/inactive states
Similarity detected between MAP p38 kinase and
Abelson leukemia virus tyrosine kinase (Abl) with
bound cancer drug STI-571
MAP kinase has unique DFG “out” conformation not
previously seen in ser/thr kinases
21
Function Sleuth
Conserved protein of unknown
function (VCA0319) from V. cholerae
apc29617
Unique arrangement of common
structural motifs
Problematic for secondary structure
and fold analysis
Surface analysis identifies DNA
binding surface and 5 putative metal
binding sites
All 5 metal binding sites showed
strong preference for Mg
Putative metalloregulated repressor
with Mg-regulated mechanism of DNA
binding
22
Function Sleuth
1bdb
NAD
1hoh
MGD
2qwr
ANP
1jbw
ACQ
Target APC7761 (3fd3)
Agrobacterium tumefaciens str. C58
23
Function Sleuth
1i9c
Target APC61725 (3fz5)
Rhodobacter sphaeroides 2.4.1
Top 17 most similar surfaces bind B12
24
Global Protein Surface Survey
SurfaceScreen for PSI ‘function sleuth’
targets
Automated analysis of largest 5 surfaces
(per chain and unit)
Technical Note:
DOE INCITE on Blue/GeneP at ANL
http://gpss.mcsg.anl.gov
25
Conclusion
Comparing surfaces of proteins can be a useful
tool with many applications
Functional characterization
Assisted electron density assignment
Automated classification
Global Protein Surface Survey
http://gpss.mcsg.anl.gov
26
Acknowledgements
ANL/MCSG
H. An,
G. Babnigg,
L. Bigelow,
A. Binkowski,
C-s. Chang,
S. Clancy,
G. Cobb,
M. Cuff,
M. Donnelly,
C. Giometti,
W. Eschenfeldt,
Y. Fan,
C. Hatzos,
R. Hendricks
G. Joachimiak,
H. Li,
L. Keigher,
Y-c. Kim,
N. Maltseva,
E. Marland,
S. Moy,
R. Mulligan,
B. Nocek,
J. Osipiuk,
M. Schiffer,
ANL/MCSG
A. Sather,
G. Shackelford,
L. Stols,
K. Tan,
C. Tesar,
R-y. Wu,
L. Volkart,
R-g. Zhang,
M. Zhou,
ANL/SBC
N. Duke,
S. Ginell,
F. Rotella
Univ. of Virginia
W. Minor,
M. Chruszcz,
M. Cyborowski,
M. Grabowski,
P. Lasota,
P. Miles,
M. Zimmerman,
H. Zheng
Univ. College
London @ EBI,
J. Thornton,
C. Orengo,
M. Bashton,
R. Laskowski,
D. Lee,
R. Marsden,
D. McKenzie,
A. Todd,
J. Watson
Northwestern Univ.
W. Anderson,
O. Kiryukhina
D. Miller,
G. Minasov,
L. Shuvalova,
X. Yang,
Y. Tang
G. Montelione,
Ruthgers Univ. NESGC
T. Terwilliger,
Los Alamos, ITCSG
Z. Derewenda, Univ.
of Virginia, ITCSG
Z. Dauter, NCI
J. Liang, Univ.
of Illinois
D. Sherman, U. Michigan
Washington
Univ.
D. Fremont,
T. Brett,
C. Nelson,
Univ. of Texas
SWMC
Z. Otwinowski,
D. Borek,
A. Kudlicki,
A. Q. Mei,
M. Rowicka
Funding: NIH and DOE
Univ. of Toronto
A. Edwards,
C. Arrowsmith,
A. Savchenko,
E. Evdokimova,
J. Guthrie,
A. Khachatryan,
M. Kudrytska,
T. Skarina,
X. (Linda) Xu
Univ. of Chicago
O. Schneewind,
D. Missiakas,
P. Gornicki,
S. Koide, ITCSG
W-j. Tang,
B. Roux,
J. L. Robertson
M.R. Rosner,
T. Kossiakoff, ITCSG
V. Tereshko,
27
Thank you