2009/06/04 Lecture

Download Report

Transcript 2009/06/04 Lecture

Forces and Prediction of Protein
Structure
Ming-Jing Hwang (黃明經)
Institute of Biomedical Sciences
Academia Sinica
http://gln.ibms.sinica.edu.tw/
Science 2005
Sequence - Structure - Function
MADWVTGKVTKVQ
NWTDALFSLTVHAP
VLPFTAGQFTKLGLE
IDGERVQRAYSYVN
SPDNPDLEFYLVTVP
DGKLSPRLAALKPG
DEVQVVSEAAGFFV
LDEVPHCETLWMLA
TGTAIGPYLSILR
Sequence/Structure Gap
Current (June 02, 2009) entries in protein sequence and structure
database:
SWISS-PROT/TREMBL : 468,851/7,916,844
PDB : 57,835
Sequence
Number of entries


Structure
Year
Anfinsen’s Protein Folding Experiment
(1957)
The secret of 3D is coded in 1D!
The Holy Grail of structural bioinformatics
MADWVTGKVTKVQNWTDALFSLTVHA
PVLPFTAGQFTKLGLEIDGERVQRAYS
YVNSPDNPDLEFYLVTVPDGKLSPRLA
ALKPGDEVQVVSEAAGFFVLDEVPHCE
TLWMLATGTAIGPYLSILRLGKDLDRFK
NLVLVHAARYAADLSYLPLMQELEKRY
EGKLRIQTVVSRETAAGSLTGRIPALIE
SGELESTIGLPMNKETSHVMLCGNPQ
MVRDTQQLLKETRQMTKHLRRRPGHM
TAEHYW
1D3D: Physics-based approach
Levitt’s lecture for S*
Levitt
Levitt
Levitt
Levitt
Levitt
Levitt
Levitt
Levitt
Levitt
Levitt
Levitt
Levitt
Levitt
Levitt
Levitt
Molecular Mechanics (Force Field)
Levitt
1-microsecond 980ns
MD simulation
- villin headpiece
- 36 a.a.
- 3000 H2O
- 12,000 atoms
- 256 CPUs (CRAY)
-~4 months
- single trajectory
Duan & Kollman, 1998
Massively distributed computing
Letters to nature (2002)
- engineered protein (BBA5)
- zinc finger fold (w/o metal)
- 23 a.a.
- solvation model
- thousands of trajectories each
of 5-20 ns, totaling 700 ms
- Folding@home
- 30,000 internet volunteers
- several months, or ~a million
CPU days of simulation
Massively distributed computing
SETI@home:
 Folding@home
 Distributed folding
 Sengent’s drug design
 FightAIDS@home
…

Levinthal’s paradox (1969)


If we assume three possible states for every flexible
dihedral angle in the backbone of a 100-residue protein,
the number of possible backbone configurations is 3200.
Even an incredibly fast computational or physical
sampling in 10-15 s would mean that a complete sampling
would take 1080 s, which exceeds the age of the universe
by more than 60 orders of magnitude.
Yet proteins fold in seconds or less!
Berendsen
Protein folding by MD
PROTEIN FOLDING:
A Glimpse of the Holy Grail?
Herman J. C. Berendsen*
"The Grail had many different manifestations
throughout its long history, and many have
claimed to possess it or its like". We might have
seen a glimpse of it, but the brave knights must
prepare for a long pursuit.
Energy landscapes of protein folding
Borman, C&E News, 1998
Protein-folding prediction technique
CGU: Convex Global
Underestimation
- K. Dill’s group
Challenges of physics-based methods
Simulation time scale
 Computing power
 Sampling
 Accuracy of energy functions

Other factors



Metal/co-factor binding
Disulfide bond
…
Biology Can’t Wait! (Evolution can
help)
One Big
Family.
Similar 3D in homologous
proteins
Ranganathan
Structure Prediction Methods
Homology modeling
Fold recognition
ab initio
0
10
20
30
40
50
60
70
80
90 100
% sequence identity
Homology Modeling
Structural Genomics: HM as a
working horse
Baker & Sali, 2001
Our papers from HM and analysis











Tzou, W.-S., Hwang, M.-J., (1997) “A Model for Fis N-Terminus and Fis-Invertase Recognition,”
FEBS Lett. 401:1-5.
Cheng, Y.-S., Tang, T.K., Hwang, M.-J. (1999) “Amino Acid Conservation and Clinical Severity
of Human Glucose-6-Phosphate Dehydrogenase Mutations,” J. Biomed. Sci. 6:106-114.
Lai, H.-L., Lin, T.-H., Kao, Y.-Y., Lin, W.-J., Hwang, M.-J., Chern, Y. (1999) “The N-Terminal
Domain of Type VI Adenylyl Cyclase Mediates Its Inhibition by Protein Kinase C,” Mol.
Pharmacol., 56: 644-650.
Chen, J.-W., Luo, Y.-L., Hwang, M.-J.*, Peng, F.-C., Ling, K.-H.* (1999) “Territrem B, a
tremorgenic mycotoxin that inhibits acetylcholinesterase with a non-covalent yet irreversible
binding mechanism,” J. Biol. Chem., 274:34916-34923.
Liu, C.L., Tsai, C.C., Lin, S.C., Wang, L.I., Hsu, C.I., Hwang, M.-J., Lin, J.Y. (2000) “Primary
Structure and Function Analysis of the Abrus precatorius Agglutinin A Chain by Site-directed
Mutagenesis,” J. Biol. Chem., 275:1897-1901.
Yao, P.-L.; Hwang, M.-J.*; Chen, Y.-M.; Yeh, K.-W.* (2001) “Site-Directed Mutagenesis
Evidence for a Negatively Charged Trypsin Inhibitory Loop in Sweet Potato Sporamin.” FEBS
Lett., 496: 134-138.
Lin, T.H., Lai, H.L., Kao, Y.Y., Sun, C.N., Hwang, M.-J., Chern, Y. (2002) “Protein Kinase C
Inhibits Type VI Adenylyl Cyclase (ACVI) by Phosphorylating the Regulatory N Domain and
Two Catalytic C1 and C2 Domains,” J. Biol. Chem. 277:15721-15728.
Chen, M.-C., Hwang, M.-J., Chou, Y.-C., Chen, W.-H., Cheng, G., Nakano, H., Luh, T.-Y., Mai,
S.-C., Hsieh, S.-L. (2003) The Role of ASK1 in Lymphotoxin-beta Receptor-Mediated Cell Death.
J. Biol. Chem. 278:16073-15081.
Shih, S.-R,* Chiang, C, Chen, T.-C, Wu, C.-N, Hsu, J.-T, Lee, J.-C, Hwang, M.-J, Li, M.-L, Chen,
G.-W, Ho, M.-S. (2004) "Mutations at KFRDI and VGK domains of enterovirus 71 3C protease
affect its RNA binding and proteolytic activities." J Biomed Sci. 11:239-248.
Kao Y.-Y., Lai H.-L., Hwang M.-J., Chern Y.* (2004) "An Important functional role of the N
terminus domain of type VI adenylyl cyclase (ACVI) in Gia-mediated inhibition." J Biol Chem.
279: 34440-34448.
Lin, D.-Y., Huang, Y.-S.,...Hwang, M.-J. ... and Shih, H.-M.* (2006) Role of SUMO-Interacting
Motif in Daxx SUMO Modification, Subnuclear Localization, and Repression of Sumoylated
Transcription Factors. Mol. Cell, 24: 341-354.
Fold recognition
Find, from a library of folds, the 3D template
that accommodates the target sequence best.
Also known as “threading” or “inverse folding”
Useful for twilight-zone sequences
Fold recognition (aligning sequence to structure)
(David Shortle, 2000)
3D->1D score
On X-ray, NMR, and computed models
Reliability and uses of comparative models
Marti-Renom et al. (2000)
Pitfalls of comparative modeling



Cannot correct alignment errors
More similar to template than to true
structure
Cannot predict novel folds
Ab initio/new fold prediction
Physics-based (laws of physics)
 Knowledge-based (rules of evolution)

CASP Experiments
One group dominates the ab initio
(knowledge-based) prediction
One lab dominated in CASP4
The 1D fragment3D approach
Primary
LGINCRGSSQCGLSGGNLMVRIRDQACGNQGQTWCPGERRAKV
CGTGNSISAYVQSTNNCISGTEACRHLTNLVNHGCRVCGSDPLYA
GNDVSRGQLTVNYVNSC
seq. to str. mapping
fragment
Tertiary
fragment assembly
The I-sites library (Baker’s
group)
Some CASP4 successes
Baker’s group
Toward High-Resolution de Novo Structure
Prediction for Small Proteins
--Philip Bradley, Kira M. S. Misura, David Baker
(Science 2005)
The prediction of protein structure from
amino acid sequence is a grand challenge of
computational molecular biology. By using a
combination of improved low- and highresolution conformational sampling methods,
improved atomically detailed potential
functions that capture the jigsaw puzzle–like
packing of protein cores, and highperformance computing, high-resolution
structure prediction (<1.5 angstroms) can be
achieved for small protein domains (<85
residues). The primary bottleneck to
consistent high-resolution prediction appears
to be conformational sampling.
3D to 1D?
Science 2003
A computer-designed protein (93 aa)
with 1.2 A resolution
Structure prediction servers
http://bioinfo.pl/cafasp/list.html
Structural Bioinformatics:
Sequence/Structure Relationship
Percent Identity
100
90
All possible sequences of amino acids
80
Protein structures
observed in nature
70
60
50
40
30
20
Protein sequences
observed in nature
Twilight zone
Midnight zone 10
0
Hybrid approach for solving macromolecular
complex structures
Thank You!
(Rost, 1996)