Document 272125
Download
Report
Transcript Document 272125
Lessons on Protein Structure
from Lattice Model
HC Lee 李弘謙
Nanjing University
Nanjing, China
2002 May 22 – 25
What is a protein?
• Large molecule:
chain of amino acids
• Several tens to thousands residues
• Folds to specific shape
• Biological machines
DNA & Gene
Now we know, for higher life forms:
one gene, many proteins
Gene to Protein
轉錄與翻譯
What do proteins do?
• Links Genotype & Phenotype 基因型與現象型
• Structural and Functional 結構與功能
– Structural
• blood, muscle, bone, etc.
– Functional
• catalytic (enzyme), metabolic, neural, reproductive
催化、新陳代謝、神經、 複製
Aberrant gene > malfunction protein > disease
Protein Conformation
Alpha helix
Beta sheets
HIV reverse transcriptase 反轉錄脢
Understanding protein folding
Driving Force for
Protein Folding
-Most important is interaction
of residues with water –
hydrophobic and hydrophilic
Miyazawa-Jernigan Statistical Interaction
Li-Tang-Wingreen’s
representation of MJ Matrix
two-body
one-body
Theoretical analysis
[Wang & Lee, PRL 84 (2000)]
MJ-matrix
Fit to one (a) and two-body (b) terms
Theory
Compare with MJ-matrix
Correct to first order; dominated
one-body term - hydrophobicity
Lattice Model
-Simple way to learn something
about a very complex subject
Lattice model
• Represent space (or, in field theory, spacetime) by a discrete lattice.
• Represent a structure by a path on the lattice.
• A peptide is a string of residues.
• A peptide whose residues occupy a path is in a
state, or have a conformation.
• Residues may interact with each other
according to relative distance. Or,
• In mean-field model, residue interacts only
with lattice sites.
Random coil and compact path
Putting a binary peptide
on 2D lattice
Binary rep’n of
Peptide:
0101011010010
110010110010
Mean-Field HP Model
• The most important interaction for protein
folding is residue with water: residues are
hydrophobic (厭水) or hydrophilic (親水).
• In real protein in native conformation,
hydrophobic residues like to be buried,
hydrophilic residues like to be exposed to
water.
• Simplest model: divide residues into
hydrophobic and hydrophilic, structure into
core and surface sites.
• Both peptide and structure are binary
sequences.
Structure-path on a 2D lattice
Structure-path on a 2D lattice
Pay attention to
only whether
path is on a
core (1) or a
surface (0) site
Structure has a binary representation:
001100110000110000110011000011111100
(from Li et al. PRL 79 (1997) 765-768)
Designability of
Structures
-Very, very few structures
are good for proteins
Structure space >> observed structures
Protein Designability
The LTW model
Ground state of peptide p is structure s closest to it in
n-dimensional hyperspace.
All peptides in Voronoi volume of s has s as ground state.
The Hamiltonian H = ½ (p – s)**2 is a mapping of
the set of peptides P to the set of sructures S that
partitions P into equivalent classes labeled by s in S.
Target of each class is the ground state/conformation
of the class.
Designability of a structure is the number of peptides
in the class mapped to that structure
Voronoi volume
Vonoroi volume
In hyperspace, all peptide sequences within the
Voronoi volume of a structure is closest to that
structure (from Li et al. PRL (1997)).
Very few structures have high designability
Number of structures
No. of structures vs designability
Designability
Li, Tang and Wingreen, PRL (1997)
Paths with high switchback
numbers have high designability
[Shih et al. & HCL, PRL 84 (2000)]
• Shortest possible Hamming distance btw
two paths proportional to difference in
switchback numbers (n10)
• Few paths have high n10
• Path with high n10 has large Voronoi
volume, hence high designability
Hi switchback > hi design’ty
Distribution of Hamming dist.
Designability vs n10;
(a) 6x6 (b) 21-site triangular
Log distrib’n vs switchback no.
Foldability of Peptides
-Vast majority of peptides
do not fold
Alpha helices like paths with
high switchback numbers
• Conformation degeneracy – disfavor peptides w/
long strings of identical/similar residues
• Hence proteins rarely have long strings of
contiguous hydrophobic or hydrophilic residues
• Alternating short stretches of hydrophobic and
hydrophilic residues yields structurally nondegenerate and robust conformations
• 0011 switchback motif simulate alpha helix on the
surface
• Empirically most alpha helices on surface
Compare with real proteins
[Shih et al. & HCL, PRE 65 (2002)]
• Compare model high designability
peptides with binarized (by hydrphobicity)
protein sequences in PDB
– Represent peptide by frequency of
occurrence of set of all binary words of
fixed length l=2k
– Has 22k such words, put frequencies on a 2k
x 2k lattce
PDBAlpha-HP
[Shih et al.
PRL 84 (2000)]
HP-LS
PDBAll-HP
Overlap of binary sequence
Highly
foldable
peptides in
HP-model
resemble
alpha-helices
in real proteins
PDBAll - PDBAlpha
Oligomer length
In HP model: peptide that folds into
high designability conformations
correspond to peptides that fold to
alpha helices in real proteins
Many models give designability
but not all are correct
• Any Hamiltonian (H) is a mapping of peptide space
(P) onto conformation space (C)
• For coarse grained C, H partitions P into equivalent
classes, each class corresponding to a point in C
• Designability results from a highly skewed
distribution of the SIZES of the classes
• Example. The LS (Large-Small) model: structure
dominated by steric effect; small residues inside,
large residues outside. Almost same math as HP
model; has designability but wrong physics.
[Shih et al.
PRL 84 (2000)]
PDBAll - PDBAlpha
HP-LS
Overlap of binary sequence
Highly
foldable
peptides in
LS-model
does not
resemble
alpha-helices
in real proteins
PDBAlpha-LS
PDBAll-LS
Oligomer length
Unlike hydrophobicity
Steric effect does not play a
dominant role
in the determination of native
structure
Folding Funnel
and
Free-energy Barrier
-Why is folding so
fast yet so slow ?
Folding Funnel
Folding funnel
Folding funnel (picture)
http://www.npaci.edu/envision/v15.4/proteinfolding.html
Free Energy, Entropy and Monte Carlo
Free energy and entropy
Free-energy barrier
(b)
(a) Biding energy increase with
compactness
(b) Entropy lost rapidly as
binding energy increases
(c) Free-energy barrier formed
by competition btw energy
gain and entropy lost
Log (S)
[Guan, Su, Shih & Lee (2000)]
(a)
annealing
|E/Enative|
G = (E – TS)/Enative
No. of contacts
Free-energy barrier
|E/Enative|
barrier
(c)
low T
high T
|E/Enative|
Getting over the barrier takes
all the folding time
Summary of lessons
• Average hydrophobic/hydrophlic property of
residues can be understood by simple physics.
• Lattice model useful for examining coarse-grain
phenomena.
• Long folding time caused by need to surmount freeenergy barrier formed by rapid lost of entropy.
• Designability of structure is a direct consequence of
hydrophobic/hydrophlic dichotomy of residues.
Summary of lessons (cont’d)
• Very few structures are highly designable; those that
are have large switchback numbers.
• Very few peptides are foldable; many of those that
are alternate rapidly between hydrophobic and
hydrophlic residues.
• Highly foldable peptides folded into high
designability structures form robust proteins.
• They fold easily into alpha-helices and to a lesser
extent to beta-sheets; hence alpha-helices are formed
very, very early in folding process, then beta-sheets.
Molecular Dynamics atomistic description of protein
folding
-takes one giga-flop PC to run
one-million days to fold a
medium small protein
Massively Distributive Computation
• Molecular dynamics.
– Atomistic level simulation needed to understand protein
folding and function relevant to biology and drug design
• Annealing time very long
– Boltzmann probability:
one machine x 1 M days = 1 M machines x one day
• Starting a program of massively distributive
computation - use screen saver program for
simulation
•
of Vijay Pande, Stanford
The End
謝謝大家