Transcript N - KIAS

Topics to be Covered
 Introduction to Protein Folding
 Mechanism of folding and misfolding
 GroEL – biological machine (chaperones
folding)
 Molecular motors: Polymer physics and
Myosin V motility
Many Facets of Folding
1.
Structure Prediction
2.
Protein & Enzyme Design
3.
Folding Kinetics & Mechanisms
4.
Crowding & confinement Effects
5.
Relation to aggregation
6.
Molecular Chaperones
7.
Unfolded protein response (UPR)
Folding and clearance mechanisms are at the center stage
A Big Protein Folding Problem
Length ≈ 220 nm ≈ 700 water
Read the Genetic Code; Transcription; Produce
Proteins, Function, Degradation
A very large protein in water – complex
problem indeed! (about 100,000 waters)
Size ≈ 22nm
Pictures, Models, Approximations & Reality
A bit Philosophy
Rich History in Condensed Matter physics & Soft Matter
(Analytic Theory)
• Ising model for magnetic systems (Ni/also biology; 1920)
• Spin glasses – Edwards-Anderson model (CuMn alloy;
1975)
• Polymer statistics (Flory; 1948)
• Liquid Crystals (TMV) (Onsager 1949)
• BCS Theory (1956…)
Folding Kinetics
Experiments
o Prot Engg (TSE)
o SAXS/NMR (DSE)
o FAST Folding (T jump;
P JUMP; Rapid Mixing)
o SM FRET (Folding/
unfolding)
o LOT/AFM (Force Ramp
Force Quench)
Theory
 Statistical Mechanics
(Energy Landscape)
 Minimal Models
(Lattice/Off-Lattice)
 MD Simulations
 Bioinformatics
(Evolutionary Imprint)
Outline
 How far can we go using polymer physics? (no force)
 Toy models and generic lessons
 Finite size effects: Universal relations
 Bringing “specificity” back: Phenomenological Models
Many facets of Protein Folding
How does a chain (necklace with different shape pearls) fold up and how
fast?
Can things go wrong and then what?
As structure
gets organized
Energy gets lowered
Minimum Free Energy
(water ions cosolvents)
Anfinsen over 50 years
ago; Nobel Prize 1972
Computational approaches to Biological problems: 2013 Nobel Chemistry
RNA and some Proteins
F
ΔFiNBA/ΔFij >> 1
S
I: Gradient to NBA
dominates: Most
likely event under
folding conditions
All other transitions
less likely.
Page 881 of Textbook Chapter 18
Approximation to Reality!
Another Nobel Protein! (GFP)
Not all molecules take the same
route:
Folding is stochastic! At least 4
classes of folding trajectories
(Reddy)
Complicated Energy
Function
Thermodynamics of src-SH3 folding
Z. Liu, G. Reddy, E. O’Brien and dt
PNAS 2011
Green = Urea
Red= MTM predictions
Black = Experiments (Baker)
ΔGNU[C] = ΔGNU[0] + m[C]
m = (1.3 – 1.5) kcal/mol.M
Exp. m = 1.5kcal/mol.M
Excellent Agreement!
Characteristic Temperatures in Proteins
Random
Coil (Flory)
HIGH T
or [C]
Rg ≈ aDN0.6
Compact
T  T
Or [C]
Foldable:
 = (T - TF)/T
small
Native State
T  TF
[CF]
Rg ≈ aN N0.33
Estimating Protein Size as a Function of N
High denaturant concentration (GdmCl or Urea)
Good solvent for polypeptide chain – may be!
Flory Theory: F(Rg) ≈ (Rg2/N2) + v(N2/Rgd)
(see de Gennes book)
Rg ≈ aNν ν = 3/(d + 2)
Folded States Globular Proteins
• Maximally compact
• Largely Spherical
• Rg ≈ aN N(1/3)
• So size of proteins follow polymer laws –
surprising!
Protein Collapse : Rg follows Flory law
“Unfolded”
RgU = 2N0.6
Rg = 3N1/3
Kohn PNAS (05)
Folded
Dima & dt JPCB (04)
RNA Folding: Tetrahymena ribozyme
RNA – Branched polymer
Tetrahymena ribozyme(...difficult)
Ion valence size shape
Rg Scaling works for RNA too including
the ribosome!
Size Dependence of RNA
Rg ≈ 5.5N(1/3)
Fairly decent
(due to Hyeon)
Exponent may
Be larger..analogy
To branched polymers
Ben Shaul, Gelbart,
Knobler
Illustrating Key ideas using Lattice models
Seems like an Absurd Idea!
Role of non-native
attractions
Multiple Folding Nuclei
Fast and slow tracks
K. A. Dill Protein Science (1995)
Even simpler
Folded lower in energy by
one unit
Blues Like Each other.
They gain one unit of energy
Multiple paths!
Toy model:
Explains protein
folding
A simple minded approach
4 types of monomers
(H, P, +, -)
Monomer has 8 beads
# of sequences = 48
(amylome)
# of conformations on
cubic lattice = 1,841
http://dillgroup.org/#/code
HPSandbox
Order parameter description
Macroscopic System
Ferromagnetism M
Nematic Phases S = P2(cos)
Smectic Phases S,tilt angle
Spin Glasses: M; qEA
Paramagnet M = 0. qEA = 0
Spin Glass M = 0; qEA  0
Ferromanet M  0; qEA  0
Physics dictates OP
Proteins a lot of choices
OP is in the eye of the
beholder
= N/Rg3 ;  (overlap)
“unfolded” (Small,big)
Compact non-native
(O(1), big)
Native
(O(1), small)
Other Choices
Helix/sheet content;
Distribution of contacts
………
Folding reaction as a phase transition: A
rationale N = number of amino acids
Order Parameter Description
 = N/Rg3 ;  = Overlap with NBA (0 for NBA)
Unfolded (U), Collapsed Globules (CG);
Folded (NBA)
U:  (small),  Large (“vapor”)
CG:  ≈ O(1),  Large (Dense no order “Liquid”)
NBA:  ≈ O(1),  Small (Dense order “Solid”)
Developing a “nucleation” picture
Free Energy of Creating a Droplet
G(R) ≈ -R3 + R2
Driving force + Opposing
What are these forces in proteins?
Driving force: Hydrophobic Collapse
Burying H bonds
Opposing: “Droplet with nonconstant ”
Entropy loss due to looping
Tentative Models + Slight refinement
Cost of creating
a region with NR
ordered residues
out of N?
Rugged Landscape with Many possibilities
Some phenomenological Models
GBW(NR)  -f(T)NR + a2NR2/3
NR*  (8a2/3 f(T))3
NR* too large for typical  and f(T) values
GGT(NR)  h(h - 1)NR2 + a2NR2/3
NR*  (8a2/h)3/4
NR*  15 or so…
Using experimental parameters
NR*  27 or so..
Folding trajectories to MFN to transition
state ensemble (TSE)
Structures near Barrier top or TSE
Simulations
Moving from one
scenario to
another –
pressure jump…
Refinement (Hiding Ignorance)
G(NR)  -1NR + NR + S (loop)
  small barrier (downhill folding)
Surface tension cannot be a constant
Multiple Folding Nuclei (Structural
Plasticity)
Multi-domain proteins involve interfaces between
globular parts..
Finite Size Effects on Folding
Order parameters matter
Scaling of C with N (number of aa)
Two points:
1) TF = max in  (suceptibility)
 = T(d<>/dh; h = ordering field (analogy to mag system)
 is dimensionless  h ~ T (in proteins or [C])
2) Efficient folding TF  T (collapse Temp; Camacho & dt
PNAS (1993)) C controlled by protein DSE at T  TF  T
Rg ~ (T/TF)- ~ N (DSE a SAW & manget analogy)
T/TF ~ 1/N (Result I)
Finite-size effects on TF
Lattice models
Side Chains
T/TF ~ 1/N
Experiments
Li, Klimov & DT Phys. Rev. Lett. (04)
Scaling of c with N
Magnet-Polymer analogy
c= (TF/T) [TF(d<>/dT)]
“disp in TF” X “suspectibility”
C  N ;  = 1 +  (Universal);   1.2
Result II
T  TF  T
  N
Universality in Cooperativity
Li, Klimov, dt PRL (04)
c ~ N
Experiments
Residue-dependent melting Tm-Holtzer Effect
Consequences of finite size
fm(Tmi) = 0.5
Lattice Models
Side Chains
Klimov & dt J. Comp. Chemistry (2002)
Is the melting temperature Unique?
Finite-size effects!
T large
Holtzer
Leucine
Zipper
Biophys J
1997
Udgaonkar
Barstar Monnelin
BBL
Munoz
Nature
2006
-hairpin
PNAS 2000
Klimov & dt
Residue dependent ordering Protein L
O’Brien, Brooks & dt Biochemistry (2009)
Spread decreases as
N decreases….finite-size
effects
Summary So Far – Really with little work on a
complex problem
• Sizes of single domain proteins (folded and
unfolded) roughly follow Flory’s expectation
• Same holds good for RNA folded structures
• Nucleation Picture of Folding
• Finite size effects – theory matches
experiments
Part II: Protein Folding Kinetics
Organization of structure
Fluctuations due to finite-size
effects
[C]
Or
T
Changes in distributions at
various stages of folding
A Few Questions
• Mechanisms of Structural organization
• Nature of the Folding Nuclei
• Interactions that guide folding (native vs nonnative)
• Folding rates – dependence on N
Illustrating Key ideas using Lattice models
Seems like an Absurd Idea!
Role of non-native
attractions
Multiple Folding Nuclei
Fast and slow tracks
K. A. Dill Protein Science (1995)
Stages in folding
Camacho and dt, PNAS (1993)
C
Random
Coil
dt J. de. Physique (1995)
F/C  (100 - 1000)
C
F
“Specific
Collapse”
F
Native State
Need for Quantitative Models
Fernandez, Rief..
Hyeon, Morrison, dt
Using mechanical
force to trigger
folding
smFRET trajectories
Eaton, Schuler, Haran…
Non-native interactions
early (time scales of collapse) in
folding;
Subsequently native interactions
dominate Camacho & dt
Proteins
22, 27-40 (1995);
Cardenas-Elber (all atom
simulations)
Dill type HP model
Beads on a lattice
Native Centric (or Go)
models appropriate!
Multiple protein folding nuclei and the transition state ensemble in two‐state proteins
Klimov and dt (2001)
MC simulations;
600 folding
Trajectories;
Folding time:
A/AGO ≈ 3
LMSC Exact
Enumeration
Proteins: Structure, Function, and Bioinformatics
Volume 43, Issue 4, pages 465-475, 17 APR 2001 DOI: 10.1002/prot.1058
http://onlinelibrary.wiley.com/doi/10.1002/prot.1058/full#fig5
Transition State Ensemble: Neural Net
Klimov and
dt Proteins
2001
Go
ES NSB 2000
Equivalent
to pfold
Multiple protein folding nuclei and the transition state ensemble in two‐state proteins
Multiple Channels Carry
Flux to the NBA
Multiple Transition States
Connecting these Channels
Proteins: Structure, Function, and Bioinformatics
Volume 43, Issue 4, pages 465-475, 17 APR 2001 DOI: 10.1002/prot.1058
http://onlinelibrary.wiley.com/doi/10.1002/prot.1058/full#fig9
Bottom line:
To get semi-quantitative
results Go-type models
May be enough…
Folding Rate versus N
kF ≈ k0 exp(-Nβ) with β = 0.5
Barriers scale sublinearly with N
Proteins: Hydrophobic residues buried
In interior (chain compact); Polar and
charged residues want solvent exposure
(extended states). Frustration between
Conflicting requirements.
P(ΔG♯) ≈ exp( - (ΔG♯)2/2N)
<ΔG♯> ≈ N0.5 (Analogy to glasses)
Fit to Experiments (80 Proteins Dill, PNAS 2012)
Reasonable given
data from so many different
laboratories
Even better for RNA (Hyeon, 2012)
At high [C] is DSE a Flory Coil?
It appears that high [C] is a Θ-solvent!
Protein
collapse
O’Brien
PNAS 2008
CT =(C - Cm)/C
 = 2 + (γ-1)/ν
P(x) ~ xexp(-x1/(1-))
Toy Model (Is the fibril structure encoded in monomer
spectrum) Prot Sci 2002; JCP 2008
4 types of monomers
(H, P, +, -)
Monomer has 8 beads
# of sequences = 48
(amylome)
# of conformations on
cubic lattice = 1,841
Structure of “protofilament” + “fibril”
Single and double layer
Interplay of E+- and EHH
a: Monomers parallel
b: Monomer alternate
c: Double layer
d: No fibril compact
Optimal growth temp
fib = (104 - 10n)F
Largest n about 9
Seeding speeds up fibril
rate formation
Growth rate depends on N* population PN*
Depends on sequence
Sequence + N* ensemble
fibril kinetics  monomer
landscape encodes
structure + growth rate
Lifshitz-Slyazov Growth Law
Supersaturated solution
J. Phys. Chem. Solids (1961)
G  0M1/3
Large clusters
incorporate
small oligomers
M  Mn* [ PF Fibrils]