Transcript Lecture 9

EXPERIMENTAL AND THEORETICAL
METHODS TO STUDY PROTEIN
FOLDING
Experiments
•
•
•
•
•
Thermal denaturation
Chemical denaturation
Mechanical unfolding
Kinetic experiments
Mutational studies
Techniques
• Differential scanning calorimetry (DSC)
• Spectroscopy
–
–
–
–
Circular dichroism (CD)
Fluorescence
Nuclear magnetic resonance (NMR)
Small angle X-ray (SAXS) and small angle
neutron scattering (SANS)
• Atomic force microscopy (AFM)
Wild type
Acid-denaturated wild type
L16A mutant
C-terminal peptide
Religa et al., J. Mol. Biol., 333,
977-991 (2003)
F-values
F0
Mutation affects
the folded state
but not the
transition state
F 1
Matouschek A, Kellis JT, Serrano L, Fersht AR. (1989).
Mapping the transition state and pathway of protein
folding by protein engineering. Nature 340:122
Mutation affects
both the folded
state and the
transition state
Millet et al.. Biochemistry 41, 321-325 (2002)
Structure of closed and open form of
the DnaK (Hsp70) chaperone
Fluorescence studies of closing and opening of Hsp70
Mapa et al., Molecular Cell 38, 89, 2010.
Theoretical studies of protein structure
and protein folding
• Need to express energy of a system as
function of coordinates
• Need an algorithm to explore the
conformational space
Energy expression in empirical force fields
Es
Eb
 
1 d
1 
0 2
0 2
E   ki d i  d i   ki  i   i 
i 2
i 2
Enb
Eel




12
6
0
0

 rij 
 rij  
qi q j
332
   ij    2   

r  
rij
 rij 
i j i
i j i
 ij  

Etor


Vi (1)
Vi ( 2 )
Vi ( 3)
i 2 1  cos i   2 1  cos 2 i   2 1  cos3 i 




Partition of the energy of interactions with respect to topological
distance
Torsional
interactions Etor
1,3-interactions Eb
only
1,5-interactions
Eel+EVdW
Bond distortion energy

d

2
Es(d)
1 d
Es d   k d  d 0
2
d0
d
Typical values of d0 and kd
Bond
d0 [A]
kd [kcal/(mol A2)]
Csp3-Csp3
1.523
317
Csp3-Csp2
1.497
317
Csp2=Csp2
1.337
690
Csp2=O
1.208
777
Csp2-Nsp3
1.438
367
C-N (amid)
1.345
719
Comparison of the actual bond-energy curve with that of
the harmonic approximation
Potentials that take into account the asymmetry of bond-energy
curve



1 d
1
0 2
Es d   k d  d   d  d 0
2
6

Es d   De 1  e
 1
b  d  d e  2

3
Anharmonic potential
Morse potential
Harmonic potential
E [kcal/mol]
Anharmonic potential
Morse potential
d [A]


k
Eb()
Energy of bond-angle distortion
1 
0
Eb   k   
2

2
0

Typical values of 0 and k
0 [degrees]
Csp3-Csp3-Csp3 109.47
k
[kcal/(mol degree2)]
0.0099
Csp3-Csp3-H
109.47
0.0079
H-Csp3-H
109.47
0.0070
Angle
Csp3-Csp2-Csp3 117.2
0.0099
Csp3-Csp2=Csp2 121.4
0.0121
Csp3-Csp2=O
0.0101
122.5
Basic types of torsional potentials
Single bond between sp3 carbons or between
sp3 carbon and nitrogen
Example: C-C-C-C quadruplet
Etor [kcal/mol]
Etor    1.61  cos3 
60
Double or partially double bonds
50
40
Example: C-C=C-C quadruplet
30
20
Etor    301  cos2 
10
0
Single bond between electronegative atoms
(oxygens, sulfurs, etc.).
Example: C-S-S-C quadruplet
Etor    3.51  cos2   0.61  cos 
dihedral angle [deg]
Potentials imposed on improper
torsional angles

B
X
A
X
Etor
V2 1  cos 2 

V3 1  cos 3 
Nonbonded Lennard-Jones (6-12) potential
 r 0 12  r 0 6 
  12   6 
Enb r       2   Enb r   4      
 r  
 r 
 r 
 r  
1
6
Enb [kcal/mol]
ro  2 
rij0  ri0  r j0
 ij   i j
-

r0
r [A]
Sample values of i and r0i
Atom type
r0

C(carbonyl)
1.85
0.12
C(sp3)
1.80
0.06
N(sp3)
1.85
0.12
O(carbonyl)
1.60
0.20
H(bonded with C)
1.00
0.02
S
2.00
0.20
Other nonbonded potentials
 r C
Enb r   A exp    6
  r
Buckingham potential
C
D
Ehb r   12  10
r
r
10-12 potential used in
some force fields (e.g.,
ECEPP) for proton…proton
donor pairs
Sources of parameters
Energy contribution
Source of parameters
Bond and bond angle
distortion
Crystal and neutronographic data, IR
spectroscopy
Torsional
NMR and FTIR spectroscopy
Nonbonded interactions
Polarizabilities, crystal and
neutronographic data
Electrostatic energy
Molecular electrostatic potentials
All
Energy surfaces of model systems
calculated with molecular quantum
mechanics
Solvent in simulations
 Explicit water
• TIP3P
• TIP4P
• TIP5P
• SPC
 Implicit water
• Solvent accessible surface area (SASA) models
• Molecular surface area models
• Poisson-Boltzmann approach
• Generalized Born surface area (GBSA) model
• Polarizable continuum model (PCM)
TIP3P model
TIP4P model
0.00 e
-0.834 e
H
104.52o
H
0.417 e
H
0.520 e
0.15 Å
O
O
M
H
-1.040 e
O=3.1507 Å
O=3.1535 Å
O=0.1521 kcal/mol
O=0.1550 kcal/mol
Solvent accessible surface area (SASA) models
Fsolw 
 A
i
i
atoms
i
Ai
Free energy of solvation of
atomu i per unit area,
solvent accessible surface of
atom i dostępna
Vila et al., Proteins: Structure, Function, and Genetics, 1991, 10, 199-218.
Comparison of the lowest-energy conformations of [Met5]enkefalin
(H-Tyr-Gly-Gly-Phe-Met-OH) obtained with the ECEPP/3 force field
in vacuo and with the SRFOPT model
vacuum
SRFOPT
Compariosn of the molecular sufraces of the lowest-energy
conformation of [Met5]enkefaliny obtained without and with
the SRFOPT model
vacuum
SRFOPT
Molecular surface are model
Fcav  A

Surface tension
A
molecular surface area
Generalized Born molecular surface (GBSA) model
Fsolw  Fcav  E GB
pol
E
GB
pol
 1
1
 332qi q j  
  in  out
 1

 f GB (rij )
2


r
ij
2

f GB (rij )  rij  Ri R j exp 
 4R R 
i j 

All-atom
representation of
polypeptide chains
Coarse-grained
representation of
polypeptide chains
Coarse-grained force fields
Physics-based potentials (statistical-mechanical formulation)
 1
F ( X)  U ( X)   RT ln 
VY

 exp E X, Y  / RT dVY 
Y

VY   dVY
Y
primary variables present in the model
secondary variables not present in the model (solvent, side-chain
dihedral angles, etc.)
E(X,Y) : all-atom energy function.
X:
Y:
Statistical potentials
W x; c; s    RT ln
N obs x; c; s 
N ref x; c; s 
X – geometric variables
c – residue types
s – sequence context
Leu-Leu pair
A – radial correlation function
B – reference distribution function
C-
Searching the conformational space
Low (Lowest)-energy
conformations
Monte Carlo with
minimization
(MCM)
Canonical conformational
ensembles
Basin
hopping
Canonical
MC
Canonical
MD
Replica-exchange MC (REMC)
Diffusion equation
method (DEM
Replica-exchange MD (REMD)
Genetic
algorithms
Local energy
minimization
Simulated annealing
Smoothing
energy surface
Monte Carlo
Molecular
dynamics
Local vs. global minimization
f(x)
Start
Local
minimum
Global
minimum
x
General scheme of local minimization of multivariate functions:
1. Choose the initial approximation x(0).
2. In pth iteration, compute the search direction d(p).
3. Locate x(p+1) as a minimum on the serarch direction (minimization of a function
in one variable).
4. Terminate when convergence has been achieved or maximum number of
iterations exceeded.
x2
x(0)
x(1)
f(x(p)+d(p))
x(2)
x*
d(2)
d(1)
x1
a*
a
Deformation methods
Lowest-energy structure of gramicidin S computed with the ECEPP force field
(M. Dygert, N. Go, H.A. Scheraga, Macromolecules, 8, 750-761 (1975). This
structure turned out to be identical with the NMR structure determined later.
The C-terminal part of
HDEA protein found by
global minimization of
the UNRES coarsegrained effective energy
function.
The N-terminal part of
HDEA
Liwo et al., PNAS, 96,
5482–5485 (1999)
Comparison of the experimental strucgture of bacteriocin AS-48 from E. faecalis
with the structure obtained by global minimization of the UNRES force field
(Pillardy et al., Proc. Natl. Acad. Sci. USA., 98, 2329-2333 (2001))
“Potential energy” or “free energy”?
Nature (and a canonical
simulation) finds the basin with
the lowest free energy, at a
given temperature which might
happen to but does not have to
contain the conformation with
the lowest potential energy.
The global-optimization
methods are desinged to find
structures with the lowest
potential energy, thus ignoring
conformational entropy.
Technically this corresponds to
canonical simulations at 0 K.
Comparison of minimum potential energies obtained in
MD runs with the lowest values of the potential energy
PDB ID code
Emin (MD)
[kcal/mol]
Eglob
[kcal/mol]
1BDD (46)
-409 (-414)
-597
1GAB (47)
-461 (-501)
-669
1LQ7 (67)
-658 (-652)
-937
1CLB (75)
-740 (-709)
-1053
1E0G (48)
-405 (-380)
-632
(number of
residues)
Results of Langevin dynamics simulations are in parentheses.
Basic scheme
of the
Metropolis
(canonical)
Monte Carlo
algorithm
Conformation Xo, energy Eo
Perturb Xo: X1 = Xo + DX
Compute new energy (E1)
NO
E1<Eo ?
NO
Sample Y from U(0,1)
Compute W=exp[-(E1-Eo)/kT]
W>Y?
YES
Xo=X1, Eo=E1
YES
E1
E0
Accept with
probability
exp[-(E2-E1)/kBT]
Accept unconditionally
E1
Calculation of averages
1
A
N
N
A
i 1
i
The index i runs through all MC steps, including those in which new
conformations have not been accepted.
Conformational space representation
in Monte Carlo methods
• Lattice representation; the centers of
interactions are on lattice nodes.
• Continuous; the centers are located in 3D
space.
An
example
of lattice
Sample
MC trajectory
of a goodMonte
folder; Model
1a
Carlo trajectory
A pathway of thermal unforlding of protein G simulated with the
CABS model and lattice Monte Carlo dynamics
Kmiecik and Koliński, Biophys. J., 94, 726-736 (2008)
Molecular dynamics
d 2ri dv i
Fi r (t ) 
1

 a i (t ) 
   ri V r (t ) , i  1,2, , n
2
dt
dt
mi
mi
d 2 xi
V
m 2 
dt
xi
dri
 v i (t ), i  1,2, , n
dt
r t0   r0
vt0   v 0
1
r (t  Dt )  r (t )  vt Dt  a(t )Dt 2  
2
The Verlet algorithm:
1
r (t  Dt )  r (t )  v(t )Dt  a(t )Dt 2
2
1
r (t  Dt )  r (t )  v(t )Dt  a(t )Dt 2
2
r (t  Dt )  r (t  Dt )  2r (t )  a(t )Dt 2
r (t  Dt )  2r (t )  r (t  Dt )  a(t )Dt 2
1
v(t )  r (t  Dt )  r (t  Dt ) 
2
e(t )  O(Dt 4 )
The Velocity Verlet algorithm
Step 1:
1
r t  Dt   r (t )  v(t )Dt  a(t )Dt 2
2
1
 Dt 
v t    v(t )  a(t )Dt
2
2

Step 2:
1
ai (t  Dt )   ri U r(t  Dt ) 
mi
 Dt  1
v(t  Dt )  v t    a(t  Dt )Dt
2 2

The leapfrog algorithm:
 Dt 
 Dt 
v t    v t    a(t )Dt
2
2


 Dt 
r (t  Dt )  r (t )  v t  Dt
2

All three algorithms are symplectic, i.e., the total energy oscillates about a
constant value (the „shadow Hamiltonian”) which is close bur not equal to the
initial energy. Many other higher-order algorithms which are more accurate in a
single step (e.g., the Gear algorithm) lack this property.
Symplectic algorithms have also been designed for isokinetic (constant
temperature) and isobaric (constant pressure) simulations; extended Hamiltonian
is considered in these cases.
Dependence of the kinetic, potential, and total energy on time for coarsegrained Ac-Ala10-NHMe (Khalili et al., J. Phys. Chem. B, 2005, 109, 1378513797)
Kinetic energy
energy [kcal/mol]
Total energy
Potential energy
Total energy
0.0
1.0
2.0
3.0
time [ns]
4.0
5.0
Berendsen’s thermostat (weak coupling with temperature
bath)
Dt  fkT 
v  v 1  
 1
  Ek

1 n
2
2
2
Ek   mi v xi  v yi  v zi
2 i 1


f – number of degrees of freedom (3n)
 – coupling parameter
Dt – time step
Ek – kinetic energy
Langevin dynamics (for implicit solvent)
d xi
dxi
V
rand
mi 2  
i
 fi
dt
xi
dt
2
 i  6 (ri  rw ) w
fi
rand
2 i RT

N (0,1)
Dt
dxi
E
rand
i

 fi
dt
xi
Stokes’ law
Wiener process
Brownian dynamics
sidechain
rotation
helix
formation
protein
folding
10-15
10-12
10-9
10-6
10-3
100
femto
pico
nano
micro
milli
seconds
bond
vibration
all atom
MD
step
loop
closure
folding of
-hairpins
MD algorithm references:
1. Frenkel, D.; Smit, B. Understanding molecular simulations,
Academic Press, 1996, Chapter 4.
2. Calvo, M. P.; Sanz-Serna, J. M. Numerical Hamiltonian
Problems; Chapman & Hall: London, U. K., 1994.
3. Verlet, L. Phys. Rev. 1967, 159, 98.
4. Swope, W. C.; Andersen, H. C.; Berens, P. H.; Wilson, K. R.
Chem. Phys. 1982, 76, 637.
J.
5. Tuckerman, M.; Berne, B. J.; Martyna, G. J. J. Chem. Phys. 1992,
97, 1990.
6. Ciccotti, G.; Kalibaeva, G. Philos. Trans. R. Soc. London, Ser. A
2004, 362, 1583.
Regular and multiplexed replica exchange
algorithm


N independent replicas are simulated independently for a reasonably
long time using standard canonical MC or MD
exchange of two neighboring replicas is attempted according to
following probability:
for D  0
1
W X,  m | Y,  n   
exp  D  for D  0
regular
D   m   n E Y   E X
multiplexed
Y.Rhee V.Pande, Biophys. J. 84, 775, 2003