lectures-week4

Transcript lectures-week4

Discussion topic for week 4 : Protein folding
•
Levinthal's paradox presents an estimate for the time it would take
for a protein to fold assuming a minimum of two possible
conformations for each pair of amino acids.
For a 101 residue protein, there are 2^100 ~10^30 possible
conformations. If it takes 1 ps to sample each conformation, it
would take 10^18 s to sample the whole phase space to find the
absolute minimum of the free energy. This is longer than the age
of the universe, 5x10^17 s!
How do proteins manage to fold in seconds?
Chemical Forces (Nelson, chap. 8)
Molecular machines in cells use chemical energy to function, and
most of the time they do chemical work, e.g. synthesize proteins.
To deal with this situation we need to consider more than one species
and allow exchange of particles as well as energy.
Let {Na, a=1, 2,…} denote the numbers of each species in a system
The entropy of the system is a function of this set: S(N1, N2, …)
We define the chemical potential for each species a as
a = T
S
Na
(availability of particles)
E ,N  ,a  
Recall the definition of the temperature (modified from the fixed N case)
1 S
=
T E Na
(availability of energy)
When two systems, A and B, exchange only energy, thermal equilibrium
is realized when TA = TB
If they also exchange particles, their chemical potentials for each species
a must be equal as well
 A,a =  B ,a
(chemical equilibrium)
When two systems are not in chemical equilibrium, entropic forces
arising from the difference in chemical potentials drive the system to
equilibrium.
As a simple example consider an ideal gas. The entropy is given by
 3 N 2 (2mEK )3 N 2 V N 
S = k ln 
 (Sakure-Tetrode formula)
3N
 (3N 2  1)!h N! 
Rewrite S using Stirling’s formula
3N  3N  3N
3N

S = k  ln( 2mEK )  N ln V 
ln 


3
N
ln
h

N
ln
N

N


2
2
2

 2

S
N
EK
3 3N 3 3
3

= k  ln 2mE K   ln V  ln
   3 ln h  ln N  1  1
2 2 2 2
2

3k  2m 2 E K
= ln  2
2  h 3 N
V 
 
N
2 3
3k  2m kT 
 = ln  2 2 3 
 2  h c 
To obtain the chemical potential we need to keep the total energy
E = EK  Ne
fixed, where e is the internal energy. This is achieved by
S
N
=
E
S
N
e
EK
S
E K
N
The last term is just e/T. Substituting, we obtain for the chemical pot.
3
2
 2m kT  e 

2
23 T
 h c 

 = T  k ln 
3
 2m kT 
= e  kT ln  2 2 3 
2
 h c 
Because we are interested in the number (or concentration) dependence
of the chemical potential, we separate that term
 2m kT 
c
3
 = kT ln  e  kT ln  2 2 3 
 h c 
c0
2

0 
= kT ln
c
  0 T 
c0
Where c0 is the reference concentration and
0 is the standard chemical potential
The reference concentration is introduced for convenience, it’s choice
has no effect on the chemical potential. Convention:
For gases at STP: c0 = 1 mole/22 L = 0.045 M
For aqueous solutions: c0 = 1 mole/L = 1 M
Notation: [X] = cx/c0 (e.g., [X] = 1 refers to a 1 molar solution)
Rewrite the chemical potential as
c

  0  k T
=e
c0
(activity)
For ideal gases, the activity is simply given by the relative concentration.
For solutions, the definition of 0 is more complicated. But if we treat it
as a phenomenological parameter, we can use the same formulas for
dilute solutions.
We can generalize the chemical potential by including the potential
energy of the particles in the internal energy:
In the case of charged particles,
 = kT ln
0  0 U
U = qV (r )
c
  0  qV (r )
c0
is called the electrochemical potential.
Electrochemical equilibrium between two systems is achieved when
 A = B
 kT ln
cA
c
  0  qVA = kT ln B   0  qVB
c0
c0
q(V A  VB ) = kT ln
cA
cB
(Nernst relation)
Chemical reactions are controlled by the chemical potential of reactants:
high concentration or high internal energy means higher availability.
Generalization of the Boltzmann distribution for particle exchange:
Consider a small system “a” which can exchange both energy and
particles with a much larger system B. Fluctuations in EB and NB are
negligible but those in Ea and Na could be large. As we have shown
before, the probability of “a” being in a state with Ea and Na is
proportional to exp[SB(EB)/k]
S B ( E B , N B ) = S B ( Etot  Ea , N tot  N a )
= S B ( Etot , N tot )  Ea
S B
S
 Na B  
E B
N B
Ea
B
= S B ( Etot , N tot ) 
 Na
TB
TB
At equilibrium,
Ta = TB = T , a =  B = 
Thus the probability of “a” being in a particular state j with Ej and Nj is
proportional to
P( E j , N j )  e
(  E j  N j ) kT
Using the grand partition function,
Z = je
(  E j  N j ) kT
The normalized probability becomes
1 (  E j  N j )
P( E j , N j ) = e
Z
kT
which is called the grand canonical distribution.
The number of particles “a” contains depends on ; the larger  is, the
more particles “a” will have.
Chemical reactions:
As a simple case, consider a molecule which has two states with internal
energies e1 and e2 (e.g. an isomer). Assume e = e 2  e1  0
The chemical potentials are
i = kT ln
ci
 i0 , i0 = e i  , i = 1,2
c0
Chemical equilibrium: 1 =  2
kT ln
c1
c
 e1   = kT ln 2  e 2
c0
c0

c2
= e e
c1
kT
Non-equilibrium cases:
1. 1   2 Reaction 1  2 proceeds (entropic forces do chem. work)
2. 1   2 Reaction 2  1 proceeds (chemical en. converted to heat)
To summarize, when two systems are at equilibrium:
•
Temperatures and chemical pot’s are equal, T = T ,
1
2
•
Total entropy S is maximum
•
Total free energy (F or G) is minimum
1 = 2
Example: Burning of hydrogen
2H 2  O2  2H 2O
Free energy change: G = 2 
H 2O  2  H 2  O2
At equilibrium: S =  G = 0
T
Assuming an ideal gas behaviour for all three participants, we can write
for the free energy change
 c H 2O 
 cH 2
0
  2  H 2O  2kT ln 
G = 2kT ln 
 c0 
 c0
 c H O  2  c H
= kT ln  2   2
 c0   c0
G = 0 



2
 cO2

 c0



2

  O0 2

1 
cH O 2 c0 (2
=e
2
cH  cO 
2

 cO2
0
  2  H 2  kT ln 

 c0
0
0
0
  2 H

2



H2
O2
2O

0
0
H 2O  2  H 2
 O0 2 ) k T
= K eq
2
Here Keq is called the equilibrium constant of the reaction, and the ratio
cH O 2 K eq
=
2
cH  cO  c0
2
2
is called the reaction quotient
2
Often a log scale is used for Keq :
Keq = 10 pK ,
pK =  log10 K eq
For ideal gases (or dilute solutions), we can use the explicit expression
derived for the standard chemical potential
 2m kT 
3
 = e  kT ln 2 2 3 
 h c 
2

0 
0
K eq = e
=e
=e
( 2  H0 2O  2  H0 2  O0 2 ) k T
( 2e H 2O  2e H 2 e O2 )
3
3
2mH 2O kT   2mH 2 kT   2mO2 kT 
kT 
 
 


 h2 c2 3   h2 c2 3   h2 c2 3 

0  
0  
0 
( 2e H 2O  2e H 2 e O2 ) k T


 h
m H 2O

 c0
 2kT m 2 m
H2
O2

2

2






3 / 2
3/ 2
At low T, reaction favours H2O. As T increases H2 and O2 conc. also inc.
From chemical data handbooks, Keq at room temperature is given by:
K eq = e
( 2  H0 2O 2  H0 2  O0 2 ) kT
= e183
Clearly almost all the hydrogen will burn. Using the reaction quotient
cH O 2 c0 H 2O2
=
= K eq
2
2
cH  cO  H 2  O2 
2
2
2
estimate the number of O2 molecules left from 1 mole of O2 gas
[ 2] 2
[ 2 x ]2 [ x ]
 e183
 [ x] = e 183 / 3 = 3  10  27
n(O2 ) = [ x]N A = 3  10  27  6  10 23  0.002
None at all!
Generalization to arbitrary reactions:
Assume n species involved in a reaction; k reactants and m-k products
n1 X1   n k X k n k 1 X k 1   n m X m
where nk are called the stoichiometric coefficients of the reaction
The free energy difference is
G = n11   n k k n k 1k 1   n m m
The reaction runs forward when G < 0 and backward if G > 0.
G = 0 corresponds to equilibrium. Again we separate the concentration
dependent part from the rest




c
c
G = n 1  kT ln 1  10       n m  kT ln m   m0 
c0
c0






= kT ln[ X 1 ]n 1    ln[ X m ]n m n 110    n m  m0
Setting G = 0, we obtain
 X k 1 n
 X 1 n
k 1
1
 X m n m
 X k 
nk
=e
 G 0 k T
= K eq
(Mass action rule)
where G0 is the standard free energy change
G 0  n 110      n m  m0
The values of G0 for formation of molecular species can be found in
chemistry handbooks (usually at STP; 298 K and 1 atm)
When more than one reaction occurs at similar rates, there is a
separate mass action rule for each reaction, which implies relations
between the various a .
Reaction Kinetics:
Consider a typical reaction with rate constants k+ and k
k



X 2  Y2  2 XY
k

Intuitively we expect the forward and backward rates to be proportional to
the concentrations of molecules (first order reaction)
r = k c X 2 cY2 , r = k (c XY ) 2
At equilibrium,
r = r
(c XY ) 2 k 
 G 0

=
=e
c X 2 cY2 k 
kT
= K eq (mass action rule)
The above is true for single step reactions. For more complex reaction
mechanisms, concentration dependence of rates may be different.
In general, a reaction is of n’th order in species X if the rate depends on
its concentration as (cX)n.
An alternative 3-step mechanism for the previous reaction, which is
second order in X2 and zeroth order in Y2 :
X2  X2




X  Y2
2X  X 2




X  XY2




(slow, rate limiting step)
XY2
(fast)
2 XY
(fast)
Each step must be in equilibrium
(c X ) 2 c X 2
2
(c X 2 ) c0
Product:
= K eq,1 ,
c XY2 c0
c X cY2
(c XY ) 2
= K eq, 2 ,
= K eq,3
c X c XY2
(c XY ) 2
= K eq,1  K eq, 2  K eq,3 = K eq
c X 2 cY2
(mass action rule is
independ. of mechanism)
Dissociation:
Salts, acids, bases and polar molecules readily dissolve in water because
the loss in potential energy is more than compensated by the interaction
of the charged parts with water molecules (charge-dipole and H-bond)
and gain in entropy.
Example: Dissociation of water

 

H 2O 
H

OH

(proton + hydroxyl)
7
From conductance measurements in pure water: cH  = cOH  = 10 M
Mass action rule: K w

[ H  ][OH  ]
=
= 10 7
[ H 2O ]

2
= 10 14
(G 0 = 32 kT )
Adding an acid (e.g. HCl) increases [H+] and hence lowers [OH]
Adding a base (e.g. NaOH) increases [OH] and hence lowers [H+]
In chemistry, the amount of protons in a solution is described by its pH
pH =  log 10 [ H  ]
 Pure water has pH = 7, which is called normal pH
 Adding acids in water lowers pH. A solution with pH < 7 is called acidic
 Adding bases in water raises pH. A solution with pH > 7 is called basic
Common acidic and basic groups in organic molecules:
Carboxyl group
Amine group




 COOH 

COO

H




 NH 3 

NH

H
2

protonated
deprotonated
Of the 20 amino acids, aspartate and glutamate have acidic side chains
while arginine and lysine (~histidine) have basic side chains.
Probability of protonation of a side chain
Pa =
equilibrium:
 COOH 

[COOH ]  [COO ] 1  [COO  ] [COOH ]
[COO  ][ H  ]
= K eq,a
[COOH ]
Pa =
=
1
[COO  ] K eq,a

=
[COOH ] [ H  ]
1
1  K eq,a [ H  ]
K eq,a = 10  pK , [ H  ] = 10  pH
Pa =
When pH = pK ,
1
1  10
 pK  pH
Pa = 1/ 2
=
1
1  10
xa
,
xa = pH  pK
Examples:
Aspartic acid:
Keq = 103.7  Pa = 1/(1+103.3)  0
(has charge –e)
Arginine:
Keq = 1012.5  Pa = 1/(1+105.5)  1
(has charge +e)
The average charge on a side chain is determined by Pa
Acidic side chain: q = –e (1 – Pa)
Basic side chain: q = e Pa
Note that the pH of the solution controls the protonation state of a protein.
In titration experiments, the pH is varied over a wide range, e.g. 1-12.
When pH < pK of all the side chains, all are protonated (max + charge)
As pH increases, and goes through pK of an acidic side chain, q = 0  –e
Beyond pH = 7, basic side chains start deprotonating, q = +e  0
For pH > pK of all the side chains, all are deprotonated (max – charge)
Titration curve of ribonuclease. As pH is raised protein loses protons.
Electrophoresis:
As the titration curve indicates, apart from a critical pH value, proteins
carry a net charge and hence will move under an applied electric field.
This process is called electrophoresis.
A common application is separation of proteins, which is achieved by
setting the pH of the solution at the critical value of the protein we want to
separate and applying an electric field.
Varying pH and measuring the electrophoretic mobility, one can
determine the critical pH value precisely.
A famous example is Pauling’s finding of the cause of sickle-cell anemia.
Patients carry a defective hemoglobin that differs from the normal one by
a single point mutation, Glu  Val. Glu has –e (pK = 4.25), Val is neutral.
At pH = 6.9, the two proteins migrate in opposite directions!
Self-assembly of amphiphiles:
How do the cell membranes form?
Amphiphiles: molecules that have both hydrophilic (polar) and
hydrophobic (CH2 chain) parts (detergents, lipids)
Sodium dodecyl sulfate
(SDS)
Phosphatidylcholine
When detergents are added to oil-water mixtures, they form a boundary
between the two such that the polar head groups face water and
hydrophobic tails face oil
Oil-water interface stabilized by detergent
oil-water emulsion
Micelle formation:
When detergent is added in pure water, they form small spherical objects
just like in the emulsion case. The only difference is that the tails avoid
water by facing each other.
N=5
N=30
Osmotic pressure: P = ckT (McBain, 1944)
Let the number of detergent molecules in a micelle be N, and denote
the concentration of micelles by cN and monomers by c1
The reaction is: (N monomers)  (micelle)
Mass action rule (MAR) at equilibrium: c N (c1 ) N = K eq

ctot = c1  NcN = c1 1  NKeq (c1 ) N 1

Experimentally measured quantity is the critical micelle concentration, c*
which is defined as, ctot = c*
when
c1* = Nc N * = c* 2
substitute in MAR
c* 2 N c* 2 N = K eq  NK eq = (2 c* ) N 1
substitute in ctot
ctot = c1 1  (2c1 c* ) N 1


For 2c1 << c*, ctot = c1, while for 2c1 >> c*, ctot = NcN
Coarse-grained models of lipid aggregation:
United atom models of lipids
Micelle formation (Klein et al. 2004)
Bilayer formation (Marrink et al. 2001)
Cooperative transitions in macromolecules (Nelson, chap. 9)
Biological molecules usually have two distinct conformations:
random coil form of the polypeptide chain and a folded compact form.
Examples:
•
Helix-coil transition in a simple amino acid chain
•
Full folding of a protein from random coil to a compact 3D structure
•
An extreme example is the condensation of DNA, where the full
length of about 1 m is squeezed into a micron size nucleus.
An important parameter in characterizing the elasticity of polymers is the
persistence length, which determines the length scale for bending of the
chain of molecules. Persistence length is typically about 1 nm for
polymers, which are very flexible. In contrast, it is about 100 nm in DNA,
which is relatively very rigid.
Elasticity model of polymers:
If we model polymers as a continuous elastic object, there are three
possible deformations: a) bending, b) stretching, c) twisting (torsion)
Because the covalent bonds in polymers are quite rigid and the torsional
motion is restricted, only the bending deformation is important
1
dE = kTAβ 2 ds where
2
1
E = kTL p
2
dtˆ
d 1
β= ,  =
=
ds
ds R
1
1 2R L p
 β ds  E = 2 kTL p  R 2 4 = 4R kT
0
Ltot
2
(for ¼ circle)
Stretching of DNA (experimental data from DNA of lambda phage)
A force of few pN is sufficient to fully stretch DNA from a random coil. At
65 pN DNA takes another form, where the backbone is straightened.
Two-state model of DNA stretching (freely jointed chain model in 1D)
Assume DNA consists of N segments of length Ls, which can be oriented
in +z or –z direction. Apply a force f in the z direction to stretch it.
The corresponding potential is U= –fz where z is the DNA length given by
N
z = Ls   i , with  i = 1
i =1
Probability of a particular configuration [σ1,….,σN] is given by the
Boltzmann factor
1 fLs iN=1 i  k T
P ( 1 ,,  N ) = e
Z
Where Z is the partition function
Z=
  P( 1,, N )
 1 = 1  N = 1
Average DNA extension under a load f
z =
  P( 1,, N ) z
 1 = 1  N = 1
N
N
1
fLs i =1 i  k T
=
 e
Ls   i

Z  =1  = 1
i =1
1
N
N

d 
fLs i =1 i  k T
= kT ln    e

df  = 1  = 1

 1
N
d 
= kT ln   e fLs 1
df   = 1
 1

d
= kT ln e fLs
df
kT
e

  e fLs N

  =1

 N

kT 

 fLs k T N




kT 
Taking the derivative wrt f gives
z = NLs
e fLs
kT
 e  fLs
kT
e fLs
kT
 e  fLs
kT
Introducing Ltot=NLs
z
fLs
= tanh
Ltot
kT
The limiting cases:
1) High force (f>>kT/Ls),
2) Low force (f<<kT/Ls),
z  Ltot
z  Ltot
fLs f
kT
= , with k =
kT k
Ltot Ls
At low force, a polymer behaves like a spring, obeying Hooke’s law
Comparison of theory with experimental data from lambda phage
Ls=35 nm
Ls=104 nm
Long-dash curve: 1D cooperative chain model (includes elastic energy)
Short-dash curve: 3D freely jointed chain model
Helix-coil transition (experimental data from an artificial polypeptide)
• At a critical temperature polypeptide makes a transition from coil to helix
• Transition is sharpened with the number of residues (cooperative effects)
Energetics of helix-coil transition
The free energy change in the transition is given by
G = Ebond  TStot
Ebond = Ehelix  Ecoil
Stot = Sbond  S conf
From experiments:
E  0,
S conf  k ln( 3  3)  0
Sbond  0, Stot  0
Introduce a parameter
a=
Ebond  TStot
 2kT
which measures favourability of extending the helix formation
When a vanishes, extending the helix by one unit makes no change in
the free energy
a = 0  Tm = Ebond Stot
a=
Ebond  1 1  Ebond T  Tm

  =
2k  Tm T 
2k
TTm
Using a cooperative 1D freely jointed model and
 
 = C1 
C2 sinh a
sinh 2 a  e 4
The curves in the previous figure are obtained by fitting this expression
to the data points.
Protein folding:

Primary sequence determines the folded structure

The free energy gain from folding is about 20 kT

Loss of entropy is compensated by H-bond formation and especially
hydrophobic interactions (Kauzmann, 1950s).

Changes in the environment can lead to denaturation (unfolding) of
proteins. For example, proteins unfold

at both high (T > 50 C) and low (T < 20 C) temperatures

in nonpolar solvents

in the presence of small amounts of surfactants
MD simulations of protein unfolding at high temperatures (Daggett et al.)
Unfolding of
engrailed
homeodomain
Folding time at
298 K, ~1 ms
Unfolding of chymotrypsin inhibitor (Daggett et al.)
Potential
energy
landscapes
for protein
folding:
a) Flat
(i.e. Levinthal)
b) Ant trail
c) Smooth
funnel
d) Rugged
funnel
Rugged protein folding pathways
from lattice calculations (Dill et al)

lectures-week4

Transcript lectures-week4

Directory